|Action||Three Waivers (ID, DD, DS) Redesign|
|Comment Period||Ends 3/31/2021|
SIS bait and switch
Section 12 VAC 30 – 122 – 200 requires the use of the SIS and provides some specific requirements for use of the scores generated by that instrument; unfortunately, it does not provide specific regulatory requirements for the implementation of the SIS and overly limits the areas to be considered when applying the results of the instrument.
- Positive Role of the SIS users manual – All of the accolades showered on the SIS by those promoting this regulation (“proven valid and reliable”, “norm referenced”, “nationally tested”, “multidimensional”, “scientifically proven”, “replicated in peer-reviewed journals”… Etc.) are only true for the SIS approach outlined in the Supports Intensity Scale User Manual published by aaidd. The key role of the users manual in generating these accolades was identified as a strength of the SIS by Y. Viriyangkura, in his 2013 work Understanding the support needs of people with intellectual disability and related developmental disability through cluster analysis and factor analysis of statewide data (page 66); “the SIS users manual provides detailed instructions and case studies on how to administer the measurement. When assessors understand how to conduct the SIS the quality of the information from the scale is likely to increase.” The Department of Health and Human Services guidelines for Responsible Conduct in Data Management; “prevention (i.e., forestalling problems with data collection)… is best demonstrated by the standardization of protocol developed in a comprehensive and detailed procedures manual for data collection. Poorly written manuals increase the risk of failing to identify problems and errors….” These same guidelines indicate that “regardless of the discipline, comprehensive documentation of the collection process before during and after the activity is essential to preserving data integrity.” Clearly, adherence to the users manual, specifically for the SIS and which was used in all the claimed replications is an essential aspect of all of the positive accolades being repetitively showered on the SIS.
- Implementation of the SIS in Virginia is not consistent with the user manual – since introduction to Virginia, the scoring sheet and instructions which are provided to respondents to be used during the administration of the SIS have deviated significantly from the scoring instructions in the user manual. Stark evidence of this fact is clearly printed as a disclaimer at the bottom of each scoring sheet “the material contained herein does not constitute a change in SIS scoring metric and is not part of the system users manual”. This statement is correct in that the scoring metric is not part of the user manual; however, the statement that it does not constitute a change in the scoring metric needs more clarity. While the change statement may be true if only referencing changes made since the instrument was introduced to Virginia, it definitely is not true if that statement is meant to declare that the SIS scoring metric identified in the user manual has not been changed. In fact, the SIS scoring metric identified in the users manual and the use of SIS scores to establish levels of support need have been significantly changed by the implementation practices being used in Virginia (see below). These significant changes render suspect the extension of all the accolades showered on the SIS to the Virginia system because the lack of a standard approach to measurement instruments guarantees the proven results of one are not transferable to the other: B. Rammstedt in the Journal Measurement Instruments for the Social Sciences 2019 (page 2) states that “measurement instruments are the central tools to acquire sound scientifically based knowledge… Requiring at least a standardized approach to collecting information and integrating survey responses or other participant data, before making inferences at the construct level and quantifying individual differences.
The amount of skepticism that should be directed towards the Virginia system for implementation of the SIS based on this analysis is significant; as there are clear inconsistencies between the Virginia system and the tested system in the users manual. Interestingly, all have the impact of lowering an individual score:
- The Virginia system implements a “dominant activity” approach to scoring type of support when multiple support types are used in any one activity. However, the user manual is very clear in providing instruction that the highest type of support not the dominant type should be used in in the appropriate scoring metric – page 76 “the highest rating of the different types of support needed should be recorded in the case where multiple supports are needed” it states elsewhere page 25 “each activity should be rated according to what dimensions of support are needed to promote participation of the person in successfully completing all aspects of the activity” and finally it provides a direct instruction in the area where the dominant activity approach is abused page 75 “when another person is needed to complete a function or task in place of the individual this should be rated full physical assistance.” Thus, individual scores generated by the dominant activity approach will frequently be lower than scores generated by following the tested user manual.
- The Virginia system violates their own dominant activity standard to further lower individual scores. While the threshold for a higher rating for type of support in the users manual is clear (any inclusion of a support type to successfully complete a function results in the higher rating); the Virginia system provides a 180° reversal from this standard to impose a similar threshold to consistently lower the scores of individuals. Specifically, when full physical assistance is clearly the dominant activity for the successful completion of a function, the State system utilizes any contribution no matter how minuscule in the completion of the function by the individual as a reason not to record the dominant activity but the next lower score. Examples abound from actual SIS assessments in areas such as: library use where every aspect of successfully finding a book in the library required full physical assistance, but the assessor lowered the score to partial if the individual could indicate a subject area (i.e. animals); banking again dominated by full physical assistance in accounting/check writing, but the assessor lowered the score to partial if the Individual could physically hand the check over to the teller or doctor’s appointments where full physical was required for scheduling, reporting and follow thru, but the assessor lowered the score to partial if the Individual could say “sick” to start the process… The examples go on but you get the point – the requirement to score successful completion of all aspects of the activity in the user manual is not being adhered to and the State system has completely reversed the direction of threshold analysis from that in the user manual. As a result, individual scores generated will consistently be significantly lower than scores generated by following the tested user manual.
- The Virginia system has implemented response restrictions that preclude entering the appropriate score for an individual. Even when the Individual, all respondents and the assessor unanimously agree on a particular score for an item, the computer frequently refuses to accept the entry because of blanket non individualized restrictions placed on the higher scores. No such restriction is identified anywhere in the users manual; creating a clear inconsistency that lowers an individual score in the Virginia system. While the number of restrictions is unclear more of them in more areas continue to appear with each of our new SIS assessments both in the scoring of the domain areas and the Virginia supplemental questions. Since the State has a strict prohibition on the respondents having a pencil and paper to document these restrictions, no one but the State knows the actual number/areas that are negatively impacted. These restrictions generate excessive standardization and can destroy data integrity as reported in The National Academy of science, engineers and medicine workshop summary 2011 (page 10) “standardization can entail the loss of information and too much standardization may make extensive evidence uninformative and misleading.” Since these restrictions have always required the recording of a lower score and never a higher score, individual scores generated will consistently be significantly lower than scores generated by following the tested user manual.
- The Virginia system for applying the scores to establish levels of support need generated by the SIS assessment required by regulation (12 VAC 30 – 122 – 200 – A4), restrict Virginia to consideration of only 2 of the 3 sections of SIS scores and only allows consideration of half of the elements in the Support Needs Scale section. This approach is clearly not consistent with the user manual which instructs the full use of all sections for establishing an individual’s level of support need. According to E. Drost from California State University in Education Research and Perspectives January 2011 (page 113 – 114), this restriction in the number of items considered can significantly reduce the reliability claimed by an instrument to below acceptable confidence levels. The validity of the SIS instrument is also dependent upon the inclusion of the excluded areas: M. Wehmeyer (et.al.) in the American Journal of Intellectual and Developmental Disabilities January 2009 makes this clear on page 3 and 13 “moreover, scores from different sections of the SIS made unique contributions to explaining variants associated with a variety of support need proxies.… Finally, it is clear that models including both SIS SNI scores and the section 3 medical and behavioral raw scores were stronger predictors than any one section alone and any potential use should involve all of these indicators.” An example of the impact of these restrictions is the exclusion of lifelong learning – given the regulatory mandate to provide training and outcome progress as the dominant type of support required in services, the necessity of assessing an individual’s ability to learn to determine how much support will be needed in providing this training would appear particularly pertinent to establishing the individuals level of support need in these services, M. Wehmeyer (et.al.) Journal of Special Education Technology April 2012 reports that “across most groups, lifelong learning was the domain in which the highest intensity of support needs was reported”; thus, this exclusion is completely inappropriate for the needs being evaluated and significantly lowers overall individual scores. Similar but different problems exist due to the other exclusions again resulting in mis-evaluation and lower overall scores. The Virginia system of applying the scores generated by the SIS assessment to establish levels of support need increases the systemic risk of having support needs mis-/under evaluated because the individual scores will consistently be significantly lower than the results generated by adherence to the user manual.
The Virginia system for implementing the SIS creates significant, distinct and meaningful differences between the scores and level assignments generated by the Virginia system and those that would be generated using the procedures in the tested users manual. These different results as proven above cannot lay claim to the same degree of accuracy, reliability or validity as the SIS which uses the user manual. In fact, when contacted and directly ask about the dominant activity approach an aaidd representative could not identify any study where the dominant activity approach to scoring had been employed much less one where it was proven accurate, reliable and valid and when pressed stated “perhaps some of the international studies”; likewise the representative was unable to identify any information in the public domain or peer-reviewed articles about the dominant activity approach. Regardless of what the international studies show, B. Rammstedt (citation above page 5) indicates these results are not readily transferable to the Virginia population.
- The Virginia system for implementing the SIS is an untested, unproven and highly dubious system of data collection and use which the Department of Health and Human Services in their guidelines for Responsible Conduct in Data Management clearly warns against “while the degree of impact from faulty data collection may vary by discipline and the nature of investigation, there is the potential to cause disproportionate harm when these research results are used to support public policy recommendations.”
- The risk is magnified when you consider that acceptance of the regulation as written would establish a precedent that allows for manipulation of the SIS scoring and use for level assignments that is completely unchecked leading to more egregious abuses in the future. The State (which has a sordid history in this area spending most of the 21st century as the 49th worst state for resource availability and requiring federal lawsuits to even care) has openly declared that this system is being used to stretch available resources to cover more individuals; clandestine, denied and unchecked ability to make changes to lower the scores of individuals would provide a mechanism to illegitimately achieve this objective at the expense of the justice and equity for individuals.
Preemptively, because the state provides no opportunity for rejoinder, occasionally state representatives indicated that the instrument is “robust” in response to criticisms; which would be a valid response were it true. However, robustness can be tested comparatively, statistically and empirically but there is absolutely no direct evidence that these verifications for robustness have ever been attempted/completed for the changes implemented by the Virginia system. Additionally, given both the direction and the magnitude of changes from the users manual indicated above, the claim of robustness would fail to even meet the minimal non-statistical requirement for robustness which is provided by T. Plumper and E. Neumayer in their work Robustness Test and Statistical Inference; “most applied scholars even today define robustness through an extreme bounds analysis: a baseline model estimate is robust to plausible alternative model specifications [i.e. Scoring changes] if and only if all estimates have the same direction and are statistically significant.” As the analysis above made painstakingly clear the Virginia system changes the direction in the specifications and any direct statistical comparison between results generated by use of the user manual and the Virginia system would find statistically significant lower scores from the Virginia system; proving that claims of robustness are not applicable to the criticisms in this analysis. Finally, it is important to consider the source aaidd is making millions off of the system by keeping the State customer happy and providing the State the smokescreen of a robustness response without any empirical testing or delineated rationale to support the claim should be greeted with more than a grain of salt.
Recommendations –1st the State could demonstrate felicity to the model they continually use as a justification for this regulation and implement the SIS with strict adherence to the user manual – 2nd the State should recognize the existence of changes from the tested/proven user manual and provide a justification, rationale and empirical evidence for the appropriateness of these changes; making adjustments in level assignments as warranted from the information discovered in this analysis and verification of their use of the instrument – 3rd regulatory protections for individuals subjected to the SIS for determining their level of support needs and hence resources, should be provided directly in the regulation to prohibit any changes in the scoring system that are not verified as appropriate and prevent the implementation of future changes without being subjected to an appropriate system of empirical evaluation and meaningful checks and balances – 4th the prohibition against the individual and respondents having even a blank paper and pencil should be rescinded to promote the preservation of data integrity which is essential as indicated by the Department of Health and Human Services guidelines for Responsible Conduct in Data Management “regardless of the discipline, comprehensive documentation of the collection process before, during and after the activity is essential to preserving data integrity”; without this change an independent check to preserve data integrity will not be possible.
The State has gone to great lengths to sell the SIS and these regulatory changes by dressing them up in the language of scientific certainty and the cloak of reliability and validity---BUT WHAT WE WERE PROMISED IS NOT WHAT WE GOT.