What now? Reporting and communication strategies
Background Once a state has determined the impact of an interruption or irregularity on student scores, it must then figure out how those scores should be treated. There are several options ranging from doing nothing to suppressing the score entirely. We will review some options and discuss the implications of each. We will also discuss communication strategies
Options Do Nothing Estimate scores on unaffected items Adjust scores on affected items Give credit to all affected items Invalidate scores
Do Nothing Involves scoring and reporting in the usual manner. Possibly a good option if analyses reveal that there was little to no impact Pro: Fewest immediate implications for the state’s reporting and accountability system Cons: Although overall results may suggest the impact is not systematic, that does not mean that some students or schools are not adversely affected Any adverse or beneficial impact that is ignored erodes longitudinal comparability for future administrations.
Estimate Scores on Unaffected Items If the issue affected only a few items, it may be possible to estimate the affected students’ scores (or all scores) on all other items. That is, the items would be dropped from scoring. Pro: Students will receive scores which minimizes near-term impact on reporting and accountability. Con: Possible change in test specifications Score will be somewhat less precise It is important to evaluate impact on the construct and the precision along the full scale (i.e., conditional standard error) for assessments that are adjusted.
Adjust Affected Scores Involves making an adjustment to the student scores to offset the estimated impact of the issue Pros: Provides a path forward to report scores. Hypothetically attends to the estimated impact of the disruption. Cons: It is unlikely that all students were evenly affected by the issue, so making a single adjustment may work well in some cases but not others. More complex or conditional adjustments may be difficult to implement and explain. The adjustments themselves may introduce some error in the outcomes, which may add uncertainty to accountability, uses, and longitudinal comparability.
Give Credit for Affected Items Involves simply counting the affected items as correct. May involve all students or only affected students. Pros: Preserves the ability to report scores and directly addresses the threat that the interruption may depress performance. Targets items determined to be problematic. Cons: May artificially increase scores for some students. Can inflate error and erodes longitudinal comparability. Limited utility when the impact of the interruption cannot be clearly isolated to a small number of items
Invalidate The state may choose to invalidate the scores of all students who were affected by the issue. This typically involves a decision to suppress student level scores; however, there are a range of alternatives. Pros: Upholds ‘abundance of caution’ to prevent a potential detrimental impact. Prevents misuse or misinterpretation of results Con: Implications for reporting and accountability for the current and subsequent years.
Invalidation Options
Florida Experience At the beginning of the testing windows, and once later in the window, some students experienced difficulties related to logging in. The majority of the issues were the result of cyber-attacks, and issues with system updates. Legislature called for independent 3rd-party review of entire assessment system, and report on review to be released by Sept. 1, 2015
Individual and Group-Level Stakes To be promoted, Grade 3 students must score at or above Level 2 on ELA (“good cause” exemptions exist, however) Students must score at or above Level 3 on Grade 10 ELA and Algebra 1 EOC to graduate (retakes and alternative assessments provided for) EOCs count as 30% of course grade Group School grades, which include learning gains, acceleration, and improvement of performance of lowest 25% of students School recognition dollars District grades (based on school grades) Teacher evaluation (Value-added Model, VAM scores count at least 33% toward evaluation)
Reporting Individual Scores Although Grade 3 ELA was paper-based, legislative action during test administration called for districts to be provided lists of students scoring in the lowest quintile statewide for consideration, along with other factors, in making promotion and retention decisions. This was provided on June 4, 2015. No score information was released. Normally, about 7%-9% of students are retained each year. Only 4% were retained in 2014-2015.
Reporting Individual Scores All other results were held until after the 3rd-party report was released Since no EOC scores were available until scores were released in October 2015, districts were unable to use the scores to calculate course grades. Initial scores were released as both state percentile ranks and as T-scores Assessment graduation requirement for the new tests applied mostly to underclassmen.
Reporting Group-Level Results 2014-2015 results were always intended to be used only as a baseline for school, district, and teacher accountability. Baseline school grades were released following standard setting in fall 2015 and State Board approval of cut scores in January 2016. 2015 scores were retrofitted with new achievement level cuts for this purpose Schools or districts could appeal grades or VAM scores if previously-unreported technical issues could be shown to have potentially impacted performance.
Communications
Communications – During Administration Balance between the need to be transparent, and avoiding additional, unwarranted anxiety Emphasis on the security of student data and test content with respect to cyber-attacks Difficult balance between withholding scores and stakeholder desire to see results in a timely fashion
Communications – Immediately Following Test Administration Difficult balance between withholding scores and stakeholder desire to see results in a timely fashion Provision and use of Grade 3 ELA percentile ranks Unavailability of EOC scores for use in calculation of course grades while awaiting report (“Can’t use what you don’t have.”)
Communications – After Release of 3rd-Party Study (Sept. 1, 2015) Highlight what was positive in report Regarding Student-Level Decisions: Report noted that scores for students taking computer-based tests should not be used as sole determinant for promotion or graduation eligibility. In Florida, this has always been the case. Students who earned a passing score (used equipercentile linking to prior year) still met the assessment graduation requirement. Students who did not earn a passing score have options in retakes, or SAT/ACT concordant scores, or comparative scores on state college-readiness test.
Communications – After Release of 3rd-Party Study (Sept. 1, 2015) Regarding Group-Level Decisions: Report noted that the evidence supports the use of test data in the aggregate for both paper-based and computer-based assessments. Results were therefore used for group-level decisions, such as calculating scores for inclusion in teacher evaluations, calculating school grades, and setting achievement level cut scores.
Communications – After Release of Scores (October 2015) Reiterated the state’s position on student- and group-level decisions, as well as intent to use test data for setting achievement standards Percentile ranks and T-scores were released for all assessments not invalidated during test administration Districts were instructed to use them as needed, along with conversion tables for translating T-scores to new scale scores and provisional cut scores recommended by Commissioner for State Board adoption
Florida Summary Florida’s approach was ultimately to report all scores, or the “do nothing” approach, but on a delayed timeline. The required delay in reporting of scores, in addition to the baseline nature of the 2014-2015 assessment, led to policies that essentially held students, teachers, schools, and districts harmless.