Download presentation
Presentation is loading. Please wait.
Published byHarriet Lindsey Modified over 8 years ago
1
Overview of the FL DOE Data Forensics Program Steve Addicott, Vice President Dennis Maynes, Chief Scientist Caveon Test Security October 29, 2013
2
Outline of Presentation FLDOE Data Forensics Goals Data Forensics (DF) Process Student-Level Invalidations School-Level Flags and Requests Q&A
3
FLDOE Data Forensics Goals Uphold fairness and validity of test results “I believe that those of us in the field of assessment must now take even greater leadership on the issue of test data integrity.” Dr. Greg Cizek, NCME President 2012-13 CCSSO Conference, June 2013 “I…urge you to do everything you can to ensure the integrity of the data used to measure student achievement.” Arne Duncan US Secretary of Education Letter to Chief State School Officers, June 2011
4
Other Goals “Measure and Manage” Identify risks and irregularities Take action based on data and analysis Communicate zero tolerance of misbehavior to students and educators
5
Data Forensics Process Analyses of test data First building a “model” of typical question responses Identify unusual patterns which indicate test scores may not be trustworthy Examples…
6
Prescriptions for Use of Data Forensics OBP Guidebook Handbook
7
“I urge us to reframe our concerns about test data integrity not as cheating concerns, but as a validity issue.” Dr. Greg Cizek, NCME President 2012-13 CCSSO Conference, June 2013
8
Testing Examiner’s Role Ensure (and then certify) the test administration is fair and proper Declare scores invalid when fairness and validity are negatively impacted Decision depends upon fairness and validity, not whether an individual cheated
9
FLDOE Data Forensics Focus on two groups Students Schools Administrations EOC FCAT 2.0 FCAT Retakes Utilize VERY conservative thresholds
10
Conservative Thresholds “…it seems to make the most sense to prioritize the allocation of resources.” “If an assessment budget only permits investigating the ‘worst of the worst’, then those resources should allocated digging deeper into possible test data invalidity whether that means instances that exceed a 5 SD criterion, the 20 most outlying test centers, 1% of classrooms, or whatever the resources will allow.” Dr. Greg Cizek NCME President 2012-13 CCSSO Conference, June 2013
11
A quick discussion of conservative thresholds…. Chance of being hit by lightning = 1 in a million Chance of winning the lottery = 1 in 10 million Chance of DNA false-positive = 1 in 30 million to 1 in a billion Chance of tests being flagged and taken independently = 1 in a TRILLION
12
Statistics Used Similarity Erasures Gains/Losses
13
Similarity Our Most Powerful & “Credible” Statistic Measures degree of similarity between 2 or more test instances Analyze each test instance against all other test instances in the same school
14
Erasures Based on estimated answer changing rates from: Wrong-to-Right Anything-to-Wrong Find answer sheets with unusual WtR answers Extreme statistical outliers could involve tampering, “panic cheating”, etc.
15
Unusual Gains/Losses Predict score using prior year info. Measure large score increases/decreases against predicted score Extreme Gains/Losses may result from: Pre-knowledge, ie “Drill It and Kill It” Coaching Student development—visual acuity
16
Student-level Analysis Similarity Analysis only Most credible, strongest No flagging for erasures or gains Invalidate test scores with Similarity Index ≥ 12 Notification letters to be sent to parents Supporting info sent to districts Chances of seeing two (or more) students’ tests so similar, with each doing his/her own work: 0.000000000001
17
Steps for Calculating Similarity 1.Analyze each grade/subject statewide to create a model of “normal” test taking behaviors 2.Use student’s performance to compute the probability of an incorrect/correct answer on all items 3.Calculate the probability that two students will answer an item identically Expected Identical Correct/Incorrect Observed Identical Correct/Incorrect much 4.Tests are flagged when the number of identical responses is much greater than expected
18
Example of Flagged Students
19
Example: 5 th Grade Reading Cluster Identifies possible collusion 2 students passed, but break the assumption of independent test taking ie, the results are not trustworthy.
20
Invalidation Support Materials District Invalidation Spreadsheet Draft Parent Notification Letter Appeals Guide Similarity “Cluster” Spreadsheets (sent upon request by FL DOE)
21
District Invalidations Spreadsheet
22
Similarity Cluster Spreadsheets* Explanations Examinees Summary results Alignment Actual responses *Similarity “Cluster” Spreadsheets sent upon request from FL DOE
24
Alignment Detail Letters = identical correct Numbers = identical incorrect
25
School-Level Analysis “… I am talking about educator cheating. I don't know if its 5% or 10%, but I doubt it's 1/10 of 1%. What I do know for certain is that there is uniformly more cheating than we think there is.” Dr. Greg Cizek, NCME President 2012-13 CCSSO Conference, June 2013
26
School-Level Analysis Similarity, Erasures, and Gains Flagged schools conduct internal review Extreme instances may prompt formal investigations and sanctions
28
School-Level Student Data One row of data per test result (student & subject) Identifying information Student Test result Similarity information Erasure information Other information (if available)
29
Identifying Information Caveon IDUIN Student IDSubject PAS or CBT IDTest Name Last NameCore Form First NameTest Form MITest Group GradeTest Date DistrictRaw Score SchoolScale Score District NamePassed School NameAchievement Level
30
Similarity Information Caveon IDExpected Incorrect Similarity IndexPercent Match Similarity ClusterClosest Similarity Index Cluster Identifier**Closest Match ID Cluster Index**Closest Last Name Matching Test IDClosest First Name Questions in CommonSource-Copier Index** Correct MatchesDominant Score** Incorrect MatchesNon-Dominant Score** Expected CorrectStandardized Difference** ** Only present when Similarity Index > 12
31
Was the Similarity Due to Small groups or large groups? Answer copying? Communication between students? Poor proctoring? Something else? Disclosure of answers Buddy system “Chunking and redirecting”
33
Snippet of Similarity Information Caveon ID Similarity Index Similarity Cluster Matching Test ID Correct Matches Incorrect Matches Expected Correct Expected Incorrect Percent Match Closest Similarity Index 235090237368145835.480.3 235101.9_0b157dfb2373710177743.551.9 235111.7_98b46c98236952212937.11.7 235129.1_b07ff4d02368162921056.459.1 235131.6_e14b8a1d2361661011225.811.6 235140235616153933.870.3 23515023531111410640.320.6 235160235522163729.030.6 235172.7_b07ff4d0236817193941.942.7 235182.5_d62f76512352511177745.162.5 2351916.2_414e1fe92352618289774.1916.2
34
Answering the questions Sort/filter by Similarity Cluster How many clusters? What are the sizes? What are the index values? Plot matches on seating chart Are students close? Is there a pattern? Teachers and/or other groupings
35
Patterns to check on seating charts Separation, associations, and index values Index values above 5 Tight groups (communication/pencil tapping?) “Close” pairs (answer copying/blind spots?) Wide separation (cell phones?) Index values between 3 and 5 Content disclosure Chunking/redirecting Index values below 3 Separated pairs are probably noise Larger groups could indicate something else
36
Clustering examples - #1 10 pairs, 3 triplets, 1 quad, 1 quint, 1 (23), 1 (43) For clusters > 3, use Matching IDs to create sub- clusters If some index values are very high, filter out the very small index values Caveon ID Similarity Index Similarity Cluster Matching Test ID 235101.944_0b157dfb23737 235971.6601_0b157dfb23740 2373637.6168_0b157dfb23737 37.6168_0b157dfb23736 2374015.6448_0b157dfb23737
37
Clustering examples - #2 Split large groups If some index values are very high, filter out the very small index values Caveon ID Matching Test ID Closest Similarity Index 236812370110.24 237012368110.24 23512236819.117 23762236819.073 23704236816.762 23709236816.675 23677237012.915 23565235122.754 23517236812.653 23755237092.642 23628237012.388 23590237552.168 23690235122.139 23661237011.735 23601237091.471 23728236901.345
38
Erasure Information Only present for paper-and-pencil Students do not erase frequently Use seating charts and student associations Caveon ID Erasure Index Wrong-to-Right Erasures Any-to-Wrong Erasures Right-to-Wrong Erasures Wrong-to-Wrong Erasures WTR Delta Flag WTR Delta Difference
39
Summary Goal: Fair and valid testing for all students DOE to conduct Data Forensics on FCAT test data Focus on Individual students -- extremely simililar tests Schools—Similarity, Gains, and Erasures
40
Follow Up Questions? Victoria.Ash@fldoe.org
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.