Students’ Perceptions of Item Modifications: Using Cognitive Labs and Questionnaires Andrew Roach Paper presented as part of “Design and Evaluating Modified.

Slides:



Advertisements
Similar presentations
2013 RCAS Summative Assessment Report Preliminary Dakota State Test of Educational Progress (D-STEP) Information August 6,2013.
Advertisements

Autism Observation Instrument General Education Classrooms
Designing More Accessible Achievement Tests for All Students Stephen N. Elliott Learning Sciences Institute and Department of Special Education Vanderbilt.
Fairness in Testing: Introduction Suzanne Lane University of Pittsburgh Member, Management Committee for the JC on Revision of the 1999 Testing Standards.
What Parents Need to Know  TABS (Texas Assessment of Basic Skills)  TEAMS (Texas Educational Assessment of Minimum Skills)  TAAS (Texas Assessment.
Assessment “We assess students not merely to evaluate them, but to improve the entire process of teaching and learning.” - Douglas B. Reeves, Making Standards.
This presentation and its materials are based upon work supported by the National Science Foundation under Cooperative Agreement Number HRD Any.
© 2012 Common Core, Inc. All rights reserved. commoncore.org NYS COMMON CORE MATHEMATICS CURRICULUM A Story of Units Module Analysis Grade 5—Module 3.
Minnesota Manual of Accommodations for Students with Disabilities Training Guide
Student Technological Mastery: It's Not Just the Hardware Wm. H. Huffman, Ph.D. Ann H. Huffman, Ph.D.
EVIDENCE BASED WRITING LEARN HOW TO WRITE A DETAILED RESPONSE TO A CONSTRUCTIVE RESPONSE QUESTION!! 5 th Grade ReadingMs. Nelson EDU 643Instructional.
ACOS 2010 Standards of Mathematical Practice
Formative and Summative Assessment
Assessment Cadre #3: “Assess How? Designing Assessments to Do What You Want”
Special Education Review & Update for Regular Educators.
NCCSAD Advisory Board1 Research Objective Two Alignment Methodologies Diane M. Browder, PhD Claudia Flowers, PhD University of North Carolina at Charlotte.
HESC Initiative 9B Faculty Fellowship: A Universal Design for Learning Mathematics: Reducing Barriers to Solving Word Problems Dr. Merry L. Staulters Dr.
The University of Central Florida Cocoa Campus
Cognitive Coaching: Mathematics Session Lizette Diaz Jennifer Hodges.
Interstate New Teacher Assessment and Support Consortium (INTASC)
Pre-college Electrical Engineering Instruction: Do Abstract or Contextualized Representations Promote Better Learning? Dr. Roxana Moreno, University of.
Smarter Balanced Accommodations – Knowing and Using Allowed Resources Presenters: Donna Gearns Alicia Skelly 8/20/2014.
Groton Elementary Agenda: Discuss assessments, modifications, and accommodations Review common accommodations for assessments Study of Test.
Out with the Old, In with the New: NYS Assessments “Primer” Basics to Keep in Mind & Strategies to Enhance Student Achievement Maria Fallacaro, MORIC
CriteriaExemplary (4 - 5) Good (2 – 3) Needs Improvement (0 – 1) Identifying Problem and Main Objective Initial QuestionsQuestions are probing and help.
Lecture 8A Designing and Conducting Formative Evaluations English Study Program FKIP _ UNSRI
Specific Learning Disability: Accurate, Defensible, & Compliant Identification Mississippi Department of Education.
EDU 385 Education Assessment in the Classroom
Comp 20 - Training & Instructional Design Unit 6 - Assessment This material was developed by Columbia University, funded by the Department of Health and.
Region Center III Continuous Improvement and Professional Development presents Continuous Improvement Process (CIM) & Plan-Do-Study-Act (PDSA) Part III:
Alternate Assessments of Modified Achievement Standards: Research on More Accessible & Less Difficult Items Stephen N. Elliott Vanderbilt University Designing.
CONNECTICUT STATE DEPARTMENT OF EDUCATION State Board of Education Update on Student Performance First Analysis of Smarter Balanced Results September.
Special thanks to Cori Wixson, Tanya Talapatra, and Tamika LaSalle for their assistance in coding the think-aloud videos.
PRINCIPAL SESSION 2012 EEA Day 1. Agenda Session TimesEvents 1:00 – 4:00 (1- 45 min. Session or as often as needed) Elementary STEM Power Point Presentation.
1 Multimedia-Supported Metaphors for Meaning Making in Mathematics Moreno & Mayer (1999)
Modified Achievement Tests for Students with Disabilities: Design Strategies and Experimental Results Stephen N. Elliott Vanderbilt University CCSSO’s.
Consortium for Alternate Assessment Validity and Experimental Studies (CAAVES Project) Elizabeth Compton Ryan J. Kettler Andrew T. Roach January 16, 2008.
Modifying Achievement Test Items: A Theory-Guided & Data-Based Approach Stephen N. Elliott Learning Sciences Institute and Department of Special Education.
CAHSEE Results Board Report 1 Lodi Unified School District 2009 California High School Exit Examination Results September 15, 2009.
Assessment Specifications Gronlund, Chapter 4 Gronlund, Chapter 5.
How a Theoretical and Data-based Modification Process Can Help Students Eligible for an AA-MAS The Consortium for Alternate Assessment Validity and Experimental.
Assessing Learners with Special Needs: An Applied Approach, 6e © 2009 Pearson Education, Inc. All rights reserved. Chapter 5: Introduction to Norm- Referenced.
MarshAccess Making Environmental Programs & Field Experiences Accessible JJ Rusher.
© 2015 The College Board The Redesigned SAT/PSAT Key Changes.
Alternate Proficiency Assessment Erin Lichtenwalner.
Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New.
STANDARD 4 & DIVERSITY in the NCATE Standards Boyce C. Williams, NCATE John M. Johnston, University of Memphis Institutional Orientation, Spring 2008.
Cambrian School District September 17, 2015
Minnesota Manual of Accommodations for Students with Disabilities Training January 2010.
Steven W. Evans, Christine Brady, Lee Kern, Christiana Andrews and the CARS Research Team Measurement Development and Inclusion Criteria: Developing Meaningful.
 School site: The Preuss School  Classes: 6 th Grade Math Enrichment and 7 th Grade Honor Pre-Algebra  Student population: 816 students  Student demographic:
Preparing Teacher Candidates to Address Academic Language for the edTPA Bryan Gillis Ph.D. Associate Professor of English Education and Literacy Kennesaw.
Accommodations and Modification in Grades Do NOT fundamentally alter or lower expectations or standards in instructional level, content, or performance.
Write your personal definition of “cognitive rigor” What do rigorous academic environments look and sound like?
Tier 1 Instructional Delivery and Treatment Fidelity Networking Meeting February, 2013 Facilitated/Presented by: The Illinois RtI Network is a State Personnel.
Assessment in Common Core. Essential Questions How is CAASPP different than STAR? How is SBAC different than CST? What do students have to know and be.
SBAC-Mathematics November 26, Outcomes Further understand DOK in the area of Mathematics Understand how the new SBAC assessments will measure student.
The Role of Prior Knowledge in the Development of Strategy Flexibility: The Case of Computational Estimation Jon R. Star Harvard University Bethany Rittle-Johnson.
Measuring College and Career Readiness
Implementing the Common Core Standards
Teaching Everybody’s Children
EDU 385 Session 8 Writing Selection items
DC CAS Anchor Paper Training Reading – Secondary Grades 7, 8 & 10
LEAP TH GRADE.
Margaret M. Flores, Ph.D., BCBA-D
Accessible Assessment
Unit 7: Instructional Communication and Technology
Using the 7 Step Lesson Plan to Enhance Student Learning
Presentation transcript:

Students’ Perceptions of Item Modifications: Using Cognitive Labs and Questionnaires Andrew Roach Paper presented as part of “Design and Evaluating Modified Items for Students with Disabilities: Research Results” Coordinated Session NCME 2009 Annual Convention San Diego, CA Special thanks to Cori Wixson, Tanya Talapatra, and Tamika LaSalle for their assistance in coding the think-aloud videos.

Objectives To discuss the rationale for including students’ perceptions in research on test item development and modification. To discuss possible research strategies for collecting student responses. To present data from post-assessment questionnaires and cognitive lab studies.

Why collect student response data? Support from the Test Standards “Questioning test takers about their performance strategies can yield evidence that enriches the definition of a construct…” (p. 12). “Process studies involving examinees from different subgroups can assist in determining the extent to which capabilities irrelevant or ancillary to the construct may be differentially influencing (student) performance” (p. 12). “Educational tests…may be advocated on the grounds that their use will improve student motivation….Where such claims are central to the rationale of testing, the direct examination of testing consequences necessarily assumes even greater importance” (p. 17).

Why collect student response data? Item enhancements or modifications could be conceptualized as a form of educational intervention. …Student perceptions are essential evidence about the acceptability of these assessment strategies. Acceptability refers to an individual’s perceptions regarding the appropriateness, fairness, and reasonableness of an intervention (Kazdin, 1981).

Using Student Response Data: Applications to Test Item Modifications Test Development or Item Enhancement/ Modifications Cognitive Lab Study Additional modifications or enhancements based on results Field Test Post-Test Survey

Study #1: CAAVES Cognitive Lab--An Initial Application of Think-Aloud Methodology  Purpose: To evaluate the influence of test item modifications on students’ problem-solving and test-taking behaviors.  Our study involved three components: 1. Students completed a series of 16 assessment items (8 reading; 8 mathematics). 2. Students were asked to think aloud as they completed or solved these items. 3. We also asked follow-up questions about students’ perceptions of the assessment items.

Distribution of item modifications Test A XXXX Test B XXXX X = Item modifications used.

Sample Size by Sub-group Test A group Test B group Total Students without disabilities 213 Students with disabilities (not eligible for AA-MAS) 123 Students with disabilities (eligible for AA-MAS) 123

Method  We explained the think-aloud procedures, had the students restate their understanding of the process, and modeled thinking aloud on a practice item.  We used a script adapted from a study conducted by Johnstone, Bottsford-Miller, and Thompson (2006).  Students were prompted only when they were silent for 10 consecutive seconds.  If students verbalized infrequently, we reminded them to “keep thinking aloud” or “keep talking.” Otherwise we generally did not give encouragement or support.

Results: Data on Reading Items Group% of Items Correct Time spent per item (mean) Miscues on passages (mean) Fluency on passages (mean) Researcher prompts per item (mean) Students without dis- abilities Original Items 83.3%79.6 s wpm.49 Modified Items 83.3%51.0 s wpm.29 SWDs (not eligible) Original Items 83.3%123.8 s wpm.65 Modified Items 75.0%100.5 s wpm.28 SWDs (eligible for AA-MAS) Original Items 66.7%149.4 s wpm.81 Modified Items 75.0%98.5 s wpm.28

Results: Data on Math Items Group % of Items Correct Time spent per item (mean) Researcher prompts per item (mean) Problem Solving Strategies Used on Items Correct strategy used Incorrect strategy used Appeared to guess * Students without dis- abilities Original Items 66.7%65.8 s % (8) 25.0% (3) 16.7% (2) Modified Items 50.0%54.1 s % (6) 33.3% (4) 41.7% (5) SWDs (not eligible) Original Items 50.0%125.2 s % (5) 50.0% (6) 16.7% (3) Modified Items 75.0%126.2 s % (5) 50.0% (6) 41.7% (5) SWDs (eligible for AA- MAS) Original Items 33.0%102.5 s % (3) 58.3% (7) 50.0% (6) Modified Items 50.0%72.8 s.088.3% (1) 58.3% (7) 83.3% (10)

Results: Use of Visuals  Visuals in Reading Passages/Items  Most SWDs (67%) saw the visuals as being helpful and providing support on reading questions and passages.  100% of the students without disabilities indicated the pictures made no difference.  Visuals/Graphs in Mathematics Items  Students with (50%) and without disabilities (67%) generally saw the visuals and graphs as being helpful and providing support.  …However, 33% of SWDs indicated that the visuals/graphs were distracting or made items harder.

Students’ Comments: Use of Visuals  The one talking about the $100 bills…well it showed me--and I was understanding—how it goes with what it was talking about, and I looked at it and it helped me even more.” -- Student with disability (eligible for AA-MAS)"  “When people do math, they're working on a sheet and what's the point of looking at a picture. It doesn't really help you. For example, on (questions) #1 and #2, those two pictures were really messing me up.” -- Student with disability (not eligible for AA-MAS)

Results: Removing Answer Choices  Reading  SWDs (with one exception) perceived no difference in difficulty between items having 3 or 4 possible answers.  Conversely, 67% of the students without disabilities identified the 3- answer modification as making the reading items easier.  Mathematics  Students without disabilities (67%) and non-eligible SWDs (67%) generally indicated removing answer choice made the math items easier.  Some non-eligible students appeared to use the possible answer choices to help solve math items, but it was not clear that they used this same strategy in reading.  “If you didn't get the answer right the first time, you know you only had 3 choices to go back and look at, instead of 4.”—Student without disability

Results: Format of Analogies Most students (including 2/3 of SWDs) found the traditional format for the analogy easier (i.e.,“meteor:space::dolphin:_______”). Some students indicated they had been taught analogies using this format and it was familiar to them. This was supported by the results as SWDs correctly answered all the traditional analogy items. SWDs missed items with a modified analogy format (i.e., “meteor is to space as dolphin is to ___”) 40% of the time.

Study #2: Post-Test Survey Original and modified versions of the 39 item tests were field tested experimentally using DEA’s online test delivery system. A large sample of students (N = 755) in grade eight from the four states (AZ, HI, ID, and IN) participated in the study. Sample was comprised of three groups: SWOD (n = 269), SWD-NE (n = 236), and SWD-E (n = 250). Students received 13 items in each of three conditions: Original, Modified, and Modified with Reading Support. After the test, students were presented with a follow-up survey that contained seven questions about their perceptions of particular item modifications.

Results: Relative Difficulty of Items Most students reported the test had about the same difficulty all the way through (61% for reading; 46% for mathematics). Some students reported the test was easier toward the beginning (19% for reading; 29% for mathematics), despite the fact that some students received the Modified or Modified with Reading Support conditions first. Actual field test results showed decreases in student performance for each successive part across groups for both content areas, independent of the order of conditions (i.e., Original, Modified, or Modified with Reading Support).

Results: Relative Difficulty of Items Fewer students in the SWD-E group reported the reading test was the same difficulty throughout (49% versus 71% of SWODs). Fewer SWDs reported the mathematics test was the same difficulty all the way through (42% and 41% of students in the SWD-NE and SWD-E groups, respectively, compared to 54% of SWODs).

Results: Adding Visuals to Items Reading Items: 62% of students in the SWD–E group reported the visuals provided helpful clues compared to 50% of students in SWD-NE and 44% of students in SWOD (44%) groups. Mathematics Items: 58% of students in the SWD-E group reported visuals gave helpful clues, compared to 37% of students in the SWOD group and 44% of students in the SWD-NE group.

Results: Using bold font for key terms We expected this modification to be most strongly endorsed by the SWD-E group, but fewer students in the eligible group reported bold type as helpful for vocabulary items (73%) compared to SWD-NEs (81%) and SWODs (84%). Actual performance data indicated that for the 17 items with key vocabulary terms in bold type, difficulty was lower for the Modified condition than for the Original condition.

Results: Reading Support More students in the SWD-E group reported reading support made the items easier (67% on the reading test, 68% on the mathematics test) compared to students in the SWOD group (41% for reading; 40% for mathematics). Field test results in both content areas, however, indicated only small differences in student performance between the Modified condition and the Modified with Reading Support condition (effect sizes of.07 for reading and.05 for mathematics items).

Study #3: CMAADI Cognitive Lab Study A replication and extension of the CAAVES study. 60 students with and without disabilities in grades 5-8 and 10. We collected data on… Students’ mental effort/mental ease for each item. How hard did you have to work to answer the reading/math item above? Not very hard Very hard Students’ instructional experiences. Students’ oral reading fluency.

Test Construction 12 items per grade level from Arizona’s item pool were selected for inclusion in the study by the state’s item modification/writing team members. Items were used to create two versions of each test, Forms A and B. Each test version included 6 items in their original forms and 6 items that had been modified. Using two versions of the test allowed us to make comparisons between behavior and responses on original (O) and modified (M) versions of each item at each of six grade levels (4th through 8th and 10th grades).

Sample: Demographics Non-EligibleEligibleTotal n (%) Sex Male* 6 (27%) 13 (65%)19 (45%) Female* 16 (73%) 7 (35%)23 (55%) Ethnicity White (Not Hispanic) 13 (54%)6 (27%)19 (41%) Black or African American (Not Hispanic) 1 (4%)1 (5%)2 (4%) Hispanic or Latino 9 (38%)13 (59%)22 (48%) American Indian or Alaskan Native 0 (0%)1 (5%)1 (2%) Asian or Pacific Islander 1 (4%)1 (5%)2 (4%) Grade Fourth6 (25%)3 (14%)9 (20%) Fifth5 (21%)3 (14%)8 (17%) Sixth3 (13%)3 (14%)6 (13%) Seventh2 (8%)3 (14%)5 (11%) Eighth3 (13%)2 (9%)5 (11%) Tenth5 (21%)8 (36%)13 (28%)

Non-Eligible Year Exceeds the StandardMeets the Standard Approaches the StandardFalls Far Below Reading (0%)20 (100%)0 (0%) (0%)19 (100%)0 (0%) (0%)12 (86%)2 (14%)0 (0%) Mathematics (5%)19 (95%)0 (0%) (11%)15 (83%)1 (6%)0 (0%) (0%)12 (100%)0 (0%) Eligible Year Exceeds the StandardMeets the Standard Approaches the StandardFalls Far Below Reading (0%) 10 (100%) (0%) 1 (8%)12 (92%) (0%) 3 (25%)9 (75%) Mathematics (0%) 10 (100%) (0%) 15 (100%) (0%) 3 (20%)12 (80%)

Sample: Oral Reading Fluency Non-EligibleEligibleTotal Mean (SD) Reading Fluency (wpm) (Sample Size) (34.45) (n = 24) (25.97) (n = 21) (47.78) (n = 45) Fourth Grade (34.87) (n = 6) (30.04) (n = 3) (45.67) (n = 9) Fifth Grade (25.05) (n = 5) (23.12) (n = 3) (44.00) (n = 8) Sixth Grade (54.88) (n = 3) (16.64) (n = 3) (61.50) (n = 6) Seventh Grade (4.24) (n = 2) (29.31) (n = 3) (58.09) (n = 5) Eighth Grade (37.58) (n = 3) (0.00) (n = 1) (68.43) (n = 4) Tenth Grade (40.91) (n = 5) (23.96) (n = 8) (42.61) (n = 13)

Reading Content Coverage by Grade Non-EligibleEligibleTotal Mean (SD) Content Coverage (Full Sample) 2.38 (0.62) (n = 17) 2.00 (0.51) (n = 14) 2.21 (0.60) (n = 31) Fourth Grade2.60 (0.00) (n = 3) 2.80 (0.28) (n = 2) 2.68 (0.18) (n = 5) Fifth Grade2.05 (0.30) (n = 4 ) 2.20 (0.00) (n = 1) 2.08 (0.27) (n = 5) Sixth Grade2.07 (0.12) (n = 3) 1.83 (0.49) (n = 3) 1.95 (0.34) (n = 6) Seventh Grade2.50 (0.00) (n = 2) 2.50 (0.00) (n = 2) 2.50 (0.00) (n = 4) Eighth Grade3.20 (0.69) (n = 3) 1.40 (0.00) (n = 1) 2.75 (1.06) (n = 4) Tenth Grade1.83 (1.18) (n = 2) 1.67 (0.00) (n = 5) 1.71 (0.49) (n = 7)

Math Content Coverage by Grade Non-EligibleEligibleTotal Mean (SD) Content Coverage (Full Sample) 1.43 (0.64) (n = 17) 1.04 (0.58) (n = 16) 1.24 (0.63) (n = 33) Fourth Grade2.00 (0.00) (n = 3) 1.25 (0.12) (n = 2) 1.70 (0.41) (n = 5) Fifth Grade1.94 (0.63) (n = 4) (0.63) (n = 4) Sixth Grade1.27 (0.46) (n = 3) 1.57 (0.93) (n = 3) 1.42 (0.68) (n = 6) Seventh Grade1.25 (0.12) (n = 2) 1.50 (0.00) (n = 2) 1.38 (0.16) (n = 4) Eighth Grade1.07 (0.23) (n = 3) 1.20 (0.00) (n = 1) 1.10 (0.20) (n = 4) Tenth Grade0.50 (0.71) (n = 2) 0.65 (0.30) (n = 8) 0.62 (0.36) (n = 10)

Reading Test Outcome Measures by Condition & Group Original ModifiedDifference Test Raw Score (0 to 3) Mean (SD) Total (n = 41) 1.76 (0.92)2.07 (0.79)+0.31 Non-Eligible (n = 26) 2.00 (0.98)2.15 (0.78)+0.15 Eligible (n = 15) 1.33 (0.62)1.93 (0.80)+0.60 Cognitive Ease z-score 1 Mean (SD) Total (n = 41) 0.17 (0.08)0.26 (0.09)+0.09 Non-Eligible (n = 26) 0.12 (0.11)0.25 (0.09)+0.13 Eligible (n = 15) 0.24 (0.12)0.27 (0.18)+0.03

Mathematics Test Outcome Measures by Condition & Group Original ModifiedDifference Test Raw Score (0 to 3) Mean (SD) Total (n = 40, 41) (0.84)1.85 (0.88)+0.25 Non-Eligible (n = 24) 1.83 (0.82)2.21 (0.78)+0.38 Eligible (n = 16, 17) (0.77)1.35 (0.79)+0.10 Cognitive Ease z-score 1 Mean (SD) Total (n = 38) (0.09)-0.10 (0.09)+0.32 Non-Eligible (n = 23) (0.13)0.01 (0.13)+0.43 Eligible (n = 15) (0.12)-0.27 (0.10)+0.22

Reading Mean total test scores by condition and group

Reading mean cognitive ease score by condition and group

Math mean total test scores by condition and group

Math mean cognitive ease score by condition and group

Students’ Perceptions of Tests and Items

Cognitive efficiency plot for grade 7 reading items #4-6

Take Away Ideas Verbalizing about “automatized” procedures and skills (i.e. low DOK levels or p-values) is difficult, but many of our test items are at lower levels of cognitive complexity. Follow-up questions may provided valuable information that make think-aloud data easier to understand and interpret (Branch, 2000; Fonteyn, Kuipers, & Grobe, 1993; Johnstone, Bottsford- Miller, & Thompson, 2006).

Take Away Ideas SWDs often appeared unfamiliar with some concepts (e.g., percentages). In these cases, item modifications are unlikely to provide necessary support or facilitate access. Reading fluency may be an issue for SWDs. In some cases, SWD’s slower rates of reading resulted in testing sessions that were almost twice as long as their peers. How could (or should) technology be used to address this barrier?

Take Away Ideas To understand the students’ cognitive processing and problem solving behavior, researchers must understand: The instructional/assessment task; Individual participants’ knowledge about the task; and How prior knowledge may affect processing and problem solving during the task (Pressley & Afflerbach, 1995).

References Branch, J. L. (2000). Investigating the information-seeking processes of adolescents: The value of using think-alouds and think-afters. Library and Information Science Research, 22(4), 371–392. Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis: Verbal reports as data (Revised edition). Cambridge, MA: MIT Press. Johnstone, C. J., Bottsford-Miller, N. A., & Thompson, S. J. (2006). Using the think aloud method (cognitive labs) to evaulate test design for students with disabilities and English language learners (Technical Report 44). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes. Retrieved from the World Wide Web: Pressley, M., & Afflerbach, P. (1995). Verbal protocols of reading: The nature of constructively responsive reading. Hillsdale, NJ: Lawrence Earlbaum.