Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Psychometric Analysis to Drive Mathematics Standardized Assessment Decision Making Mike Mazzarella George Mason University.

Similar presentations


Presentation on theme: "Using Psychometric Analysis to Drive Mathematics Standardized Assessment Decision Making Mike Mazzarella George Mason University."— Presentation transcript:

1 Using Psychometric Analysis to Drive Mathematics Standardized Assessment Decision Making Mike Mazzarella George Mason University

2 Problem In the 2011-2012 school year, the state of Virginia shifted its standardized math assessments to computer-based assessments with multiple item formats. Some believe the drop in scores was caused by the change in item formats, other believe the test became generally more difficult. Are these test results being extensively analyzed? Are adjustments being made based on psychometric analysis?

3 Literature Review Computer-Based Assessments ▫Students scored higher on a computer-based test than the same test on pencil and paper (Threlfall et al., 2007) ▫Computer-based assessment can be less reliable than pencil and paper tests (Shapiro & Gebhardt, 2012) Assessment: Item Format ▫Student achieve higher on multiple-choice than open- ended questions (Ozuru et al., 2010; Jodoin, 2003) ▫Different strategies are used for different item formats (Katz, Bennett, & Berger, 2000)

4 Literature Review, Continued Assessment: Question Type ▫Different cognitive skills are used for straightforward vs. word problems (Fuchs et al., 2008) ▫Knowledge transfer occurs while solving different question types (Belenky & Nokes-Malach, 2013; Day & Goldstone, 2012) Data Driven Decision Making ▫DDDM informs instruction (Maldinach, 2012; Schifter et al., 2014) and curriculum design (Maldinach & Gummer, 2015), but little research is done using DDDM to inform standardized testing on specific subjects

5 Research Questions 1.What are the psychometric properties, particularly validity, of a computer-based Algebra 2 assessment? 2.How can the properties of this assessment inform data-driven decision making for creating standardized tests?

6 Math Measure (Mazzarella, 2015) Modeled after Virginia Standards of Learning (SOL) Algebra 2 exam ▫Three Strands: Expressions and Operations, Equations and Inequalities, and Functions and Statistics Mix of different item formats (multiple choice vs. technology enhanced) and question types (straightforward vs. word problem) Two versions ▫Same prompts, different item format for each prompt

7 Participants and Method 148 High School Students ▫Large, Diverse, Suburban High School ▫Enrolled in an On-Level Algebra 2 Course Students randomly assigned to 1 of 2 versions Test administered on computer during the students’ class period ▫90 minutes to complete the exam

8 Data Analysis Classical Test Theory (CTT) ▫Descriptive statistics ▫Reliability (Cronbach’s alpha) ▫Difficulty ▫Discrimination Rasch Analysis ▫Misfit Items ▫3-parameter logistic ▫Wright map

9 Results Test ATest B n8662 Mean Score17.023318.3871 Median19 Standard Dev.6.51294.8704 Skewness-0.3956-0.1075 Kurtosis-0.6510-0.6814 Cronbach’s α0.88150.7761 SEM2.24172.3044 Low Difficulty0.20930.1774 High Difficulty0.88370.9516 Low Discrimination0.1123-0.1115 High Discrimination0.62260.5279 Item Separation Index4.12733.4176 Person Separation Index2.54841.8852 # of Potential Misfits33

10 Results, continued Low reliability and separation indices on Test B likely due to low number of participants Multiple-choice questions are more likely to have low discrimination Misfit items were more likely to be multiple- choice items than technology-enhanced Straightforward items were more likely answered correctly than word problems Wright map shows a similar distribution for items and participants for both versions

11 Limitations Lack of generalizability ▫School demographics, subject, specific curriculum Imbalance of students taking each test ▫More students taking Test A than Test B Classes randomly assigned, not students ▫Different teachers ▫Significant difference in achievement for one teacher

12 Implications and Further Research For teachers: knowing how to best prepare students for tests ▫Content vs. Characteristics of Test For test-makers: scrutinizing and adjusting the standardized tests that millions of students take ▫Are they really taking the data into consideration? For researchers: are there significant differences in item format or question type? ▫If so, why? ▫How do these characteristics impact student achievement? ▫Are there differences in achievement by gender?

13 References Belenky, D. M., & Nokes-Malach, T. J. (2013). Mastery-approach goals and knowledge transfer: An investigation into the effects of task structure and framing instruction. Learning and Individual Difference, 25, 21-34. doi:10.1016/j.lindif.2013.02.004 Day, S. B. & Goldstone, R. L. (2012). The import of knowledge export: Connecting findings and theories of transfer of learning. Educational Psychologist, 47(3), 153-176. doi: 10.1080/00461520.2012.696438

14 References, continued Fuchs, L. S., Fuchs, D., Stuebing, K., Fletcher, J. M., Hamlett, C. L., & Lambert, W. (2008). Problem solving and computational skill: Are they shared or distinct aspects of mathematical cognition? Journal of Educational Psychology, 100(1), 30-47. doi: 10.1037/0022-0663.100.1.30 Jodoin, M. G. (2003). Measurement efficiency of innovative item formats in computer-based testing. Journal of Educational Measurement, 40(1), 1-15. doi: 10.1111/j.1745-3984.2003.tb01093. Katz, I. R., Bennett, R. E., & Berger, A. E. (2000). Effects of response format on difficulty of SAT-mathematics items: It’s not the strategy. Journal of Educational Measurement, 37(1), 39-57. doi: 10.1111/j.1745- 3984.2000.tb01075.x

15 References, continued Mandinach, E. (2012). A perfect time for data use: Using data-driven decision making to inform practice. Educational Psychologist, 47(2), 71-85. doi: 10.1080/00461520.2012.667064 Mandinach, E. & Gummer, E. (2015). Data-driven decision making: Components of the enculturation of data use in education. Teachers College Record, 117(4). Retrieved from http://psycnet.apa.org.mutex.gmu.edu Ozuru, Y., Best, R., Bell, C., Witherspoon, A. & McNamara, D. S. (2010). Influence of question format and text availability on the assessment of expository text comprehension. Cognition and Instruction, 25(4), 399-438. doi: 10.1080/07370000701632371

16 References, continued Schifter, C. C., Natarajan, U., Ketelhut, D. J., & Kirchgessner, A. (2014). Data-driven decision making: Facilitating teacher use of student data to inform classroom instruction. Contemporary Issues in Technology & Teacher Education, 14(4), 419-432. Retrieved from http://psycnet.apa.org.mutex.gmu.edu Shapiro, E. S. & Gebhardt, S. N. (2012). Comparing computer-adaptive and curriculum-based measurement models of assessment. School Psychology Review, 41(3), 295-305. Threlfall, J., Pool, P., Homer, M., & Swinnerton, B. (2007). Implicit aspects of paper and pencil mathematics assessment that come to light through the use of the computer. Educational Studies in Mathematics, 66(3), 335–348. doi:10.1007/s10649-006-9078-5


Download ppt "Using Psychometric Analysis to Drive Mathematics Standardized Assessment Decision Making Mike Mazzarella George Mason University."

Similar presentations


Ads by Google