Download presentation
Presentation is loading. Please wait.
Published byHerman Kurniawan Modified over 6 years ago
1
Neil T. Heffernan, Joseph E. Beck & Kenneth R. Koedinger
Can an Intelligent Tutoring System Predict Math Proficiency as Well as a Standardized Test? Mingyu Feng Co-authored with Neil T. Heffernan, Joseph E. Beck & Kenneth R. Koedinger
2
The Need for Assessment
Pressure from NCLB Poor student performance on standardized test Pressure from teachers, parents and other stakeholders Want to use assessment result in a data-driven manner More immediate feedback on how students are doing
3
Related work on assessing
Corbett & Bhatnagar (1997) Use a series of short tests to adjust knowledge tracing process to more accurately predict individual difference Beck & Sison (2006) Use knowledge tracing to construct a student model to predict student performance at both coarse and fine grain size Feng, Heffernan & Koedinger (2006) Assistance metrics and longitudinal modeling Anozie & Junker (2006) Monthly summary of online metrics and look at the changing influence of online metrics at MCAS performance over time
4
How to evaluate model effectiveness ?
Measures that have been used Mean Absolute Error (MAE)/Mean Absolute Deviation (MAD) Relative closeness to real scores R square and Bayesian Information Criterion (BIC) Simulation study (Feng, Heffernan & Koedinger, 2006) “proxy” measure method (Beck & Sison, 2006)
5
Focus of this work Replicate our previous study to predict student math proficiency at the end of a school year Propose a new method for evaluating the predictive accuracy of a student model relative to the standardized test: Can we predict math proficiency as well as a standardized test, MCAS?
6
Predict math proficiency
Data 392 students who used ASSISTment during 2004 to 2005 and have done at least 39 problems 8th grade MCAS score (2005) 10th grade MCAS score (2007) Online data collected in ASSISTment system Student proficiency score – how student did on main problems Assistance metrics – how student interacted with the system
7
Student proficiency score
An estimate of student performance taking into account difficulty of the items Obtained by training a Rasch model data from 2004 to 2008 Student response on main questions 14274 student accounts 2797 questions Get theta (student proficiency score) and beta (item difficulty parameter) from the trained model
8
Rasch model “Used particularly in psychometrics, for analyzing data from assessments to measure things such as abilities, attitudes” - wikipedia 1 parameter IRT model The probability of a correct response is modelled as a logistic function of the difference between the student’s proficiency and the difficulty of the question.
9
Assistance metrics %correct on main question and scaffolds
Average number of hints, bottom-out hints, attempts, seconds student needs to solve a question Total number of questions done Total number of minutes student work in the system
10
Modelling “lean” model Only use student proficiency score
Correlates fairly well with real 8th grade MCAS score R = .731
11
Modelling Backwards linear regression model Independent variables:
assistance metrics (including quadratic and interaction terms for each assistance feature) Dependent variable: 8th grade MCAS # of Variables survived: 36 Predicted score: MCAS8’ R = .864
12
Modelling Backwards linear regression model Independent variables:
assistance metrics (including quadratic and interaction terms for each assistance feature) theta Dependent variable: 8th grade MCAS # of Variables survived: 35 Predicted score MCAS8” R = .874, MAD = 4.7 (8.7% of total score)
13
Evaluate our prediction
A prediction error of 8.7% sounds not bad. But shall we be satisfied or should we push even harder? What is a good comparison? One reasonable choice: MCAS, the standardized test itself How are we doing on estimating student math proficiency comparing to MCAS?
14
Evaluate our prediction
We propose to use scores from multiple years Assuming student math proficiency at 8th grade and at 10th grade are highly correlated with each other Measurement error relatively independent The one better predicts 10th grade MCAS score is better assessing student math skill at 8th grade. We will compare our prediction with the 2005 MCAS test on how well each correlated with the 2007 MCAS scores
15
Indirect method: First predict 8th grade MCAS then use predicted 8th grade MCAS score to predict 10th grade MCAS Assistance metrics Theta r = 0.731 predicts MCAS8 MCAS10 Predicted score r = 0.874 r = 0.729 MCAS8’’ Theta student proficiency score MCAS8: 8th grade MCAS score MCAS8’’: our prediction of 8th grade MCAS score using assistance metrics and theta MCAS10: 10th grade MCAS score Regression model: Dependent variable: MCAS8 Independent variable: assistance metrics + theta Method: backwards regression Predicted value: MCAS8”
16
Scatter plots
17
Scatter plots
18
A summary correlation table
Independent variables Correlation Assistance Theta MCAS8 MCAS10 X .731 .628 .864 .723 .874 .729 The student proficiency score correlates fairly well with the real MCAS scores (p <.05). We can get a better prediction of MCAS8 by taking into consideration how the student interacts with the system: attempts, help-seeking behavior, performance on scaffolding questions, speed etc. (r = .864) Our student model can do as well as MCAS test on predicting the MCAS score two years later (fractionally but not meaningfully worse)
19
Compare evaluation methods
Beck & Sison (2006) found 3 tests that measures extremely similar constructs to the standardized test that they were interested in. They took the arithmetic mean of those tests as a proxy measure for the true score on the original measure. In Feng, Heffernan & Koedinger (2006), we ran a simulation study by “splitting” a standardized test into two parts and the prediction power of the standardized test is determined by how well student performance on one half of the test predicts their performance on the other half. Use “proxy” measure Pros: it can be done quickly Cons: construct validity can be an issue Simulation study Pros: quickness Cons: measurement error in the same day; item level data is not always accessible Longitudinal approach Pros: avoid the confound of measurement error and get a fairer baseline Cons: takes longer time and harder effort to collect data across years
20
Conclusion Online metrics are doing a good job at predicting math proficiency Our student model can do as well as MCAS test on predicting the MCAS score two years later We propose a longitudinal approach on evaluating the prediction accuracy Long term prediction job Collection of online data starts at Sept., 2004 8th grade MCAS: May, 2005 10th grade MCAS: May, 2007 Finish data collection: Nov., 2007
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.