Core Methods in Educational Data Mining

Slides:

Advertisements

Similar presentations

Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 January 23, 2012.

Advertisements

Design of Experiments Lecture I

Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 March 12, 2012.

Knowledge Inference: Advanced BKT Week 4 Video 5.

Improving learning by improving the cognitive model: A data- driven approach Cen, H., Koedinger, K., Junker, B. Learning Factors Analysis - A General Method.

Ryan S.J.d. Baker Adam B. Goldstein Neil T. Heffernan Detecting the Moment of Learning.

Planning under Uncertainty

Formula Auditing, Data Validation, and Complex Problem Solving

Educational Data Mining Overview John Stamper PSLC Summer School /25/2011 1PSLC Summer School 2011.

Spreadsheet Problem Solving

1 Prediction of Software Reliability Using Neural Network and Fuzzy Logic Professor David Rine Seminar Notes.

Financial Statement Modeling & Spreadsheet Engineering “Training in spreadsheet modeling improves both the efficiency and effectiveness with which analysts.

Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 February 13, 2012.

Sample size vs. Error A tutorial By Bill Thomas, Colby-Sawyer College.

بسم الله الرحمن الرحیم. Ehsan Khoddam Mohammadi M.J.Mahzoon Koosha K.Moogahi.

Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 April 2, 2012.

Curvilinear 2 Modeling Departures from the Straight Line (Curves and Interactions)

Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 February 4, 2013.

Multigroup Models Byrne Chapter 7 Brown Chapter 7.

Feature Engineering Studio March 1, Let’s start by discussing the HW.

Core Methods in Educational Data Mining HUDK4050 Fall 2015.

Core Methods in Educational Data Mining HUDK4050 Fall 2015.

Core Methods in Educational Data Mining HUDK4050 Fall 2015.

Chapter 6 Neural Network.

Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 January 25, 2012.

Core Methods in Educational Data Mining HUDK4050 Fall 2014.

Special Topics in Educational Data Mining HUDK5199 Spring, 2013 April 3, 2013.

Core Methods in Educational Data Mining HUDK4050 Fall 2014.

Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 February 6, 2012.

Core Methods in Educational Data Mining HUDK4050 Fall 2014.

Core Methods in Educational Data Mining HUDK4050 Fall 2014.

Learning Analytics isn’t new Ways in which we might build on the long history of adaptive learning systems within contemporary online learning design Professor.

HUDK5199: Special Topics in Educational Data Mining

Core Methods in Educational Data Mining

Core Methods in Educational Data Mining

Core Methods in Educational Data Mining

Core Methods in Educational Data Mining

Michael V. Yudelson Carnegie Mellon University

Deep Feedforward Networks

Randomness in Neural Networks

Solver & Optimization Problems

Error Correcting Code.

Reinforcement learning (Chapter 21)

Using the Excel Creation Template to Create a Variable Parameter Problem (Macro Enabled “Alpha 1.4.2”) Getting started – Example 1 Note – You should be.

Basics of Group Analysis

Reinforcement learning (Chapter 21)

Special Topics in Educational Data Mining

Intelligent Information System Lab

Chapter 3 Component Reliability Analysis of Structures.

Core Methods in Educational Data Mining

HUDK5199: Special Topics in Educational Data Mining

Core Methods in Educational Data Mining

Core Methods in Educational Data Mining

Big Data, Education, and Society

of the Artificial Neural Networks.

Addressing the Assessing Challenge with the ASSISTment System

vms x Year 8 Mathematics Equations

Neil T. Heffernan, Joseph E. Beck & Kenneth R. Koedinger

The Organizational Impacts on Software Quality and Defect Estimation

Michal Rosen-Zvi University of California, Irvine

The loss function, the normal equation,

Core Methods in Educational Data Mining

Deep Knowledge Tracing

CMSC201 Computer Science I for Majors Final Exam Information

Mathematical Foundations of BME Reza Shadmehr

Core Methods in Educational Data Mining

Getting started – Example 1

Core Methods in Educational Data Mining

David Kauchak CS158 – Spring 2019

Two’s Complement & Binary Arithmetic

Presentation transcript:

Core Methods in Educational Data Mining EDUC 691 Spring 2019

Performance Factors Analysis What are the important differences in assumptions between PFA and BKT? What does PFA offer that BKT doesn’t? What does BKT offer that PFA doesn’t?

What do each of these parameters mean?

Let’s build PFA Using file pfa-modelfit-set-v3.xlsx

Q1 Question 1: The first thing we need to do is to create a column that represents the success so far on skill 1, 2, and 3. This will be used with PFA’s gamma parameter. We’ll put these in columns H, I, and J. What should go in cell H2? If you’re not sure, try each of these. =IF(C2=1,$F2,0),IF(C2=1,H1+$F2,H1) =IF($A2<>$A1,IF(C2=1,$F2,0),IF(C2=1,H1+$F2,H1)) =IF(C2=0,$F2,1),IF(C2=1,H1+$F2,H1) =IF($A2<>$A1,IF(C2=0,$F2,1),IF(C2=1,H1+$F2,H1)) =IF(C2=0,$F2,1),IF(C2=1,H1, H1+$F2) =IF($A2<>$A1,IF(C2=0,$F2,1),IF(C2=1,H1, H1+$F2)) =IF(C2=1,$F2,0),IF(C2=1,H1*$F2,H1) =IF($A2<>$A1,IF(C2=1,$F2,0),IF(C2=1,H1+$F2,$F2))

Q1 Question 1: The first thing we need to do is to create a column that represents the success so far on skill 1, 2, and 3. This will be used with PFA’s gamma parameter. We’ll put these in columns H, I, and J. What should go in cell H2? If you’re not sure, try each of these. =IF(C2=1,$F2,0),IF(C2=1,H1+$F2,H1) =IF($A2<>$A1,IF(C2=1,$F2,0),IF(C2=1,H1+$F2,H1)) =IF(C2=0,$F2,1),IF(C2=1,H1+$F2,H1) =IF($A2<>$A1,IF(C2=0,$F2,1),IF(C2=1,H1+$F2,H1)) =IF(C2=0,$F2,1),IF(C2=1,H1, H1+$F2) =IF($A2<>$A1,IF(C2=0,$F2,1),IF(C2=1,H1, H1+$F2)) =IF(C2=1,$F2,0),IF(C2=1,H1*$F2,H1) =IF($A2<>$A1,IF(C2=1,$F2,0),IF(C2=1,H1+$F2,$F2))

Q2 Next, you need to create a column that represents the incorrect answers so far on skill 1, 2, and 3. This will be used with PFA’s rho parameter. We’ll put these in columns K, L, and M. What should go in cell K2? (Remember, if you’re not sure, try each of these) =IF($A2<>$A3,IF(C2=1,$G2,0),IF(C2=1,K1+$G2,1)) =IF($A2<>$A3,IF(C2=1,$G2,0),IF(C2=1,K1+$G2,0)) =IF($A2<>$A3,IF(C2=1,$G2,0),IF(C2=1,K1+$G2,K1-$G2)) =IF($A2<>$A3,IF(C2=1,$G2,0),IF(C2=1,K1+$G2,K1)) =IF(C2=1,$G3,0) =IF($A2<>$A1,IF(C2=1,$G2,0),IF(C2=1,K1+$G2,K1)) =IF($A2<>$A1,IF(C2=1,$G2,0),IF(C2=1,K1,K1+$G2)) =IF($A2<>$A3,IF(C2=1,$G2,0),IF(C2=1,K1,K1+$G2))

Q2 Next, you need to create a column that represents the incorrect answers so far on skill 1, 2, and 3. This will be used with PFA’s rho parameter. We’ll put these in columns K, L, and M. What should go in cell K2? (Remember, if you’re not sure, try each of these) =IF($A2<>$A3,IF(C2=1,$G2,0),IF(C2=1,K1+$G2,1)) =IF($A2<>$A3,IF(C2=1,$G2,0),IF(C2=1,K1+$G2,0)) =IF($A2<>$A3,IF(C2=1,$G2,0),IF(C2=1,K1+$G2,K1-$G2)) =IF($A2<>$A3,IF(C2=1,$G2,0),IF(C2=1,K1+$G2,K1)) =IF(C2=1,$G3,0) =IF($A2<>$A1,IF(C2=1,$G2,0),IF(C2=1,K1+$G2,K1)) =IF($A2<>$A1,IF(C2=1,$G2,0),IF(C2=1,K1,K1+$G2)) =IF($A2<>$A3,IF(C2=1,$G2,0),IF(C2=1,K1,K1+$G2))

Q3 Now you need to compute the gamma parameters for the student’s history of success. Note that the gamma weights are on sheet “fit”. Copy =fit!$F$1*H2 into cell N2 and propagate it down. What should O2 be? =fit!$F$2*I2 =fit!$F$1*H2 =fit!$F$1*I2 =fit!$F$2*H2

Q3 Now you need to compute the gamma parameters for the student’s history of success. Note that the gamma weights are on sheet “fit”. Copy =fit!$F$1*H2 into cell N2 and propagate it down. What should O2 be? =fit!$F$2*I2 =fit!$F$1*H2 =fit!$F$1*I2 =fit!$F$2*H2

Step OK, propagate from O2 down, and do the same thing for column P, using =fit!$F$3*J2. Put =SUM(N2:P2) into cell Q2 and copy it down. Now you have all the success parameters added together for the three skills.

Q4 Now you need to create the rho parameters for the student’s history of failure. What should R2 be? =fit!$F$2*K2 =fit!$F$2*L2 =fit!$F$2*M2 =fit!$F$3*L2 =fit!$F$4*K2 =fit!$F$5*L2 =fit!$F$6*M2

Q4 Now you need to create the rho parameters for the student’s history of failure. What should R2 be? =fit!$F$2*K2 =fit!$F$2*L2 =fit!$F$2*M2 =fit!$F$3*L2 =fit!$F$4*K2 =fit!$F$5*L2 =fit!$F$6*M2

Step Propagate that down, and create corresponding values for column S and T. Put =SUM(R2:T2) into cell U2 and copy it down. Now you have all the failure parameters added together for the three skills. Put =fit!$F$7 into column V and copy it down.

Q5 Now we can calculate m!

Q5 Now we can calculate m! What’s m?

Q5 part 2 What do you put in cell W2? =V2 =Q2+U2 =Q2/U2 =Q2*U2 =Q2+U2+V2 =Q2*U2*V2 =(Q2*U2)/V2 =EXP(Q2+U2-V2) =EXP(Q2+U2+V2) =EXP(Q2*U2*V2)

Q5 part 2 What do you put in cell W2? =V2 =Q2+U2 =Q2/U2 =Q2*U2 =Q2+U2+V2 =Q2*U2*V2 =(Q2*U2)/V2 =EXP(Q2+U2-V2) =EXP(Q2+U2+V2) =EXP(Q2*U2*V2)

Q6 What goes in cell X2, p(m)? =W2 =(W2*-1) =EXP(W2) =EXP(W2*-1) =1/1+E2^(W2*-1))

Q6 What goes in cell X2, p(m)? =W2 =(W2*-1) =EXP(W2) =EXP(W2*-1) =1/1+E2^(W2*-1))

Q7 You’ve got PFA! Now it’s time to fit the seven parameters. Go to the sheet “fit”. What is the SSR currently?

Q8 What happens if you change gamma-skill 1 to be 1? What does the SSR become?

Q9 Question 9. Is the model better or worse than the model you got for question 7? Better Worse The Same

Q9 Question 9. Is the model better or worse than the model you got for question 7? Better Worse The Same

Q10 Question 10. What does it mean to increase gamma-skill-1 from 0 to 1? It means that getting skill 1 right improves your performance on future items involving skill 1 It means that getting skill 1 right worsens your performance on future items involving skill 1 It means that getting skill 1 wrong improves your performance on future items involving skill 1 It means that getting skill 1 wrong worsens your performance on future items involving skill 1 It means that your SSR gets better!

Q10 Question 10. What does it mean to increase gamma-skill-1 from 0 to 1? It means that getting skill 1 right improves your performance on future items involving skill 1 It means that getting skill 1 right worsens your performance on future items involving skill 1 It means that getting skill 1 wrong improves your performance on future items involving skill 1 It means that getting skill 1 wrong worsens your performance on future items involving skill 1 It means that your SSR gets better!

Q11 Question 11. Use the Excel Equation Solver to find the optimal parameters for this model. (You may need to install it as an add-in). Make sure to use the GRG Nonlinear solving method and leave make unconstrained variables non-negative unchecked. What is the resultant SSR?

Questions? Comments?

Can you do better than the solver? Take 5 minutes

What do each of these mean? When might you legitimately get them? r < 0 g < r g < 0

How Does PFA Represent learning? As opposed to just better predicted performance because you’ve gotten it right

How Does PFA Represent learning? As opposed to just better predicted performance because you’ve gotten it right Is it r ? Is it average of r and g?

Let’s play with b values in the spreadsheet Any questions?

b Parameters Pavlik proposes three different b Parameters Item Item-Type Skill Result in different number of parameters And greater or lesser potential concern about over-fitting What are the circumstances where you might want item versus skill?

Other questions, comments, concerns about PFA?

Deep Knowledge Tracing (DKT) (Piech et al., 2015) Based on “deep learning”, aka recurrent neural networks/long short term memory networks Fits on sequence of student performance across skills Predicts performance on future items within system Can fit very complex functions Very complex relationships between items over time

DKT Initial paper reported massively better performance than original BKT or PFA (Piech et al., 2015)

DKT (Xiong et al., 2016) reported that (Piech et al., 2015) had used the same data points for both training and test, due to miscommunication about the data set DKT doesn’t do quite as well when this error is fixed – still moderately better than original PFA or BKT

DKT (Khajah et al., 2016) compared DKT to modern extensions to BKT on same data set Particularly beneficial to re-fit item-skill mappings (Wilson et al., 2016) compared DKT to temporal IRT on same data set Bottom line: All three approaches appear to perform comparably well

Additional Issue (Yeung & Yeung, 2018) report degenerate behavior for DKT Getting answers right leads to lower knowledge Wild swings in probability estimates in short periods of time They propose a regularization method to moderate these swings

Extension for Latent Knowledge Estimation (Zhang et al., 2017) propose an extension to DKT that uses an item-skill mapping as well as DKT Latent skill still difficult to interpret (Lee & Yeung, 2019) propose an alternative to DKT that attempts to output more interpretable latent skill estimates Still some degenerate behavior reported

Watch this space Ongoing rapidly moving discussion about algorithms Time will tell which approaches are best

Interpretability of modern approaches Is prediction of immediate future correctness the right indicator? Are skill estimates more useful when prediction of immediate future correctness is better? What is the ultimate goal – predicting immediate performance or understanding what knowledge students carry forward?

Questions? Comments?

Let’s discuss assignment C3

Final project Does everyone have project groups? Any questions?

Next Class Wednesday, April 17: Knowledge Structure Discovery 2pm-3:50pm Readings Baker, R.S. (2015) Big Data and Education. Ch. 7, V6, V7. Desmarais, M.C., Meshkinfam, P., Gagnon, M. (2006) Learned Student Models with Item to Item Knowledge Structures. User Modeling and User-Adapted Interaction, 16, 5, 403-434.[pdf] Desmarais, M. C., & Naceur, R. (2013). A matrix factorization method for mapping items to skills and for enhancing expert-based Q-Matrices. Proceedings of the International Conference on Artificial Intelligence in Education, 441-450. [pdf] Cen, H., Koedinger, K., Junker, B. (2006) Learning Factors Analysis - A General Method for Cognitive Model Evaluation and Improvement. Proceedings of the International Conference on Intelligent Tutoring Systems, 164-175.[pdf] Koedinger, K.R., McLaughlin, E.A., Stamper, J.C. (2012) Automated Student Modeling Improvement. Proceedings of the 5th International Conference on Educational Data Mining, 17-24.[pdf] Assignment C3 due

The End