Educational data mining overview & Introduction to Exploratory Data Analysis Ken Koedinger CMU Director of PSLC Professor of Human-Computer Interaction.

Slides:



Advertisements
Similar presentations
Graph impedance against Er with the Si6000b polarinstruments.com.
Advertisements

Chapter 11 user support. Issues –different types of support at different times –implementation and presentation both important –all need careful design.
Causal Data Mining Richard Scheines Dept. of Philosophy, Machine Learning, & Human-Computer Interaction Carnegie Mellon.
Introduction to Acuity. Acuity Agenda Introductions Student Experience Educator Experience – Test Assignments – Manual Score – Reports – Custom Tests.
Correlation and regression
Improving learning by improving the cognitive model: A data- driven approach Cen, H., Koedinger, K., Junker, B. Learning Factors Analysis - A General Method.
Mathematics in the MYP.
An Individualized Web-Based Algebra Tutor D.Sklavakis & I. Refanidis 1 An Individualized Web-Based Algebra Tutor Based on Dynamic Deep Model Tracing Dimitrios.
Supporting (aspects of) self- directed learning with Cognitive Tutors Ken Koedinger CMU Director of Pittsburgh Science of Learning Center Human-Computer.
Learning from Learning Curves: Item Response Theory & Learning Factors Analysis Ken Koedinger Human-Computer Interaction Institute Carnegie Mellon University.
x – independent variable (input)
Data mining with DataShop Ken Koedinger CMU Director of PSLC Professor of Human-Computer Interaction & Psychology Carnegie Mellon University Ryan S.J.d.
Help and Documentation zUser support issues ydifferent types of support at different times yimplementation and presentation both important yall need careful.
Finance, Financial Markets, and NPV
Searching for Patterns: Sean Early PSLC Summer School 2007 Question: Which is a better predictor of performance in a cognitive tutor, error rate or assistance.
Cognitive Processes PSY 334 Chapter 8 – Problem Solving May 21, 2003.
Multinomial Processing Tree Models. Agenda Questions? MPT model overview. –MPT overview –Parameters and flexibility. –MPT & Evaluation Batchelder & Riefer,
Educational Data Mining Overview John Stamper PSLC Summer School /25/2011 1PSLC Summer School 2011.
1 Learning from Learning Curves: Item Response Theory & Learning Factors Analysis Ken Koedinger Human-Computer Interaction Institute Carnegie Mellon University.
Educational data mining overview & Introduction to Exploratory Data Analysis with DataShop Ken Koedinger CMU Director of PSLC Professor of Human-Computer.
DataShop: An Educational Data Mining Platform for the Learning Science Community John Stamper Pittsburgh Science of Learning Center Human-Computer Interaction.
Educational Data Mining and DataShop John Stamper Carnegie Mellon University 1 9/12/2012 PSLC Corporate Partner Meeting 2012.
Learning Sciences and Engineering Professional Master’s Program Ken Koedinger Vincent Aleven Albert Corbett Carolyn Rosé Justine Cassell.
Educational Data Mining Ryan S.J.d. Baker PSLC/HCII Carnegie Mellon University Richard Scheines Professor of Statistics, Machine Learning, and Human-Computer.
ARCHIBUS Log On Instructions. Log Into ARCHIBUS Web Central Log In Screen 1.Open your Internet browser. 2.Enter the URL to view the ARCHIBUS Login Page.
PSLC DataShop Introduction Slides current to DataShop version John Stamper DataShop Technical Director.
Jon Curwin and Roger Slater, QUANTITATIVE METHODS: A SHORT COURSE ISBN © Cengage Chapter 2: Basic Sums.
PSLC DataShop Introduction Slides current to DataShop version John Stamper DataShop Technical Director.
1 New England Common Assessment Program (NECAP) Setting Performance Standards.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 April 2, 2012.
Intro: FIT1001 Computer Systems S Important Notice for Lecturers This file is in skeleton form only Lecturers are expected to modify / enhance.
Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.
Noboru Matsuda Human-Computer Interaction Institute
Educational Data Mining: Discovery with Models Ryan S.J.d. Baker PSLC/HCII Carnegie Mellon University Ken Koedinger CMU Director of PSLC Professor of Human-Computer.
Introduction to Machine Learning Supervised Learning 姓名 : 李政軒.
Microsoft ® Office Excel 2003 Training Using XML in Excel SynAppSys Educational Services presents:
Chapter 4: Introduction to Predictive Modeling: Regressions
To advance to the next slide: Choose Step on your handheld Practice Presentation for Windows Mobile Smartphone.
DataShop Import Workshop Tuesday, June 14, 2011 pslcdatashop.org PSLC
711: Intelligent Tutoring Systems Week 1 – Introduction.
Applying the Redundancy Principle ( Chapter 7) And using e-learning data for CTA Ken Koedinger 1.
1 Chapter 4: Introduction to Predictive Modeling: Regressions 4.1 Introduction 4.2 Selecting Regression Inputs 4.3 Optimizing Regression Complexity 4.4.
Logistic Regression Analysis Gerrit Rooks
Data mining with DataShop Ken Koedinger CMU Director of PSLC Professor of Human-Computer Interaction & Psychology Carnegie Mellon University.
Performance Task Overview Introduction This training module answers the following questions: –What is a performance task? –What is a Classroom Activity?
RULES Patty Nordstrom Hien Nguyen. "Cognitive Skills are Realized by Production Rules"
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 January 25, 2012.
Using DataShop Tools to Model Students Learning Statistics Marsha C. Lovett Eberly Center & Psychology Acknowledgements to: Judy Brooks, Ken Koedinger,
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 February 6, 2012.
Logistic Regression and Odds Ratios Psych DeShon.
College of Education Meeting with the Professor: The Field Assignment Project (Case Study)
TAKS Exit Review Math by Morrison 2012 © Math by Morrison
Michael V. Yudelson Carnegie Mellon University
3.5 General feeling that knowledge of hydrology has improved … but more is needed.
How do you assign an error to a measurement?
Survey Training Pack Session 9 – Data Entry.
Yr 10 Candle Holder.
Student Registration/ Personal Needs Profile
Using Bayesian Networks to Predict Test Scores
Mingyu Feng Neil Heffernan Joseph Beck
For Computer-Based Testing
Introduction to PSLC DataShop
Computer Science A Level
Chapter 11 user support.
Graph impedance against Er with the Si6000b
For Computer-Based Testing
Student Registration/ Personal Needs Profile
Core Methods in Educational Data Mining
Presentation transcript:

Educational data mining overview & Introduction to Exploratory Data Analysis Ken Koedinger CMU Director of PSLC Professor of Human-Computer Interaction & Psychology Carnegie Mellon University

Plan Because it is technical, will start with learning curve formulas … Then go to exploratory data analysis Return to in next session to use of formulas in Item Response Theory, Learning Factors Analysis  (Provide some “spaced” practice for you)

Overview Questions on yesterday’s intro?  Another example of learning curves Quantitative models of learning curves  Power law, logistic regression Exercise:  Goals: 1) Get familiar with data, 2) Learn/practice Excel skills  Tasks: 1) create a “step table”, 2) graph learning curves using a) error rate & b) assistance score

Student Performance As They Practice with the LISP Tutor

Production Rule Analysis Evidence for Production Rule as an appropriate unit of knowledge acquisition

Using learning curves to evaluate a cognitive model Lisp Tutor Model  Learning curves used to validate cognitive model  Fit better when organized by knowledge components (productions) rather than surface forms (programming language terms) But, curves not smooth for some production rules  “Blips” in leaning curves indicate the knowledge representation may not be right  Corbett, Anderson, O’Brien (1995)  Let me illustrate …

Curve for “Declare Parameter” production rule How are steps with blips different from others? What’s the unique feature or factor explaining these blips? What’s happening on the 6th & 10th opportunities?

Can modify cognitive model using unique factor present at “blips” Blips occur when to-be-written program has 2 parameters Split Declare-Parameter by parameter-number factor:  Declare-first-parameter  Declare-second-parameter

Overview Questions on yesterday’s intro?  Another example of learning curves Quantitative models of learning curves  Power law, logistic regression Exercise:  Goals: 1) Get familiar with data, 2) Learn/practice Excel skills  Tasks: 1) create a “step table”, 2) graph learning curves using a) error rate & b) assistance score

Learning curve analysis The Power Law of Learning (Newell & Rosenbloom, 1993) Y = a X b Y – error rate X – opportunities to practice a skill a – error rate on 1st opportunity b – learning rate After the log transformation “a” is the “intercept” or starting point of the learning curve “b” is the “slope” or steepness of the learning curve

More sophisticated learning curve model Generalized Power Law to fit learning curves  Logistic regression (Draney, Wilson, Pirolli, 1995) Assumptions  Different students may initially know more or less => use an intercept parameter for each student  Students learn at the same rate => no slope parameters for each student  Some productions may be more known than others => use an intercept parameter for each production  Some productions are easier to learn than others => use a slope parameter for each production These assumptions are reflected in detailed math model …

More sophisticated learning curve model Probability of getting a step correct (p) is proportional to: - if student i performed this step = X i, add overall “smarts” of that student =  i - if skill j is needed for this step = Y j, add easiness of that skill =  j add product of number of opportunities to learn = T j & amount gained for each opportunity =  j p  Use logistic regression because response is discrete (correct or not) Probability (p) is transformed by “log odds” “stretched out” with “s curve” to not bump up against 0 or 1 (Related to “Item Response Theory”, behind standardized tests …)

Overview Questions on yesterday’s intro?  Another example of learning curves Quantitative models of learning curves  Power law, logistic regression Exercise:  Goals: 1) Get familiar with data, 2) Learn/practice Excel skills  Tasks: 1) create a “step table”, 2) graph learning curves using a) error rate & b) assistance score

TWO_CIRCLES_IN_SQUARE problem: Initial screen

TWO_CIRCLES_IN_SQUARE problem: An error a few steps later

TWO_CIRCLES_IN_SQUARE problem: Student follows hint & completes prob

Exported File Loaded into Excel

DataShop Export & Using Excel Get files from  Go to Learnlab.org  Click on “Enabling Technologies”  Click on “Meetings”  Click on “Documents” Don’t do yet … Demo …

Demo: Export Step Roll Up from Data Shop …

Now try it yourself … Follow instructions in download from two slides ago …

END