Discovery with Models Week 8 Video 1. Discovery with Models: The Big Idea  A model of a phenomenon is developed  Via  Prediction  Clustering  Knowledge.

Slides:



Advertisements
Similar presentations
Chapter 24 Quality Management.
Advertisements

Intro to EDM Why EDM now? Which tools to use in class Week 1, video 1.
Educational Data Mining Overview Ryan S.J.d. Baker PSLC Summer School 2012.
Educational Data Mining Overview Ryan S.J.d. Baker PSLC Summer School 2010.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 February 18, 2013.
Reliability and Validity
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Knowledge Inference: Advanced BKT Week 4 Video 5.
Knowledge Engineering Week 3 Video 5. Knowledge Engineering  Where your model is created by a smart human being, rather than an exhaustive computer.
Meta-Cognition, Motivation, and Affect PSY504 Spring term, 2011 January 26, 2010.
Data Synchronization and Grain-Sizes Week 3 Video 2.
Ground Truth for Behavior Detection Week 3 Video 1.
Special Topics in Educational Data Mining HUDK50199 Spring term, 2013 April 1, 2012.
VALIDITY.
MATH408: Probability & Statistics Summer 1999 WEEKS 8 & 9 Dr. Srinivas R. Chakravarthy Professor of Mathematics and Statistics Kettering University (GMI.
Educational Data Mining Overview John Stamper PSLC Summer School /25/2011 1PSLC Summer School 2011.
IE-331: Industrial Engineering Statistics II Spring 2000 WEEK 1 Dr. Srinivas R. Chakravarthy Professor of Operations Research and Statistics Kettering.
Educational Data Mining and DataShop John Stamper Carnegie Mellon University 1 9/12/2012 PSLC Corporate Partner Meeting 2012.
Educational Data Mining Ryan S.J.d. Baker PSLC/HCII Carnegie Mellon University Richard Scheines Professor of Statistics, Machine Learning, and Human-Computer.
Classifiers, Part 3 Week 1, Video 5 Classification  There is something you want to predict (“the label”)  The thing you want to predict is categorical.
© Curriculum Foundation1 Section 2 The nature of the assessment task Section 2 The nature of the assessment task There are three key questions: What are.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 27 Slide 1 Quality Management 1.
Classifiers, Part 1 Week 1, video 3:. Prediction  Develop a model which can infer a single aspect of the data (predicted variable) from some combination.
Strategies for Success
Abstraction IS 101Y/CMSC 101 Computational Thinking and Design Tuesday, September 17, 2013 Carolyn Seaman University of Maryland, Baltimore County.
Case Study – San Pedro Week 1, Video 6. Case Study of Classification  San Pedro, M.O.Z., Baker, R.S.J.d., Bowers, A.J., Heffernan, N.T. (2013) Predicting.
Process of Science The Scientific Method.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 April 4, 2012.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 February 13, 2012.
© Curriculum Foundation Part 3 Assessing a rounded curriculum Unit 3 What is the new national curriculum asking for?
Abstraction IS 101Y/CMSC 101 Computational Thinking and Design Tuesday, September 17, 2013 Marie desJardins University of Maryland, Baltimore County.
Introduction The Coding subtests from the Wechsler scales are a commonly used portion of the Processing Speed Index. They are widely understood to measure.
Educational Data Mining: Discovery with Models Ryan S.J.d. Baker PSLC/HCII Carnegie Mellon University Ken Koedinger CMU Director of PSLC Professor of Human-Computer.
Advanced BKT February 11, Classic BKT Not learned Two Learning Parameters p(L 0 )Probability the skill is already known before the first opportunity.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Lecture 1 Why We Need to Assess? Brown, 2004 p. 1-5 Prepared by: Dr. Reem Aldegether Dr. Njwan Hamdan.
Learning Analytics: Process & Theory March 24, 2014.
Measures of variability: understanding the complexity of natural phenomena.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Research Methods Ass. Professor, Community Medicine, Community Medicine Dept, College of Medicine.
©2005, Pearson Education/Prentice Hall CHAPTER 1 Goals and Methods of Science.
Lesson 4 Structures, Strategies and Compositions Lesson 4 Gathering data on fast break. Identifying strengths & weaknesses of fast break.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Feature Engineering Studio September 9, Welcome to Feature Engineering Studio Design studio-style course teaching how to distill and engineer features.
SAMI MAKERSPACE MAKE: AN ELECTRONICS WORKSHOP. SAFETY.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 March 6, 2013.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 January 25, 2012.
Outline Variables – definition  Physical dimensions  Abstract dimensions Systematic vs. random variables Scales of measurement Reliability of measurement.
LISA A. KELLER UNIVERSITY OF MASSACHUSETTS AMHERST Statistical Issues in Growth Modeling.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
LECTURE 16: BEYOND LINEARITY PT. 1 March 28, 2016 SDS 293 Machine Learning.
A research and policy informed discussion of cross-curricular approaches to the teaching of mathematics and science with a focus on how scientific enquiry.
IS6146 Databases for Management Information Systems Lecture 12: Exam Revision Rob Gleasure robgleasure.com.
Core Methods in Educational Data Mining
Multivariate Analysis - Introduction
Core Methods in Educational Data Mining
Special Topics in Educational Data Mining
Core Methods in Educational Data Mining
Big Data, Education, and Society
Core Methods in Educational Data Mining
Big Data, Education, and Society
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Data Warehousing Data Mining Privacy
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Presentation transcript:

Discovery with Models Week 8 Video 1

Discovery with Models: The Big Idea  A model of a phenomenon is developed  Via  Prediction  Clustering  Knowledge Engineering  This model is then used as a component in another analysis

Can be used in Prediction  The created model’s predictions are used as predictor variables in predicting a new variable  E.g. Classification, Regression

Can be used in Relationship Mining  The relationships between the created model’s predictions and additional variables are studied  This can enable a researcher to study the relationship between a complex latent construct and a wide variety of observable constructs  E.g. Correlation mining, Association Rule Mining

Models on top of Models  Another area of Discovery with Models is composing models out of other models  Examples:  Models of Gaming the System and Help-Seeking use Bayesian Knowledge-Tracing models as components (Baker et al., 2004, 2008a, 2008b; Aleven et al., 2004, 2006)  Models of Preparation for Future Learning use models of Gaming the System as components (Baker et al., 2011)  Models of Affect use models of Off-Task Behavior as components (Baker et al., 2012)  Models predicting standardized exam scores use models of Affect and Off-Task Behavior as components (Pardos et al., 2013)  Models predicting college attendance use models of Affect and Off- Task Behavior as components (San Pedro et al., 2013)

Models on top of Models  When I talk about this, people often worry about building a model on top of imperfect models  Will the error “pile up”?

Models on top of Models  May not be as big a risk as people worry about  If the final model successfully predicts the final construct, do we care if the model it uses internally is imperfect?  Systematic error at each step is taken into account at the next step!  That said, there are some validity issues that should be taken into account – more on that in a minute

“Increasingly Important…”  Baker & Yacef (2009) argued that Discovery with Models is a key emerging area of EDM  I think that’s still true, although it has been a bit slower to become prominent than I expected way back in 2009

DWM analyses we’ve previously talked about in class  San Pedro et al. (2013) – Week 1, Lecture 6  Aleven et al. (2004, 2006) – Week 3, Lecture 5  Beck et al. (2008) – Week 4, Lecture 5  Baker et al. (2010, 2012) – Week 4, Lecture 5  Fancsali (2013) – Week 5, Lecture 2  Baker et al. (in press) – Week 6, Lecture 2  Gowda et al. (2012) – Week 6, Lecture 2

DWM analysis we’ll talk about in next lecture  Dawson, S., Macfadyen, L., Lockyer, L., & Mazzochi- Jones, D. (2011). Using social network metrics to assess the effectiveness of broad-based admission practices. Australasian Journal of Educational Technology, 27(1),

Advantages of DWM

 Possible to Analyze Phenomena at Scale  Even for constructs that are  latent  expensive to label by hand

Advantages of DWM  Possible to Analyze Phenomena at Scale  At scales that are infeasible even for constructs that are quick & easy to label by hand  Scales easily from hundreds to millions of students  Entire years or (eventually) entire lifetimes Predicting Nobel Prize winners from kindergarten drawings?

Future Nobel Prize Winner?

Advantages of DWM  Supports inspecting and reconsidering coding later  Leaves clear data trails  Can substitute imperfect model with a better model later and re-run  Promotes replicability, discussion, debate, and scientific progress

Disadvantages of DWM  Easy to Do Wrong!

Discovery with Models: Here There Be Monsters “Rar.”

Discovery with Models: Here There Be Monsters  It’s really easy to do something badly wrong, for some types of “Discovery with Models” analyses  No warnings when you do

Think Validity  Validity is always important for model creation  Doubly-important for discovery with models  Discovery with Models almost always involves applying model to new data  How confident are you that your model will apply to the new data?

Challenges to Valid Application  Many challenges to valid application of a model within a discovery with models analysis

Challenges to Valid Application  Is model valid for population?  Is model valid for all tutor lessons? (or other differences)  Is model valid for setting of use? (classroom versus homework?)  Is the model valid in the first place? (especially important for knowledge engineered models)

Next lecture  Discovery with Models: Case Study