Download presentation
Presentation is loading. Please wait.
1
Data Mining in Education
Ryan S. Baker University of Pennsylvania
2
Over the last decades The explosion in data has transformed field after field
3
Big Data in Climate Science
In climate science, methods for dealing with big data have led to better climate models that give us a few days warning about major storms
4
Big Data in Physics In high-energy physics, it’s made it possible to analyze the data of 40 million proton collisions per second in looking for evidence of the once-elusive Higgs Boson.
5
Big Data in Biology And in biology, it’s enabled researchers to sequence at least one version of the more than 3 billion nucleotides in the human genome, and researchers are now conducting individualized gene research building off this work.
6
Big Data in E-Commerce
7
Big Data in Education The moment is arriving when we can obtain and utilize similar amounts of data in education
8
Interactive Learning Environments
As more learning takes place within educational software and online learning environments of various types, it becomes much easier to gather very rich data on individual students’ learning and engagement within specific subjects. For example, a student might use a science simulation like the Inq-ITS, to learn science content and inquiry skills. Or they might learn scientific inquiry skill and content within a virtual environment like EcoMUVE. They might learn math skill in an action game like Zombie Division – the student has a set of weapons with numbers associated with them, a 2 for a sword, or a 5 for a gauntlet, and they can divide a skeleton if the weapon divides the number on the skeleton’s chest. Or they might learn math in a conceptual story-based learning environment like Reasoning Mind… or by doing math problems in a workbook-like environment like ASSISTments. All of these environments generate rich data streams that have been used in EDM analyses. And this kind of software in becoming more widespread every day. Systems like the Cognitive Tutor, or ASSISTments, or Reasoning Mind, are used by tens or hundreds of thousands of students, one or two days a week.
9
MOOCs and online courses
10
Student Log Data *000:22:297 READY . *000:25:875 APPLY-ACTION WINDOW; LISP-TRANSLATOR::AUTHORINGTOOL-TRANSLATOR, CONTEXT; 3FACTOR-CROSS-XPL-4, SELECTIONS; (GROUP3_CLASS_UNDER_XPL), ACTION; UPDATECOMBOBOX, INPUT; "Two crossover events are very rare.", *000:25:890 GOOD-PATH *000:25:890 HISTORY P-1; (COMBOBOX-XPL-TRACE SIMBIOSYS), *000:25:890 READY *000:29:281 APPLY-ACTION SELECTIONS; (GROUP4_CLASS_UNDER_XPL), INPUT; "The largest group is parental since crossovers are uncommon.", *000:29:281 GOOD-PATH *000:29:281 HISTORY *000:29:281 READY *001:20:733 APPLY-ACTION SELECTIONS; (ORDER_GENES_OBS_XPL), INPUT; "The Q and q alleles have interchanged between the parental and SCO genotypes.", *001:20:733 SWITCHED-TO-EDITOR *001:20:748 NO-CONFLICT-SET *001:20:748 READY *001:32:498 APPLY-ACTION INPUT; "The Q and q alleles have interchanged between the parental and DCO genotypes.", *001:32:498 GOOD-PATH *001:32:498 HISTORY *001:32:498 READY *001:37:857 APPLY-ACTION SELECTIONS; (ORDER_GENES_UNDER_XPL), INPUT; "In the DCO group BOTH outer genes cross over so the interchanged gene is the middle one.", *001:37:857 GOOD-PATH For example, as a student uses one of these interactive learning environments, the student will make hundreds of meaningful actions each hour – pausing and thinking before making an incorrect answer, asking for help, rapidly changing settings on a simulation, running away from a skeleton. When the data is logged, these behaviors provide us with incredibly rich detail about learning and engagement, that we can analyze.
11
Grade and Outcome Data
12
Student Engagement Data
13
Large-scale data Log data from systems used by hundreds of thousands of students per year ALEKS, Cognitive Tutor, Reasoning Mind Whole-university-system data on course-taking and outcomes
14
Data in Education Used to Be
Dispersed Hard to Collect Small-Scale
15
Data Today
16
Data Today
17
PSLC DataShop (Koedinger et al, 2008, 2010)
>250,000 hours of students using educational software within LearnLabs and other settings >30 million student actions, responses & annotations
18
“the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs.” To quote the Society for Learning Analytics Research… EDM and learning analytics methods have some similarities with traditional data mining methods, but as with the other areas where data mining methods have been common: bioinformatics, medical informatics, business analytics, data analysis methods in physics, and so on, the unique features of the domain of education leads to the development of unique methods. (
19
Goals Joint goal of exploring the “big data” now available on learners and learning To promote New scientific discoveries & to advance science of learning Better assessment of learners along multiple dimensions Social, cognitive, emotional, meta-cognitive, etc. Individual, group, institutional, etc. Better real-time support for learners
20
Many types of EDM/LA Method (Baker & Siemens, 2014; building off of Baker & Yacef, 2009)
Prediction Structure Discovery Relationship mining Distillation of data for human judgment Discovery with models
21
Prediction Develop a model which can infer a single aspect of the data (predicted variable) from some combination of other aspects of the data (predictor variables) Which students are bored? Which students will fail the class? Infer something that matters, so we can do something about it
22
Structure Discovery Find structure and patterns in the data that emerge “naturally” No specific target or predictor variable What problems map to the same skills? Are there groups of students who approach the same curriculum differently? Which students develop more social relationships in MOOCs?
23
Relationship Mining Discover relationships between variables in a data set with many variables Are there trajectories through a curriculum that are more or less effective? Which aspects of the design of educational software have implications for student engagement?
24
Many applications Dropout/success prediction
Automated detection of learning, engagement, emotion, strategy, for better individualization Better reporting for teachers and other stakeholders Basic discovery in education
25
Individualization requires
Determining something about the student Knowing what matters Doing the right thing about it
26
Determining something about the student
Knowing what matters Doing the right thing about it
27
Stuff We Can Infer: Complex Learning
Is the student learning to solve complex problems that require inquiry? (Sao Pedro et al., 2013; Baker & Clarke-Midura, 2013) Is the student developing rich conceptual understanding in domains such as physics? (Shute & Ventura, 2013; Rowe et al., 2015)
28
Stuff We Can Infer: Robust Learning
Will the student remember what they learned? (Jastrzembski et al., 2006; Pavlik et al., 2008; Wang & Beck, 2012) Is the student prepared for future learning? (Baker et al., 2011; Hershkovitz et al., 2013)
29
Stuff We Can Infer: Meta-Cognition
How confident is the student? (Litman et al., 2006; McQuiggan, Mott, & Lester, 2008; Arroyo et al., 2009) Is the student asking for help when they need it? (Aleven et al., 2004, 2006) Is the student persisting in the face of challenge? (Ventura et al., 2012)
30
Stuff We Can Infer: Disengaged Behaviors
Gaming the System (Baker et al., 2004, 2008, 2010; Walonoski & Heffernan, 2006; Beal, Qu, & Lee, 2007) Carelessness (San Pedro et al., 2011; Hershkovitz et al., 2011) Inexplicable Behavior (Rowe et al., 2009; Wixon et al., 2012)
31
Stuff We Can Infer: Affect (Emotion in Context)
Boredom Frustration Confusion Engaged Concentration/Flow Curiosity Excitement Situational Interest Joy/Delight (D’Mello et al., 2008; Mavrikis, 2008; Arroyo et al., 2009; Conati & Maclaren, 2009; Lee et al., 2011; Sabourin et al., 2011; Baker et al., 2012, 2014; Paquette et al., 2014, 2015; Pardos et al., 2014; Kai et al., 2015)
32
No physical sensors needed
Now feasible to infer these constructs solely from student interaction with the learning system
33
Example Automated detectors of student engagement and affect in ASSISTments (Pardos et al., 2013; Ocumpaugh et al., 2014)
34
Process
35
Field Observations of Student Engagement and Affect
Using BROMP observation protocol (Ocumpaugh et al., 2015) >150 coders certified in USA, Philippines, India, UK Synchronized to log files with Android app HART
36
Use data mining to find behaviors that co-occur with human observations
Distill features of interaction hypothesized to correlate to desired construct
37
Use data mining to find behaviors that co-occur with human observations
Distill features of interaction hypothesized to correlate to desired construct Best to use theoretical understanding and automated discovery together (Sao Pedro et al., 2012; Paquette et al., 2015)
38
Use data mining to find behaviors that co-occur with human observations
Try a small set of data mining/prediction/classification algorithms that fit different kinds of patterns Decision Trees Decision Rules Step Regression Naïve Bayes K*
39
Test model generalizability on new students and new populations
In this case, students in rural, urban, and suburban schools in Northeastern USA Diverse in terms of SES, race, ethnicity
40
Model Goodness (Pardos et al., 2013)
Construct Algo A’ Kappa Boredom JRip 0.632 0.229 Frustration Naïve Bayes 0.681 0.301 Engaged Concentration K* 0.678 0.358 Confusion J48 0.736 0.274 Off-Task REPTree 0.819 0.506 Gaming 0.802 0.370 The affect detectors’ predictive performance were evaluated using A' [28] and Cohen’s Kappa [18]. An A' value (which is approximately the same as the area under the ROC curve [28]) of 0.5 for a model indicates chance-level performance for correctly determining the presence or absence of an affective state in a clip, and 1.0 performing perfectly. Cohen’s Kappa assesses the degree to which the model is better than chance at identifying the affective state in a clip. A Kappa of 0 indicates chance-level performance, while a Kappa of 1 indicates perfect performance. A Kappa of 0.45 is equivalent to a detector that is 45% better than chance at identifying affect. As discussed in [37], all of the affect and behavior detectors performed better than chance. Detector goodness was somewhat lower than had been previously seen for Cognitive Tutor Algebra [cf. 6], but better than had been seen in other published models inferring student affect in an intelligent tutoring system solely from log files (where average Kappa ranged from below zero to 0.19 when fully stringent validation was used) [19, 22, 44]. The best detector of engaged concentration involved the K* algorithm, achieving an A' of and a Kappa of The best boredom detector was found using the JRip algorithm, achieving an A' of and a Kappa of The best confusion detector used the J48 algorithm, having an A’ of 0.736, a Kappa of The best detector of off-task behavior was found using the REP-Tree algorithm, with an A’ value of 0.819, a Kappa of The best gaming detector involved the K* algorithm, having an A’ value of 0.802, a Kappa of These levels of detector goodness indicate models that are clearly informative, though there is still considerable room for improvement. The detectors emerging from the data mining process had some systematic error in prediction due to the use of re-sampling in the training sets (models were validated on the original, non-resampled data), where the average confidence of the resultant models was systematically higher or lower than the proportion of the affective states in the original data set. This type of bias does not affect correlation to other variables since relative order of predictions is unaffected, but it can reduce model interpretability. To increase model interpretability, model confidences were rescaled to have the same mean as the original distribution, using linear interpolation. Rescaling the confidences this way does not impact model goodness, as it does not change the relative ordering of model assessments. Application of Affect and Behavior Models to Broader Data Set Once the detectors of student affect and behavior were developed, they were applied to the data set used in this paper. As mentioned, this data set was comprised of 2,107,108 actions in 494,150 problems completed by 3,747 students in three school districts. The result was a sequence of predictions of student affect and behavior across the history of each student’s use of the ASSISTment system.
41
Result Models can make inference in real-time (20 second delay)
Models can be applied at scale to retrospective log files
42
Determining something about the student
Knowing what matters Doing the right thing about it
43
Example Take automated detectors of engagement, affect, and learning in ASSISTments Applied to several years of entire-year student data Thousands of students Millions of actions within the software
44
Engagement and Standardized Exam Score (Pardos et al., 2013, 2014)
Detectors applied to whole year of data for 1,393 students Gaming the system (r = -0.36) Boredom (r = -0.2) Engaged concentration (r = +0.36) We then evaluate the relationship of these measures to student outcomes, such as state exam scores. Applying applyint the assistments detectors to a whole year of data for aroun 1400 students, We found out end-of year exam scores to be negatively correlated to boredom and gaming the system, and positively correlated to engaged concentration
45
College Attendance (San Pedro, Baker, Bowers, & Heffernan, 2013)
The detectors can predict Whether a student will go to college or not, ~6 years later 69% of the time for new students
46
College Attendance (San Pedro, Baker, Bowers, & Heffernan, 2013)
And the model can indicate what aspects of a student’s behavior are predictive of college attendance Alex is less likely to go to college Top predictive factors: he is getting confused and gaming the system… Maria is less likely to go to college Top predictive factors: she is getting bored and careless…
47
College Major (San Pedro et al., 2014, 2015)
And these same constructs can also predict what a student will major in when they get to college
48
Another Example Student interaction within a MOOC in data science can predict whether the student will eventually submit a scientific paper in the field (Wang et al., under review) Forum lurkers are more likely to submit a scientific paper than forum posters!
49
Determining something about the student
Knowing what matters Doing the right thing about it
50
What do we do? When we know that a student is bored… or gaming the system… or has shallow learning… or etc. etc. etc.
51
Huge Space of Potential Interventions
52
Huge Space of Potential Interventions
Automated interventions delivered by animated agents
53
Huge Space of Potential Interventions
Stealth interventions that change learner experience in subtle ways
54
Huge Space of Potential Interventions
Reports to instructors, guidance counselors, parents, students themselves…
55
Examples of Use (at scale)
ALEKS – Models of prerequisite structure and knowledge used to select material for students High school and college math and science
56
Examples of Use (at scale)
Reasoning Mind – automated detectors of engagement used to provide analytics to regional coordinators about teacher effectiveness Elementary school math
57
Examples of Use (not at scale)
Project LISTEN – data mining used to determine which strategies work for which students, and to help select which stories to give students to read Elementary school reading
58
Examples of Use (at scale)
ASSISTments – automated detectors of engagement and knowledge used to determine how to re-design learning experiences to increase effectiveness Middle school math
59
Examples of Use (not yet at scale)
ASSISTments – automated detectors of engagement and knowledge used to make predictions about student outcomes and provide reports to school guidance counselors Middle school math
60
Examples of Use (at scale)
Course Signals, Zogotech, Soomo – At-risk prediction models used to provide actionable information to instructors and academic advisors College
61
Huge Space of Potential Interventions
Still an open area for the field And an area of considerable ongoing research for my lab
62
The Big Idea Thanks to the big data now becoming available on student learning
63
The Big Idea Thanks to the big data now becoming available on student learning And a combination of data mining and knowledge engineering
64
The Big Idea Thanks to the big data now becoming available on student learning And a combination of data mining and knowledge engineering We can make inferences about students in real-time
65
The Big Idea Thanks to the big data now becoming available on student learning And a combination of data mining and knowledge engineering We can make inferences about students in real-time That are predictive of long-term outcomes
66
Eventual Goal Track a student’s engagement now
67
Eventual Goal Track a student’s engagement now
Predict the longer-term impact
68
Eventual Goal Track a student’s engagement now
Predict the longer-term impact Intervene to help re-engage students and support their learning
69
Eventual Goal Track a student’s engagement now
Predict the longer-term impact Intervene to help re-engage students and support their learning Helping to create an educational system more sensitive to individual learners’ needs
70
Get Involved Lots of opportunities to learn more about this emerging field, right here in the Boston area
71
Get Involved MHE’s data science group right here in Boston has some amazing leaders in EDM Excellent groups at MIT, HarvardX as well Learning Analytics Summer Institute Local Meetings often held in Boston, though none scheduled for this year ACM 2017 will be held in Cambridge
72
Get Involved Edm-announce mailing list
73
See our free online MOOT “Big Data and Education”
Learn More twitter.com/BakerEDMLab Baker EDM Lab weibo.com/u/ Baker EDM Lab See our free online MOOT “Big Data and Education” Offered as EdX MOOC, next iteration in a few months All lab publications available online – Google “Ryan Baker”
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.