Download presentation
Presentation is loading. Please wait.
Published byVictor Simon Modified over 9 years ago
1
Ground Truth for Behavior Detection Week 3 Video 1
2
Welcome to Week 3 Over the last two weeks, we’ve discussed prediction models This week, we focus on a type of prediction models called behavior detectors
3
Behavior Detectors Automated models that can infer from log files whether a student is behaving in a certain way We discussed examples of this off-task behavior and gaming detectors In the San Pedro et al. case study in week 1
4
Behaviors people have detected
5
Disengaged Behaviors Gaming the System (Baker et al., 2004, 2008, 2010; Cheng & Vassileva, 2005; Walonoski & Heffernan, 2006; Beal, Qu, & Lee, 2007) Off-Task Behavior (Baker, 2007; Cetintas et al., 2010) Carelessness (San Pedro et al., 2011; Hershkovitz et al., 2011) WTF Behavior (Rowe et al., 2009; Wixon et al., UMAP2012) 5
6
Meta-Cognitive Behaviors Help Avoidance (Aleven et al., 2004, 2006) Unscaffolded Self-Explanation (Shih et al., 2008; Baker, Gowda, & Corbett, 2011) Exploration Behaviors (Amershi & Conati, 2009) 6
7
If you’re not interested in behavior detectors… Feel free to take the week off and rejoin us in a week Although there may still be stuff that’s interesting to you this week
8
Ground Truth The first big issue with developing behavior detectors is… Where do you get the prediction labels?
9
Behavior Labels are Noisy No perfect way to get indicators of student behavior It’s not truth It’s ground truth (Truthiness?)
10
Behavior Labels are Noisy Another way to think of it In some areas, there are “gold-standard” measures As good as gold With behavior detection, we have to work with “bronze-standard” measures Gold’s less expensive cousin
11
Is this a problem? Not really It does limit how good we can realistically expect our models to be If your training labels have inter-rater agreement of Kappa = 0.62 You probably should not expect (or want) your detector to have Kappa = 0.75
12
Ultimately “Anything worth doing is worth doing badly.” – Herb Simon Picture taken from CMU tribute page
13
Sources of Ground Truth for Behavior Self-report Field observations Text replays Video coding
14
Self-report Fairly common for constructs like affect and self- efficacy Not as common for labeling behavior
15
Are you gaming the system right now? Yes No
16
Was recommended to me by an IRB compliance officer in 2003 Are you gaming the system right now? Yes No
17
Field Observations One or more observers watch students and take systematic notes on student behavior Takes some training to do right (Ocumpaugh et al., 2012) http://www.columbia.edu/~rsb2162/bromp.html Free Android App developed by my group (Baker et al., 2011) http://www.columbia.edu/~rsb2162/HART8.6.zip
18
Text replays Pretty-prints of student interaction behavior from the logs
19
Examples
23
Major Advantages Blazing fast to conduct 8 to 40 seconds per observation
24
Notes Decent inter-rater reliability is possible Agree with other measures of constructs Can be used to train behavior detectors
25
Major Limitations Limited range of constructs you can code Gaming the System – yes Collaboration in online chat – yes (Prata et al, 2008) Frustration, Boredom – sometimes Off-Task Behavior outside of software – no Collaborative Behavior outside of software – no
26
Major Limitations Lower precision (because lower bandwidth of observation)
27
Video Coding Can be videos of live behavior in classrooms or screen replay videos Slowest method Replicable and precise Some challenges in camera positioning If you don’t get this right, it can be difficult to code your data
28
Inter-Rater Reliability Good to check for inter-rater reliability when using expert coding Researchers typically try to shoot for Kappa = 0.6 or higher for the expert coding But ignore magic numbers… Any real signal is something you can try to re-capture I’d rather have 1,000 data points with Kappa = 0.5 Than 100 data points with Kappa = 0.7
29
Once you have ground truth… You can build your detector! Once you’ve synchronized your ground truth to the log data…
30
Next Lecture Data Synchronization and Grain-Sizes
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.