Download presentation
Presentation is loading. Please wait.
1
Big Data, Education, and Society
March 28, 2018
2
Assignment 2 Any questions on assignment 2?
Remember, it is due tomorrow – post to the forum on your same thread as last week
3
Validating model generalizability
Will your predictive/inferential model work in the situation you want to use it in?
4
Different than statistical significance
You can have a hugely statistically significant result But if it’s drawn from a different population or context than the context you want to apply it in, it may be inapplicable
5
Over-fitting Your model fits to noise rather than signal
Your model fits to features of your current data set rather than the broader set of contexts where you want to apply it
6
Training-test split Building your model on some data, testing on other data
7
Cross-validation Repeatedly building your model on some data, testing on other data 4-fold A, B, C -> D A, B, D -> C A, C, D -> B B, C, D -> A
8
Common mistake 8 years ago
Multiple data points for the same student Divide those data points into different folds Same student is in both training and test set Why is this a problem?
9
Common mistake 8 years ago
Multiple data points for the same student Divide those data points into different folds Same student is in both training and test set Why is this a problem? Usually addressed now through student-level cross-validation
10
Cross-group validation (Ocumpaugh et al., 2014)
Train on N groups, test on 1 group Example: Train on Urban and Suburban students, test on Rural students
11
All-group validation Train on all groups, test on held-out set from all groups Check performance on each group Example: Train on Urban, Suburban, Rural Test on new Urban, Test on new Suburban, Test on new Rural
12
Why… Cross-group instead of all-group?
All-group instead of cross-group?
13
What are some groups… It might make sense to split by during validation?
14
Of course… Testing across all these groups requires having enough data for all of them! Or indeed, any data at all
15
The perniciousness of convenience samples
Much easier to collect data for suburban middle-class students than other groups in USA
16
Questions? Comments?
17
Contextual cross-validation
Easy example is lessons in tutors or levels in games (Baker et al., 2008; Karumbaiah et al., under review)
18
Contextual cross-validation
Easy example is lessons in tutors or levels in games (Baker et al., 2008; Karumbaiah et al., under review) What are some other examples of contexts to validate across?
19
Far generalizability Generalizability across learning systems – Paquette’s work last week
20
Important Consideration
Where do you want to be able to use your model? New students? New schools? New populations? New software content?
21
Common Practice Different model for every school or university
Test for overall performance No attention to how well it captures performance on subgroups within school or university Is this good enough? If not, what is a rational and affordable alternative?
22
Politics can get in the way
Limitations on demographic data Limitations on IEP data Limitations on data, period
23
Questions? Comments?
24
Debate Imagine that we are developing a model to predict whether a teenager will engage in school violence
25
Debate Is it best to have a model that is:
Moderately accurate for all students Very accurate for most groups of students, but very inaccurate for one group (10%) of students Very accurate for 100% of students, but has different models for different students (i.e. the same behavior is punished for some students but not others)
26
Questions? Comments?
27
Upcoming office hours April 4 930am-1030am or by appointment
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.