Download presentation
Presentation is loading. Please wait.
Published byYanti Sudirman Modified over 6 years ago
1
Teaching Analytics with Case Studies: Finding Love in a Classification Tree
Ruth Hummel, PhD JMP Academic Ambassador
2
Stats about Online Dating
3
Dating Apps, OKCupid, and Ethics
large-public-dataset-of-dating-site-users
4
Experimental Interventions
Other Demographics Response
5
Modeling for Change Explanatory: Explain the relationship between variables. Interpret coefficients, e.g., “For women, after age 21 each additional year of age corresponded to a 3.4% decrease in the rate of finding a romantic partner during that year.” “Seeing suggested matches with the same education level corresponded to a 4.1% increase in the rate of finding a romantic partner during that year.” Predictive: Find a model that performs well on predicting similar data. E.g., score Online Dating users according to their demographic, personality, and preference information in order to predict the likelihood of their success in finding a romantic partner. Prescriptive: change something in order to achieve the gains your model suggests. Since the Education Level Matching resulted in higher success, implement this for all users (or for users in certain target groups where the success rate for this is especially high.)
6
Explore the Data…
9
Analysis Plan Univariate Exploratory Data Analysis (and Data Cleaning, if needed) Bivariate Exploratory Data Analysis Explanatory or Predictive Models? Let’s look at: Multiple Logistic Regression, main effects and interactions Classification Tree …If we had a continuous response, we would look at: Multiple Linear Regression, main effects and interactions Regression Tree
10
*Importance of Holding Out Validation Data
11
Logistic Regression
15
Interaction: Seeing matches with your same education level when you are highly educated results in a LOWER chance of finding love than if you are medium educated. This trend is not true for the random education intervention group.
16
Interaction: Seeing matches with your same education level when you are highly educated results in a LOWER chance of finding love than if you are medium educated. This trend is not true for the random education intervention group.
17
Classification Tree
19
Classification Tree (Partition)
20
First Split
21
Second Split
22
Third Split
23
Lots more splits…
24
After 14 Splits
28
Partition – Profiler
31
Change Classification Cutoff of the Probability?
33
In the “Probability of Yes” distribution, what probability corresponds to the 80th percentile (i.e., what probability cutoff would let us classify the most likely 20% as “Yes” – even if they aren’t actually very likely to be “Yes”)?
35
True Positives False Positives
36
Back to Logistic Regression to change the classification cutoff…
37
False Positives True Positives
38
Bootstrap Forest
41
Boosted Tree
44
https://gizmodo.com/the-future-of-online-dating-is-unsexy-and-brutally- effe-1819781116
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.