Topic: Problem with Sampling Reading: Baker, Stephen (2009) Topic: Problem with Sampling Reading: Baker, Stephen (2009). They’ve Got Your Number: Data, Digits and Destiny - how the Numerati are changing our Lives. (Chap 7, Lover) Group 5: Shu Min, Yan Ling (Presenter), Yi Mou
Outline Online dating story Problem with sampling – small sample, not randomized, credibility of results from questionnaire Cell phone experiment Problem with sampling – voluntary sample, not randomized Conclusion
Online dating story Reveal the algorithms of love through an online dating service- Chemistry.com Author and his wife will describe the characteristics of one another as their ideal partner
Questionnaire Personal details Ideal partner one is willing to consider Personality Essay to describe oneself Length of index and ring fingers
Involuntary data not available to user Record the clicks, measure which types of potential dates appear to interest the user the most Study the behavior, observe trends, recommend ppl that are similar to the user
Hormones <-> Personality Science-based matching algorithm designed by Helen Fisher Fisher’s theory is that 4 different hormones mold our personalities Director (Testosterone), Negotiator (Estrogen), Builder (Serotonin) and Explorer (Dopamine)
Outcome Compatibility is matched on various attributes - personality groupings, cultural, educational level etc The algorithms have completed their work – Both are a match to each other.
Problem with sampling Small sample Not randomized Credibility of results from questionnaire
Cell phone experiment Objective: Observe similar patterns between different profiles of people Experimenter: Nathan Eagle, a PhD student at MIT’s Media Lab Methodology: Distributed cell phones to 100 graduate students (25 business, 75 engineering students). Phones were equipped with software to record the movements and interactions of people Duration: 1 year
Outcome Nathan Eagle observed that the 2 groups of subjects had different lifestyle patterns Build models of the individuals Each person’s life was very predictable
Problem with sampling Not random Voluntary sample
Conclusion Representative sample Underlying distribution of the population Randomization Size of sample Type of sampling methods
Thank you!