Enhancing Policy Decision Making with Large-Scale Digital Traces Vanessa Frias-Martinez University of Maryland NFAIS, February 2014
5.9 billion 87% 3.2 billion unique users 45% mobile devices >>humans
Have you ever heard of DATIFICATION? 1. Yes 2. No
Mobile Digital Footprints… …for Social Good?
Research Goal To extract human behavioral information from mobile digital traces in order to assist decision makers in organizations working for social development
TOOLS BEHAVIORAL INSIGHTS Energy RESEARCH DECISION MAKERS Health Education Safety Transportation Interviews, surveys: Information to assist on policy decisions Data Mining Machine Learning Statistical MOBILE DIGITAL TRACES To enhance or complement information in an affordable manner
OUTLINE
Outline Cell Phone Data Projects with Social Impact – Cencell – AlertImpact
Cell Phone Data
Call Detail Records Anonymized Granularity 1-4km² CDR: Caller | Callee | Date | Duration | Geolocation
Modeling Human Behavior Over 270 variables
Cost-Effective Census Maps From Cell Phone Data CenCell
Motivation: Census Maps A/B C+ C D E
National Statistical Institutes A/B C+ C D E
Important Data Comes at a Price Expensive Low resource regions A/B C+ C D E
Can the variables extracted from Call Detail Records be used as predictors of regional socioeconomic levels (SELs)?
Cost-effective Maps NSI carries out surveys Cell Phone Data REDUCE COSTS NSI surveys subset of regions Forecasting Models Predict the Present
Methodology
Classifying SELs - Training Consumption Social Mobility SEL CLASSIFIER Aggregated 1-4km²
SEL Classifying SELs - Testing CLASSIFIER Consumption Social Mobility Aggregated
Experimental Evaluation
Datasets Data for a city in Latin America (NSI) – 1200 regions (GUs) – SEL values from Call Detail Records – 6 months, 500K customers – City has 920 coverage areas – 279 variables per coverage area
Evaluation Results Random Forests 86% 3 SELs (A,B,C) EM Clustering 68% 6 SELs (A,B,…,F)
Human Behavior and Census Variables
Large Scale Quantitative Analysis Consumption Social Mobility
Insights Consumption Variables Mobility Variables
AlertImpact Understanding the Impact of Health Alerts using Cell Phone Data
H1N1 Mexico Timeline Preflu Medical Alert 17th April Closing Schools 27th April Suspension 1st May Reope n 6th May
Can we measure the impact that government alerts had on the mobility of the population ?
Evaluation Call Records from 1 st Jan till 31 st May 2009 – Compute mobility as different number of BTSs visited Stages – Medical Alert - Stage 1 (17 th -27 th April) – Closing Schools - Stage 2 (28 th -1 st May) – Suspension of Essential Activities - Stage 3 (1 st May-6 th May) Baselines – same periods, different year (2008)
Changes in Mobility April 27thMay 1st May 6th AlertClosed Shutdown Reopen Baseline Mobility reduced between 10% and 30% Alert Closed SuspensionReopen
Changes in Epidemic Spreading Baseline (“preflu” behavior all weeks) Intervention (alert,closed,shutdown) Epidemic peak postponed 40 hours Reduced number of infected in peak agents by 10% BASELINE K
University Campus Statistically Significant Decrease during Stages 2 and 3
Airport Statistically Significant Increase during Stages 2 and 3
Take Away Message
Geolocated traces allow us to quantitatively – Model human behavior – Measure behavioral changes – Predict/Classify external sources of information
Future Enhance and complement the tools currently used by decision makers in organizations working for social good – Use of open datasets, social media and other digital traces
Thanks !!