University of Washington, Autumn 2018

Slides:



Advertisements
Similar presentations
A Survey of Botnet Size Measurement PRESENTED: KAI-HSIANG YANG ( 楊凱翔 ) DATE: 2013/11/04 1/24.
Advertisements

Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer.
Automatic Gender Identification using Cell Phone Calling Behavior Presented by David.
6 am 11 am 5 pm Fig. 5: Population density estimates using the aggregated Markov chains. Colour scale represents people per km. Population Activity Estimation.
Mobile Phone Networks Dr. Hassan Nojumi1 MOBLIE PHONE NETWORKS Dr. Hassan Nojumi.
Auto Technologies Inc. Auto Technologies Call Cap Telemanagement System “ATTS”
Information Flow using Edge Stress Factor Communities Extraction from Graphs Implied by an Instant Messages Corpus Franco Salvetti University of Colorado.
“A.T.T.S.” Never Miss A Call! Automated Tracking Telemanagement System.
MDG data at the sub-national level: relevance, challenges and IAEG recommendations Workshop on MDG Monitoring United Nations Statistics Division Kampala,
Concept Switching Azadeh Shakery. Concept Switching: Problem Definition C1C2Ck …
II. Methodological issues Environment & Migration.
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
MT 219 Marketing Unit Two The Marketing Environment and Marketing Research Note: This seminar will be recorded by the instructor.
Title Mobility, Migration, and Mobile Phones Amy Wesolowski The Santa Fe Institute & Carnegie Mellon University 30 June 2010.
Applications in Mobile Technology for Travel Data Collection 2012 Border to Border Transportation Conference South Padre Island, Texas November, 13, 2012.
CRIM6660 Terrorist Networks Lesson 1: Introduction, Terms and Definitions.
Mining Coherent Dense Subgraphs across Multiple Biological Networks Vahid Mirjalili CSE 891.
Oscar E. Cariceo, MSW, NSWM Chile Chapter
Graph clustering to detect network modules
Cellular Records Review and Analysis Part 1: AT&T
Mobility Trajectory Mining Human Mobility Modeling at Metropolitan Scales Sibren Isaacman 2012 Mobisys Jie Feng 2016 THU FIBLab.
Impact of Interference on Multi-hop Wireless Network Performance
Cohesive Subgraph Computation over Large Graphs
Cellular Records Review and Analysis Part 1: AT&T
national scope = unique IE opportunity
Administrative data, calling patterns and spatial economics: Impact evaluation drawing on multiple data sources Nathaniel Young (EBRD)
Session 3: Innovative solutions
Discussion of “When is Hard to Make Ends Meet?”
Mobility diversity and socio-economic development
Segmentation, Targeting, and Positioning
Groups of vertices and Core-periphery structure
An analytical framework to nowcast well-being using Big Data
Ramifications of Digital Citizenship
Fraud Mobility Ken Meiser VP- Identity Solutions.
UKES Annual Conference
1st November, 2016 Transport Modelling – Developing a better understanding of Short Lived Events Marcel Pooke – Operational Modelling & Visualisation Manager.
© 2013 Jones and Bartlett Learning, LLC, an Ascend Learning Company All rights reserved. Page 1 Fundamentals of Information Systems.
International Labour Organisation
Trends in my profession, Information Technology
Tertiary Education Graduate Tracer Study
High Throughput Route Selection in Multi-Rate Ad Hoc Wireless Networks
Processes and patterns of global migration
Lecture 13 CSE 331 Oct 1, 2012.
Who’s connected, who isn’t and why?
Keeping Students on Track Using Technological Retention Tools
Sibren Isaacman Loyola University Maryland
Mining Social Networks. Contents  What are Social Networks  Why Analyse Them?  Analysis Techniques.
University of Washington, Autumn 2018
Communications Infrastructure
Lecture 12 CSE 331 Sep 26, 2011.
University of Washington, Autumn 2018
Mobile Phone Technology
University of Washington, Autumn 2018
University of Washington, Autumn 2018
University of Washington, Autumn 2018
University of Washington, Autumn 2018
International Statistics
University of Washington, Autumn 2018
Predicting poverty and wealth from mobile phone metadata
Communications Infrastructure
University of Washington, Autumn 2018
Protocols.
Practical Applications Using igraph in R Roger Stanton
CSE 421, University of Washington, Autumn 2006
Challenge: How does this link to development?
Dissemination and use of aggregate data: structures and functionality
Data Pre-processing Lecture Notes for Chapter 2
Binhai Zhu Computer Science Department, Montana State University
Protocols.
CSE 421 Richard Anderson Autumn 2019, Lecture 3
Presentation transcript:

University of Washington, Autumn 2018 Call Data Records Lecture 28: CSE 490c 12/3/18 University of Washington, Autumn 2018

University of Washington, Autumn 2018 Topics Data Science for Development AI for Social Good Today Call Data Records 12/3/18 University of Washington, Autumn 2018

University of Washington, Autumn 2018 Announcements Homework 7 Due Tonight Programming Assignment 4 Due December 11 You have received the teaching evaluation link. Complete evaluations by December 9 A high response rate is very important for meaningful results. We will send reminder emails to non-responders during the evaluation period. In addition, studies show that instructor involvement can increase response rates as much as 15-20%. 12/3/18 University of Washington, Autumn 2018

University of Washington, Autumn 2018 Telco Information Call Data Records (Call Detail Records) Meta data on individual calls Cell Tower Logs Information of handset connections with towers Cell Tower Locations 12/3/18 University of Washington, Autumn 2018

Access to cell phone data Proprietary to Telco Provide competitive advantage Possibly for marketing or data services Linkage with mobile money Subject to government privacy regulations Possible access to aggregated data 12/3/18 University of Washington, Autumn 2018

University of Washington, Autumn 2018 Reading for this week An Investigation of Phone Upgrades in Remote Community Cellular Networks, Kushal Shah et al., ICTD 2017 12/3/18 University of Washington, Autumn 2018

University of Washington, Autumn 2018 Call Data Records Meta data associated with calls Source number Destination number Source Tower (ID) Destination Tower (ID) [might be missing] Time Duration Status 12/3/18 University of Washington, Autumn 2018

University of Washington, Autumn 2018 Types of studies How people use technology Populations studies (where people are) Event studies (what happens when) Epidemiology studies Economic studies Distinction between Aggregate Studies and deriving information about individuals 12/3/18 University of Washington, Autumn 2018

University of Washington, Autumn 2018 Working with CDRs Preprocess data for higher level structure Align data with other sources Tower data Economic / Population Data Compute home location Determine movement patterns 12/3/18 University of Washington, Autumn 2018

University of Washington, Autumn 2018 Call Graph Directed or undirected graph on calls Measurement of call volume Detection of high in- degree and out-degree nodes Identification of social network 12/3/18 University of Washington, Autumn 2018

University of Washington, Autumn 2018 Call Graph Analysis Global analysis versus individual analysis Is the goal to understand aggregate properties or individual properties Feature identification of individual’s calls Number of calls Length of calls Missed calls Incoming vs out going Time of day Neighborhood Frequent caller neighborhood Nodes of distance two 12/3/18 University of Washington, Autumn 2018

Social Network Identification I How do you identify a callers Social Network from a call graph? Start with the Induced Subgraph on the Neighborhood of the individual 12/3/18 University of Washington, Autumn 2018

Social Network Identification II Identify highly connected groups of vertices in Neighborhood Graph Finding a maximum Clique is NP-Complete Heuristics Maximum degree subgraph Maximum density subgraph 12/3/18 University of Washington, Autumn 2018

Degree-K Subgraph problem While there is a vertex of degree less than K Delete all vertices of degree less than K 12/3/18 University of Washington, Autumn 2018

Maximum Density Subgraph problem Find an induced subgraph S that maximizes ration Edge(S)/Vertices(S) Polynomial time algorithms using Network Flow techniques Related to degree K subgraph problem 12/3/18 University of Washington, Autumn 2018

How good a predictor is the call graph of an individual's income? 12/3/18 University of Washington, Autumn 2018

University of Washington, Autumn 2018 Phone use studies Given demographic information, study phone use behavior Call volume, call timings (time of day, date), number of contacts Studies of Sim Card Churn Inference of demographics from behavior Mehrotra et al. (2012), Differences in Phone Use Between Men and Women: Quantitative Evidence from Rwanda. 12/3/18 University of Washington, Autumn 2018

University of Washington, Autumn 2018 Migration Studies Movement of people is an area of significant study Lack of census data make this hard to study Short term migration What are the patterns Is it possible to distinguish between shorter term and permanent migration Forced migration and droughts Match to climate data Question on local versus long distance migration Technical issues in definitions of movement 12/3/18 University of Washington, Autumn 2018

University of Washington, Autumn 2018 Blumenstock et al. (2015), Predicting poverty and wealth from mobile phone metadata Economic Studies Predict economic status at a local (e.g. District) level Household surveys are expensive. Idea is to use Cell Phone Data to expand surveys Correlate household surveys with CDR Compute wide range of properties of CDR Construct machine learning model to predict household assets (from survey data) Apply to all call records in the data set 12/3/18 University of Washington, Autumn 2018

University of Washington, Autumn 2018 Epidemiology Correlating human movement data and disease frequency Substantial work on Malaria and Call Data Records Key use case is malaria elimination Understanding if cases are local infections or from other regions Understand movement from high incidence to low incidence areas Technical modelling work that combines migration and economic studies 12/3/18 University of Washington, Autumn 2018

University of Washington, Autumn 2018 Event studies Look at impact of events in data sets High volume of calls related to disasters, elections, holidays Spike in call volumes has been observed associated with earth quakes Significant interest in call data records and disaster response Technical issues related to infrastructure and economic displacement 12/3/18 University of Washington, Autumn 2018

Additional challenges on CDR Analysis Countries often have multiple Telcos Getting data from all Telcos is even harder Is data from one Telco sufficient? Increasing use of other media for communication Encrypted messaging Apps such as WhatsApp Analysis of Social Media company data should be even more restricted than CDR Aleksandr Kogan 12/3/18 University of Washington, Autumn 2018