STARTING EXPLORING MOBILE PHONE DATA IN THE SANDBOX Pilar Rey del Castillo.

Slides:



Advertisements
Similar presentations
Measuring Arab Region’s Information Society The Fifth Annual Meeting on Telecommunication Development in the Arab Region Beirut (Lebanon), 28 – 31 May.
Advertisements

1 Very Large-Scale Incremental Clustering Berk Berker Mumin Cebe Ismet Zeki Yalniz 27 March 2007.
Human Mobility Modeling at Metropolitan Scales Sibren Isaacman, Richard Becker, Ramón Cáceres, Margaret Martonosi, James Rowland, Alexander Varshavsky,
Energy-Efficient Computing for Wildlife Tracking: Design Tradeoffs and Early Experiences with ZebraNet Presented by Eric Arnaud Makita
CMU has been said to be the “most wired campus in the US” for two years in a row. What kind of impact does such a infrastructure has on the daily academic.
Urban Encounters: The game of real life Eamonn O’Neill University of Bath Department of Computer Science Vassilis Kostakos University of Madeira / Carnegie.
Introduction Global market penetration rates at 91% (Ericsson, 2012) “The use of the mobile medium as a means of marketing communications” (Leppaniemi.
Ana Maria Oliveira Vanessa Figueiredo Out Old and New Paradigms in the Measurement of R&D Science, Technology & Innovation Policy 1.
Data Mining and Machine Learning Lab Mobile Location Prediction in Spatio-Temporal Context Data Mining and Machine Learning Lab Arizona State University.
GI Systems and Science January 30, Points to Cover  Recap of what we covered so far  A concept of database Database Management System (DBMS) 
University of Minho School of Engineering Algoritmi Uma Escola a Reinventar o Futuro – Semana da Escola de Engenharia - 24 a 27 de Outubro de 2011 Introduction.
Deanery of Business & Computer Sciences Research Methods Week 1 Collecting, Processing and Analyzing Data.
Giga-Mining Corinna Cortes and Daryl Pregibon AT&T Labs-Research Presented by: Kevin R. Gee 28 October 1999.
Chapter 1 Conducting & Reading Research Baumgartner et al Chapter 1 Nature and Purpose of Research.
Quantitative Research Methods for Information Systems and Management (Info 271B) Introduction to Social Research.
Introduction to Communication Research
COLLABORATE. INNOVATE. EDUCATE. What Smartphone Bicycle GPS Data Can Tell Us About Current Modeling Efforts Katie Kam, The University of Texas at Austin.
Mobile Monetization. TIMWE at a glance 2 Overview Offered Solutions 3 TIMWE Solutions TIMWE Services Mobile Marketing Mobile marketing campaigns and.
Beyond 2011 – A new paradigm for population statistics? Pete Benton, Beyond 2011 Programme Director Office for National Statistics, UK.
An introduction May Offermans, Martijn Tennekes, Alex Priem, Shirley Ortega en Nico Heerschap Using Mobile Phone Meta Data For National Statistics.
GeoPKDD Geographic Privacy-aware Knowledge Discovery and Delivery Kick-off meeting Pisa, March 14, 2005.
Formulating the Research Design Faisal Abbas, PhD Lecture 9 th.
IDENTIFYING USERS PROFILES FROM MOBILE CALLS HABITS August 12, Beijing, China B Furletti, L. Gabrielli, C. Renso, S. Rinzivillo KddLab, ISTI – CNR,
Fusion GPS Externalization Pilot Training 1/5/2011 Lydia M. Naylor Research Lead.
1 Using R for consumer psychological research Research Analytics | Strategy & Insight September 2014.
6 am 11 am 5 pm Fig. 5: Population density estimates using the aggregated Markov chains. Colour scale represents people per km. Population Activity Estimation.
DO STUDENTS THAT HAVE COMPUTERS AT HOME GET BETTER GRADES IN SCIENCE AND MATH CLASS AT THE HIGH SCHOOL LEVEL? By Kelly Laura.
Population Movements from Anonymous Mobile Signaling Data An Alternative or Complement to Large- Scale Episodic Travel Surveys?
1 Welcome to B.Ed 3 Students as Developing Professionals.
Seasonal Decomposition of Cell Phone Activity Series and Urban Dynamics Blerim Cici, Minas Gjoka, Athina Markopoulou, Carter T. Butts 1.
Joint UNCTAD-ITU-UNESCAP Workshop Information Society Measurements in Asia-Pacific Bangkok, July 2006 Ms. Esperanza C. Magpantay Statistician Market,
Efficient Mapping and Management of Applications onto Cyber-Physical Systems Prof. Margaret Martonosi, Princeton University and Prof. Pei Zhang, Carnegie.
Exploring Metropolitan Dynamics with an Agent- Based Model Calibrated using Social Network Data Nick Malleson & Mark Birkin School of Geography, University.
MPO/RPC Directors Meeting Asadur Rahman Lead Worker-Traffic Forecasting Section, BPED, July 28, 2015.
Knowledge Discovery and Delivery Lab (ISTI-CNR & Univ. Pisa)‏ www-kdd.isti.cnr.it Anna Monreale Fabio Pinelli Roberto Trasarti Fosca Giannotti A. Monreale,
Survey on Broadband and Wireless Usage in Taiwan Taiwan Network Information Center JULY 2003.
Survey on Broadband and Wireless Usage in Taiwan Lu Ai-Chin 26 September 2003 Taiwan Network Information Center.
Cell Phone Traffic Data Technology Demonstration in Minnesota ITS America 2007 Annual Meeting & Exposition Bernie Arseneau, Mn/DOT Rashmi Brewer, Mn/DOT.
Unique in the crowd: The privacy bounds of human mobility Y.-A. de Montjoye, C. A. Hidalgo, M. Verleysen, and V. D. Blondel, Scientific reports, vol. 3,
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Knowledge Discovery from Mobile Phone Communication Activity Data Streams Fergal Walsh Data Stream Research presented in this poster was funded by a Strategic.
Final Project Example. Step 1 – The Scenario A user commutes from New Jersey to Manhattan, NY each weekday for work. On his way to work while he is still.
Copyright 2000 Prentice Hall5-1 Chapter 5 Marketing Information and Research: Analyzing the Business Environment.
Module 8 : Configuration II Jong S. Bok
Exploration of the Academic Experience of International Students Studying Project Management *Dr Reda M Lebcir, Hany Wells and Angela Bond The Business.
Network Community Behavior to Infer Human Activities.
Training Workshop on Development of Core Statistical indicators for ICTs Tunisian Experience in ICT indicators Collection. Tunisian presentation June 2005.
Data Mining Basics. “Copyright and Terms of Service Copyright © Texas Education Agency. The materials found on this website are copyrighted © and trademarked.
NC-BSI: TASK 3.5: Reduction of False Alarm Rates from Fused Data Problem Statement/Objectives Research Objectives Intelligent fusing of data from hybrid.
University of Colorado Boulder ASEN 5070: Statistical Orbit Determination I Fall 2014 Professor Brandon A. Jones Lecture 10: Weighted LS and A Priori.
ABRA Week 3 research design, methods… SS. Research Design and Method.
4.1 Statistics Notes Should We Experiment or Should We Merely Observe?
Profiling: What is it? Notes and reflections on profiling and how it could be used in process mining.
French new census : method and utilization IPUMS Workshop 9 june Paris.
Data Mining Introduction to data mining concepts.
Lecture 3: Measurements and MOP (Measures Of Performance) Service Engineering Galit B. Yom-Tov.
The Impact of Mobile Web in Developing Countries Betty Purwandari, Wendy Hall and David De Roure School of Electronics and Computer Science (
Applications in Mobile Technology for Travel Data Collection 2012 Border to Border Transportation Conference South Padre Island, Texas November, 13, 2012.
Department of Telecommunications NetGames 2011Ottawa, October 2011 MMORPG Player Behavior Model based on Player Action Categories Mirko Suznjevic, Ivana.
Mobility Trajectory Mining Human Mobility Modeling at Metropolitan Scales Sibren Isaacman 2012 Mobisys Jie Feng 2016 THU FIBLab.
Profiling based unstructured process logs
Managing the Privacy of Incidental Information During Collaboration
Emerging Trends in Information Technology
Place Identification in Location Based Urban VANETs
Description of national ongoing/intended data processing
Data collection methodology and NM paradigms
Hasan Kadhem IT Department College of IT
ISIbalo Young African Statisticians Association-Uganda Chapter
Development of a Mobility Demand Model for Private Usage under Non-Urban Conditions Maria Kugler 20th European Conference on Mobility Management – ECOMM.
Miryam Vahtra, State Budget Department, Ministry of Finance of Estonia
Presentation transcript:

STARTING EXPLORING MOBILE PHONE DATA IN THE SANDBOX Pilar Rey del Castillo

Mobile phone data in the Sandbox Special case: only since October 2014 Limited information provided in the dataset Still very interesting to analyse – Sensors of human and social behaviour (location...) – Example of requirements of exploratory step comparing with other type of data in the Sandbox – Aim  describe initial steps in attempting to produce meaningful results for statistical purposes 2

Location or positioning data Concept in mobile phones & statistics context User assigned to a number of neighbouring antennas for load balancing reasons Types – Active – Passive: Call Detail Records (CDRs)... 3 Passive location  occasional samples of the approximate locations of the phone's user

Mobile phones datasets (1) D4D Challenge: Orange's “Data for development” in Ivory Coast Anonymised Call Detail Records (CDRs) of outgoing phone calls & sms exchanges – Orange’s customers in Ivory Coast – Between December 1, 2011 and April 28, 2012 (150 days,  5 months) Sandbox IT infrastructure: perfect 4

Mobile phones datasets (2) Total antenna-to-antenna traffic on an hourly basis (  5 million customers) Individual trajectories for customers for two week time windows 5

Literature exploiting location Supplementary information at the micro level (ground truth) – Lausanne Data Collection Campaign (Nokia ) – Reality Mining Project (MIT ) – Ad hoc experiments, conducting surveys… : Isaacman et al. (2011), De Oliveira et al. (2011) – … Just CDRs: Assumptions on the users' behaviour… – Orange Data Challenges (Ivory Coast, Senegal) – Järv et al. (Estonia, 2012) – Kung et al. (Portugal, IC, Saudi Arabia, Boston, Milan, 2014) – … 6

Ivory Coast data Positioning data  our aim: human home -> work commuting figures Way to proceed: obtain results under certain assumptions and compare First assumptions – Orange's customers represent population (96% subscriptions per 100 inhabitants, 2013) – Behaviour of customers sample is representative of mobility behaviour (to be assessed later) 7

2nd step: model to draw meaningful information Problem of oscillations: antennas aggregation by section = county x urbanization  157 sections Problem of giving a meaning to user's location: daily & weekly patterns of use as discriminative features – Isaacman et al. (2011): home  weekends + weekdays between 7 pm & 7 am work  weekdays between 1 pm & 5 pm – Kung et al. (2014): home  weekdays between 8 pm & 8 am work  weekdays between 8 am & 8 pm Apart from other sophisticated filtering… 8

9

Commuting in Ivory Coast Sample of customers 51% cluster 1 28% cluster 2 21% cluster 3 Almost 50% of the sample home -> work located  Estimate cross-tabulation commuting between Ivory Coast sections 10

11 Main commutes (%) home-> work between sections

Final remarks CDRs useful tool to learn and test new methods (although no reliable figures produced) Just a portion of possible ways to exploit CDRs promising source (need more research) Another possible research strand: develop an "OfficialStatistics" app for smartphones gathering ground truth 12

13

14

References de Oliveira, R.,Karatzoglou, A., Cerezo, P. C., de Vicuña, A. A. L. and Oliver, N. (2011), “Towards a psychographic user model from mobile phone usage”, in Desney S. Tan; SaleemaAmershi; Bo Begole; Wendy A. Kellogg &ManasTungare, ed., 'CHI Extended Abstracts', ACM Isaacman, S., Becker, R., Cáceres, R., Kobourov, S., Martonosi, M., Rowland, J. and Varshavsky, A. (2011), “Identifying Important Places in People’s Lives from Cellular Network Data”, Lecture Notes in Computer Science Vol. 6696, pp Järv,O., Ahas, R., Saluveer, E., Derudder, B.,and Witlox, F. ( 2012) “Mobile Phones in a Traffic Flow: A Geographical Perspective to Evening Rush Hour Traffic Analysis Using Call Detail Records”, PLoS ONE 7(11), Kung, K.S., Greco, K., Sobolevsky, S., and Ratti, C. (2014), “Exploring Universal Patterns in Human Home-Work Commuting from Mobile Phone Data”, PLoS ONE 9(6): e doi: /journal.pone

16