Description of national ongoing/intended data processing

Slides:



Advertisements
Similar presentations
Enhancing Policy Decision Making with Large-Scale Digital Traces Vanessa Frias-Martinez University of Maryland NFAIS, February 2014.
Advertisements

Detecting Computer Intrusions Using Behavioral Biometrics Ahmed Awad E. A, and Issa Traore University of Victoria PST’05 Oct 13,2005.
0ictQATAR October 13, 2008 Qatar’s ICT Statistical Information Areas Tariq Gulrez.
Giga-Mining Corinna Cortes and Daryl Pregibon AT&T Labs-Research Presented by: Kevin R. Gee 28 October 1999.
STARTING EXPLORING MOBILE PHONE DATA IN THE SANDBOX Pilar Rey del Castillo.
Measuring Rurality. Overview ERS has developed several classifications to measure rurality and assess the economic and social diversity of rural America.
GeoPKDD Geographic Privacy-aware Knowledge Discovery and Delivery Kick-off meeting Pisa, March 14, 2005.
Models for Software Reliability N. El Kadri SEG3202.
IDENTIFYING USERS PROFILES FROM MOBILE CALLS HABITS August 12, Beijing, China B Furletti, L. Gabrielli, C. Renso, S. Rinzivillo KddLab, ISTI – CNR,
Current Job Components Information Technology Department Network Systems Administration Telecommunications Database Design and Administration.
Population Movements from Anonymous Mobile Signaling Data An Alternative or Complement to Large- Scale Episodic Travel Surveys?
Mirco Nanni, Roberto Trasarti, Giulio Rossetti, Dino Pedreschi Efficient distributed computation of human mobility aggregates through user mobility profiles.
Needs Assessment Overview Assessment and Classification of Emergencies (ACE) Project IASC Weekly Meeting 28 January 2009.
Exploring Metropolitan Dynamics with an Agent- Based Model Calibrated using Social Network Data Nick Malleson & Mark Birkin School of Geography, University.
Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.
Tunis International Centre for Environmental Technologies Small Seminar on Networking Technology Information Centers UNFCCC secretariat offices Bonn, Germany.
Administrator – Employee Overview September, 2011.
By The First Hotel Booking System. The Group Teodor Talov (Project Manager) Tyler Thomas Michael Evans Yolanda (Xiaofeng, Zhu)
Title Sub-title PLACE PARTNER’S LOGO HERE European Commission Enterprise and Industry Craiova City CHUMS Changing Habits for Urban Mobility Solutions Gabriel.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
Kingdom of Morocco High Commision for Planning Bouazza BOUCHKHAR 1 European conference on quality in official statistics QUALITY MANAGEMENT IN THE STATISTICS.
1 1 Geographic characteristics Review of 2010 round Proposals for 2020 CES recommendations Group of Experts on Population and Housing Censuses Geneva 30.
United Nations Economic Commission for Europe Statistical Division Mapping Data Production Processes to the GSBPM Steven Vale UNECE
Transactions data: from theory to practice Sarah Kiely, Australian Bureau of Statistics (ABS) Prices and economics statistics, IAOS October 2014,
Logical view –show classes and objects Process view –models the executables Implementation view –Files, configuration and versions Deployment view –Physical.
YOU ARE WHAT YOU EAT (AND DRINK): IDENTIFYING CULTURAL BOUNDARIES BY ANALYZING FOOD AND DRINK HABITS IN FOURSQUARE Presenter: LEUNG Pak Him.
Using geolocated Twitter traces to infer residence and mobility Nigel Swier, Bence Kormaniczky, and Ben Clapperton.
Training Workshop on Development of Core Statistical indicators for ICTs Tunisian Experience in ICT indicators Collection. Tunisian presentation June 2005.
Copyright 2010, The World Bank Group. All Rights Reserved. Managing processes Core business of the NSO Part 1 Strengthening Statistics Produced in Collaboration.
Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.
CLIC08 workshop Cost requirements for CLIC H. Braun, G. Riddone
Hybrid Content and Tag-based Profiles for recommendation in Collaborative Tagging Systems Latin American Web Conference IEEE Computer Society, 2008 Presenter:
5.8 Finalise data files 5.6 Calculate weights Price index for legal services Quality Management / Metadata Management Specify Needs Design Build CollectProcessAnalyse.
TOPIC WORKSHOP: DATA AND MEASUREMENT Identifying, defining and collecting measures from PDSAs Early Years Collaborative: Learning Session 5.
What is Human Migration? Migration (human) is the movement of people from one place in the world to another. People can either choose to move ("voluntary.
Dawn Hendricks, Ph.D. Early Childhood Special Education Coordinator
Pasi Piela NTTS Conference, Brussels 14 March 2016
Italian National Institute of Statistics - Istat
Research on Knowledge Element Relation and Knowledge Service for Agricultural Literature Resource Xie nengfu; Sun wei and Zhang xuefu 3rd April 2017.
Mobile Application A tutors.
Description of compiled mobile phone data sets Roberta Radini – Istat
Work Package 3 Data Management
Leta F. Huntsinger, PhD, PE Senior Technical Principal, WSP
Presenter Organisation(s)
Mobile phone data Belgium State of affairs, datasets, use cases
Chapter 15 QUERY EXECUTION.
The future of the LMAs from the Commission's perspective
Presenter Organisation(s)
United Nations Development Account 10th Tranche Statistics and Data
Description of target statistical outputs Roberta Radini – Istat
Data Collection for Sub-national Statistics (Labour Market Areas)
Baselining PMU Data to Find Patterns and Anomalies
Italian Examples of the use of big data for producing statistics
Density Mapping of Dating App Users across Time and Space in Mumbai, India Benjamin Eveslage, Purvi Shah, Caleb Parker, Bitra George, Jiban Baishya 24.
Twitter as a novel source of mobility indicators
International Comparison Program 2011
Service- and business areas
Smart Tourism statistics: improving the range of service offering in Rome Massimo De Cubellis Istat -Italy.
Canada’s trade in services by industry
The Nova Scotia Asset Management Program
Berthold Feldmann Eurostat
Data Collection FREQUENCY DATA BEHAVIOR: BEHAVIOR:
The new Eurostat website
ICT Market Follow up in Morocco Market Observatory/ANRT MOROCCO
Mapping Data Production Processes to the GSBPM
Item 5.2 Standardised social variables
Dissemination and Communication Introductory course
WORKING PARTY ON NATIONAL ACCOUNTS Paris, 4-6 November 2009
When Machine Learning Meets Security – Secure ML or Use ML to Secure sth.? ECE 693.
Miryam Vahtra, State Budget Department, Ministry of Finance of Estonia
Presentation transcript:

Description of national ongoing/intended data processing Roberta Radini – Istat I° Internal Meeting of WP5 Mobile Phone Data Madrid, 7 June

Outline Description of ongoing data processing: The next steps Classifying urban population Tool used: Sociometer The next steps Description of national ongoing/intended data processing

Classifying municipality population (Users) A person is Resident in an area A when his/her home is inside A. Therefore the mobility tends to be from and towards his/her home. A person is a Commuter between an area B and an area A if his/her home is in B while the work/school place is in A. Therefore the daily mobility of this person is mainly between B and A. A person is a Dynamic Resident between an area A and an area B if his/her home is in A while the work/school place is in B. A Dynamic Resident represents a sort of “opposite” of the Commuter. A person is a Visitor in an area A if his/her home and work/school places are outside A, and the presence inside the area is limited to a certain period of time that can allow him/her to perform some activities in A. B A B A A B Description of national ongoing/intended data processing

A methodology to classify the users A methodology to classify the users needs a condensed representation of the user’s activities, which we can define user’s profile of behavior, called Individual call profile (ICP). We can organize the telephone data for each SIM and calling place in: time of day: morning (00:00-08:00), daytime (8:00-19:00) and night (19:00-24:00); days of the week – divided into: weekdays and weekend; Set of CDRs count the single frequency ICP t1 = [00:00-08:00) t2 = [8:00-19:00) t3 = [19:00-24:00) Cod SIM Municipality Date Hour 123643 PISA 06/02/2017 11:00 123643 PISA 06/02/2017 12:05 123643 PISA 07/02/2017 12:15 123643 PISA 08/02/2017 14:03 123643 PISA 08/02/2017 14:13 123643 CASCINA 09/02/2017 09:42 123643 CASCINA 09/02/2017 15:42 123643 PISA 11/02/2017 07:45 123643 PISA 12/02/2017 10:01 123643 PISA 12/02/2017 12:18 …. Description of national ongoing/intended data processing

Classifying the behavior By using the K-Means clustering algorithm, from the Individual Call Profiles (ICP) we can extract some clusters, which are a group of homogeneous behaviors of a population discovered in the data, called Prototypes. The corresponding k centroids, called Stereotypes, are the set of representative behaviors of the population. The same experts have specified the initial set of reference profiles, the Archetypes. These archetypes are a sort of “perfect examples” of a behavioral profile, and aim at synthesizing the users’ typical behavior category. Prototype Architype Description of national ongoing/intended data processing

The Sociometer This tool, implemented by the University of Pisa and the NRC, is able to classify the behavior of each user in the CDRs. Since 2015, a collaboration between ISTAT, NRC and the University of Pisa has been established, with the aim of using the sociometer tool to classify people’s individual and collective behavior in order to compute statistical models. The core of the analysis performed is a tool called Sociometer Description of national ongoing/intended data processing

Classification algorithm The Sociometer: Individual call profile and Data Mining A ICP Resident Dynamic Resident B Classification algorithm A A Commuter From the ICPs, a set of clusters are extracted by using K-Means algorithm. A B Visitor Description of national ongoing/intended data processing

The Next steps We are currently installing and configuring the tool Sociometer on our IT Big Data platform, Cloudera We think we will soon have the first results We will then move on to the integration phase with the administrative data relating to the resident and commuting population For tourism estimates we need a set of CDRs at least at a regional level. At this time, we are requesting permission to the Guarantor to have data from all over Tuscany, as well as from Lazio and Campania In the new request, ISTAT requires wider and more detailed data sets, given the indispensability of information regarding antennas and their location for an accurate processing process of CDRs and even better quantification of the quality of the data. Description of national ongoing/intended data processing

Thanks