Presentation is loading. Please wait.

Presentation is loading. Please wait.

Big Data in Official Statistics: Generalities

Similar presentations


Presentation on theme: "Big Data in Official Statistics: Generalities"— Presentation transcript:

1 Big Data in Official Statistics: Generalities
Antonino Virgillito

2 Big Data in Official Statistics: the International Setting - 1
Strategic vision of HLG, June 2011: «We are in a changeover from a society with little or no data available to one that has an abundance of data… … Another important point is that nowadays it is much easier to get data that cover more than the traditional national statistics users would need. We do not, however, have the mechanisms in place to make full use of these data»

3 Big Data in Official Statistics: the International Setting - 2
HLG Working paper 2013/6, January 2013: « Apart from generating new commercial opportunities in the private sector, Big data is also potentially very interesting as an input for official statistics; either for use on its own, or in combination with more traditional data sources such as sample surveys and administrative registers»

4 Big Data in Official Statistics: the International Setting - 3
Scheveningen Memorandum, September 2013 «Acknowledge that Big Data represent new opportunities and challenges for Official Statistics, and therefore encourage the European Statistical System and its partners to effectively examine the potential of Big Data sources in that regard.

5 Big Data for OS: The Concept
Big Data can be an input for official statistics: either for use on its own or in combination with more traditional data sources such as sample surveys and administrative registers

6 Big Data Sources UNECE Classification
Social Networks (human-sourced information) Traditional Business systems (process-mediated data) Internet of Things (machine-generated data)

7 Social Networks (human-sourced information)
Interactions with news media and social media, job posting Humans interacting with devices (also mobile) produce data

8 Social Networks (human-sourced information)
Interactions with news media and social media, job posting Humans interacting with devices (also mobile) produce data Example: Blog posts Twitter messages

9 Social Networks (human-sourced information)
Interactions with news media and social media, job posting Humans interacting with devices (also mobile) produce data Example: Blog posts Twitter messages User-generated maps

10 Traditional Business systems (process-mediated data)
Data collected by traditional systems in a passive mode Example: Web search logs

11 Traditional Business systems (process-mediated data)
Data collected by traditional systems in a passive mode Example: Web search logs Medical records

12 Traditional Business systems (process-mediated data)
Data collected by traditional systems in a passive mode Example: Web search logs Medical records Commercial transactions Banking/stock records

13 Internet of Things (machine-generated data)
Sensors and machines used to measure and record the events and situations in the physical world.  Example: Traffic sensors

14 Internet of Things (machine-generated data)
Sensors and machines used to measure and record the events and situations in the physical world.  Example: Traffic sensors Enviornmental Sensors

15 How can Big Data be included in official statistics?
As the only source (replacement/new statistics) Traffic intensity statistics (NL) and ‘Billion Prices’ project (MIT) As the main source with survey/admin. data as benchmark Google trends like approaches, (regular) benchmarking needed As an additional source for a survey/admin. data based statistics for example to enable small area estimation As ‘supplier’ of missing data for example use data on level of education from the internet to fill gaps in education register But also for nowcasting and to increase timeliness! Don’t use it

16 Why do Big Data look so appealing to NSIs?
Competitive pressure Private sector may take advantage of Big Data and produce more and more statistics that attempt to beat official statistics on timeliness and relevance The “Official Statistics” trademark could slowly lose reputation and relevance unless NSIs get on board Funding constraints Economic crisis ( ??) urges organizations to look for ways to increase efficiency and cut costs Being traditional data collection so cost-intensive, interest in alternative data sources and Big Data is growing

17 Why do Big Data look so appealing to NSIs?
Improving quality of traditional statistics Providing new auxiliary information that NSIs could exploit to - Build and maintain better sampling frames - Design better samples - Build better Calibration estimators - Soften nonresponse bias further Reducing respondents’ burden Potential for discovering new knowledge New well-being indicators Agriculture and environment statistics New measures of consumers’ confidence Consumer behavior beyond HBS

18 Issues and Challenges Legislative, regulating access to data Privacy
Possible diffrrent legislation country by country Privacy Possible privacy-by-design strategies Financial Private providers for Big data Management Including Training

19 Issues and Challenges – Statistical methodology
Representativeness Difficult to define target population, survey population and survey frame Linking methods of Big Data with statistical units (individuals, families, enterprises,…) Estimation procedures Quality of the results

20 Issues and Challenges – Collecting Big Data
Big Data originated from the need to manage data that grew inside organizations as a consequence of their business No collection involved In statistical offices we do not have such a situation because our “input” data is always generated from the collection phase Big Data too have to be gathered from external sources Most common sources of Big Data for statistical purposes datasets from external providers data extracted from Internet

21 International Initiatives
UN Global Working Group UNECE-HLG Big Data Sandbox ESSNet project on Big Data


Download ppt "Big Data in Official Statistics: Generalities"

Similar presentations


Ads by Google