Presentation is loading. Please wait.

Presentation is loading. Please wait.

United Nations Economic Commission for Europe Statistical Division NTTS 2015 – Satellite Workshop on Big Data March 9, 2015 Computing Energy Consumption.

Similar presentations


Presentation on theme: "United Nations Economic Commission for Europe Statistical Division NTTS 2015 – Satellite Workshop on Big Data March 9, 2015 Computing Energy Consumption."— Presentation transcript:

1 United Nations Economic Commission for Europe Statistical Division NTTS 2015 – Satellite Workshop on Big Data March 9, 2015 Computing Energy Consumption from Smart Meters Data Antonino Virgillito Project Consultant, UNECE Istat

2 The Role of Big Data in the Modernisation of Statistical Production and Services NTTS 2015 March 9, 2015 Introduction Smart meters are electronic meters that enable automated collection of electricity consumption data of households and small businesses Smart meter data is a good example of Big Data – High volume and high velocity – Low variety: highly structured Good potential for statistics – Instead of surveying individual utilities or households, we could collect the data directly from the smart meter entity, which would greatly reduce response burden. 2

3 The Role of Big Data in the Modernisation of Statistical Production and Services NTTS 2015 March 9, 2015 Objectives Showing the feasibility of computing statistics on energy consumptions starting from smart meters data Testing how to handle privacy-sensitive data in a shared environment Testing how to aggregate data using Big Data tools available in the Sandbox 3

4 The Role of Big Data in the Modernisation of Statistical Production and Services NTTS 2015 March 9, 2015 The Team 4 Lily Ma Andrew Murray Antonino Virgillito Marco Puts Stephen Ball

5 The Role of Big Data in the Modernisation of Statistical Production and Services NTTS 2015 March 9, 2015 The Datasets Irish Data – Real data, privacy sensitive – Power consumption of 6500 households over 1 year – 160 million records, 2.5Gb Canadian Data – Synthetic data generated from real counterpart – Power consumption of 41000 households over 1 month – 16 million records, 1Gb 5

6 The Role of Big Data in the Modernisation of Statistical Production and Services NTTS 2015 March 9, 2015 Handling Privacy Issues – Irish data Irish dataset could not be released freely under the terms of Irish legislation Proper precautions had to be taken to move and store the data to the Sandbox – Datasets were stored on a USB key (encrypted and protected with a password) only handled by the Irish institute representative (Andrew) – A directory was created on the Sandbox and access permission was granted only to team members – Andrew and Toni transferred the data from Toni’s computer to the Sandbox via FTP – The data was removed from the computer right after the completion of the operations 6

7 The Role of Big Data in the Modernisation of Statistical Production and Services NTTS 2015 March 9, 2015 Handling Privacy Issues – Canadian Data There were no possibilities of moving the Canadian datasets outside the boundaries of StatsCan Lily implemented a method to alter the data in order to remove each reference to the real data and change the values of the measurements Resulting statistics were detached from the real numbers although maintaining a realistic distribution 7

8 The Role of Big Data in the Modernisation of Statistical Production and Services NTTS 2015 March 9, 2015 Experiment Details Data aggregation was carried out in the Sandbox environment using the Pig tool The full datasets were aggregated in a single pass, computing the power consumption at hourly level – Script was only 4 lines long Aggregation performance was satisfactory – 2.5Gb aggregated in less than 2 minutes Since the Pig language does not natively define statistical functions a third-party extension (developed and freely made available by LinkedIn) was loaded and used – User Defined Function can indefinitely extend the power of the Pig language – Several useful functions freely available. Aggregated data was processed again in R in order to produce visualizations – Tools used: Processing and Pentaho 8

9 NTTS 2015 March 9, 2015 Visualizations Weekly consumption per hour of day over a year (IE) winter summer mid-seasons 9

10 NTTS 2015 March 9, 2015 Visualizations 10 Hourly consumption per day (CAN)

11 The Role of Big Data in the Modernisation of Statistical Production and Services NTTS 2015 March 9, 2015 Conclusions and Findings We proved that data from smart meters could potentially be used to compute statistics on energy consumption easily and at a very detailed level Key issue is data availability and privacy – Is this approach feasible in production? Technology findings: – Test of big data tools with positive results – Reuse of methods: quickly wrote aggregation scripts that could be used on both datasets Privacy findings: – two ways of overcoming privacy issues The use of synthetic data sets can enable working on common environments and sharing methods and techniques 11


Download ppt "United Nations Economic Commission for Europe Statistical Division NTTS 2015 – Satellite Workshop on Big Data March 9, 2015 Computing Energy Consumption."

Similar presentations


Ads by Google