Presentation is loading. Please wait.

Presentation is loading. Please wait.

Energy Issues in Data Analytics Domenico Talia Carmela Comito Università della Calabria & CNR-ICAR Italy

Similar presentations


Presentation on theme: "Energy Issues in Data Analytics Domenico Talia Carmela Comito Università della Calabria & CNR-ICAR Italy"— Presentation transcript:

1 Energy Issues in Data Analytics Domenico Talia Carmela Comito Università della Calabria & CNR-ICAR Italy talia@deis.unical.it

2 Motivations for Taking Care of Data  Data is everywhere (Big, complex, real-time, unstructured)  Putting data at the center of research work on energy issues may bring some benefits. (Today the focus is on algorithms).  Cost metrics of data management techniques (communication, storing, access, query, analysis) will help professionals and users to save energy in data-intensive apps.  Energy-scalable data management is important for sustainable data science. 2

3 Data Availability or Data Deluge? Every life process today is data intensive. The information stored in digital data archives is enormous and its size is still growing very rapidly. 3

4 Data Availability or Data Deluge? Some decades ago the main problem was the shortage of information, now the challenge is the very large volume of information to deal with and the associated complexity to process it and to extract significant and useful parts or summaries. 4

5 Complex Big Problems … Bigger and more complex problems must be solved by using large-scale distributed computing systems. DATA SOURCES are larger and larger and ubiquitous (Web, sensor networks, mobile devices, telescopes, …). 5

6 … and Big Data Even where accessible, much data in many fields cannot be read by humans so The huge amount of data available today requires smart data analysys techniques to aid people to deal with it and Scalable algorithms, techniques, and systems are needed (time and energy scalability). 6

7 Data: From Storing to Analysis Storing data is not the only main problem. A key issue is analyse, mine, and process data for making it useful. Source: The Economist 7

8 Towards Models for Energy- aware Data Management  The main focus today is on energy-aware algorithms, tasks, applications.  The other side of the coin is data and costs of operating on it.  Abstract energy-cost models for exchanging, accessing and transform data are primary elements for energy- aware data management at large scale.  They are useful for sustainable data science. 8

9 An Example: Energy-aware Mining of Data  We evaluated the energy cost of analyzing data by using some well-known data mining techniques on mobile devices.  Our interest was mainly on how the same technique consumes energy when dimension of data change.  Tests with different Data set dimensions, Attribute number, Class number. 9

10 Data Mining Techniques  Energy characterization of data mining techniques running on mobile devices  k -means (data clustering)  J48 (data classification)  Apriori (association rules)  Common performance parameters  Number of instances (data set size)  Number of attributes  Algorithm-specific performance parameters  k-means: number of clusters  J48: decision tree size  Apriori: Number of rules, minimum support and minimum confidence 10

11 k-means (1) 11  Increasing the number of instances,with different produced clusters

12 k-means (2) 12  Increasing the number of attributes with different produced clusters

13 Apriori (1) 13  Increasing the number of instances with different number of attributes

14 Apriori (2) 14  Increasing the data set size with different number of rules

15 Apriori (3) 15  Increasing the data set size with different minimum confidence

16 J48 16  Increasing the number of instances with different number of attributes

17 Results on different devices  Results obtained with different smart phones  Sony Xperia P:1 GHz Dual CoreARM processor and 1 GB RAM  HTC Hero:528 MHz Qualcomm processor and 288 MB RAM 17

18 Results on different devices 18  Results obtained with different smart phones  Sony Xperia P:1 GHz Dual CoreARM processor and 1 GB RAM  HTC Hero:528 MHz Qualcomm processor and 288 MB RAM

19 Results on different devices  Results obtained with different smart phones  Sony Xperia P:1 GHz Dual Core ARM processor and 1 GB RAM  HTC Hero:528 MHz Qualcomm processor and 288 MB RAM  Samsung Galaxy ACE: 800 MHz Qualcomm processor and 512 MB RAM 19

20 Concluding Remarks  Data-intensive applications demands for energy cost models based on data characteristics.  This should be done for sensors, smart phones, HPC servers, and clouds. In general, for large scale computing systems.  Sustainible data center services and applications may benefit from these models.  Preliminary experiments show useful data. 20

21  Data Sets  Census (http://archive.ics.uci.edu/ml/datasets/Census+Income)  Used with K-means  Data set size: 14 MB  Number of instances: 244348  Number of attributes: 11  Census_disc (http://archive.ics.uci.edu/ml/datasets/Census+Income)  Used with Apriori  Data set size: 19 MB  Number of instances: 333011  Number of attributes: 11  Covertype (http://archive.ics.uci.edu/ml/datasets/Covertype)  Used with J48  Data set size: 14.5 MB  Number of instances: 114556  Number of attributes: 55 21

22 22


Download ppt "Energy Issues in Data Analytics Domenico Talia Carmela Comito Università della Calabria & CNR-ICAR Italy"

Similar presentations


Ads by Google