Big Data Quality Challenges for the Internet of Things (IoT) Vassilis Christophides INRIA Paris (MUSE team)

Slides:



Advertisements
Similar presentations
anywhere and everywhere. omnipresent A sensor network is an infrastructure comprised of sensing (measuring), computing, and communication elements.
Advertisements

MOTOROLA and the Stylized M Logo are registered in the US Patent and Trademark Office. All other product or service names are the property of their respective.
Karl Aberer, Saket Sathe, Dipanjan Charkaborty, Alcherio Martinoli, Guillermo Barrenetxea, Boi Faltings, Lothar Thiele EPFL, IBM Research India, ETHZ.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Integration and Insight Aren’t Simple Enough Laura Haas IBM Distinguished Engineer Director, Computer Science Almaden Research Center.
1 Research Profile Guoliang Xing Assistant Professor Department of Computer Science and Engineering Michigan State University.
1 Security and Privacy in Sensor Networks: Research Challenges Radha Poovendran University of Washington
Columbia Hypermedia IMmersion Environment CHIME.
1 3 rd SG13 Regional Workshop for Africa on “ITU-T Standardization Challenges for Developing Countries Working for a Connected Africa” (Livingstone, Zambia,
Research Directions for the Internet of Things Supervised by: Dr. Nouh Sabry Presented by: Ahmed Mohamed Sayed.
THE SECOND LIFE OF A SENSOR: INTEGRATING REAL-WORLD EXPERIENCE IN VIRTUAL WORLDS USING MOBILE PHONES Sherrin George & Reena Rajan.
Towards EU big data economy Kimmo Rossi European Commission
© 2012 TeraMedica, Inc. Big Data: Challenges and Opportunities for Healthcare Joe Paxton Healthcare and Life Sciences Sales Leader.
1 Motivation Video Communication over Heterogeneous Networks –Diverse client devices –Various network connection bandwidths Limitations of Scalable Video.
“Collaborative automation: water network and the virtual market of energy”, an example of Operational Efficiency improvement through Analytics Stockholm,
Learning Micro-Behaviors In Support of Cognitive Assistance AlarmNet is a wireless sensor network (WSN) system for smart health-care that opens up new.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
V. Chandrasekar (CSU), Mike Daniels (NCAR), Sara Graves (UAH), Branko Kerkez (Michigan), Frank Vernon (USCD) Integrating Real-time Data into the EarthCube.
C7:Complex Event Processing Making Sense of Sensor Network Events in Real Time John Doherty Senior Presales Consultant.
NSF Critical Infrastructures Workshop Nov , 2006 Kannan Ramchandran University of California at Berkeley Current research interests related to workshop.
material assembled from the web pages at
MIS – 3030 Business Technologies Social Media & Conversation Big Data.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Generating and sharing large datasets: Moving out of our measurement comfort Rita Kukafka and Pamela M. Kato October 16-17, 2012 Bruxelles, Belgique.
Usability in Pervasive Computing Environment Advance Usability October 18, 2004 Anuj A. Nanavati.
Ch. 9. The Cloud of Things 1Ch. 9. CoT.  Current M2M/IoT solutions are focusing on communications and integration. Future Web of Things (WoT) evolution.
Developer TECH REFRESH 15 Junho 2015 #pttechrefres h Understand your end-users and your app with Application Insights.
Cloud Computing & Big Data Beny. Erlien. Febrian. Ragnar. Billy.
Institute for Security Technology Studies Dartmouth College Digital Living 2010: Sensors, Privacy, and Trust David Kotz September 2005.
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
1 Melanie Alexander. Agenda Define Big Data Trends Business Value Challenges What to consider Supplier Negotiation Contract Negotiation Summary 2.
Project Coordinator; Create-Net
Internet of Things. IoT Novel paradigm – Rapidly gaining ground in the wireless scenario Basic idea – Pervasive presence around us a variety of things.
Internet of Things in Industries
Adaptive Tracking in Distributed Wireless Sensor Networks Lizhi Yang, Chuan Feng, Jerzy W. Rozenblit, Haiyan Qiao The University of Arizona Electrical.
By Jack Stewart. Cloud computing, or something being in the cloud, is a colloquial expression used to describe a variety of different types of computing.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
IoT Meets Big Data Standardization Considerations
INSIGHT: Intelligent Synthesis and Real Time Response using Massive Streaming of Heterogeneous Data Heterogeneous Stream Processing and Crowdsourcing for.
BUSINESS INTELLIGENCE & ADVANCED ANALYTICS DISCOVER | PLAN | EXECUTE JANUARY 14, 2016.
Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,
Cognitive & Organizational Challenges of Big Data in Cyber Defence. YALAVARTHI ANUSHA 1.
Big Data Analytics with Excel Peter Myers Bitwise Solutions.
Smart Planet The IBM Smart Planet initiative concentrates on the world’s infrastructure, those systems and processes that enable goods to be developed,
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
Internet of Things. Creating Our Future Together.
Efficient Opportunistic Sensing using Mobile Collaborative Platform MOSDEN.
TWOJA CYFROWA PRZYSZŁOŚĆ. JUŻ DZISIAJ. Christoph F. Strnadl CTO Central & Eastern Europe 11 May 2016.
BIG DATA. The information and the ability to store, analyze, and predict based on that information that is delivering a competitive advantage.
Big Data Javad Azimi May First of All… Sorry about the language  Feel free to ask any question Please share similar experiences.
Real Time Event Processing Using Distributed Machine Learning in Urban Environments Nikos Stefanos Kostagiolas Computer Science Student at National Kapodistrian.
SRA 2016 – Strategic Research Challenges Design Methods, Tools, Virtual Engineering Jürgen Niehaus, SafeTRANS.
Internet of Things – Getting Started
IoT R&I on IoT integration and platforms INTERNET OF THINGS
Lecture 8: Wireless Sensor Networks By: Dr. Najla Al-Nabhan.
AMSA TO 4 Advanced Technology for Sensor Clouds 09 May 2012 Anabas Inc. Indiana University.
Connected Infrastructure
IoT Security Part 2, The Malware
Big Data.
Cyber Resilient Energy Delivery Consortium
GRUPPO TELECOM ITALIA The IoT 5G Network Infrastructures and the IoT Data Store & Share Platform: challenges and opportunities Roberto Gavazzi – TIM Technology.
1st Draft for Defining IoT (1)
Connected Infrastructure
Test Automation for IoT solutions A Paradigm shift
The Internet of Things Exploring Big Data’s Missing Soulmate
Big Data Quality Challenges for the Internet of Things (IoT)
C7: Complex Event Processing
Lecture 1: Introduction
Anatomy of a modern data-driven content product
Big DATA.
Presentation transcript:

Big Data Quality Challenges for the Internet of Things (IoT) Vassilis Christophides INRIA Paris (MUSE team)

The Internet of Things (IoT) 1  Networks of physical objects (aka. things) with embedded sensing and actuating capabilities that communicate with other objects and information systems item identification (tagging things) sensors (feeling things) nanotechnology (shrinking things) … “Things” are highly heterogeneous: Small (RFID tag) or Big (car) Fixed (fridge) or Mobile (activity tracker) Environment (thermostat) or Person- oriented (body analyzer)

Urban Computing 2

3 physical exercise sleep quality biochemical markers psychological state genetics Quantified Self

Big Data = Transactions+Interactions+Observations Increasing Data Variety, Velocity, & Veracity Megabytes Gigabytes Terabytes Petabytes IoT devices are reporting even more personal data than humans are! hortonworks.com/blog/7-key-drivers-for-the-big-data-market

Adapted from www- 05.ibm.com/fr/events/netezzaDM_2012/Solutions_Big_Data.pdf 5 The 5+1 Vs of Big Data Variability* Data in Change Evolving data distributions, models etc. Insights+ Understanding Automation+ Optimization Value

IoT Data Value Chain 6 Capture, track & monitor Transmit data to external env. Ingest, store & integrate data Analyze data, control & automate PROCESS ANALYSE

 Agility: – Availability: almost real-time, any- place – Accessibility: mostly in verticals, privacy constrained  Relevance: – Fitness: depends on the granularity of observations & measurements in thematic, spatial & temporal dimensions  Usability: – Trustworthiness: observations or measurements in the wild 7 IoT Data Quality (DQ)

 Reliability: – Accuracy: depends on device calibration & sensing method – Validity: depends on the resources constraints (connectivity, bandwidth, power, memory, storage & processing capabilities) of devices and data infrastructure – Completeness: due to data variety a complete domain knowledge is infeasible due to data variability domain knowledge quickly becomes obsolete – Integrity: usually relative to a collection of raw data series originating from different devices 8 IoT Data Quality (DQ) Ben Stansall/Agence France-Presse/Getty Images

In Search of IoT DQ Solutions  Let the data speak for itself! – Learn models (semantics) from the data robust to the presence noise (and anomalies) – Detect deviations of data from learned models – Evolve learned models according to data deviations  Computing with Big Data! – Volume: Scalable algorithms (efficiency vs accuracy) – Variety: Looking at condition and context of data deviations – Velocity: Incremental and online algorithms 9 Data Quality: the “other” Face of Big Data B. Saha, D. Srivastava ICDE 2013

Towards DQ-aware IoT Analytics 10  Analyze a single data stream: – How we can incrementally detect deviations from data regions of normal behavior? – How we can distinguish between data glitches, meaningful events or even malicious attacks? – What types of data deviations can be identified (distance, density, contextual) and at what granularity level?  Analyze multiple data streams: – How we can compute online correlations across time/space in case missing or delayed data ? – How we can progressively evolve extracted knowledge patterns (motifs, episodes)?

11 chain-key-to-delivering-business-value-in-iot Key Analytics to Delivering Value in IoT

Thank you! 12

The Three Domains of Information 13 Source: Barry Devlin, “The Big Data Zoo --- Taming the Beasts

Computing with Things: Challenges  Things are different than servers in a Data Center: they are used in the wild, and they are often constrained by limited connectivity, bandwidth, power, memory, storage & processing capabilities  Things are different from UI clients: they don’t usually dispose on-board an UI inheriting more by a M2M communication than UI client-to-server interaction paradigm  Things may directly communication with peers: It isn't all thin-client communication to the parent server in the cloud and hub-and-spoke model presents serious limitations for very large number of devices 14

15 re-workblog.tumblr.com

16