Sensor Data Management Egemen Tanin Department of Computer Science and Software Engineering University of Melbourne.

Slides:



Advertisements
Similar presentations
Advisor : Prof. Yu-Chee Tseng Student : Yi-Chen Lu 12009/06/26.
Advertisements

Directed Diffusion for Wireless Sensor Networking
Scalable Content-Addressable Network Lintao Liu
A Presentation by: Noman Shahreyar
1 Data-Centric Storage in Sensornets with GHT, A Geographic Hash Table Sylvia Ratnasamy, Scott Shenker, Brad Karp, Ramesh Govindan, Deborah Estrin, Li.
1 Message Oriented Middleware and Hierarchical Routing Protocols Smita Singhaniya Sowmya Marianallur Dhanasekaran Madan Puthige.
Sensor Network 教育部資通訊科技人才培育先導型計畫. 1.Introduction General Purpose  A wireless sensor network (WSN) is a wireless network using sensors to cooperatively.
Presented By- Sayandeep Mitra TH SEMESTER Sensor Networks(CS 704D) Assignment.
Rumor Routing Algorithm For sensor Networks David Braginsky, Computer Science Department, UCLA Presented By: Yaohua Zhu CS691 Spring 2003.
1 Sensor Network Databases Ref: Wireless sensor networks---An information processing approach Feng Zhao and Leonidas Guibas (chapter 6)
An Evaluation of Multi-Resolution Storage for Sensor Networks SenSys’03 Paper by Deepak Ganesan, Ben Greenstein, Denis Perelyubskiy, Deborah Estrin, and.
Data-Centric Storage in Sensor Networks With GHT Khaldoun A. Ibrahim,
Department of Computer Science, University of Maryland, College Park, USA TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:
Data Centric Storage using GHT Lecture 13 October 14, 2004 EENG 460a / CPSC 436 / ENAS 960 Networked Embedded Systems & Sensor Networks Andreas Savvides.
1 Data-Centric Storage in Sensornets Sylvia Ratnasamy, Scott Shenker, Brad Karp, Ramesh Govindan, Deborah Estrin ICSI/UCB/USC/UCLA Presenter: Vijay Sundaram.
Multi-dimensional Range Query in Sensor Networks Xin Li,Young Jim Kim, Ramesh Govindan (University of Southern California ) Wei Hong (Intel Research Lab.
The Cougar Approach to In-Network Query Processing in Sensor Networks By Yong Yao and Johannes Gehrke Cornell University Presented by Penelope Brooks.
Dept. of Computer Science & Engineering, CUHK1 Trust- and Clustering-Based Authentication Services in Mobile Ad Hoc Networks Edith Ngai and Michael R.
Dissemination protocols for large sensor networks Fan Ye, Haiyun Luo, Songwu Lu and Lixia Zhang Department of Computer Science UCLA Chien Kang Wu.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Overview Distributed vs. decentralized Why distributed databases
Distributed Quad-Tree for Spatial Querying in Wireless Sensor Networks (WSNs) Murat Demirbas, Xuming Lu Dept of Computer Science and Engineering, University.
Extending Network Lifetime for Precision-Constrained Data Aggregation in Wireless Sensor Networks Xueyan Tang School of Computer Engineering Nanyang Technological.
Online Data Gathering for Maximizing Network Lifetime in Sensor Networks IEEE transactions on Mobile Computing Weifa Liang, YuZhen Liu.
Load Balancing Routing Scheme in Mars Sensor Network CS 215 Winter 2001 Term Project Prof : Mario Gerla Tutor: Xiaoyan Hong Student : Hanbiao Wang & Qingying.
TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks Paper By : Samuel Madden, Michael J. Franklin, Joseph Hellerstein, and Wei Hong Instructor :
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
1 Chalermek Intanagonwiwat (USC/ISI) Ramesh Govindan (USC/ISI) Deborah Estrin (USC/ISI and UCLA) DARPA Sponsored SCADDS project Directed Diffusion
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS (Cont’d) Instructor Ms. Arwa Binsaleh.
An Integration Framework for Sensor Networks and Data Stream Management Systems.
Sensor Network Databases1 Overview: Chapter 6  Sensor Network Databases  Sensor networks are conceptually a distributed DB  Store collected data  Indexes.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Re-thinking Data Management for Storage-Centric Sensor Networks Deepak Ganesan University.
March 6th, 2008Andrew Ofstad ECE 256, Spring 2008 TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks Samuel Madden, Michael J. Franklin, Joseph.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Geographic Hash Table S. Ratnasamy, B. Karp, S. Shenker, D. Estrin, R. Govindan, L. Yin and F. Yu.
Data centric Storage In Sensor networks Based on Balaji Jayaprakash’s slides.
Multicast Routing Algorithms n Multicast routing n Flooding and Spanning Tree n Forward Shortest Path algorithm n Reversed Path Forwarding (RPF) algorithms.
Decomposing Data-Centric Storage Query Hot-Spots in Sensor Netwokrs Mohamed Aly, Panos K. Chrysanthis, and Kirk Pruhs University of Pittsburgh Proceeding.
Wireless Sensor Networks In-Network Relational Databases Jocelyn Botello.
RELAX : An Energy Efficient Multipath Routing Protocol for Wireless Sensor Networks Bashir Yahya, Jalel Ben-Othman University of Versailles, France ICC.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
Benjamin AraiUniversity of California, Riverside Reliable Hierarchical Data Storage in Sensor Networks Song Lin – Benjamin.
 SNU INC Lab MOBICOM 2002 Directed Diffusion for Wireless Sensor Networking C. Intanagonwiwat, R. Govindan, D. Estrin, John Heidemann, and Fabio Silva.
Directed Diffusion: A Scalable and Robust Communication Paradigm for Sensor Networks ChalermekRameshDeborah Intanagonwiwat Govindan Estrin Mobicom 2000.
Dave McKenney 1.  Introduction  Algorithms/Approaches  Tiny Aggregation (TAG)  Synopsis Diffusion (SD)  Tributaries and Deltas (TD)  OPAG  Exact.
Load-Balancing Routing in Multichannel Hybrid Wireless Networks With Single Network Interface So, J.; Vaidya, N. H.; Vehicular Technology, IEEE Transactions.
Communication Paradigm for Sensor Networks Sensor Networks Sensor Networks Directed Diffusion Directed Diffusion SPIN SPIN Ishan Banerjee
REED: Robust, Efficient Filtering and Event Detection in Sensor Networks Daniel Abadi, Samuel Madden, Wolfgang Lindner MIT United States VLDB 2005.
1 REED: Robust, Efficient Filtering and Event Detection in Sensor Networks Daniel Abadi, Samuel Madden, Wolfgang Lindner MIT United States VLDB 2005.
1 Shape Segmentation and Applications in Sensor Networks Xianjin Xhu, Rik Sarkar, Jie Gao Department of CS, Stony Brook University INFOCOM 2007.
1 Distributed Databases BUAD/American University Distributed Databases.
Energy conservation in Wireless Sensor Networks Sagnik Bhattacharya, Tarek Abdelzaher University of Virginia, Department of Computer Science School of.
Dr. Sudharman K. Jayaweera and Amila Kariyapperuma ECE Department University of New Mexico Ankur Sharma Department of ECE Indian Institute of Technology,
Hybrid Indirect Transmissions (HIT) for Data Gathering in Wireless Micro Sensor Networks with Biomedical Applications Jack Culpepper(NASA), Lan Dung, Melody.
Modeling In-Network Processing and Aggregation in Sensor Networks Ajay Mahimkar The University of Texas at Austin March 24, 2004.
Tufts Wireless Laboratory School Of Engineering Tufts University Paper Review “An Energy Efficient Multipath Routing Protocol for Wireless Sensor Networks”,
An Adaptive Zone-based Storage Architecture for Wireless Sensor Networks Thang Nam Le, Dong Xuan and *Wei Yu Department of Computer Science and Engineering,
Submitted by: Sounak Paul Computer Science & Engineering 4 th Year, 7 th semester Roll No:
REED : Robust, Efficient Filtering and Event Detection in Sensor Network Daniel J. Abadi, Samuel Madden, Wolfgang Lindner Proceedings of the 31st VLDB.
Performance Comparison of Ad Hoc Network Routing Protocols Presented by Venkata Suresh Tamminiedi Computer Science Department Georgia State University.
The Design of an Acquisitional Query Processor For Sensor Networks Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong Presentation.
Data Query in Sensor Networks Carmelissa Valera Jason Torre Carmelissa Valera Jason Torre.
Routing protocols for sensor networks.
Introduction to Wireless Sensor Networks
Distributed database approach,
Net 435: Wireless sensor network (WSN)
A Survey on Routing Protocols for Wireless Sensor Networks
Outline Ganesan, D., Greenstein, B., Estrin, D., Heidemann, J., and Govindan, R. Multiresolution storage and search in sensor networks. Trans. Storage.
Develop distributed algorithms for sensor networks which provide:
Overview: Chapter 2 Localization and Tracking
Presentation transcript:

Sensor Data Management Egemen Tanin Department of Computer Science and Software Engineering University of Melbourne

Goals & Fundamental Observations  Goal: Improve sensor network lifetime AND:  Maintain the current DBMS abstraction and facilities while introducing algorithms to run queries efficiently  Add new capabilities for emerging applications such as summarization of data for getting rid of irrelevant data  Observations:  Communication in sensors is much more energy hungry than computation  Sensor Networks are made up of simple devices with no extensive data storage

Additional challenges  The overall system is very volatile  Changes in environment conditions can render readings inaccessible  Failure of nodes cannot be easily fixed  Nodes can run low on power over time  Data is dynamic  New data is being appended all the time  Serving multiple queries concurrently is problematic  Sensors are very limited on physically what they can observe at a given time

Fundamental Approaches  Collect all the sensor data to one or more data centers  Use a classical DBMS Energy inefficiencies due to redundant data collection, central point of failure, hot spots near root, has to collect data at the highest frequency for all potential queries and all the time Current DBMSs not fast enough for high-update applications Many facilities are redundant: RDBMS were built 25+ years ago Lack of certain convenient operations, e.g., continuous queries  Rebuild a DBMS for sensor networks and fix some of the problems on a central setting? Still energy inefficient due to centralization

Approaches Contd.  In-network storage and processing along with a capability to inject and collect data from any- where in the network for any number of centers  Already implied by communication costs dominating the computation costs in the network  But storage limitations require eliminating some data  Fundamentally different than current commercial RDBMSs

Query Classification for Sensor Networks  Continuous queries: that commonly span some long period of time  Snapshot queries: that collect data about now or some other point in time  Historical queries: collect summary data about past

Additional Operators  Use of only some of the sensors  Aggregation of data from multiple sensors  Correlation of data from sensors

Example Query SELECT min(humidity), town FROM sensors WHERE state = ‘Queensland’ GROUP BY town HAVING max(temperature) > 30 DURATION [now, now min] SAMPLING PERIOD 30 min

Extending SQL  Example: Cougar Sensor Network Database System (by Yong Yao and Johannes Gehrke)  Uses SQL like interface  After in-network processing, data is fed to a center  Optimizes for both resource usage and reaction time  Assumes that sensors are time synchronized  Each type of sensor is represented as an Abstract Data Type (ADT)  Each sensor is then an object of that ADT  Relations are virtual and append-only relations

Cougar Contd.  Has SELECT, FROM, WHERE, GROUP BY, HAVING, DURATION, and EVERY clauses  Now extended to have Gaussian ADTs (GADTs) to run probabilistic queries as sensors collect data with noise from physical phenomena: SELECT * FROM sensors WHERE sensor.temp.prob([10,20] >= 0.6) ‘Get the temperature data from sensors if it is ±5 of 15 degrees with at least 60 percent probability’

Execution Steps  Broadcast the query to the network  Collect data back  Not all data may be relevant and summarization of data may be utilized  Further analysis on a central system can be done if needed later  Note: Either a human or an automated system can be the origin of the query

Data Collection  Energy x Delay is the main composite metric  Methods:  Direct Independent Transmission  PEGASIS (Power-Efficient Gathering for Sensor Information Systems)  Binary Chain-based Scheme  Chain-based Three-level Scheme  Directed Diffusion  Tree-based Schemes  Multi-path Schemes  Hybrids

Direct Independent Transmission  Each node transmits to a center independently  Very energy inefficient  Nodes must watch out for collision and take turns  Hence the last message can be transmitted after a significant delay  First response may be very fast

PEGASIS  By Stephanie Lindsey and Cauligi Raghavendra  Assumes all nodes know the location of every other node  All nodes should be able to transmit data to the center in one hop  A greedy algorithm is used to construct a chain of sensor nodes starting farthest from the center  The chain is formed a priori  After every hop, data aggregation can be done  Leadership is transferred sequentially  May be energy efficient but delay is O(n)

PEGASIS To Center Leader End Start Sensors

Binary Chain-based Scheme  By the same authors from PEGASIS  It is a chain-based scheme like PEGASIS  Nodes are classified into levels  All nodes receiving a message at one level rise to the next level  At each level, number of nodes is halved  This is a CDMA only scheme (to prevent collisions)  Delay is O(log n)

Binary Chain-based Scheme Contd. To Center Step 1 Step 2 Step 3 Step 4

Chain-based Three-level Scheme  By the same authors from PEGASIS  For non-CDMA settings binary does not work  Again, a chain, like PEGASIS, is formed but the network is partitioned into groups that are far away from each other for simultaneous transmissions  Within a group, nodes transmit at the same time  One node of the group aggregates and goes to the next level  In the next level, all nodes are divided into two groups  Finally, all send to one node which sends to a center

Directed Diffusion  By Chalermek Intanagonwiwat and Ramesh Govidan and Deborah Estrin  Consists of:  Interest propagation E.g., location=[(100,100),(10, 200)], temperature=[10,20]  Gradient setup  Data delivery along reinforced path

Directed Diffusion Contd.

Tree-based Schemes  A routing tree rooted at a base station is used  The tree, that is utilized to distribute the query, is also utilized to collect the data  Example, TinyDB (by Samuel Madden and Michael Franklin and Joseph Hellerstein and Wei Hong)

TinyDB Contd.  Uses an epoch-based mechanism  Main disadvantage is that it can loose large subtrees/data due to central point of failure

Extensions  Report data only if it has changed from the previous report or consider whether a re-report will effect the final aggregation at all  Adapting to changing conditions in the network:

Multi-path Schemes  To prevent failures, the same sensor value can be sent along multiple paths  The main disadvantage is that the final value now may contain an approximation rather than an exact value  E.g., by Suman Nath and Philip Gibbons and Srinivasan Seshan and Zachary Anderson:

Hybrids  E.g., By Amit Manjhi and Suman Nath and Philip Gibbons  Benefits of both a tree mechanism as well as a multi-path mechanism: Base Station Tree Multi-path

Storing Data versus Data Collection  Rather than collecting data from individual sensors for every given query, sensors can be made to store their data in the network for point retrieval at a later time  Similar to creating rendezvous points

Geographic Hash Tables (GHTs)  By Sylvia Ratnasamy and Brad Karp and Li Yin and Fang Yu and Deborah Estrin and Ramesh Govindan and Scott Shenker  Assumes each node knows its location  Limited to point queries  Hashes keys to geographic locations  Stores a key-value pair on a sensor closest to the location  Geographic routing is used to access this data with a key later on  Replication on nearby nodes can be used for load sharing and failure resistance  Regions of data, rather than individual sensor readings, can also be hashed as an extension  The idea, in general, similar to publish-subscribe

GHTs Contd. Storage Query Source Query Source Data Source x x

Range Queries  GHTs do not work for range queries  A similar approach to Binary Chain-based Schemes can be used to one dimensional settings but storage, rather than collection is the goal: Road Sensors a b d c f e g h i j badc ca b fehg ge f d

Multidimensional Indexing  For multidimensional indexing, we can use:  Grid files with multidimensional range hashing  Quadtrees with block hashing  It is less clear how to map R-trees or k-d trees using hashing  In general, research on this front is at its infancy  Load balancing as well as minimizing communication overhead is a critical issue

DIMENSIONS System  By Deepak Ganesan and Ben Greenstein and Denis Perelyubskiy and Deborah Estrin and John Heidemann

DIFS System  By Benjamin Greenstein and Deborah Estrin and Ramesh Govindan and Sylvia Ratnasamy and Scott Shenker  A multi-rooted method  Nodes hold histograms  Even load distribution  I.e., we have many roots

Fractional Cascading  By Jie Gao and Leonidas Guibas and John Hershberger and Li Zhang  Request are commonly local, i.e., from a given node  GHTs can store data afar  Hence: Keep a fraction of distant data and keep detailed local data (use exponential decay)

Locality Preserving Hashing: DIM system  By Xin Li and Young Jin Kim and Ramesh Govindan and Wei Hong

Additional Issues: Data Aging  Algorithms are needed for summarizing aging data on sensors:  E.g., DIMENSIONS uses a monotonically decreasing function to discard data over time by creating new summaries

Summary and Future Directions  In-network processing is gaining momentum  Either collect data in an efficient manner  Or store data by creating good rendezvous-based mechanisms  Complex data aggregation mechanisms for sophisticated data analysis is commonly cited as a good research direction  Subquery generation and subquery trading is also a good research direction  Indexing with complex query processing is also at its infancy