Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001.

Slides:



Advertisements
Similar presentations
Phil Buonadonna, Jason Hill CS-268, Spring 2000 MOTE Active Messages Communication Architectures for Networked Mini-Devices Networked sub-devicesActive.
Advertisements

Multicasting in Mobile Ad Hoc Networks Ravindra Vaishampayan Department of Computer Science University of California Santa Cruz, CA 95064, U.S.A. Advisor:
Decentralized Reactive Clustering in Sensor Networks Yingyue Xu April 26, 2015.
Introduction to Wireless Sensor Networks
CSE 5392By Dr. Donggang Liu1 CSE 5392 Sensor Network Security Introduction to Sensor Networks.
A Transmission Control Scheme for Media Access in Sensor Networks Lee, dooyoung AN lab A.Woo, D.E. Culler Mobicom’01.
PERFORMANCE MEASUREMENTS OF WIRELESS SENSOR NETWORKS Gizem ERDOĞAN.
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
Monday, June 01, 2015 ARRIVE: Algorithm for Robust Routing in Volatile Environments 1 NEST Retreat, Lake Tahoe, June
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Fjording the Stream: An Architecture for Queries over Streaming Sensor Data Samuel Madden, Michael J. Franklin University of California, Berkeley Proceedings.
1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Transport Protocols.
Murat Demirbas Youngwhan Song University at Buffalo, SUNY
The Cougar Approach to In-Network Query Processing in Sensor Networks By Yong Yao and Johannes Gehrke Cornell University Presented by Penelope Brooks.
Queries over Sensor Networks Sam Madden UC Berkeley Database Seminar October 5, 2001.
Dissemination protocols for large sensor networks Fan Ye, Haiyun Luo, Songwu Lu and Lixia Zhang Department of Computer Science UCLA Chien Kang Wu.
Aggregation in Sensor Networks NEST Weekly Meeting Sam Madden Rob Szewczyk 10/4/01.
Energy-Efficient Design Some design issues in each protocol layer Design options for each layer in the protocol stack.
Challenges in Sensor Network Query Processing Sam Madden NEST Retreat January 15, 2002.
Adaptive Self-Configuring Sensor Network Topologies ns-2 simulation & performance analysis Zhenghua Fu Ben Greenstein Petros Zerfos.
A Transmission Control Scheme for Media Access in Sensor Networks Alec Woo, David Culler (University of California, Berkeley) Special thanks to Wei Ye.
Top-k Monitoring in Wireless Sensor Networks Minji Wu, Jianliang Xu, Xueyan Tang, and Wang-Chien Lee IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,
1 Energy Efficient Communication in Wireless Sensor Networks Yingyue Xu 8/14/2015.
Energy-Aware Synchronization in Wireless Sensor Networks Yanos Saravanos Major Advisor: Dr. Robert Akl Department of Computer Science and Engineering.
Energy Saving In Sensor Network Using Specialized Nodes Shahab Salehi EE 695.
MICA: A Wireless Platform for Deeply Embedded Networks
TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks Paper By : Samuel Madden, Michael J. Franklin, Joseph Hellerstein, and Wei Hong Instructor :
CS2510 Fault Tolerance and Privacy in Wireless Sensor Networks partially based on presentation by Sameh Gobriel.
A System Architecture for Networked Sensors Jason Hill, Robert Szewczyk, Alec Woo, Seth Hollar, David Culler, Kris Pister
A Transmission Control Scheme for Media Access in Sensor Networks Alec Woo and David Culler University of California at Berkeley Intel Research ACM SIGMOBILE.
The Design of an Acquisitional Query Processor For Sensor Networks Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong.
Dynamic Clustering for Acoustic Target Tracking in Wireless Sensor Network Wei-Peng Chen, Jennifer C. Hou, Lui Sha.
Power Save Mechanisms for Multi-Hop Wireless Networks Matthew J. Miller and Nitin H. Vaidya University of Illinois at Urbana-Champaign BROADNETS October.
March 6th, 2008Andrew Ofstad ECE 256, Spring 2008 TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks Samuel Madden, Michael J. Franklin, Joseph.
Crowd Management System A presentation by Abhinav Golas Mohit Rajani Nilay Vaish Pulkit Gambhir.
TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks Authors: Samuel Madden, Michael Franklin, Joseph Hellerstein Presented by: Vikas Motwani CSE.
1 Fjording The Stream An Architecture for Queries over Streaming Sensor Data Samuel Madden, Michael Franklin UC Berkeley.
Query Processing for Sensor Networks Yong Yao and Johannes Gehrke (Presentation: Anne Denton March 8, 2003)
System Architecture Directions for Networked Sensors Jason Hill, Robert Szewczyk, Alec Woo, Seth Hollar, David Culler, Kris Pister Presented by Yang Zhao.
Lan F.Akyildiz,Weilian Su, Erdal Cayirci,and Yogesh sankarasubramaniam IEEE Communications Magazine 2002 Speaker:earl A Survey on Sensor Networks.
Load-Balancing Routing in Multichannel Hybrid Wireless Networks With Single Network Interface So, J.; Vaidya, N. H.; Vehicular Technology, IEEE Transactions.
Communication Paradigm for Sensor Networks Sensor Networks Sensor Networks Directed Diffusion Directed Diffusion SPIN SPIN Ishan Banerjee
REED: Robust, Efficient Filtering and Event Detection in Sensor Networks Daniel Abadi, Samuel Madden, Wolfgang Lindner MIT United States VLDB 2005.
1 REED: Robust, Efficient Filtering and Event Detection in Sensor Networks Daniel Abadi, Samuel Madden, Wolfgang Lindner MIT United States VLDB 2005.
Energy-Efficient Monitoring of Extreme Values in Sensor Networks Loo, Kin Kong 10 May, 2007.
Minimizing Energy Consumption in Sensor Networks Using a Wakeup Radio Matthew J. Miller and Nitin H. Vaidya IEEE WCNC March 25, 2004.
System Architecture Directions for Networked Sensors Jason Hill, Robert Szewczyk, Alec Woo, Seth Hollar, David Culler, Kris Pister Presenter: James.
Tufts Wireless Laboratory School Of Engineering Tufts University Paper Review “An Energy Efficient Multipath Routing Protocol for Wireless Sensor Networks”,
a/b/g Networks Routing Herbert Rubens Slides taken from UIUC Wireless Networking Group.
By: Gang Zhou Computer Science Department University of Virginia 1 Medians and Beyond: New Aggregation Techniques for Sensor Networks CS851 Seminar Presentation.
Aggregation and Secure Aggregation. Learning Objectives Understand why we need aggregation in WSNs Understand aggregation protocols in WSNs Understand.
1 IEX8175 RF Electronics Avo Ots telekommunikatsiooni õppetool, TTÜ raadio- ja sidetehnika inst.
W. Hong & S. Madden – Implementation and Research Issues in Query Processing for Wireless Sensor Networks, ICDE 2004.
Optimization Problems in Wireless Coding Networks Alex Sprintson Computer Engineering Group Department of Electrical and Computer Engineering.
Aggregation and Secure Aggregation. [Aggre_1] Section 12 Why do we need Aggregation? Sensor networks – Event-based Systems Example Query: –What is the.
Building Wireless Efficient Sensor Networks with Low-Level Naming J. Heihmann, F.Silva, C. Intanagonwiwat, R.Govindan, D. Estrin, D. Ganesan Presentation.
KAIS T Location-Aided Flooding: An Energy-Efficient Data Dissemination Protocol for Wireless Sensor Networks Harshavardhan Sabbineni and Krishnendu Chakrabarty.
The Design of an Acquisitional Query Processor For Sensor Networks Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong Presentation.
MAC Protocols for Sensor Networks
TAG: a Tiny AGgregation service for ad-hoc sensor networks Authors: Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, Wei Hong Presenter: Mingwei.
William Stallings Data and Computer Communications
MAC Protocols for Sensor Networks
Distributed database approach,
PROTEAN: A Scalable Architecture for Active Networks
Introduction to Wireless Sensor Networks
Distributing Queries Over Low Power Sensor Networks
Streaming Sensor Data Fjord / Sensor Proxy Multiquery Eddy
REED : Robust, Efficient Filtering and Event Detection
Aggregation.
Challenges in Sensor Network Query Processing
Presentation transcript:

Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001

Outline Background Server Side Solutions Fjords, Sensor Proxies, CACQ Sensor Side Solutions Catalog Management Aggregation Future Work

Background: Sensor Networks

Sensor Networks Small, low cost battery powered microprocessors with 1 –4 sensors Light, temperature, vibration, acceleration, AC power, humidity. 10 kBit – 1Mbit wireless networks, 100ft range. “Ad-hoc” networking – no predefined routes. Cal, MIT, UCLA OS and networking communities committed

SmartDust Sensor nets motivated by “SmartDust Vision” – millimeter scale microprocessors, sensor, and wireless communication for pennies. Deployed in thousands, no concern for reliability of a single sensor. Requires: position detection, fault tolerance, aggregation, etc.

Rene / Mica Motes SmartDust stand-in ~2cm x 3cm, OTS. Processor Atmel 85354Mhz, 5 mA Radio RFM TR Mhz, 10kBits ~25 mJ/msg, msg / sec Memory 512B RAM, 8k Flash, 32k EEPROM Flash R/O EEPROM slow Power 575 mAh batteryPeak load: 19.5 mA, Idle 3.1 mA, sleeping 10uA.

TinyOS Lightweight OS for sensors Event-based Active-message, multi-hop networking Auto-idling Network reprogramming, time synchronization, etc. [18] J. Hill, R. Szewczyk, A. Woo, S. Hollar, and D. C. K. Pister. System architecture directions for networked sensors. In Proceedingsof the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, November 2000.

Applications of Sensor Nets Space Monitoring Power, light, temp in buildings Temperature, humidity Traffic Military Structural Personal Networks

Database Opportunities All applications depend on data processing Declarative query language over sensors attractive Want “to combine and aggregate data streaming from motes.” Sounds like a database…

Database Challenges Sensors unreliable Come on and offline, variable bandwidth Sensors push data Sensors stream data Sensors have limited memory, power, bandwidth Sensors have processors

Outline Background Server Side Solutions Fjords, Sensor Proxies, CACQ Sensor Side Solutions Catalog Management Aggregation Future Work

Fjords Query Plan Abstraction to handle lack of reliability and streaming, push based data Combine push and pull in arbitrary combinations Use connectors between operators to isolate them from flow direction “Bracket Model” – Graefe ‘93

Fjords (Continued) Operators assume non-blocking queue interface between each other. Queues implement push vs. pull Pull from A to B : Suspend A, schedule B until it produces data. A cannot go forward until B produces data. Push from B to A : A polls, scheduler thread invokes B until it produces data. A can process other inputs while waiting for B. Supports parallelism between operators via queues, state machines, and OS (e.g. NIC buffers, DMA) in operator transparent way.

Fjords Example   PushPush PushPush Pull Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, To Appear, Feburary 2002.

Fjords Example   PushPush PushPush Pull Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, To Appear, Feburary 2002.

Fjords Example   PushPush PushPush Pull Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, To Appear, Feburary 2002.

Fjords Example   PushPush PushPush Pull Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, To Appear, Feburary 2002.

Fjords Example   PushPush PushPush Pull Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, To Appear, Feburary 2002.

Fjords Example   PushPush PushPush Pull Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, To Appear, Feburary 2002.

Fjords Example   PushPush PushPush Pull Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, To Appear, Feburary 2002.

Fjords Example   PushPush PushPush Pull Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, To Appear, Feburary 2002.

Fjords Example   PushPush PushPush Pull Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, To Appear, Feburary 2002.

Fjords Applications Combine traffic streams with web-based accident reports Francis Li, Sam Madden, Megan Thomas. Traffic Visualization.

Operators for Streaming Data Need special operators for dealing with streams (See P. Seshadri, et al. The design and implementation of a sequence database systems..VLDB ’96 ) In particular, streams can’t be joined or sorted in the traditional sense Solution: Use windows – e.g. “Zipper Join”

Sensor Proxy Energy-sensitive database operator Buffer sensor tuples and route to multiple user queries to hide query load from sensors Push aggregation operators into sensors to reduce communications load Dynamically adjust sample rate based on user demand Push results into Fjords so that other operators don’t block waiting on slow or dead sensors

Some Results Pushing predicates into sensors can vastly reduce costs: Atmel Simulator 100 samples / sec 5 vehicles / sec 7x power savings

CACQ Expect hundreds to thousands of queries over same sensor sources Continuously Adaptive Continuous Queries Continuous Queries: Long running queries which combine selections and joins to improve efficiency ( See Chen, NiagaraCQ, SIGMOD 2000 )   Stocks. symbol = ‘MSFT’ Stocks. symbol = ‘APPL’ Query 2Query 1 Stock Quotes ‘MSFT’ ‘APPL’ Stock Quotes 

CACQ (Cont.) Continuous Adaptivity From Eddies Route tuples differently, depending on selectvity and cost estimates of operators static dataflow eddy

CACQ (cont.) Combining CA with CQ is a win: CQ increases number of simultaneous queries Adaptivity well suited to long running queries Eddies allow us to avoid ugly query- optimization phase in traditional CQ Eddies + Streams == few copies, unlike traditional CQ

CACQ (cont) Look for a paper in SIGMOD 2002 (fingers crossed!)

Outline Background Server Side Solutions Fjords, Sensor Proxies, CACQ Sensor Side Solutions Catalog Management Aggregation Future Work

Sensor Side Solutions CACQ + Fjords provides interface + performance on QP, but sensors still need help: Locate / identify sensors Reduce power consumption Take advantage of processors? Improve responsiveness

Cataloging Sensors To query sensors, need a way to locate, identify properties, extract values Goal: Drop a bunch of sensors around the DBMS, allow them to be queried without manual effort Idea: Add a layer to each sensor which advertises its capabilities

Catalog (Continued) #temperature sensor field { name : "temp" #optional type : int units : celsius min : -20 max : 100 bits : 8 sample_cost : 10.0 J #optional -- for use in costing sample_time : 10.0 ms #optional -- for use in costing input : adc2 #optional : read from adc channel 1 sends : ondemand accessorEvent : GET_TEMPERATURE_DATA responseEvent : TEMPERATURE_DATA_READY } Compiled in 27 bytes of memory Layer to register with telegraph Can be “push” or “pull”

Aggregating Over Sensors Sensor Proxy combines user queries, pushes down aggregates Goal: Save energy, increase efficiency Idea: Take advantage of the routing hierarchy (example soon!)

Why bother with aggregation Individual sensor readings are of limited use Interest in higher level properties, e.g. what vehicles drove through, what is the spread of temperatures in the building We have a processor & network on board, lets use it We cannot survive without aggregation Delivering a message to all nodes much easier than delivering a message from each node to a central point Delivering a large amount of data from every node harder still, vide connectivity experiment Forwarding raw information too expensive Scarce energy Scarce bandwidth Multihop performance penalty

Aggregation challenges Inherently unreliable environment, certain information unavailable or expensive to obtain how many nodes are present? how many nodes are supposed to respond? what is the error distribution (in particular, what about malicious nodes?) Trying to build an infrastructure to remove all uncertainty from the application may not be feasible – do we want to build distributed transactions? Information trickles in one message at a time Never have a complete and up-to-date information about the neighborhood What type of information should we expect from aggregation Streams Robust estimates

Scenario: Count

Goal: Count the number of nodes in the network. Number of children is unknown Sensor # Time

2 13 Scenario: Count Goal: Count the number of nodes in the network. Number of children is unknown Sensor # Time

Scenario: Count Goal: Count the number of nodes in the network. Number of children is unknown Sensor # Time

Scenario: Count Goal: Count the number of nodes in the network. Number of children is unknown ½ Sensor # Time

Scenario: Count Goal: Count the number of nodes in the network. Number of children is unknown ½ ½ Sensor # Time

Scenario: Count Goal: Count the number of nodes in the network. Number of children is unknown ½ ½ / Sensor # Time

Scenario: Count Goal: Count the number of nodes in the network. Number of children is unknown ½ ½ / / Sensor # Time

Counting Lessons Take advantage of redundancy to improve accuracy (reply to all parents, not just one) Use broadcast to reduce number of messages Result is a stream of values: much more robust to failures, movement, or collision than a single value.

Aggregation in network programming Network programming problem Reliable delivery of a large number of messages to all nodes in range, while exploiting the broadcast nature of the medium Basic setup Broadcast a known number of idempotent program fragments Each node keeps a bitmap of fragments received (1=packet received) Two stages of the problem: single hop, and multihop Solutions Single hop, dense cell Broadcasting the program – trivial, the central node broadcasts Feedback from nodes – broadcast a request from the central node: Is anyone missing packets in this packet range? Convergence: no replies to the request

Aggregation in multihop network programming Broadcasting the program – use flooding Remember the last 8 packets forwarded, use that cache to decide whether to forward or not Feedback from nodes Distribute requests for feedback using the flooding After some delay, respond if any packets are missing locally Responses from children: AND with the local bitmap, store the result locally, forward the request Suboptimal because there is no local fixups Convergence No replies to the request

Aggregation over streams Inherent uncertainty of the system Can nodes communicate, do they have enough power, have they moved? computing a complete single answer can be very expensive, and may not be possible Partial estimates have their own value Aggregation over streams Values reflect the current best estimates Self stabilizing: in the absence of changes converges to a desired value within N steps

What does it mean to aggregate (The DB Perspective) General purpose solution: apply standard aggregation operators like COUNT, MIN, MAX, AVERAGE, and SUM to any set of sensors. Previous example are application specific In sensors, operators may be arbitrary signal processing functions Provide grouping semantics: e.g. ‘select avg(temp) group by trunc(light/10)’ In sensor networks, groups may be random samples t1t1 t2t2 t3t3 t4t4 t5t5 t6t6 t7t7 t8t8 t9t9

Identifying Groups Need a way to identify groups Idea: set of membership criteria pushed down Nodes determine their membership set based on those criteria Nodes can be in multiple but not unlimited groups E.g. “Group 1 : 0 <= t < 10, Group 2 : 10 <= t < 20, …” Need a way to evaluate aggregation predicates by group May want to allow grouping and aggregation predicates to be expressed together to take advantage of broadcast effects

Local Query Rewrite Intermediate nodes may determine that its faster to evaluate an aggregate by asking children a different question. Example 1: MAX(t). Once we have a guess T for MAX, ask children to report iff t > T, rather than asking all children to compute a local maximum. Example 2: Network programming. Rather than asking nodes what packets they have, ask them to report iff packets missing. Is this a general technique? Maybe: Inform child of guess at aggregate, ask it to refute. Works for average (within error bound), not count.

Wins and pitfalls of aggregation Aggregation over natural network topology Aggregation over an arbitrary subset of the network may be a loss Really dense cells Aggregation does not help with the starvation problem Use the message suppression via query rewrite technique Still beneficial in a multihop scenario

Advanced Aggregation Tricks Break the Network Protocol Boundary Use analog reading from channel over time to determine aggregates. Simple example: Time Sum Reading = 11 = Reading = 21 = Reading = 32 =

Outline Background Server Side Solutions Fjords, Sensor Proxies, CACQ Sensor Side Solutions Catalog Management Aggregation Future Work

DBMS Side Efficient Catalog Management Moving Object Databases Query Optimization Techniques Sensor Side Efficient Grouping Joins over Network Topology Non Standard Aggregate Functions Somewhere In Between Histograms and other Correlations Sampling and Compression for Streams