Querying Sensor Networks

Slides:

Advertisements

Similar presentations

Trickle: Code Propagation and Maintenance Neil Patel UC Berkeley David Culler UC Berkeley Scott Shenker UC Berkeley ICSI Philip Levis UC Berkeley.

Advertisements

한국기술교육대학교 컴퓨터 공학 김홍연 TinyDB : An Acquisitional Query Processing System for Sensor Networks. - Samuel R. Madden, Michael J. Franklin, Joseph M. Hellerstein,

Overview: Chapter 7  Sensor node platforms must contend with many issues  Energy consumption  Sensing environment  Networking  Real-time constraints.

1 Sensor Network Databases Ref: Wireless sensor networks---An information processing approach Feng Zhao and Leonidas Guibas (chapter 6)

1 Querying Sensor Networks Sam Madden UC Berkeley.

Programming Vast Networks of Tiny Devices David Culler University of California, Berkeley Intel Research Berkeley

1 Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks Samuel Madden UC Berkeley With Robert Szewczyk, Michael Franklin, and David Culler.

Aggregation in Sensor Networks NEST Weekly Meeting Sam Madden Rob Szewczyk 10/4/01.

A Survey of Wireless Sensor Network Data Collection Schemes by Brett Wilson.

Approximate data collection in sensor networks the appeal of probabilistic models David Chu Amol Deshpande Joe Hellerstein Wei Hong ICDE 2006 Atlanta,

Generic Sensor Platform for Networked Sensors Haywood Ho.

1 Acquisitional Query Processing in TinyDB Sam Madden UC Berkeley NEST Winter Retreat 2003.

The Design of an Acquisitional Query Processor For Sensor Networks Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong Presentation.

Model-driven Data Acquisition in Sensor Networks Amol Deshpande 1,4 Carlos Guestrin 4,2 Sam Madden 4,3 Joe Hellerstein 1,4 Wei Hong 4 1 UC Berkeley 2 Carnegie.

TAG: A TINY AGGREGATION SERVICE FOR AD-HOC SENSOR NETWORKS Presented by Akash Kapoor SAMUEL MADDEN, MICHAEL J. FRANKLIN, JOSEPH HELLERSTEIN, AND WEI HONG.

T AG : A TINY AGGREGATION SERVICE FOR AD - HOC SENSOR NETWORKS Samuel Madden, Michael J. Franklin, Joseph Hellerstein, and Wei Hong Presented by – Mahanth.

TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks Paper By : Samuel Madden, Michael J. Franklin, Joseph Hellerstein, and Wei Hong Instructor :

The Design of an Acquisitional Query Processor For Sensor Networks Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong.

TinyOS By Morgan Leider CS 411 with Mike Rowe with Mike Rowe.

An Integration Framework for Sensor Networks and Data Stream Management Systems.

March 6th, 2008Andrew Ofstad ECE 256, Spring 2008 TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks Samuel Madden, Michael J. Franklin, Joseph.

1 Pradeep Kumar Gunda (Thanks to Jigar Doshi and Shivnath Babu for some slides) TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks Samuel Madden,

TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks Authors: Samuel Madden, Michael Franklin, Joseph Hellerstein Presented by: Vikas Motwani CSE.

1 TAG: A Tiny Aggregation Service for Ad-Hoc Sensor Networks Samuel Madden UC Berkeley with Michael Franklin, Joseph Hellerstein, and Wei Hong December.

1 Querying Sensor Networks Sam Madden UC Berkeley December 13 th, New England Database Seminar.

Sensor Database System Sultan Alhazmi

The Design of an Acquisitional Query Processor for Sensor Networks CS851 Presentation 2005 Presented by: Gang Zhou University of Virginia.

한국기술교육대학교 컴퓨터 공학 김홍연 Habitat Monitoring with Sensor Networks DKE.

Query Processing for Sensor Networks Yong Yao and Johannes Gehrke (Presentation: Anne Denton March 8, 2003)

REED: Robust, Efficient Filtering and Event Detection in Sensor Networks Daniel Abadi, Samuel Madden, Wolfgang Lindner MIT United States VLDB 2005.

1 REED: Robust, Efficient Filtering and Event Detection in Sensor Networks Daniel Abadi, Samuel Madden, Wolfgang Lindner MIT United States VLDB 2005.

ResTAG: Resilient Event Detection with TinyDB Angelika Herbold -Western Washington University Thierry Lamarre -ENSEIRB Systems Software Laboratory, OGI.

Xiong Junjie Node-level debugging based on finite state machine in wireless sensor networks.

Aggregation and Secure Aggregation. Learning Objectives Understand why we need aggregation in WSNs Understand aggregation protocols in WSNs Understand.

W. Hong & S. Madden – Implementation and Research Issues in Query Processing for Wireless Sensor Networks, ICDE 2004.

In-Network Query Processing on Heterogeneous Hardware Martin Lukac*†, Harkirat Singh*, Mark Yarvis*, Nithya Ramanathan*† *Intel.

REED ： Robust, Efficient Filtering and Event Detection in Sensor Network Daniel J. Abadi, Samuel Madden, Wolfgang Lindner Proceedings of the 31st VLDB.

Aggregation and Secure Aggregation. [Aggre_1] Section 12 Why do we need Aggregation? Sensor networks – Event-based Systems Example Query: –What is the.

Sep Multiple Query Optimization for Wireless Sensor Networks Shili Xiang Hock Beng Lim Kian-Lee Tan (ICDE 2007) Presented by Shan Bai.

1 TAG: A Tiny Aggregation Service for Ad-Hoc Sensor Networks Samuel Madden UC Berkeley with Michael Franklin, Joseph Hellerstein, and Wei Hong December.

Building Wireless Efficient Sensor Networks with Low-Level Naming J. Heihmann, F.Silva, C. Intanagonwiwat, R.Govindan, D. Estrin, D. Ganesan Presentation.

1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.

The Design of an Acquisitional Query Processor For Sensor Networks Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong Presentation.

Software Architecture of Sensors. Hardware - Sensor Nodes Sensing: sensor --a transducer that converts a physical, chemical, or biological parameter into.

- Pritam Kumat - TE(2) 1.  Introduction  Architecture  Routing Techniques  Node Components  Hardware Specification  Application 2.

TAG: a Tiny AGgregation service for ad-hoc sensor networks Authors: Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, Wei Hong Presenter: Mingwei.

MAC Protocols for Sensor Networks

S. Sudarshan CS632 Course, Mar 2004 IIT Bombay

Demetrios Zeinalipour-Yazti (Univ. of Cyprus)

Introduction to Wireless Sensor Networks

Open Source distributed document DB for an enterprise

Querying Sensor Networks

Distributed database approach,

Wireless Sensor Network Architectures

Energy-Efficient Communication Protocol for Wireless Microsensor Networks by Wendi Rabiner Heinzelman, Anantha Chandrakasan, and Hari Balakrishnan Presented.

Net 435: Wireless sensor network (WSN)

The Design of an Acquisitional Query Processor For Sensor Networks

Trickle: Code Propagation and Maintenance

Querying Sensor Networks

Distributing Queries Over Low Power Sensor Networks

Streaming Sensor Data Fjord / Sensor Proxy Multiquery Eddy

Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform

Chapter 1 Computer System Overview

Evaluation of Relational Operations: Other Techniques

REED : Robust, Efficient Filtering and Event Detection

COMP755 Advanced Operating Systems

Distributed Databases

Presentation transcript:

Querying Sensor Networks Sam Madden UC Berkeley November 18th, 2002 @ Madison

Introduction What are sensor networks? Programming Sensor Networks Is Hard Especially if you want to build a “real” application Example: Vehicle tracking application took 2 grad students 2 weeks to build and hundreds of lines of code. Declarative Queries Are Easy And, can be faster and more robust than most applications! Vehicle tracking query: took 2 minutes to write, worked just as well! SELECT MAX(mag) FROM sensors WHERE mag > thresh SAMPLE INTERVAL 64ms

Overview Sensor Networks Why Queries in Sensor Nets TinyDB Features Demo Focus: Tiny Aggregation The Next Step

Overview Sensor Networks Why Queries in Sensor Nets TinyDB Features Demo Focus: Tiny Aggregation The Next Step

Device Capabilities “Mica Motes” Other more powerful platforms exist 8bit, 4Mhz processor Roughly a PC AT 40kbit radio Time to send 1 bit = 800 instrs Reducing communication is good 4KB RAM, 128K flash, 512K EEPROM Sensor board expansion slot Standard board has light & temperature sensors, accelerometer, magnetometer, microphone, & buzzer Other more powerful platforms exist E.g. Sensoria WINS nodes Trend towards smaller devices “Smart Dust” – Kris Pister, et al.

Earthquake monitoring in shake-test sites. Sensor Net Sample Apps Habitat Monitoring. Storm petrels on great duck island, microclimates on James Reserve. Earthquake monitoring in shake-test sites. Vehicle detection: sensors dropped from UAV along a road, collect data about passing vehicles, relay data back to UAV. Traditional monitoring apparatus.

Key Constraint: Power Lifetime from One pair of AA batteries 2-3 days at full power 6 months at 2% duty cycle Communication dominates cost Because it takes so long (~30ms) to send / receive a message

TinyOS Operating system from David Culler’s group at Berkeley C-like programming environment Provides messaging layer, abstractions for major hardware components Split phase highly asynchronous, interrupt-driven programming model Hill, Szewczyk, Woo, Culler, & Pister. “Systems Architecture Directions for Networked Sensors.” ASPLOS 2000. See http://webs.cs.berkeley.edu/tos

Communication In Sensor Nets Radio communication has high link-level losses typically about 20% @ 5m Newest versions of TinyOS provide link-level acknowledgments No end-to-end acknowledgements Ad-hoc neighbor discovery Two major routing techniques: tree-based hierarchy and geographic A B C D F E 00 10 21 11 12 20 22

Overview Sensor Networks Why Queries in Sensor Nets TinyDB Features Demo Focus: Tiny Aggregation The Next Step

Declarative Queries for Sensor Networks Examples: SELECT nodeid, light FROM sensors WHERE light > 400 SAMPLE PERIOD 1s Light Temp Accel …. T-2 453 245 512 … 1 T-1 442 278 513 … T 406 335 511 … “epoch” 2 SELECT roomNo, AVG(volume) FROM sensors GROUP BY roomNo HAVING AVG(volume) > 200 Rooms w/ volume > 200

Declarative Benefits In Sensor Networks Vastly simplifies execution for large networks Since locations are described by predicates Operations are over groups Enables tolerance to faults Since system is free to choose where and when operations happen Data independence System is free to choose where data lives, how it is represented

Computing In Sensor Nets Is Hard Why? Limited power (must optimize for it!) Lossy communication Zero administration Limited processing capabilities, storage, bandwidth In power-based optimization, we choose: Where data is processed. How data is routed Exploit operator semantics! Avoid dead nodes How to order operators, sampling, etc. What kinds of indices to apply, which data to prioritize …

Overview Sensor Networks Why Queries in Sensor Nets TinyDB Features Demo Focus: Tiny Aggregation The Next Step

TinyDB A distributed query processor for networks of Mica motes Available today! Goal: Eliminate the need to write C code for most TinyOS users Features Declarative queries Temporal + spatial operations Multihop routing In-network storage

TinyDB @ 10000 Ft Query {A,B,C,D,E,F} (Almost) All Queries are Continuous and Periodic Written in SQL-Like Language With Extensions For : Sample rate Offline delivery Temporal Aggregation {B,D,E,F} {D,E,F}

TinyDB Demo

Applications + Early Adopters Some demo apps: Network monitoring Vehicle tracking “Real” future deployments: Environmental monitoring @ GDI (and James Reserve?) Generic Sensor Kit Building Monitoring Demo!

TinyDB Architecture (Per node) SelOperator AggOperator TupleRouter: Fetches readings (for ready queries) Builds tuples Applies operators Deliver results (up tree) TupleRouter Network AggOperator: Combines local & neighbor readings ~10,000 Lines C Code ~5,000 Lines Java ~3200 Bytes RAM (w/ 768 byte heap) ~58 kB compiled code (3x larger than 2nd largest TinyOS Program) SelOperator: Filters readings Radio Stack Schema TinyAllloc Schema: “Catalog” of commands & attributes (more later) TinyAlloc: Reusable memory allocator!

Overview Sensor Networks Why Queries in Sensor Nets TinyDB Features Demo Focus: Tiny Aggregation The Next Step

TAG In-network processing of aggregates Aggregates are common operation Reduces costs depending on type of aggregates Focus on “spatial aggregation” (Versus “temporal aggregation”) Exploitation of operator, functional semantics Tiny AGgregation (TAG), Madden, Franklin, Hellerstein, Hong. OSDI 2002 (to appear).

Aggregation Framework As in extensible databases, we support any aggregation function conforming to: Aggn={fmerge, finit, fevaluate} Fmerge{<a1>,<a2>}  <a12> finit{a0}  <a0> Fevaluate{<a1>}  aggregate value (Merge associative, commutative!) Partial State Record (PSR) Just like parallel database systems – e.g. Bubba! Example: Average AVGmerge {<S1, C1>, <S2, C2>}  < S1 + S2 , C1 + C2> AVGinit{v}  <v,1> AVGevaluate{<S1, C1>}  S1/C1

Query Propagation Review SELECT AVG(light)… A B C D F E

Pipelined Aggregates Value from 2 produced at time t arrives at 1 at time (t+1) After query propagates, during each epoch: Each sensor samples local sensors once Combines them with PSRs from children Outputs PSR representing aggregate state in the previous epoch. After (d-1) epochs, PSR for the whole tree output at root d = Depth of the routing tree If desired, partial state from top k levels could be output in kth epoch To avoid combining PSRs from different epochs, sensors must cache values from children 1 2 3 4 5 Value from 5 produced at time t arrives at 1 at time (t+3)

Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors 1 2 3 4 5 Depth = d

Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors 1 Epoch 1 1 2 3 4 5 Sensor # 1 2 3 4 5 1 1 1 Epoch # 1

Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors 3 Epoch 2 1 2 3 4 5 Sensor # 1 2 3 4 5 1 2 2 Epoch # 1

Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors 4 Epoch 3 1 2 3 4 5 Sensor # 1 2 3 4 5 1 3 2 Epoch # 1

Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors 5 Epoch 4 1 2 3 4 5 Sensor # 1 2 3 4 5 1 3 2 Epoch # 1

Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors 5 Epoch 5 1 2 3 4 5 Sensor # 1 2 3 4 5 1 3 2 Epoch # 1

Grouping If query is grouped, sensors apply predicate on each epoch PSRs tagged with group When a PSR (with group) is received: If it belongs to a stored group, merge with existing PSR If not, just store it At the end of each epoch, transmit one PSR per group

Group Eviction Problem: Number of groups in any one iteration may exceed available storage on sensor Solution: Evict! (Partial Preaggregation*) Choose one or more groups to forward up tree Rely on nodes further up tree, or root, to recombine groups properly What policy to choose? Intuitively: least popular group, since don’t want to evict a group that will receive more values this epoch. Experiments suggest: Policy matters very little Evicting as many groups as will fit into a single message is good * Per-Åke Larson. Data Reduction by Partial Preaggregation. ICDE 2002.

TAG Advantages In network processing reduces communication Important for power and contention Continuous stream of results In the absence of faults, will converge to right answer Lots of optimizations Based on shared radio channel Semantics of operators

Simulation Environment Chose to simulate to allow 1000’s of nodes and control of topology, connectivity, loss Java-based simulation & visualization for validating algorithms, collecting data. Coarse grained event based simulation Sensors arranged on a grid, radio connectivity by Euclidian distance Communication model Lossless: All neighbors hear all messages Lossy: Messages lost with probability that increases with distance Symmetric links No collisions, hidden terminals, etc.

Simulation Result Simulation Results 2500 Nodes 50x50 Grid Depth = ~10 Neighbors = ~20 Some aggregates require dramatically more state!

Taxonomy of Aggregates TAG insight: classify aggregates according to various functional properties Yields a general set of optimizations that can automatically be applied Property Examples Affects Partial State MEDIAN : unbounded, MAX : 1 record Effectiveness of TAG Duplicate Sensitivity MIN : dup. insensitive, AVG : dup. sensitive Routing Redundancy Exemplary vs. Summary MAX : exemplary COUNT: summary Applicability of Sampling, Effect of Loss Monotonic COUNT : monotonic AVG : non-monotonic Hypothesis Testing, Snooping

Optimization: Channel Sharing (“Snooping”) Insight: Shared channel enables optimizations Suppress messages that won’t affect aggregate E.g., in a MAX query, sensor with value v hears a neighbor with value ≥ v, so it doesn’t report Applies to all exemplary, monotonic aggregates Can be applied to summary aggregates also if imprecision is allowed Learn about query advertisements it missed If a sensor shows up in a new environment, it can learn about queries by looking at neighbors messages. Root doesn’t have to explicitly rebroadcast query!

Optimization: Hypothesis Testing Insight: Root can provide information that will suppress readings that cannot affect the final aggregate value. E.g. Tell all the nodes that the MIN is definitely < 50; nodes with value ≥ 50 need not participate. Depends on monotonicity How is hypothesis computed? Blind guess Statistically informed guess Observation over first few levels of tree / rounds of aggregate

Experiment: Hypothesis Testing Uniform Value Distribution, Dense Packing, Ideal Communication

Optimization: Use Multiple Parents For duplicate insensitive aggregates Or aggregates that can be expressed as a linear combination of parts Send (part of) aggregate to all parents Decreases variance Dramatically, when there are lots of parents No splitting: E(count) = c * p Var(count) = c2 * p * (1-p) A B C A B C 1/2 A B C 1 A B C A B C With Splitting: E(count) = 2 * c/2 * p Var(count) = 2 * (c/2)2 * p * (1-p)

Multiple Parents Results Critical Link! No Splitting With Splitting Interestingly, this technique is much better than previous analysis predicted! Losses aren’t independent! Instead of focusing data on a few critical links, spreads data over many links

Fun Stuff Sophisticated, sensor network specific aggregates Temporal aggregates

Temporal Aggregates TAG was about “spatial” aggregates Inter-node, at the same time Want to be able to aggregate across time as well Two types: Windowed: AGG(size,slide,attr) Decaying: AGG(comb_func, attr) Demo! slide =2 size =4 … R1 R2 R3 R4 R5 R6 …

Isobar Finding

TAG Summary In-network query processing a big win for many aggregate functions By exploiting general functional properties of operators, optimizations are possible Requires new aggregates to be tagged with their properties Up next: non-aggregate query processing optimizations – a flavor of things to come!

Overview Sensor Networks Why Queries in Sensor Nets TinyDB Features Demo Focus: Tiny Aggregation The Next Step

Acquisitional Query Processing Cynical question: what’s really different about sensor networks? Low Power? Lots of Nodes? Limited Processing Capabilities? Laptops! Distributed DBs! Moore’s Law!

Answer Long running queries on physically embedded devices that control when and and with what frequency data is collected! Versus traditional systems where data is provided a priori Next: an acquisitional teaser…

ACQP: What’s Different? How does the user control acquisition? Specify rates or lifetimes Trigger queries in response to events Which nodes have relevant data? Need a node index Construct topology such that nodes that are queried together route together What sensors should be sampled? Treat sampling at an operator Sample cheapest sensors first Which samples should be transmitted? Not all of them, if bandwidth or power is limited Those that are most “valuable”?

Operator Ordering: Interleave Sampling + Selection SELECT light, mag FROM sensors WHERE pred1(mag) AND pred2(light) SAMPLE INTERVAL 1s At 1 sample / sec, total power savings could be as much as 4mW, same as the processor! Energy cost of sampling mag >> cost of sampling light 1500 uJ vs. 90 uJ Correct ordering (unless pred1 is very selective): 1. Sample light Sample mag Apply pred1 Apply pred2 2. Sample light Apply pred2 Sample mag Apply pred1 3. Sample mag Apply pred1 Sample light Apply pred2

Optimizing in ACQP Model sampling as an “expensive predicate” Some subtleties: Attributes referenced in multiple predicates; which to “charge”? Attributes must be fetched before operators that use them can be applied Solution: Treat sampling as a separate task Build a partial order on sampling and predicates Solve for cheapest schedule using series-parallel scheduling algorithm (Monma & Sidney, 1979.), as in other optimization work (e.g. Ibaraki & Kameda, TODS, 1984, or Hellerstein, TODS, 1998.)

Exemplary Aggregate Pushdown SELECT WINMAX(light,8s,8s) FROM sensors WHERE mag > x SAMPLE INTERVAL 1s Unless > x is very selective, correct ordering is: Sample light Check if it’s the maximum If it is: Sample mag Check predicate If satisfied, update maximum

Summary Declarative queries are the right interface for data collection in sensor nets! Aggregation is a fundamental operation for which there are many possible optimizations Network Aware Techniques Current Research: Acquisitional Query Processing Framework for addresses lots of the new issues that arise in sensor networks, e.g. Order of sampling and selection Languages, indices, approximations that give user control over which data enters the system TinyDB Release Available - http://telegraph.cs.berkeley.edu/tinydb

Questions?

Simulation Screenshot

TinyAlloc Handle Based Compacting Memory Allocator For Catalog, Queries Handle h; call MemAlloc.alloc(&h,10); … (*h)[0] = “Sam”; call MemAlloc.lock(h); tweakString(*h); call MemAlloc.unlock(h); call MemAlloc.free(h); Free Bitmap Heap Master Pointer Table Free Bitmap Heap Master Pointer Table Free Bitmap Heap Master Pointer Table Free Bitmap Heap Master Pointer Table User Program Compaction

Schema Attribute & Command IF At INIT(), components register attributes and commands they support Commands implemented via wiring Attributes fetched via accessor command Catalog API allows local and remote queries over known attributes / commands. Demo of adding an attribute, executing a command.

Q1: Expressiveness Simple data collection satisfies most users How much of what people want to do is just simple aggregates? Anecdotally, most of it EE people want filters + simple statistics (unless they can have signal processing) However, we’d like to satisfy everyone!

Query Language New Features: Joins Event-based triggers Via extensible catalog In network & nested queries Split-phase (offline) delivery Via buffers

Sample Query 1 Bird counter: CREATE BUFFER birds(uint16 cnt) SIZE 1 ON EVENT bird-enter(…) SELECT b.cnt+1 FROM birds AS b OUTPUT INTO b ONCE

Sample Query 2 Birds that entered and left within time t of each other: ON EVENT bird-leave AND bird-enter WITHIN t SELECT bird-leave.time, bird-leave.nest WHERE bird-leave.nest = bird-enter.nest ONCE

Sample Query 3 Delta compression: SELECT light FROM buf, sensors WHERE |s.light – buf.light| > t OUTPUT INTO buf SAMPLE PERIOD 1s

Sample Query 4 Offline Delivery + Event Chaining CREATE BUFFER equake_data( uint16 loc, uint16 xAccel, uint16 yAccel) SIZE 1000 PARTITION BY NODE SELECT xAccel, yAccel FROM SENSORS WHERE xAccel > t OR yAccel > t SIGNAL shake_start(…) SAMPLE PERIOD 1s ON EVENT shake_start(…) SELECT loc, xAccel, yAccel FROM sensors OUTPUT INTO BUFFER equake_data(loc, xAccel, yAccel) SAMPLE PERIOD 10ms

Event Based Processing Enables internal and chained actions Language Semantics Events are inter-node Buffers can be global Implementation plan Events and buffers must be local Since n-to-n communication not (well) supported Next: operator expressiveness

Attribute Driven Topology Selection Observation: internal queries often over local area* Or some other subset of the network E.g. regions with light value in [10,20] Idea: build topology for those queries based on values of range-selected attributes Requires range attributes, connectivity to be relatively static * Heideman et. Al, Building Efficient Wireless Sensor Networks With Low Level Naming. SOSP, 2001.

Attribute Driven Query Propagation SELECT … WHERE a > 5 AND a < 12 Precomputed intervals == “Query Dissemination Index” 4 [1,10] [20,40] [7,15] 1 2 3

Attribute Driven Parent Selection Even without intervals, expect that sending to parent with closest value will help 1 2 3 [1,10] [7,15] [20,40] [3,6]  [1,10] = [3,6] [3,7]  [7,15] = ø [3,7]  [20,40] = ø 4 [3,6]

Hot off the press…

Grouping GROUP BY expr expr is an expression over one or more attributes Evaluation of expr yields a group number Each reading is a member of exactly one group Example: SELECT max(light) FROM sensors GROUP BY TRUNC(temp/10) Result: Sensor ID Light Temp Group 1 45 25 2 27 28 3 66 34 4 68 37 Group max(light) 2 45 3 68

Having HAVING preds preds filters out groups that do not satisfy predicate versus WHERE, which filters out tuples that do not satisfy predicate Example: SELECT max(temp) FROM sensors GROUP BY light HAVING max(temp) < 100 Yields all groups with temperature under 100

Group Eviction Problem: Number of groups in any one iteration may exceed available storage on sensor Solution: Evict! Choose one or more groups to forward up tree Rely on nodes further up tree, or root, to recombine groups properly What policy to choose? Intuitively: least popular group, since don’t want to evict a group that will receive more values this epoch. Experiments suggest: Policy matters very little Evicting as many groups as will fit into a single message is good

Experiment: Basic TAG Dense Packing, Ideal Communication

Experiment: Hypothesis Testing Uniform Value Distribution, Dense Packing, Ideal Communication

Experiment: Effects of Loss

Experiment: Benefit of Cache

Pipelined Aggregates After query propagates, during each epoch: Each sensor samples local sensors once Combines them with PSRs from children Outputs PSR representing aggregate state in the previous epoch. After (d-1) epochs, PSR for the whole tree output at root d = Depth of the routing tree If desired, partial state from top k levels could be output in kth epoch To avoid combining PSRs from different epochs, sensors must cache values from children Value from 2 produced at time t arrives at 1 at time (t+1) 1 2 3 4 5 Value from 5 produced at time t arrives at 1 at time (t+3)

Pipelining Example 1 2 3 4 5 SID Epoch Agg. SID Epoch Agg. SID Epoch

Pipelining Example Epoch 0 1 2 <4,0,1> 3 4 <5,0,1> 5 SID Agg. 1 Epoch 0 1 SID Epoch Agg. 2 1 4 2 <4,0,1> 3 4 <5,0,1> SID Epoch Agg. 3 1 5 5

Pipelining Example Epoch 1 1 <2,0,2> 2 <3,0,2> SID Epoch Agg. 1 2 Epoch 1 1 SID Epoch Agg. 2 1 4 3 <2,0,2> 2 <3,0,2> <4,1,1> 3 4 <5,1,1> SID Epoch Agg. 3 1 5 5

Pipelining Example <1,0,3> Epoch 2 1 <2,0,4> 2 SID Epoch Agg. 1 2 4 Epoch 2 1 SID Epoch Agg. 2 1 4 3 <2,0,4> 2 <3,1,2> <4,2,1> 3 4 <5,2,1> SID Epoch Agg. 3 1 5 2 5

Pipelining Example <1,0,5> Epoch 3 1 <2,1,4> 2 SID Epoch Agg. 1 2 4 Epoch 3 1 SID Epoch Agg. 2 1 4 3 <2,1,4> 2 <3,2,2> <4,3,1> 3 4 <5,3,1> SID Epoch Agg. 3 1 5 2 5

Pipelining Example <1,1,5> Epoch 4 1 <2,2,4> 2 <3,3,2> <4,4,1> 3 4 <5,4,1> 5

Our Stream Semantics One stream, ‘sensors’ We control data rates Joins between that stream and buffers are allowed Joins are always landmark, forward in time, one tuple at a time Result of queries over ‘sensors’ either a single tuple (at time of query) or a stream Easy to interface to more sophisticated systems Temporal aggregates enable fancy window operations

Formal Spec. ON EVENT <event> [<boolop> <event>... WITHIN <window>] [SELECT {<expr>|agg(<expr>)|temporalagg(<expr>)} FROM [sensors | <buffer> | events]] [WHERE {<pred>}] [GROUP BY {<expr>}] [HAVING {<pred>}] [ACTION [<command> [WHERE <pred>] | BUFFER <bufname> SIGNAL <event>({<params>}) | (SELECT ... ) [INTO BUFFER <bufname>]]] [SAMPLE PERIOD <seconds> [FOR <nrounds>] [INTERPOLATE <expr>] [COMBINE {temporal_agg(<expr>)}] | ONCE]

Buffer Commands [AT <pred>:] CREATE [<type>] BUFFER <name> ({<type>}) PARTITION BY [<expr>] SIZE [<ntuples>,<nseconds>] [AS SELECT ... [SAMPLE PERIOD <seconds>]] DROP BUFFER <name>