INT 598 Data Management for Sensor Networks Silvia Nittel Spatial Information Science & Engineering University of Maine Fall 2006
INT598: Sensor System Foundation 2 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Overview Data Collection and Aggregation Programming sensor networks In-network data aggregation In-network query processing In-network data storage and indexing Multi-Resolution Storage
INT598: Sensor System Foundation 3 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Data Collection Scenarios Embed numerous distributed devices to monitor and interact with physical world Exploit spatially and temporally dense, in situ, sensing and actuation Network these devices so that they can coordinate to perform higher-level identification and tasks. Requires robust distributed systems of hundreds or thousands of devices. Deborah Estrin, UCLA
INT598: Sensor System Foundation 4 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Indoor Applications Intel cubicle space Sensing: Light and sounds sensors on the ceiling or cubicle walls Actuation: detecting occupied cubicles and disturbing conversation outside of cubicles
INT598: Sensor System Foundation 5 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Outdoor Applications Napa Valley vineyard Sensing: Humidity and temperature sensors at vines Actuation: ventilators to remove fog, and localized heaters Queries: monitoring micro-climates at vines
INT598: Sensor System Foundation 6 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Example A “Macroscope” in the Redwoods. ACM Sensys 2005 Observation over ca. 60 days.
INT598: Sensor System Foundation 7 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Database Management System Database Collection of store and streamed data Database Management System Software to store, manage, access, and query the data With simple-to-use user interface and query language Database System Both together. Data Base (DB) Data Base Management System (DBMS)
INT598: Sensor System Foundation 8 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Viewing SN as DBS
INT598: Sensor System Foundation 9 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Viewing a SN as a DBS Assumption: A sensor network can be viewed as distributed database system each sensor node is a database system that can accept, process, and answer queries participate in execution of global, distributed queries The user poses declarative queries to the SN as a whole. The dbms figure out how to process the query. Tiny (foot-print) database management systems (DBMS) running on sensor nodes are available Example: TinyDB (UCBerkeley), Cougar (Cornell University) However, constrained computing environment Adapting existing database technology In-network data storage, data aggregation and query processing
INT598: Sensor System Foundation 10 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Sensor Network DBMS Objectives: Users: Model the application data and data needs no low-level detail programming of the sensor nodes and the data gathering details “What should be done”, not “how should it be done” Approach: Declarative SQL-style queries Intelligent query processing Fault Mitigation SELECT MAX(temperat) FROM sensors WHERE temperat > thresh SAMPLE PERIOD 64ms App Sensor Network TinyDB Query, Trigger Data © S. Madden, 2005.
INT598: Sensor System Foundation 11 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Declarative Queries EpochNodeidLightTempAccelSound 01455xxx 02389xxx 11422xxx 12405xxx Sensor SchemaEpochNodeid AGV(so und) TempSound 01360xx 02520xx 11370xx 12520xx Examples: SELECT nodeid, light FROM sensors WHERE light < 400 EPOCH DURATION 1s SELECT roomId, AVG(sound) FROM sensors GROUP BY roomId HAVING AVG(sound) > 400 EPOCH DURATION 1s © S. Madden, 2005.
INT598: Sensor System Foundation 12 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Queries over Sensor Networks Query types: “Snap shot” queries Report the current temperature reading of sensor node #1? Continuous queries Report the temperature readings of sensor node #1 to #10 in the next 10 minutes at the interval of 1 min? Event queries Report when temperature values are above threshold 1 Most common: Spatio-Temporal queries Point queries (“report temperature in room 324”) Spatial window queries (“report temperature values from region A”)
INT598: Sensor System Foundation 13 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Queries over Sensor Networks Most common (cont.): ST Aggregation (average, max, min, etc) Temporal aggregation (“max temperature value in the last 24h”) Spatial aggregation (“average temperature value of all sensors on the first floor”) Basic aggregation: Min, max, average, sum, count, etc. Holistic aggregates: estimation
INT598: Sensor System Foundation 14 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 In-Network Data Aggregation A B C D F E Query {B,D,E,F} {A,B,C,D,E,F} Data stream processing : Sample rate Temporal aggregation Stream mining for local events Uncertainty/inaccuracy Each sensor node: production of data stream processing of data stream locally processing of aggregated data minimize communication Computation is pushed to data collection points: Local and locally- coordinated processing of data “in the network” {D,E,F} Partial state record
INT598: Sensor System Foundation 15 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Execution of Aggregates Flexible, ad-hoc communication topology (network level) Aggregation computation over sensor networks consists of two phases: a (query) distribution phase in which aggregate queries are pushed down into the network, and a (data) collection phase where the aggregate values are continually routed up from children to parents. Query semantics partition time into epochs of duration, and that we must produce a single aggregate value (when not grouping) that combines the readings of all devices in the network during that epoch.
INT598: Sensor System Foundation 16 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Data Distribution Phase 1. When a sensor node n receives a request to aggregate r (e.g. max(temp)), it awakens, synchronizes its clock according to timing information in the message, and prepares to participate in aggregation. In the tree-based routing scheme, n chooses the sender s of the message as its parent. In addition the query r includes the interval when the sender s is expecting to hear partial state records from n.
INT598: Sensor System Foundation 17 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Data Distribution Phase 2. n then forwards the query request r down the network, setting this delivery interval for children to be slightly before the time its parent expects to see n ’ s partial state record. In the tree-based approach, this forwarding consists of a broadcast of r, to include any nodes that did not hear the previous round, and include them as children (if it has any.) These nodes continue to forward the request in this manner, until the query has been propagated throughout the network (n-coverage) Special cases: geo-routing
INT598: Sensor System Foundation 18 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Data Collection Phase 3. During the epoch after query propagation, each sensor node listens for messages from its children during the interval it specified when forwarding the query. It then computes a partial state record consisting of the combination of any child values it heard with its own local sensor readings (aggregation). Finally, during the transmission interval requested by its parent, the mote transmits this partial state record up the network
INT598: Sensor System Foundation 19 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006
INT598: Sensor System Foundation 20 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 In-Network Data Estimation Query Processing Types of phenomena: discrete Example: did a truck pass or not? continuous (fields) Example: temperature field, toxic clouds Types of spatial queries: window or point queries Discrete sensor readings Estimation over continuous ph. Computation is pushed to nodes Still: computational complex and expensive Spatial window query:
INT598: Sensor System Foundation 21 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Detecting and Tracking Continuous Phenomena spatial window query over a toxic plume
INT598: Sensor System Foundation 22 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Tracking continuous phenomena Network configuration based on qualitative characteristics of a phenomenon Collaboration with M. Worboys
INT598: Sensor System Foundation 23 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 In-Network Query Processing Key: Acquisitional Query Processing Traditional query processing: query processing on stored data. Sensor network query processing: acquiring the data from sensors Acquisitional query processor controls when, where, and with what frequency data is collected
INT598: Sensor System Foundation 24 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Acquisitional Query Processing Basic Strategies: Continuous queries “SELECT temperature FROM sensors WHERE location=ESRB AND EPOCH=1h UNTIL DATE=11/25/06” with rates or lifetimes Events for asynchronous triggering Optimization Strategies: E.g. avoiding unnecessary acquisition Sampling as a query operator Choosing Where to Sample via Co-acquisition Index-like data structures
INT598: Sensor System Foundation 25 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Lifetime Queries Lifetime vs. sample rate SELECT … EPOCH DURATION 10 s SELECT … LIFETIME 30 days Extra: Allow a MAX SAMPLE PERIOD Discard some samples Sampling cheaper than transmitting
INT598: Sensor System Foundation 26 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Adaptive & Decentralized Operator Placement Main Idea Place operators near data sources Greater operator sample rate place operator closer For each operator Explore candidate neighbors Migrate to lower cost placements Via extra messages Rate A Rate B
INT598: Sensor System Foundation 27 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 “Adaptivity” in Databases Adaptivity : changing query plans on the fly Typically at the physical level Where the plan runs Ordering of operators Instantiations of operators, e.g. hash join vs merge join Non-traditional Conventionally, complete plans are built prior to execution Using cost estimates (collected from history) Important in volatile or long running environments Where a priori estimates are unlikely to be good E.g., sensor networks
INT598: Sensor System Foundation 28 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 In-Network Data Storage Storage challenges: Method:transmit all measurements to central db for storage Advantage: unconstrained search on historic data Disadvantage: high power consumption db Queries on different Level of detail Hierarchical In-Network Storage Centralized Storage
INT598: Sensor System Foundation 29 © Dr. Silvia Nittel, NCGIA, University of Maine, 2006 Summary Declarative Query Processing Simplify data collection in sensornets In-network processing, query optimization for performance Acquisitional Query Processing Focus on costs associated with sampling data New challenge of sensornets, other streaming systems? Adaptive Join Placement In-network optimization Some benefit, but practicality unclear Operator pushdown still a good idea