Model-driven Data Acquisition in Sensor Networks Amol Deshpande 1,4 Carlos Guestrin 4,2 Sam Madden 4,3 Joe Hellerstein 1,4 Wei Hong 4 1 UC Berkeley 2 Carnegie.

Model-driven Data Acquisition in Sensor Networks Amol Deshpande 1,4 Carlos Guestrin 4,2 Sam Madden 4,3 Joe Hellerstein 1,4 Wei Hong 4 1 UC Berkeley 2 Carnegie Mellon University 3 MIT 4 Intel Research - Berkeley

Sensor networks and distributed systems A collection of devices that can sense, actuate, and communicate over a wireless network Available resources 4 MHz, 8 bit CPU 40 Kbps wireless 3V battery (lasts days or months) Sensors for temperature, humidity, pressure, sound, magnetic fields, acceleration, visible and ultraviolet light, etc. Analogous issues in other distributed systems, including streams and the Internet

Leach's Storm Petrel Real deployments Great Duck Island Redwoods Precision agriculture Fabrication monitoring

Example: Intel Berkeley Lab deployment

Every time step Analogy: Sensor net as a database TinyDB Query Distribute query Collect query answer or data SQL-style query Declarative interface:  Sensor nets are not just for PhDs  Decrease deployment time Data aggregation:  Can reduce communication

Every time step Limitations of existing approach TinyDB Query Distribute query Collect data New Query SQL-style query Redo process every time query changes Query distribution:  Every node must receive query Data collection:  Every node must wake up at every time step  Data loss ignored  No quality guarantees  Data inefficient – ignoring correlations

Sensor net data is correlated Spatial-temporal correlation Inter-attributed correlation Data is not i.i.d.  shouldn’t ignore missing data Observing one sensor  information about other sensors (and future values) Observing one attribute  information about other attributes

tt SQL-style query with desired confidence Model-driven data acquisition: overview Probabilistic Model Query Data gathering plan Condition on new observations New Query posterior belief Strengths of model-based data acquisition  Observe fewer attributes  Exploit correlations  Reuse information between queries  Directly deal with missing data  Answer more complex (probabilistic) queries

Probabilistic models and queries User’s perspective: Query SELECT nodeId, temp ± 0.5°C, conf(.95) FROM sensors WHERE nodeId in {1..8} System selects and observes subset of nodes Observed nodes: {3,6,8} Query result Node12345678 Temp.17.318.117.416.119.221.317.516.3 Conf.98%95%100%99%95%100%98%100%

Probabilistic models and queries Joint distribution P(X 1,…,X n ) Probabilistic query Example: Value of X 2 ±  with prob. > 1-  Prob. below 1-  ? Observe attributes Example: Observe X 1 =18 P(X 2 |X 1 =18) Higher prob., could answer query  Learn from historical data

Dynamic models: filtering Joint distribution at time t Observe attributes Example: Observe X 1 =18 Condition on observations  t Fewer obs. in future queries  Example: Kalman filter  Learn from historical data

Supported queries Value query X i ±  with prob. at least 1-  SELECT and Range query X i  [a,b] with prob. at least 1-  which sensors have temperature greater than 25°C ? Aggregation average ±  of subset of attribs. with prob. > 1-  combine aggregation and selection probability > 10 sensors have temperature greater than 25°C ? Queries require solution to integrals  Many queries computed in closed-form  Some require numerical integration/sampling

tt SQL-style query with desired confidence Model-driven data acquisition: overview Probabilistic Model Query Data gathering plan Condition on new observations posterior belief What sensors do we observe ? How do we collect observations?

Acquisition costs Attributes have different acquisition costs Exploit correlation through probabilistic model Must consider networking cost 1 2 63 45 cheaper?

Network model and plan format Assume known (quasi-static) network topology Define traversal using (1.5-approximate) TSP C t (S ) is expected cost of TSP (lossy communication) 1 2 63 45 7 8 12 9 1011 Cost of collecting subset S of sensor values: C(S ) = C a (S )+ C t (S ) Goal: Find subset S that is sufficient to answer query at minimum cost C(S )

Choosing observation plan Is a subset S sufficient? X i 2 [a,b] with prob. > 1-  If we observe S =s : R i (s ) = max{ P(X i 2 [a,b] | s ), 1-P(X i 2 [a,b] | s )} Value of S is unknown: R i (S ) = P(s ) R i (s ) ds  Optimization problem:

tt SQL-style query with desired confidence BBQ system Probabilistic Model Query Data gathering plan Condition on new observations posterior belief  Value  Range  Average  Multivariate Gaussians  Learn from historical data  Equivalent to Kalman filter  Simple matrix operations  Exhaustive or greedy search  Factor 1.5 TSP approximation

Experimental results Redwood trees and Intel Lab datasets Learned models from data Static model Dynamic model – Kalman filter, time-indexed transition probabilities Evaluated on a wide range of queries

Cost versus Confidence level

Obtaining approximate values Query: True temperature value ± epsilon with confidence 95%

Approximate range queries Query: Temperature in [T 1,T 2 ] with confidence 95%

Comparison to other methods

Intel Lab traversals

tt SQL-style query with desired confidence BBQ system Probabilistic Model Query Data gathering plan Condition on new observations posterior belief  Value  Range  Average  Multivariate Gaussians  Learn from historical data  Equivalent to Kalman filter  Simple matrix operations  Exhaustive or greedy search  Factor 1.5 TSP approximation Extensions  More complex queries  Other probabilistic models  More advanced planning  Outlier detection  Dynamic networks  Continuous queries  …

Conclusions Model-driven data acquisition Observe fewer attributes Exploit correlations Reuse information between queries Directly deal with missing data Answer more complex (probabilistic) queries Basis for future sensor network systems

Model-driven Data Acquisition in Sensor Networks Amol Deshpande 1,4 Carlos Guestrin 4,2 Sam Madden 4,3 Joe Hellerstein 1,4 Wei Hong 4 1 UC Berkeley 2 Carnegie.

Similar presentations

Presentation on theme: "Model-driven Data Acquisition in Sensor Networks Amol Deshpande 1,4 Carlos Guestrin 4,2 Sam Madden 4,3 Joe Hellerstein 1,4 Wei Hong 4 1 UC Berkeley 2 Carnegie."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Model-driven Data Acquisition in Sensor Networks Amol Deshpande 1,4 Carlos Guestrin 4,2 Sam Madden 4,3 Joe Hellerstein 1,4 Wei Hong 4 1 UC Berkeley 2 Carnegie.

Similar presentations

Presentation on theme: "Model-driven Data Acquisition in Sensor Networks Amol Deshpande 1,4 Carlos Guestrin 4,2 Sam Madden 4,3 Joe Hellerstein 1,4 Wei Hong 4 1 UC Berkeley 2 Carnegie."— Presentation transcript:

Similar presentations

About project

Feedback