Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 DIMENSIONS: Why do we need a new Data Handling architecture for sensor networks? Deepak Ganesan, Deborah Estrin (UCLA), John Heidemann (USC/ISI) Presenter:

Similar presentations


Presentation on theme: "1 DIMENSIONS: Why do we need a new Data Handling architecture for sensor networks? Deepak Ganesan, Deborah Estrin (UCLA), John Heidemann (USC/ISI) Presenter:"— Presentation transcript:

1 1 DIMENSIONS: Why do we need a new Data Handling architecture for sensor networks? Deepak Ganesan, Deborah Estrin (UCLA), John Heidemann (USC/ISI) Presenter: Vijay Sundaram

2 2 Deployment: Microclimate monitoring at James Reserve Park (UC Riverside) How well does data fit model of variation of temperature with altitude. Send robotic agent to edge between low and high precipitation regions Get detailed data from node with maximum precipitation from Sept to Dec 2003 Hmm…I wonder why packet-loss is so high. Get a connectivity map of the network for all transmit power settings Weather Sensor Network

3 3 Goals Flexible spatio-temporal querying Provide ability to mine for interesting patterns and features in data. Drill-down on details Distributed Long-term networked data storage Preserve ability for long-term data mining, while catering to node storage constraints Performance Reasonable Accuracy for wide range of queries Low communication (energy) overhead

4 4 How can we achieve goals? Exploit redundancy in data Potentially huge gains from lossy compression exploiting spatio-temporal correlation Exploit rarity of interesting features Preserve only interesting features. Exploit scale of sensor network. large distributed storage, although limited local storage. Exploit low cost of approximate query processing allow approximate query processing that obtain sufficiently accurate responses.

5 5 Can existing systems satisfy design goals? P2P: DHT Gnutella Data Correlation Vs Decentralization None Temporal Spatial CentralizedHierarchical Fully Distributed Web Caches Centralized Data Collection Geo-Spatial Data Mining, Streaming Media (MPEG-2) Wireless Sensor Networks Exploited Data Correlation Degree of Decentralization

6 6 PROGRESSIVELY AGE DIMENSIONS Design: Key Ideas Construct hierarchy of lossy compressed summaries of data using wavelet compression. Queries “drill-down” from root of hierarchy to focus search on small portions of the network. Progressively age lossy data along spatio- temporal hierarchy to enable long-term storage Level 0 Level 1 Level 2 PROGRESSIVELY LOSSY

7 7 Roadmap Why wavelets? Example Precipitation Hierarchy Spatial and Temporal Processing internals Initial Results: Precipitation Dataset

8 8 Enabling Technique: Wavelets Very popular signal processing approach, that provides good time and frequency localization. JPEG2000, Geo-Spatial Data Mining preserves spatio-temporal features (edges, discontinuities) while providing good approximation of long-term trends in data Efficient distributed implementation possible.

9 9 Sample Architecture: Precipitation Hierarchy Local Processing: Construct lossy time-series summary (zero communication cost) Spatial Data Processing: Hierarchical Lossy Compression Organize network into hierarchy. At each higher level, reduce number of participating nodes by a factor of 4. At each step of the hierarchy, summarize data from 4 quadrants, and propagate Wavelet Coeffs What is the maximum precipitation between Sept-Dec 2002? Direct query to quadrant that best matches query decreasing temporal resolution decreasing spatial resolution

10 10 Spatial Decomposition Recursively split network into non- overlapping square grids. At each level of the hierarchy, Elect clusterhead Cluster-head combines and summarizes data from 4 quadrants Cluster-head propagates compressed data to the next level of the hierarchy. Routing protocol: GPSR variant (DCS - Ratnasamy et al,) Hierarchy construction

11 11 Wavelet Compression Internals  Haar Filter  Debauchies 9/7 filter Cost Metric  Communication Budget  Error bound Wavelet Subband Decomposition Thresholding + Quantization + Drop Subbands Filter Lossless Encoder x time y Input Data x y Compressed Output time

12 12 Initial Results with Precipitation Dataset: Communication Overhead 15x12 grid (50km edge) of precipitation data from 1949- 1994, from Pacific Northwest †. Gridded before processing. Handpicked choice of threshold, quantization intervals, subbands to drop. Huffman Encoder at output. Very large compression ratio up the hierarchy LevelRaw data size (Kb) - R Mean data sent to n ext level (Kb) - M Compression Ratio = R/M 1262.55.646.6 2984.43.8257.2 33937.77.4987 411813.22.52286.2 † M. Widmann and C.Bretherton. 50 km resolution daily precipitation for the Pacific Northwest, 1949-94.

13 13 Find maximum annual precipitation for each year. Exact Answer for 89% of queries. Within 90% of answer for >95% of queries. Queries require less than 3% of network. Good performance on average with very low lookup overhead

14 14 Locate boundary in annual precipitation between Low and High Precipitation Areas Error Metric: Number of nodes greater than 1 pixel distance from drill-down boundary Accuracy: Within 25% error for 93% of the queries (or within 13% error for 75% of the queries) Less than 5% of the network queried.

15 15 Open Issues Load Balancing and Robustness Hierarchical Model vs Peer Model: lot of work in p2p systems… Irregular Node Placement Use wavelet extensions for irregular node placement. Computationally more expensive Gridify dataset with interpolation Providing Query Guarantees Can we bound error in response obtained for a drill-down query at a particular level of hierarchy? Implementation on IPAQ/mote network

16 16 Summary DIMENSIONS provides a holistic data handling architecture for sensor networks that can Support a wide range of sensor-network usage and query models (using drill-down querying of wavelet summaries) Provide a gracefully degrading lossy storage model (by progressively ageing summaries) Offer ability to tune energy expended for query performance. (tunable lossy compression)

17 17 Different optimization metrics Internet- based Peer-to Peer Systems Geo-Spatial Data Mining Web Caches Streaming Media (MPEG-2) Wireless Sensor Networks Energy Latency Bandwidth Approximate Results OK Spatio-temporal Query Performance Lookup Cost Resource Constraints

18 18 Other Examples: Packet Loss Different example of dataset that exhibits spatial correlation Throughput from one transmitter to proximate receivers is correlated Throughput from multiple proximate transmitters to one receiver is correlated. Typically, what we want to query is the deviations from normal and average throughput.

19 19 Involves expensive transfer of 12x14 map from each node. Good approximate results can be obtained from querying compressed data. Packet-Loss Dataset: Get Throughput Vs Distance Map

20 20 Long-term Storage: Concepts Data is progressively aged, both locally, and along the hierarchy. Summaries that cover larger areas and longer time- periods are retained for much longer than raw time- series. Slower Ageing Wavelet Coefficients

21 21 Load Balancing and Robustness: Concepts Hierarchical Model Naturally fits wavelet processing Strict hierarchies are vulnerable to node failures. Failures near root of hierarchy can be expensive to repair Decentralized Peer Model Summaries communicated to multiple nodes probabilistically. Better robustness, but incurs greater communication overhead.


Download ppt "1 DIMENSIONS: Why do we need a new Data Handling architecture for sensor networks? Deepak Ganesan, Deborah Estrin (UCLA), John Heidemann (USC/ISI) Presenter:"

Similar presentations


Ads by Google