Query and Storage in Wireless Sensor Networks. The Problem ◊How to perform efficient query and storage in wireless sensor networks? ◊Design goals:  Distributed.

Query and Storage in Wireless Sensor Networks

The Problem ◊How to perform efficient query and storage in wireless sensor networks? ◊Design goals:  Distributed design that works for hundreds of nodes in a sensor network  Reduce communication overhead in query and storage ◊Challenges/Issues:  Scale of the network: # of nodes  Distributed solution

The Solution Main ideas: ◊Store locally, access globally ◊Location-based solution  Use the location information of each node  Build location-based indexing  Exploit geographic routing ◊Exploit locality  Store relevant event information locally Search is easier ◊Hierarchical indexing to scale  Divide into multiple tiers/levels  Tradeoff between storage overhead and query steps

Overview of DIMs Reference paper: “ Multi-dimensional Range Queries in Sensor Networks ”, ACM SenSys 2003. ◊Events (multi-dimensional space) are mapped onto zones of the network (2-D space) ◊Data Locality: Events with comparable attribute values stored in same location in network ◊Locality-preserving geographic hash ◊Events are routed to and stored at that node ◊Queries are routed to and resolved by appropriate nodes

Zone ◊Rectangle R on x-y plane (entire network) ◊Subrectangle Z is a zone if Z is obtained by dividing R k times satisfying the following property:  After the i-th division, 1 ≤ i ≤ k, R is partitioned into 2 i equal rectangles. If i is odd (even), the division is parallel to the y-axis (x-axis). ◊k is the level of the zone, level(Z) = k

Zone Identification ◊code(Z)  Bit string of length level(Z)  Starting from left of code string, if zone Z resides on the left half of R, bit equals 0, else 1. For the next bit, if zone Z resides on the bottom half of R, bit is 0, else 1. ◊addr(Z)  Centroid of zone rectangle

Zone Terminology ◊Sibling subtree of a zone  Left/right subtree rooted at the same parent zone ◊Backup zone  If the sibling subtree of a node is on the left (right), its backup zone is the rightmost (leftmost) zone in its sibling subtree

Zones Example

Associating Zones with Nodes ◊Sensor field divided into zones, which can be of different sizes ◊Zone ownership  A owns Z A → Z A is the largest zone that contains only node A  Some zones may not have node owner → backup(Z) is the owner

Algorithm for Zone Ownership ◊Each node maintains its four boundaries  Initialize to network boundary ◊Send messages to learn locations of neighbors  If neighbor responds, node will adjust its boundaries accordingly  Else boundary is ‘ undecided ’ ◊Undecided boundaries resolved during querying or event insertion

Event Insertion ◊Hashing an event to a zone ◊Routing an event to its owner ◊Resolving undecided zone boundaries during insertion

Event Insertion Hashing an Event to a Zone ◊Have m attributes A 1, A 2, …, A m and attribute values have been normalized ◊To assign a k-bit zone code to an event:  For i in [1, m], if A i < 0.5, the i th bit of the zone code is 0, else 1.  For i in [m+1, 2m], if A i-m < 0.25 or 0.5 ≤ A i-m < 0.75, then the (i-m) th bit is 0, else 1.  Etc. until all k bits are assigned

Event Insertion Hashing an Event to a Zone ◊Example: Hash event to a 5- bit zone code  First range: 0: [0, 0.5)  Second range: 0: [0, 0.25), [0.5, 0.75)  Third range: 0: [0, 0.125), [0.25, 0.375), [0.5, 0.625), [0.75, 0.875)  Zone code = 01110

Event Insertion Routing an Event to its Owner ◊GPSR delivers message to node A  Message contains: event E, code(E), target location, owner, location of owner, … ◊A encodes the event to code new (E)  Updates message if code new (E) is longer than code in message ◊A checks if code(A) has longer match with code(E) than previous owner  If yes, update message by setting itself as the owner ◊If code(A) and code(E) identical and A ’ s boundaries are known, A is the owner of E and stores it ◊Else A will route E to its owner by invoking GPSR

Event Insertion Resolving Undecided Boundaries ◊Suppose node C receives event E ◊If code(C) = code(E) and all of C ’ s boundaries are known, C will store the event ◊If C has undecided boundaries, there may be zone overlap with another node  C sets itself as owner and forwards message using GPSR perimeter mode ◊If message not changed, it will come back to C  C assumes it is the owner and stores it

Event Insertion Resolving Undecided Boundaries ◊An intermediate node X marks itself as the owner but code(E) is unchanged  X recognizes zone overlap with C and adjusts its boundaries and messages C to update its boundaries ◊An intermediate node D refines code(E)  D will try to deliver the message to the new zone  Another node X may overlap with C  X will shrink its zone and send C messages to do the same  C will update its undecided boundary

Event Insertion Example ◊Nodes A and B have claimed the same zone 0 ◊Node A generates event E =, code(E) = 0 ◊Perimeter mode forwarding of event to B ◊B and A engage in message exchange to shrink zones

Queries Routing and Resolving ◊Routing for point queries same as event ◊Range queries  query initially routed to zone corresponding to the entire range  progressively split into smaller sub-queries so each sub-query can be resolved by a single node

Queries Splitting Queries ◊If the range of Q ’ s first attribute contains value 0.5, A divides Q into two sub-queries, one with range 0 to 0.5, and the other 0.5 to 1 ◊If Q A, part of zone A that overlaps with query area, is empty, A stops splitting ◊Else A continues splitting the query using successive attribute ranges and recomputing Q A until it is small enough to fit entirely in zone(A)

Queries Splitting Queries Example ◊Suppose there is a node A with code(A) = 0110 ◊Split a query Q =

Query Resolution ◊Once a sub-query falls into a zone, the node owner resolves the query and sends the reply to the querier ◊The other sub-queries are forwarded to other nodes

Analysis on DIMs ◊Metrics  Average insertion cost – average number of messages required to insert an event into the network  Average query delivery cost – average number of messages required to route a query message to all the relevant nodes ◊Compared against alternatives  GHT-R and flooding

Average Insertion Cost

Summary ◊DIM builds a flat indexing structure in sensor networks  Store similar events in local nodes  Use geographic routing to query events ◊Use location information  Determine the zone code  Hashing an event to a zone code ◊Limitations:  Flat structure can incur large overhead in a large network

Why do we need archival storage? Applications need historical sensor information. Why? Trigger events: Traffic monitoring - crash Surveillance - break-in Environmental monitoring - natural disaster lead to requests for past information. This requires archival storage. Reference paper: TSAR*: A Two Tier Sensor Storage Architecture Using Interval Skip Graphs, ACM SenSys 2005.

Limited by lack of sufficient, energy-efficient storage and of communication and computation resources on current sensor platforms. Optimized for continuous queries. High energy cost if used for archival - data must be transmitted to central data store. Existing storage and indexing approaches ◊Streaming query systems  TinyDB (Madden 2005), etc.  Data storage and indexing is performed outside of network. ◊In-network storage and indexing  DCS, GHT (Ratnasamy 2002)  Dimensions (Ganesan 2003)  Directed Diffusion (Intangonwiwat 2000)

Technology Trends Radio J/byte Flash J/byte Max Flash size Mica2304.50.5MB MicaZ3.44.50.5MB Telos3.411MB UMass NAND 0.01>1GB 1000x 100x New flash technologies enable large storage systems on small energy- constrained sensors.

Hierarchical Storage and Indexing Hierarchical deployments are being used to provide scaling: James Reserve (CENS) Higher powered micro-servers are deployed alongside resource constrained sensor nodes. Key challenge: Exploit proxy resources to perform intelligent search across data on resource- constrained nodes. Sensors Proxies Application

Key Ideas in TSAR ◊Exploit storage trends for archival.  Use cheap, low-power, high capacity flash memory in preference to communication. ◊Index at proxies and store at sensors.  Exploit proxy resources to conserve sensor resources and improve system performance. ◊Extract key searchable attributes.  Distill sensor data into concise attributes such as ranges of time or value that may be used for location and retrieval but require less energy to transmit.

TSAR Architecture 1. Interval Skip Graph-based index between proxies. Exploit proxy resources to locate data stored on sensors in response to queries. 2. Summarization process Extracts identifying information: e.g. time period during which events were detected, range of event values, etc.  3. Local sensor data archive Stores detailed sensor information: e.g. images, events. Sensor node archive

TSAR Architecture 1. Interval Skip Graph-based index between proxies. Exploit proxy resources to locate data stored on sensors in response to queries. 3. Local sensor data archive Stores detailed sensor information, e.g. images, events. 2. Summarization process Extracts identifying information: e.g. time period during which events were detected, range of event values, etc.  Summarization function

TSAR Architecture 2. Summarization process Extracts identifying information: e.g. time period during which events were detected, range of event values, etc. 3. Local sensor data archive Stores detailed sensor information, e.g. images, events.  Distributed index 1. Interval Skip Graph-based index between proxies Exploit proxy resources to locate data stored on sensors in response to queries.

 Example - Camera Sensing storage Cyclops camera summarize image Sensor archives information and transmits summary to proxy. Sensor node Summary handle  Birds(t 1,t 2 )=1

Example - Indexing Index Network of proxies Summary and location information are stored and indexed at proxy. proxy   Birds(t 1,t 2 )=1 Birds t1,t2 1

Example - Querying and Retrieval Birds in interval (t1,t2)? proxy  Cyclops camera summarize   Cyclops camera summarize  Query is sent to any proxy. Birds t1,t2 1

Example - Querying and Retrieval Birds in interval (t1,t2)? proxy  Cyclops camera summarize   Cyclops camera summarize  Index is used to locate sensors holding matching records. Birds t1,t2 1

Record is retrieved from storage and returned to application. Example - Querying and Retrieval proxy  Cyclops camera summarize   Cyclops camera  Birds t1,t2 1

Outline ◊Introduction and Motivation ◊Architecture ◊Example ◊Design  Skip Graph  Interval Search  Interval and Sparse Interval Skip Graph ◊Experimental Results ◊Related Work ◊Conclusion and Future Directions

The index should: support range queries over time or value, be fully distributed among proxies, and Support interval keys indicating a range in time or value. Goals of Index Structure insert(| |) Distributed index (| |) ?

What is a Skip Graph? 23569121819 Single key and associated pointers Distributed extension of Skip Lists (Pugh ‘90): Probabilistically balanced - no global rebalancing needed. Ordered by key - provides efficient range queries. Fully distributed - data is indexed in place. (Aspnes & Shah, 2003, Harvey et al. 2003) Log(N) search and insert No single root - load balancing, robustness Properties:

Interval search Given intervals [low,high] and query X: 1 - order by low 2 - find first interval with high <= X 3 - search until low > X 012345678910 0 3 5 8 6 8 9 4 2 2 3 1 5 Query: x=4

Interval search Given intervals [low,high] and query X: 1 - order by low 2 - find first interval with high <= X 3 - search until low > X 012345678910 03 58 6 89 42 23 15 Query: x=4

Interval search Given intervals [low,high] and query X: 1 - order by low 2 - find first interval with high <= X 3 - search until low > X 012345678910 03 5 8 6 89 42 23 15 Query: x=4

Simple Interval Skip Graph 0-30-11-52-45-86-108-99-12 Derived from Interval Tree, Cormen et al. 1990 3355810 12 Method: Index two increasing values: low, max Search on either as needed. Interval keys:YES logN search:YES logN update:NO - (worst case O(N))

Adaptive Summarization updates queries How accurately should the summary information represent the original data? Detailed summaries = more summaries, precise index Precise index = fewer wasted queries

Adaptive Summarization updates queries How accurately should the summary information represent the original data? Approximate summaries = fewer summaries, imprecise index imprecise index = more wasted queries ??

 = summarization (summaries / data) r = EWMA( wasted queries / data ) Target range: r 0 Decrease  if: r > r 0  Increase  if: r < r 0  Adaptive Summarization updates queries Goal: balance update and query cost. Approach: adaptation.

Prototype and Experiments ◊ Software: Em* (proxy), TinyOS (sensor) ◊ Hardware:Stargate Mica2 mote ◊ Network:802.11 ad-hoc, multihop BMAC 11% ◊ Data:James Reserve [CENS] dataset 30s temperature readings 34 days For physical experiments, data stream was stored on sensor node and replayed.

Index performance Queries Sensor data How does the index performance scale with the number of proxies and size of dataset? Tested in:Em* emulation Tasks: insert, query Variables: number of proxies (1-48) size of dataset Metric:proxy-to-proxy messages Interval skip graph index

Index results Sparse skip graph provides >2x decrease in message traffic for small numbers of proxies. Sparse skip graph shows virtually flat message cost for larger index sizes.

Tested on:4 Stargate proxies 12 Mica2 sensors in tree configuration Task: query Variables: size of dataset Metric:query latency (ms) Query performance data queries What is the query performance on real hardware and real data? 4-proxy network 3-level multi-hop sensor field

Validates the approach of using proxy resources to minimize the number of expensive sensor operations. Query results Sensor link latency dominates Proxy link delay is negligible The sensor communication consists only of a query and a response - the minimal communication needed to retrieve the data.

Summary algorithm adapts to data and query dynamics. Tested in:Em*, EMTOSSIM emulation Task: data and queries Variables: query/data ratio Metric:summarization factor  Query/data = 0.2 Query/data =0.03 Query/data = 0.1 Adaptive Summarization Varied query rate Summary rate adapts How well does the adaptation mechanism track changes in conditions? 1/1/

Related Work ◊In-network Storage:  DCS (Ratnasamy 2002)  Dimensions (Ganesan 2003)  … ◊In-network Indexing:  GHT (Ratnasamy 2002)  DIFS (Greenstein 2003)  DIM (Li 2003)  … ◊Hierarchical Sensor Systems:  Tenet (CENS, USC) ◊Sensor Flash File Systems:  ELF (Dai 2004)  Matchbox (Hill et al. 2000)

Summary ◊Proposed novel Interval Skip Graph-based index structure and adaptive summarization mechanism for multi-tier sensor archival storage. ◊Implemented these ideas in the TSAR system. ◊Demonstrated index scalability, query performance, and adaptation of summarization factor, both in emulation and running on real hardware.

Query and Storage in Wireless Sensor Networks. The Problem ◊How to perform efficient query and storage in wireless sensor networks? ◊Design goals:  Distributed.

Similar presentations

Presentation on theme: "Query and Storage in Wireless Sensor Networks. The Problem ◊How to perform efficient query and storage in wireless sensor networks? ◊Design goals:  Distributed."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Query and Storage in Wireless Sensor Networks. The Problem ◊How to perform efficient query and storage in wireless sensor networks? ◊Design goals:  Distributed.

Similar presentations

Presentation on theme: "Query and Storage in Wireless Sensor Networks. The Problem ◊How to perform efficient query and storage in wireless sensor networks? ◊Design goals:  Distributed."— Presentation transcript:

Similar presentations

About project

Feedback