Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Centric Storage using GHT Lecture 13 October 14, 2004 EENG 460a / CPSC 436 / ENAS 960 Networked Embedded Systems & Sensor Networks Andreas Savvides.

Similar presentations


Presentation on theme: "Data Centric Storage using GHT Lecture 13 October 14, 2004 EENG 460a / CPSC 436 / ENAS 960 Networked Embedded Systems & Sensor Networks Andreas Savvides."— Presentation transcript:

1 Data Centric Storage using GHT Lecture 13 October 14, 2004 EENG 460a / CPSC 436 / ENAS 960 Networked Embedded Systems & Sensor Networks Andreas Savvides andreas.savvides@yale.edu Office: AKW 212 Tel 432-1275 Course Website http://www.eng.yale.edu/enalab/courses/een g460a

2 Data centric Storage In Sensornets with GHT Data centric Storage In Sensornets with GHT S. Ratnasamy, B. Karp, S. Shenker, D. Estrin, R. Govindan, L. Yin and F. Yu MONET Special Issue on Sensor Networks, August 2003

3 Overview  Data Centric Storage –Data is stored inside the network – each name corresponds to a location in space –All data with the same name will be stored at the same sensor network location –E.g an elephant sighting  Why Data centric Storage? –Energy efficiency –Robustness against mobility and node failures –Scalability

4 Keywords and Terminology  Observation ♦ low-level readings from sensors ♦ low-level readings from sensors ♦ e.g. Detailed temperature readings ♦ e.g. Detailed temperature readings  Events ♦ Predefined constellations of low-level observations ♦ Predefined constellations of low-level observations ♦ e.g. temperature greater than 75 F ♦ e.g. temperature greater than 75 F  Queries ♦ Used to elicit information from sensor network ♦ Used to elicit information from sensor network

5 Performance Metric:Total Usage /Hotspot Usage  Use communication as a cost function for energy consumption  Total Usage –Total number of packets sent in the Sensor network  Hotspot Usage –The maximal number of packets send by a particular sensor node  Costs used in the evaluation –Message flooding cost O(n) –Point-to-point routing cost –n is the number of nodes

6 Alternative Storage Schemes  External Storage (ES) –Events propagated and stored at an external location  Local Storage (LS) –Events stored locally at the detecting node –Queries are flooded to all nodes and the events are sent back  Data Centric Storage (DCS) –Data for an event stored within the sensor network –Queries are directed to the node that stores the data

7 External Storage (ES) External storage event

8 Local Storage (LS) event Queries flooded at all the nodes

9 Why do we need DCS?  Scalability  Robustness against Node failures and Node mobility  To achieve Energy-efficiency

10 Design Criterial: Scalability & Robustness  Node failures  Topology changes  System scale to large number of nodes  Energy Constraints  Persistence –(k,v) pair must remain available to queries, despite sensor node failures and changes in sensor network topology  Consistency –A query k must be routed correctly to a node where (k,v) pairs are stored – if these node change, then they should do this consisently  Scaling in Database Size  Topological generality – system should scale well on a large number of topologies

11 Assumptions in DCS  Large Scale networks whose approximate geographic boundaries are known  Nodes have short range communication and are within the radio range of several other nodes  Nodes know their own locations by GPS or some localization scheme  Communication to the outside world takes place by one or more access points

12 Data Centric Storage  Relevant Data are stored by “name” at nodes within the Sensor network  All data with the same general name will be stored at the same sensor-net node. e.g. (“elephant sightings”) e.g. (“elephant sightings”)  Queries for data with a particular name are then sent directly to the node storing those named data

13 Data centric Storage Elephant Sighting source:lass.cs.umass.edu

14 Geographic Hash Table  Events are named with keys and both the storage and the retrieval are performed using keys  GHT provides (key, value) based associative memory

15 Geographic Hash Table Operations  GHT supports two operations ♦ Put(k,v)-stores v (observed data) according to the key k ♦ Put(k,v)-stores v (observed data) according to the key k ♦ Get(k)-retrieve whatever value is associated with key k ♦ Get(k)-retrieve whatever value is associated with key k  Hash function ♦ Hash the key in to the geographic coordinates ♦ Hash the key in to the geographic coordinates ♦ Put() and Get() operations on the same key “k” hash k to the same location ♦ Put() and Get() operations on the same key “k” hash k to the same location

16 Storing Data in GHT Put (“elephant”, data) (12,24) Hash (‘elephant’)=(12,24) source:lass.cs.umass.edu

17 Retrieving data in GHT Get (“elephant”) Hash (‘elephant’)=(12,24) (12,24)

18 Geographic Hash Table Node A Node B

19 Algorithms Used By GHT  Geographic hash Table uses GPSR for Routing( Greedy Perimeter Stateless Routing )  PEER-TO-PEER look up system ( data object is associated with key and each node in the system is responsible for storing a certain range of keys ) ( data object is associated with key and each node in the system is responsible for storing a certain range of keys )

20 Algorithm (Contd) GPSR- Packets are marked with position of destinations and each node is aware of its position GPSR- Packets are marked with position of destinations and each node is aware of its position  Greedy forwarding algorithm  Perimeter forwarding algorithm A B A B

21 GPSR: Right-Hand Rule In Perimeter Forwarding x y z 1 2 3

22 Home Node and Home perimeter  Home node: Node geographically nearest to the destination coordinates of the packet –Serves as the rendezvous point for Get() and Put() operations on the same key  In GHT packet is not addressed to specific node but only to a specific location –Use GPSR to find the home node –only perimeter mode of GPSR to find Home Perimeter  Home Perimeter – perimeter that encloses the destination –Start from the home node, and use perimeter mode to make a cycle and return to the home node

23 Problems  Robustness could be affected » Nodes could move (i.d. of Home node?) » Node failure can Occur » Deployment of new Nodes  Not Scalable » Storage capacity of the home nodes » Bottleneck at Home nodes

24 Solutions to the problems  Perimeter refresh protocol – mostly addresses the robustness issue  Structured Replication – address the scalability issue – how to handle storage of many events

25 Perimeter refresh protocol   Replicates stored data for key k at nodes around the location to which k hashes – –Stores a copy of the key value pair at each node on the home perimeter – –Each node on the perimeter is called a replica node   How do you ensure consistency & persistence – –A node becomes the home node if a packet for a particular key arrives at that node   The perimeter refresh protocols periodically sends out refresh packets – –After a time period T h generate a refresh packet that contains the data for that key – –Packet forwarded on the home perimeter in the same way as Get() and Put() – –The refresh packet will take a tour of the home perimeter regardless the changes in the network topology since the key’s insertion – –This property maintains the perimeter

26 Perimeter Refresh Protocol  How do you guard against node failures –When a replica node receives a packet it did not originate, it caches the data in the refresh and sets up a takeover timer T t –Timer is reset each time a refresh from another node arrives –If the timer expires the replica node initiates a refresh packet addressed to the key’s hashed location  Note: That particular node does not determine a new home node. The GHT routing causes the refresh to reach a node home node

27 Perimeter Refresh Protocol E F B D A C L home Replica Assume key k hashes at location L A is closest to L so it becomes the home node

28 Perimeter Refresh Protocol E D F C B L Replica home Replica Suppose the node A dies

29 Time Specifications  Refresh time (T h )  Take over time (T t )  Death time (T d ) General rule T d >T h and T t >T h T d >T h and T t >T h In GHT Td=3Th and Tt=2Th

30 Characteristics Of Refresh Packet  Refresh packet is addressed to the hashed location of the key  Every (T h ) secs the home node will generate refresh packet  Refresh packet contains the data stored for the key and routed exactly as get() and put() operations  Refresh packet always travels along the home perimeter

31 Structured Replication  Too many events are detected then home node will become the hotspot of communication.  Structured replication is used to address the scaling problem  Hierarchical decomposition of the key space –Event names have a certain hierarchy depth

32 Structured Replication

33  A node that detects a new event, stores that event to its closest mirror – this is easily computable  This reduces the storage cost, but increases the query cost  GHT has to route the queries to all mirror nodes –Queries are routes recursively  First route query to the root, then to the first level and then to the second level mirrors  Structured replication becomes more useful for frequently detected events

34 Evaluation  Simulation to test if the protocol is functioning correctly  Done in the ns-2 network simulator using an IEEE 802.11 mac –This is a well known event driven simulator for ad-hoc networks  Larger scale simulations for the comparative study where done with a custom simulator

35 Comparative Study  Simulation compares the following schemes –External Storage (ES) –Local Storage (LS) –Normal DCS – A query returns a separate message for each detected event –Summarized DCS(S-DCS): A query returns a single message regardless of the number of detected events –Structured Replication DCS (SR_DCS) – Assuming an optimal level of SR  Comparison based on Cost  Comparison based on Total usage and Hot spot usage

36 Assumptions in comparison  Asymptotic costs of O(n) for floods and O( n) for point to point routing  Event locations are distributed randomly  Event locations are not known in advance  No more than one query for each event type (Q –Queries in total) (Q –Queries in total)  Assume access points to be the most heavily used area of the sensor network

37 Comparison based on Hot-spot/Total Usage  n - Number of nodes  T - Number of Event types  Q – Number Of Event types queried for  D total – Total number of detected events  D Q - Number of detected events for queries

38 DCS TYPES  Normal DCS – Query returns a separate message for each detected event  Summarized DCS – Query returns a single message regardless of the number of detected events (usually summary is preferred) (usually summary is preferred)

39 Comparison Study – contd.. ESLSDCS Total Hotspot

40 Observations from the Comparison DCS is preferable only in cases where  Sensor network is Large  There are many detected events and not all event types queried D total >>max(D q, Q) D total >>max(D q, Q)

41 Simulations  To check the Robustness of GHT  To compare the Storage methods in terms of total and hot spot usage

42 Simulation Setup  ns-2  Node Density – 1node/256m 2  Radio Range – 40 m  Number of Nodes -50,100,150,200  Mobility Rate -0,0.1,1m/s  Query generation Rate -2qps  Event types – 20  Events detected -10/type  Refresh interval -10 s

43 Performance metrics  Availability of data stored to Queriers (In terms of success rate) (In terms of success rate)  Loads placed on the nodes participating in GHT (hotspot usage)

44 Simulation Results for Robustness  GHT offers perfect availability of stored events in static case  It offers high availability when nodes are subjected to mobility and failures

45 Simulation Results under varying Q Number of nodes is constant= 10000

46 Simulation results under varying N Number of Queries Q =50

47 Simulation Results for comparison of 3-storage methods  S-DCS have low hot-spot usage under varying “Q”  S-DCS is has the lowest hot-spot usage under varying “n”

48 Conclusion  Data centric storage entails naming of data and storing data at nodes within the sensor network  GHT- hashes the key (events) in to geographical co-ordinates and stores a key-value pair at the sensor node geographically nearest to the hash  GHT uses Perimeter Refresh Protocol and structured replication to enhance robustness and scalability  DCS is useful in large sensor networks and there are many detected events but not all event types are Queried

49 REFERENCES  Deepak Ganesan, Deborah Estrin, John Heidemann, Dimensions: why do we need a new data handling architecture for sensor networks?, ACM SIGCOMM Computer Communication Review, Volume 33 Issue 1, January 2003 Scott Shenker, Sylvia Ratnasamy, Brad Karp, Ramesh Govindan, Deborah Estrin, Data-centric storage in sensornets, ACM SIGCOMM Computer Communication Review, Volume 33 Issue 1, January 2003 Dimensions: why do we need a new data handling architecture for sensor networks? Data-centric storage in sensornetsDimensions: why do we need a new data handling architecture for sensor networks? Data-centric storage in sensornets  Sylvia Ratnasamy, Brad Karp, Scott Shenker, Deborah Estrin, Ramesh Govindan, Li Yin, Fang Yu, Data-centric storage in sensornets with GHT, a geographic hash table, Mobile Networks and Applications, Volume 8 Issue 4, August 2003 Data-centric storage in sensornets with GHT, a geographic hash tableData-centric storage in sensornets with GHT, a geographic hash table  Chalermek Intanagonwiwat, Ramesh Govindan, Deborah Estrin, John Heidemann, Fabio Silva, Directed diffusion for wireless sensor networking, IEEE/ACM Transactions on Networking (TON), Volume 11 Issue, February 2003 Directed diffusion for wireless sensor networkingDirected diffusion for wireless sensor networking  R. Govindan, J. M. Hellerstein, W. Hong, S. Madden, M. Franklin, S. Shenker, The Sensor Network as a Database, USC Technical Report No. 02-771, September 2002 The Sensor Network as a DatabaseThe Sensor Network as a Database


Download ppt "Data Centric Storage using GHT Lecture 13 October 14, 2004 EENG 460a / CPSC 436 / ENAS 960 Networked Embedded Systems & Sensor Networks Andreas Savvides."

Similar presentations


Ads by Google