Download presentation
Presentation is loading. Please wait.
Published byElvin Dixon Modified over 9 years ago
1
Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks
2
Abstract : This paper proposes a generic, query based scheme for extracting data from sensor networks. This paper proposes a generic, query based scheme for extracting data from sensor networks.
3
Main Aim of the Paper The main idea behind this paper is to show how a generic query interface for data aggregation can be applied to ad-hoc networks of sensor devices. The main idea behind this paper is to show how a generic query interface for data aggregation can be applied to ad-hoc networks of sensor devices. The authors want to emphasize the fact that this technique helps in querying arbitrary data in a sensor network without building any custom application. The authors want to emphasize the fact that this technique helps in querying arbitrary data in a sensor network without building any custom application. To study the generic aggregation techniques. To study the generic aggregation techniques.
4
Introduction Advances in computing technology have led to the production of wireless battery powered smart sensors. Advances in computing technology have led to the production of wireless battery powered smart sensors. Due to deployment of large sensor networks a need arises for tools to collect & query data from these network. Due to deployment of large sensor networks a need arises for tools to collect & query data from these network.
5
Aggregation It is an important issue from the network performance & longevity standpoint It is an important issue from the network performance & longevity standpoint It drastically reduces the amount of data routed through the network. It drastically reduces the amount of data routed through the network. Increases throughput, extending life of battery powered sensor networks. Increases throughput, extending life of battery powered sensor networks. It provides benefits like optimizing the computation & giving programmers the ease to issue declaration SQL style queries. It provides benefits like optimizing the computation & giving programmers the ease to issue declaration SQL style queries.
6
Background The authors discuss the relevant design aspects of the following MotesMotes TinyOSTinyOS Ad-hoc Sensor NetworksAd-hoc Sensor Networks Aggregation in Database SystemsAggregation in Database Systems The paper then summarizes aggregation in database systems & discusses how these techniques provide a useful & well defined framework for computing aggregates in sensor networks.
7
Motes Configuration : Equipped with 4MHz Atmel microprocessor, RAM : 512 bytes, Code space : 8kB,917 MHz RFM radio running at 10 kb/s,EEPROM : 32 kB. Configuration : Equipped with 4MHz Atmel microprocessor, RAM : 512 bytes, Code space : 8kB,917 MHz RFM radio running at 10 kb/s,EEPROM : 32 kB. Sensor options : Light, Temperature, Magnetic Field, Sensor options : Light, Temperature, Magnetic Field, Acceleration, Sound and Power. Acceleration, Sound and Power. Tradeoff : Power consumption of each sensor node is dominated by the cost of transmitting and receiving messages. Tradeoff : Power consumption of each sensor node is dominated by the cost of transmitting and receiving messages. Message delivery is unreliable by default. Message delivery is unreliable by default.
8
A TinyOS Sensor Mote
9
TinyOS TinyOS makes it possible to deploy ad-hoc networks of sensors that can locate each other & route data without any prior knowledge of network topology. TinyOS makes it possible to deploy ad-hoc networks of sensors that can locate each other & route data without any prior knowledge of network topology. They help in writing programs that capture & process sensor data & transmit messages over the radio. They help in writing programs that capture & process sensor data & transmit messages over the radio.
10
Ad-hoc Sensor Networks Each sensor has a unique id.Sensors route data by adopting a technique of building a routing tree. Each sensor has a unique id.Sensors route data by adopting a technique of building a routing tree. One sensor is appointed as the root,that interfaces the querying user to the rest of the network. One sensor is appointed as the root,that interfaces the querying user to the rest of the network. The constant topology maintenance makes it easy to adapt to network changes caused by mobility of certain nodes or to the addition or deletion of sensors. The constant topology maintenance makes it easy to adapt to network changes caused by mobility of certain nodes or to the addition or deletion of sensors.
11
The root broadcasts the message asking sensors to organize into a routing tree. The root broadcasts the message asking sensors to organize into a routing tree. The message contains the root id,its level and distance from the root,i.e. zero. The message contains the root id,its level and distance from the root,i.e. zero. Sender of message is chosen by the sensors as its parent through which it will route messages to the root. Sender of message is chosen by the sensors as its parent through which it will route messages to the root. The application helps to efficiently route data towards the root. The application helps to efficiently route data towards the root. This application doesn’t address point-to-point routing. This application doesn’t address point-to-point routing.
12
Aggregation in Database Systems Its defined by an aggregate function & a grouping predicate in SQL – based database systems. Its defined by an aggregate function & a grouping predicate in SQL – based database systems. Aggregate Function :- Aggregate Function :- It specifies how a set of values should be combined to compute an aggregate.Eg: COUNT,MIN,MAX, AVERAGE, and SUM SELECT AVERAGE (temp) FROM sensors SELECT AVERAGE (temp) FROM sensors
13
Most database systems allow user – defined functions (UDFs), that specify more complex aggregates. Most database systems allow user – defined functions (UDFs), that specify more complex aggregates. Grouping Predicate :- Grouping Predicate :- It partitions values into groups based on certain attributes. Eg : SELECT TRUNC (temp/10), AVERAGE (light) Eg : SELECT TRUNC (temp/10), AVERAGE (light) FROM sensors GROUP BY TRUNC (temp/10) HAVING AVERAGE (light) > 50 The above query partitions the sensor readings into groups by their temperatures reading and computes the average light within a group. The above query partitions the sensor readings into groups by their temperatures reading and computes the average light within a group.
14
Generic Aggregation Techniques An implementation of sensor network aggregation would be to use a centralized,server based approach. An implementation of sensor network aggregation would be to use a centralized,server based approach. However focus is given to the distributed in-network approach since it has the potential to be both lower latency and lower power as compared to the server based approach. However focus is given to the distributed in-network approach since it has the potential to be both lower latency and lower power as compared to the server based approach. Its assumed that the entire experiment is based on the fact that the user is stationed on a desktop PC that has a large memory. Its assumed that the entire experiment is based on the fact that the user is stationed on a desktop PC that has a large memory.
15
Advantages of in-network approach Consider computing an aggregate over a group of sensors as arranged in the following figure. Consider computing an aggregate over a group of sensors as arranged in the following figure. Dotted lines represent connections between sensors Dotted lines represent connections between sensors Solid lines represent the routing tree imposed on top of this graph to allow sensors to propagate data to the root along a single path. Solid lines represent the routing tree imposed on top of this graph to allow sensors to propagate data to the root along a single path.
17
Sensors in fig (a),are labeled with their distance from the root. Sensors in fig (a),are labeled with their distance from the root. Summing these numbers gives 16 messages required to route all aggregation information to the root. Summing these numbers gives 16 messages required to route all aggregation information to the root. Each node is labeled with the number of messages required to get data to the host PC.i.e. 16 messages are required. Each node is labeled with the number of messages required to get data to the host PC.i.e. 16 messages are required. Sensors in fig (b) :- sensors with no children simply transmit their readings tot heir parents. Sensors in fig (b) :- sensors with no children simply transmit their readings tot heir parents. One message is sent along each edge as aggregation is performed by sensor themselves. One message is sent along each edge as aggregation is performed by sensor themselves. Intermediate nodes combine their own readings with the readings of their children via the aggregation function f and propagate the partial aggregate along with any additional data required to update the aggregate,up the tree. Intermediate nodes combine their own readings with the readings of their children via the aggregation function f and propagate the partial aggregate along with any additional data required to update the aggregate,up the tree.
18
The amount of data transmitted in this solution depends on the aggregate The amount of data transmitted in this solution depends on the aggregate The focus is on class of aggregation predicates that is particularly well suites to the in-network regime.Such aggregated are denoted by an aggregate function f over the sets a and b. The focus is on class of aggregation predicates that is particularly well suites to the in-network regime.Such aggregated are denoted by an aggregate function f over the sets a and b. f ( a U b) = g ( f(a), f(b) ) We assume that aggregate queries do not specify groups. We assume that aggregate queries do not specify groups.
19
Injecting a Query Computing an aggregate consists of two phases. Computing an aggregate consists of two phases. Propagation phase : in which aggregate queries are pushed down into sensor networks. Propagation phase : in which aggregate queries are pushed down into sensor networks. Aggregation phase : in which aggregate values are propagated up from the children to the parents.Aggregation phase : in which aggregate values are propagated up from the children to the parents. In the network discovery algorithm, leaf nodes must discover that they are leaves and propagate singular aggregates upto their parents.In the network discovery algorithm, leaf nodes must discover that they are leaves and propagate singular aggregates upto their parents.
20
Thus when a sensor p receives an aggregate a,either from another sensor or user, it transmits a & begins listening. Thus when a sensor p receives an aggregate a,either from another sensor or user, it transmits a & begins listening. If p has any children they will hear those children retransmit a to their children & will know it is not a leaf.If p has any children they will hear those children retransmit a to their children & will know it is not a leaf. At some time t,p has heard no children and concludes that it’s a leaf and transmits its current sensor value up the routing tree.At some time t,p has heard no children and concludes that it’s a leaf and transmits its current sensor value up the routing tree. If p has children they will report within time t & thus computes the value of a applied its own valueIf p has children they will report within time t & thus computes the value of a applied its own value
21
Choosing a short duration for t leads to missed reports from children. Choosing a short duration for t leads to missed reports from children. Then while injecting a query using propagation aggregate, the time interval is set to be long enough so that the messages have time to propagate down to the leaves and back in the routing tree. Then while injecting a query using propagation aggregate, the time interval is set to be long enough so that the messages have time to propagate down to the leaves and back in the routing tree. T = 2 * ( d p -d tree ) * (t xmit + t processs ) T = 2 * ( d p -d tree ) * (t xmit + t processs ) where t xmit is – time to send a msg t process is time to process aggregation request. t process is time to process aggregation request. This approach is undesirable as it takes long computation times. The major limitation of the tree based routing approach is that it is not suitable for peer-to-peer routing. This approach is undesirable as it takes long computation times. The major limitation of the tree based routing approach is that it is not suitable for peer-to-peer routing.
22
Streaming Aggregates Sensor networks are inherently unreliable. Sensor networks are inherently unreliable. Individual radio transmission can fail Individual radio transmission can fail Nodes can move Nodes can move All this makes it difficult to guarantee that certain portion of the network was not detached during a particular aggregate computation. All this makes it difficult to guarantee that certain portion of the network was not detached during a particular aggregate computation. Eg : If p broadcasts a,& its only child, c,due to some reason misses the message p wont ever hear c rebroadcast & thus entire network below p is excluded from aggregation computation and the end result is probably incorrect. Eg : If p broadcasts a,& its only child, c,due to some reason misses the message p wont ever hear c rebroadcast & thus entire network below p is excluded from aggregation computation and the end result is probably incorrect.
23
This problem can be solved by computing double check aggregates multiple times i.e. to request the aggregate be computed many times at the root of the network. This problem can be solved by computing double check aggregates multiple times i.e. to request the aggregate be computed many times at the root of the network. The drawback of this technique is that it requires retransmitting the aggregate request down the network multiple time,at a significant message overhead. The drawback of this technique is that it requires retransmitting the aggregate request down the network multiple time,at a significant message overhead. The pipelined aggregate scheme was proposed which has time divided into intervals of duration i. The pipelined aggregate scheme was proposed which has time divided into intervals of duration i.
24
Properties of Pipelined Aggregate Properties of Pipelined Aggregate After aggregates have propagated up from leaves,a new aggregate arrives every i seconds. After aggregates have propagated up from leaves,a new aggregate arrives every i seconds. t is total time for an aggregation request to propagate down to the leaves& back to the root, but user begins to see approximations of the aggregate after the 1 st interval has elapsed. t is total time for an aggregation request to propagate down to the leaves& back to the root, but user begins to see approximations of the aggregate after the 1 st interval has elapsed. These properties change as the sensor readings and underlying network change. These properties change as the sensor readings and underlying network change. Drawback of this approach s that a number of additional messages are transmitted to extract the first aggregate over all sensors. Drawback of this approach s that a number of additional messages are transmitted to extract the first aggregate over all sensors. This scheme will improve robustness of aggregates,throughput. This scheme will improve robustness of aggregates,throughput.
25
Pipelined Computation of Aggregates
26
Shared Channel The previous algorithms have ignored the fact that sensors communicate over shared radio channel The previous algorithms have ignored the fact that sensors communicate over shared radio channel Shared channel increases message efficiency. Shared channel increases message efficiency. It improves the number of sensors participating in any aggregate. It improves the number of sensors participating in any aggregate. It reduces the number of messages sent, by snooping. It reduces the number of messages sent, by snooping. The inherently broadcast nature of radio offers communications redundancy which improve reliability The inherently broadcast nature of radio offers communications redundancy which improve reliability
27
Multiple parents issue Multiple parents issue Consider the case where sensor s sends count c to a single parent, expected value of transmitted count is p * c & variance is c 2 * p * (1-p)
28
Hypothesis Testing The main drawback of the shared channel approach is that it requires input from every node in the network to compute an aggregate. The main drawback of the shared channel approach is that it requires input from every node in the network to compute an aggregate. Hypothesis Testing :- Hypothesis Testing :- When computing a MAX or MIN, a sensor can snoop on the values its peers report & omit its own value if its aware that it cannot affect the final value of the aggregate.When computing a MAX or MIN, a sensor can snoop on the values its peers report & omit its own value if its aware that it cannot affect the final value of the aggregate.
29
In this approach, leaf nodes will be required to send no message if their value is greater than the minimum observed over the top k levels. In this approach, leaf nodes will be required to send no message if their value is greater than the minimum observed over the top k levels. If we assume sensor values to be independent and randomly distributed,then a particular node must transmit with probability 1/2 k which is low for even small values of k. If we assume sensor values to be independent and randomly distributed,then a particular node must transmit with probability 1/2 k which is low for even small values of k.
30
Summary of Techniques Pipelining aggregates enables to increase the throughput and to smooth over intermittent losses inherent in radio communication. Pipelining aggregates enables to increase the throughput and to smooth over intermittent losses inherent in radio communication. Snooping over radio to reduce message load, improve accuracy of aggregates. Snooping over radio to reduce message load, improve accuracy of aggregates. Hypothesis testing to invert problems & further reduce the number of messages sent. Hypothesis testing to invert problems & further reduce the number of messages sent.
31
Grouping Grouping computes aggregates over partitions of sensor readings and its basic technique is to push down a set of predicates which specify group membership, ask sensors to choose the group they belong to,& then as answers flow back, update the aggregate values in the appropriate groups. Grouping computes aggregates over partitions of sensor readings and its basic technique is to push down a set of predicates which specify group membership, ask sensors to choose the group they belong to,& then as answers flow back, update the aggregate values in the appropriate groups. Each group predicate specifies a group id, a sensor attribute and a range of sensor values. Each group predicate specifies a group id, a sensor attribute and a range of sensor values. Groups are assumed to be disjoint & defined over the same attribute which may not be the attribute being aggregated. Groups are assumed to be disjoint & defined over the same attribute which may not be the attribute being aggregated.
32
When a sensor is a leaf,and receives a message from a child,& checks the group number. When a sensor is a leaf,and receives a message from a child,& checks the group number. If the child is in the same group as the sensor, it combines the two values (of sensor). Else it stores the value of the child’s group along with its own value for forwarding in the next arrival. If the child is in the same group as the sensor, it combines the two values (of sensor). Else it stores the value of the child’s group along with its own value for forwarding in the next arrival. The predicate is only sent into the network if it can potentially be used to reduce the number of messages that must be sent. The predicate is only sent into the network if it can potentially be used to reduce the number of messages that must be sent. Eg : predicate : MAX (attr) > x then information about groups with MAX (attr)<= x need not be transmitted up the tree.
33
Since the number of groups can exceed available storage on any one sensor, a way to evict groups is needed. Since the number of groups can exceed available storage on any one sensor, a way to evict groups is needed. Evicting partially computed groups is known as partial pre-aggregation. Evicting groups with low membership is likely a good policy, as they are least likely to be combined with other sensor readings. Evicting partially computed groups is known as partial pre-aggregation. Evicting groups with low membership is likely a good policy, as they are least likely to be combined with other sensor readings. Evicting groups forces information about current time interval into higher level nodes in the tree. Evicting groups forces information about current time interval into higher level nodes in the tree.
34
In the above diagram of the standard pipelined scheme, aggregates are computed over values from the previous time interval, this presents an inconsistency
35
Related Work Cougar project at Cornell discusses queries over sensor networks – it only considers moving selection operators onto sensors. Cougar project at Cornell discusses queries over sensor networks – it only considers moving selection operators onto sensors. USC/ISI & UCLA have contributed to the works on networks within the sensor network community. USC/ISI & UCLA have contributed to the works on networks within the sensor network community. An important platform on which our solution operates is the number of papers published by the TinyOS group at UC Berkeley describing the design of motes, TinyOS and the implementation of the networking protocols used to construct ad-hoc sensor networks. An important platform on which our solution operates is the number of papers published by the TinyOS group at UC Berkeley describing the design of motes, TinyOS and the implementation of the networking protocols used to construct ad-hoc sensor networks.
36
Future Work Researchers at UC Berkeley are currently working with the sensor testbed built by the TinyOS group to empirically verify algorithms presented in this paper. Researchers at UC Berkeley are currently working with the sensor testbed built by the TinyOS group to empirically verify algorithms presented in this paper. There is a need for experimental & mathematical validation of many techniques presented in this paper. There is a need for experimental & mathematical validation of many techniques presented in this paper.
37
Challenges The authors have not explored the tradeoffs between fully pipelined communication and techniques such as sending values only when sensor readings change. The authors have not explored the tradeoffs between fully pipelined communication and techniques such as sending values only when sensor readings change. Its not clear how this approach will behave when sensors move. The routing tree construction algorithm allows moving nodes to reattach, but its unclear how movements and disconnections affect the value of aggregates. Its not clear how this approach will behave when sensors move. The routing tree construction algorithm allows moving nodes to reattach, but its unclear how movements and disconnections affect the value of aggregates. The problem of computing multiple simultaneous aggregates over a single sensor network has yet to be explored The problem of computing multiple simultaneous aggregates over a single sensor network has yet to be explored
38
Conclusion Thus by applying generic aggregation operations this approach offers the ability to query arbitrary data in a sensor network. Thus by applying generic aggregation operations this approach offers the ability to query arbitrary data in a sensor network. Its possible to robustly compute aggregates while providing rapid and continuous updates of their value to the user by pipelining the flow of data through the sensor network. Its possible to robustly compute aggregates while providing rapid and continuous updates of their value to the user by pipelining the flow of data through the sensor network. By snooping messages n the shared channel & applying techniques for hypothesis testing, its possible to improve the performance. By snooping messages n the shared channel & applying techniques for hypothesis testing, its possible to improve the performance.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.