Sep Multiple Query Optimization for Wireless Sensor Networks Shili Xiang Hock Beng Lim Kian-Lee Tan (ICDE 2007) Presented by Shan Bai
2 Highlight Introduction Background Challenge Goal of this paper Multiple Query Optimization Base station optimization In-network optimization Discussion References
3 Introduction Background WSN are deployed in many important applications to query the physical world. (environmental monitoring, healthcare monitoring, military surveillance, tracking of goods and manufacturing processes, traffic monitoring, etc.) The sensor network needs to support the efficient processing of multiple queries.
4 Introduction Users Requirement: Users can issue declarative queries without having to worry about how the data are generated, processed, and transferred within the network, and how sensor nodes are (re)programmed to satisfy changing user interest. ( User transparency) The WSN should be able to concurrently handle several user requests through running multiple queries Challenge Various data from the network at the same time, and both users and their interests can change over time.
5 Introduction Current situation Several sensor data query processing systems, such as Cougar [16, 4] and TinyDB [9], have been developed by the database research community. However, most existing work on sensor data query processing has focused on the optimization and execution of a single long-running query. --- systems cannot amortize the data acquisition, computation and communication cost of fetching the common data for multiple queries. ---lead to bandwidth contention and even data loss as a result of transmission collisions (which may in turn require retransmission).
6 Introduction Goal of this paper To design a light-weight but effective scheme to support multiple data acquisition and aggregation queries in a wireless sensor network, in order to minimize the number of radio transmissions. Similar queries to share the limited communication and computational resources.
7 Highlight Introduction Background Challenge Goal of this paper Multiple Query Optimization Base station optimization In-network optimization Discussion References
8 Base station optimization Use the base station as a filter to reduce duplicate data accesses from the sensor network, and as a screen to hide the query dynamics as much as possible. Multiple Query Optimization Base station optimization
9 Given a set of queries Q that has been submitted to the base station, rewrite them into a new query set Q’. The optimal situation is that data requested by queries in Q’ will be just enough to answer queries in Q, and the same data needed for various queries in Q will be acquired only once by queries in Q’ COST MODEL VOL :cost of one query, the number of its result dissemination messages in a unit of time. C : whole data space d: the average depth of nodes in the network sel(p): the selectivity of predicates p As a result of multi-hop routing protocols, the cost of data acquisition query qi with sampling period si can be estimated as: Multiple Query Optimization Base station optimization
10 Multiple Query Optimization Base station optimization Define metric Benefit to quantify the cost savings by query rewriting. It is beneficial to write q1 and q2 into q’ if and only if Benefit12 > 0. We have the following theorem, the proof of which is omitted due to space constraint. Theorem 1. Benefit12 > 0 only if GCD(s1, s2) == s1 or GCD(s1, s2) == s2. *GCD: Greatest Common divisor of s1,s2 sample period
11 Multiple Query Optimization Base station optimization to identify the most beneficial synthetic query qj to rewrite with this qi.
12 it is possible that multiple synthetic queries can benefit from the newly integrated synthetic query. Multiple Query Optimization Base station optimization
13 Multiple Query Optimization Base station optimization Iterative query Insertion algorithm The main idea: whenever a synthetic query is updated, it is checked against the synthetic query list to see if it is beneficial to other synthetic queries; if so, the most beneficial pairs are rewritten, and the newly updated synthetic query will be checked against the synthetic query list; this process terminates when there is no further beneficial rewriting. To achieve this, after Integrate (qid, qi) in line 16 in Algorithm 1 has updated the synthetic query qid into a new one, Insert (qid,Qsyn). The iterative query insertion algorithm is expected to reduce more redundancy among the data requested by user queries
14 Multiple Query Optimization Base station optimization To enable our multi-query optimization scheme to perform well for dynamic workloads where user queries are inserted at different frequency and run for various duration, introduce a parameter a to adjust our query termination algorithm according to the property of application workload.
15 Multiple Query Optimization Base station optimization Summary: When there are several queries in the system, and substantial similarities between queries, the query insertion and termination can most likely be handled at the base station, in terms of reduction in the number of radio messages and the scalability of the number of concurrent queries, without affecting the sensor network.
16 Highlight Introduction Background Challenge Goal of this paper Multiple Query Optimization Base station optimization In-network optimization Discussion References
17 Multiple Query Optimization In-network optimization base station optimization cannot support sharing of the commonality among queries at the finest granularity. base station optimization cannot take advantage of special property of sensor nodes, such as the broadcast nature of radio transmission. Sensor nodes make local decisions themselves and adaptively handle the query workload with time.
18 Multiple Query Optimization In-network optimization Sharing over time. more progressive sharing over time by scheduling the data acquisition and transmission of all queries in a whole. At the end of a query’s propagation phase, setSampleRate is triggered, which may start (or restart) the node’s clock to fire at the GCD of the “epoch duration” of all the queries. We set the epoch start time on sensor nodes to be divisible by the epoch duration instead of the arrival time of a new query (here we assume that every epoch duration is divisible by 2048ms).
19 Multiple Query Optimization In-network optimization Sharing over space Each sensor node dynamically selects a route (parent) that is aware of the query space; in the meanwhile, it tries to take advantage of the broadcast nature of the radio channel to satisfy multiple queries in one message.
20 Multiple Query Optimization In-network optimization Sharing over space. A B D F E H G C Routing tree in DAG (Directed Acyclic Graph) Base station
21 Highlight Introduction Background Challenge Goal of this paper Multiple Query Optimization Base station optimization In-network optimization Discussion References
22 Discussion Pros Two optimization tiers are similar and complementary to each other. Both of them can eliminate the duplicate transmission of the same data for several data acquisition queries
23 Discussion Cons The base station optimization is somewhat more constrained by the granularity while the in-network optimization will result in a bigger result message size. In in-network optimization, aggregation queries can only benefit among themselves with semantic correctness guarantee. The authors should provide more detail evidence showed to prove the performance improvements over the traditional single query optimization technique -- design some real algorithms and simulation combine base-station and in-station in a whole. To achieve a robust data transmission for the data that is requested by many user queries.
24 References [1] A. Demers, J. Gehrke, R. Rajaraman, N. Trigoni, and Y. Yao. The cougar project: A work-in-progress report. SIGMOD Record, 32(4), [2] S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong.TINYDB: An acquisitional query processing system for sensor networks. ACM TODS, 30(1), November [3] S. Xiang, H. B. Lim, and K. L. Tan. Impact of multi-query optimization in sensor networks. In Proc. of DMSN, [4] A. Demers, J. Gehrke, R. Rajaraman, N. Trigoni, and Y. Yao. The cougar project: A work-in-progress [9] S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. TinyDB: An acquisitional query processing system for sensor networks. ACM TODS, 30(1), November [16] Y. Yao and J. Gehrke. Query processing for sensor networks. In Proc. of CIDR, 2003.
25 Questions/Comments?