CountTorrent: Ubiquitous Access to Query Aggregates in Dynamic and Mobile Sensor Networks Abhinav Kamra, Vishal Misra and Dan Rubenstein - Columbia University (ACM SenSys 2007) Presenter : Justin
A few definitions Distributive queries (e.g. MIN, MAX, COUNT, SUM) Form: f(p U q) = f( f(p), f(q) ), (p, q = set of disjoint nodes) e.g. Sum: f(p U q) = |p| + |q|
A few definitions (cont.) Two types of queries: Duplicate-sensitive: e.g. SUM, COUNT Duplicate-insensitive: e.g. MIN, MAX 3
Traditional data aggregation Goal: Combine data values while routing to the sink Two schemes Tree-based Multi-path example Tree-based (spanning tree) Multi-path (DAG) 4
Traditional data aggregation (cont.) Tree-based Error-prone in dynamic networks Not accurate in failure-prone settings Multi-path Bandwidth overkill in stable networks Have to avoid duplicate and redundant data Still loses accuracy in high mobility/loss scenarios 5
CountTorrent: An adaptive approach Adapt to network conditions: Stable networks: accurate tree-based aggregation Dynamic networks: multi-path aggregation, accuracy degrades gracefully Completely distributed: local decisions Can compute duplicate-sensitive and duplicate- insensitive query aggregates 6
CountTorrent: A conceptual overview Divide and conquer strategy Arrange information in a hierarchy using a (prefix- free) binary labeling Combine disjoint information Adapt the labeling as network changes
CountTorrent: Label assignment Each node is assigned a unique (binary) label by its parent. When a new node joins: Chooses one of its neighbors as parent Parent splits its label S into 2 separate labels S0 and S1: Child given label S1 h2 h3 h h4 h2 h3 h Node h4 joins Chooses h1 as parent
CountTorrent: Data combining After a label is assigned to each node All labels can be merged to form ε ε 21 9
CountTorrent: Data combining Aggregating with tuples Definition: Tuple = (binary label, aggregate value) pair Labels differ only in last bit merge tuples (11, 5)(01, 3)(001, 2)(011, 1)(10, 3)(001, 2) (011, 1)(11, 5)(10, 3)(001, 2)(011, 1)(1, 8)(001, 2) Node ANode B Neighbors exchange tuples randomly Merge any tuples if possible 10
CountTorrent: Variants Static CountTorrent Exchange tuples with neighbors Dynamic CountTorrent Addiction tuple cache contains tuples associated with the node’s children and its own label. b c a (10,4) (1100,2) (111,3) (1101,1) (111,3) …… d (10,4) (1100,2) (1101,1)(111,3) (10,4) (1100, 0 ) c a (111,3) …… d (10,4) (1101,1)(111,3) Send up node b leaves node a reset value of “1100” into 0 node d require new label from a c a (10,4) (11,3) …… d (10,4) (1101,1)(11,3) Update down 11
CountTorrent: Fine-tuning Random exchange is not efficient: Convergence is slow Optimizations: Intelligent Selection Carefully choose data to send to neighbors Minimize redundant and duplicate tuple exchanges Preferred Diffusion Carefully choose neighbor to send data to Fast convergence in stable networks 12
CountTorrent: Intelligent Selection Node A sending to neighbor B Remembers what was sent to B Remembers what was received from B Only send tuples that are useful for B (11, 5)(01, 3)(0010, 2)(10, 3)(001, 2) Node ANode B (11, 5)(10, 3)(001, 2) (1, 8)(001, 2) Node a won’t send (11,5) again (11, 5)(01, 3)(0010, 2) Node a won’t send (0010,2) to node b, since “001” is the prefix of “0010” (11, 5)(001, 2)(01, 3)(0010, 2) 13
CountTorrent: Preferred Diffusion Preferential forwarding: If any tuple useful for parent Send Else, if any tuple useful for a child Send Else, send to another neighbor Stable networks: Mimics tree-based aggregation Dynamic network: mix of tree-based and multi- path 14
Simulations & Experiments Simulations: Compare with other aggregation methods Effect of Node joins & failures Aggregation in a mobile network Experiments on Tossim & motes: CountTorrent implementation on Crossbow micaz motes 15
CountTorrent accuracy 16
Bandwidth usage 17
Adapting to node joins & failures 18
COUNT aggregate in a mobile network 19
CountTorrent on TOSSIM 20
CountTorrent on micaz motes 21
Conclusions Robust: Accurate even in lossy networks Adaptive: Data communication adapts to changing topology Handles mobility: Close to accurate aggregates Bandwidth-efficient: adapts to the stability of the network to maintain accuracy Ubiquitous: All nodes get the aggregate by design 22