Download presentation
Presentation is loading. Please wait.
Published byAshlynn Floyd Modified over 9 years ago
1
1 Report of Advanced Data Base Topics Project Instructor : Dr. rahgozar euhanna ghadimi, Ali abbasi, kave pashaii Data Storage selection in sensor networks
2
2 Outline 1. Introduction Definition, Applications, Differences, Storage 2. Queries 2.1. Querying in Cougar 2.2. Querying in TinyDB 2.3. In-network Aggregation 3. Other Issues
3
3 Introduction From a data storage point of view, a sensor network database : “a distributed database that collects physical measurements about the environment, indexes them, and serves queries from users and other applications external to or from within the network” Research in sensor network databases: relatively new can benefit from current efforts in data streams and P2P networks
4
4 Disaster Response Circulatory Net Embed Embed numerous distributed devices to monitor and interact with physical world: in work- spaces, hospitals, homes, vehicles, and “the environment” (water, soil, air…) Network these devices so that they can coordinate to perform higher-level tasks. Requires robust distributed systems of tens of thousands of devices. The long term goal
5
5 Sensor Net Sample Apps Traditional monitoring apparatus. Earthquake monitoring in shake- test sites. Vehicle detection: sensors along a road, collect data about passing vehicles. Habitat Monitoring: Storm petrels on Great Duck Island, microclimates on James Reserve.
6
6 Overview of research Sensor network challenges One approach: Directed diffusion Basic algorithm Initial simulation results (Intanagowat) Other interesting localized algorithms in progress: Aggregation (Kumar) Adaptive fidelty (Xu) Address free architecture, Time synch (Elson) Localization (Bulusu, Girod) Self-configuration using robotic nodes (Bulusu, Cerpa) Instrumentation and debugging (Jerry Zhao)
7
7 The Challenge is Dynamics! The physical world is dynamic Dynamic operating conditions Dynamic availability of resources … particularly energy! Dynamic tasks Devices must adapt automatically to the environment Too many devices for manual configuration Environmental conditions are unpredictable Unattended and un-tethered operation is key to many applications
8
8 Approach Energy is the bottleneck resource And communication is a major consumer--avoid communication over long distances Pre-configuration and global knowledge are not applicable Achieve desired global behavior through localized interactions Empirically adapt to observed environment Leverage points Small-form-factor nodes, densely distributed to achieve Physical locality to sensed phenomena Application-specific, data-centric networks Data processing/aggregation inside the network
9
9 Directed Diffusion Concepts Application-aware communication primitives expressed in terms of named data (not in terms of the nodes generating or requesting data) Consumer of data initiates interest in data with certain attributes Nodes diffuse the interest towards producers via a sequence of local interactions This process sets up gradients in the network which channel the delivery of data Reinforcement and negative reinforcement used to converge to efficient distribution Intermediate nodes opportunistically fuse interests, aggregate, correlate or cache data
10
10 Illustrating Directed Diffusion Sink Source Setting up gradients Sink Source Sending data Sink Source Recovering from node failure Sink Source Reinforcing stable path
11
11 Sensor Network Tomography: Key Ideas and Challenges Kinds of tomograms network health resource-level indicators responses to external stimuli Can exchange resource health during low-level housekeeping functions … such as radio synchronization Key challenge: energy- efficiency need to aggregate local representations algorithms must auto-scale outlier indicators are different
12
12 Self configuring networks using and supporting robotic nodes (Bulusu, Cerpa, Estrin, Heidemann, Mataric, Sukhatme) Robotics introduces self- mobile nodes and adaptively placed nodes Self configuring ad hoc networks in the context of unpredictable RF environment Place nodes for network augmentation or formation Place beacons for localization granularity
13
13 Programming Sensor Nets Is Hard Months of lifetime required from small batteries 3-5 days naively; can’t recharge often Interleave sleep with processing –Lossy, low-bandwidth, short range communication »Nodes coming and going »~20% loss @ 5m »Multi-hop –Remote, zero administration deployments –Highly distributed environment –Limited Development Tools »Embedded, LEDs for Debugging! Need high level abstractions! 200-800 instructions per bit transmitted! High-Level Abstraction Is Needed!
14
14 A Solution: Declarative Queries Users specify the data they want Simple, SQL-like queries Using predicates, not specific addresses Same spirit as Cougar – Our system: TinyDB Challenge is to provide: Expressive & easy-to-use interface High-level operators Well-defined interactions “Transparent Optimizations” that many programmers would miss Sensor-net specific techniques Power efficient execution framework Question: do sensor networks change query processing? Yes!
15
15 Overview TinyDB: Queries for Sensor Nets Processing Aggregate Queries (TAG) Taxonomy & Experiments Acquisitional Query Processing Other Research Future Directions
16
16 Overview TinyDB: Queries for Sensor Nets Processing Aggregate Queries (TAG) Taxonomy & Experiments Acquisitional Query Processing Other Research Future Directions
17
17 TinyDB Demo
18
18 TinyOS Schema Query Processor Multihop Network TinyDB Architecture Schema: “Catalog” of commands & attributes Filter light > 400 get (‘temp’) Agg avg(temp) Queries SELECT AVG(temp) WHERE light > 400 Results T:1, AVG: 225 T:2, AVG: 250 TablesSamples got(‘temp’) Name: temp Time to sample: 50 uS Cost to sample: 90 uJ Calibration Table: 3 Units: Deg. F Error: ± 5 Deg F Get f : getTempFunc() … getTempFunc(…)TinyDB ~10,000 Lines Embedded C Code ~5,000 Lines (PC-Side) Java ~3200 Bytes RAM (w/ 768 byte heap) ~58 kB compiled code (3x larger than 2 nd largest TinyOS Program)
19
19 Declarative Queries for Sensor Networks Examples: SELECT nodeid, nestNo, light FROM sensors WHERE light > 400 EPOCH DURATION 1s 1 EpochNodeidnestNoLight 0117455 0225389 1117422 1225405 Sensors “Find the sensors in bright nests.”
20
20 Aggregation Queries EpochregionCNT(…)AVG(…) 0North3360 0South3520 1North3370 1South3520 “Count the number occupied nests in each loud region of the island.” SELECT region, CNT(occupied) AVG(sound) FROM sensors GROUP BY region HAVING AVG(sound) > 200 EPOCH DURATION 10s 3 Regions w/ AVG(sound) > 200 SELECT AVG(sound) FROM sensors EPOCH DURATION 10s 2
21
21 Overview TinyDB: Queries for Sensor Nets Processing Aggregate Queries (TAG) Taxonomy & Experiments Acquisitional Query Processing Other Research Future Directions
22
22 Tiny Aggregation (TAG) In-network processing of aggregates Common data analysis operation Aka gather operation or reduction in || programming Communication reducing Operator dependent benefit Across nodes during same epoch Exploit query semantics to improve efficiency!
23
23 Query Propagation Via Tree- Based Routing Tree-based routing Used in: Query delivery Data collection Topology selection is important; e.g. Krishnamachari, DEBS 2002, Intanagonwiwat, ICDCS 2002, Heidemann, SOSP 2001 LEACH/SPIN, Heinzelman et al. MOBICOM 99 SIGMOD 2003 Continuous process Mitigates failures A B C D F E Q:SELECT … Q Q Q Q Q Q Q Q Q QQ Q R:{…}
24
24 Basic Aggregation In each epoch: Each node samples local sensors once Generates partial state record ( PSR ) local readings readings from children Outputs PSR during assigned comm. interval At end of epoch, PSR for whole network output at root New result on each successive epoch Extras: Predicate-based partitioning via GROUP BY 1 23 4 5
25
25 Illustration: Aggregation 12345 4 1 3 2 1 4 1 2 3 4 5 1 Sensor # Interval # Interval 4 SELECT COUNT(*) FROM sensors Epoch
26
26 Illustration: Aggregation 12345 4 1 3 2 2 1 4 1 2 3 4 5 2 Sensor # Interval 3 SELECT COUNT(*) FROM sensors Interval #
27
27 Illustration: Aggregation 12345 4 1 3 2 2 13 1 4 1 2 3 4 5 3 1 Sensor # Interval 2 SELECT COUNT(*) FROM sensors Interval #
28
28 Illustration: Aggregation 12345 4 1 3 2 2 13 1 5 4 1 2 3 4 5 5 Sensor # SELECT COUNT(*) FROM sensors Interval 1 Interval #
29
29 Illustration: Aggregation 12345 4 1 3 2 2 13 1 5 4 1 1 2 3 4 5 1 Sensor # SELECT COUNT(*) FROM sensors Interval 4 Interval #
30
30 Interval Assignment: An Approach 1 2 3 4 5 SELECT COUNT(*)… 4 intervals / epoch Interval # = Level 4 3 Level = 1 2 Epoch Comm Interval 43 2 155 Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z L T L T L T T L T LL Pipelining: Increase throughput by delaying result arrival until a later epoch Madden, Szewczyk, Franklin, Culler. Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks. WMCSA 2002. CSMA for collision avoidance Time intervals for power conservation Many variations( e.g. Yao & Gehrke, CIDR 2003 ) Time Sync (e.g. Elson & Estrin OSDI 2002)
31
31 Aggregation Framework As in extensible databases, we support any aggregation function conforming to: Aggn={finit, fmerge, fevaluate} Finit {a0} Fmerge {, } Fevaluate { } aggregate value Example: Average AVGinit {v} AVGmerge {, } AVGevaluate{ } S/C Partial State Record (PSR) Restriction: Merge associative, commutative
32
32 Types of Aggregates SQL supports MIN, MAX, SUM, COUNT, AVERAGE Any function over a set can be computed via TAG In network benefit for many operations E.g. Standard deviation, top/bottom N, spatial union/intersection, histograms, etc. Compactness of PSR
33
33 Overview TinyDB: Queries for Sensor Nets Processing Aggregate Queries (TAG) Taxonomy & Experiments Acquisitional Query Processing Other Research Future Directions
34
34 Simulation Environment Evaluated TAG via simulation Coarse grained event based simulator Sensors arranged on a grid Two communication models Lossless: All neighbors hear all messages Lossy: Messages lost with probability that increases with distance Communication (message counts) as performance metric
35
35 Taxonomy of Aggregates TAG insight: classify aggregates according to various functional properties Yields a general set of optimizations that can automatically be applied Properties Partial State Monotonicity Exemplary vs. Summary Duplicate Sensitivity Drives an API!
36
36 Partial State Growth of PSR vs. number of aggregated values (n) Algebraic: |PSR| = 1 (e.g. MIN) Distributive: |PSR| = c (e.g. AVG) Holistic: |PSR| = n (e.g. MEDIAN) Unique: |PSR| = d (e.g. COUNT DISTINCT) d = # of distinct values Content Sensitive: |PSR| < n (e.g. HISTOGRAM) PropertyExamplesAffects Partial StateMEDIAN : unbounded, MAX : 1 record Effectiveness of TAG “Data Cube”, Gray et. al
37
37 Benefit of In-Network Processing Simulation Results 2500 Nodes 50x50 Grid Depth = ~10 Neighbors = ~20 Uniform Dist. Aggregate & depth dependent benefit!HolisticUnique Distributive Algebraic
38
38 Monotonicity & Exemplary vs. Summary PropertyExamplesAffects Partial State MEDIAN : unbounded, MAX : 1 record Effectiveness of TAG Monotonicity COUNT : monotonic AVG : non-monotonic Hypothesis Testing, Snooping Exemplary vs. Summary MAX : exemplary COUNT: summary Applicability of Sampling, Effect of Loss
39
39 Channel Sharing (“Snooping”) Insight: Shared channel can reduce communication Suppress messages that won’t affect aggregate E.g., MAX Applies to all exemplary, monotonic aggregates Only snoop in listen/transmit slots Future work: explore snooping/listening tradeoffs
40
40 Hypothesis Testing Insight: Guess from root can be used for suppression E.g. ‘MIN < 50’ Works for monotonic & exemplary aggregates Also summary, if imprecision allowed How is hypothesis computed? Blind or statistically informed guess Observation over network subset
41
41 Experiment: Snooping vs. Hypothesis Testing Uniform Value Distribution Dense Packing Ideal Communication Pruning in Network Pruning at Leaves
42
42 Duplicate Sensitivity PropertyExamplesAffects Partial State MEDIAN : unbounded, MAX : 1 record Effectiveness of TAG Monotonicity COUNT : monotonic AVG : non-monotonic Hypothesis Testing, Snooping Exemplary vs. Summary MAX : exemplary COUNT: summary Applicability of Sampling, Effect of Loss Duplicate Sensitivity MIN : dup. insensitive, AVG : dup. sensitive Routing Redundancy
43
43 Use Multiple Parents Use graph structure Increase delivery probability with no communication overhead For duplicate insensitive aggregates, or Aggs expressible as sum of parts Send (part of) aggregate to all parents In just one message, via multicast Assuming independence, decreases variance SELECT COUNT(*) A BC R A BC c R P(link xmit successful) = p P(success from A->R) = p 2 E(cnt) = c * p 2 Var(cnt) = c 2 * p 2 * (1 – p 2 ) V # of parents = n E(cnt) = n * (c/n * p 2 ) Var(cnt) = n * (c/n) 2 * p 2 * (1 – p 2 ) = V/n A BC c/n R n = 2
44
44 Multiple Parents Results Better than previous analysis expected! Losses aren’t independent! Insight: spreads data over many links Critical Link! No Splitting With Splitting
45
45 Taxonomy Related Insights Communication Reducing In-network Aggregation (Partial State) Hypothesis Testing (Exemplary & Monotonic) Snooping (Exemplary & Monotonic) Sampling Quality Increasing Multiple Parents (Duplicate Insensitive) Child Cache
46
46 TAG Contributions Simple but powerful data collection language Vehicle tracking: SELECT ONEMAX(mag,nodeid) EPOCH DURATION 50ms Distributed algorithm for in-network aggregation Communication Reducing Power Aware Integration of sleeping, computation Predicate-based grouping Taxonomy driven API Enables transparent application of techniques to Improve quality (parent splitting) Reduce communication (snooping, hypo. testing)
47
47 Overview TinyDB: Queries for Sensor Nets Processing Aggregate Queries (TAG) Taxonomy & Experiments Acquisitional Query Processing Other Research Future Directions
48
48 Acquisitional Query Processing (ACQP) Closed world assumption does not hold Could generate an infinite number of samples An acqusitional query processor controls when, where, and with what frequency data is collected! Versus traditional systems where data is provided a priori Madden, Franklin, Hellerstein, and Hong. The Design of An Acqusitional Query Processor. SIGMOD, 2003
49
49 ACQP: What’s Different? How should the query be processed? Sampling as a first class operation Event – join duality How does the user control acquisition? Rates or lifetimes Event-based triggers Which nodes have relevant data? Index-like data structures Which samples should be transmitted? Prioritization, summary, and rate control
50
50 E(sampling mag) >> E(sampling light) 1500 uJ vs. 90 uJ Operator Ordering: Interleave Sampling + Selection SELECT light, mag FROM sensors WHERE pred1(mag) AND pred2(light) EPOCH DURATION 1s (pred1) (pred2) mag light (pred1) (pred2) mag light (pred1) (pred2) mag light Traditional DBMS ACQP At 1 sample / sec, total power savings could be as much as 3.5mW Comparable to processor! Correct ordering (unless pred1 is very selective and pred2 is not): Cheap Costly
51
51 Exemplary Aggregate Pushdown SELECT WINMAX(light,8s,8s) FROM sensors WHERE mag > x EPOCH DURATION 1s Novel, general pushdown technique Mag sampling is the most expensive operation! WINMAX (mag>x) mag light Traditional DBMS light mag (mag>x) WINMAX (light > MAX) ACQP
52
52 Lifetime Queries Lifetime vs. sample rate SELECT … EPOCH DURATION 10 s SELECT … LIFETIME 30 days Extra: Allow a MAX SAMPLE PERIOD Discard some samples Sampling cheaper than transmitting
53
53 (Single Node) Lifetime Prediction
54
54 Overview TinyDB: Queries for Sensor Nets Processing Aggregate Queries (TAG) Taxonomy & Experiments Acquisitional Query Processing Other Research Future Directions
55
55 Sensor Network Challenge Problems Temporal aggregates Sophisticated, sensor network specific aggregates Isobar Finding Vehicle Tracking Lossy compression Wavelets Hellerstein, Hong, Madden, and Stanek. Beyond Average. IPSN 2003 “Isobar Finding”
56
56 TinyDB Deployments Initial efforts: Network monitoring Vehicle tracking Ongoing deployments: Environmental monitoring Generic Sensor Kit Building Monitoring Golden Gate Bridge
57
57 Data Storage Recently Introduced Larger capacity, larger battery power Usual sensors send their data to it It replies queries (sheng et. al ACM MobiHoc 2006)
58
58 Problems Data Storage Placement (Sheng et. al paper) Data Storage Selection Our method : An adaptive and decentralized method
59
59 Costs in the system
60
60 Overall cost
61
61 Our method
62
62 Our method (Cont.)
63
63 Our results Very Good !!
64
64 References Book: Wireless Sensor Networks: An Information Processing Approach, by F. Zhao and L. Guibas, Elsevier, 2004. Papers: [1]Bo Sheng, Qun Li, and Weizhen Mao. Data Storage Placement in sensor networks,ACM Mobihoc 2006, Florence, Italy, May 22-25, 2006,[1]Bo Sheng, Qun Li, and Weizhen Mao. Data Storage Placement in sensor networks,ACM Mobihoc 2006, Florence, Italy, May 22-25, 2006, [2]B. Bonfils,.P. Bonnet, Adaptive and Decentralized Operator Placement for In- Network Query Processing,2003, springer verlag.[2]B. Bonfils,.P. Bonnet, Adaptive and Decentralized Operator Placement for In- Network Query Processing,2003, springer verlag
65
65 References(Cont.) [3]S. Bhattacharya, H. Kim, S. Prabh, and T. Abdelzaher. Energy-conserving data placement and asynchronous multicast in wireless sensor networks. In Proceedings of the 1st international conference on Mobile systems, applications and services, pages 173–185, New York, NY, USA, 2003. ACM Press.[3]S. Bhattacharya, H. Kim, S. Prabh, and T. Abdelzaher. Energy-conserving data placement and asynchronous multicast in wireless sensor networks. In Proceedings of the 1st international conference on Mobile systems, applications and services, pages 173–185, New York, NY, USA, 2003. ACM Press. [4]H. S. Kim, T. F. Abdelzaher, and W. H. Kwon. Minimum- energy asynchronous dissemination to mobile sinks in wireless sensor networks. In Proceedings of the 1st international conference on Embedded networked sensor systems, pages 193–204, New York, NY, USA, 2003. ACM Press.[4]H. S. Kim, T. F. Abdelzaher, and W. H. Kwon. Minimum- energy asynchronous dissemination to mobile sinks in wireless sensor networks. In Proceedings of the 1st international conference on Embedded networked sensor systems, pages 193–204, New York, NY, USA, 2003. ACM Press. [5] A. Trigoni, Y. Yao, A. Demers, J. Gehrke and R. Rajaraman. Multi-Query Optimization for Sensor Networks. in the International Conference on Distributed Processing on Sensor Systems (DCOSS), 2005.[5] A. Trigoni, Y. Yao, A. Demers, J. Gehrke and R. Rajaraman. Multi-Query Optimization for Sensor Networks. in the International Conference on Distributed Processing on Sensor Systems (DCOSS), 2005.
66
66 References(Cont.) [6]Madden S., Franklin M.J., Hellerstein J.M., Hong W., The Design of an Acquisitional Query Processor For Sensor Networks, Proc. Int. Conf. on Management of Data (SIGMOD), San Diego (USA), 2003.[6]Madden S., Franklin M.J., Hellerstein J.M., Hong W., The Design of an Acquisitional Query Processor For Sensor Networks, Proc. Int. Conf. on Management of Data (SIGMOD), San Diego (USA), 2003. [7] P. Bonnet, J. Gehrke, P. Seshadri, Towards Sensor Database Systems, Lecture Notes in Computer Science, 2001, Springer Verlag[7] P. Bonnet, J. Gehrke, P. Seshadri, Towards Sensor Database Systems, Lecture Notes in Computer Science, 2001, Springer Verlag
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.