Download presentation
Presentation is loading. Please wait.
Published bySilas Hutchinson Modified over 9 years ago
1
Data Quality and Query Cost in pervasive sensing systemsDavid Yates1 Data Quality and Query Cost in Pervasive Sensing Systems David J. Yates Bentley College Computer Information Systems Dept. Waltham, Massachusetts, USA dyates@bentley.edu
2
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates2 Joint Work With … Erich Nahum IBM T.J. Watson Research Center 19 Skyline Drive Hawthorne, New York, USA James Kurose and Prashant Shenoy Dept. of Computer Science University of Massachusetts Amherst, Massachusetts, USA
3
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates3 Talk Outline Data quality and query cost for pervasive sensing systems Motivation and introduction Pervasive sensing applications Resource-constrained sensor fields Sensor networks and backbone networks Data management techniques to conserve resources Sensor network data server and cache Query cost, data quality, delay, value deviation Cost and quality performance Summary and Conclusions
4
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates4 Research Contributions Define and quantify data quality and query cost performance in pervasive sensing systems Develop policies that approximate sensor field values using cached values for nearby locations Prove analytic upper bound on sensor field query rate Show cost and quality win-win for pervasive sensing applications for which response time is most important Show cost vs. quality tradeoff for sensing applications for which accuracy is most important Results are robust with respect to the manner in which the query workload changes
5
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates5 Pervasive Sensing Applications Microsensors, on-board processing, wireless interfaces feasible at very small scale – can monitor phenomena “up close” Enables spatially and temporally dense monitoring and control Pervasive sensing will reveal previously unobservable phenomena Data center management Manufacturing engineering Environmental monitoring Natural disaster response Embedded, energy-constrained (wireless, small form-factor), unattended systems
6
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates6 Sensors Embedded in Infrastructure The day after a moderate earthquake jolts the city of San Francisco, building inspectors check on the structural integrity of an office building in the financial district. Sensors embedded in the walls of the building to monitor and record vibration data confirm that the structure is safe to enter. (Intel 2005)
7
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates7 Sensor fields (blue), backbone (yellow), monitoring & control applications (red) Queries submitted from sensing applications Replies received from sensor fields Our focus – Data management at data server From Sensor Networks to Applications Light Sound Data server / Gateway (and cache) … Routers & Switches Sensing Application … Embedded, energy-constrained (wireless, small form-factor), unattended systems
8
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates8 Data Server Node Without Cache Sensor network query queue Gateway reply queue Queries Replies Sensor field Queries Replies s s s s s s s s s s s s l1l1 l2l2 l i = query location i t i = timestamp associated with value sampled in sensor field at location i {t1}{t1} {t2}{t2} s = sensor
9
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates9 Data Server Node Without Cache Sensor network query queue Gateway reply queue Queries Replies Sensor field Queries Replies s s s s s s s s s s s s l1l1 l2l2 l i = query location i t i = timestamp associated with value sampled in sensor field at location i Query m Reply m End-to-end delay occurs between Query m and Reply m. Value deviation is between the value in Reply m and the value at l i as Reply m leaves the gateway reply queue. {t1}{t1} {t2}{t2} s = sensor
10
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates10 Sensor network query queue Gateway query queue Cache update queue Cache Queries Updates or replies Hit Gateway reply queue Miss or Prefetch Updates Data Server Node With Cache Sensor field s s s s s s s s s s s s l1l1 l2l2 Queries Replies l3l3 l i = query location; el i = cache entry for query location t i = timestamp of value associated with location i v i = value in cache associated with location i el i = { l i,v i,t i } el 1, el 2 Query m Reply m For a cache hit or a miss, end-to-end delay occurs between Query m and Reply m. Also, value deviation is between the value in Reply m and the value at l i as Reply m leaves the gateway reply queue. s = sensor Locations l 1 and l 2 are cached in entries el 1 and el 2
11
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates11 Query Cost and Data Quality Cost to query location li is normalized such that Normalized quality using softmax normalization
12
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates12 Caching and Lookup Policies All hits All misses Simple lookup Piggyback queries Greedy age-based lookup Greedy distance-based lookup Median-of-3 lookup Policies incorporate an age parameter T T can be 0, finite, or infinite precise lookups and queries approximate lookups and queries
13
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates13 Research Contributions Defined and quantified data quality and query cost performance in pervasive sensing systems Developed policies that approximate sensor field values using cached values for nearby locations Prove analytic upper bound on sensor field query rate Show cost and quality win-win for pervasive sensing applications for which response time is most important Show cost vs. quality tradeoff for sensing applications for which accuracy is most important Results are robust with respect to the manner in which the query workload changes
14
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates14 Lab Trace Data Trace data from multi-sensor motes deployed at Intel Berkeley lab (Deshpande 2004)
15
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates15 Lab Environment and Workload 2.3 million readings taken over 35+ days Use readings with largest changes in value in our simulator (light measured in Lux) Changes occur slowly relative to correlated changes (about 1 location every 1.4 seconds) But, range of values is large Applications determine values for A and T
16
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates16 Bounded Resource Consumption N is set of locations in sensor field Cache entry for each location used by multiple queries for periods of T seconds (requires blocking behind pending queries) Sensor field query rate can be bounded by: queries per second Proof: Induction on size of N Sensor field transmissions dominate resource consumption
17
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates17 Data Quality Driven by Response Time Picking a large value of A means delay is more important than value deviation Consider normalized quality when A = 0.9
18
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates18 Cost and Quality Performance when Response Time drives Quality Trace-driven Changes A = 0.9, T = 90 sec Query rate = 0.9 lps Change rate = 1.4 lps Approximate greedy lookups outperform other policies There is a win-win here!
19
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates19 Delay when Response Time drives Quality Trace-driven Changes
20
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates20 Research Contributions Defined and quantified data quality and query cost performance in pervasive sensing systems Developed policies that approximate sensor field values using cached values for nearby locations Proved analytic upper bound on sensor field query rate Showed cost and quality win-win for pervasive sensing applications for which response time is most important Show cost vs. quality tradeoff for sensing applications for which accuracy is most important Results are robust with respect to the manner in which the query workload changes
21
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates21 Data Quality Driven by Accuracy Choosing a small value of A means value deviation is more important to data quality than delay For example, consider normalized quality when A = 0.1
22
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates22 Cost vs. Quality when Accuracy drives Quality Trace-driven Changes A = 0.1, T = 90 sec Query rate = 0.9 lps Change rate = 1.4 lps There is a tradeoff between cost and quality here
23
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates23 Value Deviation when Accuracy drives Quality Trace-driven Changes Significant differences in accuracy between policies
24
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates24 Cost and Quality Trends when Response Time drives Quality Trace-driven Changes A = 0.9, T = 9 sec Query rate = 90, 9, and 0.9 lps Again, there is a win-win here!
25
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates25 Cost vs. Quality Trends when Accuracy drives Quality Trace-driven Changes A = 0.1, T = 9 sec Query rate = 90, 9, and 0.9 lps Same relative performance
26
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates26 Talk Summary Define and quantify data quality and query cost performance in pervasive sensing systems Develop policies that approximate sensor field values using cached values for nearby locations Prove analytic upper bound on sensor field query rate Show cost and quality win-win for pervasive sensing applications for which response time is most important Show cost vs. quality tradeoff for sensing applications for which accuracy is most important Results are robust with respect to the manner in which the query workload changes
27
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates27 Thank You! Further questions ??? … David J. Yates Bentley College Computer Information Systems Dept. Waltham, Massachusetts, USA dyates@bentley.edu
28
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates28 Emergency Response Applications Fire erupts in a warehouse in an industrial section of town. A sensing system installed in the building feeds detailed data to fire crews arriving on the scene, describing the location, characteristic and etiology of the fire, and predicting its future path. The result: firefighters are able to work quickly and safely to bring the blaze under control. (Intel 2005)
29
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates29 Technology & Market Trends Three of the 7 companies named by Gartner as “Cool Vendors in Emerging Trends and Technologies” in 2005 produced hardware and/or software for sensor networks (Reynolds et al. 2005) IDC has identified supply chain management as the largest sensor network market in the short-term and predicts that the domestic market for RFID sensors will exceed $1 billion in 2007 (C. Boone 2003)
30
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates30 Data Quality and Query Cost: Research Issues What form do data quality and query cost performance take? Can we bound resource consumption? Which policies provide best cost and quality when value deviation is more important than delay? Which policies provide best performance when delay is more important than value deviation? How does the manner in which the environment changes impact performance?
31
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates31 Softmax Normalization Requires that we know only the mean and standard deviation for our system delays and value deviations Normalization makes transformed values lie in the range [0,1] Used in neural networks and data mining for pattern recognition and data classification (Bridle 1990, Bishop 1995, Han and Kamber 2000) Reaches “softly” towards maximum and minimum values, never quite getting there (Rodríguez 2004) Transformation is more or less linear in the middle range, and has a nonlinearity at both ends (Rodríguez 2004)
32
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates32 Query Workload Model Query workload consists of polling component and random component Parameterize to yield many workloads proposed by others e.g., [Madd03] (Berkeley), [Lu02] (Virginia), [Deme03] (Cornell), [Jami03] (MIT), [Inta03][Zhao03] (USC), [Desh03] (CMU), [Olst03] (Stanford) These components are specified using two parameters: = period of the polling component λ = average query arrival rate for a process that represents the random component Example: 9 queries with fixed interarrival times + 81 queries with exponentially distributed interarrival times = 90 queries / second All locations are equally likely to be queried
33
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates33 1. Changes at each location are independent 2. Changes at each location correlated in space and time Models developed at USC (Jindal 2004) 3. Changes taken from real-world sensor readings at Intel Berkeley lab (Deshpande 2004) Our focus - Models 2. and 3. Models for Changes to Environment
34
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates34 Delay when Accuracy drives Quality Correlated ChangesTrace-driven Changes Large all misses delay has important impact on quality, but is discounted by choice of A = 0.1
35
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates35 Results from Two Models For correlated and trace-driven sensor network models When delay is more important than value deviation, policies that approximate values using cached values for nearby locations provide best cost and best quality performance When value deviation is more important than delay, there is a cost vs. quality tradeoff Policies that always query (and cache) the specified location provide the best quality performance Policies that approximate values using cached values for nearby locations provide best cost performance What happens if we vary the query rate relative to the rate at which the environment changes?
36
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates36 Summary and Conclusions Data Quality and Query Cost in Pervasive Sensing Systems Define and quantify data quality and query cost performance in Pervasive Sensing Systems Blocking behind pending queries bounds sensor field query rate When delay is more important than value deviation, policies that approximate values using cached values for nearby locations provide best cost and quality performance When value deviation is more important than delay, there is a cost vs. quality tradeoff Policies that always query (and cache) the specified location provide the best quality performance Policies that approximate values using cached values for nearby locations provide best cost performance Results are robust with respect to the manner in which the environment changes
37
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates37 References I (Christopher Bishop 1995) Neural Networks for Pattern Recognition. Oxford University Press, Oxford. (C. Boone 2003) “U.S. RFID for the Retail Supply Chain Spending Forecast and Analysis, 2003-2008,” IDC, December 2003. (John Bridle 1990) “Probabilistic interpretation of feed-forward classification network outputs, with relationships to statistical pattern recognition,” In Neurocomputing: Algorithms, Architectures and Applications, Volume 6, Springer-Verlag, Berlin. (A. Deshpande, C. Guestrin, S. Madden, J.M. Hellerstein, W. Hong 2004) “Model-Driven Data Acquisition in Sensor Networks,” In International Conference on Very Large Data Bases (VLDB), Toronto, August 2004.
38
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates38 (J. Han and M. Kamber 2000) Data Mining: Concepts and Techniques. Morgan Kaufman Publishers, San Francisco, California. (Intel 2005) Intel Corp., “Expanding Usage Models for Pervasive Sensing Systems,” Technology@Intel Magazine, August 2005. (A. Jindal and K. Psounis 2004) “Modeling spatially- correlated sensor network data”, In IEEE International Conference on Sensor and Ad hoc Communications and Networks (SECON), Santa Clara, California, October 2004. (Reynolds et al. 2005) Martin Reynolds, Alan Mac Neela, Carol Rozwell, and Anne-Marie Roussel, “Cool Vendors in Emerging Trends and Technologies,” Gartner Research Report, March 2005. References II
39
Data Quality and Query Cost in Pervasive Sensing SystemsDavid Yates39 References III (Caroline Rodríguez 2004) “A computational environment for data preprocessing in supervised classification,” M.Sc. Thesis, Department of Mathematics, University of Puerto Rico, Mayagüez, July 2004. (J. Sikander 2004), “Microsoft RFID Technology Overview,” Microsoft Corp., November 2004.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.