Sensor Network Databases1 Overview: Chapter 6  Sensor Network Databases  Sensor networks are conceptually a distributed DB  Store collected data  Indexes.

Slides:



Advertisements
Similar presentations
Data and Computer Communications
Advertisements

C6 Databases.
Scalable Content-Addressable Network Lintao Liu
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part C Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Searching on Multi-Dimensional Data
1 Data-Centric Storage in Sensornets with GHT, A Geographic Hash Table Sylvia Ratnasamy, Scott Shenker, Brad Karp, Ramesh Govindan, Deborah Estrin, Li.
1 Sensor Network Databases Ref: Wireless sensor networks---An information processing approach Feng Zhao and Leonidas Guibas (chapter 6)
File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.
Gossip Scheduling for Periodic Streams in Ad-hoc WSNs Ercan Ucan, Nathanael Thompson, Indranil Gupta Department of Computer Science University of Illinois.
Multidimensional Data. Many applications of databases are "geographic" = 2­dimensional data. Others involve large numbers of dimensions. Example: data.
Data-Centric Storage in Sensor Networks With GHT Khaldoun A. Ibrahim,
Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein.
Managing Data Resources
Multi-dimensional Range Query in Sensor Networks Xin Li,Young Jim Kim, Ramesh Govindan (University of Southern California ) Wei Hong (Intel Research Lab.
The Cougar Approach to In-Network Query Processing in Sensor Networks By Yong Yao and Johannes Gehrke Cornell University Presented by Penelope Brooks.
Carnegie Mellon University Complex queries in distributed publish- subscribe systems Ashwin R. Bharambe, Justin Weisz and Srinivasan Seshan.
Dissemination protocols for large sensor networks Fan Ye, Haiyun Luo, Songwu Lu and Lixia Zhang Department of Computer Science UCLA Chien Kang Wu.
Distributed Quad-Tree for Spatial Querying in Wireless Sensor Networks (WSNs) Murat Demirbas, Xuming Lu Dept of Computer Science and Engineering, University.
A Survey of Wireless Sensor Network Data Collection Schemes by Brett Wilson.
Probabilistic Data Aggregation Ling Huang, Ben Zhao, Anthony Joseph Sahara Retreat January, 2004.
Quick Review of Apr 15 material Overflow –definition, why it happens –solutions: chaining, double hashing Hash file performance –loading factor –search.
Chapter 3: Data Storage and Access Methods
Distributed Quad-Tree for Spatial Querying in Wireless Sensor Networks (WSNs) Murat Demirbas, Xuming Lu Dept of Computer Science and Engineering, University.
SESSION 7 MANAGING DATA DATARESOURCES. File Organization Terms and Concepts Field: Group of words or a complete number Record: Group of related fields.
Managing Data Resources. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single.
Sensor Networks Storage Sanket Totala Sudarshan Jagannathan.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
An adaptive framework of multiple schemes for event and query distribution in wireless sensor networks Vincent Tam, Keng-Teck Ma, and King-Shan Lui IEEE.
Systems analysis and design, 6th edition Dennis, wixom, and roth
CSC271 Database Systems Lecture # 30.
1 Pertemuan 20 Teknik Routing Matakuliah: H0174/Jaringan Komputer Tahun: 2006 Versi: 1/0.
Load Balancing of In-Network Data-Centric Storage Schemes in Sensor Networks Mohamed Aly In collaboration with Kirk Pruhs and Panos K. Chrysanthis Advanced.
7.1 Managing Data Resources Chapter 7 Essentials of Management Information Systems, 6e Chapter 7 Managing Data Resources © 2005 by Prentice Hall.
EN : Adv. Storage and TP Systems Cost-Based Query Optimization.
Mutlidimensional Indices Instructor: Randal Burns Lecture for 29 November 2005 Computer Science Johns Hopkins University.
March 6th, 2008Andrew Ofstad ECE 256, Spring 2008 TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks Samuel Madden, Michael J. Franklin, Joseph.
Sensor Database System Sultan Alhazmi
Benjamin AraiUniversity of California, Riverside Reliable Hierarchical Data Storage in Sensor Networks Song Lin – Benjamin.
Physical Database Design I, Ch. Eick 1 Physical Database Design I About 25% of Chapter 20 Simple queries:= no joins, no complex aggregate functions Focus.
Query Processing for Sensor Networks Yong Yao and Johannes Gehrke (Presentation: Anne Denton March 8, 2003)
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
FEN Introduction to the database field:  Applications, concepts and terminology Seminar: Introduction to relational databases.
Zone Sharing: A Hot-Spots Decomposition Scheme for Data-Centric Storage in Sensor Networks Mohamed Aly Nicholas Morsillo Panos K. Chrysanthis Kirk Pruhs.
DIST: A Distributed Spatio-temporal Index Structure for Sensor Networks Anand Meka and Ambuj Singh UCSB, 2005.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Rendezvous Regions: A Scalable Architecture for Service Location and Data-Centric Storage in Large-Scale Wireless Sensor Networks Karim Seada, Ahmed Helmy.
TELE202 Lecture 6 Routing in WAN 1 Lecturer Dr Z. Huang Overview ¥Last Lecture »Packet switching in Wide Area Networks »Source: chapter 10 ¥This Lecture.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Physical Database Design I, Ch. Eick 1 Physical Database Design I Chapter 16 Simple queries:= no joins, no complex aggregate functions Focus of this Lecture:
Teknik Routing Pertemuan 10 Matakuliah: H0524/Jaringan Komputer Tahun: 2009.
STDCS: A Spatio-Temporal Data-Centric Storage Scheme For Real-Time Sensornet Applications Mohamed Aly (University of Pittsburgh & Yahoo, Inc.) In collaboration.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
1 Querying the Physical World Son, In Keun Lim, Yong Hun.
Attribute Allocation in Large Scale Sensor Networks Ratnabali Biswas, Kaushik Chowdhury, and Dharma P. Agrawal International Workshop on Data Management.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
KDDCS: A Load-Balanced In- Network Data-Centric Storage Scheme for Sensor Networks Mohamed Aly In collaboration with Kirk Pruhs and Panos K. Chrysanthis.
Multidimensional Access Structures COMP3017 Advanced Databases Dr Nicholas Gibbins –
Managing Data Resources File Organization and databases for business information systems.
Data Transformation: Normalization
Updating SF-Tree Speaker: Ho Wai Shing.
Multidimensional Access Structures
Distributed database approach,
Physical Database Design and Performance
MANAGING DATA RESOURCES
Indexing and Hashing Basic Concepts Ordered Indices
Outline Ganesan, D., Greenstein, B., Estrin, D., Heidemann, J., and Govindan, R. Multiresolution storage and search in sensor networks. Trans. Storage.
Data-Centric Networking
Overview: Chapter 2 Localization and Tracking
Presentation transcript:

Sensor Network Databases1 Overview: Chapter 6  Sensor Network Databases  Sensor networks are conceptually a distributed DB  Store collected data  Indexes the data  Serves queries from users on the data  DB abstraction  Separates logical data view(naming,access,operations) from implementation details  Loss of efficiency, but increased ease of use

Sensor Network Databases2 Challenges  Transferring data to central server not feasible  Need in-network storage  Node memories/storage are limited  Older data needs to be discarded  Classic DB performance metrics not suitable  Query languages need additional operators  Support for continuous long-running queries  Need to correlate current readings with past statistics

Sensor Network Databases3 Querying the Physical Environment  Need a high-level query language  Separate query from underlying implementation details  Allow non-expert users easy access to data  Query execution plan dependent on system state and implementation  Continuous and historical queries need special operators

Sensor Network Databases4 Query Interface: Cougar Sensor DB  Each type of sensor is an abstract data type (ADT)  Similar to Object-Relational DB  Can perform functions on sensor data  Distributed query processing  Query transmitted to sensors  Only those that satisfy query respond  Eliminate need to transmit all sensor data to central server

Sensor Network Databases5 Query Interface: Probabilistic Queries  Uncertainty in data caused by noise etc.  Requests for exact sensor values not appropriate  Need query language that can handle uncertainty  Gaussian ADT (GADT) models uncertainty as continuous pdf  Query: retrieve all values from sensors with value X with probability at least Y

Sensor Network Databases6 Database Organization  Centralized data storage  Sensors forward all data to centralized server  Queries do not incur additional network overhead  Nodes near server act as routing hotspots and become depleted of energy  In-network data storage  Storage points for data in the network act as rendezvous points between queries and data  Load is balanced across network  Query time depends on indexing scheme  In-network storage allows aggregation of data ahead of query processing

Sensor Network Databases7 Query Propagation & Aggregation  Server-based approach  Data is sent to server and aggregated at server  In-network aggregation  Data is aggregated at sensors  Query propagated to sensors  Need efficient routing protocol  How to aggregate the data  Can some data be computed ahead of time and then aggregated?

Sensor Network Databases8 TinyDB  SQL-style query interface  SQL operators: count, min, max, sum, average  Extension operators: median, histogram  In-network aggregate query-processing  Sensitive to resource constraints and lossy communication channels

Sensor Network Databases9 Query Processing Scheduling  Processing based on sampling period  Sampling period divided into time intervals  Aggregation results reported at end of each sampling period  Choice of sampling period is important  Routing tree depends on number of intervals in sampling period  Nodes schedule processing, communication etc. based on routing tree

Sensor Network Databases10 Optimizations  Aggregation can be pipelined to increase throughput  Nodes can snoop packets of others to make early decisions  Report only changed data  Adaptive aggregation to support changing network conditions

Sensor Network Databases11 Data-Centric Storage  Flexible storage needed when queries can originate from within network  Server-based approaches flood queries to all nodes  Data-Centric Storage (DCS) make use of rendezvous points to aggregate queries and data

Sensor Network Databases12 Geographical Hash Table (GHT)  Mapping from node attribute to storage location by hash function  Distributes data evenly across network  Nodes know their location  Objects have keys  Nodes responsible for storing a range of keys  Geographical routing algorithm  Any node can locate storage node for any attribute  Nodes put and get data to/from storage  Hashtable interface  Data replicated at nodes near the storage point for a key

Sensor Network Databases13 Data Indices & Range Queries  Sensor network DBs need to support range queries  Query specifies value range for each attribute  GHTs and base TinyDB model not adequate as is  Need to index data to support complex queries  Metrics for measuring index  Speed gains for query processing  Size of index  Costs of building and maintaining index

Sensor Network Databases14 1-D Indices  1-Dimensional indices  Most prevalent type of DB index  Indexes data parameterized by single value  Implemented with data structures: B-trees, hash tables, etc.  Not directly mapable to distributed implementation  1D Index for range queries  Pre-compute and store answer to certain range queries  Compute answer to arbitrary range query from pre- computed answers  Adapting to distributed storage model requires partial aggregation of results

Sensor Network Databases15 Multi-Dimensional Indices  Can’t simply create multiple 1D indices  Queries will retrieve many more records than necessary  Indexing scheme must take into account need to range queries and distributed storage  Quickly eliminate irrelevant records  Top-down hierarchical approach  K-D trees, quad trees, R trees etc.  Assumes orthogonal ranges  Non-orthogonal range searching requires more complex schemes

Sensor Network Databases16 Multi-Resolution Summarization  Aggregate summaries of data can be used to “drill down” to more specific queries  Saves on storage space and network communication  Need appropriate data structures (e.g., quad trees, etc.)  Indexing based on data structure  Match queries based on ranges from child branches at each node  Non-matching branches pruned off

Sensor Network Databases17 Partitioning the Summaries  Balances workload across nodes  DIFS  Nodes distributed in 2D space  Use multi-rooted quad-trees to partition spatial domain  Wider spatial domain of node = narrower value range indexed  Couples spatial domain decomposition to value indexing  Uses a GHT to locate index nodes

Sensor Network Databases18 Fractional Cascading  Intuition: queries exhibit temporal and spatial locality and are not for arbitrary data  Idea: store information about other nodes at each node, but nodes only know “fraction” of info from far nodes  Little duplicate info  Spatial locality: nearby neighbors know a lot about local information  Can still satisfy queries of distant nodes

Sensor Network Databases19 Locality-Preserving Hashing  Map high-dimensional attribute space to a plane  Close values in high-D space are close in the plane  Spatial domain divided into zones  Locality-preserving geographical hash  Maps attribute space to spatial domain  Values that are similar are mapped to nearby nodes

Sensor Network Databases20 Data Aging  Continuous data acquisition and limited storage = data storage problem  Older data is aggregated and summarized  Eventually older data is discarded  Aging functions are difficult to construct  Aging policy is application dependent

Sensor Network Databases21 Indexing Motion Data  Continuously changing sensor data  Fixed index structure not appropriate  Heavy index update penalty for continuous data  Need to support temporal queries, predictive queries, etc.  Apriori motion knowledge  Time is treated as another attribute  Standard indexing methods may apply  No apriori knowledge  More likely in physical world  Dynamic index. Updates only on critical events