1 Querying the Physical World ------Cornell University Event Detection Services Using Data Service Middleware in Distributed Sensor Networks ------University.

Slides:



Advertisements
Similar presentations
Research Issues in Web Services CS 4244 Lecture Zaki Malik Department of Computer Science Virginia Tech
Advertisements

Directed Diffusion for Wireless Sensor Networking
1 Routing Techniques in Wireless Sensor networks: A Survey.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Efficient Solutions to the Replicated Log and Dictionary Problems
Chapter 13 (Web): Distributed Databases
SIA: Secure Information Aggregation in Sensor Networks Bartosz Przydatek, Dawn Song, Adrian Perrig Carnegie Mellon University Carl Hartung CSCI 7143: Secure.
An Energy-Efficient Data Storage Scheme for Multi- resolution Query in Wireless Sensor Networks 老師 : 溫志煜 學生 : 官其瑩.
Ordering and Consistent Cuts Presented By Biswanath Panda.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
ABCSG - Distributed Database 1 Data Management Distributed Database Data Replication.
2/23/2009CS50901 Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial Fred B. Schneider Presenter: Aly Farahat.
A Survey of Wireless Sensor Network Data Collection Schemes by Brett Wilson.
Data Sharing in OSD Environment Dingshan He September 30, 2002.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
SIGMOD'061 Energy-Efficient Monitoring of Extreme Values in Sensor Networks Adam Silberstein Kamesh Munagala Jun Yang Duke University.
Managing Data Resources. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single.
Distributed Databases
Buffer Management for Shared- Memory ATM Switches Written By: Mutlu Apraci John A.Copelan Georgia Institute of Technology Presented By: Yan Huang.
Switching Techniques Student: Blidaru Catalina Elena.
Sensor Coordination using Role- based Programming Steven Cheung NSF NeTS NOSS Informational Meeting October 18, 2005.
Chapter 10 Architectural Design
EVENT MANAGEMENT IN MULTIVARIATE STREAMING SENSOR DATA National and Kapodistrian University of Athens.
Database Design – Lecture 16
SIGNALING. To establish a telephone call, a series of signaling messages must be exchanged. There are two basic types of signal exchanges: (1) between.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
Chapter 7: Database Systems Succeeding with Technology: Second Edition.
1 Adapted from Pearson Prentice Hall Adapted form James A. Senn’s Information Technology, 3 rd Edition Chapter 7 Enterprise Databases and Data Warehouses.
Demo. Overview Overall the project has two main goals: 1) Develop a method to use sensor data to determine behavior probability. 2) Use the behavior probability.
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
Sensor Network Databases1 Overview: Chapter 6  Sensor Network Databases  Sensor networks are conceptually a distributed DB  Store collected data  Indexes.
March 6th, 2008Andrew Ofstad ECE 256, Spring 2008 TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks Samuel Madden, Michael J. Franklin, Joseph.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Sensor Database System Sultan Alhazmi
1 EnviroStore: A Cooperative Storage System for Disconnected Operation in Sensor Networks Liqian Luo, Chengdu Huang, Tarek Abdelzaher John Stankovic INFOCOM.
Efficient Deployment Algorithms for Prolonging Network Lifetime and Ensuring Coverage in Wireless Sensor Networks Yong-hwan Kim Korea.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Wireless Sensor Network Wireless Sensor Network Based.
Chapter 2: System Models. Objectives To provide students with conceptual models to support their study of distributed systems. To motivate the study of.
Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.
Communication Paradigm for Sensor Networks Sensor Networks Sensor Networks Directed Diffusion Directed Diffusion SPIN SPIN Ishan Banerjee
REED: Robust, Efficient Filtering and Event Detection in Sensor Networks Daniel Abadi, Samuel Madden, Wolfgang Lindner MIT United States VLDB 2005.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
1 Distributed Databases BUAD/American University Distributed Databases.
Energy conservation in Wireless Sensor Networks Sagnik Bhattacharya, Tarek Abdelzaher University of Virginia, Department of Computer Science School of.
Managing Data Resources. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single.
Maximizing Lifetime per Unit Cost in Wireless Sensor Networks
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Modeling In-Network Processing and Aggregation in Sensor Networks Ajay Mahimkar The University of Texas at Austin March 24, 2004.
Big traffic data processing framework for intelligent monitoring and recording systems 學生 : 賴弘偉 教授 : 許毅然 作者 : Yingjie Xia a, JinlongChen a,b,n, XindaiLu.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
1 Querying the Physical World Son, In Keun Lim, Yong Hun.
W. Hong & S. Madden – Implementation and Research Issues in Query Processing for Wireless Sensor Networks, ICDE 2004.
Chapter 1 Database Access from Client Applications.
1 An infrastructure for context-awareness based on first order logic 송지수 ISI LAB.
1 Chapter 2 Database Environment Pearson Education © 2009.
REED : Robust, Efficient Filtering and Event Detection in Sensor Network Daniel J. Abadi, Samuel Madden, Wolfgang Lindner Proceedings of the 31st VLDB.
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
Efficient Opportunistic Sensing using Mobile Collaborative Platform MOSDEN.
Welcome: To the fifth learning sequence “ Data Models “ Recap : In the previous learning sequence, we discussed The Database concepts. Present learning:
Database and Cloud Security
Databases and DBMSs Todd S. Bacastow January 2005.
Introduction to Wireless Sensor Networks
Distributed database approach,
Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.
Data, Databases, and DBMSs
Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing Zaharia, et al (2012)
Buffer Management for Shared-Memory ATM Switches
Overview: Chapter 2 Localization and Tracking
Presentation transcript:

1 Querying the Physical World Cornell University Event Detection Services Using Data Service Middleware in Distributed Sensor Networks University of Virginia Presented By Gary UVA CS 862 Presentation

2 Comparison between these two papers Query the physical world Event Detection Service  No avi value for each data, so not really real-time based.  There is avi value for each data, so really Real-time based  Special interesting point: represent device function  Special interesting point: provide event detection service  Concentrate on individual mote.  Group-based robust coordination  Provide database-like abstraction to applications

3 Outline --- Querying the Physical World Device Networks & Their Query Processing  Description of Device Networks  Three kinds of queries  Two approaches Device Database System  Device & Function  User representation  Internal representation  Queries Query Processing over Device Database System  Performance Metrics  Distributed Query Execution Plans  Experiments Discussions

4 Outline --- Event Detection Service Motivation Data services in sensor networks Data Service Middleware (DSWare) Pay more attention to Event Detection Service Experiments and performance Discussions

5 Device Networks & Their Query Processing Description of Device Network The widespread deployment of sensors, actuators and mobile devices is transforming the physical world into a computing platform. Emerging networking techniques ensure that devices are interconnected and accessible from local- or wide-area networks. Using this new computing platform, users interact with portions of the physical world.

6 Three kinds of Queries Historical queries These are typically aggregate queries over historical data obtained from the device network. An example --- For each rainfall sensor in 1800 JPA, display the average level of rainfall for Snapshot queries These queries concern the device network at a given point in time. An example --- Retrieve the current rainfall level for all sensors in 1800 JPA. Long-running queries These queries concern the device network over a time interval. For the next 5 hours, retrieve every 30 seconds the rainfall level for all sensors in 1800 JPA.

7 Two Approaches Device database system Definition --- A database system that enables distributed query processing over a device network. The warehousing approach Definition --- In this approach, data are extracted from the devices in a predefined way and stored in a centralized database system that is responsible for query processing.

8 Two Approaches --- warehousing Advantages of warehousing approach Disadvantages of warehousing approach It uses valuable resources to transfer large amount of raw data from devices to the database server. It disassociates access to device from the query workload. It is well suited for aggregated queries asked for historical data.

9 Two Approaches --- Device database system Device database system  Device & Function  User representation  Internal representation  Queries

10 Device & Function Device Each device is a mini-server that supports a set of functions and can process portions of the queries directly at the device. example, a function that detects an abnormal rainfall level. Function A function either a) Acquires, stores and processes data or b) Triggers an action in the physical world Synchronous function  It returns result immediately, on demand.  It is used to monitor continuous phenomena, for example, a function that returns the rainfall level. Asynchronous function  It returns result after an arbitrary period of time.  It is used to monitor threshold events, for example, a function that detects an abnormal rainfall level.

11 User representation Devices are represented as ADTs Abstract Data Type (ADT) objects ADT objects are objects that are single attribute values encapsulating a collection of related data. ADT objects provide controlled access to encapsulated data through a well-defined interface. An example: RFSensors (Sensor,X,Y) provides Sensor.getRainfallLevel()

12 Internal representation Device functions are represented as virtual relations Virtual relation It is a tabular representation of a function. A record in it contains the input arguments and the output argument of the function it is associated with. Arguments of Device Function a1a1a1a1…… aMaMaMaM Attributes of Virtual Relation Device ADT ID Device ADT ID a1a1a1a1…… aMaMaMaM Output value Time stamp Properties of Virtual relation It is appended only It is naturally partitioned across all devices represented by the same device ADT

13 Queries Historical queries Snapshot queries They are naturally formulated as declarative queries in SQL An example of long-running query SELECT R.Sensor.getRainfallLevel() FROM RFSensors R WHERE R.Sensor.getRainfallLevel() > 50 AND $every(30) The function $every(30) specifies that a new record is inserted every 30 seconds into the append-only virtual relation corresponding to the function RFSensor.getRainfallLevel().

14 Query Processing over Device Database System Performance Metrics Traditional performance metrics  Throughput --- average number of queries processed per unit of time New performance metrics  Resource Usage --- The total amount of energy consumed by the devices when executing a query.  Response time --- time needed by the system to produce all answer records to a query.  Reaction Time --- The interval between the time a function, called on devices, returns the value and the time the corresponding answer is produced on the front-end.

15 Distributed Query Execution Plans Query --- Retrieve every 30 seconds the rainfall level if it is greater than 50 mm. SELECT VR.value FROM VRFSensorsGetRainfallLevel VR, RFSensors R WHERE VR.Sensor = R.Sensor AND VR.value > 50 AND $every(30)

16 Plan T  Data extracted from the devices are materialized in the relation VR that is located on the front-end.  Both R and VR are in the front-end. And the join is executed on the front- end  Join relation R and relation VR (using join condition VR.Sensor = R.Sensor AND VR.value > 50)

17 Plan A  It is a simple tree where R is joined on the front-end with relation VR partitioned across a set of devices.  The front-end asked each device to measure rainfall level and to transfer the resulting virtual records back to the front-end.  Disadvantages --- All devices with rainfall sensors transmit data to the front-end while the query only concerns the sensors which measure a rainfall level greater than 50.  Each virtual record arriving on the front-end is then joined with relation R.

18 Plan B  Define a semi-join between R and the partitions of VR located on the devices. The semi-join projects out the joining attribute from R (here the device ID Sensor) and sends it to all devices.  On the devices, whenever the rainfall level is measured, a virtual record is generated and joined with the portion of relation R sent by the front-end (using joining condition R.Sensor = VR.Sensor and VR.value > 50)  If the joining condition is verified, the virtual record is sent back to the front- end to get joined with complete records from relation R.

19 Plan C  It only pushes the selection (VR.value > 50) onto the device. Only records that verify the condition are sent back to the front-end where they are joined with relation R.  Compared to Plan B, there is no subset of relation R transmitted to the devices.

20 Resource usage for sensors located outside a flood area With Plan A, data is sent back to the front-end whenever it is generate. With Plan B, a semi-join is pushed to the device. The condition on the rainfall level is checked on the device and no data is sent back because of being outside of the flood. Plan B pays the initial cost of transferring a fragment of relation R to the devices. This initial cost is amortized (compared to Plan A) during the lifespan of the long-running query. With Plan C, a selection is pushed to the device. The condition on the rainfall level is checked on the device and also no data is sent back because of locating outside of the flood.

21 Resource usage for sensors located inside a flood area With all plans, data is always sent back to the front-end. The initial cost of Plan B is here never amortized. So line B will rise rapidly with time increasing. Question: Why Plan C and Plan A have almost similar curves? Because the cost of performing a selection is low compared to the cost of sending data.

22 Conclusion of Plans  Pushing a selection as in Plan C is the optimal. This is intuitive since the query filters out uninteresting events generated on the devices.  Pushing the selection allows the device database system to trade efficiently increased processing on the devices for reduced communication.

23 I love the idea of using virtual relations to represent device functions The complete query semantics over a Device Database are not given here. No avi value for each data, so not really real-time based. Individual nodes are not important, and a mote’s sensor may get damaged and repots wrong value. So group-based coordinate should be introduced. Discussions

24 Event Detection Service

25 Motivation sensor networks are data-centric and real-time based – Abstraction of real-time data semantics needed – Abstraction of real-time data semantics needed Individual nodes in sensor networks are unreliable -- Group-based robust coordination needed Detection of some events relies on more than one type of sensor data -- The relationship can help to increase the reliability of data decisions

26 Data Services in sensor networks Queries (location, frequency, duration) Data/Event dissemination Data Aggregation Data-centric Storage/Caching Event Detection Data Security and Access Authorization

27 Data Service Middleware (DSWare) Data Storage Map the key to a logical node Map a logical node to multiple physical nodes Caching Spread copies along the routing path Compare? Data Storage Static copies & provide reliability Caching Variable copies & improve performance Sensor nodes Real-time Scheduling Subscription Application DSWare Database-like abstraction Event Detection Group ManagementAggregation Data StorageCachingAuthorization Services in Data Service Middleware

28 Problems with current event detection schemes An external node collects reports of atomic events and determines whether the compound event occurs Explosion Atomic Event Reports Determine the occurrence of compound events  reduce possible in-network processing and increase unnecessary concentrated traffic around the decision node  Increase detection delay (unacceptable for some time-critical applications)

29 Event Detection Service in DSWare Event: application-interested activity in the environment that can be monitored or detected Explosion Detected in the area: High Temperature, light intensity change, acoustic changes Hierarchy of events  Atomic event: detected through a single sensor’s observation e.g. High Temperature, light intensity change, acoustic change  Compound event: consists of a set of atomic events detected based on the detection of atomic events that a compound event consists of e.g. Explosion

30 Event Detection Scheme in DSWare Confidence  Every compound event detection report has a confidence value, which indicates the reliability of the report  Confidence function is designed based on data semantics Related importance of different atomic sub-events Temporary continuity of events Statistical models Similarity among adjacent regions Waiting Time Window  The time that an aggregation node waits for the arrivals of all possible atomic event reports  When TW timeouts, report a compound event if the confidence value reaches the minimum confidence requirements of this event  Avoid endless waiting for messages loss  Enable event detection based on partial information collected

31 A Simple Example: Explosion (E) Sub-events: high temperature (T), special light (L), acoustic changes (A) Group Leader T f=0.6h A f=0.9h Time window A f=1.2h L Lost L f=0.3h Shift time window time f=0.9h Report E f=0.9h T No reports f=0.3h f=1.2h Report E L f=0.3h Confidence function: f = [0.6 * BOOL(T) * BOOL(L) * BOOL(A)] * h (h: history factor, increases if the explosion event has been detected in previous waiting time window. Assume 1≤h≤2) Minimum Confidence: 0.8

32 Some other issues in event detection Temporal resolution –Some events last much longer than the sensing interval of a sensor. So probably some applications will report a single event repetitively, which is unnecessary. Spatial resolution –If the size of a detection group is too small compared to the event, there might be several groups in this event’s coverage that will report the same event.

33 Performance in Reduction of Communication Base line: –Only one report of an environment property is generated from a group during each sensing interval. –Send all reports to an outside node and the entire analysis will be done there. DSWare has less communication.

34 Performance in Differentiating Events and Event-like Factors How to differentiate repetition report of event from event-like factor? How about the performance with different time window size and different minimum confidence value?

35 Discussions The idea of event detection service is well developed and completely discussed. In DSWare, data is replicated in multiple physical nodes that can be mapped to a single logical node. So consistency among these nodes is a key issue. In this paper, “weak consistency” is mentioned. But what’s the definition of “weak consistency” in sensor network? Since multiple physical nodes are used to map to a single logical node, why data caching is needed? What’s the different purposes of introducing both of them. It is mentioned that application can specify the actual scheduling schema in the sensor networks based on the most important concerns. But is it a good way for application to do that? It doesn’t seem a simple work.

36 Discussions --- (cont.) What is the position of real-time scheduling in the system? How to provide real-time? Two questions about Fig 5.  How to differentiate repetition report of event from event-like factor?  How about the performance, with different time window size and different minimum confidence value? A little typing mistake:  In the last sentence before 5.1, “an explosion event will be reported if the Confidence_E is not less than 0.9” should be “an explosion event will be reported if the Confidence_E is no less than 0.9”