Download presentation
Presentation is loading. Please wait.
1
HiFi: Network-centric Query Processing in the Physical World SAP Research Forum February 2005 Mike Franklin UC Berkeley
2
Mike Franklin UC Berkeley EECS Introduction Receptors everywhere! Wireless sensor networks, RFID technologies, digital homes, network monitors,... Large-scale deployments will be as High Fan-In Systems
3
Mike Franklin UC Berkeley EECS High Fan-in Systems Large numbers of receptors = large data volumes Hierarchical, successive aggregation The “Bowtie”
4
Mike Franklin UC Berkeley EECS High Fan-in Example (SCM) Receptors Warehouses, Stores Dock doors, Shelves Regional Centers Headquarters
5
Mike Franklin UC Berkeley EECS Properties High Fan-In, globally-distributed architecture. Large data volumes generated at edges. Filtering and cleaning must be done there. Successive aggregation as you move inwards. Summaries/anomalies continually, details later. Strong temporal focus. Strong spatial/geographic focus. Streaming data and stored data. Integration within and across enterprises.
6
Mike Franklin UC Berkeley EECS Design Space: Time Filtering, Cleaning, Alerts Monitoring, Time-series Data mining (recent history) Archiving (provenance and schema evolution) On-the-fly processing Disk-based processing Stream/Disk Processing Time Scale seconds years
7
Mike Franklin UC Berkeley EECS Design Space: Geography Filtering, Cleaning, Alerts Monitoring, Time-series Data mining (recent history) Archiving (provenance and schema evolution) Geographic Scope local global Several Readers Regional Centers Central Office
8
Mike Franklin UC Berkeley EECS Design Space: Resources Filtering, Cleaning, Alerts Monitoring, Time-series Data mining (recent history) Archiving (provenance and schema evolution) Individual Resources tiny huge Devices Stargates/ Desktops Clusters/ Grids
9
Mike Franklin UC Berkeley EECS Design Space: Data Filtering, Cleaning, Alerts Monitoring, Time-series Data mining (recent history) Archiving (provenance and schema evolution) Degree of Detail Aggregate Data Volume Dup Elim history: hrs Interesting Events history: days Trends/Archive history: years
10
Mike Franklin UC Berkeley EECS State of the Art Current approaches: hand-coded, script-based expensive, one-off, brittle, hard to deploy and keep running Piecemeal/stovepipe systems Each type of receptor (RFID, sensors, etc) handled separately Standards-efforts not addressing this: Protocol design bent Different “data models” at each level Reinventing “query languages” at each level No end-to-end, integrated middleware for managing distributed receptor data
11
Mike Franklin UC Berkeley EECS HiFi A data management infrastructure for high fan-in environments Uniform Declarative Framework Every node is a data stream processor that speaks SQL-ese stream-oriented queries at all levels Hierarchical, stream-based views as an organizing principle
12
Mike Franklin UC Berkeley EECS Why Declarative? (database dogma) Independence: data, location, platform Allows the system to adapt over time Many optimization opportunities In a complex system, automatic optimization is key. Also, optimization across multiple applications. Simplifies Programming ???
13
Mike Franklin UC Berkeley EECS Building HiFi
14
Mike Franklin UC Berkeley EECS Integrating RFID & Sensors (the “loudmouth” query)
15
Mike Franklin UC Berkeley EECS A Tale of Two Systems TinyDB Declarative query processing for wireless sensor networks In-network aggregation Released as part of TinyOS Open Source Distribution TelegraphCQ Data stream processor Continuous, adaptive query processing with aggressive sharing Built by modifying PostgreSQL Open source “beta” release out now; new release soon
16
Mike Franklin UC Berkeley EECS The Network is the Database: Basic idea: treat the sensor net as a “virtual table”. System hides details/complexities of devices, changing topologies, failures, … System is responsible for efficient execution. Developed on TinyOS/Motes http://telegraph.cs.berkeley.edu/tinydb SELECT MAX(mag) FROM sensors WHERE mag > thresh SAMPLE PERIOD 64ms App Sensor Network TinyDB Query, Trigger Data TinyDB
17
Mike Franklin UC Berkeley EECS TelegraphCQ: Data Stream Monitoring Streaming Data Network monitors Sensor Networks, RFID News feeds, Stock tickers, … B2B and Enterprise apps Trade Reconciliation, Order Processing etc. (Quasi) real-time flow of events and data Manage these flows to drive business processes. Can mine flows to create and adjust business rules. Can also “tap into” flows for on-line analysis. http://telegraph.cs.berkeley.edu
18
Mike Franklin UC Berkeley EECS Data Stream Processing Queries Queries Data Traditional Database Data Stream Processor Result Tuples Data streams are unending Continuous, long running queries Real-time processing Data
19
Mike Franklin UC Berkeley EECS Windowed Queries SELECT S.city, AVG(temp) FROM SOME_STREAM S [range by ‘5 seconds’ slide by ‘5 seconds’] WHERE S.state = ‘California’ GROUP BY S.city “I want to look at 5 seconds worth of data” “I want a result tuple every 5 seconds” A typical streaming query Result Tuple(s) Data Stream Result Tuple(s) … Window Window Clause
20
Mike Franklin UC Berkeley EECS TelegraphCQ Architecture Proxy TelegraphCQ Front End Planner Parser Listener Mini-Executor Catalog TelegraphCQ Wrapper ClearingHouse Wrappers Query Plan Queue Eddy Control Queue Query Result Queues } Shared Memory Shared Memory Buffer Pool Disk Split TelegraphCQ Back End Modules Scans CQEddy Split TelegraphCQ Back End Modules Scans CQEddy
21
Mike Franklin UC Berkeley EECS The HiFi System TelegraphCQ TinyDB Stargates Sensor Networks & RFID Readers RFID Wrappers PC
22
Mike Franklin UC Berkeley EECS Basic HiFi Architecture HiFi Glue DSQP HiFi Glue DSQP MDR Hierarchical federation of nodes Each node: Data Stream Query Processor (DSQP) HiFi Glue Views drive system functionality Metadata Repository (MDR) HiFi Glue DSQP HiFi Glue DSQP Management Query Planning Archiving Internode coordination and communication
23
Mike Franklin UC Berkeley EECS HiFi Processing Pipelines The CSAVA Framework Multiple Receptors Single TupleWindow CSAVA Generalization Arbitrate Clean Smooth Validate Analyze Join w/Stored Data On-line Data Mining
24
Mike Franklin UC Berkeley EECS CSAVA Processing Clean CREATE VIEW cleaned_rfid_stream AS (SELECT receptor_id, tag_id FROM rfid_stream rs WHERE read_strength >= strength_T)
25
Mike Franklin UC Berkeley EECS CSAVA: Processing Clean Smooth CREATE VIEW smoothed_rfid_stream AS (SELECT receptor_id, tag_id FROM cleaned_rfid_stream [range by ’5 sec’, slide by ’5 sec’] GROUP BY receptor_id, tag_id HAVING count(*) >= count_T)
26
Mike Franklin UC Berkeley EECS CSAVA: Processing Clean Smooth Arbitrate CREATE VIEW arbitrated_rfid_stream AS (SELECT receptor_id, tag_id FROM smoothed_rfid_stream rs [range by ’5 sec’, slide by ’5 sec’] GROUP BY receptor_id, tag_id HAVING count(*) >= ALL (SELECT count(*) FROM smoothed_rfid_stream [range by ’5 sec’, slide by ’5 sec’] WHERE tag_id = rs.tag_id GROUP BY receptor_id))
27
Mike Franklin UC Berkeley EECS CSAVA: Processing Arbitrate Validate CREATE VIEW validated_tags AS (SELECT tag_name, FROM arbitrated_rfid_stream rs [range by ’5 sec’, slide by ’5 sec’], known_tag_list tl WHERE tl.tag_id = rs.tag_id Clean Smooth
28
Mike Franklin UC Berkeley EECS CSAVA: Processing Validate CREATE VIEW tag_count AS (SELECT tag_name, count(*) FROM validated_tags vt [range by ‘5 min’, slide by ‘1 min’] GROUP BY tag_name Analyze Arbitrate Clean Smooth
29
Mike Franklin UC Berkeley EECS Ongoing Work Bridging the physical-digital divide VICE – A “Virtual Device” Interface Hierarchical query processing Automatic Query planning & dissemination Complex event processing Unifying event and data processing
30
Mike Franklin UC Berkeley EECS Virtual Device (VICE) Layer “Metaphysical* Data Independence” *The branch of philosophy that deals with the ultimate nature of reality and existence. (name due to Shawn Jeffery)
31
Mike Franklin UC Berkeley EECS The Virtues of VICE A simple RFID Experiment 2 Adjacent Shelves, 8 ft each 10 EPC-tagged items each, plus 5 moved between them. RFID antenna on each shelf.
32
Mike Franklin UC Berkeley EECS Ground Truth
33
Mike Franklin UC Berkeley EECS Raw RFID Readings
34
Mike Franklin UC Berkeley EECS After VICE Processing Under the covers (in this case): Cleaning, Smoothing, and Arbitration
35
Mike Franklin UC Berkeley EECS Other VICE Uses Once you have the right abstractions: “Soft Sensors” Quality and lineage streams Pushdown of external validation information Power management and other optimizations Data Archiving Model-based sensing “Non-declarative” code …
36
Mike Franklin UC Berkeley EECS Hierarchical Query Processing “I provide raw readings for Soda Hall” “I provide avg daily values for Berkeley” “I provide avg weekly values for California” “I provide national monthly values for the US” Continuous and Streaming Automatic placement and optimization Hierarchical Temporal granularity vs. geographic scope Sharing of lower-level streams
37
Mike Franklin UC Berkeley EECS Complex Event Processing Needed for monitoring and actuation Key to prioritization (e.g., of detail data) Exploit duality of data and events Shared Processing “Semantic Windows” Challenge: a single system that simultaneously handles events spanning seconds to years.
38
Mike Franklin UC Berkeley EECS Next Steps Archiving and Detail Data Dealing with transient overloads Rate matching between stored and streaming data Scheduling large archive transfers System design & deployment Tools for provisioning and evaluating receptor networks System monitoring & management Leverage monitoring infrastructure for introspection
39
Mike Franklin UC Berkeley EECS Conclusions Receptors everywhere High Fan-In Systems Current middleware solutions are complex & brittle Uniform declarative framework is the key The HiFi project is exploring this approach Our initial prototype Leveraged TelegraphCQ and TinyDB Demonstrated RFID/multiple sensor integration Validated the HiFi approach We have an ambitious on-going research agenda See http://hifi.cs.berkeley.edu for more info.http://hifi.cs.berkeley.edu
40
Mike Franklin UC Berkeley EECS Acknowledgements Team HiFi: Shawn Jeffery, Sailesh Krishnamurthy, Frederick Reiss, Shariq Rizvi, Eugene Wu, Nathan Burkhart, Owen Cooper, Anil Edakkunni Experts in VICE: Gustavo Alonso, Wei Hong, Jennifer Widom Funding and/or Reduced-Price Gizmos from NSF, Intel, UC MICRO program, and Alien Technologies
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.