Network Computing Laboratory HiFi Systems: Network-Centric Query Processing for the Physical World Michael J. Franklin, Shawn R. Jeffrey, et al UC Berkeley.

Slides:



Advertisements
Similar presentations
Chapter 13 The Data Warehouse
Advertisements

Nov DOLAP 2002 McLean USA A Multidimensional and Multiversion Structure for OLAP Applications Mathurin Body 1,2, Maryvonne Miquel 2, Yvan Bédard.
C6 Databases.
Outline What is a data warehouse? A multi-dimensional data model Data warehouse architecture Data warehouse implementation Further development of data.
Visibility Information Exchange Web System. Source Data Import Source Data Validation Database Rules Program Logic Storage RetrievalPresentation AnalysisInterpretation.
Design Considerations for High Fan-in Systems: The HiFi Approach Presented by Shawn Jeffery CIDR‘05 1/7/05 Michael J. Franklin, Shawn R. Jeffery, Sailesh.
ICS 421 Spring 2010 Data Warehousing (1) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/18/20101Lipyeow.
Xyleme A Dynamic Warehouse for XML Data of the Web.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
An Introduction to Dimensional Data Warehouse Design Presented by Joseph J. Sarna Jr. JJS Systems, LLC.
Exploiting the DW data DW is a platform for creating a wide array of reports It solves data feed problems, but does not lead to specific decision support.
Components and Architecture CS 543 – Data Warehousing.
Sensor Networks: Implications for Database Systems and Vice-Versa Michael Franklin January UCB Sensor Day.
Data Warehouse success depends on metadata
HiFi Systems: Network-Centric Query Processing for the Physical World Michael Franklin UC Berkeley
13 Chapter 13 The Data Warehouse Hachim Haddouti.
HiFi: Network-centric Query Processing in the Physical World SAP Research Forum February 2005 Mike Franklin UC Berkeley.
The Structure of (Computer) Scientific Revolutions Dow Jones Enterprise Ventures May 2006 Michael Franklin UC Berkeley & Amalgamated Insight.
Chapter 13 The Data Warehouse
Adaptive Cleaning for RFID Data Streams VLDB /12/06 Shawn Jeffery Minos Garofalakis Michael Franklin UC Berkeley Intel Research Berkeley UC Berkeley.
Professor Michael J. Losacco CIS 1150 – Introduction to Computer Information Systems Databases Chapter 11.
Ch3 Data Warehouse part2 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
An Overview of Data Warehousing and OLTP Technology Presenter: Parminder Jeet Kaur Discussion Lead: Kailang.
Architecture and Infrastructure Module 2 G.Anuradha.
Database Management COP4540, SCS, FIU An Introduction to database system.
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
ETL By Dr. Gabriel.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
L/O/G/O Metadata Business Intelligence Erwin Moeyaert.
DATA WAREHOUSING IN SQL SERVER 2005/2008 BUSINESS INTELLIGENCE.
Understanding Data Warehousing
1 INTRODUCTION TO DATABASE MANAGEMENT SYSTEM L E C T U R E
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
Datawarehouse Objectives
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
CISB594 – Business Intelligence
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
Data Management for Decision Support Session-3 Prof. Bharat Bhasker.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Sachin Goel (68) Manav Mudgal (69) Piyush Samsukha (76) Rachit Singhal (82) Richa Somvanshi (85) Sahar ( )
Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane.
CISB594 – Business Intelligence Data Warehousing Part I.
Ayyat IT Group Murad Faridi Roll NO#2492 Muhammad Waqas Roll NO#2803 Salman Raza Roll NO#2473 Junaid Pervaiz Roll NO#2468 Instructor :- “ Madam Sana Saeed”
Non-Traditional Databases. Reading 1. Scientific data management at the Johns Hopkins institute for data intensive engineering and science Yanif Ahmad,
CISB594 – Business Intelligence Data Warehousing Part I.
Data Mining Data Warehouses.
By N.Gopinath AP/CSE.  The data warehouse architecture is based on a relational database management system server that functions as the central repository.
Managing Data for DSS II. Managing Data for DS Data Warehouse Common characteristics : –Database designed to meet analytical tasks comprising of data.
CISB594 – Business Intelligence Data Warehousing Part I.
Library Online Resource Analysis (LORA) System Introduction Electronic information resources and databases have become an essential part of library collections.
Managing Data Resources File Organization and databases for business information systems.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Supervisor : Prof . Abbdolahzadeh
Chapter 13 Business Intelligence and Data Warehouses
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Data Warehouse.
MANAGING DATA RESOURCES
Data Warehouse and OLAP
Adaptive Cleaning for RFID Data Streams
C.U.SHAH COLLEGE OF ENG. & TECH.
THE ARCHITECTURAL COMPONENTS
The Database Environment
Data Warehousing Concepts
Data Warehouse and OLAP
Data Warehouse and OLAP Technology
Presentation transcript:

Network Computing Laboratory HiFi Systems: Network-Centric Query Processing for the Physical World Michael J. Franklin, Shawn R. Jeffrey, et al UC Berkeley TelegraphCQ Team 2 nd CIDR Conf. 2005

Korea Advanced Institute of Science and Technology Table of Contents One line Comment Motivating Scenario HiFi System with CSAVA processing stage Internal Architecture of HiFi Node Critiques New Idea -1,2

Korea Advanced Institute of Science and Technology One line Comment It’s a preliminary work describing the group’s vision to distribute their TelegraphCQ system to a hierarchical network

Korea Advanced Institute of Science and Technology Motivating Scenario – Supply Chain Management “Smart Shelves” continuously monitor item addition and removal. Info is sent back through the supply chain.

Korea Advanced Institute of Science and Technology Hi Fan-In system Ursa-Minor (TinyDB-based) Ursa-Major (TelegraphCQ w/Archiving) Mid-tier Stargate Mid-tier Processing Node

Korea Advanced Institute of Science and Technology Characteristics of HiFi Systems High Fan-In, globally-distributed architecture Large data volumes generated at edges Filtering and cleaning must be done there Successive aggregation as you move inwards Summaries/anomalies continually, details later Strong temporal focus Strong spatial/geographic focus Streaming data and stored data Integration within and across enterprises

Korea Advanced Institute of Science and Technology A View on this example Filtering, Cleaning, Alerts Monitoring, Time-series Data mining (recent history) Archiving (provenance and schema evolution) Geographic Scope local global Several Readers Regional Centers Central Office

Korea Advanced Institute of Science and Technology Headquarters Regional Centers Warehouse Warehouse Doors Receptor High fan-in system levels with associated CSAVA processing stages RFID Clean Remove Anomalies Smooth Interpolate for lost/garbled readings Arbitrate Remove duplicates Validate Correlate with business rules Analyze Tactical decision support

Korea Advanced Institute of Science and Technology Internal Architecture of a HiFi node Metadata Repository Data Stream Processor Cache Manager Data Listener Resource Manager Query Dispatcher Local View Manager Query Placement Service Query Listener Control Manager Data Disseminator Query Planner DSP Manager Archive Manager Logical Query Planner Physical Query Planner HiFi Glue Data Flow Query Flow Control Flow

Korea Advanced Institute of Science and Technology Critiques Strong Point They classify and formulate five distinct data processing stage They develop the prototype system (in VLDB 05) Weak Point Designing MDR is critical but no initial effort is done No new system requirement Solutions are not technically deep

Korea Advanced Institute of Science and Technology New Idea - 1 Data SourceCQ engineWeb Server SP Accel Clients Filtered out By-passing Buffering

Korea Advanced Institute of Science and Technology New Idea – related to SPAccel Designing front-end component (Cache??) Filtering out unwanted input data By-passing data matching query predicates Buffering data for windowed queries (views) or distributed queries Buffering Query Results

Korea Advanced Institute of Science and Technology Issues expected Cache replacement mechanism How to index cached elements What to cache? How much?

Korea Advanced Institute of Science and Technology New Idea -2 processing stream data for OLAP queries OLTPOLAP Users Clerk, IT professionalKnowledge worker Function Day to day operationsdecision support DB design application-orientedsubject-oriented Data current, up-to-datehistorical, summarized detailed, flat relational multidimensional isolatedintegrated, consolidated Usage repetitivead-hoc Access read/write, lots of scans index/hash on prim. key Unit of work short, simple transactioncomplex query #Records accessed tensmillions #Usersthousandshundreds DB size100MB-GB100GB-TB Metrictransaction throughput query throughput/response

Korea Advanced Institute of Science and Technology A Sample Data Cube sum USA Canada Mexico CountryCountry Date Product CD video camera 1Q2Q3Q4Q

Korea Advanced Institute of Science and Technology New Idea - 2 Stream data in terms of OLAP domain OLAP queries are Inherently multidimensional Spans a long time Need data from multiple sources Processing OLAP queries are Memory intensive Computation intensive

Korea Advanced Institute of Science and Technology Naïve Solution Pre-computing popular computation path

Korea Advanced Institute of Science and Technology Supplementary Silde Cleaning CREATE VIEW cleaned_rfid_stream AS ( SELECT receptor_id, tag_id FROM rfid_stream rs WHERE read_strength >= strength_T) Smoothing CREATE VIEW smoothed_rfid_stream AS ( SELECT receptor_id, tag_id FROM cleaned_rfid_stream GROUP BY receptor_id, tag_id HAVING count(*) >= count_T)