MPDS 2003 San Diego 1 Reducing Execution Overhead in a Data Stream Manager Don Carney Brown University Uğur ÇetintemelBrown University Mitch Cherniack.

Slides:



Advertisements
Similar presentations
Starfish: A Self-tuning System for Big Data Analytics.
Advertisements

CSC 360- Instructor: K. Wu Overview of Operating Systems.
Processes Management.
Analysis of : Operator Scheduling in a Data Stream Manager CS561 – Advanced Database Systems By Eric Bloom.
IWQoS June 2006 New Haven, CT Gargi B Dasgupta, Koustuv Dasgupta, Amit Purohit, Balaji Viswanathan, IBM, India Research Lab Grid as a utility-based.
The Design of the Borealis Stream Processing Engine Daniel J. Abadi1, Yanif Ahmad2, Magdalena Balazinska1, Ug ̆ur C ̧ etintemel2, Mitch Cherniack3, Jeong-Hyon.
Chapter 101 Cleaning Policy When should a modified page be written out to disk?  Demand cleaning write page out only when its frame has been selected.
Introduction CSCI 444/544 Operating Systems Fall 2008.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Parallel Database Systems
The Design of the Borealis Stream Processing Engine Brandeis University, Brown University, MIT Magdalena BalazinskaNesime Tatbul MIT Brown.
The Design of the Borealis Stream Processing Engine CIDR 2005 Brandeis University, Brown University, MIT Kang, Seungwoo Ref.
C-Store: Updates Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May. 15, 2009.
Manage Run Activities Cognos 8 BI. Objectives  At the end of this course, you should be able to:  manage current, upcoming and past activities  manage.
Processes, Threads and Scheduling OS Lecture #4. Processes Unit of resource allocation in the OS Allocation of space Clock time The abstraction of a process.
CS 3013 & CS 502 Summer 2006 Scheduling1 The art and science of allocating the CPU and other resources to processes.
Quality-Of-Service (QoS) Panel Mitch Cherniack Brandeis David Maier OGI Rajeev Motwani Stanford Johannes GehrkeCornell Hari BalakrishnanMIT SWiM, Stanford.
SWiM Panel on Engine Implementation Jennifer Widom.
Scalable Distributed Stream System Mitch Cherniack, Hari Balakrishnan, Magdalena Balazinska, Don Carney, Uğur Çetintemel, Ying Xing, and Stan Zdonik Proceedings.
Monitoring Streams -- A New Class of Data Management Applications Don Carney Brown University Uğur ÇetintemelBrown University Mitch Cherniack Brandeis.
Building a Data Stream Management System Prof. Jennifer Widom Joint project with Prof. Rajeev Motwani and a team of graduate studentshttp://www-db.stanford.edu/stream.
Concurrency, Threads, and Events Robbert van Renesse.
Computer Organization and Architecture
Wk 2 – Scheduling 1 CS502 Spring 2006 Scheduling The art and science of allocating the CPU and other resources to processes.
Go Stream Matvey Arye, Princeton/Cloudflare Albert Strasheim, Cloudflare.
Stream Data Management System Prototypes Ying Sheng, Richard Sia June 1, 2004 Professor Carlo Zaniolo CS 240B Spring 2004.
Monitoring Streams -- A New Class of Data Management Applications Don Carney Brown University Uğur ÇetintemelBrown University Mitch Cherniack Brandeis.
Applying Control Theory to Stream Processing Systems Wei Xu Bill Kramer Joe Hellerstein.
SWIM 1/9/20031 QoS in Data Stream Systems Rajeev Motwani Stanford University.
Anti-Caching in Main Memory Database Systems Justin DeBrabant Brown University
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
Out-of-order Execution Divider Sanmukh Kuppannagari.
1 CSE544 Database Architecture Tuesday, February 1 st, 2011 Slides courtesy of Magda Balazinska.
Efficient Scheduling of Heterogeneous Continuous Queries Mohamed A. Sharaf Panos K. Chrysanthis Alexandros Labrinidis Kirk Pruhs Advanced Data Management.
Monitoring Streams- A New Class of Data Management Applications Presented by Qing Cao at
MONITORING STREAMS: A NEW CLASS OF DATA MANAGEMENT APPLICATIONS DON CARNEY, U Ğ UR ÇETINTEMEL, MITCH CHERNIACK, CHRISTIAN CONVEY, SANGDON LEE, GREG SEIDMAN,
Operating Systems.  Operating System Support Operating System Support  OS As User/Computer Interface OS As User/Computer Interface  OS As Resource.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
1 Lecture 4: Threads Operating System Fall Contents Overview: Processes & Threads Benefits of Threads Thread State and Operations User Thread.
The Design of the Borealis Stream Processing Engine CIDR 2005 Brandeis University, Brown University, MIT Kang, Seungwoo Ref.
Freshness-Aware Scheduling of Continuous Queries in the Dynamic Web Mohamed A. Sharaf Alexandros Labrinidis Panos K. Chrysanthis Kirk Pruhs Advanced Data.
C-Store: Column-Oriented Data Warehousing Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May 17, 2010.
A new model and architecture for data stream management.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
CS848 Class Project: A Survey on QoS for Multi-tier Web Systems Huaning(Mike) Nie
C-Store: Concurrency Control and Recovery Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun. 5, 2009.
한국기술교육대학교 컴퓨터공학부 민준기.  Stream data ◦ A growing number of applications generate streams of data  Performance measurements in network monitoring and traffic.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 13 Threads Read Ch 5.1.
Aurora – system architecture Pawel Jurczyk. Currently used DB systems Classical DBMS: –Passive repository storing data (HADP – human-active, DBMS- passive.
Runtime Optimization of Continuous Queries Balakumar K. Kendai and Sharma Chakravarthy Information Technology Laboratory Department of Computer Science.
C o n f i d e n t i a l 1 Course: BCA Semester: III Subject Code : BC 0042 Subject Name: Operating Systems Unit number : 1 Unit Title: Overview of Operating.
Processes and Process Control 1. Processes and Process Control 2. Definitions of a Process 3. Systems state vs. Process State 4. A 2 State Process Model.
C-Store: Integrating Compression and Execution Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Mar 20, 2009.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems with Multi-programming Chapter 4.
Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.
Aurora Group 19 : Chu Xuân Tình Trần Nhật Tuấn Huỳnh Thái Tâm Lec: Associate Professor Dr.techn. Dang Tran Khanh A new model and architecture for data.
A new model and architecture for data stream management.
Lecture 3 : Performance of Parallel Programs Courtesy : MIT Prof. Amarasinghe and Dr. Rabbah’s course note.
Aurora: a new model and architecture for data stream management Daniel J. Abadi 1, Don Carney 2, Ugur Cetintemel 2, Mitch Cherniack 1, Christian Convey.
Monitoring Streams -- A New Class of Data Management Applications based on paper and talk by authors below, slightly adapted for CS561: Don Carney Brown.
Apache Tez : Accelerating Hadoop Query Processing Page 1.
HERON.
S. Sudarshan CS632 Course, Mar 2004 IIT Bombay
Simultaneous Multithreading in Superscalar Processors
Anti-Caching in Main Memory Database Systems
Streaming Sensor Data Fjord / Sensor Proxy Multiquery Eddy
Incrementally Maintaining Classification using an RDBMS
Adaptive Query Processing (Background)
EdgeWise: A Better Stream Processing Engine for the Edge
Prof. Onur Mutlu Carnegie Mellon University
Presentation transcript:

MPDS 2003 San Diego 1 Reducing Execution Overhead in a Data Stream Manager Don Carney Brown University Uğur ÇetintemelBrown University Mitch Cherniack Brandeis University Alex Rasin Brown University Michael Stonebraker MIT Stan ZdonikBrown University

MPDS 2003 San Diego 2 Aurora from the Sky Queries App QoS App QoS App QoS

MPDS 2003 San Diego 3 Aurora from the Sky App QoS App QoS App QoS

MPDS 2003 San Diego 4 Runtime Operation Basic Architecture Scheduler QOS Monitor Box Processors Buffer Storage Manager Persistent Store … q1q1 … q2q2 … qiqi … q1q1 … qnqn … q2q2      Catalog Router inputs outputs

MPDS 2003 San Diego 5 Execution Model Traditional Thread-driven Execution Traditional Thread-driven Execution Thread per query or operatorThread per query or operator Resource management done by OSResource management done by OS Easy to program Easy to program Scalability problems Scalability problems State-based Execution State-based Execution Single scheduler thread maintains execution queueSingle scheduler thread maintains execution queue Small number of worker threads execute execution queue entriesSmall number of worker threads execute execution queue entries Enables application specific allocation of resourcesEnables application specific allocation of resources

MPDS 2003 San Diego 6 State-Based vs. Thread-Based

MPDS 2003 San Diego 7 Scheduling Two level scheduling Two level scheduling Inter-query scheduling (Which query?)Inter-query scheduling (Which query?) Intra-query scheduling (Operation order?)Intra-query scheduling (Operation order?) Batching Batching Tuple trainsTuple trains Fewer box executions -> fewer scheduling decisions Fewer box executions -> fewer scheduling decisions Also, better memory utilization Also, better memory utilization Superbox schedulingSuperbox scheduling Multiple boxes per decision -> fewer scheduling decisions Multiple boxes per decision -> fewer scheduling decisions Memory utilization: allocate for entire superbox at once Memory utilization: allocate for entire superbox at once State Monitoring (# tuples, latencies, etc) State Monitoring (# tuples, latencies, etc) Incremental and approximateIncremental and approximate

MPDS 2003 San Diego 8 Runtime Operation Scheduling: Minimizing Per Tuple Processing Overhead Train Scheduling: A B …xyz A (x)A (y)A (z)B (A (x))B (A (y))B (A (z)) = Scheduler Action AB …xyz B (A (x))B (A (y))B (A (z)) Box Trains: A B …xyz A (z, y, x) B (A (z), A (y), A (x)) Tuple Trains:

MPDS 2003 San Diego 9 Tuple Trains and Superboxes

MPDS 2003 San Diego 10 Overheads

MPDS 2003 San Diego 11 Overheads

MPDS 2003 San Diego 12 Other Issues Priority assignment Priority assignment Box Execution Order Box Execution Order QoS QoS

MPDS 2003 San Diego 13 Stay Tuned! SIGMOD Demo SIGMOD Demo VLDB ’03 paper “Operator Scheduling in a Data Stream Environment” VLDB ’03 paper “Operator Scheduling in a Data Stream Environment”

MPDS 2003 San Diego 14 A little closer App QoS App QoS

MPDS 2003 San Diego 15 A little closer App QoS App QoS

MPDS 2003 San Diego 16 Aurora from the Sky Query App QoS App QoS Query