The Design of the Borealis Stream Processing Engine Brandeis University, Brown University, MIT Magdalena BalazinskaNesime Tatbul MIT Brown.

Slides:



Advertisements
Similar presentations
Network Resource Broker for IPTV in Cloud Computing Lei Liang, Dan He University of Surrey, UK OGF 27, G2C Workshop 15 Oct 2009 Banff,
Advertisements

Load Management and High Availability in Borealis Magdalena Balazinska, Jeong-Hyon Hwang, and the Borealis team MIT, Brown University, and Brandeis University.
Analysis of : Operator Scheduling in a Data Stream Manager CS561 – Advanced Database Systems By Eric Bloom.
Scalable Content-Addressable Network Lintao Liu
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.
The Design of the Borealis Stream Processing Engine Daniel J. Abadi1, Yanif Ahmad2, Magdalena Balazinska1, Ug ̆ur C ̧ etintemel2, Mitch Cherniack3, Jeong-Hyon.
The Design of the Borealis Stream Processing Engine CIDR 2005 Brandeis University, Brown University, MIT Kang, Seungwoo Ref.
Fault-Tolerance in the Borealis Distributed Stream Processing System Magdalena Balazinska, Hari Balakrishnan, Samuel Madden, and Michael Stonebraker MIT.
Aurora Proponent Team Wei, Mingrui Liu, Mo Rebuttal Team Joshua M Lee Raghavan, Venkatesh.
Chapter 10: Stream-based Data Management Title: Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core Authors:
Quality-Of-Service (QoS) Panel Mitch Cherniack Brandeis David Maier OGI Rajeev Motwani Stanford Johannes GehrkeCornell Hari BalakrishnanMIT SWiM, Stanford.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Scalable Adaptive Data Dissemination Under Heterogeneous Environment Yan Chen, John Kubiatowicz and Ben Zhao UC Berkeley.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Monitoring Streams -- A New Class of Data Management Applications Don Carney Brown University Uğur ÇetintemelBrown University Mitch Cherniack Brandeis.
Wide-area cooperative storage with CFS
Monitoring Streams -- A New Class of Data Management Applications Don Carney Brown University Uğur ÇetintemelBrown University Mitch Cherniack Brandeis.
Winter Retreat Connecting the Dots: Using Runtime Paths for Macro Analysis Mike Chen, Emre Kıcıman, Anthony Accardi, Armando Fox, Eric Brewer
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Reliable, Robust Data Collection in Sensor Networks Murali Rangan Russell Sears Fall 2005 – Sensornet.
PETAL: DISTRIBUTED VIRTUAL DISKS E. K. Lee C. A. Thekkath DEC SRC.
Using Probabilistic Models for Data Management in Acquisitional Environments Sam Madden MIT CSAIL With Amol Deshpande (UMD), Carlos Guestrin (CMU)
CS An Overlay Routing Scheme For Moving Large Files Su Zhang Kai Xu.
The Design of the Borealis Stream Processing Engine CIDR 2005 Brandeis University, Brown University, MIT Kang, Seungwoo Ref.
Managing Service Metadata as Context The 2005 Istanbul International Computational Science & Engineering Conference (ICCSE2005) Mehmet S. Aktas
Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.
Replication Mechanisms for a Distributed Time Series Storage and Retrieval Service Mugurel Ionut Andreica Politehnica University of Bucharest Iosif Charles.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
An Integration Framework for Sensor Networks and Data Stream Management Systems.
Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
A new model and architecture for data stream management.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Aurora – system architecture Pawel Jurczyk. Currently used DB systems Classical DBMS: –Passive repository storing data (HADP – human-active, DBMS- passive.
Integrating Scale Out and Fault Tolerance in Stream Processing using Operator State Management Author: Raul Castro Fernandez, Matteo Migliavacca, et al.
A Collaborative Framework for Scientific Data Analysis and Visualization Jaliya Ekanayake, Shrideep Pallickara, and Geoffrey Fox Department of Computer.
Load Shedding in Stream Databases – A Control-Based Approach Yicheng Tu, Song Liu, Sunil Prabhakar, and Bin Yao Department of Computer Science, Purdue.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
Introduction to Query Optimization, R. Ramakrishnan and J. Gehrke 1 Introduction to Query Optimization Chapter 13.
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
A new model and architecture for data stream management.
Minimizing Churn in Distributed Systems P. Brighten Godfrey, Scott Shenker, and Ion Stoica UC Berkeley SIGCOMM’06.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Control-Based Load Shedding in Data Stream Management Systems Yicheng Tu and Sunil Prabhakar Department of Computer Sciences, Purdue University April 3,
NC STATE UNIVERSITY / MCNC Protecting Network Quality of Service Against Denial of Service Attacks Douglas S. Reeves  S. Felix Wu  Fengmin Gong DARPA.
Monitoring Streams -- A New Class of Data Management Applications based on paper and talk by authors below, slightly adapted for CS561: Don Carney Brown.
By Nitin Bahadur Gokul Nadathur Department of Computer Sciences University of Wisconsin-Madison Spring 2000.
Network Computing Laboratory Load Balancing and Stability Issues in Algorithms for Service Composition Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM.
Danilo Florissi, Yechiam Yemini (YY), Sushil da Silva, Hao Huang Columbia University, New York, NY 10027
Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Song Liu‡, Sunil Prabhakar†, and Bin Yao‡ † Department of Computer.
Control-Based Load Shedding in Data Stream Management Systems Yicheng Tu and Sunil Prabhakar Department of Computer Sciences, Purdue University April 3,
Online Parameter Optimization for Elastic Data Stream Processing Thomas Heinze, Lars Roediger, Yuanzhen Ji, Zbigniew Jerzak (SAP SE) Andreas Meister (University.
CONCEPTS OF REAL-TIME OPERATING SYSTEM. OBJECTIVE  To Understand Why we need OS?  To identify Types of OS  To Define Real - Time Systems  To Classify.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
S. Sudarshan CS632 Course, Mar 2004 IIT Bombay
Architecture and Algorithms for an IEEE 802
Applying Control Theory to Stream Processing Systems
Data Stream Management System (DSMS)
湖南大学-信息科学与工程学院-计算机与科学系
Presenter Kyungho Jeon 11/17/2018.
Multimedia Data Stream Management System
Load Shedding in Stream Databases – A Control-Based Approach
Streaming Sensor Data Fjord / Sensor Proxy Multiquery Eddy
Sam Madden MIT CSAIL With Amol Deshpande (UMD), Carlos Guestrin (CMU)
TelegraphCQ: Continuous Dataflow Processing for an Uncertain World
Adaptive Query Processing (Background)
An Analysis of Stream Processing Languages
Presentation transcript:

The Design of the Borealis Stream Processing Engine Brandeis University, Brown University, MIT Magdalena BalazinskaNesime Tatbul MIT Brown

Distributed Stream Processing Data Sources Client Apps U   Node 1 Node 2 Node 3Node 4 U  Data sources push tuples continuously Operators process windows of tuples

Where are we today? Data models, operators, query languages Efficient single-site processing Single-site resource management [STREAM, TelegraphCQ, NiagaraCQ, Gigascope, Aurora] Basic distributed systems  Servers [TelegraphCQ, Medusa/Aurora]  or sensor networks [TinyDB, Cougar]  Tolerance to node failures  Simple load management

Challenges Tuple revisions  Revisions on input streams  Fault-tolerance Dynamic & distributed query optimization...

Causes for Tuple Revisions Data sources revise input streams: “On occasion, data feeds put out a faulty price [...] and send a correction within a few hours” [MarketBrowser] Temporary overloads cause tuple drops Temporary failures cause tuple drops Average price 1 hour (1pm,$10) (2pm,$12)(3pm,$11) (4:05,$11)(4:15,$10)(2:25,$9)

Current Data Model time : tuple timestamp headerdata (time,a1,...,an)

New Data Model for Revisions time : tuple timestamp type : tuple type  insertion, deletion, replacement id : unique identifier of tuple on stream headerdata (time,type,id,a1,...,an)

Revisions: Design Options Restart query network from a checkpoint Let operators deal with revisions  Operators must keep their own history  (Some) streams can keep history

Revision Processing in Borealis Closed model: revisions produce revisions Average price 1 hour (1pm,$10) (2pm,$12)(3pm,$11) (2:25,$9) Average price 1 hour (2pm,$12) (3pm,$11)(2pm,$11)

Revision Processing in Borealis Connection points (CPs) store history Operators pull the history they need  Operator CP Oldest tuple in history Most recent tuple in history Revision Input queue Stream

Fault-Tolerance through Replication Goal: Tolerate node and network failures U  Node 1’  Node 2’ Node 3’  U s1s1 s2s2 Node 1 U  Node 3 U  Node 2  s3s3 s4s4 s3’s3’

Reconciliation State reconciliation alternatives  Propagate tuples as revisions  Restart query network from a checkpoint  Propagate UNDO tuple Output stream revision alternatives  Correct individual tuples  Stream of deletions followed by insertions  Single UNDO tuple followed by insertions

Fault-Tolerance Approach If an input stream fails, find another replica No replica available, produce tentative tuples Correct tentative results after failures STABLE UPSTREAM FAILURE STABILIZING Missing or tentative inputs Failure heals Another upstream failure in progress Reconcile state Corrected output

Challenges Tuple revisions  Revisions on input streams  Fault-tolerance Dynamic & distributed query optimization...

Optimization in a Distributed SPE Goal: Optimized resource allocation Challenges:  Wide variation in resources High-end servers vs. tiny sensors  Multiple resources involved CPU, memory, I/O, bandwidth, power  Dynamic environment Changing input load and resource availability  Scalability Query network size, number of nodes

Quality of Service A mechanism to drive resource allocation Aurora model  QoS functions at query end-points  Problem: need to infer QoS at upstream nodes An alternative model  Vector of Metrics (VM) carried in tuples  Operators can change VM  A Score Function to rank tuples based on VM  Optimizers can keep and use statistics on VM

Example Application: Warfighter Physiologic Status Monitoring (WPSM) Area State Confidence Thermal Hydration Cognitive Life Signs Wound Detection 90% 60% 100% 90% 80% Physiologic Models   

Ranking Tuples in WPSM SF(VM) = VM.confidence x ADF(VM.age) age decay function Score Function Sensors Model1 Model2 ([age, confidence], value) VM Merge Models may change confidence. HRate RRate Temp

Borealis Optimizer Hierarchy End-point Monitor(s)Global Optimizer Local Monitor Local Optimizer Neighborhood Optimizer Borealis Node

Optimization Tactics Priority scheduling Modification of query plans  Commuting operators  Using alternate operator implementations Allocation of query fragments to nodes Load shedding

Correlation-based Load Distribution Goal: Minimize end-to-end latency Key ideas:  Balance load across nodes to avoid overload  Group boxes with small load correlation together  Maximize load correlation among nodes Connected Plan AB S1 CD S2 r 2r 2cr 4cr Cut Plan S1 S2 AB CD r 2r 3cr

Load Shedding Goal: Remove excess load at all nodes and links Shedding at node A relieves its descendants Distributed load shedding  Neighbors exchange load statistics  Parent nodes shed load on behalf of children  Uniform treatment of CPU and bandwidth problem Load balancing or Load shedding?

Local Load Shedding PlanLoss LocalApp 1 : 25% App 2 : 58% CPU Load = 5Cr 1, Cap = 4Cr 1 CPU Load = 4Cr 2, Cap = 2Cr 2 A must shed 20%B must shed 50% 4CC C 3C r1r1 r1r1 r2r2 r2r2 App 1 App 2 Node ANode B Goal: Minimize total loss 25% 58% CPU Load = 3.75Cr 2, Cap = 2Cr 2 B must shed 47% No overload A must shed 20% r2’r2’

Distributed Load Shedding CPU Load = 5Cr 1, Cap = 4Cr 1 CPU Load = 4Cr 2, Cap = 2Cr 2 A must shed 20%B must shed 50% 4CC C 3C r1r1 r1r1 r2r2 r2r2 App 1 App 2 Node ANode B 9% 64% PlanLoss LocalApp 1 : 25% App 2 : 58% DistributedApp 1 : 9% App 2 : 64% smaller total loss! No overload A must shed 20% r2’r2’ r 2 ’’ Goal: Minimize total loss

Extending Optimization to Sensor Nets Sensor proxy as an interface Moving operators in and out of the sensor net Adjusting sensor sampling rates  The WPSM Example: Proxy data control Merge Sensors Models control message from the optimizer

Conclusions Next generation streaming applications require a flexible processing model  Distributed operation  Dynamic result and query modification  Dynamic and scalable optimization  Server and sensor network integration  Tolerance to node and network failures Borealis has been iteratively designed, driven by real applications First prototype release planned for Spring’05