The Design of the Borealis Stream Processing Engine CIDR 2005 Brandeis University, Brown University, MIT Kang, Seungwoo 2005.3.15 Ref.

Slides:

Advertisements

Similar presentations

Computer Systems & Architecture Lesson 2 4. Achieving Qualities.

Advertisements

Network Resource Broker for IPTV in Cloud Computing Lei Liang, Dan He University of Surrey, UK OGF 27, G2C Workshop 15 Oct 2009 Banff,

Load Management and High Availability in Borealis Magdalena Balazinska, Jeong-Hyon Hwang, and the Borealis team MIT, Brown University, and Brandeis University.

Analysis of : Operator Scheduling in a Data Stream Manager CS561 – Advanced Database Systems By Eric Bloom.

SDN Controller Challenges

Scalable Content-Addressable Network Lintao Liu

Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.

MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.

The Design of the Borealis Stream Processing Engine Daniel J. Abadi1, Yanif Ahmad2, Magdalena Balazinska1, Ug ̆ur C ̧ etintemel2, Mitch Cherniack3, Jeong-Hyon.

The Design of the Borealis Stream Processing Engine Brandeis University, Brown University, MIT Magdalena BalazinskaNesime Tatbul MIT Brown.

The Design of the Borealis Stream Processing Engine CIDR 2005 Brandeis University, Brown University, MIT Kang, Seungwoo Ref.

Fault-Tolerance in the Borealis Distributed Stream Processing System Magdalena Balazinska, Hari Balakrishnan, Samuel Madden, and Michael Stonebraker MIT.

Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,

Volcano Routing Scheme Routing in a Highly Dynamic Environment Yashar Ganjali Stanford University Joint work with: Nick McKeown SECON 2005, Santa Clara,

Adaptive Sampling for Sensor Networks Ankur Jain ٭ and Edward Y. Chang University of California, Santa Barbara DMSN 2004.

Aurora Proponent Team Wei, Mingrui Liu, Mo Rebuttal Team Joshua M Lee Raghavan, Venkatesh.

Decentralized resource management for a distributed continuous media server Cyrus Shahabi and Farnoush Banaei-Kashani IEEE Transactions on Parallel and.

Chapter 10: Stream-based Data Management Title: Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core Authors:

Quality-Of-Service (QoS) Panel Mitch Cherniack Brandeis David Maier OGI Rajeev Motwani Stanford Johannes GehrkeCornell Hari BalakrishnanMIT SWiM, Stanford.

Energy-Efficient Design Some design issues in each protocol layer Design options for each layer in the protocol stack.

Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.

Monitoring Streams -- A New Class of Data Management Applications Don Carney Brown University Uğur ÇetintemelBrown University Mitch Cherniack Brandeis.

An Adaptive Multi-Objective Scheduling Selection Framework For Continuous Query Processing Timothy M. Sutherland Bradford Pielech Yali Zhu Luping Ding.

Monitoring Streams -- A New Class of Data Management Applications Don Carney Brown University Uğur ÇetintemelBrown University Mitch Cherniack Brandeis.

Winter Retreat Connecting the Dots: Using Runtime Paths for Macro Analysis Mike Chen, Emre Kıcıman, Anthony Accardi, Armando Fox, Eric Brewer

Applying Control Theory to Stream Processing Systems Wei Xu Bill Kramer Joe Hellerstein.

Correctness of Gossip-Based Membership under Message Loss Maxim Gurevich, Idit Keidar Technion.

PROMISE: Peer-to-Peer Media Streaming Using CollectCast Presented by: Randeep Singh Gakhal CMPT 886, July 2004.

Computer Measurement Group, India Reliable and Scalable Data Streaming in Multi-Hop Architecture Sudhir Sangra, BMC Software Lalit.

Efficient Scheduling of Heterogeneous Continuous Queries Mohamed A. Sharaf Panos K. Chrysanthis Alexandros Labrinidis Kirk Pruhs Advanced Data Management.

CS An Overlay Routing Scheme For Moving Large Files Su Zhang Kai Xu.

OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.

Managing Service Metadata as Context The 2005 Istanbul International Computational Science & Engineering Conference (ICCSE2005) Mehmet S. Aktas

Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.

An Integration Framework for Sensor Networks and Data Stream Management Systems.

QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.

ARMADA Middleware and Communication Services T. ABDELZAHER, M. BJORKLUND, S. DAWSON, W.-C. FENG, F. JAHANIAN, S. JOHNSON, P. MARRON, A. MEHRA, T. MITTON,

Kevin Ross, UCSC, September Service Network Engineering Resource Allocation and Optimization Kevin Ross Information Systems & Technology Management.

Quality of Service Karrie Karahalios Spring 2007.

HQ Replication: Efficient Quorum Agreement for Reliable Distributed Systems James Cowling 1, Daniel Myers 1, Barbara Liskov 1 Rodrigo Rodrigues 2, Liuba.

May PEM status report. O.Bärring 1 PEM status report Large-Scale Cluster Computing Workshop FNAL, May Olof Bärring, CERN.

A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.

Aurora – system architecture Pawel Jurczyk. Currently used DB systems Classical DBMS: –Passive repository storing data (HADP – human-active, DBMS- passive.

Integrating Scale Out and Fault Tolerance in Stream Processing using Operator State Management Author: Raul Castro Fernandez, Matteo Migliavacca, et al.

Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.

PMI: A Scalable Process- Management Interface for Extreme-Scale Systems Pavan Balaji, Darius Buntinas, David Goodell, William Gropp, Jayesh Krishna, Ewing.

Tufts Wireless Laboratory School Of Engineering Tufts University Paper Review “An Energy Efficient Multipath Routing Protocol for Wireless Sensor Networks”,

Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.

A Dynamic Query-tree Energy Balancing Protocol for Sensor Networks H. Yang, F. Ye, and B. Sikdar Department of Electrical, Computer and systems Engineering.

Scalable and Topology-Aware Load Balancers in Charm++ Amit Sharma Parallel Programming Lab, UIUC.

CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.

Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing Presenter : Lee Youn Do Oct 5, 2005 Ben Y.Zhao, John Kubiatowicz, and Anthony.

Monitoring Streams -- A New Class of Data Management Applications based on paper and talk by authors below, slightly adapted for CS561: Don Carney Brown.

By Nitin Bahadur Gokul Nadathur Department of Computer Sciences University of Wisconsin-Madison Spring 2000.

Network Computing Laboratory Load Balancing and Stability Issues in Algorithms for Service Composition Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM.

Danilo Florissi, Yechiam Yemini (YY), Sushil da Silva, Hao Huang Columbia University, New York, NY 10027

Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Song Liu‡, Sunil Prabhakar†, and Bin Yao‡ † Department of Computer.

IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©

Courtesy Piggybacking: Supporting Differentiated Services in Multihop Mobile Ad Hoc Networks Wei LiuXiang Chen Yuguang Fang WING Dept. of ECE University.

RAID Technology By: Adarsha A,S 1BY08A03. Overview What is RAID Technology? What is RAID Technology? History of RAID History of RAID Techniques/Methods.

1 Fault Tolerance and Recovery Mostly taken from

Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,

S. Sudarshan CS632 Course, Mar 2004 IIT Bombay

Applying Control Theory to Stream Processing Systems

Presenter Kyungho Jeon 11/17/2018.

Multimedia Data Stream Management System

Adaptive Query Processing (Background)

An Analysis of Stream Processing Languages

Presentation transcript:

The Design of the Borealis Stream Processing Engine CIDR 2005 Brandeis University, Brown University, MIT Kang, Seungwoo Ref. Borealis.ppt

Outline Motivation and Goal Requirements Technical issues  Dynamic revision of query results  Dynamic query modification  Flexible and scalable optimization  Fault tolerance Conclusion

Motivation and Goal Envision the second-generation SPE  Fundamental requirements of many streaming applications not supported by first-generation SPE  Three fundamental requirements Dynamic revision of query results Dynamic query modification Flexible and scalable optimization Present the functionality and preliminary design of Borealis

Requirements Dynamic revision of query results Dynamic query modification Flexible and highly scalable optimization

Changes in the models Data model  Revision support Three types of messages Query model  Support modification of Op box semantic Control lines QoS model  Introduce QoS metrics in tuple basis Each tuple (message) holds VM (Vector of Metrics) Score function

Architecture of Borealis Modify and extend both systems, Aurora and Medusa, with a set of features and mechanisms

1.Dynamic revision of query results Problem  Corrections of previously reported data from data source  Highly volatile and unpredictable characteristics of data sources like sensors Late arrival, out-of-order data  Just accept approximate or imperfect results Goal  Support to process revisions and correct the previous output result

Causes for Tuple Revisions Data sources revise input streams: “On occasion, data feeds put out a faulty price [...] and send a correction within a few hours” [MarketBrowser] Temporary overloads cause tuple drops Data arrives late, and miss its processing wnd Average price 1 hour (1pm,$10) (2pm,$12)(3pm,$11) (4:05,$11)(4:15,$10)(2:25,$9)

New Data Model for Revisions time : tuple timestamp type : tuple type  insertion, deletion, replacement id : unique identifier of tuple on stream headerdata (time,type,id,a1,...,an)

Revision Processing in Borealis Connection points (CPs) store history (diagram history with history bound) Operators pull the history they need  Operator CP Oldest tuple in history Most recent tuple in history Revision Input queue Stream

Discussion Runtime overhead  Assumption Input revision msg < 1% Revision msg processing can be deferred  Processing cost  Storage cost Application’s needs / requirements  Whether it is interested in revision msg or not  What to do with revision msg Rollback, compensation ??  Application’s QoS requirement  Delay bound

2. Dynamic query modification Problem  Change certain attributes of the query at runtime  Manual query substitutions – high overhead, slow to take effect Goal  Low overhead, fast, and automatic modifications Dynamic query modification  Control lines Provided to each operator box Carry messages with revised box parameters and new box functions  Timing problem Specify precisely what data is to be processed according to what control parameters Data is ready for processing too late or too early  old data vs. new parameter -> buffered old parameters  new data vs. old parameter -> revision message and time travel

Time travel Motivation  Rewind history and then repeat it  Move forward into the future Connection Point (CP) CP View  Independent view of CP  View range Start_time Max_time  Operations Undo: deletion message to rollback the state to time t Replay  Prediction function for future travel

3.Optimization in a Distributed SPE Goal: Optimized resource allocation Challenges:  Wide variation in resources High-end servers vs. tiny sensors  Multiple resources involved CPU, memory, I/O, bandwidth, power  Dynamic environment Changing input load and resource availability  Scalability Query network size, number of nodes

Quality of Service A mechanism to drive resource allocation Aurora model  QoS functions at query end-points  Problem: need to infer QoS at upstream nodes An alternative model  Vector of Metrics (VM) carried in tuples Content: tuple’s importance, or its age Performance: arrival time, total resource consumed,..  Operators can change VM  A Score Function to rank tuples based on VM  Optimizers can keep and use statistics on VM

Example Application: Warfighter Physiologic Status Monitoring (WPSM) Area State Confidence Thermal Hydration Cognitive Life Signs Wound Detection 90% 60% 100% 90% 80% Physiologic Models   

SF(VM) = VM.confidence x ADF(VM.age) age decay function Score Function Sensors Model1 Model2 ([age, confidence], value) VM Merge Models may change confidence. HRate RRate Temp

Borealis Optimizer Hierarchy End-point Monitor(s)Global Optimizer Local Monitor Local Optimizer Neighborhood Optimizer Borealis Node statisticstriggerdecision

Optimization Tactics Priority scheduling - Local Modification of query plans - Local  Changing order of commuting operators  Using alternate operator implementations Allocation of query fragments to nodes - Neighborhood, Global Load shedding - Local, Neighborhood, Global

Priority scheduling Highest QoS Gradient box  Evaluate the predicted-QoS score function on each message with values in VM  average QoS Gradient for each box By comparing average QoS-impact scores between the inputs and outputs of each box Highest QoS-impact message from the input queue  Out-of-order processing  Inherent load shedding behavior

Correlation-based Load Distribution Under the situation that network BW is abundant and network transfer delays are negligible Goal: Minimize end-to-end latency Key ideas:  Balance load across nodes to avoid overload  Group boxes with small load correlation together  Maximize load correlation among nodes Connected Plan AB S1 CD S2 r 2r 2cr 4cr Cut Plan S1 S2 AB CD r 2r 3cr

Load Shedding Goal: Remove excess load at all nodes and links Shedding at node A relieves its descendants Distributed load shedding  Neighbors exchange load statistics  Parent nodes shed load on behalf of children  Uniform treatment of CPU and bandwidth problem Load balancing or Load shedding?

Local Load Shedding PlanLoss LocalApp 1 : 25% App 2 : 58% CPU Load = 5Cr 1, Cap = 4Cr 1 CPU Load = 4Cr 2, Cap = 2Cr 2 A must shed 20%B must shed 50% 4CC C 3C r1r1 r1r1 r2r2 r2r2 App 1 App 2 Node ANode B Goal: Minimize total loss 25% 58% CPU Load = 3.75Cr 2, Cap = 2Cr 2 B must shed 47% No overload A must shed 20% r2’r2’

Distributed Load Shedding CPU Load = 5Cr 1, Cap = 4Cr 1 CPU Load = 4Cr 2, Cap = 2Cr 2 A must shed 20%B must shed 50% 4CC C 3C r1r1 r1r1 r2r2 r2r2 App 1 App 2 Node ANode B 9% 64% PlanLoss LocalApp 1 : 25% App 2 : 58% DistributedApp 1 : 9% App 2 : 64% smaller total loss! No overload A must shed 20% r2’r2’ r 2 ’’ Goal: Minimize total loss

Fault-Tolerance through Replication Goal: Tolerate node and network failures U  Node 1’  Node 2’ Node 3’  U s1s1 s2s2 Node 1 U  Node 3 U  Node 2  s3s3 s4s4 s3’s3’

Fault-Tolerance Approach If an input stream fails, find another replica No replica available, produce tentative tuples Correct tentative results after failures STABLE UPSTREAM FAILURE STABILIZING Missing or tentative inputs Failure heals Another upstream failure in progress Reconcile state Corrected output

Conclusions Next generation streaming applications require a flexible processing model  Distributed operation  Dynamic result and query modification  Dynamic and scalable optimization  Server and sensor network integration  Tolerance to node and network failures