R-Storm: Resource Aware Scheduling in Storm

Slides:



Advertisements
Similar presentations
Dynamic Thread Mapping for High- Performance, Power-Efficient Heterogeneous Many-core Systems Guangshuo Liu Jinpyo Park Diana Marculescu Presented By Ravi.
Advertisements

Analysis of : Operator Scheduling in a Data Stream Manager CS561 – Advanced Database Systems By Eric Bloom.
Traffic Engineering with Forward Fault Correction (FFC)
Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures Pree Thiengburanathum Advanced computer architecture Oct 24,
COT 4600 Operating Systems Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:00-4:00 PM.
SkewReduce YongChul Kwon Magdalena Balazinska, Bill Howe, Jerome Rolia* University of Washington, *HP Labs Skew-Resistant Parallel Processing of Feature-Extracting.
ElasticTree: Saving Energy in Data Center Networks Brandon Heller, Srini Seetharaman, Priya Mahadevan, Yiannis Yiakoumis, Puneed Sharma, Sujata Banerjee,
The Who, What, Why and How of High Performance Computing Applications in the Cloud Soheila Abrishami 1.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Adaptive Scheduling with QoS Satisfaction in Hybrid Cloud Environment 研究生:李羿慷 指導老師:張玉山 老師.
CS 3013 & CS 502 Summer 2006 Scheduling1 The art and science of allocating the CPU and other resources to processes.
5: CPU-Scheduling1 Jerry Breecher OPERATING SYSTEMS SCHEDULING.
Wk 2 – Scheduling 1 CS502 Spring 2006 Scheduling The art and science of allocating the CPU and other resources to processes.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Lecture 5 Operating Systems.
OPERATING SYSTEMS CPU SCHEDULING.  Introduction to CPU scheduling Introduction to CPU scheduling  Dispatcher Dispatcher  Terms used in CPU scheduling.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Min Xu1, Yunfeng Zhu2, Patrick P. C. Lee1, Yinlong Xu2
Network Aware Resource Allocation in Distributed Clouds.
EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.
Dominant Resource Fairness: Fair Allocation of Multiple Resource Types Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion.
LATA: A Latency and Throughput- Aware Packet Processing System Author: Jilong Kuang and Laxmi Bhuyan Publisher: DAC 2010 Presenter: Chun-Sheng Hsueh Date:
An Efficient Linear Time Triple Patterning Solver Haitong Tian Hongbo Zhang Zigang Xiao Martin D.F. Wong ASP-DAC’15.
1 Iterative Integer Programming Formulation for Robust Resource Allocation in Dynamic Real-Time Systems Sethavidh Gertphol and Viktor K. Prasanna University.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
Adaptive Online Scheduling in Storm Paper by Leonardo Aniello, Roberto Baldoni, and Leonardo Querzoni Presentation by Keshav Santhanam.
Stela: Enabling Stream Processing Systems to Scale-in and Scale-out On- demand Le Xu ∗, Boyang Peng†, Indranil Gupta ∗ ∗ Department of Computer Science,
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Real-Time Operating Systems RTOS For Embedded systems.
Embedded System Scheduling
Chen Qian, Xin Li University of Kentucky
Optimizing Distributed Actor Systems for Dynamic Interactive Services
Distributed, real-time actionable insights on high-volume data streams
OPERATING SYSTEMS CS 3502 Fall 2017
lecture 5: CPU Scheduling
Chris Cai, Shayan Saeed, Indranil Gupta, Roy Campbell, Franck Le
Introduction to Load Balancing:
Presented by Tae-Seok Kim
Hydra: Leveraging Functional Slicing for Efficient Distributed SDN Controllers Yiyang Chang, Ashkan Rezaei, Balajee Vamanan, Jahangir Hasan, Sanjay Rao.
Guangxiang Du*, Indranil Gupta
Operating Systems Design (CS 423)
Processes and Threads Processes and their scheduling
Resource Aware Scheduler – Initial Results
Networks and Operating Systems: Exercise Session 2
CS 425 / ECE 428 Distributed Systems Fall 2016 Nov 10, 2016
Sub-millisecond Stateful Stream Querying over
ElasticTree Michael Fruchtman.
CS 425 / ECE 428 Distributed Systems Fall 2017 Nov 16, 2017
Nithin Michael, Yao Wang, G. Edward Suh and Ao Tang Cornell University
PA an Coordinated Memory Caching for Parallel Jobs
Lecture 4 Schedulability and Tasks
Lecture 24: Process Scheduling Examples and for Real-time Systems
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Overview Introduction VPS Understanding VPS Architecture
Boyang Peng, Le Xu, Indranil Gupta
MapReduce Simplied Data Processing on Large Clusters
湖南大学-信息科学与工程学院-计算机与科学系
Department of Computer Science University of California, Santa Barbara
CPU Scheduling G.Anuradha
ElasticTree: Saving Energy in Data Center Networks
Multi-hop Coflow Routing and Scheduling in Data Centers
Henge: Intent-Driven Multi-Tenant Stream Processing
Smita Vijayakumar Qian Zhu Gagan Agrawal
CPU SCHEDULING.
Hybrid Programming with OpenMP and MPI
From Rivulets to Rivers: Elastic Stream Processing in Heron
Subject Name: Operating System Concepts Subject Number:
Operating System Introduction.
CS703 - Advanced Operating Systems
Department of Computer Science University of California, Santa Barbara
IIS Progress Report 2016/01/18.
Presentation transcript:

R-Storm: Resource Aware Scheduling in Storm Boyang Peng Mohammad Hosseini Zhihao Hong Reza Farivar Roy Campbell

Introduction STORM is an open source distributed real-time data stream processing system Real-time analytics Online machine learning Continuous computation

R-Storm: Resource Aware Scheduling in Storm A scheduler that takes into account resource availability and resource requirement Satisfy resource requirements Improve throughput by maximizing resource utilization and minimizing network latency Your date comes here

Storm Overview Reference: storm.apache.org Your footer comes here Your date comes here

Storm Topology Reference: storm.apache.org Your footer comes here Your date comes here

Definitions of Storm Terms Tuples - The basic unit of data that is processed. Stream - an unbounded sequence of tuples. Component - A processing operator in a Storm topology that is either a Bolt or Spout Tasks - A Storm job that is an instantiation of a Spout or Bolt. Executors - A thread that is spawned in a worker process that may execute one or more tasks. Worker Process - A process spawned by Storm that may run one or more executors. Your footer comes here Your date comes here

Logical connections in a Storm Topology Your footer comes here Your date comes here

Storm topology physical connections Your footer comes here Your date comes here

Naïve round robin scheduler Naïve load limiter (Worker Slots) Motivation Naïve round robin scheduler Naïve load limiter (Worker Slots) Resources that don’t have graceful performance degradation Memory Your footer comes here Your date comes here

R-Storm: Resource Aware Scheduling in Storm Micro-benchmark 30-47% higher throughput For Yahoo! Storm applications R-Storm outperforms default Storm by around 50% based on overall throughput. Your footer comes here Your date comes here

Targeting 3 types of resources Limited resource budget for each node Problem Formulation Targeting 3 types of resources CPU, Memory, and Network Limited resource budget for each node Specific resource needs for each task Goal: Improve throughput by maximizing utilization and minimizing network latency

Problem Formulation Set of all tasks Ƭ = {τ1 , τ2, τ3, …}, each task τi has resource demands CPU requirement of cτi Network bandwidth requirement of bτi Memory requirement of mτi Set of all nodes N = {θ1 , θ2, θ3, …} Total available CPU budget of W1 Total available Bandwidth budget of W2 Total available Memory budget of W3

Problem Formulation Qi : Throughput contribution of each node Assign tasks to a subset of nodes N’ ∈ N that minimizes the total resource waste:

 Quadratic Multiple 3D Knapsack Problem Problem Formulation  Quadratic Multiple 3D Knapsack Problem We call it QM3DKP! NP-Hard! Compute optimal solutions or approximate solutions may be hard and time consuming Real time systems need fast scheduling Re-compute scheduling when failures occur

Soft Constraints Vs Hard Constraints CPU and Network Resources Graceful performance degradation with over subscription Hard Constraints Memory Oversubscribe -> Game over Your footer comes here Your date comes here

Observations on Network Latency 1. Inter-rack communication is the slowest 2. Inter-node communication is slow 3. Inter-process communication is faster 4. Intra-process communication is the fastest Your footer comes here Your date comes here

Designing a 3D resource space Heuristic Algorithm Greedy approach Designing a 3D resource space Each resource maps to an axis Can be generalized to nD resource space Trivial overhead! Based on: min (Euclidean distance) Satisfy hard constraints For each task with a certain resource vector that represents its resource requirement we attempt to find the node with the resource vector that represents its resource availability that is closest Based on min (Euclidean distance) while not violating hard constraints

Heuristic Algorithm Your footer comes here Your date comes here

Heuristic Algorithm Switch 1 2 3 4 5 6 Your footer comes here Your date comes here

Our proposed heuristic algorithm has the following properties: Tasks of components that communicate will each other will have the highest priority to be scheduled in close network proximity to each other. No hard resource constraint is violated. Resource waste on nodes are minimized. Based on min (Euclidean distance) while not violating hard constraints

Evaluation Micro-benchmarks Industry inspired benchmarks Linear Topology Diamond Topology Star Topology Industry inspired benchmarks PageLoad Topology Processing Topology Network Bound versus Computation Bound

1 host for Nimbus + Zookeeper 12 hosts as worker nodes All hosts: Evaluation Setup Used Emulab.net as testbed and to emulate inter-rack latency across two sides 1 host for Nimbus + Zookeeper 12 hosts as worker nodes All hosts: Ubuntu 12.04 LTS 1-core Intel CPU 2GB RAM+ 100Mb NIC V0 1 2 3 5 6 4 V1 7 8 9 11 12 10

Micro-benchmarks Linear Topology Diamond Topology Star Topology Your footer comes here Your date comes here

Network Bound vs CPU Bound Network Bottleneck Speed of light test CPU Bound Computation Bound Arbitrary computation (Find prime numbers) Your footer comes here Your date comes here

Network-bound Micro-benchmark 50% improvement Your footer comes here Your date comes here

Network-bound Micro-benchmark Topologies 30% improvement Your footer comes here Your date comes here

Network-bound Micro-benchmark Topologies 47% improvement Your footer comes here Your date comes here

Computation Bound Micro-benchmark These experiments are not exactly a fair comparison but serve to illustrate an important point that more is not always better. Im giving storm more resource but its not running any faster Your footer comes here Your date comes here

CPU Utilization Your footer comes here Your date comes here

Lesson Learn for Computation bound More nodes might not be better while parallelism is constant. Need the right amount Having more nodes and Spreading executors unnecssarily far will increase network distance and latency Your footer comes here Your date comes here

Yahoo Sample Topologies Layout of topologies provided by Yahoo Implemented in abstractly for our evaluation These two topologies are used by Yahoo! for processing event-level data from their advertising platforms to allow for near real-time analytical reporting. Your footer comes here Your date comes here

PageLoad Topology 50% improvement Your footer comes here Your date comes here

Processing Topology 47% improvement Your footer comes here Your date comes here

Throughput comparison of running multiple topologies Your footer comes here Your date comes here

Multiple Topologies Results PageLoad topology R-Storm (25496 tuples/10sec) Default Storm (16695 tuples/10sec) R-Storm is around 53% higher Processing topology R-Storm (67115 tuples/10sec) Default Storm (10 tuples/sec). Orders of magnitude higher Your footer comes here Your date comes here

Merged with Apache Storm 0.11 Release Current Status Merged with Apache Storm 0.11 Release https://github.com/apache/storm/ Starting to be used a Yahoo! Inc. Your footer comes here Your date comes here

Experiments to quantify latency Multi-tenancy Future Work Experiments to quantify latency Multi-tenancy Topology Priorities and Per User Resource Guarantees Elasticity Dynamic adjusting parallelism to meet SLA Your footer comes here Your date comes here

Questions? Your footer comes here Your date comes here