A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley.

Slides:



Advertisements
Similar presentations
Orchestra Managing Data Transfers in Computer Clusters Mosharaf Chowdhury, Matei Zaharia, Justin Ma, Michael I. Jordan, Ion Stoica UC Berkeley.
Advertisements

A MapReduce Workflow System for Architecting Scientific Data Intensive Applications By Phuong Nguyen and Milton Halem phuong3 or 1.
Load Balancing Parallel Applications on Heterogeneous Platforms.
Towards Predictable Datacenter Networks
Varys Efficient Coflow Scheduling Mosharaf Chowdhury,
CoMPI: Enhancing MPI based applications performance and scalability using run-time compression. Rosa Filgueira, David E.Singh, Alejandro Calderón and Jesús.
SDN + Storage.
Coflow A Networking Abstraction For Cluster Applications UC Berkeley Mosharaf Chowdhury Ion Stoica.
LIBRA: Lightweight Data Skew Mitigation in MapReduce
Sharing Cloud Networks Lucian Popa, Gautam Kumar, Mosharaf Chowdhury Arvind Krishnamurthy, Sylvia Ratnasamy, Ion Stoica UC Berkeley.
EHarmony in Cloud Subtitle Brian Ko. eHarmony Online subscription-based matchmaking service Available in United States, Canada, Australia and United Kingdom.
Locality-Aware Dynamic VM Reconfiguration on MapReduce Clouds Jongse Park, Daewoo Lee, Bokyeong Kim, Jaehyuk Huh, Seungryoul Maeng.
MIS 2000 Class 20 System Development Process Updated 2014.
CS 268: Lecture 8 Router Support for Congestion Control Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.
My Point of View about Bandwidth Sharing Bin Wang.
Making Sense of Performance in Data Analytics Frameworks Kay Ousterhout, Ryan Rasti, Sylvia Ratnasamy, Scott Shenker, Byung-Gon Chun.
Reference: Message Passing Fundamentals.
UC Berkeley Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University.
CS 268: Lecture 8 (Router Support for Congestion Control) Ion Stoica February 19, 2002.
UC Berkeley Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University.
Making Sense of Performance in Data Analytics Frameworks Kay Ousterhout Joint work with Ryan Rasti, Sylvia Ratnasamy, Scott Shenker, Byung-Gon Chun UC.
UC Berkeley Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University.
Lecture Nine Database Planning, Design, and Administration
Network Sharing Issues Lecture 15 Aditya Akella. Is this the biggest problem in cloud resource allocation? Why? Why not? How does the problem differ wrt.
OSN Research As If Sociology Mattered Krishna P. Gummadi Networked Systems Research Group MPI-SWS.
1 Enabling Large Scale Network Simulation with 100 Million Nodes using Grid Infrastructure Hiroyuki Ohsaki Graduate School of Information Sci. & Tech.
Basic Communication Operations Based on Chapter 4 of Introduction to Parallel Computing by Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar These.
COLLABORATIVE SPECTRUM MANAGEMENT FOR RELIABILITY AND SCALABILITY Heather Zheng Dept. of Computer Science University of California, Santa Barbara.
The Only Constant is Change: Incorporating Time-Varying Bandwidth Reservations in Data Centers Di Xie, Ning Ding, Y. Charlie Hu, Ramana Kompella 1.
GridIS: an Incentive-based Grid Scheduling Lijuan Xiao, Yanmin Zhu, Lionel M. Ni, Zhiwei Xu 19th International Parallel and Distributed Processing Symposium.
CPS216: Advanced Database Systems (Data-intensive Computing Systems) Introduction to MapReduce and Hadoop Shivnath Babu.
Dominant Resource Fairness: Fair Allocation of Multiple Resource Types Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion.
Improving MBMS Security in 3G Wenyuan Xu Rutgers University.
Example: Sorting on Distributed Computing Environment Apr 20,
1 Andreea Chis under the guidance of Frédéric Desprez and Eddy Caron Scheduling for a Climate Forecast Application ANR-05-CIGC-11.
MC 2 : Map Concurrency Characterization for MapReduce on the Cloud Mohammad Hammoud and Majd Sakr 1.
Chapter Seventeen Policymaking. Copyright © Houghton Mifflin Company. All rights reserved Public Policies and Purposes A public policy is a general.
Using Map-reduce to Support MPMD Peng
CSC480 Software Engineering Lecture 10 September 25, 2002.
Network-Aware Scheduling for Data-Parallel Jobs: Plan When You Can
Performance Analysis of Preemption-aware Scheduling in Multi-Cluster Grid Environments Mohsen Amini Salehi, Bahman Javadi, Rajkumar Buyya Cloud Computing.
Dynamic Slot Allocation Technique for MapReduce Clusters School of Computer Engineering Nanyang Technological University 25 th Sept 2013 Shanjiang Tang,
Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,
Design Issues of Prefetching Strategies for Heterogeneous Software DSM Author :Ssu-Hsuan Lu, Chien-Lung Chou, Kuang-Jui Wang, Hsiao-Hsi Wang, and Kuan-Ching.
Dzmitry Kliazovich University of Luxembourg, Luxembourg
Advanced Computer Networks Lecture 1 - Parallelization 1.
Finite State Machines (FSM) OR Finite State Automation (FSA) - are models of the behaviors of a system or a complex object, with a limited number of defined.
MapReduce & Hadoop IT332 Distributed Systems. Outline  MapReduce  Hadoop  Cloudera Hadoop  Tutorial 2.
XCP: eXplicit Control Protocol Dina Katabi MIT Lab for Computer Science
Unit - 4 Introduction to the Other Databases.  Introduction :-  Today single CPU based architecture is not capable enough for the modern database.
Handling Data Skew in Parallel Joins in Shared-Nothing Systems Yu Xu, Pekka Kostamaa, XinZhou (Teradata) Liang Chen (University of California) SIGMOD’08.
Presented by Qifan Pu With many slides from Ali’s NSDI talk Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion Stoica.
Group # 14 Dhairya Gala Priyank Shah. Introduction to Grid Appliance The Grid appliance is a plug-and-play virtual machine appliance intended for Grid.
Dominant Resource Fairness: Fair Allocation of Multiple Resource Types Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion.
6.888 Lecture 8: Networking for Data Analytics Mohammad Alizadeh Spring  Many thanks to Mosharaf Chowdhury (Michigan) and Kay Ousterhout (Berkeley)
PACMan: Coordinated Memory Caching for Parallel Jobs Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Wang, Dhruba Borthakur, Srikanth Kandula, Scott Shenker,
1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.
Aalo Efficient Coflow Scheduling Without Prior Knowledge Mosharaf Chowdhury, Ion Stoica UC Berkeley.
Resilient Distributed Datasets A Fault-Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
The Future of Apache Flink®
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
Sujata Ray Dey Maheshtala College Computer Science Department
Managing Data Transfer in Computer Clusters with Orchestra
Speaker : Che-Wei Chang
Improving Datacenter Performance and Robustness with Multipath TCP
EECS 582 Midterm Review Mosharaf Chowdhury EECS 582 – F16.
Effective Social Network Quarantine with Minimal Isolation Costs
Akshay Tomar Prateek Singh Lohchubh
Sujata Ray Dey Maheshtala College Computer Science Department
Presentation transcript:

A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

Datacenter Applications

3 Data Parallelism Applications execute in several computation stages and require transfer of data between these stages (communication). Computation in a stage is split across multiple nodes. Network has an important role to play, 33% of the running time in Facebook traces. (Orchestra, SIGCOMM 2011) M M M M R R R R J J M M R R M M R R J J M M R R J J Map Reduce Join (*RoPE, NSDI 2012) * HotCloud June 12, 2012

4 Data Parallelism Users, often, do not know what network support they require.  Final execution graph created by the framework. Frameworks know more, provide certain communication primitives.  e.g., Shuffle, Broadcast etc. HotCloud June 12, 2012

5 Scope Private clusters running data parallel applications. Little concern for adversarial behavior. Application level inefficiencies dealt extrinsically. HotCloud June 12, 2012

Current Proposals

7 Explicit Accounting Virtual cluster based network reservations. (Oktopus, SIGCOMM 2011) Time-varying network reservations. (SIGCOMM 2012) DRAWBACK: Exact network requirements often not known; non work- conserving. HotCloud June 12, 2012

8 Fairness-Centric Flow level fairness or Per-Flow. (TCP) Fairness with respect to the sources. (Seawall, NSDI 2012) Proportionality in terms of total number of VMs. (FairCloud, SIGCOMM 2012) DRAWBACK: Gives little guidance to developers about the performance they can expect while scaling their applications. HotCloud June 12, 2012

9 In this work... A new perspective to share the network amongst data-parallel applications – performance-centric allocations:  enabling users to reason about the performance of their applications when they scale them up.  enabling applications to effectively parallelize to preserve the intuitive mapping between scale-up and speed-up. Contrast / relate performance-centric proposals with fairness-centric proposals. HotCloud June 12, 2012

Performance- Centric Allocations

1 λ λ/2 λ λ λ Shuffle Broadcast Types of Transfers* (*Orchestra, SIGCOMM 2011) HotCloud June 12, 2012

1212 λ λ λ λ λ λ/2 λ λ λ 2λ λ/2 2X Scale UP Total Data = 4λ Total Data = 2λ λ/2 ShuffleBroadcast Scaling up the application HotCloud June 12, 2012

1313 Performance -Centric Allocations Understand the support that the application needs from the network to effectively parallelize. At a sweet spot – framework knows application’s network requirements. HotCloud June 12, 2012

1414 Shuffle-only clusters λ/2 λ 2λ AmAm ArAr BmBm BrBr λ BmBm BrBr λ/2 tAmtAm tAstAs tArtAr tBmtBm tBrtBr tBstBs HotCloud June 12, 2012

1515 Shuffle-only Clusters λ/2 λ 2λ AmAm ArAr BmBm BrBr λ BmBm BrBr λ/2 tAmtAm t A s = 2λ/α tArtAr t A m /2t A r /2t B s = λ/2α = t A s /4 α α t B < t A /2 λ/2 λ 2λ AmAm ArAr BmBm BrBr λ BmBm BrBr λ/2 tAmtAm t A s = 2λ/α tArtAr t A m /2 t A r /2t B s = λ/α = t A s /2 α α/2 t B = t A /2 Per-FlowProportional HotCloud June 12, 2012

1616 Broadcast-only Clusters λ λ 2λ AmAm ArAr BmBm BrBr λ BmBm BrBr λ λ λ tAmtAm tAstAs tArtAr tBmtBm tBrtBr tBstBs HotCloud June 12, 2012

1717 Broadcast-only Clusters λ λ 2λ AmAm ArAr BmBm BrBr λ BmBm BrBr λ λ λ tAmtAm t A s = 2λ/α tArtAr t A m /2t A r /2t B s = 2λ/α = t A s α α/2 t B > t A /2 λ 2λ AmAm ArAr BmBm BrBr λ BmBm BrBr tAmtAm t A s = 2λ/α tArtAr t A m /2t A r /2t B s = λ/α = t A s /2 t B = t A /2 α α λ λ λ λ Proportional Per-Flow HotCloud June 12, 2012

1818 Recap TCP in shuffle gives more than requisite speed-up and thus hurts performance of small jobs. Proportionality achieves the right balance. Proportionality in broadcast limits parallelism. TCP achieves the right balance. Degree of Parallelism Speed Up TCP (Shuffle) Prop. (Shuffle) TCP. (Broadcast) Prop. (Broadcast) HotCloud June 12, 2012

1919 Complexity of a transfer x N -transfer if x is the factor by which the amount of data transferred increases when a scale up of N is done, x  [1, N]. Shuffle is a 1 N -transfer and broadcast is an N N -transfer. Performance-centric allocations encompass x. HotCloud June 12, 2012

Heterogeneous Frameworks and Congested Resources

2121 Share given based on the complexity of the transfer. The job completion time of both jobs degrades uniformly in the event of contention. Both finish in 6s Both finish in 4s 2G AmAm ArAr BmBm BrBr 1G A’ m A’ r 1G A’ m A’ r 0.5G 1G1G B’ m B’ r 1G1G B’ m B’ r 500Mbps 2Gb to send 500Mbps 2Gb to send ~333Mbps 2Gb to send ~666Mbps 4Gb to send 2X Scale UP 0.5G 1G HotCloud June 12, 2012

2 Network Parallelism Isolation between the speed-up due to the scale- up for the application and the performance degradation due to finite resources. y’ y N (α)(α) X α : degradation due to limited resources y : old running time y’: new running time after a scale-up of N HotCloud June 12, 2012

Summary

2424 Understand performance-centric allocations and their relationship with fairness-centric proposals.  Proportionality is the performance-centric approach for shuffle-only clusters.  Breaks down for broadcasts, per-flow is the performance-centric approach for broadcast-only clusters. An attempt to a performance-centric proposal for heterogeneous transfers.  Understand what happens when resources get congested. HotCloud June 12, 2012

Future Work

2626 A more rigorous formulation.  Some questions to be answered: different N 1 and N 2 on both sides of the stage etc. Analytical and experimental evaluation of the policies.  Whether redistribution of completion time or total savings. HotCloud June 12, 2012

Thank you