Hierarchical Scheduling for Diverse Datacenter Workloads

Slides:



Advertisements
Similar presentations
Author : Chia-Hung Lin, Chia-Yin Hsu, and Sun-Yuan Hsieh Publisher : IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS.
Advertisements

Hadi Goudarzi and Massoud Pedram
MPTCP is not Pareto- Optimal Performance Issues and a possible solution B 吳昇峰.
No Agent Left Behind: Dynamic Fair Division of Multiple Resources Ian Kash 1 Ariel Procaccia 2 Nisarg Shah 2 (Speaker) 1 MSR Cambridge 2 Carnegie Mellon.
Reciprocal Resource Fairness: Towards Cooperative Multiple-Resource Fair Sharing in IaaS Clouds School of Computer Engineering Nanyang Technological University,
CPSC 335 BTrees Dr. Marina Gavrilova Computer Science University of Calgary Canada.
MM Process Management Karrie Karahalios Spring 2007 (based off slides created by Brian Bailey)
E VALUATION OF F AIRNESS IN ICN X. De Foy, JC. Zuniga, B. Balazinski InterDigital
Mesos A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony Joseph, Randy.
Dominant Resource Fairness: Fair Allocation of Multiple Resource Types Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion.
CHAPTER 8: DEADLOCKS System Model Deadlock Characterization
B-Tree.
A Platform for Fine-Grained Resource Sharing in the Data Center
1 张惠娟 副教授 实用操作系统概念. 2 内容框架 概述 体系结构 进程管理 内存管理 文件管理 外设管理.
Presented by Qifan Pu With many slides from Ali’s NSDI talk Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion Stoica.
Dominant Resource Fairness: Fair Allocation of Multiple Resource Types Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion.
Architecture for Resource Allocation Services Supporting Interactive Remote Desktop Sessions in Utility Grids Vanish Talwar, HP Labs Bikash Agarwalla,
Stride Scheduling: Deterministic Proportional-Share Resource Management Carl A. Waldspurger, William E. Weihl MIT Laboratory for Computer Science Presenter:
PACMan: Coordinated Memory Caching for Parallel Jobs Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Wang, Dhruba Borthakur, Srikanth Kandula, Scott Shenker,
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center NSDI 11’ Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D.
Resource Provision for Batch and Interactive Workloads in Data Centers Ting-Wei Chang, Pangfeng Liu Department of Computer Science and Information Engineering,
Chapter 7: Deadlocks.
Chapter 7: Deadlocks.
COS 518: Advanced Computer Systems Lecture 13 Michael Freedman
Chapter 11 Heap.
A Dynamic Indirect IP Lookup based on Prefix Relationships
Multiway Search Trees Data may not fit into main memory
Lottery Scheduling and Dominant Resource Fairness (Lecture 24, cs262a)
Greedy Method 6/22/2018 6:57 PM Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.
Edinburgh Napier University
OPERATING SYSTEMS CS3502 Fall 2017
CS 425 / ECE 428 Distributed Systems Fall 2016 Nov 10, 2016
The Greedy Method and Text Compression
Chapter 18 B-Trees Lee, Hsiu-Hui
G.Anuradha Ref:- Galvin
A Dynamic Critical Path Algorithm for Scheduling Scientific Workflow Applications on Global Grids e-Science IEEE 2007 Report: Wei-Cheng Lee
CS 425 / ECE 428 Distributed Systems Fall 2017 Nov 16, 2017
Design and Analysis of Algorithm
PA an Coordinated Memory Caching for Parallel Jobs
Lottery Scheduling Ish Baid.
Operating System: DEADLOCKS
Hierarchical clustering approaches for high-throughput data
Chapter 7 Deadlocks.
Andy Wang Operating Systems COP 4610 / CGS 5765
Merge Sort 11/28/2018 2:21 AM The Greedy Method The Greedy Method.
Multi-hop Coflow Routing and Scheduling in Data Centers
Ch. 8 Priority Queues And Heaps
COS 518: Advanced Computer Systems Lecture 14 Michael Freedman
Shanjiang Tang1, Bingsheng He2, Shuhao Zhang2,4, Zhaojie Niu3
B-Tree.
CPU SCHEDULING.
Deadlock B.Ramamurthy CSE421 1/11/2019 B.Ramamurthy.
Deadlock B.Ramamurthy CSE421 2/23/2019 B.Ramamurthy.
Cloud Computing Large-scale Resource Management
Scheduling of Regular Tasks in Linux
Deadlock B.Ramamurthy CSE421 4/23/2019 B.Ramamurthy.
CENG 351 Data Management and File Structures
Deadlock B.Ramamurthy CSE421 5/1/2019 B.Ramamurthy.
Chapter 6: Scheduling Algorithms Dr. Amjad Ali
B-Trees.
Heaps & Multi-way Search Trees
Chapter 7: Deadlocks.
IIS Progress Report 2016/01/18.
Deadlock B.Ramamurthy CSE421 8/28/2019 B.Ramamurthy.
Deadlock B.Ramamurthy CSE421 9/3/2019 B.Ramamurthy.
Scheduling of Regular Tasks in Linux
Towards Predictable Datacenter Networks
CMPT 225 Lecture 16 – Heap Sort.
Presentation transcript:

Hierarchical Scheduling for Diverse Datacenter Workloads Arka A. Bhattacharya, David Culler, Ali Ghodsi, Scott Shenker, and Ion Stoica University of California, Berkeley Eric Friedman International Computer Science Institute, Berkeley ACM SoCC’13

Hierarchical Scheduling A feature of cloud schedulers. Enables scheduling resources to reflect organizational priorities. “Hierarchical scheduling 是 cloud scheduler 的一個關鍵特徵” The key feature of hierarchical scheduling—which is absent in flat or non-hierarchical scheduling—is that if some node in the hierarchy is not using its resources they are redistributed among that node’s sibling nodes, as opposed to all leaf nodes.

Hierarchical Share Guarantee Assign to each node in the weighted tree some guaranteed share of the resources. A node nL is guaranteed to get at least x share of resources from it parent, where x equals to Wi: weight of node ni P(): parent of a node C(): the set of children of a node A(): the subset of demanding nodes A leaf node is demanding if it asks for more resources than are allocated to it, whereas an internal node is demanding if any of its children are demanding.

Example Given 480 servers 240 48 80 48 96 96 96 96 160

Multi-resource Scheduling Workloads in data centers tend to be diverse. CPU-intensive, memory-intensive, or I/O intensive. Ignoring the actual resource needs of jobs leads to poor performance isolation and low throughput for jobs.

Dominant Resource Fairness (DRF) A generalization of max-min fairness to multiple resource types. Maximize the minimum dominant shares of users in the system. Dominant share si is the maximum share of resource among all shares of a user. Dominant resource is the resource corresponding to the dominant share. DRF是一種通用的多資源的max-min fairness分配策略。 簡而言之,DRF試圖最大化所有用戶中最小的dominant share。 Dominant share是在所有已经分配给用戶的多種資源中,占據最大份額的一種資源。 DRF直觀想法是在多資源環境下,一个用戶的資源分配應該由該用戶的dominant share决定。

Example Dominant resource Job 1: memory Job 2: CPU Dominant share 60%

How DRF Works Given a set of users, each with a resource demand vector. The resources required to execute one job. Starts with every user being allocated with zero resources. Repeatedly picks the user with the lowest dominant share. Launches one of the user’s job if there are enough resources available in the system.

Example System with 9 CPUs and 18 GB RAM. User A: <1 CPU, 4 GB> User B: <3 CPUs, 1 GB>

Hierarchical DRF (H-DRF) Static H-DRF Collapsed hierarchies Naive H-DRF Dynamic H-DRF

Static H-DRF A static version of DRF to handle hierarchies. Algorithm Given the hierarchy structure and the amount of resources in the system. Starts with every leaf nodes being allocated with zero resources. Repeatedly allocates resource to a leaf node until no more resources can be assigned to any node.

Resource Allocation in Static H-DRF Start at the root of the tree and traverse down to a leaf. At each step picking the demanding child that has the smallest dominant share. Internal nodes are assigned the sum of all the resources assigned to their immediate children. Allocate the leaf node an ε amount of its resource demands. Increases the node’s dominant share by ε.

Example Given 10 CPUs and 10 GPUs.

Weakness of Static H-DRF Re-calculating the static H-DRF allocations for each of the leaves and arrivals from scratch is computationally infeasible.

Collapsed Hierarchies Converts a hierarchical scheduler into a flat one and apply weighted DRF algorithm. Works when only one resource is involved. Violates the hierarchical share guarantee for internal nodes in the hierarchy.

Example Given Flatten nr n1,1 <1,1> 50% n2,1 <1,0> 25%

Weighted DRF Each user i is associated a weight vector Wi = {wi,1, … wi,m}. wi,j represents the weight of user i for resource j. Dominant share If the weights of all users are set to 1 => DRF. wi,j

Weighted DRF in Collapsed Hierarchies Each node ni has a weight wi. Let wi,j = wi for 1≦j≦m The ratio between dominated resources allocated to user a and user b equals to wa/wb.

Example Given Collapsed Hierarchies nr n1,1 <1,1> 50%

Naive H-DRF A natural adaptation of the original DRF to the hierarchical setting. The hierarchical share guarantee is violated for leaf nodes. Starvation

Example Static H-DRF Naive H-DRF Dominate share = 1.0

Dynamic H-DRF Does not suffer from starvation. Satisfy the hierarchical share guarantee. Two key features: Rescaling to minimum nodes Ignoring blocked nodes

Rescaling to Minimum Nodes Compute the resource consumption of an internal node as follows: Find the demanding child with minimum dominant share M. Rescale every child’s resource consumption vector so that its dominant share becomes M. Add all the children’s rescaled vectors to get the internal node’s resource consumption vector.

Example Given 10 CPUs and 10 GPUs. After n2,1 finishes a job and release 1 CPU: Dominate share = 0.4 Dominate share = 0.5 <0.4, 0.4> <0.5, 0> <0.4, 0> <0, 1> <0, 0.4> <0, 1>

Ignoring Blocked Nodes Dynamic H-DRF only consider non- blocked nodes for rescaling. A leaf node is blocked if either Any of the resources it requires are saturated. The node is non-demanding. An internal node is blocked if all of its children are blocked.

Example Static H-DRF Without blocked Dominate share = 1/3

Allocation Properties Hierarchical Share Guarantees Group Strategy-proofness No group of users can misrepresent their resource requirements in such a way that all of them are weakly better off, and at least one of them is strictly better off. Recursive Scheduling Not Population Monotonicity PM: Any node exiting the system should not decrease the resource allocation to any other node in the hierarchy tree.

Example

Evaluation - Hierarchical Sharing 49 Amazon EC2 severs Dominant resource: n1,1, n2,1, n2,2: CPU n1,2: GPU

Result pareto-efficiency: no node in the hierarchy can be allocated an extra task on the cluster without reducing the share of some other node.

Conclusion Proposed H-DRF, which is a hierarchical multi-resource scheduler. Avoid job starvation and maintain hierarchical share guarantee. Future works DRF under placement constraints. Efficient allocation vector update.