Presented by Qifan Pu With many slides from Ali’s NSDI talk Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion Stoica.

Slides:



Advertisements
Similar presentations
The Datacenter Needs an Operating System Matei Zaharia, Benjamin Hindman, Andy Konwinski, Ali Ghodsi, Anthony Joseph, Randy Katz, Scott Shenker, Ion Stoica.
Advertisements

SLA-Oriented Resource Provisioning for Cloud Computing
Sharing Cloud Networks Lucian Popa, Gautam Kumar, Mosharaf Chowdhury Arvind Krishnamurthy, Sylvia Ratnasamy, Ion Stoica UC Berkeley.
Locality-Aware Dynamic VM Reconfiguration on MapReduce Clouds Jongse Park, Daewoo Lee, Bokyeong Kim, Jaehyuk Huh, Seungryoul Maeng.
Beyond Dominant Resource Fairness David Parkes (Harvard) Ariel Procaccia (CMU) Nisarg Shah (CMU)
No Agent Left Behind: Dynamic Fair Division of Multiple Resources Ian Kash 1 Ariel Procaccia 2 Nisarg Shah 2 (Speaker) 1 MSR Cambridge 2 Carnegie Mellon.
THE DATACENTER NEEDS AN OPERATING SYSTEM MATEI ZAHARIA, BENJAMIN HINDMAN, ANDY KONWINSKI, ALI GHODSI, ANTHONY JOSEPH, RANDY KATZ, SCOTT SHENKER, ION STOICA.
Resource Management with YARN: YARN Past, Present and Future
Disk-Locality in Datacenter Computing Considered Irrelevant Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, Ion Stoica 1.
FairCloud: Sharing the Network in Cloud Computing Computer Communication Review(2012) Arthur : Lucian Popa Arvind Krishnamurthy Sylvia Ratnasamy Ion Stoica.
Reciprocal Resource Fairness: Towards Cooperative Multiple-Resource Fair Sharing in IaaS Clouds School of Computer Engineering Nanyang Technological University,
An Energy Efficient Hierarchical Heterogeneous Wireless Sensor Network
Quincy: Fair Scheduling for Distributed Computing Clusters Microsoft Research Silicon Valley SOSP’09 Presented at the Big Data Reading Group by Babu Pillai.
Ken Birman. Massive data centers We’ve discussed the emergence of massive data centers associated with web applications and cloud computing Generally.
Mesos A Platform for Fine-Grained Resource Sharing in Data Centers Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy.
Matei Zaharia, Dhruba Borthakur *, Joydeep Sen Sarma *, Khaled Elmeleegy +, Scott Shenker, Ion Stoica UC Berkeley, * Facebook Inc, + Yahoo! Research Delay.
Mesos A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony Joseph, Randy.
Charge-Sensitive TCP and Rate Control Richard J. La Department of EECS UC Berkeley November 22, 1999.
Mesos A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony Joseph, Randy.
Distributed Rational Decision Making Sections By Tibor Moldovan.
UC Berkeley Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University.
Distributed Low-Latency Scheduling
DRFQ: Multi-Resource Fair Queueing for Packet Processing Ali Ghodsi 1,3, Vyas Sekar 2, Matei Zaharia 1, Ion Stoica 1 1 UC Berkeley, 2 Intel ISTC/Stony.
A Platform for Fine-Grained Resource Sharing in the Data Center
CS Advanced Operating Systems Structures and Implementation Lecture 18 Dominant Resource Fairness Two-Level Scheduling April 7 th, 2014 Prof. John.
1 CS : Project Suggestions Ion Stoica ( September 14, 2011.
Network Sharing Issues Lecture 15 Aditya Akella. Is this the biggest problem in cloud resource allocation? Why? Why not? How does the problem differ wrt.
DRFQ: Multi-Resource Fair Queueing for Packet Processing Ali Ghodsi 1,3, Vyas Sekar 2, Matei Zaharia 1, Ion Stoica 1 1 UC Berkeley, 2 Intel ISTC/Stony.
EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.
E VALUATION OF F AIRNESS IN ICN X. De Foy, JC. Zuniga, B. Balazinski InterDigital
Mesos A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony Joseph, Randy.
Dominant Resource Fairness: Fair Allocation of Multiple Resource Types Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion.
CS Advanced Operating Systems Structures and Implementation Lecture 12 Dominant Resource Fairness Two-Level Scheduling March 11 th, 2013 Prof. John.
1. Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions 2/42.
Job scheduling algorithm based on Berger model in cloud environment Advances in Engineering Software (2011) Baomin Xu,Chunyan Zhao,Enzhao Hua,Bin Hu 2013/1/251.
EECS 262a Advanced Topics in Computer Systems Lecture 13 Resource allocation: Lithe/DRF October 16 th, 2012 John Kubiatowicz and Anthony D. Joseph Electrical.
Using Map-reduce to Support MPMD Peng
Matchmaking: A New MapReduce Scheduling Technique
Towards Economic Fairness for Big Data Processing in Pay-as-you-go Cloud Computing Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
Shanjiang Tang, Bu-Sung Lee, Bingsheng He, Haikun Liu School of Computer Engineering Nanyang Technological University Long-Term Resource Fairness Towards.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
Multi-Resource Packing for Cluster Schedulers Robert Grandl Aditya Akella Srikanth Kandula Ganesh Ananthanarayanan Sriram Rao.
Scheduling Determines which packet gets the resource. Enforces resource allocation to each flows. To be “Fair”, scheduling must: –Keep track of how many.
A Platform for Fine-Grained Resource Sharing in the Data Center
John Kubiatowicz and Anthony D. Joseph
D. AriflerCMPE 548 Fall CMPE 548 Routing and Congestion Control.
Multi-Resource Packing for Cluster Schedulers Robert Grandl, Ganesh Ananthanarayanan, Srikanth Kandula, Sriram Rao, Aditya Akella.
Dominant Resource Fairness: Fair Allocation of Multiple Resource Types Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion.
PACMan: Coordinated Memory Caching for Parallel Jobs Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Wang, Dhruba Borthakur, Srikanth Kandula, Scott Shenker,
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center NSDI 11’ Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
COS 518: Advanced Computer Systems Lecture 13 Michael Freedman
Hierarchical Scheduling for Diverse Datacenter Workloads
COS 418: Distributed Systems Lecture 23 Michael Freedman
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
MESOS.
Lottery Scheduling and Dominant Resource Fairness (Lecture 24, cs262a)
CS 425 / ECE 428 Distributed Systems Fall 2016 Nov 10, 2016
EECS 262a Advanced Topics in Computer Systems Lecture 13 Resource allocation: Lithe/DRF March 7th, 2016 John Kubiatowicz Electrical Engineering and Computer.
CS 425 / ECE 428 Distributed Systems Fall 2017 Nov 16, 2017
PA an Coordinated Memory Caching for Parallel Jobs
AWS Batch Overview A highly-efficient, dynamically-scaled, batch computing service May 2017.
Lottery Scheduling Ish Baid.
Cluster Resource Management: A Scalable Approach
COS 518: Advanced Computer Systems Lecture 14 Michael Freedman
Shanjiang Tang1, Bingsheng He2, Shuhao Zhang2,4, Zhaojie Niu3
Creating a Dynamic HPC Infrastructure with Platform Computing
Cloud Computing Large-scale Resource Management
Cloud Computing MapReduce in Heterogeneous Environments
Presentation transcript:

Presented by Qifan Pu With many slides from Ali’s NSDI talk Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion Stoica

What is Fair Sharing? n users want to share a resource (e.g., CPU) – Solution: Allocate each 1/n of the shared resource Generalized by max-min fairness – Handles if a user wants less than its fair share – E.g. user 1 wants no more than 20% Generalized by weighted max-min fairness – Give weights to users according to importance – User 1 gets weight 1, user 2 weight 2 CPU 100% 50% 0% 33% 100% 50% 0% 20% 40% 100% 50% 0% 33% 66%

Why Care about Fairness? Desirable properties of max-min fairness – Isolation policy: A user gets her fair share irrespective of the demands of other users – Flexibility separates mechanism from policy: Proportional sharing, priority, reservation,... Many schedulers use max-min fairness – Datacenters:Hadoop’s fair sched, capacity, Quincy – OS:rr, prop sharing, lottery, linux cfs,... – Networking:wfq, wf2q, sfq, drr, csfq,...

Why is Fair Sharing Useful? Weighted Fair Sharing / Proportional Shares – User 1 gets weight 2, user 2 weight 1 Priorities – Give user 1 weight 1000, user 2 weight 1 Reservations – Ensure user 1 gets 10% of a resource – Give user 1 weight 10, sum weights ≤ 100 CPU 100% 50% 0% 66% 33% CPU 100% 50% 0% 50% 10% 40%

Heterogeneous Resource Demands Most task need ~ Some tasks are memory-intensive Some tasks are CPU-intensive 2000-node Hadoop Cluster at Facebook (Oct 2010)

Problem Single resource example – 1 resource: CPU – User 1 wants per task – User 2 wants per task Multi-resource example – 2 resources: CPUs & memory – User 1 wants per task – User 2 wants per task – What is a fair allocation? CPU 100% 50% 0% CPU 100% 50% 0% mem ? ? 50%

Model Users have tasks according to a demand vector – e.g. user’s tasks need 2 R 1, 3 R 2, 1 R 3 – Not needed in practice, can simply measure actual consumption Resources given in multiples of demand vectors Divisible resources

What is Fair?

Desirable Fair Sharing Properties Many desirable properties – Share Guarantee – Strategy proofness – Envy-freeness – Pareto efficiency – Single-resource fairness – Bottleneck fairness – Population monotonicity – Resource monotonicity DRF focuses on these properties

Asset Fairness – Equalize each user’s sum of resource shares Cluster with 70 CPUs, 70 GB RAM – U 1 needs per task – U 2 needs per task Asset fairness yields – U 1 : 15 tasks: 30 CPUs, 30 GB (∑=60) – U 2 : 20 tasks: 20 CPUs, 40 GB (∑=60) First Try: Asset Fairness CPU User 1User 2 100% 50% 0% RAM 43% 57% 43% 28% Problem User 1 has < 50% of both CPUs and RAM Better off in a separate cluster with 50% of the resources Problem User 1 has < 50% of both CPUs and RAM Better off in a separate cluster with 50% of the resources

Lessons from Asset Fairness “You shouldn’t do worse than if you ran a smaller, private cluster equal in size to your fair share” Thus, given N users, each user should get ≥ 1/N of her dominating resource (i.e., the resource that she consumes most of)

Cheating the Scheduler Some users will game the system to get more resources Real-life examples – A cloud provider had quotas on map and reduce slots Some users found out that the map-quota was low Users implemented maps in the reduce slots! – A search company provided dedicated machines to users that could ensure certain level of utilization (e.g. 80%) Users used busy-loops to inflate utilization

Two Important Properties Strategy-proofness – A user should not be able to increase her allocation by lying about her demand vector – Intuition: Users are incentivized to make truthful resource requirements Envy-freeness – No user would ever strictly prefer another user’s lot in an allocation – Intuition: Don’t want to trade places with any other user

Challenge A fair sharing policy that provides – Strategy-proofness – Share guarantee Max-min fairness for a single resource had these properties – Generalize max-min fairness to multiple resources

Dominant Resource Fairness A user’s dominant resource is the resource she has the biggest share of – Example: Total resources: User 1’s allocation: Dominant resource is memory as 1/4 > 2/10 (1/5) A user’s dominant share is the fraction of the dominant resource she is allocated – User 1’s dominant share is 25% (1/4)

Dominant Resource Fairness (2) Apply max-min fairness to dominant shares Equalize the dominant share of the users – Example: Total resources: User 1 demand: dominant res: mem User 2 demand: dominant res: CPU User 1 User 2 100% 50% 0% CPU (9 total) mem (18 total) 3 CPUs 12 GB 6 CPUs2 GB 66%

DRF is Fair DRF is strategy-proof DRF satisfies the share guarantee DRF allocations are envy-free See DRF paper for proofs

Properties of Policies PropertyAssetCEEIDRF Share guarantee ✔✔ Strategy-proofness ✔✔ Pareto efficiency ✔✔✔ Envy-freeness ✔✔✔ Single resource fairness ✔✔✔ Bottleneck res. fairness ✔✔ Population monotonicity ✔✔ Resource monotonicity

DRF Inside Mesos on EC2 User 1’s Shares User 2’s Shares Dominant Shares 19

Fairness in Today’s Datacenters Hadoop Fair Scheduler/capacity/Quincy – Each machine consists of k slots (e.g. k=14) – Run at most one task per slot – Give jobs ”equal” number of slots, i.e., apply max-min fairness to slot-count This is what DRF paper compares against

Utilization of DRF vs Slots Simulation of Facebook workload

Follow-ups & Adoption Academia: – Many papers in both CS and economics (330 citations since 2011) DRFQ: extend to packet processing Choosy: DRF with constraints Hierarchical Scheduling for DRF Industry: – Mesos – Fair scheduler in YARN for multiple resources

Why Google doesn’t use DRF? “Quota allocation is handled outside of Borg, and is intimately tied to our physical capacity planning, whose results are reflected in the price and availability of quota in different datacenters…The use of quota reduces the need for policies like DRF.”

Efficiency-Fairness Trade-off DRF has under-utilized resources DRF schedules at the level of tasks (lead to sub-optimal job completion time) Fairness is fundamentally at odds with overall efficiency (how to trade-off?) 100% 50% 0% 3 CPUs 12 GB 6 CPUs 2 GB 66%

Others Pareto-efficiency holds in the dynamic case? Is it that easy to determine demand vector? – E.g. do all Spark tasks specify memory demand? Assumes Leontief utility function Does it apply to network bandwidth?