Shanjiang Tang, Bu-Sung Lee, Bingsheng He, Haikun Liu School of Computer Engineering Nanyang Technological University Long-Term Resource Fairness Towards.

Slides:



Advertisements
Similar presentations
Pricing for Utility-driven Resource Management and Allocation in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS)
Advertisements

Cloud Service Models and Performance Ang Li 09/13/2010.
Evaluating the Cost-Benefit of Using Cloud Computing to Extend the Capacity of Clusters Presenter: Xiaoyu Sun.
Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures Pree Thiengburanathum Advanced computer architecture Oct 24,
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
SLA-Oriented Resource Provisioning for Cloud Computing
Sharing Cloud Networks Lucian Popa, Gautam Kumar, Mosharaf Chowdhury Arvind Krishnamurthy, Sylvia Ratnasamy, Ion Stoica UC Berkeley.
Locality-Aware Dynamic VM Reconfiguration on MapReduce Clouds Jongse Park, Daewoo Lee, Bokyeong Kim, Jaehyuk Huh, Seungryoul Maeng.
OPNET Technologies, Inc. Performance versus Cost in a Cloud Computing Environment Yiping Ding OPNET Technologies, Inc. © 2009 OPNET Technologies, Inc.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
1 Distributed Systems Meet Economics: Pricing in Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of.
Enabling High-level SLOs on Shared Storage Andrew Wang, Shivaram Venkataraman, Sara Alspaugh, Randy Katz, Ion Stoica Cake 1.
Meeting Service Level Objectives of Pig Programs Zhuoyao Zhang, Ludmila Cherkasova, Abhishek Verma, Boon Thau Loo University of Pennsylvania Hewlett-Packard.
Service Level Agreement based Allocation of Cluster Resources: Handling Penalty to Enhance Utility Chee Shin Yeo and Rajkumar Buyya Grid Computing and.
Reciprocal Resource Fairness: Towards Cooperative Multiple-Resource Fair Sharing in IaaS Clouds School of Computer Engineering Nanyang Technological University,
CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, John Wilkes.
Xavier León PhD defense
Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.
UC Berkeley Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University.
A Grid Resource Broker Supporting Advance Reservations and Benchmark- Based Resource Selection Erik Elmroth and Johan Tordsson Reporter : S.Y.Chen.
1 A General Auction-Based Architecture for Resource Allocation Weidong Cui, Matthew C. Caesar, and Randy H. Katz EECS, UC Berkeley {wdc, mccaesar,
UC Berkeley Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University.
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
Distributed Systems Meet Economics: Pricing In The Cloud Authors: Hongyi Wang, Qingfeng Jing, Rishan Chen, Bingsheng He, Zhengping He, Lidong Zhou Presenter:
Cluster Scheduler Reference: Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center NSDI’2011 Multi-agent Cluster Scheduling for Scalability.
Chapter-7 Introduction to Cloud Computing Cloud Computing.
Computer Measurement Group, India CLOUD PERFORMANCE TESTING - KEY CONSIDERATIONS Abhijeet Padwal, Persistent Systems.
Network Sharing Issues Lecture 15 Aditya Akella. Is this the biggest problem in cloud resource allocation? Why? Why not? How does the problem differ wrt.
Department of Computer Science Engineering SRM University
Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment.
1 An SLA-Oriented Capacity Planning Tool for Streaming Media Services Lucy Cherkasova, Wenting Tang, and Sharad Singhal HPLabs,USA.
HOW TO BID THE CLOUD Liang Zheng, Carlee Joe-Wong, Chee Wei Tan, Mung Chiang, and Xinyu Wang SIGCOMM 2015 | London, UK | August
Adaptive software in cloud computing Marin Litoiu York University Canada.
An Integration Framework for Sensor Networks and Data Stream Management Systems.
BestPeer++: A Peer-to-Peer Based Large-Scale Data Processing Platform.
SLA-based Resource Allocation for Software as a Service Provider (SaaS) in Cloud Computing Environments Author Linlin Wu, Saurabh Kumar Garg and Rajkumar.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
1 Time & Cost Sensitive Data-Intensive Computing on Hybrid Clouds Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The.
High Performance Computing on Virtualized Environments Ganesh Thiagarajan Fall 2014 Instructor: Yuzhe(Richard) Tang Syracuse University.
Dominant Resource Fairness: Fair Allocation of Multiple Resource Types Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion.
The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter: 李長霖.
How AWS Pricing Works Jinesh Varia Technology Evangelist.
Using Map-reduce to Support MPMD Peng
Embedded System Lab. 정범종 A_DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters H. Wang et al. VEE, 2015.
Towards Economic Fairness for Big Data Processing in Pay-as-you-go Cloud Computing Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
Uppsala, April 12-16th 2010EGEE 5th User Forum1 A Business-Driven Cloudburst Scheduler for Bag-of-Task Applications Francisco Brasileiro, Ricardo Araújo,
Dynamic Slot Allocation Technique for MapReduce Clusters School of Computer Engineering Nanyang Technological University 25 th Sept 2013 Shanjiang Tang,
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
Using Map-reduce to Support MPMD Peng
Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,
Presented by Qifan Pu With many slides from Ali’s NSDI talk Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion Stoica.
Dominant Resource Fairness: Fair Allocation of Multiple Resource Types Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Online Parameter Optimization for Elastic Data Stream Processing Thomas Heinze, Lars Roediger, Yuanzhen Ji, Zbigniew Jerzak (SAP SE) Andreas Meister (University.
1 Automated Power Management Through Virtualization Anne Holler, VMware Anil Kapur, VMware.
Presented by: Saurav Kumar Bengani
Aaron Harlap, Alexey Tumanov*, Andrew Chung, Greg Ganger, Phil Gibbons
CS 425 / ECE 428 Distributed Systems Fall 2016 Nov 10, 2016
CS 425 / ECE 428 Distributed Systems Fall 2017 Nov 16, 2017
AWS Batch Overview A highly-efficient, dynamically-scaled, batch computing service May 2017.
Software Engineering Introduction to Apache Hadoop Map Reduce
Building a Database on S3
Xiaoyang Zhang1, Yuchong Hu1, Patrick P. C. Lee2, Pan Zhou1
Pricing Model In Cloud Computing
Jason Neih and Monica.S.Lam
COS 518: Advanced Computer Systems Lecture 14 Michael Freedman
Shanjiang Tang1, Bingsheng He2, Shuhao Zhang2,4, Zhaojie Niu3
Cost Effective Presto on AWS
Progress Report 2017/02/08.
Presentation transcript:

Shanjiang Tang, Bu-Sung Lee, Bingsheng He, Haikun Liu School of Computer Engineering Nanyang Technological University Long-Term Resource Fairness Towards Economic Fairness on Pay-as-you-use Computing Systems

Pay-As-You-Use is Pervasive Charge users based on the amount of resources used over time (e.g., Hourly). Advantages – Elasticity – Flexibility – Cost efficiency Pay-as-you-use is becoming common and popular. – Supercomputing, Cloud Computing 2

Twitter’s Cluster One week data from Twitter production cluster [Delimitrou et. Al. ASPLOS’14] Resource Utilization = User resource demands are heterogeneous. – Users have different demands. – A user’s demand is changing over time.  Static provisioning/partitioning causes underutilization. Resource utilization is a critical problem in such pay-as-you-use environments. – Providers waste resources (  waste investment and lose profit). – Users waste money. 3

Resource Sharing can improve resource utilization. – Allow underloaded users to release resources to other users. – Allow overloaded users to temporarily use more resources (from others).  Reduce the idle resources at runtime.  Resolve resource contention across users. What about fairness? – If the fairness is not solved, resource sharing is unlikely to achieve in pay-as-you-use environments. To Share or Not To Share? 4

Pay-as-you-use Fairness: Resource-as-you-pay The total resources a user gained should be proportional to her payment. This is a Service-Level Agreement (SLA). 60 $ 40 $ A: B: 60% 40% Resource Service A A B B Resource Service = Resources-per-time X service time 5

Example: Amazon EC2 Reserved Instance – Pay one-time fee for a long time in advance – Get discount on the hourly charge over on-demand one. E.g., m3.xlarge: on-demand($0.28/h), reserved instance (0.06/h) Annual UtilizationMedium Utilization RIHeavy Utilization RI 20%-8%-99% 40%30%1% 60%43%34% 80%50% 100%54%60% Table: 3-year RI Percentage Saving Over on-Demand Comparison [data from aws] 6

Desirable Properties in Pay-as- you-use Computing Resource-as-you-pay Fairness Guarantee Non-Trivial Workload Incentive – User should submit non-trivial workload – Be willing to yield to others when no need. – Improve cost-efficiency. Truthfulness – Users cannot get benefits by cheating. – Be honest and friendly to each other. 7

Fair Policy in Existing Systems State-of-the-art: Max-min fairness – Select the user with the minimum allocation/share ratio every time. – Consider the present requirement only (memoryless). Memoryless fairness has severe problems in pay-as-you-use environments, violating the following properties: – Resource-as-you-pay fairness guarantee. – Non-Trivial workload incentive and sharing incentive. – Truthfulness (Users may get benefits by cheating). 8

Problems with MemoryLess Fairness Resource-as-you-pay Fairness Problem – E.g., A, B equally pay for total resource of 100 units. Time New Demand AB t A A B B Accumulate Resource Usage: Accumulate Resource Usage: Unsatisfied Demand A A B B Current Allocation at t1:

Problems with MemoryLess Fairness Resource-as-you-pay Fairness Problem – E.g., A, B equally pay for total resource of 100 units. Time New Demand AB t t A A B B Accumulate Resource Usage: Accumulate Resource Usage: A A B B Unsatisfied Demand Current Allocation at t2:

Problems with MemoryLess Fairness Resource-as-you-pay Fairness Problem – E.g., A, B equally pay for total resource of 100 units. Time New Demand AB t t t A A B B Accumulated resource usage: A A B B Unsatisfied Demand Current Allocation at t3:

Problems with MemoryLess Fairness Resource-as-you-pay Fairness Problem – E.g., A, B equally pay for total resource of 100 units. Time New Demand AB t t t t A A B B Accumulated resource usage: A A B B Unsatisfied Demand Existing Fair Policy fails to satisfy Resource-as-you-pay fairness!!! 12 Current Allocation at t4:

MemoryLess Fairness Violates Sharing Incentives Non-trivial workload and sharing incentive Problem – Yielding resources to others have no benefits. – Suppose A, B, and C equally pay for total resource of 100 units. A has 13 idle resource units. In that case, A can be selfish, either idle or running trivial workloads. CPU A:A: A:A: B:B: B:B: C:C: C:C: A’s idle resource 13

Cheating User Benefits on MemoryLess Fairness Truthfulness Problem – Suppose A, B, C equally pay for a cluster of 100 units, with true demand to be 33, 21 and 80, respectively. – Case 1: all are honest. – Case 2: User A cheats and claims the demand to be A:A: A:A: B:B: B:B: C:C: C:C: A’s cheating get benefits A:A: A:A: B:B: B:B: C:C: C:C: 12 Case 1: A is honest Case 2: A is cheating

Our Work Challenges: can we find a fair sharing policy that satisfies the following properties? – Resource-as-you-pay fairness – Non-trivial workload and sharing incentives – Truthfulness Our Solution: Long-Term Resource Fairness – Ensure resource fairness over a period of time. – With historical information considered. 15

Long-Term Resource Fairness Basic Concept: Loan agreement (Lending w/o interests) – When resources are not needed, users can lend the resources to others. – When more resources are needed, others should give back.  Benefit others and user herself. 16

Long-Term Resource Fairness Satisfy Pay-as-you-use Fairness Time New Demand AB t A A B B Accumulated resource usage: Unsatisfied Demand A A B B Current Allocation at t1: A A B B Lend Resources:

Long-Term Resource Fairness Satisfy Pay-as-you-use Fairness Time New Demand AB t t A A B B Accumulated resource usage: A A B B Unsatisfied Demand Current Allocation at t2: A A B B Lend Resources:

Long-Term Resource Fairness Satisfy Pay-as-you-use Fairness Time New Demand AB t t A A B B Accumulated resource usage: A A B B Unsatisfied Demand Current Allocation at t2: A A B B Lend Resources: t3 8050

Long-Term Resource Fairness Satisfy Pay-as-you-use Fairness Time New Demand AB t t t A A B B Accumulated resource usage: A A B B Unsatisfied Demand Current Allocation at t3: A A B B Lend Resources:

Long-Term Resource Fairness Satisfy Pay-as-you-use Fairness Time New Demand AB t t t A A B B Accumulated resource usage: A A B B Unsatisfied Demand Current Allocation at t3: A A B B Lend Resources: t4 6050

Long-Term Resource Fairness Satisfy Pay-as-you-use Fairness Time New Demand AB t t t t A A B B Accumulated resource usage: A A B B Unsatisfied Demand Long-Term Resource Fairness satisfy Resource-as-you-pay fairness. 22 Current Allocation at t4: A A B B Lend Resources:

Other Properties of Long-Term Resource Fairness Satisfy non-trivial workload and sharing incentives – Running trivial workload can waste money. – Not sharing idle resource can waste money. Users cannot get benefits by lying (strategy proof). 23 Proof sketches are in the paper.

LTYARN Implement Long-Term Resource Fairness in YARN – Extend memoryless max-min fairness to long-term max- min fairness. – Add a few components into resource manager Support full long-term and time window-based requirements. Currently support a single resource type (main memory). 24

LTYARN Design Quantum Updater (QU) – Estimates task execution time. – Updates the resource usage history periodically. Resource Controller (RC) – Manages and updates resource for each queue. Resource Allocator (RA) – Performs long-term resource allocation. – Runs when there are pending tasks and idle resources. 25

Evaluation A Hadoop Cluster – 10 nodes, each with two Intel X5675 CPUs (6 cores per CPU with 3.07 GHz), 24GB DDR3 memory, 56GB hard disks. – YARN-2.2.0, configured with 24GB memory per node. Macro-benchmarks – Synthetic Facebook Workload – Purdue Workload – HIVE/TPC-H – Spark 26 Detailed setups are in the paper.

Metrics Evaluation metrics – Fairness degree for each user (>1 for sharing benefits; <1 for sharing loss) – Resource-as-you-pay fairness – Application performance Benchmark scenario – The four macro benchmarks equally share the cluster. – Each benchmark runs in a separate queue. – Window size =1 day. 27

Sharing Benefit/Loss LTYARN enables sharing benefits. Sharing benefit degree: the degree of benefits under sharing cluster over non-sharing case Sharing loss degree : the degree of losses under sharing cluster over non-sharing case (a). YARN 28 (b). LTYARN

Sharing Benefit/Loss LTYARN enables sharing benefits for all applications. (b). LTYARN 29 (a). YARN

Resource-as-you-pay Fairness Results LTYARN achieves resource-as-you-pay fairness. 30

Performance Results 31 Sharing always achieves a better performance. Long-term fairness is comparable to memory-less fairness (max-min).

Conclusions Max-min resource fairness is memoryless and unsuitable for pay-as-you-use computing. We define long-term resource fairness that can satisfy the desirable properties. We develop LTYARN by integrating long-term resource fairness into YARN – Homepage: 32

Future Work Implement Long-Term Resource Fairness in other systems/schedulers. – Mesos, Quincy, Choosy, etc. Extend Long-Term Resource Fairness for multi- resources: – CPU, memory, Network I/O, etc. 33

We are Hosting IEEE CloudCom 2014 in Singapore Deadline for paper submissions: July 31, 2014 Notification of Paper acceptance: September 2, 2014 Conference: December 15-18,

Thanks! Question? 35