Achieving Elasticity for Cloud MapReduce Jobs Khaled Salah IEEE CloudNet 2013 – San Francisco November 13, 2013.

Slides:

Advertisements

Similar presentations

A Cloud Data Center Optimization Approach using Dynamic Data Interchanges Prof. Stephan Robert University of Applied Sciences.

Advertisements

Feedback Control Real- time Scheduling James Yang, Hehe Li, Xinguang Sheng CIS 642, Spring 2001 Professor Insup Lee.

Feedback Control Real-Time Scheduling: Framework, Modeling, and Algorithms Chenyang Lu, John A. Stankovic, Gang Tao, Sang H. Son Presented by Josh Carl.

Autonomic Scaling of Cloud Computing Resources

Hadi Goudarzi and Massoud Pedram

LIBRA: Lightweight Data Skew Mitigation in MapReduce

MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.

SLA-Oriented Resource Provisioning for Cloud Computing

LOAD BALANCING IN A CENTRALIZED DISTRIBUTED SYSTEM BY ANILA JAGANNATHAM ELENA HARRIS.

Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.

Meeting Service Level Objectives of Pig Programs Zhuoyao Zhang, Ludmila Cherkasova, Abhishek Verma, Boon Thau Loo University of Pennsylvania Hewlett-Packard.

Proactive Prediction Models for Web Application Resource Provisioning in the Cloud _______________________________ Samuel A. Ajila & Bankole A. Akindele.

CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, John Wilkes.

Presented by Nirupam Roy Starfish: A Self-tuning System for Big Data Analytics Herodotos Herodotou, Harold Lim, Gang Luo, Nedyalko Borisov, Liang Dong,

Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.

AQM for Congestion Control1 A Study of Active Queue Management for Congestion Control Victor Firoiu Marty Borden.

1 Token Bucket Based CAC and Packet Scheduling for IEEE Broadband Wireless Access Networks Chi-Hung Chiang

1 The Designs and Analysis of a Scalable Optical Packet Switching Architecture Speaker: Chia-Wei Tuan Adviser: Prof. Ho-Ting Wu 3/4/2009.

GHS: A Performance Prediction and Task Scheduling System for Grid Computing Xian-He Sun Department of Computer Science Illinois Institute of Technology.

Proteus: Power Proportional Memory Cache Cluster in Data Centers Shen Li, Shiguang Wang, Fan Yang, Shaohan Hu, Fatemeh Saremi, Tarek Abdelzaher.

Knight’s Tour Distributed Problem Solving Knight’s Tour Yoav Kasorla Izhaq Shohat.

Improving MapReduce Performance Using Smart Speculative Execution Strategy Qi Chen, Cheng Liu, and Zhen Xiao Oct 2013 To appear in IEEE Transactions on.

MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering.

Delay Analysis of Large-scale Wireless Sensor Networks Jun Yin, Dominican University, River Forest, IL, USA, Yun Wang, Southern Illinois University Edwardsville,

Chapter 2 Computer Clusters Lecture 2.1 Overview.

Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.

Capacity Scaling for Elastic Compute Clouds Ahmed Aleyeldin Hassan

1 SenMetrics’05, San Diego, 07/21/2005 SOSBRA: A MAC-Layer Retransmission Algorithm Designed for the Physical-Layer Characteristics of Clustered Sensor.

1 Enabling Large Scale Network Simulation with 100 Million Nodes using Grid Infrastructure Hiroyuki Ohsaki Graduate School of Information Sci. & Tech.

Location-aware MapReduce in Virtual Cloud 2011 IEEE computer society International Conference on Parallel Processing Yifeng Geng1,2, Shimin Chen3, YongWei.

Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.

Distributed Indexing of Web Scale Datasets for the Cloud {ikons, eangelou, Computing Systems Laboratory School of Electrical.

ROBUST RESOURCE ALLOCATION OF DAGS IN A HETEROGENEOUS MULTI-CORE SYSTEM Luis Diego Briceño, Jay Smith, H. J. Siegel, Anthony A. Maciejewski, Paul Maxwell,

1 Performance Evaluation of Computer Systems and Networks Introduction, Outlines, Class Policy Instructor: A. Ghasemi Many thanks to Dr. Behzad Akbari.

1 Time & Cost Sensitive Data-Intensive Computing on Hybrid Clouds Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The.

Optimizing Cloud MapReduce for Processing Stream Data using Pipelining 作者 :Rutvik Karve ， Devendra Dahiphale ， Amit Chhajer 報告 : 饒展榕.

Budget-based Control for Interactive Services with Partial Execution 1 Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research.

Chapter 3 System Performance and Models. 2 Systems and Models The concept of modeling in the study of the dynamic behavior of simple system is be able.

Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.

Automated Control in Cloud Computing: Challenges and Opportunities Harold C. Lim, Shivnath Babu, Jeffrey S. Chase, and Sujay S. Parekh ACM’s First Workshop.

1 Challenges in Scaling E-Business Sites  Menascé and Almeida. All Rights Reserved. Daniel A. Menascé Department of Computer Science George Mason.

A Hierarchical MapReduce Framework Yuan Luo and Beth Plale School of Informatics and Computing, Indiana University Data To Insight Center, Indiana University.

CARDIO: Cost-Aware Replication for Data-Intensive workflOws Presented by Chen He.

MC 2 : Map Concurrency Characterization for MapReduce on the Cloud Mohammad Hammoud and Majd Sakr 1.

1 Performance Analysis of the Distributed Coordination Function under Sporadic Traffic joint work with C.-F. Chiasserini (Politecnico di Torino)

Full auto rate MAC protocol for wireless ad hoc networks Z. Li, A. Das, A.K. Gupta and S. Nandi School of Computer Engineering Nanyang Technological University.

Handling Session Classes for Predicting ASP.NET Performance Metrics Ágnes Bogárdi-Mészöly, Tihamér Levendovszky, Hassan Charaf Budapest University of Technology.

Computing Scientometrics in Large-Scale Academic Search Engines with MapReduce Leonidas Akritidis Panayiotis Bozanis Department of Computer & Communication.

The Extended Connection-Dependent Threshold Model for Elastic and Adaptive Traffic V. Vassilakis, I. Moscholios and M. Logothetis Wire Communications Laboratory,

MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.

CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.

DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.

OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.

ApproxHadoop Bringing Approximations to MapReduce Frameworks

1 An Analytical Model for the Dimensioning of a GPRS/EDGE Network with a Capacity Constraint on a Group of Cells r , r , r Nogueira,

Zeta: Scheduling Interactive Services with Partial Execution Yuxiong He, Sameh Elnikety, James Larus, Chenyu Yan Microsoft Research and Microsoft Bing.

CSE 5810 Biomedical Informatics and Cloud Computing Zhitong Fei Computer Science & Engineering Department The University of Connecticut CSE5810: Introduction.

Performance Assurance for Large Scale Big Data Systems

OPERATING SYSTEMS CS 3502 Fall 2017

StreamApprox Approximate Stream Analytics in Apache Flink

StreamApprox Approximate Stream Analytics in Apache Spark

StreamApprox Approximate Computing for Stream Analytics

Henge: Intent-Driven Multi-Tenant Stream Processing

Pramod Bhatotia, Ruichuan Chen, Myungjin Lee

Architecture & System Performance

Hawk: Hybrid Datacenter Scheduling

Performance And Scalability In Oracle9i And SQL Server 2000

Mihai Neacşu, BSc. Prof.dr.eng. Alexandru Iosup Ir. Laurens Versluis

Presentation transcript:

Achieving Elasticity for Cloud MapReduce Jobs Khaled Salah IEEE CloudNet 2013 – San Francisco November 13, 2013

p2 Outline r Background and motivation r Uses cases of our analytical model r Analytical model r Derived performance metrics r Numerical results r Conclusions and future work

p3 Background and Motivation r MapReduce is a popular paradigm that can parallelize large data processing on cloud clusters. r MR paradigm is a key enabler for Big Data analytics r MR Jobs – e.g. web search engine requests r In cloud computing, a critical research problem is how to achieve elasticity for MR jobs as the workload conditions change over time.

p4 Elasticity r Elasticity is how fast the cloud responds (or autoscales) to a given workload to reach perfect capacity.  Overprovisioning D(t) < R(t)  Underprovisioning D(t) > R(t)  Perfect Provisioning D(t) = R(t)

p5 MapReduce Jobs

p6 Usefulness of our model (1/2) r In elasticity and autoscaling: given workload conditions, we can estimate the required number of VMs to meet the SLO delay requirements  And not by trial and error  CPU utilization can be misleading r Determine the required slave nodes required to execute MR jobs

p7 Usefulness of our model (2/2) r In call admission  To accept or deny cloud requests based on meeting the SLO delay  Available compute resources are not enough r Estimating the end-to-end delay for elastic MR jobs

p8 Typical Cloud Datacenter Architecture

p9 M/G/1/K Queueing Model

p10 M/G/1/K

p11 Analysis Approach r The challenge in analyzing such a queueing system is to compute or the PDF of the generally distributed random variable X representing the service times r The mean service time E[X] r Then, the second stage random service time B for these N parallel workers can be expressed as r E[B] can be expressed as

p12 Analysis Approach r For the Reducer stage, r E[R] can be expressed as r Therefore, the mean service time E[X]

p13 Performance r Given:  Incoming load  JS and service rates for each mapper & reducer  Queue size r Formulas for:  Response time  Throughput  Loss probability

p14 Numerical Example r We fix the system size K to 100 requests. We fix r depends on two factors: (1) m-- the number of mapper per node, and (2) the execution speed of each node.  If we assume a reducer takes 500 ms to be executed on a single node, and with homogenous splitting, then ms.

p15 Numerical Example r Similarly, depends on two factors: (1) n-- the number of mapper per node, and (2) the execution speed of each node.  If we assume a reducer takes 100 ms to be executed on a single node, and with homogenous splitting, then ms. r For autoscaling, we assume that the mappers and reducers always autoscale with a ratio of 2:1. That is, one reducer is needed for two mappers, or

p16 Service Delay vs. Workload

p17

p18

p19

p20 Concluding Remarks r We presented analytical model to estimate the minimum number of cloud resources required for executing MapReduce jobs on the cloud r Closed-form solutions were derived for key SLO performance metrics such as response time, blocking probability, and throughput. r Simulation results show that our analytical model is correct. r Future work will be on implementation

p21 Thank you!

p22 Q&A