University of Minnesota Optimizing MapReduce Provisioning in the Cloud Michael Cardosa, Aameek Singh†, Himabindu Pucha†, Abhishek Chandra Department of Computer Science, University of Minnesota † IBM Almaden Research Center
University of Minnesota MapReduce Provisioning Problem Platform: Virtualized Cloud Environment, which enables Virtualized MapReduce Clusters Several MapReduce Jobs from different users Goal: Optimize system-wide metrics, such as: throughput, energy, load distribution, user costs Problem: At the Cloud Service Provider level, how can we harvest opportunities to increase performance, save energy, or reduce user costs? 2
University of Minnesota MapReduce Platform: Hadoop Open-source implementation of MapReduce distributed computing framework Used widely: Yahoo, Facebook, NYT, (Google) Input Data
University of Minnesota Hadoop Clusters 4 Distributed data Replicated chunks Distributed computation Map/reduce tasks Traditional: Dedicated physical nodes
University of Minnesota Virtual Hadoop Clusters 5 Run Hadoop on top of VMs E.g.: Amazon Elastic MapReduce = Hadoop+AmazonEC2 Server Pool VM Pool Hadoop Processes
University of Minnesota Roadmap Intro & Problem Platform Overview Spatio-Temporal Insights for Provisioning Building Blocks for MapReduce Provisioning Case Study: Performance optimization Case Study: Energy optimization 6
University of Minnesota Spatio-Temporal Insights for Provisioning Initial Focus: Energy Savings Goal: Minimize energy usage Energy+cooling ~ 42% of total cost [Hamilton08] Problem: How to place the VMs on available physical servers to minimize energy usage? Minimize Cumulative Machine Uptime (CMU) 7
University of Minnesota VM Placement: Spatial Fit 8 Job 1Job 2Job 3Job 4 Co-Place complementary workloads
University of Minnesota Which placement is better? 9 20min 10min 100min20min SHUTDOWN AB
University of Minnesota Time Balancing Time Balance
University of Minnesota Building Blocks for Provisioning 11 Objective-driven resource provisioning MapReduce Jobs Job profiling Cluster scaling Migration Cloud Execution Environment Initial Provisioning Continuous Optimization
University of Minnesota Building Blocks for Provisioning Job Profiling: MapReduce job runtime estimation Based on number of VMs allocated to job Based on input data size Offline and Online Profiling Cluster Scaling: Changing number of VMs allocated to a particular MapReduce job Affects runtime of job; relies on Job Profiling model Migration: Useful for continuous optimization Load balancing, VM consolidation 12
University of Minnesota Job Profiling: Runtime Estimation Based on Number of VMs 13
University of Minnesota Job Profiling: Runtime Estimation Based on Input Data Size 14
University of Minnesota Job Profiling: Runtime Estimation Online Profiling: Additional refinement 15
University of Minnesota Cluster Scaling Increasing allocated resources (typical): Add additional VMs to join virtualized Hadoop cluster Job performance increases, runtime decreases E.g, for Time Balancing: Energy reasons E.g, Load Balancing and Deadlines: Performance 16
University of Minnesota Cluster Scaling: Time Balancing Time Balance
University of Minnesota Roadmap Intro & Problem Platform Overview Spatio-Temporal Insights for Provisioning Building Blocks for MapReduce Provisioning Case Study: Performance optimization Case Study: Energy optimization 18
University of Minnesota Case Study: Performance & Deadlines Goal: Meet deadlines for MapReduce jobs Determine initial allocation accurately Dynamically adjust allocation to meet deadline if necessary Monitoring: Use offline profiling to estimate number of VMs needed based on past performance Actuation: Online profiling: Trigger points to invoke cluster scaling 19
University of Minnesota Case Study: Energy Savings Goal: Minimize energy consumption from the execution of a large batch of MapReduce jobs Energy+cooling ~ 42% of total cost [Hamilton08] Pass energy savings on to users Problem: How to place the VMs on available physical servers to minimize energy usage? Minimize Cumulative Machine Uptime (CMU) 20
University of Minnesota Case Study: Energy Savings Use Job Profiling to place similar-runtime VMs together for initial provisioning Use Job Profiling to adjust number of VMs in each cluster to adjust runtimes if needed Monitoring: Online profiling to determine when energy could be saved by using migration or cluster scaling Actuation: Use Cluster Scaling or Migration to dynamically adjust for inaccuracies/unknowns in initial provisioning 21
University of Minnesota Conclusion Framework: Building blocks (STEAMEngine) for the optimization of MapReduce provisioning from a cloud service provider perspective Preliminary evaluations to validate usefulness of each building block Approaches for applying building blocks to meet specific goals, e.g. performance, energy 22
University of Minnesota Thank you! Questions? 23
University of Minnesota Job Profiling: Runtime Estimation Based on Number of VMs 24
University of Minnesota Cluster Scaling Increasing allocated resources (typical): Add additional VMs to join virtualized Hadoop cluster Job performance increases, runtime decreases E.g, for Time Balancing: Energy reasons E.g, Load Balancing and Deadlines: Performance 25