Energy Prediction for I/O Intensive Workflow Applications 1 MASc Exam Hao Yang NetSysLab The Electrical and Computer Engineering Department The University.

Slides:

Advertisements

Similar presentations

Towards Automating the Configuration of a Distributed Storage System Lauro B. Costa Matei Ripeanu {lauroc, NetSysLab University of British.

Advertisements

SkewReduce YongChul Kwon Magdalena Balazinska, Bill Howe, Jerome Rolia* University of Washington, *HP Labs Skew-Resistant Parallel Processing of Feature-Extracting.

SLA-Oriented Resource Provisioning for Cloud Computing

1 A GPU Accelerated Storage System NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany Sathish Gopalakrishnan Matei.

A Dynamic World, what can Grids do for Multi-Core computing? Daniel Goodman, Anne Trefethen and Douglas Creager

The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

1 The Case for Versatile Storage System NetSysLab The University of British Columbia Samer Al-Kiswany, Abdullah Gharaibeh, Matei Ripeanu.

The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.

Where to go from here? Get real experience building systems! Opportunities: 496 projects –More projects:

1 stdchk : A Checkpoint Storage System for Desktop Grid Computing Matei Ripeanu – UBC Sudharshan S. Vazhkudai – ORNL Abdullah Gharaibeh – UBC The University.

Akhil Langer, Harshit Dokania, Laxmikant Kale, Udatta Palekar* Parallel Programming Laboratory Department of Computer Science University of Illinois at.

1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Matei Ripeanu.

Failure Avoidance through Fault Prediction Based on Synthetic Transactions Mohammed Shatnawi 1, 2 Matei Ripeanu 2 1 – Microsoft Online Ads, Microsoft Corporation.

CloudCmp: Shopping for a Cloud Made Easy Ang Li Xiaowei Yang Duke University Srikanth Kandula Ming Zhang Microsoft Research 6/22/2010HotCloud 2010, Boston1.

Project Proposal (Title + Abstract) Due Wednesday, September 4, 2013.

1 Presenter: Ming-Shiun Yang Sah, A., Balakrishnan, M., Panda, P.R. Design, Automation & Test in Europe Conference & Exhibition, DATE ‘09. A Generic.

Exploring the Tradeoffs of Configurability and Heterogeneity in Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable.

1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Advisor: Professor.

A Workflow-Aware Storage System Emalayan Vairavanathan 1 Samer Al-Kiswany, Lauro Beltrão Costa, Zhao Zhang, Daniel S. Katz, Michael Wilde, Matei Ripeanu.

Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1.

Energy Prediction for I/O Intensive Workflow Applications 1 Hao Yang, Lauro Beltrão Costa, Matei Ripeanu NetSysLab Electrical and Computer Engineering.

1 NETE4631 Managing the Cloud and Capacity Planning Lecture Notes #8.

XI HE Computing and Information Science Rochester Institute of Technology Rochester, NY USA Rochester Institute of Technology Service.

11 If you were plowing a field, which would you rather use? Two oxen, or 1024 chickens? (Attributed to S. Cray) Abdullah Gharaibeh, Lauro Costa, Elizeu.

Location-aware MapReduce in Virtual Cloud 2011 IEEE computer society International Conference on Parallel Processing Yifeng Geng1,2, Shimin Chen3, YongWei.

Emalayan Vairavanathan

November , 2009SERVICE COMPUTATION 2009 Analysis of Energy Efficiency in Clouds H. AbdelSalamK. Maly R. MukkamalaM. Zubair Department.

Cloud Computing Energy efficient cloud computing Keke Chen.

1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.

Improving Network I/O Virtualization for Cloud Computing.

Experience with Using a Performance Predictor During Development a Distributed Storage System Tale Lauro Beltrão Costa *, João Brunet +, Lile Hattori #,

1 Configurable Security for Scavenged Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany, Matei Ripeanu.

An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.

Meta Scheduling Sathish Vadhiyar Sources/Credits/Taken from: Papers listed in “References” slide.

High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.

Porting Irregular Reductions on Heterogeneous CPU-GPU Configurations Xin Huo, Vignesh T. Ravi, Gagan Agrawal Department of Computer Science and Engineering.

CDA 3101 Fall 2013 Introduction to Computer Organization Computer Performance 28 August 2013.

Scientific Workflow Scheduling in Computational Grids Report: Wei-Cheng Lee 8th Grid Computing Conference IEEE 2007 – Planning, Reservation,

1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.

Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.

Condor Week 2005Optimizing Workflows on the Grid1 Optimizing workflow execution on the Grid Gaurang Mehta - Based on “Optimizing.

GreenSched: An Energy-Aware Hadoop Workflow Scheduler

Performance evaluation of component-based software systems Seminar of Component Engineering course Rofideh hadighi 7 Jan 2010.

1 MosaStore -A Versatile Storage System Lauro Costa, Abdullah Gharaibeh, Samer Al-Kiswany, Matei Ripeanu, Emalayan Vairavanathan, (and many others from.

PREDIcT: Towards Predicting the Runtime of Iterative Analytics Adrian Popescu 1, Andrey Balmin 2, Vuk Ercegovac 3, Anastasia Ailamaki

Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.

Data Replication and Power Consumption in Data Grids Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff Byrd Department of Computer Science The University.

Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.

Matchmaking: A New MapReduce Scheduling Technique

MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.

Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.

Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,

June 30 - July 2, 2009AIMS 2009 Towards Energy Efficient Change Management in A Cloud Computing Environment: A Pro-Active Approach H. AbdelSalamK. Maly.

DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.

Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS

XI HE Computing and Information Science Rochester Institute of Technology Rochester, NY USA Rochester Institute of Technology Service.

ApproxHadoop Bringing Approximations to MapReduce Frameworks

Sunpyo Hong, Hyesoon Kim

E-MOS: Efficient Energy Management Policies in Operating Systems

1 of 14 Lab 2: Formal verification with UPPAAL. 2 of 14 2 The gossiping persons There are n persons. All have one secret to tell, which is not known to.

1 of 14 Lab 2: Design-Space Exploration with MPARM.

1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.

Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,

Performance Assurance for Large Scale Big Data Systems

Jacob R. Lorch Microsoft Research

Diskpool and cloud storage benchmarks used in IT-DSS

Department of Computer Science University of California, Santa Barbara

Proposal for Term Project Operating Systems, Fall 2018

A Software-Defined Storage for Workflow Applications

Department of Computer Science University of California, Santa Barbara

Presentation transcript:

Energy Prediction for I/O Intensive Workflow Applications 1 MASc Exam Hao Yang NetSysLab The Electrical and Computer Engineering Department The University of British Columbia

Background - Workflow Applications 2 Montage Workflow Computation File Dependency Characteristics: File based communication Large number of tasks Large amount of I/O Common data access patterns

Background - Application Execution 3 Central Storage System (e.g., GPFS, NFS) File based communication Large I/O volume Workflow Runtime Engine App. task Local storage App. task Local storage App. task Local storage App. task Local storage App. task Local storage I/O Bottleneck

Background - Intermediate Storage System 4 Central Storage System (e.g., GPFS, NFS) App. task Local storage App. task Local storage App. task Local storage Intermediate Storage … Workflow Runtime Engine Stage In Stage Out Compute Nodes

5 Background - Context of this thesis This work focuses on workflow application execution on intermediate storage systems.

Research Problem – Energy Consumption The pursuit of performance use to dominate the conventional computing area. Energy efficiency is the new concern. 6 Computing Equipment Energy Bill

Research Problem - Configuration Decisions 7 Montage WorkloadEnergy Delay Product (EDP) Configuring the runtime system is complex (Example: resource allocation decision)

Q1: What performance optimizations in storage systems lead to energy savings? Q2: What is the performance and energy impact of power- centric tuning techniques? Q3: How can users balance time-to-solution and energy consumption when given a target application? 8 Research Problem - Questions

Outline Background Research Problem Methodology Evaluation Conclusion 9

Methodology – Building Energy Consumption Predictor The goal of this work is to build an energy consumption predictor to aid system configuration and provisioning decisions. Answer what-if questions (E.g, is A configuration better than B from the energy perspective?) Customize optimization metric (E.g., energy consumption, performance-energy product) 10

Methodology – Energy Model 11 App. task Local storage App. task Local storage App. task Local storage Intermediate Storage … Compute Nodes Execution States: Idle Network Transfer Storage I/O Task Processing AC D App. task Local storage B Workflow Runtime Engine Power Profiles:

Methodology – Energy Model 12 Idle Network Transfer I/O ops (read, write) Task Processing Energy Power Profile * Predicted Times Execution States:

Methodology – Energy Model How to seed the energy model? Power states: using synthetic benchmarks to get the power consumption in each state. Time estimates: augments a performance predictor to track the time spent in each state. 13

Methodology – Building Energy Consumption Predictor 14 L. B. Costa, S. Al-Kiswany, H. Yang, and M. Ripeanu, “Supporting Storage Configuration for I/O Intensive Workflows”, In Proceedings of the 28th ACM International Conference on Supercomputing, ICS'14, (Acceptance Rate: 20%) June L. B. Costa, S. Al-Kiswany, A. Barros, H. Yang, and M. Ripeanu, “Predicting Intermediate Storage Performance for Workflow Applications”, In Proceedings PDSW'13, Sources of inaccuracies homogeneity, Power meter Time Prediction Model Simplification (metadata, scheduling, …)

Evaluation Outline 15 Synthetic benchmarks: Workflow Patterns Real workflow applications Predicting Energy Impact of Power-tuning Techniques Predicting Energy-Performance Tradeoffs

Evaluation - Platform 16 Taurus Cluster (11 nodes) two 2.3GHz Intel Xeon E CPUs (each with 6 cores), 32GB memory, 10 Gbps NIC Sagittaire Cluster (16 nodes) two 2.4GHz AMD Opteron CPUs (each with one core), 2GB RAM and 1 Gbps NIC SME Omegawatt power-meter per Node 0.01W power resolution at 1Hz sampling rate Grid5000 Lyon site Idle App Storage I/O Net transfer

Evaluation – Synthetic benchmarks: Workflow Patterns 17 Montage Workflow Pipeline Reduce

Evaluation – Synthetic benchmarks: Workflow Patterns 18

Evaluation – Synthetic benchmarks: Workflow Patterns 19 Average 88% accuracy 20-30x times faster than running the actual benchmark 200x-300x less resources (machines * runtime) Using Default Storage System Configuration (DSS)

Evaluation – Synthetic benchmarks: Workflow Patterns 20 S. Al-Kiswany, L. B. Costa, H. Yang, E. Vairavanathan, M. Ripeanu, “The Case for Cross-Layer Optimizations in Storage: A Workflow-Optimized Storage System”, IEEE Transactions on Parallel and Distributed Systems (TPDS), Under Review, Submitted in June 2014 L.B. Costa, H. Yang, E. Vairavanathan, A. Barros, K. Maheshwari, G. Fedak, D.S. Katz, M. Wilde, M. Ripeanu and S. Al-Kiswany, “The Case for Workflow-Aware Storage: An Opportunity Study using MosaStore”, Journal of Grid Computing Pipeline Energy Consumption DSS – Default Storage System Configuration WOSS – Workflow Optimized Storage System Configuration Q1: What are the energy savings that performance optimizations in storage can bring? Accurate in both configurations. Suggests the configuration from energy perspective.

Evaluation – Real Workflow Applications 21 BLAST workflowMontage workflow

Evaluation – Real Workflow Applications 22 BLAST Result (Energy 89%, Time 95% ) Montage Result (Energy 84%, Time 86% )

Evaluation – CPU Throttling 23 CPU throttling is an important technique where processors run at less-than-maximum frequency to conserve power. this technique can prolong the execution time while conserving instantaneous power. Q2: What is the energy and performance impact of CPU throttling? Is it application- specific? CPU bound application: BLAST I/O bound application: pipeline benchmark

Evaluation – CPU Throttling 24 BLAST Result Pipeline Result EnergyTime Energy Time 17% savings when using maximum throttling 96% cost when using maximum CPU throttling Frequency Level: 1200MHz, 1800MHz, 2300MHz Conclusion: The computational and I/O characteristics Energy savings/ energy costs The predictor can be used in make the decisions.

Evaluation – Predicting Energy Delay Product 25 User’s optimization metric Performance (use more machines) Energy Energy-Delay Product (EDP, energy * time) Consider allocation decision. Use Montage workload on two clusters to demonstrate prediction. Q3: How can users balance time-to-solution and energy consumption when given a target application?

Evaluation – Predicting Energy Delay Product 26 Montage EDP at Taurus Montage EDP at Sagittaire

Conclusion This thesis presents an energy consumption predictor in the workflow application domain. The proposed energy model and prediction framework achieve adequate accuracy to be useful for the energy- oriented configurations this work targets. 27

Resulting Publications Energy Prediction H. Yang, L. B. Costa and M. Ripeanu, “Energy Prediction for I/O Intensive Workflows Applications”, submitted to 7th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS) 2014 (Co-located with Supercomputing/SC 2014), under-review. Performance Prediction and Provisioning L. B. Costa, S. Al-Kiswany, H. Yang, and M. Ripeanu, “Supporting Storage Configuration and Provisioning for I/O Intensive Workflows”, In Preparation. L. B. Costa, S. Al-Kiswany, H. Yang, and M. Ripeanu, “Supporting Storage Configuration for I/O Intensive Workflows”, In Proceedings of ICS'14, Acceptance rate: 20%. June L. B. Costa, S. Al-Kiswany, A. Barros, H. Yang, and M. Ripeanu, “Predicting Intermediate Storage Performance for Workflow Applications”, In Proceedings PDSW'13, Evaluating Storage Systems for Scientific Data in the Cloud K. Maheshwari, J. Wozniak, H. Yang, D. S. Katz, M. Ripeanu, V. Zavala, M. Wilde, “Evaluating Storage Systems for Scientific Data in the Cloud”, In Proceedings of the 5th Workshop on Scientific Cloud Computing (ScienceCloud), Co-located with ACM HPDC 2014 (Best Paper Award) A Workflow-Optimized Storage System S. Al-Kiswany, L. B. Costa, H. Yang, E. Vairavanathan, M. Ripeanu, “A Software Defined Storage for Scientific Workflow Applications”, In Preparation. S. Al-Kiswany, L. B. Costa, H. Yang, E. Vairavanathan, M. Ripeanu, “The Case for Cross-Layer Optimizations in Storage: A Workflow-Optimized Storage System”, IEEE Transactions on Parallel and Distributed Systems (TPDS), Under Review, Submitted in June 2014 L.B. Costa, H. Yang, E. Vairavanathan, A. Barros, K. Maheshwari, G. Fedak, D.S. Katz, M. Wilde, M. Ripeanu and S. Al-Kiswany, “The Case for Workflow-Aware Storage: An Opportunity Study using MosaStore”, accepted by Journal of Grid Computing, 2014.

29 The system model Model seeding Workload description System Deployment Configuration Number of Storage Nodes Number of Client Nodes Chunk Size Replication Level … Platform Performance Parameters Manger Service Time Storage Service Time Client Service Time Remote network service Time Local network service time I/O traces Task Dependency Graph L. B. Costa, S. Al-Kiswany, H. Yang, and M. Ripeanu, “Supporting Storage Configuration for I/O Intensive Workflows”, In Proceedings of the 28th ACM International Conference on Supercomputing, ICS'14, June Backup Slides

Limitations: Simplification of the model Short tasks/ small workload Not validated using new devices (e.g, SSD) 30 Backup Slides

Alternative Approaches: Utilization Detailed simulation Machine learning 31 Backup Slides

32 Backup Slides Combined states

Energy Composition (pipeline benchmark): Idle energy: 64% App processing: 9.2% Storage operations: 15.8% Network transfer: 10.6% 33 Backup Slides

Sagittaire power profiles 34 Backup Slides 175W 25W 8W 7W