Always race to sleep? i.e. how we managed to confuse ourselves by talking about two kinds of race to sleep.

Slides:



Advertisements
Similar presentations
Shark:SQL and Rich Analytics at Scale
Advertisements

MapReduce Online Tyson Condie UC Berkeley Slides by Kaixiang MO
Robots Leslie B.. What is a robot? A robot is a machine that is capable of doing many kinds of actions.
SDN + Storage.
Overview of MapReduce and Hadoop
LIBRA: Lightweight Data Skew Mitigation in MapReduce
Seunghwa Kang David A. Bader Large Scale Complex Network Analysis using the Hybrid Combination of a MapReduce Cluster and a Highly Multithreaded System.
INTEGRATING BIG DATA TECHNOLOGY INTO LEGACY SYSTEMS Robert Cooley, Ph.D.CodeFreeze 1/16/2014.
Properties of A Wave Properties of A Wave.
A rivers long profile looks something like this:
Simple Gear with Idler Idler Drive Driven.
Mechanisms Get your notebook please- we are taking a few notes first
Towards Energy Efficient Hadoop Wednesday, June 10, 2009 Santa Clara Marriott Yanpei Chen, Laura Keys, Randy Katz RAD Lab, UC Berkeley.
Chapter 10 Algorithm Efficiency
Towards Energy Efficient MapReduce Yanpei Chen, Laura Keys, Randy H. Katz University of California, Berkeley LoCal Retreat June 2009.
Selection Sort
Comparing Computing Machines Dr. André DeHon UC Berkeley November 3, 1998.
CPS216: Advanced Database Systems (Data-intensive Computing Systems) How MapReduce Works (in Hadoop) Shivnath Babu.
Developing a Plan for Increasing Participation in Community Action.
From ‘dirty’ energy to sustainable energy: UK Energy Policy and the role of community energy
REFRACTION!!!! The bending of light. Aaah! We said before that light travelled in a straight line, so why does this bending of light happen?!?!? It.
Hadoop & Cheetah. Key words Cluster  data center – Lots of machines thousands Node  a server in a data center – Commodity device fails very easily Slot.
1 Ethics of Computing MONT 113G, Spring 2012 Session 11 Graphics on the Web Limits of Computer Science.
CS 420 Design of Algorithms Analytical Models of Parallel Algorithms.
Folklore Confirmed: Compiling for Speed = Compiling for Energy Tomofumi Yuki INRIA, Rennes Sanjay Rajopadhye Colorado State University 1.
A Dynamic MapReduce Scheduler for Heterogeneous Workloads Chao Tian, Haojie Zhou, Yongqiang He,Li Zha 簡報人:碩資工一甲 董耀文.
WHAT IS POSITION? LOCATION RELATIVE TO A REFERENCE POINT (FRAME OF REFERENCE)
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Our Experience Running YARN at Scale Bobby Evans.
L14. Fair networks and topology design D. Moltchanov, TUT, Spring 2008 D. Moltchanov, TUT, Spring 2015.
Power Save Mechanisms for Multi-Hop Wireless Networks Matthew J. Miller and Nitin H. Vaidya University of Illinois at Urbana-Champaign BROADNETS October.
Apache Hadoop MapReduce What is it ? Why use it ? How does it work Some examples Big users.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Vyassa Baratham, Stony Brook University April 20, 2013, 1:05-2:05pm cSplash 2013.
Reaction Rates Collision Theory  In order for reactions to occur, particles must collide  If collisions are too gentle, no reaction occurs  If collisions.
1 ratios 9C5 - 9C6 tell how one number is related to another. may be written as A:B, or A/B, or A to B. compare quantities of the same units of measurement.
1 Ethics of Computing MONT 113G, Spring 2012 Session 13 Limits of Computer Science.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. LogKV: Exploiting Key-Value.
Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.
Physical Database Design I, Ch. Eick 1 Physical Database Design I About 25% of Chapter 20 Simple queries:= no joins, no complex aggregate functions Focus.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
Driven Drive Simple Gear B. 2. What is the position of the input shaft compared to the output shaft? Simple Gear B Parallel.
MC 2 : Map Concurrency Characterization for MapReduce on the Cloud Mohammad Hammoud and Majd Sakr 1.
Diagrams and Motion Graphs.  The slope is equal to the acceleration.  Because we only consider constant acceleration, v/t graphs will always be straight.
Selection Sort
Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.
Topic modeling experiments benchmark and simple evaluations 11/15, 11/16.
1 5. Abstract Data Structures & Algorithms 5.6 Algorithm Evaluation.
Advanced Computer Networks Lecture 1 - Parallelization 1.
Jeopardy Danger SafteySecret Info Q $100 Q $200 Q $300 Q $400 Q $500 Q $100 Q $200 Q $300 Q $400 Q $500 Final Jeopardy.
MapReduce & Hadoop IT332 Distributed Systems. Outline  MapReduce  Hadoop  Cloudera Hadoop  Tutorial 2.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
Computer and Network Infrastructure for the LHCb RTTC Artur Barczyk CERN/PH-LBC RTTC meeting,
FORCE AND MOTION IDEAS WHAT DO YOU KNOW OR THINK YOU KNOW?!?!
Heap Sort Uses a heap, which is a tree-based data type Steps involved: Turn the array into a heap. Delete the root from the heap and insert into the array,
Gears & More Gears What are gears? What do gears accomplish?
网上报账系统包括以下业务: 日常报销 差旅费报销 借款业务 1. 填写报销内容 2. 选择支付方式 (或冲销借款) 3. 提交预约单 4. 打印预约单并同分类粘 贴好的发票一起送至财务 处 预约报销步骤: 网上报账系统 薪酬发放管理系统 财务查询系统 1.
Time Management. Benefits of Time Management Increased productivity Have more energy each day Reduced stress Able to do the things you want to do Get.
A Peta-Scale Graph Mining System
An Open Source Project Commonly Used for Processing Big Data Sets
Running TPC-H On Pig Jie Li, Koichi Ishida, Muzhi Zhao, Ralf Diestelkaemper, Xuan Wang, Yin Lin CPS 216: Data Intensive Computing Systems Dec 9, 2011.
Deep Learning in HEP Large number of applications:
Spatial Online Sampling and Aggregation
UNIT 2 So what DOES it take to be an empire?
Work and Energy Practice Problems
Performance What hardware accelerators are you using/evaluating?
Alan Kuhnle*, Victoria G. Crawford, and My T. Thai
Secondary Sort  Problem: Sorting on values
Parallel Speedup.
Enzymes Section 2.5.
Presentation transcript:

Always race to sleep? i.e. how we managed to confuse ourselves by talking about two kinds of race to sleep

Always race to sleep? i.e. how we managed to confuse ourselves by talking about two kinds of race to sleep Single node – amortize static power cost Many nodes – minimize parallelization overhead

1. Background: Common system power and work rate behaviors 2. Analysis: Required conditions for race to sleep 3. Empirical data: When are those conditions met in Hadoop?

Two common behaviors for system power Resource proportional Not resource proportional

Three common behaviors for system work rate Linear speed up Parallelization overhead Bottleneck elsewhere

1. Background: Common system power and work rate behaviors 2. Analysis: Required conditions for race to sleep 3. Empirical data: When are those conditions met in Hadoop?

Work rate

Power

Work rate Power Power efficiency = work rate / power

Work rate Power Power efficiency = work rate / power

Work rate Power Power efficiency = work rate / power Yes Time benefit, no energy benefit Yes Increasing efficiency Race to sleep? i.e. operate at highest work rate? No Decreasing efficiency Somewhat Turning point exists Energy benefit, no time benefit Yes and no

Work rate Power Power efficiency = work rate / power Yes Increasing efficiency Race to sleep? i.e. operate at highest work rate? No Decreasing efficiency Somewhat Turning point exists Yes and no Time benefit, no energy benefit Energy benefit, no time benefit

Work rate Power Power efficiency = work rate / power Race to sleep? i.e. operate at highest work rate? Go faster if % increase in work rate ≥ % increase in power Go slower otherwise

Work rate Power Power efficiency = work rate / power Race to sleep? i.e. operate at highest work rate? Required condition for race to sleep Go faster if % increase in work rate ≥ % increase in power Go slower otherwise

Work rate Power Power efficiency = work rate / power Race to sleep? i.e. operate at highest work rate? e.g. Old work rate = A New work rate = 1.1A Old power = B New power = 1.05A Old power efficiency = A / B New power efficiency = (1.1 / 1.05) × (A / B) = (1.1 / 1.05) × old power eff. Required condition for race to sleep Go faster if % increase in work rate ≥ % increase in power Go slower otherwise

1. Background: Common system power and work rate behaviors 2. Analysis: Required conditions for race to sleep 3. Empirical data: When are those conditions met in Hadoop?

Hadoop sort 10GB terasort format HDFS read 10GB HDFS write 10GB Hadoop shuffle 10GB

Work rate Hadoop sort 10GB terasort format HDFS read 10GB HDFS write 10GB Hadoop shuffle 10GB

Work rate Hadoop sort 10GB terasort format HDFS read 10GB HDFS write 10GB Hadoop shuffle 10GB Power efficiency

Work rate Hadoop sort 10GB terasort format HDFS read 10GB HDFS write 10GB Hadoop shuffle 10GB Power efficiency Race to sleep No Yes No Yes

That was multi-node power efficiency Single-node power efficiency is a different picture

Always race to sleep?

Maybe the question should be Always use as much resources as possible?

Always race to sleep? Maybe the question should be Always use as much resources as possible? Take away: Single node – amortize static power cost (awake nodes should race to sleep) Many nodes – minimize parallelization overhead (as few nodes awake as possible) Increase resource if resulting % work rate increase ≥ % power increase

Work rate Power Power efficiency = work rate / power Yes Increasing efficiency Race to sleep? i.e. operate at highest work rate? No Decreasing efficiency Somewhat Turning point exists Yes and no Time benefit, no energy benefit Energy benefit, no time benefit

Other junk …

Power efficiency = energy efficiency

=