Efficient and Scalable Computation of the Energy and Makespan Pareto Front for Heterogeneous Computing Systems Kyle M. Tarplee 1, Ryan Friese 1, Anthony.

Slides:



Advertisements
Similar presentations
Algorithm Design Methods (I) Fall 2003 CSE, POSTECH.
Advertisements

Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.
Primal Dual Combinatorial Algorithms Qihui Zhu May 11, 2009.
Integer Optimization Basic Concepts Integer Linear Program(ILP): A linear program except that some or all of the decision variables must have integer.
MATH 224 – Discrete Mathematics
Counting the bits Analysis of Algorithms Will it run on a larger problem? When will it fail?
1 Transportation problem The transportation problem seeks the determination of a minimum cost transportation plan for a single commodity from a number.
ISE480 Sequencing and Scheduling Izmir University of Economics ISE Fall Semestre.
Online Scheduling with Known Arrival Times Nicholas G Hall (Ohio State University) Marc E Posner (Ohio State University) Chris N Potts (University of Southampton)
Progress in Linear Programming Based Branch-and-Bound Algorithms
Merge Sort 4/15/2017 6:09 PM The Greedy Method The Greedy Method.
Multiobjective Optimization Chapter 7 Luke, Essentials of Metaheuristics, 2011 Byung-Hyun Ha R1.
Branch and Bound Searching Strategies
Dealing with NP-Complete Problems
Complexity Analysis (Part I)
1 A Second Stage Network Recourse Problem in Stochastic Airline Crew Scheduling Joyce W. Yen University of Michigan John R. Birge Northwestern University.
EDA (CS286.5b) Day 11 Scheduling (List, Force, Approximation) N.B. no class Thursday (FPGA) …
System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.
1 Branch and Bound Searching Strategies 2 Branch-and-bound strategy 2 mechanisms: A mechanism to generate branches A mechanism to generate a bound so.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract.
DAST 2005 Week 4 – Some Helpful Material Randomized Quick Sort & Lower bound & General remarks…
5-1 Chapter 5 Tree Searching Strategies. 5-2 Breadth-first search (BFS) 8-puzzle problem The breadth-first search uses a queue to hold all expanded nodes.
1 Planning and Scheduling to Minimize Tardiness John Hooker Carnegie Mellon University September 2005.
Daniel Kroening and Ofer Strichman Decision Procedures An Algorithmic Point of View Deciding ILPs with Branch & Bound ILP References: ‘Integer Programming’
Energy, Energy, Energy  Worldwide efforts to reduce energy consumption  People can conserve. Large percentage savings possible, but each individual has.
Evolutionary Multi-objective Optimization – A Big Picture Karthik Sindhya, PhD Postdoctoral Researcher Industrial Optimization Group Department of Mathematical.
1 Chapter 2 Program Performance – Part 2. 2 Step Counts Instead of accounting for the time spent on chosen operations, the step-count method accounts.
Decision Procedures An Algorithmic Point of View
On comparison of different approaches to the stability radius calculation Olga Karelkina Department of Mathematics University of Turku MCDM 2011.
Lecture 3 – Parallel Performance Theory - 1 Parallel Performance Theory - 1 Parallel Computing CIS 410/510 Department of Computer and Information Science.
Column Generation Approach for Operating Rooms Planning Mehdi LAMIRI, Xiaolan XIE and ZHANG Shuguang Industrial Engineering and Computer Sciences Division.
Chapter 2.6 Comparison of Algorithms modified from Clifford A. Shaffer and George Bebis.
Week 2 CS 361: Advanced Data Structures and Algorithms
Network Aware Resource Allocation in Distributed Clouds.
ROBUST RESOURCE ALLOCATION OF DAGS IN A HETEROGENEOUS MULTI-CORE SYSTEM Luis Diego Briceño, Jay Smith, H. J. Siegel, Anthony A. Maciejewski, Paul Maxwell,
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 12: February 13, 2002 Scheduling Heuristics and Approximation.
Analysis of Algorithms These slides are a modified version of the slides used by Prof. Eltabakh in his offering of CS2223 in D term 2013.
The Fast Optimal Voltage Partitioning Algorithm For Peak Power Density Minimization Jia Wang, Shiyan Hu Department of Electrical and Computer Engineering.
Stochastic DAG Scheduling using Monte Carlo Approach Heterogeneous Computing Workshop (at IPDPS) 2012 Extended version: Elsevier JPDC (accepted July 2013,
DR. Gatot F. Hertono, MSc. Design and Analysis of ALGORITHM (Session 2)
A Node and Load Allocation Algorithm for Resilient CPSs under Energy-Exhaustion Attack Tam Chantem and Ryan M. Gerdes Electrical and Computer Engineering.
Program Efficiency & Complexity Analysis. Algorithm Review An algorithm is a definite procedure for solving a problem in finite number of steps Algorithm.
A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, and Charles J. Alpert** *Dept of ECE, Michigan Technological.
1 Short Term Scheduling. 2  Planning horizon is short  Multiple unique jobs (tasks) with varying processing times and due dates  Multiple unique jobs.
The Greedy Method. The Greedy Method Technique The greedy method is a general algorithm design paradigm, built on the following elements: configurations:
Kanpur Genetic Algorithms Laboratory IIT Kanpur 25, July 2006 (11:00 AM) Multi-Objective Dynamic Optimization using Evolutionary Algorithms by Udaya Bhaskara.
DIVERSITY PRESERVING EVOLUTIONARY MULTI-OBJECTIVE SEARCH Brian Piper1, Hana Chmielewski2, Ranji Ranjithan1,2 1Operations Research 2Civil Engineering.
1 Iterative Integer Programming Formulation for Robust Resource Allocation in Dynamic Real-Time Systems Sethavidh Gertphol and Viktor K. Prasanna University.
Integer LP In-class Prob
Rounding scheme if r * j  1 then r j := 1  When the number of processors assigned in the continuous solution is between 0 and 1 for each task, the speed.
A Fast Genetic Algorithm Based Static Heuristic For Scheduling Independent Tasks on Heterogeneous Systems Gaurav Menghani Department of Computer Engineering,
Sporadic model building for efficiency enhancement of the hierarchical BOA Genetic Programming and Evolvable Machines (2008) 9: Martin Pelikan, Kumara.
Incremental Run-time Application Mapping for Heterogeneous Network on Chip 2012 IEEE 14th International Conference on High Performance Computing and Communications.
Algorithmics - Lecture 41 LECTURE 4: Analysis of Algorithms Efficiency (I)
Branch and Bound Searching Strategies
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.
Chapter 15 Running Time Analysis. Topics Orders of Magnitude and Big-Oh Notation Running Time Analysis of Algorithms –Counting Statements –Evaluating.
Complexity Analysis (Part I)
Data Driven Resource Allocation for Distributed Learning
Merge Sort 7/29/ :21 PM The Greedy Method The Greedy Method.
Merge Sort 11/28/2018 2:18 AM The Greedy Method The Greedy Method.
Merge Sort 11/28/2018 8:16 AM The Greedy Method The Greedy Method.
MURI Kickoff Meeting Randolph L. Moses November, 2008
Merge Sort 1/17/2019 3:11 AM The Greedy Method The Greedy Method.
Algorithm Design Methods
Merge Sort 5/2/2019 7:53 PM The Greedy Method The Greedy Method.
1.2 Guidelines for strong formulations
Complexity Analysis (Part I)
Complexity Analysis (Part I)
1.2 Guidelines for strong formulations
Presentation transcript:

Efficient and Scalable Computation of the Energy and Makespan Pareto Front for Heterogeneous Computing Systems Kyle M. Tarplee 1, Ryan Friese 1, Anthony A. Maciejewski 1, H.J. Siegel 1,2 1 Department of Electrical and Computer Engineering 2 Department of Computer Science Colorado State University Fort Collins, Colorado, USA

Problem Statement static scheduling  single bag-of-tasks  task assigned to only one machine  machine runs one task at a time heterogeneous tasks and machines desire to reduce energy consumption (operating cost) and makespan to aid decision makers  find high-quality schedules for both energy and makespan  desire computationally efficient algorithms to compute Pareto fronts 2

Outline linear programming lower bound recovering a feasible allocation to form an upper bound Pareto front generation techniques comparison with NSGA-II complexity and scalability 3

Preliminaries simplifying approx: tasks are divisible among machines T i − number of tasks of type i M j − number of machines of type j x ij − number of tasks of type i assigned to machine type j ETC ij − estimated time to compute for a task of type i running on a machine of type j finishing time of machine type j (lower bound): APC ij − average power consumption for a task of type i running on a machine of type j APC Øj − idle power consumption for a machine of type j 4

Objective Functions makespan (lower bound) energy consumed by the bag-of-tasks (lower bound) note that energy is a function of makespan when we have non-zero idle power consumption 5

Linear Bi-Objective Optimization Problem 6

Linear Programming Lower Bound Generation solve for x ij ∈ℝ to obtain a lower bound  generally infeasible solution to the actual scheduling problem  a lower bound for each objective function (tight in practice) solve for x ij ∈ℤ to obtain a tighter lower bound  requires branch and bound (or similar)  typically this is computationally prohibitive  still making the assumption that tasks are divisible among machines 7

Recovering a Feasible Allocation: Round Near 8 find ẋ ij ∈ℤ that is “near” to x ij while maintaining the task assignment constraint for task type (each row in x) do  let n = T i −∑ j floor(x ij )  let f j = frac(x ij ) be the fractional part of x ij  let set K be the indices of the n largest f j  ẋ ij = ceil(x ij ) if j ∈ K else floor(x ij ) x i =30650 ẋ i = xi =xi = ẋ i = xi =xi = ẋ i = xi =xi = ẋ i = 42754

Recovering a Feasible Allocation: Local Assignment 9 given integer number of tasks for each machine type assign tasks to actual machines within a homogeneous machine type via a greedy algorithm for each machine type (column in ẋ ) do longest processing time algorithm  while any tasks are unassigned  assign longest task (irrevocably) to machine that has the earliest finish time  update machine finish time

Pareto Front Generation Procedure step 1 weighted sum scalarization  step 1.1 find the utopia (ideal) and nadir (non-ideal) points  step 1.2 sweep  between 0 and 1  at each step solve the linear programming problem  step 1.3 remove duplicates linear objective functions and convex constraints  convex, lower bounds on Pareto front step 2 round each solution step 3 remove duplicates step 4 locally assign each solution step 5 remove duplicates and dominated solutions full allocation is an upper bound on the true Pareto front 10

Simulation Results simulation setup  ETC matrix derived from actual systems  9 machine types, 36 machines, 4 machines per type  30 task types, 1100 tasks, tasks per type Non-dominated Sorting Genetic Algorithm II  NSGA-II is another algorithm for finding the Pareto front  an adaptation of classical genetic algorithms  basic seed  min energy, min-min completion time, and random  full allocation seed  all solutions from upper bound Pareto front 11

Pareto Fronts 12

Pareto Fronts (Zoomed) 13

Lower Bound to Full Allocation, No Idle Power 14

Effect of Idle Power 15

Complexity 16 let the ETC matrix be T×M linear programming lower bound  average complexity (T+M) 2 (TM+1) rounding step: T(M log M) local assignment step:  number of tasks for machine type j is n j = ∑ i x ij  worst cast complexity is M max j (n j log n j + n j log M j ) complexity of all steps is dominated by either  linear programming solver  local assignment complexity of linear programming solver is independent of the number tasks and machines  depends number of task types and machine types

Impact of Number of Tasks 17 number of task types, machines, and machine types held constant

Impact of Number of Machines 18 number of tasks, task types, and machine types held constant

Impact of Number of Task Types 19 number of tasks, machines, and machine types held constant

Contributions algorithms to compute  tight lower bounds on the energy and makespan  quickly recovers near optimal feasible solutions  high quality bi-objective Pareto fronts evaluation of the scaling properties of the proposed algorithms 20

Conclusions this linear programming approach to Pareto front generation is efficient, accurate, and practical bounds are tight when  a small percentage of tasks are divided  a large number of tasks assigned to each machine type and individual machine asymptotic solution quality and runtime are very reasonable 21

QUESTIONS? 22

BACKUP SLIDES 23

Lower Bound to Full Allocation, 10% Idle Power 24

Impact of Number of Machine Types 25 number of tasks, task types, and machines held constant