Parallel Processing & Distributed Systems Thoai Nam Chapter 2.

Slides:

Advertisements

Similar presentations

CSE 160 – Lecture 9 Speed-up, Amdahl’s Law, Gustafson’s Law, efficiency, basic performance metrics.

Advertisements

Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.

Distributed Systems CS

SE-292 High Performance Computing

Potential for parallel computers/parallel programming

A system Performance Model Instructor: Dr. Yanqing Zhang Presented by: Rajapaksage Jayampthi S.

1 Programming Languages Storage Management Cao Hoaøng Truï Khoa Coâng Ngheä Thoâng Tin Ñaïi Hoïc Baùch Khoa TP. HCM.

1 Lecture 5: Part 1 Performance Laws: Speedup and Scalability.

ECE669 L9: Workload Evaluation February 26, 2004 ECE 669 Parallel Computer Architecture Lecture 9 Workload Evaluation.

Lincoln University Canterbury New Zealand Evaluating the Parallel Performance of a Heterogeneous System Elizabeth Post Hendrik Goosen formerly of Department.

11Sahalu JunaiduICS 573: High Performance Computing5.1 Analytical Modeling of Parallel Programs Sources of Overhead in Parallel Programs Performance Metrics.

1 Distributed Computing Algorithms CSCI Distributed Computing: everything not centralized many processors.

1 Lecture 4 Analytical Modeling of Parallel Programs Parallel Computing Fall 2008.

EECC756 - Shaaban #1 lec # 9 Spring Parallel System Performance: Evaluation & Scalability Factors affecting parallel system performance: –Algorithm-related,

CS 584. Logic The art of thinking and reasoning in strict accordance with the limitations and incapacities of the human misunderstanding. The basis of.

1 Tuesday, October 03, 2006 If I have seen further, it is by standing on the shoulders of giants. -Isaac Newton.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming with MPI and OpenMP Michael J. Quinn.

CS 584 Lecture 11 l Assignment? l Paper Schedule –10 Students –5 Days –Look at the schedule and me your preference. Quickly.

Lecture 5 Today’s Topics and Learning Objectives Quinn Chapter 7 Predict performance of parallel programs Understand barriers to higher performance.

Steve Lantz Computing and Information Science Parallel Performance Week 7 Lecture Notes.

Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.

Chapter 4 Performance. Times User CPU time – Time that the CPU is executing the program System CPU time – time the CPU is executing OS routines for the.

Lecture 3 – Parallel Performance Theory - 1 Parallel Performance Theory - 1 Parallel Computing CIS 410/510 Department of Computer and Information Science.

-1.1- PIPELINING 2 nd week. -2- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM PIPELINING 2 nd week References Pipelining concepts The DLX.

Performance Evaluation of Parallel Processing. Why Performance?

1 Interconnects Shared address space and message passing computers can be constructed by connecting processors and memory unit using a variety of interconnection.

“elbowing out” Processors used Speedup Efficiency timeexecution Parallel Processors timeexecution Sequential Efficiency   

Amdahl’s Law in the Multicore Era Mark D.Hill & Michael R.Marty 2008 ECE 259 / CPS 221 Advanced Computer Architecture II Presenter : Tae Jun Ham 2012.

1 Distributed Process Scheduling: A System Performance Model Vijay Jain CSc 8320, Spring 2007.

Performance Measurement n Assignment? n Timing #include double When() { struct timeval tp; gettimeofday(&tp, NULL); return((double)tp.tv_sec + (double)tp.tv_usec.

Performance Measurement. A Quantitative Basis for Design n Parallel programming is an optimization problem. n Must take into account several factors:

-1.1- Chapter 2 Abstract Machine Models Lectured by: Nguyễn Đức Thái Prepared by: Thoại Nam.

PARALLEL PARADIGMS AND PROGRAMMING MODELS 6 th week.

April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.

Parallel Processing Steve Terpe CS 147. Overview What is Parallel Processing What is Parallel Processing Parallel Processing in Nature Parallel Processing.

-1- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Parallel Computer Architectures 2 nd week References Flynn’s Taxonomy Classification of Parallel.

Lecture 9 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.

Scaling Area Under a Curve. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.

Parallel Processing & Distributed Systems Thoai Nam And Vu Le Hung.

Parallel Programming with MPI and OpenMP

MATRIX MULTIPLICATION 4 th week. -2- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM MATRIX MULTIPLICATION 4 th week References Sequential matrix.

Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare.

Parallel Processing & Distributed Systems Thoai Nam Chapter 2.

Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.

A System Performance Model Distributed Process Scheduling.

Scaling Conway’s Game of Life. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.

Computer Science 320 Measuring Sizeup. Speedup vs Sizeup If we add more processors, we should be able to solve a problem of a given size faster If we.

Parallel Processing & Distributed Systems Thoai Nam Chapter 3.

Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 February Session 7.

1 Potential for Parallel Computation Chapter 2 – Part 2 Jordan & Alaghband.

Classification of parallel computers Limitations of parallel processing.

Concurrency Idea. 2 Concurrency idea Challenge –Print primes from 1 to Given –Ten-processor multiprocessor –One thread per processor Goal –Get ten-fold.

DCS/1 CENG Distributed Computing Systems Measures of Performance.

Supercomputing in Plain English Tuning Blue Waters Undergraduate Petascale Education Program May 29 – June

CS203 – Advanced Computer Architecture

Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Workload-Driven Evaluation Dr. Xiao Qin.

Potential for parallel computers/parallel programming

Lecture 2: Performance Evaluation

Software Architecture in Practice

4- Performance Analysis of Parallel Programs

Introduction to Parallelism.

Chapter 3: Principles of Scalable Performance

Distributed Computing:

PERFORMANCE MEASURES. COMPUTATIONAL MODELS Equal Duration Model:  It is assumed that a given task can be divided into n equal subtasks, each of which.

Potential for parallel computers/parallel programming

Parallel Speedup.

Potential for parallel computers/parallel programming

Potential for parallel computers/parallel programming

Presentation transcript:

Parallel Processing & Distributed Systems Thoai Nam Chapter 2

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Amdahl’s Law, Gustafson’s Law, Sun & Ni’s Law  Speedup & Efficiency  Amdahl’s Law  Gustafson’s Law  Sun & Ni’s Law

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Speedup & Efficiency  Speedup: S = Time (the most efficient sequential algorithm) / Time (parallel algorithm)  Efficiency: E = S / N with N is the number of processors

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Amdahl’s Law – Fixed Problem Size (1)  The main objective is to produce the results as soon as possible –(ex) video compression, computer graphics, VLSI routing, etc  Implications –Upper-bound is –Make Sequential bottleneck as small as possible –Optimize the common case  Modified Amdahl’s law for fixed problem size including the overhead

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Amdahl’s Law – Fixed Problem Size (2) SequentialParallel Sequential P0P0 P1P1 P2P2 P3P3 P4P4 P5P5 P6P6 P7P7 P8P8 P9P9 Parallel T(1) T(N) TsTs TpTp T s =  T(1)  T p = (1-  )T(1) T(N) =  T(1)+ (1-  )T(1)/N Number of processors

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Amdahl’s Law – Fixed Problem Size (3)

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Enhanced Amdahl’s Law The overhead includes parallelism and interaction overheads

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Gustafson’s Law – Fixed Time (1)  User wants more accurate results within a time limit –Execution time is fixed as system scales –(ex) FEM for structural analysis, FDM for fluid dynamics  Properties of a work metric –Easy to measure –Architecture independent –Easy to model with an analytical expression –No additional experiment to measure the work –The measure of work should scale linearly with sequential time complexity of the algorithm  Time constrained seems to be most generally viable model!

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Gustafson’s Law – Fixed Time (2) Parallel SequentialP0P0 P1P1 P2P2 P3P3 P4P4 P5P5 P6P6 P7P7 P8P8 P9P9 P0P0 P9P T0T0 TsTs  = T s / T(N)  T(1) =  T(N) + (1-  )T(N) T(n)

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Gustafson’s Law – Fixed Time without overhead Time = Work. k W(n) = W

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Gustafson’s Law – Fixed Time with overhead W(n) = W + W 0

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Sun and Ni’s Law – Fixed Memory (1)  Scale the largest possible solution limited by the memory space. Or, fix memory usage per processor  Speedup, –Time(1)/Time(N) for scaled up problem is not appropriate. –For simple profile, and G(N) is the increase of parallel workload as the memory capacity increases n times.

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Sun and Ni’s Law – Fixed Memory (2)

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM  Definition: A function is homomorphism if there exists a function such that for any real number c and variable x,.  Theorem: If W = for some homomorphism function,, then, with all data being shared by all available processors, the simplified memory-bounced speedup is Sun and Ni’s Law – Fixed Memory (3)

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Proof: Let the memory requirement of W n be M, W n =. M is the memory requirement when 1 node is available. With N nodes available, the memory capacity will increase to NM. Using all of the available memory, for the scaled parallel portion :. Sun and Ni’s Law – Fixed Memory (4)

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM –When the problem size is independent of the system, the problem size is fixed, G(N)=1  Amdahl’s Law. –When memory is increased N times, the workload also increases N times, G(N)=N  Amdahl’s Law –For most of the scientific and engineering applications, the computation requirement increases faster than the memory requirement, G(N)>N. Speedup