Computer Science 320 Measuring Sizeup. Speedup vs Sizeup If we add more processors, we should be able to solve a problem of a given size faster If we.

Slides:



Advertisements
Similar presentations
Analyzing Parallel Performance Intel Software College Introduction to Parallel Programming – Part 6.
Advertisements

CSE 160 – Lecture 9 Speed-up, Amdahl’s Law, Gustafson’s Law, efficiency, basic performance metrics.
Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Potential for parallel computers/parallel programming
1 Lecture 5: Part 1 Performance Laws: Speedup and Scalability.
Parallel Processing & Distributed Systems Thoai Nam Chapter 2.
11Sahalu JunaiduICS 573: High Performance Computing5.1 Analytical Modeling of Parallel Programs Sources of Overhead in Parallel Programs Performance Metrics.
Example (1) Two computer systems have been tested using three benchmarks. Using the normalized ratio formula and the following tables below, find which.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Sep 5, 2005 Lecture 2.
1 Lecture 4 Analytical Modeling of Parallel Programs Parallel Computing Fall 2008.
Chapter 7 Performance Analysis. 2 Additional References Selim Akl, “Parallel Computation: Models and Methods”, Prentice Hall, 1997, Updated online version.
CS 584. Logic The art of thinking and reasoning in strict accordance with the limitations and incapacities of the human misunderstanding. The basis of.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming with MPI and OpenMP Michael J. Quinn.
Arquitectura de Sistemas Paralelos e Distribuídos Paulo Marques Dep. Eng. Informática – Universidade de Coimbra Ago/ Quantitative.
Recap.
GEOS-CHEM: Platforms and Resolution 4x5 is under control, many platform options exist. 2x2.5 is a factor of 5-6 slower than 4x5, but useful. global 1x1.
Lecture 5 Today’s Topics and Learning Objectives Quinn Chapter 7 Predict performance of parallel programs Understand barriers to higher performance.
Parallel Programming Chapter 3 Introduction to Parallel Architectures Johnnie Baker January 26 , 2011.
Steve Lantz Computing and Information Science Parallel Performance Week 7 Lecture Notes.
DCS/2003/1 CENG Distributed Computing Systems Measures of Performance.
Computer Science 320 Measuring Speedup. What Is Running Time? T(N, K) says that the running time T is a function of the problem size N and the number.
Chapter 4 Performance. Times User CPU time – Time that the CPU is executing the program System CPU time – time the CPU is executing OS routines for the.
Lecture 3 – Parallel Performance Theory - 1 Parallel Performance Theory - 1 Parallel Computing CIS 410/510 Department of Computer and Information Science.
Performance Evaluation of Parallel Processing. Why Performance?
“elbowing out” Processors used Speedup Efficiency timeexecution Parallel Processors timeexecution Sequential Efficiency   
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
INTEL CONFIDENTIAL Predicting Parallel Performance Introduction to Parallel Programming – Part 10.
Flynn’s Taxonomy SISD: Although instruction execution may be pipelined, computers in this category can decode only a single instruction in unit time SIMD:
Amdahl's Law Validity of the single processor approach to achieving large scale computing capabilities Presented By: Mohinderpartap Salooja.
Performance Measurement n Assignment? n Timing #include double When() { struct timeval tp; gettimeofday(&tp, NULL); return((double)tp.tv_sec + (double)tp.tv_usec.
CS453 Lecture 3.  A sequential algorithm is evaluated by its runtime (in general, asymptotic runtime as a function of input size).  The asymptotic runtime.
Performance Measurement. A Quantitative Basis for Design n Parallel programming is an optimization problem. n Must take into account several factors:
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Compiled by Maria Ramila Jimenez
Parallel Processing Steve Terpe CS 147. Overview What is Parallel Processing What is Parallel Processing Parallel Processing in Nature Parallel Processing.
Parallel Processing Sharing the load. Inside a Processor Chip in Package Circuits Primarily Crystalline Silicon 1 mm – 25 mm on a side 100 million to.
Lecture 9 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
Scaling Area Under a Curve. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.
Dean Tullsen UCSD.  The parallelism crisis has the feel of a relatively new problem ◦ Results from a huge technology shift ◦ Has suddenly become pervasive.
Parallel Programming with MPI and OpenMP
Classic Model of Parallel Processing
Advanced Computer Networks Lecture 1 - Parallelization 1.
Scaling Conway’s Game of Life. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.
Concurrency and Performance Based on slides by Henri Casanova.
1a.1 Parallel Computing and Parallel Computers ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2006.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 February Session 7.
Complexity of Algorithms Fundamental Data Structures and Algorithms Ananda Guna January 13, 2005.
1 Potential for Parallel Computation Chapter 2 – Part 2 Jordan & Alaghband.
1 A simple parallel algorithm Adding n numbers in parallel.
Classification of parallel computers Limitations of parallel processing.
Supercomputing in Plain English Tuning Blue Waters Undergraduate Petascale Education Program May 29 – June
Potential for parallel computers/parallel programming
PERFORMANCE EVALUATIONS
4- Performance Analysis of Parallel Programs
Introduction to Parallelism.
Literature Review Dongdong Chen Peng Huang
Parallel Computers.
Chapter 3: Principles of Scalable Performance
You have a system that contains a special processor for doing floating-point operations. You have determined that 50% of your computations can use the.
CSE8380 Parallel and Distributed Processing Presentation
CS 584.
PERFORMANCE MEASURES. COMPUTATIONAL MODELS Equal Duration Model:  It is assumed that a given task can be divided into n equal subtasks, each of which.
Parallel Computing and Parallel Computers
Mattan Erez The University of Texas at Austin
Potential for parallel computers/parallel programming
Potential for parallel computers/parallel programming
Complexity Measures for Parallel Computation
Potential for parallel computers/parallel programming
Potential for parallel computers/parallel programming
Presentation transcript:

Computer Science 320 Measuring Sizeup

Speedup vs Sizeup If we add more processors, we should be able to solve a problem of a given size faster If we add more processors, we should be able to increase the size of a problem that we can solve in a given amount of time

Speedup vs Sizeup T(N, K) says that the running time T is a function of the problem size N and the number of processors K N(T, K) says that the problem size N is a function of the running time Tand the number of processors K

What Is Sizeup? Sizeup is the size of a parallel version running on K processors relative to a sequential version running on one processor Sizeup(T, K) = N par (T, K) / N seq (T, 1) Ideally, linear with K

What Is Sizeup Efficiency? SizeupEff(T, K) = Sizeup (T, K) / K Usually a fraction < 1

Gustafson’s Law The sequential portion of a parallel program puts an upper bound on the efficiency it can achieve Don’t run a problem of the same size on more and more processors Scale up the problem size as running time stays the same

Gustafson’s Law Determine what the running time would be on a single processor with the larger problem size attained by using K processors, where T(N, K) is always the same T(N, 1) = F * T (N, K) + K * (1 – F) * T(N, K)

Speedup and Efficiency T(N, 1) = F * T (N, K) + K * (1 – F) * T(N, K) Speedup(N, K) = F + K – K * F Eff(N, K) = F / K F As K increases, speedup continues increasing without limit, and efficiency becomes 1 – F and K goes to infinity Unlike Amdahl, who says speedup approaches 1 / F and efficiency approaches 0 as K increases

Different Assumptions Amdahl: defines the sequential fraction F with respect to the running time on one processor Gustafson: defines the sequential fraction F with respect to the running time on K processors

Problem Size Laws Running time is constant, but N varies; the running time model with model parameters a and d is T(N, K) = a + 1 / k * d * N Solve for N to get the problem size model: N(T, K) = 1 / d * K * (T – a) This is the First Problem Size Law

Ideal Sizeup and Efficiency N(T, K) = 1 / d * K * (T – a) Using the First Problem Size Law to determine sizeup and efficiency, we get Sizeup(T, K) = K SizeupEff(N, K) = 1

Realistic Sizeup and Efficiency The sequential portion’s running time does increase as N goes up T(N, K) = (a + b * N) + 1 / k * (c +d * N) N(T, K) = (K * T – K * a – c) / (K * b + d) This is the Second Problem Size Law Then Sizeup(T, K) = (K * G + K) / (K * G + 1), where G = b / d lim SizeupEff(N, K) = / G

Sizeup or Speedup? Fine-tune and test speedup during development Focus on sizeup during operation