CS 584. Logic The art of thinking and reasoning in strict accordance with the limitations and incapacities of the human misunderstanding. The basis of.

Slides:



Advertisements
Similar presentations
Analyzing Parallel Performance Intel Software College Introduction to Parallel Programming – Part 6.
Advertisements

CSE 160 – Lecture 9 Speed-up, Amdahl’s Law, Gustafson’s Law, efficiency, basic performance metrics.
Performance Measurement n Assignment? n Timing #include double When() { struct timeval tp; gettimeofday(&tp, NULL); return((double)tp.tv_sec + (double)tp.tv_usec.
Distributed Systems CS
Parallel Algorithms Lecture Notes. Motivation Programs face two perennial problems:: –Time: Run faster in solving a problem Example: speed up time needed.
Potential for parallel computers/parallel programming
11Sahalu JunaiduICS 573: High Performance Computing5.1 Analytical Modeling of Parallel Programs Sources of Overhead in Parallel Programs Performance Metrics.
Example (1) Two computer systems have been tested using three benchmarks. Using the normalized ratio formula and the following tables below, find which.
ECE669 L4: Parallel Applications February 10, 2004 ECE 669 Parallel Computer Architecture Lecture 4 Parallel Applications.
1 Lecture 4 Analytical Modeling of Parallel Programs Parallel Computing Fall 2008.
Chapter 7 Performance Analysis. 2 Additional References Selim Akl, “Parallel Computation: Models and Methods”, Prentice Hall, 1997, Updated online version.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming with MPI and OpenMP Michael J. Quinn.
Arquitectura de Sistemas Paralelos e Distribuídos Paulo Marques Dep. Eng. Informática – Universidade de Coimbra Ago/ Quantitative.
CS 284a, 4 November 1997 Copyright (c) , John Thornley1 CS 284a Lecture Tuesday, 4 November, 1997.
GEOS-CHEM: Platforms and Resolution 4x5 is under control, many platform options exist. 2x2.5 is a factor of 5-6 slower than 4x5, but useful. global 1x1.
CSCI 4440 / 8446 Parallel Computing Three Sorting Algorithms.
CS 584 Lecture 11 l Assignment? l Paper Schedule –10 Students –5 Days –Look at the schedule and me your preference. Quickly.
Lecture 5 Today’s Topics and Learning Objectives Quinn Chapter 7 Predict performance of parallel programs Understand barriers to higher performance.
Steve Lantz Computing and Information Science Parallel Performance Week 7 Lecture Notes.
Parallel System Performance CS 524 – High-Performance Computing.
DCS/2003/1 CENG Distributed Computing Systems Measures of Performance.
CPU Performance Assessment As-Bahiya Abu-Samra *Moore’s Law *Clock Speed *Instruction Execution Rate - MIPS - MFLOPS *SPEC Speed Metric *Amdahl’s.
Performance of Parallel Programs Michelle Kuttel 1.
Rechen- und Kommunikationszentrum (RZ) Parallelization at a Glance Christian Terboven / Aachen, Germany Stand: Version 2.3.
Computer Science 320 Measuring Speedup. What Is Running Time? T(N, K) says that the running time T is a function of the problem size N and the number.
Parallel Programming in C with MPI and OpenMP
CS 420 Design of Algorithms Analytical Models of Parallel Algorithms.
Lecture 3 – Parallel Performance Theory - 1 Parallel Performance Theory - 1 Parallel Computing CIS 410/510 Department of Computer and Information Science.
Performance Evaluation of Parallel Processing. Why Performance?
Timing Trials An investigation arising out of the Assignment CS32310 – Nov 2013 H Holstein 1.
“elbowing out” Processors used Speedup Efficiency timeexecution Parallel Processors timeexecution Sequential Efficiency   
INTEL CONFIDENTIAL Predicting Parallel Performance Introduction to Parallel Programming – Part 10.
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
Flynn’s Taxonomy SISD: Although instruction execution may be pipelined, computers in this category can decode only a single instruction in unit time SIMD:
Performance Measurement n Assignment? n Timing #include double When() { struct timeval tp; gettimeofday(&tp, NULL); return((double)tp.tv_sec + (double)tp.tv_usec.
CS453 Lecture 3.  A sequential algorithm is evaluated by its runtime (in general, asymptotic runtime as a function of input size).  The asymptotic runtime.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
SJSU SPRING 2011 PARALLEL COMPUTING Parallel Computing CS 147: Computer Architecture Instructor: Professor Sin-Min Lee Spring 2011 By: Alice Cotti.
Performance Measurement. A Quantitative Basis for Design n Parallel programming is an optimization problem. n Must take into account several factors:
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
CS 8625 High Performance and Parallel, Dr. Hoganson Copyright © 2005, 2006 Dr. Ken Hoganson CS8625-June Class Will Start Momentarily… Homework.
Parallel Processing Steve Terpe CS 147. Overview What is Parallel Processing What is Parallel Processing Parallel Processing in Nature Parallel Processing.
Parallel Processing Sharing the load. Inside a Processor Chip in Package Circuits Primarily Crystalline Silicon 1 mm – 25 mm on a side 100 million to.
Lecture 9 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
Scaling Area Under a Curve. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.
From lecture slides for Computer Organization and Architecture: Designing for Performance, Eighth Edition, Prentice Hall, 2010 CS 211: Computer Architecture.
Parallel Programming with MPI and OpenMP
Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.
Advanced Computer Networks Lecture 1 - Parallelization 1.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
Scaling Conway’s Game of Life. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.
Computer Science 320 Measuring Sizeup. Speedup vs Sizeup If we add more processors, we should be able to solve a problem of a given size faster If we.
Concurrency and Performance Based on slides by Henri Casanova.
1 Potential for Parallel Computation Chapter 2 – Part 2 Jordan & Alaghband.
Classification of parallel computers Limitations of parallel processing.
Distributed and Parallel Processing George Wells.
DCS/1 CENG Distributed Computing Systems Measures of Performance.
Potential for parallel computers/parallel programming
Introduction to Parallelism.
Parallel Computers.
Parallel Processing Sharing the load.
Amdahl's law.
CS 584.
PERFORMANCE MEASURES. COMPUTATIONAL MODELS Equal Duration Model:  It is assumed that a given task can be divided into n equal subtasks, each of which.
Potential for parallel computers/parallel programming
Potential for parallel computers/parallel programming
Potential for parallel computers/parallel programming
Quiz Questions Parallel Programming Parallel Computing Potential
Potential for parallel computers/parallel programming
Presentation transcript:

CS 584

Logic The art of thinking and reasoning in strict accordance with the limitations and incapacities of the human misunderstanding. The basis of logic is the syllogism, consisting of a major and minor premise and a conclusion.

Example n Major Premise: Sixty men can do a piece of work sixty times as quickly as one man. n Minor Premise: One man can dig a post- hole in sixty seconds. n Conclusion: Sixty men can dig a post-hole in one second.

Performance Analysis: "Tar Baby" n Ask the right questions n Questions to consider –What is time? –What is work? n Objectivity is the key –Take a step back from your program

Performance Analysis Statements n There is always a trade-off between time and solution quality. n We should compare the quality of the answer for a given execution time. n For any performance reporting, find and clearly state the quality measure.

Efficiency n Efficiency is defined as speedup/P n With superlinear speedup efficiency > 1 –Does cache make a processor work at 110%? n Why is communication not considered work but rather overhead?

Speedup n Conventional speedup is defined as the reduction in execution time. n Consider running a problem on a slow parallel computer and on a faster one. –Same serial component –Speedup will be lower on the faster computer.

Speedup and Amdahl's Law n Conventional speedup penalizes faster absolute speed. n Assumption that task size is constant as the computing power increases results in an exaggeration of task overhead. n Scaling the problem size reduces these distortion effects.

Solution n Gustafson introduces scaled speedup. n Scale the problem size as you increase the number of processors. n Calculated in two ways –Experimentally –Analytical models

Traditional Speedup )( )( 1 NC NC Speedup P  C 1 is complexity (time) taken on a single processor C P is complexity (time) taken on P processors

Scaled Speedup )( )( 1 PNC C Speedup P  C 1 is complexity (time) taken on a single processor C P is complexity (time) taken on P processors

Experimental Scaled Speedup n Keep the ratio N/P constant between single processor case and many processor case when testing n Example:Calculate the speedup for 8, and 16 processors. –N/P = 256 n How big should the problem be?

Using analytical models n Examine the control flow of the algorithm n Find a general algebraic form for the complexity (execution time). n Fit the curve with experimental data. n If the fit is poor, find the missing terms and repeat. n Calculate the scaled speedup using formula.

Example n Serial Time = N seconds n Parallel Time = N/P + 5P seconds n Let N/P = 128 n Scaled Speedup for 4 processors is:  )4(5)4/)128(4(124 ))128(4(122    )( )( 1  PNC C P

Traditional Speedup ideal measured Number of Processors Speedup

Scaled Speedup ideal Number of Processors Speedup Small problem Medium problem Large Problem

Assignment n Problems on the web n Create a model for your program n Use the model to calculate –traditional speedup –scaled speedup n Experimentally calculate the values n Compare the results.