Parallel Computing Demand for High Performance

Slides:



Advertisements
Similar presentations
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Advertisements

Partitioning and Divide-and-Conquer Strategies ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 23, 2013.
Zhao Lixing.  A supercomputer is a computer that is at the frontline of current processing capacity, particularly speed of calculation.  Supercomputers.
Potential for parallel computers/parallel programming
Chapter 1 Parallel Computers.
Parallel Computers Chapter 1
COMPE 462 Parallel Computing
A many-core GPU architecture.. Price, performance, and evolution.
Parallel Programming Henri Bal Rob van Nieuwpoort Vrije Universiteit Amsterdam Faculty of Sciences.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
Lecture 1: Introduction to High Performance Computing.
Parallel Programming Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
1a-1.1 Parallel Computing Demand for High Performance ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson Dec 27, 2012 slides1a-1.
Erik P. DeBenedictis Sandia National Laboratories October 24-27, 2005 Workshop on the Frontiers of Extreme Computing Sandia is a multiprogram laboratory.
Parallel Computing and Parallel Computers. Home work assignment 1. Write few paragraphs (max two page) about yourself. Currently what going on in your.
Grid Computing, B. Wilkinson, 20047a.1 Computational Grids.
1 BİL 542 Parallel Computing. 2 Parallel Programming Chapter 1.
CSCI-455/522 Introduction to High Performance Computing Lecture 1.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Introduction. Demand for Computational Speed Continual demand for greater computational speed from a computer system than is currently possible Areas.
1 BİL 542 Parallel Computing. 2 Parallel Programming Chapter 1.
SNU OOPSLA Lab. 1 Great Ideas of CS with Java Part 1 WWW & Computer programming in the language Java Ch 1: The World Wide Web Ch 2: Watch out: Here comes.
Chapter VI What should I know about the sizes and speeds of computers?
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3.
1a.1 Parallel Computing and Parallel Computers ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2006.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! 1 ITCS 4/5145 Parallel Computing,
Complexity Analysis (Part I)
Parallel Computers Chapter 1.
Potential for parallel computers/parallel programming
Parallel Computing and Parallel Computers
PERFORMANCE EVALUATIONS
Course Contents KIIT UNIVERSITY Sr # Major and Detailed Coverage Area
Introduction to Analysis of Algorithms
Parallel Computing Demand for High Performance
Super Computing By RIsaj t r S3 ece, roll 50.
Parallel Density-based Hybrid Clustering
Constructing a system with multiple computers or processors
Algorithm Analysis CSE 2011 Winter September 2018.
Course Description Algorithms are: Recipes for solving problems.
Parallel Computers.
Objective of This Course
Winter 2018 CISC101 12/2/2018 CISC101 Reminders
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Quiz Questions Parallel Programming Parallel Computing Potential
1.1 The Characteristics of Contemporary Processors, Input, Output and Storage Devices Types of Processors.
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
Parallel Computing Demand for High Performance
Parallel Computing Demand for High Performance
PERFORMANCE MEASURES. COMPUTATIONAL MODELS Equal Duration Model:  It is assumed that a given task can be divided into n equal subtasks, each of which.
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
Parallel Computing and Parallel Computers
Potential for parallel computers/parallel programming
Potential for parallel computers/parallel programming
Potential for parallel computers/parallel programming
Programming with Shared Memory Specifying parallelism
Quiz Questions Parallel Programming Parallel Computing Potential
Quiz Questions Parallel Programming Parallel Computing Potential
Quiz Questions Parallel Programming Parallel Computing Potential
Vrije Universiteit Amsterdam
Course Description Algorithms are: Recipes for solving problems.
Potential for parallel computers/parallel programming
Complexity Analysis (Part I)
Complexity Analysis (Part I)
Presentation transcript:

Parallel Computing Demand for High Performance ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson, 2012 July 10, 2013 slides1a-1

Parallel Computing Motives Using more than one computer, or a computer with more than one processor, to solve a problem. Motives Usually faster computation. Very simple idea n computers operating simultaneously can achieve the result faster it will not be n times faster for various reasons Other motives include: fault tolerance, larger amount of memory available, ...

Parallel programming has been around for more than 50 years Parallel programming has been around for more than 50 years. Gill writes in 1958*: “... There is therefore nothing new in the idea of parallel programming, but its application to computers. The author cannot believe that there will be any insuperable difficulty in extending it to computers. It is not to be expected that the necessary programming techniques will be worked out overnight. Much experimenting remains to be done. After all, the techniques that are commonly used in programming today were only won at the cost of considerable toil several years ago. In fact the advent of parallel programming may do something to revive the pioneering spirit in programming which seems at the present to be degenerating into a rather dull and routine occupation ...” * Gill, S. (1958), “Parallel Programming,” The Computer Journal, vol. 1, April, pp. 2-10.

Some problems needing a large number of computers Computer animation 1995 - Toy Story processed on 200 processors (cluster of 100 dual processor machines). 1999 Toy Story 2, -- 1400 processor system 2001 Monsters, Inc. -- 3500 processors (250 servers each containing 14 processors) Frame rate relatively constant, but increasing computing power provides more realistic animation. Sequencing the human genome Celera corporation uses 150 four processor servers plus a server with 16 processors. Facts from: http://www.eng.utah.edu/~cs5955/material/mattson12.pdf

“Grand Challenge” Problems Ones that cannot be solved in a “reasonable” time with today’s computers. Obviously, an execution time of 10 years is always unreasonable. Examples Global weather forecasting Modeling motion of astronomical bodies Modeling large DNA structures …

Weather Forecasting Atmosphere modeled by dividing it into 3-dimensional cells. Calculations of each cell repeated many times to model passage of time. Temperature, pressure, humidity, etc.

Global Weather Forecasting Suppose global atmosphere divided into 1 mile  1 mile  1 mile cells to a height of 10 miles - about 5  108 cells. Suppose calculation in each cell requires 200 floating point operations. In one time step, 1011 floating point operations necessary in total. To forecast weather over 7 days using 1-minute intervals, a computer operating at 1Gflops (109 floating point operations/s) takes 106 seconds or over 10 days. Obviously that will not work. To perform calculation in 5 minutes requires computer operating at 3.4 Tflops (3.4  1012 floating point operations/sec) A typical Intel processor operates in 20-100 Gflops region, a GPU up to 500Gflops, so we will have to use multiple computers/cores 3.4 Tflops achievable with a parallel cluster, but note 200 floating pt operations per cell and 1 x 1 x 1 mile cells are just guesses.

“Erik P. DeBenedictis of Sandia National Laboratories theorizes that a zettaFLOPS (ZFLOPS) computer is required to accomplish full weather modeling of two week time span.[1] Such systems might be built around 2030.[2]” Wikipedia “FOPS,” http://en.wikipedia.org/wiki/FLOPS [1] DeBenedictis, Erik P. (2005). "Reversible logic for supercomputing". Proc. 2nd conference on Computing frontiers. New York, NY: ACM Press. pp. 391–402. [2] "IDF: Intel says Moore's Law holds until 2029". Heise Online. April 4, 2008. ZFLOP = 1021 FLOPS

Modeling Motion of Astronomical Bodies Each body attracted to each other body by gravitational forces. Movement of each body predicted by calculating total force on each body and applying Newton’s laws (in the simple case) to determine the movement of the bodies.

Modeling Motion of Astronomical Bodies Each body has N-1 forces on it from the N-1 other bodies - O(N) calculation to determine the force on one body (three dimensional). With N bodies, approx. N2 calculations, i.e. O(N2) * After determining new positions of bodies, calculations repeated, i.e. N2 x T calculations where T is the number of time steps, i.e. the total complexity is O(N2T). * There is an O(N log2 N) algorithm, which we will cover in the course

A galaxy might have, say, 1011 stars. Even if each calculation done in 1 ms (extremely optimistic figure), it takes: 109 years for one iteration using N2 algorithm or Almost a year for one iteration using the N log2 N algorithm assuming the calculations take the same time (which may not be true). Then multiple the time by the number of time periods! We may set the N-body problem as an assignment using the basic O(N2) algorithm. However, you do not have 109 years to get the solution and turn it in.

My code with graphics running on coit-grid server and forwarded to client PC. Will be explained how to do this later with assignment

Questions