1a-1.1 Parallel Computing Demand for High Performance ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson Dec 27, 2012 slides1a-1.

Slides:



Advertisements
Similar presentations
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Advertisements

Concurrency The need for speed. Why concurrency? Moore’s law: 1. The number of components on a chip doubles about every 18 months 2. The speed of computation.
CSCI-455/552 Introduction to High Performance Computing Lecture 11.
Partitioning and Divide-and-Conquer Strategies ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 23, 2013.
Zhao Lixing.  A supercomputer is a computer that is at the frontline of current processing capacity, particularly speed of calculation.  Supercomputers.
Potential for parallel computers/parallel programming
Chapter 1 Parallel Computers.
Parallel Computers Chapter 1
COMPE 462 Parallel Computing
Partitioning and Divide-and-Conquer Strategies ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2007.
A many-core GPU architecture.. Price, performance, and evolution.
Parallel Programming Henri Bal Rob van Nieuwpoort Vrije Universiteit Amsterdam Faculty of Sciences.
Introduction to Analysis of Algorithms
Parallel Programming Henri Bal Vrije Universiteit Faculty of Sciences Amsterdam.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
Lecture 5 Today’s Topics and Learning Objectives Quinn Chapter 7 Predict performance of parallel programs Understand barriers to higher performance.
Parallel Algorithms - Introduction Advanced Algorithms & Data Structures Lecture Theme 11 Prof. Dr. Th. Ottmann Summer Semester 2006.
Lecture 1: Introduction to High Performance Computing.
Parallel Programming Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
CS 101: Arrays Abhiram Ranade. Computers manipulate many objects Given the position, masses, and velocities of all stars, where will they be after the.
UNIT - 1Topic - 2 C OMPUTING E NVIRONMENTS. What is Computing Environment? Computing Environment explains how a collection of computers will process and.
Erik P. DeBenedictis Sandia National Laboratories October 24-27, 2005 Workshop on the Frontiers of Extreme Computing Sandia is a multiprogram laboratory.
Parallel Computing and Parallel Computers. Home work assignment 1. Write few paragraphs (max two page) about yourself. Currently what going on in your.
Grid Computing, B. Wilkinson, 20047a.1 Computational Grids.
1 BİL 542 Parallel Computing. 2 Parallel Programming Chapter 1.
Analysis of Algorithms
Games Development 2 Concurrent Programming CO3301 Week 9.
Program Efficiency & Complexity Analysis. Algorithm Review An algorithm is a definite procedure for solving a problem in finite number of steps Algorithm.
CSCI-455/522 Introduction to High Performance Computing Lecture 1.
Slide 1 Course Description and Objectives: The aim of the module is –to introduce techniques for the design of efficient parallel algorithms and –their.
1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 25, 2011 Synchronization.ppt Synchronization These notes will introduce: Ways to achieve.
ITCS 3181 Logic and Computer Systems 2015 B. Wilkinson Slides4-2.ppt Modification date: March 23, Procedures Essential ingredient of high level.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Introduction. Demand for Computational Speed Continual demand for greater computational speed from a computer system than is currently possible Areas.
ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson, Dec 26, 2012outline.1 ITCS 4145/5145 Parallel Programming Spring 2013 Barry Wilkinson Department.
Time Parallel Simulations I Problem-Specific Approach to Create Massively Parallel Simulations.
Parallel Programming & Cluster Computing N-Body Simulation and Collective Communications Henry Neeman, University of Oklahoma Paul Gray, University of.
1 BİL 542 Parallel Computing. 2 Parallel Programming Chapter 1.
SNU OOPSLA Lab. 1 Great Ideas of CS with Java Part 1 WWW & Computer programming in the language Java Ch 1: The World Wide Web Ch 2: Watch out: Here comes.
Chapter VI What should I know about the sizes and speeds of computers?
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3.
1a.1 Parallel Computing and Parallel Computers ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2006.
CSC 212 – Data Structures Lecture 15: Big-Oh Notation.
1 A simple parallel algorithm Adding n numbers in parallel.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! 1 ITCS 4/5145 Parallel Computing,
Parallel Computers Chapter 1.
Potential for parallel computers/parallel programming
Parallel Computing and Parallel Computers
Parallel Computing Demand for High Performance
Super Computing By RIsaj t r S3 ece, roll 50.
Parallel Computers.
What is Parallel and Distributed computing?
Parallel Computing Demand for High Performance
Constructing a system with multiple computers or processors
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
Parallel Computing Demand for High Performance
Parallel Computing Demand for High Performance
PERFORMANCE MEASURES. COMPUTATIONAL MODELS Equal Duration Model:  It is assumed that a given task can be divided into n equal subtasks, each of which.
Parallel Computing and Parallel Computers
Potential for parallel computers/parallel programming
Potential for parallel computers/parallel programming
Potential for parallel computers/parallel programming
Programming with Shared Memory Specifying parallelism
Vrije Universiteit Amsterdam
Potential for parallel computers/parallel programming
Presentation transcript:

1a-1.1 Parallel Computing Demand for High Performance ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson Dec 27, 2012 slides1a-1

1a-1.2 Parallel Computing Using more than one computer, or a computer with more than one processor, to solve a problem. Motives Usually faster computation. Very simple idea –n computers operating simultaneously can achieve the result faster –it will not be n times faster for various reasons Other motives include: fault tolerance, larger amount of memory available,...

1a-1.3 Parallel programming has been around for more than 50 years. Gill writes in 1958 * : “... There is therefore nothing new in the idea of parallel programming, but its application to computers. The author cannot believe that there will be any insuperable difficulty in extending it to computers. It is not to be expected that the necessary programming techniques will be worked out overnight. Much experimenting remains to be done. After all, the techniques that are commonly used in programming today were only won at the cost of considerable toil several years ago. In fact the advent of parallel programming may do something to revive the pioneering spirit in programming which seems at the present to be degenerating into a rather dull and routine occupation...” * Gill, S. (1958), “Parallel Programming,” The Computer Journal, vol. 1, April, pp

Some problems needing a large number of computers Computer animation Toy Story processed on 200 processors (cluster of 100 dual processor machines) Toy Story 2, processor system 2001 Monsters, Inc processors (250 servers each containing 14 processors) Frame rate relatively constant, but increasing computing power provides more realistic animation. Sequencing the human genome Celera corporation uses 150 four processor servers plus a server with 16 processors. 1a.4 Facts from:

1a-1.5 “Grand Challenge” Problems Ones that cannot be solved in a “reasonable” time with today’s computers. Obviously, an execution time of 10 years is always unreasonable. Examples Global weather forecasting Modeling motion of astronomical bodies Modeling large DNA structures …

1a-1.6 Weather Forecasting Atmosphere modeled by dividing it into 3- dimensional cells. Calculations of each cell repeated many times to model passage of time. Temperature, pressure, humidity, etc.

1a-1.7 Global Weather Forecasting Suppose global atmosphere divided into 1 mile  1 mile  1 mile cells to a height of 10 miles - about 5  10 8 cells. Suppose calculation in each cell requires 200 floating point operations. In one time step, floating point operations necessary in total. To forecast weather over 7 days using 1-minute intervals, a computer operating at 1Gflops (10 9 floating point operations/s) takes 10 6 seconds or over 10 days. Obviously that will not work. To perform calculation in 5 minutes requires computer operating at 3.4 Tflops (3.4  floating point operations/sec) In 2012, a typical Intel processor operates in Gflops region, a GPU up to 500Gflops, so we will have to use multiple computers/cores 3.4 Tflops achievable with a parallel cluster, but note 200 floating pt operations per cell and 1 x 1 x 1 mile cells are just guesses.

“Erik P. DeBenedictis of Sandia National Laboratories theorizes that a zettaFLOPS (ZFLOPS) computer is required to accomplish full weather modeling of two week time span.[1] Such systems might be built around 2030.[2]” Wikipedia “FOPS,” [1] DeBenedictis, Erik P. (2005). "Reversible logic for supercomputing". Proc. 2nd conference on Computing frontiers. New York, NY: ACM Press. pp. 391–402. [2] "IDF: Intel says Moore's Law holds until 2029". Heise Online. April 4, ZFLOP = FLOPS 1a.8

1a-1.9 Modeling Motion of Astronomical Bodies Each body attracted to each other body by gravitational forces. Movement of each body predicted by calculating total force on each body and applying Newton’s laws (in the simple case) to determine the movement of the bodies.

1a-1.10 Modeling Motion of Astronomical Bodies Each body has N-1 forces on it from the N-1 other bodies - O(N) calculation to determine the force on one body (three dimensional). With N bodies, approx. N 2 calculations, i.e. O(N 2 ) * After determining new positions of bodies, calculations repeated, i.e. N 2 x T calculations where T is the number of time steps, i.e. the total complexity is O(N 2 T). * There is an O(N log 2 N) algorithm, which we will cover in the course

1a-1.11 A galaxy might have, say, stars. Even if each calculation done in 1 ms (extremely optimistic figure), it takes: 10 9 years for one iteration using N 2 algorithm or Almost a year for one iteration using the N log 2 N algorithm assuming the calculations take the same time (which may not be true). Then multiple the time by the number of time periods! We may set the N-body problem as one assignment using the basic O(N 2 ) algorithm. However, you do not have 10 9 years to get the solution and turn it in.

My code with graphics running on coit-grid server and forwarded to client PC. Will be explained how to do this later with assignment 1a-1.12 Nbodies.wmv Double click to run

Questions 1a-1.13