Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

Similar presentations


Presentation on theme: "Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn."— Presentation transcript:

1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn

2 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 1 Motivation and History

3 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Outline Motivation Motivation Modern scientific method Modern scientific method Evolution of supercomputing Evolution of supercomputing Modern parallel computers Modern parallel computers Seeking concurrency Seeking concurrency Data clustering case study Data clustering case study Programming parallel computers Programming parallel computers

4 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Why Faster Computers? Solve compute-intensive problems faster Solve compute-intensive problems faster  Make infeasible problems feasible  Reduce design time Solve larger problems in same amount of time Solve larger problems in same amount of time  Improve answer’s precision  Reduce design/analysis time Gain competitive advantage Gain competitive advantage

5 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Definitions Parallel computing Parallel computing  Using parallel computer to solve single problems faster Parallel computer Parallel computer  Multiple-processor system supporting parallel programming Parallel programming Parallel programming  Programming in a language that supports concurrency explicitly

6 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Why MPI? MPI = “Message Passing Interface” MPI = “Message Passing Interface” Standard specification for message-passing libraries Standard specification for message-passing libraries Libraries available on virtually all parallel computers Libraries available on virtually all parallel computers Free libraries also available for networks of workstations or commodity clusters Free libraries also available for networks of workstations or commodity clusters

7 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Why OpenMP? OpenMP an application programming interface (API) for shared-memory systems OpenMP an application programming interface (API) for shared-memory systems Supports higher performance parallel programming of symmetrical multiprocessors Supports higher performance parallel programming of symmetrical multiprocessors

8 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Classical Science Nature Observation Theory Physical Experimentation

9 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Modern Scientific Method Nature Observation Theory Physical Experimentation Numerical Simulation

10 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Evolution of Supercomputing World War II World War II  Hand-computed artillery tables  Need to speed computations  ENIAC Cold War Cold War  Nuclear weapon design  Intelligence gathering  Code-breaking

11 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Supercomputer General-purpose computer General-purpose computer Solves individual problems at high speeds, compared with contemporary systems Solves individual problems at high speeds, compared with contemporary systems Typically costs $10 million or more Typically costs $10 million or more Traditionally found in government labs Traditionally found in government labs

12 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Commercial Supercomputing Started in capital-intensive industries Started in capital-intensive industries  Petroleum exploration  Automobile manufacturing Other companies followed suit Other companies followed suit  Pharmaceutical design  Consumer products

13 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 50 Years of Speed Increases ENIAC 350 flops Today > 1 trillion flops

14 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. CPUs 1 Million Times Faster Faster clock speeds Faster clock speeds Greater system concurrency Greater system concurrency  Multiple functional units  Concurrent instruction execution  Speculative instruction execution

15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Systems 1 Billion Times Faster Processors are 1 million times faster Processors are 1 million times faster Combine thousands of processors Combine thousands of processors Parallel computer Parallel computer  Multiple processors  Supports parallel programming Parallel computing = Using a parallel computer to execute a program faster Parallel computing = Using a parallel computer to execute a program faster

16 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Microprocessor Revolution Micros Minis Mainframes Speed (log scale) Time ECL Supercomputers

17 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Cray (ECL) Machines

18 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Early Multicomputers Caltech’s Cosmic Cube (Seitz and Fox) Caltech’s Cosmic Cube (Seitz and Fox) Commercial copy-cats Commercial copy-cats  nCUBE Corporation  Intel’s Supercomputer Systems Division  Lots more Thinking Machines Corporation Thinking Machines Corporation

19 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Copy-cat Strategy Microprocessor Microprocessor  1% peak performance of supercomputer  0.1% cost of supercomputer Parallel computer = 1000 microprocessors Parallel computer = 1000 microprocessors  10 x peak performance of supercomputer  Same cost as supercomputer

20 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Why Didn’t Everybody Buy One? Supercomputer   CPUs Supercomputer   CPUs  Peak performance  speed  Inadequate I/O Software Software  Inadequate operating systems  Inadequate programming environments

21 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Beowulf Concept NASA (Sterling and Becker) NASA (Sterling and Becker) Commodity processors Commodity processors Commodity interconnect Commodity interconnect Linux operating system Linux operating system Message Passing Interface (MPI) library Message Passing Interface (MPI) library High performance/$ for certain applications High performance/$ for certain applications

22 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Commercial (CMOS) Parallel Systems SMPs (Sun, Dell). SMPs (Sun, Dell). IBM SP: Multicomputer with custom interconnect. IBM SP: Multicomputer with custom interconnect. MultiSupercomputers (Cray X1, NEC SX- 6) MultiSupercomputers (Cray X1, NEC SX- 6)

23 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Cray X1

24 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. NEC SX-6

25 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Advanced Strategic Computing Initiative U.S. nuclear policy changes U.S. nuclear policy changes  Moratorium on testing  Production of new weapons halted Numerical simulations needed to maintain existing stockpile Numerical simulations needed to maintain existing stockpile Five supercomputers costing up to $100 million each Five supercomputers costing up to $100 million each

26 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. ASCI White (10 teraops/sec)

27 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Japan’s Earth Simulator Through collaboration with various national-related agencies and industries and with the support of the Japanese nation, the Earth Simulator Project is dedicated to serving society. Basic Principles; Through collaboration with various national-related agencies and industries and with the support of the Japanese nation, the Earth Simulator Project is dedicated to serving society. Basic Principles; 1. Quantitative prediction and assessment of variations of the atmosphere, ocean and solid earth. 2. Production of reliable data to protect human lives and properties from natural disasters and environmental destruction. 3. Contribution to symbiotic relationship of human activities with nature. 4. Promotion of innovative and epoch-making simulation in any fields such as industry, bioscience, energy and so on.

28 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Earth Simulator (40 Teraflops)

29 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Seeking Concurrency Data dependence graphs Data dependence graphs Data parallelism Data parallelism Functional parallelism Functional parallelism Pipelining Pipelining

30 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Data Dependence Graph Directed graph Directed graph Vertices = tasks Vertices = tasks Edges = dependences Edges = dependences

31 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Data Parallelism Independent tasks apply same operation to different elements of a data set Independent tasks apply same operation to different elements of a data set Okay to perform operations concurrently Okay to perform operations concurrently for i  0 to 99 do a[i]  b[i] + c[i] endfor

32 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Functional Parallelism Independent tasks apply different operations to different data elements Independent tasks apply different operations to different data elements First and second statements First and second statements Third and fourth statements Third and fourth statements a  2 b  3 m  (a + b) / 2 s  (a 2 + b 2 ) / 2 v  s - m 2

33 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Pipelining Divide a process into stages Divide a process into stages Produce several items simultaneously Produce several items simultaneously

34 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partial Sums Pipeline

35 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Data Clustering Data mining = looking for meaningful patterns in large data sets Data mining = looking for meaningful patterns in large data sets Data clustering = organizing a data set into clusters of “similar” items Data clustering = organizing a data set into clusters of “similar” items Data clustering can speed retrieval of related items Data clustering can speed retrieval of related items

36 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Document Vectors Moon Rocket Alice in Wonderland A Biography of Jules Verne The Geology of Moon Rocks The Story of Apollo 11

37 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Document Clustering

38 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Clustering Algorithm Compute document vectors Compute document vectors Choose initial cluster centers Choose initial cluster centers Repeat Repeat  Compute performance function  Adjust centers Until function value converges or max iterations have elapsed Until function value converges or max iterations have elapsed Output cluster centers Output cluster centers

39 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Data Parallelism Opportunities Operation being applied to a data set Operation being applied to a data set Examples Examples  Generating document vectors  Finding closest center to each vector  Picking initial values of cluster centers

40 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Functional Parallelism Opportunities Draw data dependence diagram Draw data dependence diagram Look for sets of nodes such that there are no paths from one node to another Look for sets of nodes such that there are no paths from one node to another

41 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Data Dependence Diagram Build document vectors Compute function value Choose cluster centers Adjust cluster centersOutput cluster centers

42 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Programming Parallel Computers Extend compilers: translate sequential programs into parallel programs Extend compilers: translate sequential programs into parallel programs Extend languages: add parallel operations Extend languages: add parallel operations Add parallel language layer on top of sequential language Add parallel language layer on top of sequential language Define totally new parallel language and compiler system Define totally new parallel language and compiler system

43 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Strategy 1: Extend Compilers Parallelizing compiler Parallelizing compiler  Detect parallelism in sequential program  Produce parallel executable program Focus on making Fortran programs parallel Focus on making Fortran programs parallel

44 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Extend Compilers (cont.) Advantages Advantages  Can leverage millions of lines of existing serial programs  Saves time and labor  Requires no retraining of programmers  Sequential programming easier than parallel programming

45 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Extend Compilers (cont.) Disadvantages Disadvantages  Parallelism may be irretrievably lost when programs written in sequential languages  Performance of parallelizing compilers on broad range of applications still up in air

46 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Strategy 2: Use Libraries Add functions to a sequential language Add functions to a sequential language  Create and terminate processes  Synchronize processes  Allow processes to communicate

47 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Use Libraries (cont.) Advantages Advantages  Easiest, quickest, and least expensive  Allows existing compiler technology to be leveraged  New libraries can be ready soon after new parallel computers are available

48 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Use Libraries (cont.) Disadvantages Disadvantages  Lack of compiler support to catch errors  Easy to write programs that are difficult to debug

49 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Strategy 3: Add a Parallel Programming Layer Lower layer Lower layer  Core of computation  Process manipulates its portion of data to produce its portion of result Upper layer Upper layer  Creation and synchronization of processes  Partitioning of data among processes A few research prototypes have been built based on these principles A few research prototypes have been built based on these principles

50 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Strategy 4: Create a Parallel Language Develop a parallel language “from scratch” Develop a parallel language “from scratch”  occam is an example Add parallel constructs to an existing language Add parallel constructs to an existing language  Fortran 90  C*

51 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. New Parallel Languages (cont.) Advantages Advantages  Allows programmer to communicate parallelism to compiler  Improves probability that executable will achieve high performance Disadvantages Disadvantages  Requires development of new compilers  New languages may not become standards  Programmer resistance

52 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Strategy 5: Use Directives Directives annotate constructs to specify parallelism and data distribution. Directives annotate constructs to specify parallelism and data distribution. Approach followed by High Performance Fortran (HPF), OpenMP. Approach followed by High Performance Fortran (HPF), OpenMP. Program should work correctly if directives are ignored. This is difficult to enforce. Program should work correctly if directives are ignored. This is difficult to enforce. Compiler is needed. Implementation is not as hard as extending the language, but not as simple as using libraries. Compiler is needed. Implementation is not as hard as extending the language, but not as simple as using libraries.

53 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Current Status Low-level approach is most popular Low-level approach is most popular  Augment existing language with low-level parallel constructs  MPI and OpenMP are examples Advantages of low-level approach Advantages of low-level approach  Efficiency  Portability Disadvantage: More difficult to program and debug Disadvantage: More difficult to program and debug

54 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary (1/2) High performance computing High performance computing  U.S. government  Capital-intensive industries  Many companies and research labs Parallel computers Parallel computers  Commercial systems  Commodity-based systems

55 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary (2/2) Power of CPUs keeps growing exponentially Power of CPUs keeps growing exponentially Parallel programming environments changing very slowly Parallel programming environments changing very slowly Two standards have emerged Two standards have emerged  MPI library, for processes that do not share memory  OpenMP directives, for processes that do share memory


Download ppt "Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn."

Similar presentations


Ads by Google