Download presentation
Presentation is loading. Please wait.
Published byAlfred O’Connor’ Modified over 9 years ago
1
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn
2
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 1 Motivation and History
3
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Outline Motivation Motivation Modern scientific method Modern scientific method Evolution of supercomputing Evolution of supercomputing Modern parallel computers Modern parallel computers Seeking concurrency Seeking concurrency Data clustering case study Data clustering case study Programming parallel computers Programming parallel computers
4
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Why Faster Computers? Solve compute-intensive problems faster Solve compute-intensive problems faster Make infeasible problems feasible Reduce design time Solve larger problems in same amount of time Solve larger problems in same amount of time Improve answer’s precision Reduce design/analysis time Gain competitive advantage Gain competitive advantage
5
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Definitions Parallel computing Parallel computing Using parallel computer to solve single problems faster Parallel computer Parallel computer Multiple-processor system supporting parallel programming Parallel programming Parallel programming Programming in a language that supports concurrency explicitly
6
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Why MPI? MPI = “Message Passing Interface” MPI = “Message Passing Interface” Standard specification for message-passing libraries Standard specification for message-passing libraries Libraries available on virtually all parallel computers Libraries available on virtually all parallel computers Free libraries also available for networks of workstations or commodity clusters Free libraries also available for networks of workstations or commodity clusters
7
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Why OpenMP? OpenMP an application programming interface (API) for shared-memory systems OpenMP an application programming interface (API) for shared-memory systems Supports higher performance parallel programming of symmetrical multiprocessors Supports higher performance parallel programming of symmetrical multiprocessors
8
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Classical Science Nature Observation Theory Physical Experimentation
9
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Modern Scientific Method Nature Observation Theory Physical Experimentation Numerical Simulation
10
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Evolution of Supercomputing World War II World War II Hand-computed artillery tables Need to speed computations ENIAC Cold War Cold War Nuclear weapon design Intelligence gathering Code-breaking
11
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Supercomputer General-purpose computer General-purpose computer Solves individual problems at high speeds, compared with contemporary systems Solves individual problems at high speeds, compared with contemporary systems Typically costs $10 million or more Typically costs $10 million or more Traditionally found in government labs Traditionally found in government labs
12
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Commercial Supercomputing Started in capital-intensive industries Started in capital-intensive industries Petroleum exploration Automobile manufacturing Other companies followed suit Other companies followed suit Pharmaceutical design Consumer products
13
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 50 Years of Speed Increases ENIAC 350 flops Today > 1 trillion flops
14
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. CPUs 1 Million Times Faster Faster clock speeds Faster clock speeds Greater system concurrency Greater system concurrency Multiple functional units Concurrent instruction execution Speculative instruction execution
15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Systems 1 Billion Times Faster Processors are 1 million times faster Processors are 1 million times faster Combine thousands of processors Combine thousands of processors Parallel computer Parallel computer Multiple processors Supports parallel programming Parallel computing = Using a parallel computer to execute a program faster Parallel computing = Using a parallel computer to execute a program faster
16
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Microprocessor Revolution Micros Minis Mainframes Speed (log scale) Time ECL Supercomputers
17
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Cray (ECL) Machines
18
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Early Multicomputers Caltech’s Cosmic Cube (Seitz and Fox) Caltech’s Cosmic Cube (Seitz and Fox) Commercial copy-cats Commercial copy-cats nCUBE Corporation Intel’s Supercomputer Systems Division Lots more Thinking Machines Corporation Thinking Machines Corporation
19
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Copy-cat Strategy Microprocessor Microprocessor 1% peak performance of supercomputer 0.1% cost of supercomputer Parallel computer = 1000 microprocessors Parallel computer = 1000 microprocessors 10 x peak performance of supercomputer Same cost as supercomputer
20
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Why Didn’t Everybody Buy One? Supercomputer CPUs Supercomputer CPUs Peak performance speed Inadequate I/O Software Software Inadequate operating systems Inadequate programming environments
21
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Beowulf Concept NASA (Sterling and Becker) NASA (Sterling and Becker) Commodity processors Commodity processors Commodity interconnect Commodity interconnect Linux operating system Linux operating system Message Passing Interface (MPI) library Message Passing Interface (MPI) library High performance/$ for certain applications High performance/$ for certain applications
22
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Commercial (CMOS) Parallel Systems SMPs (Sun, Dell). SMPs (Sun, Dell). IBM SP: Multicomputer with custom interconnect. IBM SP: Multicomputer with custom interconnect. MultiSupercomputers (Cray X1, NEC SX- 6) MultiSupercomputers (Cray X1, NEC SX- 6)
23
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Cray X1
24
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. NEC SX-6
25
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Advanced Strategic Computing Initiative U.S. nuclear policy changes U.S. nuclear policy changes Moratorium on testing Production of new weapons halted Numerical simulations needed to maintain existing stockpile Numerical simulations needed to maintain existing stockpile Five supercomputers costing up to $100 million each Five supercomputers costing up to $100 million each
26
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. ASCI White (10 teraops/sec)
27
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Japan’s Earth Simulator Through collaboration with various national-related agencies and industries and with the support of the Japanese nation, the Earth Simulator Project is dedicated to serving society. Basic Principles; Through collaboration with various national-related agencies and industries and with the support of the Japanese nation, the Earth Simulator Project is dedicated to serving society. Basic Principles; 1. Quantitative prediction and assessment of variations of the atmosphere, ocean and solid earth. 2. Production of reliable data to protect human lives and properties from natural disasters and environmental destruction. 3. Contribution to symbiotic relationship of human activities with nature. 4. Promotion of innovative and epoch-making simulation in any fields such as industry, bioscience, energy and so on.
28
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Earth Simulator (40 Teraflops)
29
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Seeking Concurrency Data dependence graphs Data dependence graphs Data parallelism Data parallelism Functional parallelism Functional parallelism Pipelining Pipelining
30
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Data Dependence Graph Directed graph Directed graph Vertices = tasks Vertices = tasks Edges = dependences Edges = dependences
31
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Data Parallelism Independent tasks apply same operation to different elements of a data set Independent tasks apply same operation to different elements of a data set Okay to perform operations concurrently Okay to perform operations concurrently for i 0 to 99 do a[i] b[i] + c[i] endfor
32
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Functional Parallelism Independent tasks apply different operations to different data elements Independent tasks apply different operations to different data elements First and second statements First and second statements Third and fourth statements Third and fourth statements a 2 b 3 m (a + b) / 2 s (a 2 + b 2 ) / 2 v s - m 2
33
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Pipelining Divide a process into stages Divide a process into stages Produce several items simultaneously Produce several items simultaneously
34
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partial Sums Pipeline
35
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Data Clustering Data mining = looking for meaningful patterns in large data sets Data mining = looking for meaningful patterns in large data sets Data clustering = organizing a data set into clusters of “similar” items Data clustering = organizing a data set into clusters of “similar” items Data clustering can speed retrieval of related items Data clustering can speed retrieval of related items
36
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Document Vectors Moon Rocket Alice in Wonderland A Biography of Jules Verne The Geology of Moon Rocks The Story of Apollo 11
37
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Document Clustering
38
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Clustering Algorithm Compute document vectors Compute document vectors Choose initial cluster centers Choose initial cluster centers Repeat Repeat Compute performance function Adjust centers Until function value converges or max iterations have elapsed Until function value converges or max iterations have elapsed Output cluster centers Output cluster centers
39
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Data Parallelism Opportunities Operation being applied to a data set Operation being applied to a data set Examples Examples Generating document vectors Finding closest center to each vector Picking initial values of cluster centers
40
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Functional Parallelism Opportunities Draw data dependence diagram Draw data dependence diagram Look for sets of nodes such that there are no paths from one node to another Look for sets of nodes such that there are no paths from one node to another
41
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Data Dependence Diagram Build document vectors Compute function value Choose cluster centers Adjust cluster centersOutput cluster centers
42
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Programming Parallel Computers Extend compilers: translate sequential programs into parallel programs Extend compilers: translate sequential programs into parallel programs Extend languages: add parallel operations Extend languages: add parallel operations Add parallel language layer on top of sequential language Add parallel language layer on top of sequential language Define totally new parallel language and compiler system Define totally new parallel language and compiler system
43
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Strategy 1: Extend Compilers Parallelizing compiler Parallelizing compiler Detect parallelism in sequential program Produce parallel executable program Focus on making Fortran programs parallel Focus on making Fortran programs parallel
44
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Extend Compilers (cont.) Advantages Advantages Can leverage millions of lines of existing serial programs Saves time and labor Requires no retraining of programmers Sequential programming easier than parallel programming
45
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Extend Compilers (cont.) Disadvantages Disadvantages Parallelism may be irretrievably lost when programs written in sequential languages Performance of parallelizing compilers on broad range of applications still up in air
46
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Strategy 2: Use Libraries Add functions to a sequential language Add functions to a sequential language Create and terminate processes Synchronize processes Allow processes to communicate
47
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Use Libraries (cont.) Advantages Advantages Easiest, quickest, and least expensive Allows existing compiler technology to be leveraged New libraries can be ready soon after new parallel computers are available
48
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Use Libraries (cont.) Disadvantages Disadvantages Lack of compiler support to catch errors Easy to write programs that are difficult to debug
49
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Strategy 3: Add a Parallel Programming Layer Lower layer Lower layer Core of computation Process manipulates its portion of data to produce its portion of result Upper layer Upper layer Creation and synchronization of processes Partitioning of data among processes A few research prototypes have been built based on these principles A few research prototypes have been built based on these principles
50
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Strategy 4: Create a Parallel Language Develop a parallel language “from scratch” Develop a parallel language “from scratch” occam is an example Add parallel constructs to an existing language Add parallel constructs to an existing language Fortran 90 C*
51
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. New Parallel Languages (cont.) Advantages Advantages Allows programmer to communicate parallelism to compiler Improves probability that executable will achieve high performance Disadvantages Disadvantages Requires development of new compilers New languages may not become standards Programmer resistance
52
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Strategy 5: Use Directives Directives annotate constructs to specify parallelism and data distribution. Directives annotate constructs to specify parallelism and data distribution. Approach followed by High Performance Fortran (HPF), OpenMP. Approach followed by High Performance Fortran (HPF), OpenMP. Program should work correctly if directives are ignored. This is difficult to enforce. Program should work correctly if directives are ignored. This is difficult to enforce. Compiler is needed. Implementation is not as hard as extending the language, but not as simple as using libraries. Compiler is needed. Implementation is not as hard as extending the language, but not as simple as using libraries.
53
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Current Status Low-level approach is most popular Low-level approach is most popular Augment existing language with low-level parallel constructs MPI and OpenMP are examples Advantages of low-level approach Advantages of low-level approach Efficiency Portability Disadvantage: More difficult to program and debug Disadvantage: More difficult to program and debug
54
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary (1/2) High performance computing High performance computing U.S. government Capital-intensive industries Many companies and research labs Parallel computers Parallel computers Commercial systems Commodity-based systems
55
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary (2/2) Power of CPUs keeps growing exponentially Power of CPUs keeps growing exponentially Parallel programming environments changing very slowly Parallel programming environments changing very slowly Two standards have emerged Two standards have emerged MPI library, for processes that do not share memory OpenMP directives, for processes that do share memory
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.