Download presentation
Presentation is loading. Please wait.
Published bySilvester Davidson Modified over 8 years ago
1
Today's Software For Tomorrow's Hardware: An Introduction to Parallel Computing Rahul.S. Sampath May 9 th 2007
2
Computational Power Today…
3
Floating Point Operations Per Second (FLOPS) Humans doing long division: Milli-flops (1/1000th of one flop) Cray-1 supercomputer, 1976, $8m: 80 MFLOPS Pentium II, 400 mhz: 100 MFLOPS TYPICAL HIGH-END PC TODAY: ~ 1 GFLOPS Sony Playstation 3, 2006: 2 TFLOPS IBM TRIPS, 2010 (one-chip solution, CPU only): 1 TFLOPS IBM Blue Gene, < 2010 (with 65,536 microprocessors): 360 TFLOPS
4
Why do we need more? "DOS addresses only 1 MB of RAM because we cannot imagine any application needing more." -- Microsoft, 1980. "640k ought to be enough for anybody"--Bill Gates, 1981. Bottom-line: Demand for computational power will continue to increase.
5
Some Computationally Intensive Applications Today Computer Aided Surgery Medical Imaging MD simulations FEM simulations with > 10^10 unknowns Galaxy formation and evolution 17 million particle Cold Dark Matter Cosmology simulation
6
Any application, which can be scaled up should be treated as a computationally intensive application.
7
The Need for Parallel Computing Memory (RAM) There is a theoretical limit on the RAM that is available on your computer. 32 bit systems: 4GB (2^32) 64 bit systems: 16 exabytes (> 16,000 TB) Speed Upgrading microprocessors can’t help you anymore Flops is not the bottleneck, memory is What we need is more registers Think pre-computing, higher bandwidth memory bus, L2/L3 cache, compiler optimizations, assembly language Asylum Or… Think parallel…
8
Hacks If Speed is not an issue… Is out-of-core implementation an option? Parallel programs can be converted into out- of-core implementations easily.
9
Parallel Algorithms
10
The Key Questions Why? Memory Speed Both What kind of platform? Shared Memory Distributed Computing Typical size of the application Small (< 32 processors) Medium ( 32 - 256 processors) Large (> 256 processors) How much time and effort do you want to invest? How many times will the component be used in a single execution of the program?
11
Factors to Consider in any Parallel Algorithm Design Give equal work to all processors at all times Load Balancing Give equal amount of data to all processors Efficient Memory Management Processors should work independently as much as possible Minimize communication, especially iterative communication If communication is necessary, try to do some work in the background as well Overlapping communication and computation Try to keep the sequential part of the parallel algorithm as close to the best sequential algorithm possible Optimal Work Algorithm
12
Difference Between Sequential and Parallel Algorithms Not all data is accessible at all times All computations must be as localized as possible Can’t have random access New dimension to the existing algorithm – division of work Which processor does what portion of the work? If communication can not be avoided How will it be initiated? What type of communication? What are the pre-processing and post-processing operations? Order of operations could be very critical for performance
13
Parallel Algorithm Approaches Data-Parallel Approach Partition the data among the processors Each processor will execute the same set of commands Control-Parallel Approach Partition the tasks to be performed among the processors Each processor will execute different commands Hybrid Approach Switch between the two approaches at different stages of the algorithm Most parallel algorithms fall in this category
14
Performance Metrics Speedup Overhead Scalability Fixed Size Iso-granular Efficiency Speedup per processor Iso-Efficiency Problem size as a function of p in order to keep efficiency constant
15
The Take Home Message A good parallel algorithm is NOT a simple extension of the corresponding sequential algorithm. What model to use? – Problem dependent. e.g. a+b+c+… = (a+b) + (c+d) + … Not much choice really. It is a big investment, but can really be worth it.
16
Parallel Programming
17
How does a parallel program work? You request a certain number of processors You setup a communicator Give a unique id to each processor – rank Every processor executes the same program Inside the program Query for the rank and use it decide what to do Exchange messages between different processors using their ranks In theory, you only need 3 functions: Isend, Irecv, wait In practice, you can optimize communication depending on the underlying network topolgoy – Message Passing Standards…
18
Message Passing Standards The standards define a set of primitive communication operations. The vendors implementing these on any machine are responsible to optimize these operations for that machine. Popular Standards Message Passing Interface (MPI) Open Message Passing (OpenMP)
19
Languages that support MPI Fortran 77 C/C++ Python Matlab
20
MPI Implementations MPICH ftp://info.mcs.anl.gov/pub/mpi ftp://info.mcs.anl.gov/pub/mpi LAM http://www.mpi.nd.edu/lam/download CHIMP ftp://ftp.epcc.ed.ac.uk/pub/chimp/release ftp://ftp.epcc.ed.ac.uk/pub/chimp/release WinMPI (Windows) ftp://csftp.unomaha.edu/pub/rewini/WinMPI W32MPI (Windows) http://dsg.dei.uc.pt/wmpi/intro.html
21
Open Source Parallel Software PETSc ( Linear and NonLinear Solvers ) http://www-unix.mcs.anl.gov/petsc/petsc-as/ http://www-unix.mcs.anl.gov/petsc/petsc-as/ ScaLAPACK ( Linear Algebra ) http://www.netlib.org/scalapack/scalapack_home.html SPRNG ( Random Number Generator ) http://sprng.cs.fsu.edu/ Paraview ( Visualization ) http://www.paraview.org/HTML/Index.html NAMD ( Molecular Dynamics ) http://www.ks.uiuc.edu/Research/namd/ CHARMM++ ( Parallel Objects ) http://charm.cs.uiuc.edu/research/charm/
22
References Parallel Programming with MPI, Peter S. Pacheco Introduction to Parallel Computing, A. Grama, A. gupta, G. Karypis, V. Kumar MPI-The Complete Reference, William Gropp et.al. http://www-unix.mcs.anl.gov/mpi/ http://www.erc.msstate.edu/mpi http://www.epm.ornl.gov/~walker/mpi http://www.erc.msstate.edu/mpi/mpi-faq.html (FAQ) http://www.erc.msstate.edu/mpi/mpi-faq.html Comp.parallel.mpi (Newsgroup) http://www.mpi-forum.org (MPI Forum) http://www.mpi-forum.org
23
Thank You
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.