Download presentation
Presentation is loading. Please wait.
Published byLewis Austin Modified over 6 years ago
1
Team 1 Aakanksha Gupta, Solomon Walker, Guanghong Wang
Parallel Computing Team 1 Aakanksha Gupta, Solomon Walker, Guanghong Wang
2
Topics What is Parallel Computing? Why use Parallel Computing ?
Concepts and Terminologies Parallel Computer Memory Architectures Parallel Programming Models Designing Parallel Programs Parallel Examples
3
What is Parallel Computing
Serial Computing: A problem is broken into a discrete series of instructions Instructions are executed sequentially one after another Executed on a single processor Only one instruction may execute at any moment in time Parallel Computing: A problem is broken into discrete parts that can be solved concurrently Each part is further broken down to a series of instructions Instructions from each part execute simultaneously on different processors An overall control/coordination mechanism is employed
4
Parallel Computing The compute resources are typically:
The computational problem should be able to:
6
Why use Parallel Computing
7
Concepts and Terminology
Flynn's Classical Taxonomy Von Neumann Architecture where P = parallel fraction, N = number of processors and S = serial fraction Amdahl's Law:
8
Parallel Computer Memory Architectures
Shared Memory Uniform Memory Access (UMA) Non-Uniform Memory Access (NUMA) Distributed Memory
9
Parallel Computer Memory Architectures
Hybrid Distributed-Shared Memory
10
Parallel Programming Models
Shared Memory Model (without threads) Processes/tasks share a common address space, which they read and write to asynchronously Locks / semaphores are used to control access to the shared memory Implementation UNIX
11
Parallel Programming Models
Threads Model Type of shared memory programming. A single "heavy weight" process can have multiple "light weight", concurrent execution Comprises of a library of subroutines that are called from within parallel source code A set of compiler directives embedded in either serial or parallel source code Implementation POSIX Threads and OpenMP.
12
Parallel Programming Models
Distributed Memory / Message Passing Model Multiple tasks can reside on the same physical machine and/or across an arbitrary number of machines. Tasks exchange data through communications by sending and receiving Data transfer usually requires cooperative operations to be performed by each process Implementation Message Passing Interface (MPI)
13
Parallel Programming Models
Data Parallel Model Address space is treated globally Parallel work focuses on performing operations on a data set A set of tasks work collectively on the same data structure Tasks perform the same operation on their partition of work Implementation Coarray Fortran, Unified Parallel C (UPC), Chapel
14
Parallel Programming Models
Hybrid Model A combination of the message passing model (MPI) with the threads model (OpenMP) Threads perform computationally intensive kernels using local, on-node data Communications between processes on different nodes occurs over the network using MPI
15
Parallel Programming Models
SPMD- Single Program Multiple Data "High level" programming model combination of the previously models MPMD- Multiple Program Multiple Data
16
Designing Parallel Programs
17
How to make serial programs parallel
Parallelization Partitioning Automatic: a program parallelizes a program as it can Manual: the programmer denotes where and how they want parallelization to occur Data dependency between tasks determine what can or cannot be parallelized and in what order parallel tasks can be done. Determines what work and in what order processing units handle parallel programs. Cyclic: cycle through discrete chunks of work that split up each problem and when a chunk is complete, the processing unit moves onto the next chunk Block: Each task is handled by its processing unit and is held and worked on until the task is complete
18
Parallel Examples
19
In depth look: Array Processing
Say you want to perform a function on every element of an array Could do every array element in order, but could also parallelize the task. Block form: divide the array elements into groups depending on how many processing units are available. Then, have each processing unit work on its chunk of work until completion. (array elements)/(# of processing units) = chunk size Cyclic form: have each array element be a discrete unit. Each processing unit will compute a result for an element, then move on to the next element of the array that is not currently being worked on by another processing unit. This example is embarrassingly parallel, which means it has little to no data dependency, which is a huge roadblock for designing most applicable parallel programs. Timing of completing functions and keeping relevant programs running on the same processing unit are considerations in examples that are not embarassingly parallel Also, complex parallel programs need to consider granularity, which is the ratio of computing work to communicating work. Programs with a lot of data dependency need more time for communication leading to a higher granularity.
20
PI Calculation
21
Simple Heat Equation
22
References Designing and Building Parallel Programs". Ian Foster.
"Introduction to Parallel Computing". Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar. "Overview of Recent Supercomputers". A.J. van der Steen, Jack Dongarra. OverviewRecentSupercomputers.2008.pdf
23
Questions ??
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.