Download presentation
Presentation is loading. Please wait.
1
Shared-Memory Paradigm & OpenMP
FDI 2007 Track Q Day 3 – Morning Session
2
Characteristics of Shared-Memory Machines
Modest number of processors, e.g., 8, 16, 32 One bus, one memory, one address space. Major issues: bus contention, cache coherency, synchronization. Incremental parallelization is easy. NUMA vs. UMA: the former scales better, but with increased latency for some data. Important trends in computer architecture imply that shared-memory parallelism will be increasingly important, i.e., multicore. Major issues: with openmp, you don’t have to worry about the 1st two problems, and even synchronization isn’t usually a problem … at least as far as correctness goes. But performance? Well … that can definitely be important. Example of bus contention and cache coherency overhead. For I = 1, n do in parallel a[I] = b[I] + c[I] /// note benefit of chunking NUMA not that common right now. Dual core. Hybrid codes? Not 1st priority. But may become increasingly important.
3
OpenMP Compiler directives, library routines, and environment variables for specifying shared-memory thread-based parallelism. Fortran and C/C++ specified. Supported by many compilers. Directives allow work sharing, synchronization, and sharing and privatizing of data. Directives are ignored by compiler unless command-line option is specified. What’s a thread?
4
OpenMP Other aspects of the OpenMP model: For further information:
Explicit, user-defined parallelism SPMD Fork/join model: only a master thread is executing when outside a parallel region. The parallel directive causes multiple threads to be started (or continued), each executing all or a part of the specified block. For further information: OpenMP Quick Guide (pdf)
5
Loop-based Parallelism in OpenMP
#pragma omp parallel for [clause [ clause ...]] for ( … ) { } where clause is one of the following: private (list) shared (list) copyin (list) firstprivate (list) lastprivate (list) reduction (operator: list) ordered schedule(kind [, chunk_size]) nowait
6
Loop-based Parallelism in OpenMP: Example
sum = 0 #pragma omp parallel for \ private( ... ) \ shared ( ... ) \ reduction ( ... ) for (i=0; i<n; i++) { k = 2*i-1; j = k + 1; x = a[k] * sin(pi*b[k]); c[i] = func(a[j], b[j], x) * beta; sum = sum + c[i]*c[i]; }
7
Loop-based Parallelism in OpenMP: A 2nd Example
Monte Carlo estimate of pi: generate random coordinate (x,y) in unit square and count how many land inside unit circle.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.