Download presentation
Presentation is loading. Please wait.
Published byKerry Fowler Modified over 9 years ago
1
O PEN MP (O PEN M ULTI -P ROCESSING ) David Valentine Computer Science Slippery Rock University
2
T HE BUZZ FOR O PEN MP There are more than a dozen events at SC12 with “OpenMP” in their titles. OpenMP celebrating 15 years: booth #2237 API designed for C/C++ and FORTRAN Shared memory parallelism, in multicore world As such, is an incremental learning curve for current programmers: Start with serial code Grab the obvious parallelizable sections to get the quickest results (Amdahl’s Law).
3
S HARED M EMORY P ARALLELISM Our world has already gone multicore How best can we take advantage of the cores already on the desktop without jumping into the weeks of low-level thread manipulation? There are several choices: openMP Cilk Threaded Building Blocks (TBB)
4
O PEN MP ( OPEN MULTI - PROCESSING ) O PEN MP. ORG Started in 1997, as continuation of ANSI X3H5 Supported by industry (HP, IBM, Intel, Sun, et al) and government (DoE) Designed for shared memory, multicore Thread based parallelism Explicit programmer control Fork-join model
5
O PEN MP F ORK -J OIN M ODEL Explicit programmer control Can use thread number ( omp_get_thread_num() ) to set different tasks per thread in parallel region wikipedia.com For k Join
6
O PEN MP Made of 3 components: Compiler Directives (20 as of 3.1) #pragma omp parallel will spawn parallel region Run time library routines (32) int myNum = omp_get_thread_num( ); Environment Variables (9) setenv OMP_NUM_THREADS 8
7
O PEN MP G OALS Their 4 stated goals are: Standardization Lean and Mean Ease of Use Portability CS2 students see their programs “go parallel” with just 2 or 3 lines of additional code! At this level we are just exposing them to the concept of mulitcore, shared memory parallelism
8
G ENERAL C ODE S TRUCTURE ( FROM HTTPS :// COMPUTING. LLNL. GOV / TUTORIALS / OPEN MP/#A BSTRACT ) HTTPS :// COMPUTING. LLNL. GOV / TUTORIALS / OPEN MP/#A BSTRACT #include main () { int var1, var2, var3; Serial code … Beginning of parallel section. Fork a team of threads. Specify variable scoping #pragma omp parallel private(var1, var2) shared(var3) { Parallel section executed by all threads Other OpenMP directives Run-time Library calls All threads join master thread and disband } //parallel block Resume serial code … }//main
9
T HE OBLIGATORY H ELLO W ORLD EXAMPLE Compile with OpenMP enabled Project-Properties-Configuration Properties-C/C++ -Language – OpenMP Support – YES Or gcc uses -fopenmp #include int main() { printf("Getting started...\n\n"); #pragma omp parallel printf("Hello World from thread %i of %i\n", omp_get_thread_num(), omp_get_num_threads()); printf("\nThat's all Folks!\n"); return 0; }
10
F OR CS1/CS2 NB most programs are severely I/O bound But, we are looking for only: A simple counting loop (FOR) where each iteration is independent, and has enough work to distribute across our cores The first two requirements are easy- the third one can involve “handicapping” the loop work We won’t show them nearly all of OpenMP; we just want to whet their appetites here Tell them the Truth, tell them nothing but the Truth, but for heaven’s sake don’t tell them ALL the Truth!
11
EG. T RAPEZOIDAL R ULE float trap(float xLo, float xHi, int numIntervals) { float area;//area under the curve (the integral) float width;//width of each trapezoid float x;//our points of evaluation float sum;//sum up all our f(x)’s sum= 0.0;//init our summing var width = (xHi-xLo)/numIntervals;//width of each trap for(int i=1; i<numIntervals; i++) {//get the interior points x = xLo + i*width;//each iter. independent of others sum += f(x);//add the interior value }//for sum += (f(xLo) + f(xHi))/2.0; //add the endpoints area = width * sum;//calc the total area return area;//return the approximation }//trap
12
EG. T RAPEZOIDAL R ULE Students add two lines: #include #pragma When they see the cores all “redline” @100%, they are hooked. #pragma omp parallel for private(x) reduction(+:sum) for(int i=1; i<numIntervals; i++) {//get the interior points x = xLo + i*width;//each iteration independent of others sum += f(x);//add the interior value }//for
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.