Download presentation
Presentation is loading. Please wait.
1
cOMPunity: The OpenMP Community Barbara Chapman Compunity University of Houston
2
2 Contents A brief cOMPunity history Board of Directors Finances Workshops Meantime, in the rest of the world… Membership Our web presence Participation in ARB committee work OpenMP Futures
3
3 cOMPunity Goals provide continuity for workshops Participate in work of ARB Promote API Provide information, primarily via website, To join ARB, it was necessary to found a (non-profit) company Based officially in US state of Delaware Non-profit company, founded end of 2001
4
4 Board of Directors Must have 4 directors according to by-laws Current: Barbara Chapman (CEO, Finances) Mark Bull (Secretary) Dieter an Mey (web services) Mitsuhisa Sato (Asia) Alistair Rendell (Australasia) Eduard Ayguade (OpenMP language) Mike Voss (workshops) Rudi Eigenmann (workshops) We need to hold elections soon
5
5 Finances Goal: build up reserve to ensure workshops Expenses: Annual fee to agent in State of Delaware Delaware franchise tax (only $25 at present) Web domain registration ARB has waived membership fee Income: membership fees and surplus from workshops Current balance: ca. $18,400 Non-profit status recognized for federal tax purposes All cOMPunity work without pay.
6
6 OpenMP Focus Workshops First workshop at Lund, Sweden, 1999 Since 2000 organized annually EWOMP in Europe WOMPAT in North America WOMPEI in Asia Strong regional participation Aachen introduced OMPlab to format
7
7 wallcraf@nlr.com
8
Comments from First Workshop It is easy to get OpenMP code up and running. Can do this incrementally. It is also easy to combine OpenMP and MPI straightforward migration path for MPI code OpenMP is well suited to SPMD style coding It is not easy to optimize for cache, but it is essential for good performance Compilers should do the cache optimization
9
OpenMP Language: Comments Some workarounds are required when porting Fortran 90 constructs not truly supported array reductions not possible need threadprivate for variables I/O needs more work consideration Extensions proposed for I/O, synchronization Extensions may be required for new kinds of HPC applications Libraries are needed
10
Performance: Comments Scalable applications have been developed Some specific performance problems, e.g. I/O and memory allocation Significant differences in overheads for OpenMP constructs on different platforms On cc-NUMA systems, performance too dependent on OS and system load EPCC OpenMP Microbenchmark available at http://www.epcc.ed.ac.uk/research/openmpbench/
11
OpenMP Language: Comments Major problem with OpenMP is range of applicability Needs significant extension for use on cc- NUMA and distributed memory systems Data and thread placement may need to be coordinated by user explicitly Some vendor-specific extensions, but no standards
12
Summary: First Meeting High level of satisfaction with development experience, but understanding of cache optimizations often limited Mostly SPMD programming style adopted Many using OpenMP together with MPI Some language and performance problems identified Much discussion of cc-NUMA performance Confidence in market for OpenMP expressed
13
13 Uptake of OpenMP Widely available Single source code Ease of programming Major industry codes ported to OpenMP Many hybrid codes in use Many more users: experts and novice parallel programmers
14
14 ECHAM5: OpenMP vs. MPI PnPn Speedup p690 Courtesy of Oswald Hahn, MPI for Meteorology
15
15 What has the ARB Been Doing? OpenMP 1.0 to 2.0 to 2.5 Much clearer specs Some nasty parts Especially flush Memory model This is actually a problem elsewhere too (e.g. Pthreads) Tools work didn’t produce interfaces for performance tools or debuggers Lots of ideas for new features waiting for discussion
16
16 Life is Short, Remember? It’s official: OpenMP is easier to use than MPI!
17
17 Life is Short, Remember? It’s official: OpenMP is easier to use than MPI UPC! (Not actually, tested on real subject)
18
18 Workshops: What has changed? Workshops now merged One international workshop (IWOMP) Will rotate location IWOMP 2006 in Europe IWOMP 2007 in Australia Need a steering committee Email suggestions for this to chapman@cs.uh.educhapman@cs.uh.edu How does this affect the date of the event?
19
19 Workshops: Format and Content What about the content? Should we be working harder to get new kinds of users? If so, how? Publishable papers? Contributions from other sources OMP Lab Tutorial
20
20 Status: Membership BOD wanted membership in cOMPunity to be more-or- less free to academics Solution was to make it part of workshop registration In other words: participants at workshops are members In the past, de facto discount for attendance at multiple workshops Now there is only one annual workshop So we are (pretty much) the current members You can join individually too (fee is $50)
21
21 Membership: ctd What is benefit of being a member? Ability to participate in ARB deliberations This needs to be better organized Members-only discussion list New proposal: membership for two years from attended workshop up to the workshop in that year (not matter what the date)
22
22 Our Web Presence www.compunity.org Seems to be pretty useful Was managed at UH, input from BOD members Now managed by RWTH Aachen www.iwomp.org
23
23 Participation in ARB Participation in ARB committees ARB, Tools, Futures and Language ARB: Barbara Chapman Reports are produced by Matthijs van Waveren (Fujitsu) Tools: various, including originators of POMP interface Language: UPC Barcelona, but no regular participation on OpenMP 2.5 committee
24
24 Challenges and Opportunities Single processor optimization: Multiple virtual processors on a single chip need multi- threading Applications outside scientific computing: Compute intensive commercial and financial applications need HPC technology. Multiprocessor game platforms are coming. Clusters and distributed shared memory Clusters are the fastest growing HPC platform. Can OpenMP play a greater role? Does OpenMP have the right language features for these?
25
25 Completeness If we don’t cover a broad enough range of parallel applications, some one else will. Explicit Threading, Distributed Programming? Is OpenMP able to meet the needs of asynchronous or scalable computing? Is there an inherent problem or is some work on the language needed? The Risk: Fragmentation of Parallel Programming API’s – bad for HPC
26
26 OpenMP3.0 A list of proposed features prepared this week Not all of them have a concrete proposal Listed in following slides Order of listing does NOT imply anything with regard to priority, overall importance or status of proposal
27
27 OpenMP3.0: Suggested Features Task Queues There is a proposal Semaphores There is a proposal Collapse clause to allow parallelization of perfect loop nests There is a proposal
28
28 OpenMP3.0: Suggested Features Parallelization of wavefront loop nests There is a proposal Thread groups, named sections and precedence relationships There is a proposal Add internal control variable and environment variable to control slave thread stack size There is a proposal
29
29 OpenMP3.0: Suggested Features Automatic data scope clause There is a proposal SCHEDULE clause for sections There is no proposal Error reporting mechanism There is no proposal
30
30 OpenMP3.0: Suggested Features More kinds of schedules, including one where enough can be assumed to make NOWAIT useful There are several proposals Reductions with user defined functions (esp. min/max reductions in C/C++ There is no proposal Array reductions in C/C++ There is no proposal
31
31 OpenMP3.0: Suggested Features Reduce clause/construct to force reduction inside a parallel region There is no proposal Insist on (instead of permitting) multiple copies of internal control variables There is a proposal Define interactions with standard thread APIs There is no proposal
32
32 OpenMP3.0: Suggested Features INDIRECT clause to specify partially parallel loops There is a proposal Add library routines to support nested parallelism (team ids, global thread ids, etc.) There is no proposal If POMP-like profiling interface never happens, some basic profiling mechanism There is no proposal
33
33 OpenMP3.0: Suggested Features Support for default(private) in C/C++ There is no proposal Additional clauses to make workshare more flexible There is no proposal Include F2003 in set of base languages There is no proposal
34
34 OpenMP3.0: Suggested Features Non-flushing lock routines There is no proposal Support for atomic writes There is no proposal
35
35 OpenMP3.0: Proposed Fixes Remove possibility of storage reuse for private variables Define more clearly where constructors/destructors are called Define clearly how threadprivate objects should be initialized Widen scope of persistence of threadprivate in nested parallel regions
36
36 OpenMP3.0: Proposed Fixes Allow unsigned integers as parallel loop iteration variables in C/C++ Fix C/C++ directive grammar Address reading of environment variables when libraries are loaded multiple times
37
37 Validating OpenMP 2.5 for Fortran and C/C++ Mathias Mueller HLRS High Performance Computing Center Stuttgart University of Houston
38
38 Moving OpenMP Forward What else matters? Modularity? Libraries?.....? Even more widely Some users have been asking for a variety of hints and/or assertions to give more information to the compiler This is not really OpenMP specific
39
39 Moving OpenMP Forward Tools committee Many users complain about relative lack of tools How can we help get better tools? Can we share infrastructure to get more open source tools? What kind of tool support is (most) important?
40
40 cOMPunity Activities Participation in ARB Committees; ARB, Futures/Language, Tools Requires commitment Workshops Web presence Other? Need to participate in 3.0 effort
41
41 Outlook Let’s round up those cycles!!!
42
42 Elections Current officers are willing to serve Must have at least four Roles: Chair Secretary Finances Outreach Workshops Regional events
43
43 OpenMP ARB Current Organization OpenMP Board of Directors Greg Astfalk, HP (Chair) David Klepacki, IBM Ken Miura, Fujitsu Sanjiv Shah, KSL/Intel Josh Simons, Sun OpenMP ARB (Administrative) One representative per member OpenMP Officers Sanjiv Shah, CEO David Poulsen, CFO Larry Meadows, Secretary OpenMP Committees (Actual Work) One representative per member Language, Mark Bull Futures, Mark Debug, Bronis de Supinski Performance, Bronis MPIT, Bronis
44
44 OpenMP 3.0 : Pointer chasing loops Can OpenMP today handle pointer chasing loops? nodeptr list, p; for (p=list; p!=NULL; p=p->next) process(p->data); nodeptr list, p; #pragma omp parallel private( p ) for (p=list; p!=NULL; p=p->next) #pragma omp single nowait process(p->data); Yes it can – at least for simple cases:
45
45 OpenMP 3.0 : Pointer chasing loops A better way has been proposed: WorkQueuing #pragma omp parallel taskq private(p) for (p=list; p!=NULL; p=p->next) #pragma omp task process(p->data); Key concept: separate work iteration from work generation, which is combined in omp for Syntactic variations have been proposed by SUN and the Nanos threads group This method is very flexible Reference: Shah, Haab, Petersen and Throop, EWOMP’1999 paper.
46
46 Parallelization of Loop Nests do i = 1,33 do j = 1,33....loop body end do With 32 threads, how can we get good load balance without manually collapsing the loops? Can we handle non-rectangular and/or imperfect nests? Do I = 1, 33 DO J = 1, 33 … body of loop … END DO Do I = 1, 33 DO J = 1, 33 … body of loop … END DO
47
47 Portability of OpenMP Thread stacksize Different vendor defaults Different ways to request a given size Need to standardize this Behavior of code between parallel regions Do threads sleep? Busy-wait? Can the user control this? Again, need to standardize options
48
48 OpenMP Enhancements : OpenMP must be more modular Define how OpenMP interfaces to “other stuff”: How can an OpenMP program work with components implemented with OpenMP? How can OpenMP work with other thread environments? Support library writers: OpenMP needs an analog to MPI’s contexts. We don’t have any solid proposals on the table to deal with these problems.
49
49 Automatic Data Scoping Create a standard way to ask the compiler to figure out data scooping. When in doubt, the compiler serializes the construct int j; double x, result[COUNT]; #pragma omp parallel for default(automatic) for (j=0; j<COUNT; j++){ x = bigCalc(j); res[j] = hugeCalc(x); } int j; double x, result[COUNT]; #pragma omp parallel for default(automatic) for (j=0; j<COUNT; j++){ x = bigCalc(j); res[j] = hugeCalc(x); } Ask the compiler to figure out that “x” should be private.
50
!$OMP PARALLEL !$OMP DO do i=1,imt RHOKX(imt,i) = 0.0 enddo !$OMP ENDDO !$OMP DO do i=1, imt do j=1, jmt if (k.le. KMU(j,i)) then RHOKX(j,i) = DXUR(j,i)*p5*RHOKX(j,i) endif enddo !$OMP ENDDO !$OMP DO do i=1, imt do j=1, jmt if (k > KMU(j,i)) then RHOKX(j,i) = 0.0 endif enddo !$OMP ENDDO if (k == 1) then !$OMP DO do i=1, imt do j=1, jmt RHOKMX(j,i) = RHOKX(j,i) enddo !$OMP ENDDO !$OMP DO do i=1, imt do j=1, jmt SUMX(j,i) = 0.0 enddo !$OMP ENDDO endif !$OMP SINGLE factor = dzw(kth-1)*grav*p5 !$OMP END SINGLE !$OMP DO do i=1, imt do j=1, jmt SUMX(j,i) = SUMX(j,i) + factor * & (RHOKX(j,i) + RHOKMX(j,i)) enddo !$OMP ENDDO !$OMP END PARALLEL Execution with Reduced Synchronization Part of computation of gradient of hydrostatic pressure in POP code Runtime execution model for (c stands for chunk) Dataflow execution model associated with translated code
51
51 Producer/Consumer Example Correct version according to 2.5: Producer: data =... !$omp flush(data,flag) flag = 1 !$omp flush(flag) Consumer: do !$omp flush(flag) while (flag.eq.0) !$omp flush(data)... = data
52
52 Workshops Since 2000 organized annually EWOMP in Europe WOMPAT in North America WOMPEI in Asia Strong regional participation Aachen introduced OMPlab to format These have been a niche event Most OpenMP users are satisfied (or at least not thinking about how it could evolve) OpenMP is supposed to be easy, right?
53
53 What’s in a Flush? Flush writes data to and reads from memory It doesn’t synchronize threads According to the new rules Compiler is free to reorder flush directives if they are on different variables Two flushes on same variables must be seen by all threads in the same order
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.