Embedded Systems MPSoC Architectures OpenMP: Exercises Alberto Bosio

Slides:



Advertisements
Similar presentations
NewsFlash!! Earth Simulator no longer #1. In slightly less earthshaking news… Homework #1 due date postponed to 10/11.
Advertisements

Open[M]ulti[P]rocessing Pthreads: Programmer explicitly define thread behavior openMP: Compiler and system defines thread behavior Pthreads: Library independent.
1 OpenMP—An API for Shared Memory Programming Slides are based on:
1 Tuesday, November 07, 2006 “If anything can go wrong, it will.” -Murphy’s Law.
DISTRIBUTED AND HIGH-PERFORMANCE COMPUTING CHAPTER 7: SHARED MEMORY PARALLEL PROGRAMMING.
Computer Architecture II 1 Computer architecture II Programming: POSIX Threads OpenMP.
1 ITCS4145/5145, Parallel Programming B. Wilkinson Feb 21, 2012 Programming with Shared Memory Introduction to OpenMP.
OpenMPI Majdi Baddourah
A Very Short Introduction to OpenMP Basile Schaeli EPFL – I&C – LSP Vincent Keller EPFL – STI – LIN.
50.003: Elements of Software Construction Week 5 Basics of Threads.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
Budapest, November st ALADIN maintenance and phasing workshop Short introduction to OpenMP Jure Jerman, Environmental Agency of Slovenia.
Programming with Shared Memory Introduction to OpenMP
CS470/570 Lecture 5 Introduction to OpenMP Compute Pi example OpenMP directives and options.
Shared Memory Parallelism - OpenMP Sathish Vadhiyar Credits/Sources: OpenMP C/C++ standard (openmp.org) OpenMP tutorial (
Parallel Programming in Java with Shared Memory Directives.
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
Lecture 5: Shared-memory Computing with Open MP. Shared Memory Computing.
Chapter 17 Shared-Memory Programming. Introduction OpenMP is an application programming interface (API) for parallel programming on multiprocessors. It.
Multithreading Allows application to split itself into multiple “threads” of execution (“threads of execution”). OS support for creating threads, terminating.
OpenMP - Introduction Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
1 OpenMP Writing programs that use OpenMP. Using OpenMP to parallelize many serial for loops with only small changes to the source code. Task parallelism.
OpenMP OpenMP A.Klypin Shared memory and OpenMP Simple Example Threads Dependencies Directives Handling Common blocks Synchronization Improving load balance.
OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)
CS 838: Pervasive Parallelism Introduction to OpenMP Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from online references.
Work Replication with Parallel Region #pragma omp parallel { for ( j=0; j
OpenMP fundamentials Nikita Panov
Threaded Programming Lecture 4: Work sharing directives.
Shared Memory Parallelism - OpenMP Sathish Vadhiyar Credits/Sources: OpenMP C/C++ standard (openmp.org) OpenMP tutorial (
Threaded Programming Lecture 2: Introduction to OpenMP.
Computer Science 320 Load Balancing. Behavior of Parallel Program Why do 3 threads take longer than two?
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
Special Topics in Computer Engineering OpenMP* Essentials * Open Multi-Processing.
CPE779: Shared Memory and OpenMP Based on slides by Laxmikant V. Kale and David Padua of the University of Illinois.
CS 221 – May 22 Timing (sections 2.6 and 3.6) Speedup Amdahl’s law – What happens if you can’t parallelize everything Complexity Commands to put in your.
OpenMP – Part 2 * *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)
OpenMP Lab Antonio Gómez-Iglesias Texas Advanced Computing Center.
C++ Exceptions.
Winter 2009 Tutorial #6 Arrays Part 2, Structures, Debugger
14 Compilers, Interpreters and Debuggers
Introduction to OpenMP
A bit of C programming Lecture 3 Uli Raich.
Shared Memory Parallelism - OpenMP
Conception of parallel algorithms
What to do when a test fails
Lecture 5: Shared-memory Computing with Open MP
CS427 Multicore Architecture and Parallel Computing
Loop Parallelism and OpenMP CS433 Spring 2001
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing A bug in the rwlock program Dr. Xiao Qin.
Open[M]ulti[P]rocessing
Atomic Operations in Hardware
Computer Engg, IIT(BHU)
Supporting OpenMP and other Higher Languages in Dyninst
Introduction to OpenMP
Computer Science Department
Shared Memory Programming with OpenMP
Presented by: Huston Bokinsky Ying Zhang 25 April, 2013
Hybrid Parallel Programming
Lab. 3 (May 11th) You may use either cygwin or visual studio for using OpenMP Compiling in cygwin “> gcc –fopenmp ex1.c” will generate a.exe Execute :
Introduction to High Performance Computing Lecture 20
Programming with Shared Memory Introduction to OpenMP
Introduction to OpenMP
CISC101 Reminders Assignment 3 due next Friday. Winter 2019
Parallel Computing Explained How to Parallelize a Code
Programming with Shared Memory Specifying parallelism
Arrays.
WJEC GCSE Computer Science
Speaker : Guo-Zhen Wu Date : 2008/12/28
Software Development Techniques
WorkSharing, Schedule, Synchronization and OMP best practices
Presentation transcript:

Embedded Systems MPSoC Architectures OpenMP: Exercises Alberto Bosio

Exercice 1 ● Take a moment to examine the source code and note how OpenMP directives and library routines are being used. ● Use the following command to compile the code: ● gcc−fopenmpomp_hello.c−o hello ● To run the code, simply type the command./hello and the program should run. ● How many threads were created? Why?

Exercise 1 ● Vary the number of threads and re-run Hello World Set the number of threads to use by means of the ● OMP_NUM_THREADS environment variable. ● Do you know other ways to set the number of threads? ● Your output should look similar to below. The actual order of output strings may vary. ● Try to execute several times the program in order to verify if the execution oder changes or not.

Exercise 1 Hello World from thread = 0 Number of threads = 4 Hello World from thread = 3 Hello World from thread = 1 Hello World from thread = 2

Exercise 2 ● Write a simple program that obtains information about your openMP environment. ● you can modify the "hello" program to do this. ● Using the appropriate openMP functions, have the master thread query and print the following: ● The number of processors available ● The number of threads being used ● The maximum number of threads available ● If you are in a parallel region ● If dynamic threads are enabled ● If nested parallelism is supported

Exercise 3 ● This example demonstrates use of the OpenMP for work-sharing construct. ● It specifies dynamic scheduling of threads and assigns a specific number of iterations to be done by each thread. ● After reviewing the source code, compile and run the executable. (Assuming OMP_NUM_THREADS still set to 4).

Exercise 3 gcc−fopenmpomp_workshare1.c −o workshare1./workshare1 | sort ● Review the output. Note that it is piped through the sort utility. This will make it easier to view how loop iterations were actually scheduled. ● Run the program a couple more times and review the output. What do you see?

Exercise 3 ● Edit the workshare1 source file and switch to static scheduling. Recompile and run the modified program. Notice the difference in output compared to dynamic scheduling. ● Rerun the program. Does the output change? ● With static scheduling, the allocation of work is deterministic and should not change between runs. ● Every thread gets work to do. ● Reflect on possible performance differences between dynamic and static scheduling.

Exercise 4: Sections ● This example demonstrates use of the OpenMP sections work-sharing construct. ● Note how the parallel region is divided into separate sections, each of which will be executed by one thread. ● As before, compile and execute the program after reviewing it.

Exercise 4: Sections ● Run the program several times and observe any differences in output. gcc−openmpomp_workshare2.c−o workshare2./workshare2

Exercise 5 ● This example performs a matrix multiply by distributing the iterations of the operation between available threads. After reviewing the source code, compile and run the program ● Review the output. It shows which thread did each iteration and the final result matrix. Run the program again, however this time sort the output to clearly see which threads execute which iterations: ● Do the loop iterations match the schedule(static, chunk) clause for the matrix multiple loop in the code?

Bug fixing ● here are many things that can go wrong when developing OpenMP programs. ● The omp_bugX.X series of programs demonstrate just a few. See if you can figure out what the problem is with each case and then fix it. ● The buggy behavior will differ for each example. Some hints are provided in the next slide.

Bug Fixing ● omp_bug1: Fails compilation. ● omp_bug2: Thread identifiers are wrong. Wrong answers. ● omp_bug3: Run-time error. ● omp_bug4: Causes a segmentation fault. ● omp_bug5: Program freezed. ● omp_bug6: Failed compilation.

Exercise 6

● Parallelize the numerical integration code using OpenMP ● What variables can be shared? ● What variables need to be private? ● What variables should be set up for reductions?

Exercise 6 ● Modify the pi calculation example, so you can: vary, the number of steps Will the calculated pi value change? ● get the total time for the calculation using omp_get_wtime Implement the computational core in a separate function, and ● call it varying the number of thread spawned ● Observe differences in elapsed time What happens if you use more threads than available processors? ● Re implement it not using the reduction clause it is slower? faster?