Parallel Computing Project (OPENMP using LINUX for Parallel application) Summer 2008 Group Project Instructor: Prof. Nagi Mekhiel August 12 th,, 2008 Ravi.

Slides:

Advertisements

Similar presentations

Introductions to Parallel Programming Using OpenMP

Advertisements

The OpenUH Compiler: A Community Resource Barbara Chapman University of Houston March, 2007 High Performance Computing and Tools Group

Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.

Modified from Silberschatz, Galvin and Gagne ©2009 Lecture 7 Chapter 4: Threads (cont)

Slides 8d-1 Programming with Shared Memory Specifying parallelism Performance issues ITCS4145/5145, Parallel Programming B. Wilkinson Fall 2010.

Silberschatz, Galvin and Gagne ©2013 Operating System Concepts Essentials – 2 nd Edition Chapter 4: Threads.

Cc Compiler Parallelization Options CSE 260 Mini-project Fall 2001 John Kerwin.

1 Tuesday, November 07, 2006 “If anything can go wrong, it will.” -Murphy’s Law.

Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 5: Threads Overview Multithreading Models Threading Issues Pthreads Solaris.

Software Group © 2006 IBM Corporation Compiler Technology Task, thread and processor — OpenMP 3.0 and beyond Guansong Zhang, IBM Toronto Lab.

Active Messages: a Mechanism for Integrated Communication and Computation von Eicken et. al. Brian Kazian CS258 Spring 2008.

Tile Reduction: the first step towards tile aware parallelization in OpenMP Ge Gan Department of Electrical and Computer Engineering Univ. of Delaware.

Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.

14.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 4: Threads.

What is Concurrent Programming? Maram Bani Younes.

Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.

High Performance Computation --- A Practical Introduction Chunlin Tian NAOC Beijing 2011.

OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.

國立台灣大學資訊工程學系 Chapter 4: Threads. 資工系網媒所 NEWS 實驗室 Objectives To introduce the notion of a thread — a fundamental unit of CPU utilization that forms the.

Silberschatz, Galvin and Gagne ©2011Operating System Concepts Essentials – 8 th Edition Chapter 4: Threads.

Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.

ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.

Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.

Threads, Thread management & Resource Management.

ECE 1747 Parallel Programming Shared Memory: OpenMP Environment and Synchronization.

OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

High-Performance Parallel Scientific Computing 2008 Purdue University OpenMP Tutorial Seung-Jai Min School of Electrical and Computer.

Operating System 2 Overview. OPERATING SYSTEM OBJECTIVES AND FUNCTIONS.

Introduction to OpenMP Eric Aubanel Advanced Computational Research Laboratory Faculty of Computer Science, UNB Fredericton, New Brunswick.

Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition, Chapter 4: Multithreaded Programming.

Threaded Programming Lecture 2: Introduction to OpenMP.

Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.

3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,

CPE779: Shared Memory and OpenMP Based on slides by Laxmikant V. Kale and David Padua of the University of Illinois.

Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.

Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.

Chapter 4: Threads.

Chapter 4: Threads.

Chapter 4: Threads.

Chapter 4: Threads.

Computer Engg, IIT(BHU)

Chapter 4: Threads.

Multi-core CPU Computing Straightforward with OpenMP

Chapter 4: Threads.

Department of Computer Science University of California, Santa Barbara

Chapter 4: Threads.

Chapter 4: Threads.

Compiler Back End Panel

Compiler Back End Panel

Chapter 4: Threads.

What is Concurrent Programming?

Chapter 4: Threads.

Chapter 4: Threads.

CHAPTER 4:THreads Bashair Al-harthi OPERATING SYSTEM

Chapter 4: Threads.

Chapter 4: Threads & Concurrency

Chapter 4: Threads.

Chapter 4: Threads.

Chapter 4: Threads.

Chapter 4: Threads.

Programming with Shared Memory Specifying parallelism

Chapter 4: Threads.

Department of Computer Science University of California, Santa Barbara

Chapter 4: Threads.

Operating System Overview

Chapter 4 Threads!!!.

Chapter 4: Threads & Concurrency

N-Body Gravitational Simulations

Shared-Memory Paradigm & OpenMP

Presentation transcript:

Parallel Computing Project (OPENMP using LINUX for Parallel application) Summer 2008 Group Project Instructor: Prof. Nagi Mekhiel August 12 th,, 2008 Ravi Illapani Kyunghee Ko Lixiang Zhang

2 OpenMP Parallel Computing Solution Stack

3 Recall Basic Idea of OpenMP The program generated by the compiler is executed by multiple threads  One thread per processor or core Each thread performs part of the work  Parallel parts executed by multiple threads  Sequential parts executed by single thread Dependences in parallel parts require synchronization between threads

4 Recall Basic Idea: How OpenMP Works User must decide what is parallel in program  Makes any changes needed to original source code  E.g. to remove any dependences in parts that should run in parallel User inserts directives telling compiler how statements are to be executed  What parts of the program are parallel  How to assign code in parallel regions to threads  Specifies data sharing attributes: shared, private, threadprivate…

5 How The User Interacts with Compiler Compiler generates explicit threaded code  Shields user from many details of the multithreaded code Compiler figures out details of code each thread needs to execute Compiler does not check that programmer directives are correct!!!  Programmer must be sure the required synchronization is inserted The result is a multithreaded object program

6 OpenMP Compilers and Platforms Intel C++ and Fortran Compilers from Intel  Intel IA32 Linux/Windows Systems  Intel Itanium-based Linux/Windows Systems Fujitsu/Lahey Fortran, C and C++  Intel Linux Systems, Fujitsu Solaris Systems HP HP-UX PA-RISC/Itanium, HP Tru64 Unix  Fortran/C/C++ IBM XL Fortran and C from IBM  IBM AIX Systems Guide Fortran and C/C++ from Intel's KAI Software Lab  Intel Linux/Windows Systems PGF77 / PGF90 Compilers from The Portland Group (PGI)  Intel Linux/Solaris/Windows/NT Systems Freeware: Omni, OdinMP, OMPi, OpenUH... Check information at

7 Structure of a Compiler Front End  Read in source program, ensure that it is error-free, build the intermediate representation(IR) Middle End  Analyze and optimize program as much as possible. “Lower” IR to machine-like form Back End  Determine layout of program data in memory. Generate object code for the target architecture and optimize it

8 OpenMP Implementation

9 OpenMP Implementation (con’t) If program is compiled sequentially  OpenMP comments and pragmas are ignored If code is compiled for parallel execution  Comments and/or pragmas are read, and  Drive translation into parallel program Ideally, one source for both sequential and parallel program (big maintenance plus) Usually this is accomplished by choosing a specific compiler option

10 OpenMP Implementation (con’t) Transforms OpenMP programs into multithreaded code Figures out the details of the work to be performed by each thread Arranges storage for different data and performs their initializations: shared, private... Manages threads: creates, suspends, wakes up, terminates threads Implements thread synchronization

11 Implementation-Defined Issues OpenMP leaves some issues to the implement  Default number of threads  Default schedule and default for schedule (runtime)  Number of threads to execute nested parallel regions  Behaviour in case of thread exhaustion  And many others.... Despite many similarities, each implementation is a little different from all others

Butterfly effect The butterfly effect is a phrase that encapsulates the more technical notion of sensitive dependence on initial conditions in chaos theory. Small variations of the initial condition of a dynamical system may produce large variations in the long term behavior of the system As butterfly describes, we gave parameters a little change and we got the totally different results.

13 System Overview The classical model assumes having a magnetic pendulum which is attracted by three magnets with each magnet having a distinct color. The magnets are located underneath the pendulum on a circle centered at the pendulum mount-point. They are strong enough to attract the pendulum in a way that it will not come to rest in the center position

System Overview (con’t)

Beeman Integration Algorithm The formula used to compute the positions at time t + Δt is: and this is the formula used to update the velocities:

Simulation results Exp 1: Single core vs dual core…. Performance w.r.t number of threads….. Serial vs parallel….. 32 tests were conducted… 17

18

Exp 2: Simulation when the no.of magnets are changed…. Simulation of the behavior of the pendulum…. 5 tests were conducted.. 19

20

21

Exp 3 In this experiment, we simulate the pendulum in a field of 2 magnets with varying values of friction and gravitation forces. A total number of 63 simulations were run: 22

23

Exp 4 In this experiment, we simulate the pendulum in a field of 3 magnets with varying values of friction and gravitation forces. A total number of 63 simulations were run: 24

25

Exp 5 In this experiment, we simulate the pendulum in a field of 8 magnets with varying values of friction and gravitation forces. A total number of 26 simulations were run: 26

27

28 Conclusion Even though the hardware is available, effective programming is required to maximize code efficiency. Complex simulations can be performed faster using parallel architecture. Openmp helps!! Simple: everybody can learn it in 11weeks Not so simple: Don’t stop learning! keep learning it for better performance

29 References [1] Michael Resch, Edgar Gabriel, Alfred Geiger (1999). An Approach for MPI Based Metacomputing, High Performance Distributed Computing Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing, 17, retrieved from ACM website August, [2] William Gropp, Ewing Lusk, Rajeev Thakur (1998), A case for using OPENMP's derived datatypes to improve I/O performance, Conference on High Performance Networking and Computing Proceedings of the 1998 ACM/IEEE conference on Supercomputing, 1-10, retrieved from ACM website August, [3] Michael Kagan (2006), Application acceleration through OPENMP overlap, Proceedings of the 2006 ACM/IEEE conference on Supercomputing,, retrieved from ACM website August, [4] Kai Shen, Hong Tang, Tao Yang (1999), Compile/run-time support for threaded OPENMP execution on multiprogrammed shared memory machines, ACM SIGPLAN Notices Volume 34, Issue 8, ,, retrieved from ACM website August, [5] Wikipedia Reference, retrieved from Wikipedia.org website August, [6] Software install, compiler, code Reference, retrieved website August,

30