INTEL CONFIDENTIAL Implementing a Task Decomposition Introduction to Parallel Programming – Part 9.

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

Implementing Task Decompositions Intel Software College Introduction to Parallel Programming – Part 5.
Analyzing Parallel Performance Intel Software College Introduction to Parallel Programming – Part 6.
Confronting Race Conditions Intel Software College Introduction to Parallel Programming – Part 4.
Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.
Improving Parallel Performance Intel Software College Introduction to Parallel Programming – Part 7.
Implementing Domain Decompositions Intel Software College Introduction to Parallel Programming – Part 3.
Simplifications of Context-Free Grammars
1
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
Copyright © 2003 Pearson Education, Inc. Slide 1.
Chapter 7 Constructors and Other Tools. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 7-2 Learning Objectives Constructors Definitions.
Chapter 17 Linked Data Structures. Copyright © 2006 Pearson Addison-Wesley. All rights reserved Learning Objectives Nodes and Linked Lists Creating,
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
David Burdett May 11, 2004 Package Binding for WS CDL.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Custom Statutory Programs Chapter 3. Customary Statutory Programs and Titles 3-2 Objectives Add Local Statutory Programs Create Customer Application For.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
Version 1.0 digitaloffice.intel.com Intel ® vPro Technology Intel ® Active Management Technology Setup and Configuration HP Laptop – Compaq 6910p Small.
Break Time Remaining 10:00.
Turing Machines.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
PP Test Review Sections 6-1 to 6-6
Chapter 17 Linked Lists.
Lists Chapter 6 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved
Chapter 24 Lists, Stacks, and Queues
Data Structures Using C++
Chapter 1 Object Oriented Programming 1. OOP revolves around the concept of an objects. Objects are created using the class definition. Programming techniques.
1 DATA STRUCTURES. 2 LINKED LIST 3 PROS Dynamic in nature, so grow and shrink in size during execution Efficient memory utilization Insertion can be.
Bright Futures Guidelines Priorities and Screening Tables
Bellwork Do the following problem on a ½ sheet of paper and turn in.
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1 Section 5.5 Dividing Polynomials Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Adding Up In Chunks.
Copyright © 2013 by John Wiley & Sons. All rights reserved. HOW TO CREATE LINKED LISTS FROM SCRATCH CHAPTER Slides by Rick Giles 16 Only Linked List Part.
1 Processes and Threads Chapter Processes 2.2 Threads 2.3 Interprocess communication 2.4 Classical IPC problems 2.5 Scheduling.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of CHAPTER 11: Priority Queues and Heaps Java Software Structures: Designing.
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
10 -1 Chapter 10 Amortized Analysis A sequence of operations: OP 1, OP 2, … OP m OP i : several pops (from the stack) and one push (into the stack)
Analyzing Genes and Genomes
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 12 View Design and Integration.
Pointers and Arrays Chapter 12
Essential Cell Biology
Clock will move after 1 minute
PSSA Preparation.
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
 2003 Prentice Hall, Inc. All rights reserved. 1 Chapter 13 - Exception Handling Outline 13.1 Introduction 13.2 Exception-Handling Overview 13.3 Other.
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
Select a time to count down from the clock above
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
INTEL CONFIDENTIAL Improving Parallel Performance Introduction to Parallel Programming – Part 11.
INTEL CONFIDENTIAL OpenMP for Domain Decomposition Introduction to Parallel Programming – Part 5.
INTEL CONFIDENTIAL Reducing Parallel Overhead Introduction to Parallel Programming – Part 12.
INTEL CONFIDENTIAL Shared Memory Considerations Introduction to Parallel Programming – Part 4.
Presentation transcript:

INTEL CONFIDENTIAL Implementing a Task Decomposition Introduction to Parallel Programming – Part 9

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Review & Objectives Previously: Described how the OpenMP task pragma is different from the for pragma Showed how to code task decomposition solutions for while loop and recursive tasks, with the OpenMP task construct At the end of this part you should be able to: Design and implement a task decomposition solution 2

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Case Study: The N Queens Problem 3 Is there a way to place N queens on an N-by-N chessboard such that no queen threatens another queen?

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. A Solution to the 4 Queens Problem 4

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Exhaustive Search 5

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #1 for Parallel Search Create threads to explore different parts of the search tree simultaneously If a node has children The thread creates child nodes The thread explores one child node itself Thread creates a new thread for every other child node 6

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #1 for Parallel Search 7 Thread W New Thread X New Thread Y New Thread Z

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Pros and Cons of Design #1 Pros Simple design, easy to implement Balances work among threads Cons Too many threads created Lifetime of threads too short Overhead costs too high 8

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #2 for Parallel Search One thread created for each subtree rooted at a particular depth Each thread sequentially explores its subtree 9

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #2 in Action 10 Thread 1 Thread 2 Thread 3

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Pros and Cons of Design #2 Pros Thread creation/termination time minimized Cons Subtree sizes may vary dramatically Some threads may finish long before others Imbalanced workloads lower efficiency 11

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #3 for Parallel Search Main thread creates work poollist of subtrees to explore Main thread creates finite number of co-worker threads Each subtree exploration is done by a single thread Inactive threads go to pool to get more work 12

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Work Pool Analogy More rows than workers Each worker takes an unpicked row and picks the crop After completing a row, the worker takes another unpicked row Process continues until all rows have been harvested 13

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #3 in Action 14 Thread 1 Thread 2 Thread 3 Thread 3 Thread 1

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Pros and Cons of Strategy #3 Pros Thread creation/termination time minimized Workload balance better than strategy #2 Cons Threads need exclusive access to data structure containing work to be done, a sequential component Workload balance worse than strategy #1 Conclusion Good compromise between designs 1 and 2 15

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Implementing Strategy #3 for N Queens Work pool consists of N boards representing N possible placements of queen on first row 16

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Parallel Program Design One thread creates list of partially filled-in boards Fork: Create one thread per core Each thread repeatedly gets board from list, searches for solutions, and adds to solution count, until no more board on list Join: Occurs when list is empty One thread prints number of solutions found 17

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Search Tree Node Structure /* The board struct contains information about a node in the search tree; i.e., partially filled- in board. The work pool is a singly linked list of board structs. */ struct board { int pieces;/* # of queens on board*/ int places[MAX_N]; /* Queens pos in each row */ struct board *next; /* Next search tree node */ }; 18

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Key Code in main Function struct board *stack;... stack = NULL; for (i = 0; i < n; i++) { initial=(struct board *)malloc(sizeof(struct board)); initial->pieces = 1; initial->places[0] = i; initial->next = stack; stack = initial; } num_solutions = 0; search_for_solutions (n, stack, &num_solutions); printf ("The %d-queens puzzle has %d solutions\n", n, num_solutions); 19

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Insertion of OpenMP Code struct board *stack;... stack = NULL; for (i = 0; i < n; i++) { initial=(struct board *)malloc(sizeof(struct board)); initial->pieces = 1; initial->places[0] = i; initial->next = stack; stack = initial; } num_solutions = 0; #pragma omp parallel search_for_solutions (n, stack, &num_solutions); printf ("The %d-queens puzzle has %d solutions\n", n, num_solutions); 20

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Original C Function to Get Work void search_for_solutions (int n, struct board *stack, int *num_solutions) { struct board *ptr; void search (int, struct board *, int *); while (stack != NULL) { ptr = stack; stack = stack->next; search (n, ptr, num_solutions); free (ptr); } 21

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. C/OpenMP Function to Get Work void search_for_solutions (int n, struct board *stack, int *num_solutions) { struct board *ptr; void search (int, struct board *, int *); while (stack != NULL) { #pragma omp critical { ptr = stack; stack = stack->next; } search (n, ptr, num_solutions); free (ptr); } 22

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Original C Search Function void search (int n, struct board *ptr, int *num_solutions) { int i; int no_threats (struct board *); if (ptr->pieces == n) { (*num_solutions)++; } else { ptr->pieces++; for (i = 0; i < n; i++) { ptr->places[ptr->pieces-1] = i; if (no_threats(ptr)) search (n, ptr, num_solutions); } ptr->pieces--; } 23

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. C/OpenMP Search Function void search (int n, struct board *ptr, int *num_solutions) { int i; int no_threats (struct board *); if (ptr->pieces == n) { #pragma omp critical (*num_solutions)++; } else { ptr->pieces++; for (i = 0; i < n; i++) { ptr->places[ptr->pieces-1] = i; if (no_threats(ptr)) search (n, ptr, num_solutions); } ptr->pieces--; } 24

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Only One Problem: It Doesnt Work! OpenMP program throws an exception Culprit: Variable stack 25 Heap stack

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Problem Site int main () { struct board *stack;... #pragma omp parallel search_for_solutions(n, stack, &num_solutions);... } void search_for_solutions (int n, struct board *stack, int *num_solutions) {... while (stack != NULL)... 26

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. 1. Both Threads Point to Top 27 stack Thread 1Thread 2

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. 2. Thread 1 Grabs First Element 28 stack Thread 1Thread 2 stack ptr

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. 3. Thread 2 Grabs Next Element 29 Thread 1Thread 2 stack ptr stack ptr Error #1 Thread 2 grabs same element

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. 4. Thread 1 Deletes Element 30 stack Thread 1Thread 2 stack ptr ? Error #2 Thread 2s stack pointer dangles

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 31 stack Thread 1Thread 2 stack ptr Thread 1 gets hits critical region & reads stack

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 32 stack Thread 1Thread 2 stack ptr Thread 1 copies stack to ptr

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 33 stack Thread 1Thread 2 stack ptr Thread 1 advances stack

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 34 stack Thread 1Thread 2 stack ptr Thread 1 exits critical region

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 35 stack Thread 1Thread 2 stack ptr ? Thread 1 frees ptr Thread 2 stack points to undefined value

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 36 Thread 1Thread 2 stack

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 37 Thread 2 stack Thread 1 stack ptr stack ptr Why would this work?

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 38 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 enters critical region

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 39 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 copies stack to ptr

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 40 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 advances stack

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 41 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 exits critical region

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 42 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 frees ptr – no dangling memory

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 2: Use Indirection (Best choice) 43 Thread 1Thread 2 &stack Now data is encapsulated inside function calls and no longer susceptible to overwriting global/static variable

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected main Function struct board *stack;... stack = NULL; for (i = 0; i < n; i++) { initial=(struct board *)malloc(sizeof(struct board)); initial->pieces = 1; initial->places[0] = i; initial->next = stack; stack = initial; } num_solutions = 0; #pragma omp parallel search_for_solutions (n, &stack, &num_solutions); printf ("The %d-queens puzzle has %d solutions\n", n, num_solutions); 44

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected Stack Access Function void search_for_solutions (int n, struct board **stack, int *num_solutions) { struct board *ptr; void search (int, struct board *, int *); while (*stack != NULL) { #pragma omp critical { ptr = *stack; *stack = (*stack)->next; } search (n, ptr, num_solutions); free (ptr); } 45

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. References Rohit Chandra, Leonardo Dagum, Dave Kohr, Dror Maydan, Jeff McDonald, and Ramesh Menon, Parallel Programming in OpenMP, Morgan Kaufmann (2001). Barbara Chapman, Gabriele Jost, Ruud van der Pas, Using OpenMP: Portable Shared Memory Parallel Programming, MIT Press (2008). Michael J. Quinn, Parallel Programming in C with MPI and OpenMP, McGraw-Hill (2004). 46