Presentation is loading. Please wait.

Presentation is loading. Please wait.

INTEL CONFIDENTIAL Implementing a Task Decomposition Introduction to Parallel Programming – Part 9.

Similar presentations


Presentation on theme: "INTEL CONFIDENTIAL Implementing a Task Decomposition Introduction to Parallel Programming – Part 9."— Presentation transcript:

1 INTEL CONFIDENTIAL Implementing a Task Decomposition Introduction to Parallel Programming – Part 9

2 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Review & Objectives Previously: Described how the OpenMP task pragma is different from the for pragma Showed how to code task decomposition solutions for while loop and recursive tasks, with the OpenMP task construct At the end of this part you should be able to: Design and implement a task decomposition solution 2

3 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Case Study: The N Queens Problem 3 Is there a way to place N queens on an N-by-N chessboard such that no queen threatens another queen?

4 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. A Solution to the 4 Queens Problem 4

5 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Exhaustive Search 5

6 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #1 for Parallel Search Create threads to explore different parts of the search tree simultaneously If a node has children The thread creates child nodes The thread explores one child node itself Thread creates a new thread for every other child node 6

7 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #1 for Parallel Search 7 Thread W New Thread X New Thread Y New Thread Z

8 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Pros and Cons of Design #1 Pros Simple design, easy to implement Balances work among threads Cons Too many threads created Lifetime of threads too short Overhead costs too high 8

9 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #2 for Parallel Search One thread created for each subtree rooted at a particular depth Each thread sequentially explores its subtree 9

10 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #2 in Action 10 Thread 1 Thread 2 Thread 3

11 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Pros and Cons of Design #2 Pros Thread creation/termination time minimized Cons Subtree sizes may vary dramatically Some threads may finish long before others Imbalanced workloads lower efficiency 11

12 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #3 for Parallel Search Main thread creates work poollist of subtrees to explore Main thread creates finite number of co-worker threads Each subtree exploration is done by a single thread Inactive threads go to pool to get more work 12

13 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Work Pool Analogy More rows than workers Each worker takes an unpicked row and picks the crop After completing a row, the worker takes another unpicked row Process continues until all rows have been harvested 13

14 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #3 in Action 14 Thread 1 Thread 2 Thread 3 Thread 3 Thread 1

15 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Pros and Cons of Strategy #3 Pros Thread creation/termination time minimized Workload balance better than strategy #2 Cons Threads need exclusive access to data structure containing work to be done, a sequential component Workload balance worse than strategy #1 Conclusion Good compromise between designs 1 and 2 15

16 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Implementing Strategy #3 for N Queens Work pool consists of N boards representing N possible placements of queen on first row 16

17 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Parallel Program Design One thread creates list of partially filled-in boards Fork: Create one thread per core Each thread repeatedly gets board from list, searches for solutions, and adds to solution count, until no more board on list Join: Occurs when list is empty One thread prints number of solutions found 17

18 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Search Tree Node Structure /* The board struct contains information about a node in the search tree; i.e., partially filled- in board. The work pool is a singly linked list of board structs. */ struct board { int pieces;/* # of queens on board*/ int places[MAX_N]; /* Queens pos in each row */ struct board *next; /* Next search tree node */ }; 18

19 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Key Code in main Function struct board *stack;... stack = NULL; for (i = 0; i < n; i++) { initial=(struct board *)malloc(sizeof(struct board)); initial->pieces = 1; initial->places[0] = i; initial->next = stack; stack = initial; } num_solutions = 0; search_for_solutions (n, stack, &num_solutions); printf ("The %d-queens puzzle has %d solutions\n", n, num_solutions); 19

20 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Insertion of OpenMP Code struct board *stack;... stack = NULL; for (i = 0; i < n; i++) { initial=(struct board *)malloc(sizeof(struct board)); initial->pieces = 1; initial->places[0] = i; initial->next = stack; stack = initial; } num_solutions = 0; #pragma omp parallel search_for_solutions (n, stack, &num_solutions); printf ("The %d-queens puzzle has %d solutions\n", n, num_solutions); 20

21 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Original C Function to Get Work void search_for_solutions (int n, struct board *stack, int *num_solutions) { struct board *ptr; void search (int, struct board *, int *); while (stack != NULL) { ptr = stack; stack = stack->next; search (n, ptr, num_solutions); free (ptr); } 21

22 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. C/OpenMP Function to Get Work void search_for_solutions (int n, struct board *stack, int *num_solutions) { struct board *ptr; void search (int, struct board *, int *); while (stack != NULL) { #pragma omp critical { ptr = stack; stack = stack->next; } search (n, ptr, num_solutions); free (ptr); } 22

23 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Original C Search Function void search (int n, struct board *ptr, int *num_solutions) { int i; int no_threats (struct board *); if (ptr->pieces == n) { (*num_solutions)++; } else { ptr->pieces++; for (i = 0; i < n; i++) { ptr->places[ptr->pieces-1] = i; if (no_threats(ptr)) search (n, ptr, num_solutions); } ptr->pieces--; } 23

24 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. C/OpenMP Search Function void search (int n, struct board *ptr, int *num_solutions) { int i; int no_threats (struct board *); if (ptr->pieces == n) { #pragma omp critical (*num_solutions)++; } else { ptr->pieces++; for (i = 0; i < n; i++) { ptr->places[ptr->pieces-1] = i; if (no_threats(ptr)) search (n, ptr, num_solutions); } ptr->pieces--; } 24

25 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Only One Problem: It Doesnt Work! OpenMP program throws an exception Culprit: Variable stack 25 Heap stack

26 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Problem Site int main () { struct board *stack;... #pragma omp parallel search_for_solutions(n, stack, &num_solutions);... } void search_for_solutions (int n, struct board *stack, int *num_solutions) {... while (stack != NULL)... 26

27 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. 1. Both Threads Point to Top 27 stack Thread 1Thread 2

28 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. 2. Thread 1 Grabs First Element 28 stack Thread 1Thread 2 stack ptr

29 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. 3. Thread 2 Grabs Next Element 29 Thread 1Thread 2 stack ptr stack ptr Error #1 Thread 2 grabs same element

30 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. 4. Thread 1 Deletes Element 30 stack Thread 1Thread 2 stack ptr ? Error #2 Thread 2s stack pointer dangles

31 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 31 stack Thread 1Thread 2 stack ptr Thread 1 gets hits critical region & reads stack

32 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 32 stack Thread 1Thread 2 stack ptr Thread 1 copies stack to ptr

33 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 33 stack Thread 1Thread 2 stack ptr Thread 1 advances stack

34 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 34 stack Thread 1Thread 2 stack ptr Thread 1 exits critical region

35 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 35 stack Thread 1Thread 2 stack ptr ? Thread 1 frees ptr Thread 2 stack points to undefined value

36 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 36 Thread 1Thread 2 stack

37 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 37 Thread 2 stack Thread 1 stack ptr stack ptr Why would this work?

38 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 38 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 enters critical region

39 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 39 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 copies stack to ptr

40 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 40 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 advances stack

41 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 41 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 exits critical region

42 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 42 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 frees ptr – no dangling memory

43 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 2: Use Indirection (Best choice) 43 Thread 1Thread 2 &stack Now data is encapsulated inside function calls and no longer susceptible to overwriting global/static variable

44 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected main Function struct board *stack;... stack = NULL; for (i = 0; i < n; i++) { initial=(struct board *)malloc(sizeof(struct board)); initial->pieces = 1; initial->places[0] = i; initial->next = stack; stack = initial; } num_solutions = 0; #pragma omp parallel search_for_solutions (n, &stack, &num_solutions); printf ("The %d-queens puzzle has %d solutions\n", n, num_solutions); 44

45 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected Stack Access Function void search_for_solutions (int n, struct board **stack, int *num_solutions) { struct board *ptr; void search (int, struct board *, int *); while (*stack != NULL) { #pragma omp critical { ptr = *stack; *stack = (*stack)->next; } search (n, ptr, num_solutions); free (ptr); } 45

46 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. References Rohit Chandra, Leonardo Dagum, Dave Kohr, Dror Maydan, Jeff McDonald, and Ramesh Menon, Parallel Programming in OpenMP, Morgan Kaufmann (2001). Barbara Chapman, Gabriele Jost, Ruud van der Pas, Using OpenMP: Portable Shared Memory Parallel Programming, MIT Press (2008). Michael J. Quinn, Parallel Programming in C with MPI and OpenMP, McGraw-Hill (2004). 46

47


Download ppt "INTEL CONFIDENTIAL Implementing a Task Decomposition Introduction to Parallel Programming – Part 9."

Similar presentations


Ads by Google