Download presentation
Presentation is loading. Please wait.
Published byAshton Miner Modified over 10 years ago
1
INTEL CONFIDENTIAL Implementing a Task Decomposition Introduction to Parallel Programming – Part 9
2
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Review & Objectives Previously: Described how the OpenMP task pragma is different from the for pragma Showed how to code task decomposition solutions for while loop and recursive tasks, with the OpenMP task construct At the end of this part you should be able to: Design and implement a task decomposition solution 2
3
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Case Study: The N Queens Problem 3 Is there a way to place N queens on an N-by-N chessboard such that no queen threatens another queen?
4
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. A Solution to the 4 Queens Problem 4
5
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Exhaustive Search 5
6
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #1 for Parallel Search Create threads to explore different parts of the search tree simultaneously If a node has children The thread creates child nodes The thread explores one child node itself Thread creates a new thread for every other child node 6
7
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #1 for Parallel Search 7 Thread W New Thread X New Thread Y New Thread Z
8
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Pros and Cons of Design #1 Pros Simple design, easy to implement Balances work among threads Cons Too many threads created Lifetime of threads too short Overhead costs too high 8
9
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #2 for Parallel Search One thread created for each subtree rooted at a particular depth Each thread sequentially explores its subtree 9
10
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #2 in Action 10 Thread 1 Thread 2 Thread 3
11
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Pros and Cons of Design #2 Pros Thread creation/termination time minimized Cons Subtree sizes may vary dramatically Some threads may finish long before others Imbalanced workloads lower efficiency 11
12
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #3 for Parallel Search Main thread creates work poollist of subtrees to explore Main thread creates finite number of co-worker threads Each subtree exploration is done by a single thread Inactive threads go to pool to get more work 12
13
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Work Pool Analogy More rows than workers Each worker takes an unpicked row and picks the crop After completing a row, the worker takes another unpicked row Process continues until all rows have been harvested 13
14
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Design #3 in Action 14 Thread 1 Thread 2 Thread 3 Thread 3 Thread 1
15
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Pros and Cons of Strategy #3 Pros Thread creation/termination time minimized Workload balance better than strategy #2 Cons Threads need exclusive access to data structure containing work to be done, a sequential component Workload balance worse than strategy #1 Conclusion Good compromise between designs 1 and 2 15
16
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Implementing Strategy #3 for N Queens Work pool consists of N boards representing N possible placements of queen on first row 16
17
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Parallel Program Design One thread creates list of partially filled-in boards Fork: Create one thread per core Each thread repeatedly gets board from list, searches for solutions, and adds to solution count, until no more board on list Join: Occurs when list is empty One thread prints number of solutions found 17
18
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Search Tree Node Structure /* The board struct contains information about a node in the search tree; i.e., partially filled- in board. The work pool is a singly linked list of board structs. */ struct board { int pieces;/* # of queens on board*/ int places[MAX_N]; /* Queens pos in each row */ struct board *next; /* Next search tree node */ }; 18
19
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Key Code in main Function struct board *stack;... stack = NULL; for (i = 0; i < n; i++) { initial=(struct board *)malloc(sizeof(struct board)); initial->pieces = 1; initial->places[0] = i; initial->next = stack; stack = initial; } num_solutions = 0; search_for_solutions (n, stack, &num_solutions); printf ("The %d-queens puzzle has %d solutions\n", n, num_solutions); 19
20
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Insertion of OpenMP Code struct board *stack;... stack = NULL; for (i = 0; i < n; i++) { initial=(struct board *)malloc(sizeof(struct board)); initial->pieces = 1; initial->places[0] = i; initial->next = stack; stack = initial; } num_solutions = 0; #pragma omp parallel search_for_solutions (n, stack, &num_solutions); printf ("The %d-queens puzzle has %d solutions\n", n, num_solutions); 20
21
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Original C Function to Get Work void search_for_solutions (int n, struct board *stack, int *num_solutions) { struct board *ptr; void search (int, struct board *, int *); while (stack != NULL) { ptr = stack; stack = stack->next; search (n, ptr, num_solutions); free (ptr); } 21
22
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. C/OpenMP Function to Get Work void search_for_solutions (int n, struct board *stack, int *num_solutions) { struct board *ptr; void search (int, struct board *, int *); while (stack != NULL) { #pragma omp critical { ptr = stack; stack = stack->next; } search (n, ptr, num_solutions); free (ptr); } 22
23
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Original C Search Function void search (int n, struct board *ptr, int *num_solutions) { int i; int no_threats (struct board *); if (ptr->pieces == n) { (*num_solutions)++; } else { ptr->pieces++; for (i = 0; i < n; i++) { ptr->places[ptr->pieces-1] = i; if (no_threats(ptr)) search (n, ptr, num_solutions); } ptr->pieces--; } 23
24
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. C/OpenMP Search Function void search (int n, struct board *ptr, int *num_solutions) { int i; int no_threats (struct board *); if (ptr->pieces == n) { #pragma omp critical (*num_solutions)++; } else { ptr->pieces++; for (i = 0; i < n; i++) { ptr->places[ptr->pieces-1] = i; if (no_threats(ptr)) search (n, ptr, num_solutions); } ptr->pieces--; } 24
25
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Only One Problem: It Doesnt Work! OpenMP program throws an exception Culprit: Variable stack 25 Heap stack
26
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Problem Site int main () { struct board *stack;... #pragma omp parallel search_for_solutions(n, stack, &num_solutions);... } void search_for_solutions (int n, struct board *stack, int *num_solutions) {... while (stack != NULL)... 26
27
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. 1. Both Threads Point to Top 27 stack Thread 1Thread 2
28
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. 2. Thread 1 Grabs First Element 28 stack Thread 1Thread 2 stack ptr
29
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. 3. Thread 2 Grabs Next Element 29 Thread 1Thread 2 stack ptr stack ptr Error #1 Thread 2 grabs same element
30
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. 4. Thread 1 Deletes Element 30 stack Thread 1Thread 2 stack ptr ? Error #2 Thread 2s stack pointer dangles
31
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 31 stack Thread 1Thread 2 stack ptr Thread 1 gets hits critical region & reads stack
32
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 32 stack Thread 1Thread 2 stack ptr Thread 1 copies stack to ptr
33
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 33 stack Thread 1Thread 2 stack ptr Thread 1 advances stack
34
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 34 stack Thread 1Thread 2 stack ptr Thread 1 exits critical region
35
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Demonstrate error #2 35 stack Thread 1Thread 2 stack ptr ? Thread 1 frees ptr Thread 2 stack points to undefined value
36
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 36 Thread 1Thread 2 stack
37
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 37 Thread 2 stack Thread 1 stack ptr stack ptr Why would this work?
38
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 38 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 enters critical region
39
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 39 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 copies stack to ptr
40
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 40 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 advances stack
41
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 41 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 exits critical region
42
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 1: Make stack Static 42 Thread 2 stack Thread 1 stack ptr stack ptr Thread 1 frees ptr – no dangling memory
43
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Remedy 2: Use Indirection (Best choice) 43 Thread 1Thread 2 &stack Now data is encapsulated inside function calls and no longer susceptible to overwriting global/static variable
44
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected main Function struct board *stack;... stack = NULL; for (i = 0; i < n; i++) { initial=(struct board *)malloc(sizeof(struct board)); initial->pieces = 1; initial->places[0] = i; initial->next = stack; stack = initial; } num_solutions = 0; #pragma omp parallel search_for_solutions (n, &stack, &num_solutions); printf ("The %d-queens puzzle has %d solutions\n", n, num_solutions); 44
45
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected Stack Access Function void search_for_solutions (int n, struct board **stack, int *num_solutions) { struct board *ptr; void search (int, struct board *, int *); while (*stack != NULL) { #pragma omp critical { ptr = *stack; *stack = (*stack)->next; } search (n, ptr, num_solutions); free (ptr); } 45
46
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. References Rohit Chandra, Leonardo Dagum, Dave Kohr, Dror Maydan, Jeff McDonald, and Ramesh Menon, Parallel Programming in OpenMP, Morgan Kaufmann (2001). Barbara Chapman, Gabriele Jost, Ruud van der Pas, Using OpenMP: Portable Shared Memory Parallel Programming, MIT Press (2008). Michael J. Quinn, Parallel Programming in C with MPI and OpenMP, McGraw-Hill (2004). 46
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.