Download presentation
Presentation is loading. Please wait.
Published byGeorgina Blair Modified over 9 years ago
1
CS 584
2
Discrete Optimization Problems A discrete optimization problem can be expressed as (S, f) S is the set of all feasible solutions f is the cost function Goal: Find a feasible solution x opt such that f(x opt ) <= f(x) for all x in S
3
Discrete Optimization Problems Examples VLSI layout Robot motion planning Test pattern generation In most problems, S is very large S can be converted to a state-space graph and then can be reformulated to a search problem.
4
Discrete Optimization Problems NP-hard Why parallelize? Consider real-time problems robot motion planning speech understanding task scheduling Faster search through bigger search spaces.
5
Search Algorithms Depth First Search Breadth First Search Best First Search Branch and Bound Use cost to determine expansion Iterative Deepening A* Use cost + heuristic value to determine expansion
6
Parallel Depth First Search Critical issue is distribution of search space. Static partitioning of unstructured trees leads to poor load balancing.
7
Dynamic Load Balancing Consider sequential DFS
8
Parallel DFS Each processor performs DFS on a disjoint section of the tree. (Static load assignment) After the processor finishes, it requests unsearched portions of the tree from other processors. Unexplored sections are stored in the stack Pop off a section from the stack and give it to somebody else.
10
Parallel DFS Problems Splitting up the work How much work should you give to another processor? Determining a donor processor Who do you request more work from?
11
Work Splitting Strategies When splitting up a stack, consider Sending too little or too much increases work requests Ideally, rather than splitting the stack, you would split the search space. HARD Nodes high in tree --> big subtrees, & vice-versa
12
Work Splitting Strategies To avoid sending small amounts of work, nodes beyond a specified stack depth are not sent. Cut-off depth Strategies Send only nodes near bottom of stack Send nodes near cut-off depth Send 1/2 of nodes between bottom and cut-off
13
Load Balancing Schemes (Who do I request work from?) Asynchronous Round Robin each processor maintains target Ask from target then increment target Global Round Robin target is maintained by master node Random Polling randomly select a donor each processor has equal probability
14
Speedups of DFS
15
Best-First Search Heuristic is used to direct the search Maintains 2 lists Open Nodes unsearched Sorted by heuristic value Closed Expanded nodes Memory requirement is linear in the size of the search space explored.
16
Parallel Best-First Search Concurrent processors pick the most promising node from the open list Newly generated nodes are placed back on the open list Centralized Strategy
17
Expand the node to generate successors Expand the node to generate successors Expand the node to generate successors at designated processor Global list maintained best node nodes Put expanded Get current Pick the best node from the list Place generated nodes in the list Pick the best node from the list Place generated nodes in the list Unlock the list Pick the best node from the list Place generated nodes in the list Unlock the list Lock the list
18
Centralized Best-First Search Termination condition A processor may find a solution but not the best solution. Modify the termination criteria (how?) Centralization leads to congestion Open list must be locked when accessed Extra work
19
Decentralizing Best-First Search Let each processor maintain its own open list Issues: Load balancing Termination (make sure it is the best)
20
Communication Strategies Random Periodically send some of the best nodes to a random processor Ring Periodically exchange best nodes with neighbors Blackboard Select best node from open list If l-value is OK then expand If l-value is BAD then get some from blackboard If l-value is GREAT then give some to blackboard
21
Ring Communication
22
Blackboard
23
What about searching a graph? Problem: node replication Possible solution: Assign each node to a processor Use hash function Whenever a node is generated, check to see if it already has been searched Costly
24
Speedup Anomalies Due to nature of the problem, speedup can vary greatly from one execution to the next. Two anomaly types: Acceleration Deceleration
27
Termination Detection Dijkstra's Token Termination Detection When idle, send idle token to next processor When idle token is received again, all done Tree-Based Termination Detection Associate a weight of 1 with initial work load Assign portions of the weight When finished give the weight portion back When processor 0 has weight of 1 --> all done.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.