Presentation is loading. Please wait.

Presentation is loading. Please wait.

CALTECH CS137 Winter2006 -- DeHon 1 CS137: Electronic Design Automation Day 8: January 27, 2006 Cellular Placement.

Similar presentations


Presentation on theme: "CALTECH CS137 Winter2006 -- DeHon 1 CS137: Electronic Design Automation Day 8: January 27, 2006 Cellular Placement."— Presentation transcript:

1 CALTECH CS137 Winter2006 -- DeHon 1 CS137: Electronic Design Automation Day 8: January 27, 2006 Cellular Placement

2 CALTECH CS137 Winter2006 -- DeHon 2 Today Problem Parallelism Cellular Automata Idea Details –Avoid Local Minima –Update locations Results Directions Primary Sources –Wrighton&DeHon FPGA2003 –Wrighton MS Thesis 2003

3 CALTECH CS137 Winter2006 -- DeHon 3 Placement Problem: Pick locations for all building blocks –minimizing energy, delay, area –really: minimize wire length minimize channel density –surrogates: Minimizing squared wire length Minimize bounding box

4 CALTECH CS137 Winter2006 -- DeHon 4 Parallelism What parallelism exists in placement? –Evaluate costs of prospective moves One set to many perspective locations Many moves each to single location –Perform moves

5 CALTECH CS137 Winter2006 -- DeHon 5 Cellular Automata Basic idea: regular array of identical cells with nearest-neighbor communication

6 CALTECH CS137 Winter2006 -- DeHon 6 CA Model On each cycle: –Each cell exchanges values with neighbors –Updates state/value based on own state and that of neighbors –E.g. Conway’s LIFE

7 CALTECH CS137 Winter2006 -- DeHon 7 Cellular Automata Physical Advantage: –No long wires Area linear in number of nodes Minimum delay  small cycle time Good scaling properties

8 CALTECH CS137 Winter2006 -- DeHon 8 System Architecture Taxonomy (Subject to continuing refinement and embellishment)

9 CALTECH CS137 Winter2006 -- DeHon 9 CA Placement Can we perform placement in a CA?

10 CALTECH CS137 Winter2006 -- DeHon 10 Mapping Each cell is a physical placement location State is a logical node assigned to the cell Assume: –Cell knows own location –State knows location of connected nodes

11 CALTECH CS137 Winter2006 -- DeHon 11 Costs Assume: –Cell knows own location –State knows location of connected nodes Cell computes: its cost at that location

12 CALTECH CS137 Winter2006 -- DeHon 12 Moves Two adjacent cells can exchange graph nodes

13 CALTECH CS137 Winter2006 -- DeHon 13 Moves Evaluate goodness of proposed swap –Each cell considers impact of its graph node being in the other cell –Keep if swap reduces cost

14 CALTECH CS137 Winter2006 -- DeHon 14 Move Costs Only really need to evaluate delta cost (src.x-sink.x) 2 Moving sink d/dx=-2 (src.x-sink.x) Delta move cost is linear distance

15 CALTECH CS137 Winter2006 -- DeHon 15 Parallel Swaps Pair up and perform N/2 swaps in parallel

16 CALTECH CS137 Winter2006 -- DeHon 16 Movement Alternate pairings with N,S,E,W neighbor  move any directions

17 CALTECH CS137 Winter2006 -- DeHon 17 Basic Idea Pair up PEs Compute impact of swaps in parallel Perform swaps in parallel Repeat until converge

18 CALTECH CS137 Winter2006 -- DeHon 18 Problems/Details Greedy swaps  local minima? How update location of neighbors? –…they are moving, too

19 CALTECH CS137 Winter2006 -- DeHon 19 Avoid Greedy Insert randomness in swaps  Simulated Annealing Shake up system to get out of local minima Swap if –Randomly decide to swap –OR beneficial to swap Change swap thresholds over time

20 CALTECH CS137 Winter2006 -- DeHon 20 Swap?

21 CALTECH CS137 Winter2006 -- DeHon 21 Impact of Randomness

22 CALTECH CS137 Winter2006 -- DeHon 22 Range Limiting Eurgo, Hauck, & Sharma DAC 2005

23 CALTECH CS137 Winter2006 -- DeHon 23 Local Swaps Only Assume there’s an ideal location Each node takes a biased Random Walk away from minimum cost location Gives node a distribution function around the minimum cost location If wander into a better “minimum cost” home, then wanders around new centerpoint Decreasing temperature restricts effective radius of walk

24 CALTECH CS137 Winter2006 -- DeHon 24 Local Swap Random Walk Decreasing temperature restricts effective radius of walk

25 CALTECH CS137 Winter2006 -- DeHon 25 How update locations? Broadcast? Pipelined Ring? Send to neighbors? –Routing network? Tree? For whom? –Everyone? Only things moved? Only things moved a lot?

26 CALTECH CS137 Winter2006 -- DeHon 26 Simple Solution: Ring Drop value in ring Shift around entire array Everyone listens for updates

27 CALTECH CS137 Winter2006 -- DeHon 27 Simple Solution: Ring Weakness? –Serial –N cycles to complete –N/2 swaps in O(1) –Then O(N) to update?

28 CALTECH CS137 Winter2006 -- DeHon 28 Simple Solution: Ring Linear update bad Idea: allow staleness –Things move slowly –Estimate of position not that bad… –…and continued operation will correct…

29 CALTECH CS137 Winter2006 -- DeHon 29 Algorithm

30 CALTECH CS137 Winter2006 -- DeHon 30 Algorithm Update Locations

31 CALTECH CS137 Winter2006 -- DeHon 31 Algorithm Try Moves

32 CALTECH CS137 Winter2006 -- DeHon 32 Quality vs. Parameters

33 CALTECH CS137 Winter2006 -- DeHon 33 Iso-Quality Pick point on Iso-Quality Curve that minimizes time

34 CALTECH CS137 Winter2006 -- DeHon 34 FPGA Implementation Virtex E (180nm) 10ns cycle (100MHz) 150 cycles for 4-phase swap –(~40 cycles/swap) 400 LUTs / Placement Engine Comparing –2.2GHz Intel Xeon (L2 512KB)

35 CALTECH CS137 Winter2006 -- DeHon 35 Results

36 CALTECH CS137 Winter2006 -- DeHon 36 Tuning Quality

37 CALTECH CS137 Winter2006 -- DeHon 37 Scaling Processor cycles O(N 4/3 ) –VPR Systolic cycles –O(N 1/2 ) – assume geometric refinement; O(N 1/2 ) update –O(N 5/6 ) – mesh sort, same number of swaps as VPR (N 4/3 / N 1/2 )

38 CALTECH CS137 Winter2006 -- DeHon 38 Scaling Also includes technology scaling

39 CALTECH CS137 Winter2006 -- DeHon 39 Variations Update Schemes Cost Functions Larger bins than PEs

40 CALTECH CS137 Winter2006 -- DeHon 40 Update Scheme: Tree Build Reduce Tree (H-Tree) Route to route in O(N 1/2 ) time Route from root to leaves in O(N 1/2 ) times Pipeline Same bandwidth as Ring (1/cycle) But less staleness (only O(N 1/2 ))

41 CALTECH CS137 Winter2006 -- DeHon 41 Reducing Broadcast (Idea 1) Don’t update things that haven’t moved (much) –…or things that move and move back before broadcast Keep track of staleness –How far moved from last broadcast Give priority to stalest data Max staleness wins at each tree stage –Break ties with randomness

42 CALTECH CS137 Winter2006 -- DeHon 42 Reducing Broadcast (Idea 2) Update locally Don’t need to know if someone far away moved by 1 square …but need to know if near neighbor did Multigrid/multiscale scheme –Only alert nodes in same subtree –When change subtrees at a level, alert all nodes underneath

43 CALTECH CS137 Winter2006 -- DeHon 43 Update Scheme: Mesh Route Can Route a permutation in O(N 1/2 ) time on a mesh Build mesh switching Make O(N) swaps Then take O(N 1/2 ) time moving/updating Becomes full simulated annealing –i.e. not just local swaps

44 CALTECH CS137 Winter2006 -- DeHon 44 Cost Functions

45 CALTECH CS137 Winter2006 -- DeHon 45 Cost Functions Bounding Box  2 phase update –Phase 1: alert source to location of all sinks –Phase 2: source communicates bbox extents to all sinks

46 CALTECH CS137 Winter2006 -- DeHon 46 Timing Linear Update: –Topological ordering of netlist –Use tree to distribute updates –Send updates in netlist order –  get delay in one pass Mesh: –Compute directly with dataflow-style spreading activation Wait for all inputs; then send output

47 CALTECH CS137 Winter2006 -- DeHon 47 Bins

48 CALTECH CS137 Winter2006 -- DeHon 48 Node Bins Keep more than one graph node per PE Local swap of one node from each PE node set each step –One with largest benefit? –Randomly select based on cost/benefit? Like rejectionnless annealing

49 CALTECH CS137 Winter2006 -- DeHon 49 Admin Parallel Prefix familiarity? Due today: literature review There is class on Monday


Download ppt "CALTECH CS137 Winter2006 -- DeHon 1 CS137: Electronic Design Automation Day 8: January 27, 2006 Cellular Placement."

Similar presentations


Ads by Google