Download presentation
Presentation is loading. Please wait.
1
CS294-6 Reconfigurable Computing Day 13 October 6, 1998 Interconnect Partitioning
2
Previously Established –cannot afford fully general interconnect –must exploit locality in network Quantified –locality (geometric growth / Rent’s Rule)
3
Today Automation to extract/exploit locality –Partitioning Heuristics FM Spectral
4
Partitioning Heirarchical placement Often used to initialize placement NP Hard in general Fast heuristics exist Don’t address critical path delay
5
Partitioning Problem Given: netlist of interconnect cells Partition into two (roughly) equal halves (A,B) minimize the number of nets shared by halves “Roughly Equal” –balance condition: (0.5- )N |A| (0.5+ )N
6
Fiduccia-Mattheyses (Kernighan-Lin refinement) Randomly partition into two halves Repeat until no updates –Start with all cells free –Repeat until no cells free Move cell with largest gain Update costs of neighbors Lock cell in place (record current cost) –Pick least cost point in previous sequence and use as next starting position Repeat for different random starting points
7
FM Cell Gains Gain = Delta in number of nets crossing between partitions
8
FM Recompute Cell Gain For each net, keep track of number of cells in each partition Move update:(for each net on moved cell) –if T(net)==0, increment gain on F side of net (think -1 => 0) –if T(net)==1, decrement gain on T side of net (think 1=>0) –decrement F(net), increment T(net) –if F(net)==1, increment gain on F cell –if F(net)==0, decrement gain on all cells (T)
9
FM Recompute (example)
10
FM Recmopute (example)
11
FM Data Structures Partition Counts A,B Two gain arrays –Key: constant time cell update Cells –successors (consumers) –inputs –locked status
12
FM Optimization Sequence (ex)
13
FM Running Time? Randomly partition into two halves Repeat until no updates –Start with all cells free –Repeat until no cells free Move cell with largest gain Update costs of neighbors Lock cell in place (record current cost) –Pick least cost point in previous sequence and use as next starting position Repeat for different random starting points
14
FM Running Time Claim: small number of passes (constant?) to converge Small (constant?) number of random starts N cell updates Updates K + fanout work (avg. fanout K) –assume K-LUTs Maintain ordered list O(1) per move –every io move up/down by 1 Running time: O(KN)
15
FM Starts? 21K random starts, 3K network -- Alpert/Kahng
16
Tool Admin “breather” ppw autoretime limitdepth gbufplace ar (sliding) my_adder Makefile ex
17
Spectral Partitioning Minimize Squared Wire length -- 1D layout Start with connection array C (c i,j ) “Placement” Vector X for x i placement cost = 0.5* (all I,j) (x i - x j ) 2 c i,j cost sum is X’BX –B = D-C –D=diagonal matrix, d i,i = (over j) c i,j
18
Spectral Partition Constraint: X’X=1 –prevent trivial solution all x i ’s same Minimize cost=X’BX w/ constraint –minimize L=X’BX- (X’X-1) – L/ X=2BX-2 X=0 –(B- I)X=0 –X => Eigenvector of B –cost is Eigenvalue
19
Spectral Partitioning X (x i ’s) continuous use to order nodes cut partition from order
20
Spectral Ordering Midpoint bisect isn’t necessarily best place to cut, consider: K (n/4) K (n/2)
21
Spectral Partitioning Options Can bisect by choosing midpoint Can relax cut critera –min cut w/in some of balance Ratio Cut –minimize (cut/|A||B|) idea tradeoff imbalance for smaller cut –more imbalance =>smaller |A||B| –so cut must be much smaller to accept
22
Spectral vs. FM From Hauck/Boriello ‘96
23
Improving Spectral More Eigenvalues –look at clusters in n-d space –5--70% improvement over EIG1
24
Spectral Theory There are conditions under which spectral is optimal –Boppana Provides lower/upper bound on cut size
25
Improving FM Clustering technology mapping initial partitions runs partition size freedom replication Following comparisons from Hauck and Boriello ‘96
26
Clustering Group together several leaf cells into cluster Run partition on clusters Uncluster (keep partitions) –iteratively Run partition again –using prior result as starting point
27
Clustering Benefits Catch local connectivity which FM might miss –moving one element at a time, hard to see move whole connected groups across partition Faster (smaller N) –METIS -- fastest research partitioners exploits heavily –…works for spectral, too FM work better w/ larger nodes (???)
28
How Cluster? Random –cheap, some benefits for speed Greedy “connectivity” –examine in random order –cluster to most highly connected –30% better cut, 16% faster than random Spectral –look for clusters in placement –(ratio-cut like) Brute-force connectivity (can be O(N 2 ))
29
LUT Mapped? Better to partition before LUT mapping.
30
Initial Partitions? Random Pick Random node for one side –start imbalanced –run FM from there Pick random node and Breadth-first search to fill one half Pick random node and Depth-first search to fill half Start with Spectral partition
31
Initial Partitions If run several times –pure random tends to win out –more freedom / variety of starts –more variation from run to run –others trapped in local minima
32
Number of Runs
33
2 - 10% 10 - 18% 20 <20% (2% better than 10) 50 (4% better than 10) …but?
34
FM Starts? 21K random starts, 3K network -- Alpert/Kahng
35
Replication Trade some additional logic area for smaller cut size Replication data from: Enos, Hauck, Sarrafzadeh ‘97
36
Replication 5% => 38% cut size reduction 50% => 50+% cut size reduction
37
Partitioning Summary Two effective heuristics –Spectral –FM many ways to tweak –Hauck/Boriello half size of vanilla even better with replication only address cut size, not critical path delay
38
Next Lecture LIVE (Taping): Wednesday 4pm (here) Playback: Thursday, classtime/place Programmable Computing Elements –Computing w/ Memories
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.