CALTECH CS137 Winter2004 -- DeHon CS137: Electronic Design Automation Day 9: February 9, 2004 Partitioning (Intro, KLFM)

Slides:



Advertisements
Similar presentations
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 14: March 3, 2004 Scheduling Heuristics and Approximation.
Advertisements

Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort
25 May Quick Sort (11.2) CSE 2011 Winter 2011.
Balancing Interconnect and Computation in a Reconfigurable Array Dr. André DeHon BRASS Project University of California at Berkeley Why you don’t really.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 11: March 4, 2008 Placement (Intro, Constructive)
Penn ESE525 Spring DeHon 1 ESE535: Electronic Design Automation Day 10: February 27, 2008 Partitioning 2 (spectral, network flow, replication)
VLSI Layout Algorithms CSE 6404 A 46 B 65 C 11 D 56 E 23 F 8 H 37 G 19 I 12J 14 K 27 X=(AB*CD)+ (A+D)+(A(B+C)) Y = (A(B+C)+AC+ D+A(BC+D)) Dr. Md. Saidur.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 9: February 20, 2008 Partitioning (Intro, KLFM)
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 15: March 12, 2007 Interconnect 3: Richness.
CS294-6 Reconfigurable Computing Day 8 September 17, 1998 Interconnect Requirements.
VLSI Layout Algorithms CSE 6404 A 46 B 65 C 11 D 56 E 23 F 8 H 37 G 19 I 12J 14 K 27 X=(AB*CD)+ (A+D)+(A(B+C)) Y = (A(B+C)+AC+ D+A(BC+D)) Dr. Md. Saidur.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 2: January 23, 2008 Covering.
EDA (CS286.5b) Day 5 Partitioning: Intro + KLFM. Today Partitioning –why important –practical attack –variations and issues.
DeHon March 2001 Rent’s Rule Based Switching Requirements Prof. André DeHon California Institute of Technology.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 21: April 15, 2009 Routing 1.
EDA (CS286.5b) Day 6 Partitioning: Spectral + MinCut.
CS294-6 Reconfigurable Computing Day 13 October 6, 1998 Interconnect Partitioning.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 19: April 9, 2008 Routing 1.
EDA (CS286.5b) Day 11 Scheduling (List, Force, Approximation) N.B. no class Thursday (FPGA) …
EDA (CS286.5b) Day 3 Clustering (LUT Map and Delay) N.B. no lecture Thursday.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 16: March 23, 2009 Covering.
CS294-6 Reconfigurable Computing Day 15 October 13, 1998 LUT Mapping.
EDA (CS286.5b) Day 7 Placement (Simulated Annealing) Assignment #1 due Friday.
Lecture 9: Multi-FPGA System Software October 3, 2013 ECE 636 Reconfigurable Computing Lecture 9 Multi-FPGA System Software.
Penn ESE525 Spring DeHon 1 ESE535: Electronic Design Automation Day 10: February 18, 2009 Partitioning 2 (spectral, network flow)
EDA (CS286.5b) Day 19 Covering and Retiming. “Final” Like Assignment #1 –longer –more breadth –focus since assignment #2 –…but ideas are cummulative –open.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 3: January 27, 2008 Clustering (LUT Mapping, Delay) Please work preclass example.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 17: March 30, 2009 Clustering (LUT Mapping, Delay)
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 14: February 28, 2007 Interconnect 2: Wiring Requirements and Implications.
Partitioning Outline –What is Partitioning –Partitioning Example –Partitioning Theory –Partitioning Algorithms Goal –Understand partitioning problem –Understand.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 13: February 4, 2005 Interconnect 1: Requirements.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 4: January 28, 2015 Partitioning (Intro, KLFM)
CSE 242A Integrated Circuit Layout Automation Lecture: Partitioning Winter 2009 Chung-Kuan Cheng.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 15: February 12, 2003 Interconnect 5: Meshes.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 12: February 13, 2002 Scheduling Heuristics and Approximation.
Algorithms  Al-Khwarizmi, arab mathematician, 8 th century  Wrote a book: al-kitab… from which the word Algebra comes  Oldest algorithm: Euclidian algorithm.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day11: October 30, 2000 Interconnect Requirements.
10/25/ VLSI Physical Design Automation Prof. David Pan Office: ACES Lecture 3. Circuit Partitioning.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 24: April 18, 2011 Covering and Retiming.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 6: January 23, 2002 Partitioning (Intro, KLFM)
Penn ESE525 Spring DeHon 1 ESE535: Electronic Design Automation Day 6: February 4, 2014 Partitioning 2 (spectral, network flow)
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 10: February 6, 2002 Placement (Simulated Annealing…)
Chapter 18: Searching and Sorting Algorithms. Objectives In this chapter, you will: Learn the various search algorithms Implement sequential and binary.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 14: February 10, 2003 Interconnect 4: Switching.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 13: February 20, 2002 Routing 1.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 18: April 2, 2014 Interconnect 4: Switching.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 3: January 12, 2004 Clustering (LUT Mapping, Delay)
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 16: February 14, 2005 Interconnect 4: Switching.
Circuit Partitioning Divides circuit into smaller partitions that can be efficiently handled Goal is generally to minimize communication between balanced.
CALTECH CS137 Spring DeHon 1 CS137: Electronic Design Automation Day 5: April 12, 2004 Covering and Retiming.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 10: February 11, 2002 Partitioning 2 (spectral, network flow, replication)
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 13: February 6, 2003 Interconnect 3: Richness.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 6: January 30, 2013 Partitioning (Intro, KLFM)
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 12: February 5, 2003 Interconnect 2: Wiring Requirements.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 11: February 25, 2015 Placement (Intro, Constructive)
CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 2: September 28, 2005 Covering.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 17: March 29, 2010 Interconnect 2: Wiring Requirements and Implications.
CALTECH CS137 Winter DeHon 1 CS137: Electronic Design Automation Day 8: January 27, 2006 Cellular Placement.
CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 21: November 28, 2005 Routing 1.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 25: April 17, 2013 Covering and Retiming.
CS137: Electronic Design Automation
CS137: Electronic Design Automation
CS137: Electronic Design Automation
ESE534: Computer Organization
CS184a: Computer Architecture (Structures and Organization)
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
CS137: Electronic Design Automation
Presentation transcript:

CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 9: February 9, 2004 Partitioning (Intro, KLFM)

CALTECH CS137 Winter DeHon Today Partitioning –why important –practical attack –variations and issues

CALTECH CS137 Winter DeHon Motivation (1) Divide-and-conquer –trivial case: decomposition –smaller problems easier to solve net win, if super linear Part(n) + 2  T(n/2) < T(n) –problems with sparse connections or interactions –Exploit structure limited cutsize is a common structural property random graphs would not have as small cuts

CALTECH CS137 Winter DeHon Motivation (2) Cut size (bandwidth) can determine area Minimizing cuts –minimize interconnect requirements –increases signal locality Chip (board) partitioning –minimize IO Direct basis for placement

CALTECH CS137 Winter DeHon Bisection Bandwidth Partition design into two equal size halves Minimize wires (nets) with ends in both halves Number of wires crossing is bisection bandwidth lower bw = more locality N/2 cutsize

CALTECH CS137 Winter DeHon Interconnect Area Bisection is lower- bound on IC width –Apply wire dominated (recursively)

CALTECH CS137 Winter DeHon Classic Partitioning Problem Given: netlist of interconnect cells Partition into two (roughly) equal halves (A,B) minimize the number of nets shared by halves “Roughly Equal” –balance condition: (0.5-  )N  |A|  (0.5+  )N

CALTECH CS137 Winter DeHon Balanced Partitioning NP-complete for general graphs –[ND17: Minimum Cut into Bounded Sets, Garey and Johnson] –Reduce SIMPLE MAX CUT –Reduce MAXIMUM 2-SAT to SMC –Unbalanced partitioning poly time Many heuristics/attacks

CALTECH CS137 Winter DeHon KL FM Partitioning Heuristic Greedy, iterative –pick cell that decreases cut and move it –repeat small amount of non-greediness: –look past moves that make locally worse –randomization

CALTECH CS137 Winter DeHon Fiduccia-Mattheyses (Kernighan-Lin refinement) Start with two halves (random split?) Repeat until no updates –Start with all cells free –Repeat until no cells free Move cell with largest gain (balance allows) Update costs of neighbors Lock cell in place (record current cost) –Pick least cost point in previous sequence and use as next starting position Repeat for different random starting points

CALTECH CS137 Winter DeHon Efficiency Tricks to make efficient: Pick move candidate with little work Update costs on move cheaply Efficient data structure –update costs cheap –cheap to find next move

CALTECH CS137 Winter DeHon Ordering and Cheap Update Keep track of Net gain on node == delta net crossings to move a node  cut cost after move = cost - gain Calculate node gain as  net gains for all nets at that node Sort by gain

CALTECH CS137 Winter DeHon FM Cell Gains Gain = Delta in number of nets crossing between partitions

CALTECH CS137 Winter DeHon FM Cell Gains Gain = Delta in number of nets crossing between partitions

CALTECH CS137 Winter DeHon After move node? Update cost each –Newcost=cost-gain Also need to update gains –on all nets attached to moved node –but moves are nodes, so push to all nodes affected by those nets

CALTECH CS137 Winter DeHon FM Recompute Cell Gain For each net, keep track of number of cells in each partition [F(net), T(net)] Move update:(for each net on moved cell) –if T(net)==0, increment gain on F side of net (think -1 => 0)

CALTECH CS137 Winter DeHon FM Recompute Cell Gain For each net, keep track of number of cells in each partition [F(net), T(net)] Move update:(for each net on moved cell) –if T(net)==0, increment gain on F side of net (think -1 => 0) –if T(net)==1, decrement gain on T side of net (think 1=>0)

CALTECH CS137 Winter DeHon FM Recompute Cell Gain Move update:(for each net on moved cell) –if T(net)==0, increment gain on F side of net –if T(net)==1, decrement gain on T side of net –decrement F(net), increment T(net)

CALTECH CS137 Winter DeHon FM Recompute Cell Gain Move update:(for each net on moved cell) –if T(net)==0, increment gain on F side of net –if T(net)==1, decrement gain on T side of net –decrement F(net), increment T(net) –if F(net)==1, increment gain on F cell

CALTECH CS137 Winter DeHon FM Recompute Cell Gain Move update:(for each net on moved cell) –if T(net)==0, increment gain on F side of net –if T(net)==1, decrement gain on T side of net –decrement F(net), increment T(net) –if F(net)==1, increment gain on F cell –if F(net)==0, decrement gain on all cells (T)

CALTECH CS137 Winter DeHon FM Recompute Cell Gain For each net, keep track of number of cells in each partition [F(net), T(net)] Move update:(for each net on moved cell) –if T(net)==0, increment gain on F side of net (think -1 => 0) –if T(net)==1, decrement gain on T side of net (think 1=>0) –decrement F(net), increment T(net) –if F(net)==1, increment gain on F cell –if F(net)==0, decrement gain on all cells (T)

CALTECH CS137 Winter DeHon FM Recompute (example) (work through before advancing to next slide)

CALTECH CS137 Winter DeHon FM Recompute (example)

CALTECH CS137 Winter DeHon FM Data Structures Partition Counts A,B Two gain arrays –One per partition –Key: constant time cell update Cells –successors (consumers) –inputs –locked status Binned by cost  constant time update

CALTECH CS137 Winter DeHon FM Optimization Sequence (ex)

CALTECH CS137 Winter DeHon FM Running Time? Randomly partition into two halves Repeat until no updates –Start with all cells free –Repeat until no cells free Move cell with largest gain Update costs of neighbors Lock cell in place (record current cost) –Pick least cost point in previous sequence and use as next starting position Repeat for different random starting points

CALTECH CS137 Winter DeHon FM Running Time Claim: small number of passes (constant?) to converge Small (constant?) number of random starts N cell updates each round (swap) Updates K + fanout work (avg. fanout K) –assume K-LUTs Maintain ordered list O(1) per move –every io move up/down by 1 Running time: O(KN) –Algorithm significant for its speed (more than quality)

CALTECH CS137 Winter DeHon FM Starts? 21K random starts, 3K network -- Alpert/Kahng So, FM gives a not bad solution quickly

CALTECH CS137 Winter DeHon Weaknesses? Local, incremental moves only –hard to move clusters –no lookahead –[see example] Looks only at local structure

CALTECH CS137 Winter DeHon Improving FM Clustering Technology mapping Initial partitions Runs Partition size freedom Replication Following comparisons from Hauck and Boriello ‘96

CALTECH CS137 Winter DeHon Clustering Group together several leaf cells into cluster Run partition on clusters Uncluster (keep partitions) –iteratively Run partition again –using prior result as starting point instead of random start

CALTECH CS137 Winter DeHon Clustering Benefits Catch local connectivity which FM might miss –moving one element at a time, hard to see move whole connected groups across partition Faster (smaller N) –METIS -- fastest research partitioner exploits heavily –FM work better w/ larger nodes (???)

CALTECH CS137 Winter DeHon How Cluster? Random –cheap, some benefits for speed Greedy “connectivity” –examine in random order –cluster to most highly connected –30% better cut, 16% faster than random Spectral (next time) –look for clusters in placement –(ratio-cut like) Brute-force connectivity (can be O(N 2 ))

CALTECH CS137 Winter DeHon LUT Mapped? Better to partition before LUT mapping.

CALTECH CS137 Winter DeHon Initial Partitions? Random Pick Random node for one side –start imbalanced –run FM from there Pick random node and Breadth-first search to fill one half Pick random node and Depth-first search to fill half Start with Spectral partition

CALTECH CS137 Winter DeHon Initial Partitions If run several times –pure random tends to win out –more freedom / variety of starts –more variation from run to run –others trapped in local minima

CALTECH CS137 Winter DeHon Number of Runs

CALTECH CS137 Winter DeHon Number of Runs % % 20 <20% (2% better than 10) 50 (4% better than 10) …but?

CALTECH CS137 Winter DeHon FM Starts? 21K random starts, 3K network -- Alpert/Kahng

CALTECH CS137 Winter DeHon Unbalanced Cuts Increasing slack in partitions –may allow lower cut size

CALTECH CS137 Winter DeHon Unbalanced Partitions Following comparisons from Hauck and Boriello ‘96

CALTECH CS137 Winter DeHon Replication Trade some additional logic area for smaller cut size –Net win if wire dominated Replication data from: Enos, Hauck, Sarrafzadeh ‘97

CALTECH CS137 Winter DeHon Replication 5%  38% cut size reduction 50%  50+% cut size reduction

CALTECH CS137 Winter DeHon What Bisection doesn’t tell us Bisection bandwidth purely geometrical No constraint for delay –I.e. a partition may leave critical path weaving between halves

CALTECH CS137 Winter DeHon Critical Path and Bisection Minimum cut may cross critical path multiple times. Minimizing long wires in critical path => increase cut size.

CALTECH CS137 Winter DeHon So... Minimizing bisection –good for area –oblivious to delay/critical path

CALTECH CS137 Winter DeHon Partitioning Summary Decompose problem Find locality NP-complete problem linear heuristic (KLFM) many ways to tweak –Hauck/Boriello, Karypis even better with replication only address cut size, not critical path delay

CALTECH CS137 Winter DeHon Admin Assignment #3, #4 out today –Based on 2-level optimization –…but stuff covering now may be highly relevant to your solutions E.g. is group formation a partitioning problem? Today’s supplemental references linked to reading –(Hauck, Karypis, Alpert…)

CALTECH CS137 Winter DeHon Today’s Big Ideas: Divide-and-Conquer Exploit Structure –Look for sparsity/locality of interaction Techniques: –greedy –incremental improvement –randomness avoid bad cases, local minima –incremental cost updates (time cost) –efficient data structures