Floorplanning Try the online demos at

Slides:



Advertisements
Similar presentations
Floorplanning. Non-Slicing Floorplan Representation Rectangle-Packing-Based Module Placement, H. Murata, K. Fujiyoushi, S. Nakatake and Y. Kajitani, IEEE.
Advertisements

Analysis of Floorplanning Algorithm in EDA Tools
THERMAL-AWARE BUS-DRIVEN FLOORPLANNING PO-HSUN WU & TSUNG-YI HO Department of Computer Science and Information Engineering, National Cheng Kung University.
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
OCV-Aware Top-Level Clock Tree Optimization
Architecture-dependent optimizations Functional units, delay slots and dependency analysis.
Cadence Design Systems, Inc. Why Interconnect Prediction Doesn’t Work.
Register Allocation CS 671 March 27, CS 671 – Spring Register Allocation - Motivation Consider adding two numbers together: Advantages: Fewer.
Fast Algorithms For Hierarchical Range Histogram Constructions
Linear Constraint Graph for Floorplan Optimization with Soft Blocks Jia Wang Electrical and Computer Engineering Illinois Institute of Technology Chicago,
Recent Development on Elimination Ordering Group 1.
3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
Fixed-outline Floorplanning Through Better Local Search
NuCAD ACG - Adjacent Constraint Graph for General Floorplans Hai Zhou and Jia Wang ICCD 2004, San Jose October 11-13, 2004.
Placement 1 Outline Goal What is Placement? Why Placement?
Fall 2003EE VLSI Design Automation I 118 EE 5301 – VLSI Design Automation I Kia Bazargan University of Minnesota Part IV: Floorplanning.
Floorplanning Professor Lei He
ICS 252 Introduction to Computer Design Lecture 14 Winter 2004 Eli Bozorgzadeh Computer Science Department-UCI.
VLSI Routing. Routing Problem  Given a placement, and a fixed number of metal layers, find a valid pattern of horizontal and vertical wires that connect.
Generating Supply Voltage Islands In Core-based System-on-Chip Designs Final Presentation Steven Beigelmacher Gall Gotfried 04/26/2005.
EDA (CS286.5b) Day 7 Placement (Simulated Annealing) Assignment #1 due Friday.
Processing Rate Optimization by Sequential System Floorplanning Jia Wang 1, Ping-Chih Wu 2, and Hai Zhou 1 1 Electrical Engineering & Computer Science.
CSE 242A Integrated Circuit Layout Automation Lecture: Floorplanning Winter 2009 Chung-Kuan Cheng.
Floorplanning. Obtained by subdividing a given rectangle into smaller rectangles. Each smaller rectangle corresponds to a module.
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 3: Chip Planning © KLMH Lienig 1 Modern Floorplanning Based on B*-Tree and Fast.
Chip Planning 1. Introduction Chip Planning:  Deals with large modules with −known areas −fixed/changeable shapes −(possibly fixed locations for some.
1 ENTITY test is port a: in bit; end ENTITY test; DRC LVS ERC Circuit Design Functional Design and Logic Design Physical Design Physical Verification and.
8/15/ VLSI Physical Design Automation Prof. David Pan Office: ACES Lecture 8. Floorplanning (2)
Introduction to Routing. The Routing Problem Apply after placement Input: –Netlist –Timing budget for, typically, critical nets –Locations of blocks and.
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
Simulated Annealing.
Global Routing.
CAD for Physical Design of VLSI Circuits
10/7/ VLSI Physical Design Automation Prof. David Pan Office: ACES Lecture 6. Floorplanning (1)
TSV-Aware Analytical Placement for 3D IC Designs Meng-Kai Hsu, Yao-Wen Chang, and Valerity Balabanov GIEE and EE department of NTU DAC 2011.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Modern Floor-planning Based on B ∗ -Tree and Fast Simulated Annealing Paper by Chen T. C. and Cheng Y. W (2006) Presented by Gal Itzhak
Regularity-Constrained Floorplanning for Multi-Core Processors Xi Chen and Jiang Hu (Department of ECE Texas A&M University), Ning Xu (College of CST Wuhan.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.
Simulated Annealing.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
L11: Lower Power High Level Synthesis(2) 성균관대학교 조 준 동 교수
Placement. Physical Design Cycle Partitioning Placement/ Floorplanning Placement/ Floorplanning Routing Break the circuit up into smaller segments Place.
Deferred Decision Making Enabled Fixed- Outline Floorplanner Jackey Z. Yan and Chris Chu DAC 2008.
Fishbone: A Block-Level Placement and Routing Scheme Fan Mo and Robert K. Brayton EECS, UC Berkeley.
Rectlinear Block Packing Using the O-tree Representation Yingxin Pang Koen Lampaert Mindspeed Technologies Chung-Kuan Cheng University of California, San.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
A Stable Fixed-outline Floorplanning Method Song Chen and Takeshi Yoshimura Graduate School of IPS, Waseda University March, 2007.
Single-solution based metaheuristics. Outline Local Search Simulated annealing Tabu search …
Routability-driven Floorplanning With Buffer Planning Chiu Wing Sham Evangeline F. Y. Young Department of Computer Science & Engineering The Chinese University.
By P.-H. Lin, H. Zhang, M.D.F. Wong, and Y.-W. Chang Presented by Lin Liu, Michigan Tech Based on “Thermal-Driven Analog Placement Considering Device Matching”
1 Twin Binary Sequences: A Non-Redundant Representation for General Non-Slicing Floorplan Evan Young Department of Computer Science and Engineering The.
Block Packing: From Puzzle-Solving to Chip Design
Ramakrishna Lecture#2 CAD for VLSI Ramakrishna
VLSI Floorplanning and Planar Graphs prepared and Instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University July 2015VLSI Floor Planning and Planar.
An Exact Algorithm for Difficult Detailed Routing Problems Kolja Sulimma Wolfgang Kunz J. W.-Goethe Universität Frankfurt.
Application Domains for Fixed-Length Block Structured Architectures ACSAC-2001 Gold Coast, January 30, 2001 ACSAC-2001 Gold Coast, January 30, 2001.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Placement and Routing Algorithms. 2 FPGA Placement & Routing.
VLSI Physical Design Automation
Partial Reconfigurable Designs
VLSI Physical Design Automation
Sheqin Dong, Song Chen, Xianlong Hong EDA Lab., Tsinghua Univ. Beijing
An Automated Design Flow for 3D Microarchitecture Evaluation
Topics Logic synthesis. Placement and routing..
EDA Lab., Tsinghua University
ICS 252 Introduction to Computer Design
ICS 252 Introduction to Computer Design
Clock Tree Routing With Obstacles
Presentation transcript:

Floorplanning Try the online demos at http://foghorn.cadlab.lafayette.edu/cadapplets/

An Example Floorplan Alpha 21364

Floorplanning Problem Given circuit modules (or cells) and their connections, determine the approximate location of circuit elements Consistent with a hierarchical / building block design methodology Modules (result of partitioning): Fixed area, generally rectangular Fixed aspect ratio  hard macro (aka fixed-shaped blocks) fixed / floating terminals (pins) Rotation might be allowed / denied Flexible shape  soft macro (aka soft modules) - Note that the green module in the floorplan shown in the mid-right does not fill its “placeholder” (w1,h1) (wN,hN) [Bazargan]

Floorplanning (cont.) Objectives: Possible additional constraints: Minimize area Determine best shape of soft modules Minimize total wire length to make subsequent routing phase easy (short wire length roughly translates into routability) Additional cost components: Wire congestion (exact routability measure) Wire delays Power consumption System throughput (e.g., CPI of a processor) Possible additional constraints: Fixed location for some modules Fixed die, or range of die aspect ratio [Bazargan]

Floorplanning: Why Important? Early stage of physical design Determines the location of large blocks  detailed placement easier (divide and conquer!) Estimates of area, delay, power  important design decisions Impact on subsequent design steps (e.g., routing, heat dissipation analysis and optimization) B C G E L F K I J A H D J K L G I E H C B A F D Figs: [©Sherwani] [Bazargan]

Floorplan Classes Slicing, recursively defined as: Non-slicing A module OR A floorplan that can be partitioned into two slicing floorplans with a horizontal or vertical cut line 4 3 6 2 7 34 167 2345 Slicing floorplan Corresp. Slicing tree 1 3 2 1 67 4 7 6 234 5 5 1234567 Non-Slicing floorplan Non-slicing Superset of slicing floorplans Contains the “wheel” shape too. The non-slicing floorplan shown here is the smallest non-slicing floorplan. [©Sarrafzadeh] [Bazargan]

Non-slicing Floorplan Example Hierarchical floorplan of order 5 Templates Floorplan and tree example L5 R5 2 R5 3 1 4 3 5 6 1 6 2 7 8 4 5 7 8 [©Sarrafzadeh] [Bazargan]

Floorplanning Algorithms Components “Placeholder” representation Usually in the form of a tree Slicing class: Polish expression [Otten] Non-slicing class: O-tree, Sequence Pair, BSG, etc. Just defines the relative position of modules Perturbation Going from one floorplan to another Usually done using Simulated Annealing Floorplan sizing Definition: Given a floorplan tree, choose the best shape for each module to minimize area Slicing: polynomial, bottom-up algorithm Non-slicing: NP! Use mathematical programming (exact solution) Cost function Area, wire-length, ... [Bazargan]

Bounds on Aspect Ratios We can also allow several shapes for each block: For hard blocks, the orientations can be changed: [Pan]

Area Utilization, Hard and Soft Modules The hierarchy tree and floorplan define “place holders” for modules Area utilization Depends on how nicely the rigid modules’ shapes are matched Soft modules can take different shapes to “fill in” empty slots  floorplan sizing 1 7 6 2 3 4 5 m1 m3 m1 m7 m1 m3 Assuming all modules are soft with absolute flexibility (i.e., fixed area, any aspect ratio), how can you size a floorplan? m2 m2 m4 m4 m7 m6 m6 m7 m5 m7 m5 Area = 20x22 = 440 Area = 20x19 = 380 [Bazargan]

Bounds on Aspect Ratios If there is no bound on the aspect ratios, can we pack everything tightly? - Sure! But we don’t want to layout blocks as long strips, so we require ri  hi/wi  si for each i. [Pan]

Floorplan Sizing for Slicing Floorplans Bottom-up process Has to be done per floorplan perturbation Requires O(n) time. n is the total number of shapes of all the modules V L R H T B bi ai yj xj bi+yj max(ai, xj) bi ai max(bi, yj) ai+ xj yj xj [©Sarrafzadeh] [Bazargan]

Sizing Slicing Floorplans Simple case: All modules are hard macros No rotation allowed  one shape only 1 3 4 5 2 6 7 17x16 1234567 167 2345 234 5 1 67 4 3 6 2 7 34 4x7 5x4 8x8 4x8 3x6 4x5 7x5 m1 9x15 m5 8x16 8x11 m2 m4 m3 4x11 m7 m6 9x7 Does the exact order in which we perform the sizing matter? No. As long as it’s a bottom-up order, the exact order doesn’t matter (e.g., [34] can be sized before or after [67]) When doing the bottom-up sizing, can we determine the location of the bottom-left corner of each block? [Bazargan]

Sizing Slicing Floorplans (cont.) What if modules have more than one shape? If area only concern: Module A has shapes 4x6, 7x8, 5x6, 6x4, 7x4, which ones should we pick? Module A has shapes 4x6, 5x5, 6x4, which ones should we pick? Dominant points Shape (x1, y1) dominates (x2, y2) if x1  x2 and y1  y2. A B Can area alone be a factor in eliminating a shape? (e.g., 4x6, 5x5) g p q a a dominates p b dominates r b r b dominates q [Bazargan]

Sizing Slicing Floorplans: Example B b1 b2 b3 3x4 2x7 4x2 6x7 7x7 8x7 b1 a1 a2 a3 7x6 8x5 9x4 b2 a1 a2 a3 Note that the shapes are sorted (decreasing height) Claim: if you sort the shapes on decreasing height, the widths HAVE TO be in increasing order. Why? Each dominating width or height will eliminate the elements to the right of an item in a row, or the elements to the bottom of an element in a column. Why does a1,b1 eliminate a2,b1 and a3,b1? Note that the resulting shapes are also in inc height / dec width order What if we are dealing with a horizontal cut? 8x6 9x5 10x4 b3 a1 a2 a3 [Bazargan]

Slicing Floorplan Sizing Algorithm Procedure Vertical_Node_Sizing   Input: Two sorted lists L = { (a1, b1), ... , (as,bs) }, R = { (x1, y1), ... , (xt, yt) } where ai < aj, bi > bj, for all i < j; xi < xj, yi > yj for all i < j Output: A sorted list H = { (c1, d1), ... , (cu,du) } where u  s + t - 1, ci < cj, di > dj for all i < j begin H :=  i := 1, j := 1, k = 1 while (i  s) and (j  t) do (ck, dk) := (ai + xj, max(bi, yj)) H := H  { (ck, dk) } k := k + 1 if max(bi, yj) = bi then i := i + 1 if max(bi, yj) = yj then j := j + 1 end What happens if bi == yj? Would the algorithm still work correctly? What change do we have to make to do horizontal node sizing? Can we keep the sorting order? [©Sarrafzadeh] [Bazargan]

Slicing Floorplan Sizing Input: floorplan tree, modules shapes Start with sorted shapes lists of modules In a bottom-up fashion, perform: Vertical_Node_Sizing AND Horizontal_Node_Sizing When get to the root node, we have a list of shapes. Select the one that is best in terms of area In a top-down fashion, traverse the floorplan tree and set module locations [Bazargan]

Find the Best Area Recursively combining shape curves. 2 V 1 3 1 H 3 2 Pick the best 2 V 1 3 1 H 3 2 [Pan]

(b) minimum spanning tree Wire Length For hyperedges: Either of complete graph, MST, or Steiner tree For each edge: Euclidian distance sqrt( (x1-x2)2 + (y1-y2)2 ). Direct lines Manhattan distance |x1 – x2| + |y1 – y2| Manhattan: Only horizontal / vertical lines (b) minimum spanning tree (a) Steiner tree (c) complete graph (length = 11) (length = 13) (length = 32) [©Sherwani] [Bazargan]

Polish Expression Tree representation of the floorplan Left child of a V-cut in the tree represents the left slice in the floorplan Left child of an H-cut in the tree represents the top slice in the floorplan Polish expression representation A string of symbols obtained by traversing a binary tree in post-order. 1 3 4 5 2 6 7 1 5 4 3 6 7 2 How to uniquely represent a floorplan using a tree? In a Polish expression, can the last character be an operand? Can the first or the second character be an operator? If we traverse a Polish expression from left to right, and use a counter to increment when we see an operand and decrease when we get to an operator, what is the physical meaning of such a counter? Can we get to a count value of zero? At the end, what should be the value? What data structure can we use to convert a Polish expression to a tree? 1 7 6 | - 2 3 4 - | 5 - | [Bazargan]

Normalized Polish Expression Problem with Polish expressions? Multiple representations for some slicing trees When more than one cut in one direction cut a floorplan Larger solution space A stochastic algorithm (e.g., Simulated Annealing) will be more biased towards floorplans with multiple representations (More likely to be visited) 4 3 2 1 1 2 3 4 1 2 3 4 Is any of the shown trees better? No. Both generate the same floorplan with the same area (sizing is done in linear time in both cases too). 1 2 - 3 4 | | 1 2 - 3 | 4 | [©Sarrafzadeh] [Bazargan]

Normalized Polish Expression (cont.) Solution? Assign priorities to the cuts In a top-down tree construction, Pick the right-most cut Pick the lowest cut Result: no two same operators adjacent in the Polish expression (i.e., no “| |” or “— —”) 4 3 5 2 1 We picked this representation, because maybe it’s easier to check. 1 2 3 4 5 1 2 – 5 - 3 | 4 | [Bazargan]

Simulated Annealing Idea originated from observations of crystal formations (e.g., in lava) A crystal is in a low energy state Materials tend to form crystals (global minimum) If at the right temperature (i.e., right speed), a molecule will adhere to a crystal formation Very slowly decrease temperature When very hot, molecules move freely When a molecule gets to a chunk of crystal, it *might* move away due to its high speed When colder, molecules slow down The probability of moving away from a local optimum decreases When the material “freezes”, all molecules are fixed and the material is in minimum energy state [Bazargan]

Simulated Annealing Algorithm Components: Solution space (e.g., slicing floorplans) Cost function (e.g., the area of a floorplan) Determines how “good” a particular solution is Perturbation rules (e.g., transforming a floorplan to a new one) Simulated annealing engine A variable T, analogous to temperature An initial temperature T0 (e.g., T0 = 40,000) A freezing temperature Tfreez (e.g., Tfreez=0.1) A cooling schedule (e.g., T = 0.95 * T) What properties should the perturbation rules have? Fast Cover all the solution space Not biased towards a particular class of solutions [Bazargan]

Simulated Annealing Algorithm Procedure SimulatedAnnealing curSolution = random initial solution T = T0 // initial temperature while (T > Tfreez) do for i=1 to NUM_MOVES_PER_TEMP_STEP do nextSol = perturb (curSolution) Dcost = cost(nextSol) – cost(curSolution) if acceptMove (Dcost, T) then curSolution = nextSol // accept the move T = coolDown (T ) Procedure acceptMove (Dcost, T) if Dcost < 0 then return TRUE // always accept a good move else boltz = e-Dcost / k T // Boltzmann probability function r = random(0,1) // uniform rand # between 0&1 if r < boltz then return TRUE else return FALSE Think of the boltz exp function as a mapping process that maps delta_cost to [0,1] on the average at the beginning, and to [0,0] at the end. (“on average” is important). So, when you generate a uniformly random number between [0,1], chances that it is less than “boltz” – i.e., you accept a bad move (on average over that particular temperature) is going to be equal to “boltz”. [Bazargan]

Simulated Annealing: Move Acceptance Good moves are always accepted Accepting bad moves: When T = T0, bad move acceptance probability  1 When T = Tfreez, Bad move acceptance probability = 0 Boltzmann probability function?!? boltz = e-Dcost / k T. k is the Boltzmann constant, chosen so that all moves at the initial temperature are accepted [Bazargan]

Simulated Annealing: More Insight... Annealing steps [Bazargan]

Simulated Annealing: More Insight... [Bazargan]

Wong-Liu Floorplanning Algorithm Uses simulated annealing Normalized Polish expressions represent floorplans Cost function: cost = area + l totalWireLength Floorplan sizing is used to determine area After floorplan sizing, the exact location of each module is known, hence wire-length can be calculated What will a designer do, if wire length is more important than the area? [Bazargan]

Wong-Liu Floorplanning Algorithm (cont.) Moves: OP1: Exchange two operands that have no other operands in between OP2: Complement a series of operators between two operands OP3: Exchange adjacent operand and operator if the resulting expression still a normalized Polish exp. 2 4 1 3 Why these moves? Can we replace OP2 with a move that flips only one operator? No. It will violate the normalized condition (if you avoid it, it might prohibit the moves set to reach all solutions) OP1 OP2 OP3 12 | 4 – 3 | 12 | 3 – 4 | 12 - 3 – 4 | 12 - 3 4 - | [©Sarrafzadeh] [Bazargan]

The Sequence Pair Algorithm Sequence-Pair is a succinct representation of non-slicing floorplans of rectangles Just like Polish Expression for slicing floorplans Represent a non-slicing floorplan by a pair of sequences of blocks Using Simulated Annealing to find a good sequence-pair Can only handle hard blocks i.e., cannot do things like shape-curve computation Essentially macro placement Techniques for soft block shaping exist (e.g., using Lagrangian Relaxation) but are very slow [Pan]

Positive step lines d e a c b f

Is this unique? d e a c b f

Sequence Pair Negative step line sequence: fcbead Positive step line sequence: ecadfb [or ecafdb in the alternative version] [Pan]

Positive Locus and Negative Locus of Block b Negative Locus of Block b [Pan]

Sequence-Pair = (abdecf, cbfade) Positive Loci Negative Loci Sequence-Pair = (abdecf, cbfade) [Pan]

Geometric Info of Sequence-Pair Given a placement and the corresponding sequence-pair (P, N): a right of b  a is after b in both P and N. c c a a b b

Geometric Info of Sequence-Pair Given a placement and the corresponding sequence-pair (P, N): a above b  a is before b in P and after b in N b a c b a c

Positive Locus and Negative Locus of Block b above left right below Negative Locus of Block b [Pan]

Geometric Info of Sequence-Pair Given a placement and the corresponding sequence-pair (P, N): a right of b  a is after b in both P and N. a left of b  a is before b in both P and N. a above b  a is before b in P and after b in N. a below b  a is after b in P and before b in N. [Pan]

Sequence Pair Negative step line sequence: fcbead Positive step line sequence: ecadfb [Pan]

From Sequence-Pair to a Floorplan Labeled grid for (abdecf, cbfade) Given a sequence-pair, the floorplan with smallest area can be found in O(n2) time. Algorithms of time O(n log log n) or O(n log n) exist. But faster than O(n2) algorithm only when n is quite large. e d a f b c a b d e c f [Pan]

From Sequence-Pair to Placement Distance from left (bottom) edge can be found using the longest path algorithm on the horizontal (vertical) constraint graph. Horizontal Constraint Graph Vertical Constraint Graph [Pan]

Sequence Pair (SP) A floorplan is represented by a pair of permutations of the module names: e.g. 1 3 2 4 5 3 5 4 1 2 A sequence pair (s1, s2) of n modules can represent all possible floorplans formed by the n modules by specifying the pair-wise relationship between the modules. [Pan]

Sequence Pair Consider a pair of modules A and B. If the arrangement of A and B in s1 and s2 are: (…A…B…, …A…B…), then the right boundary of A is on the left hand side of the left boundary of B. (…A…B…, …B…A…), then the upper boundary of B is below the lower boundary of A. [Pan]

Example Consider the sequence pair: (13245,41352 ) Any other SP that is also valid for this packing? 3 2 1 5 4 [Pan]

Floorplan Realization Floorplan realization is the step to construct a floorplan from its representation. How to construct a floorplan from a sequence pair? We can make use of the horizontal and vertical constraint graphs (Gh and Gv). [Pan]

Floorplan Realization Whenever we see (…A…B…, …A…B…), add an edge from A to B in Gh with weight wA. Whenever we see (…A…B…, …B…A…), add an edge from B to A in Gv with weight hA. Add a source vertex s to Gh and Gv pointing, with weight 0, to all vertices without incoming edges. Finally, find the longest paths from s to every vertex in Gh and Gv (how?), which are the coordinates of the lower left corner of the module in the packing. [Pan]

Example Gh 1.1 3 2 1.2 1 1.2 1.1 1.1 1 1.2 3 2 1.2 1 5 1.2 s 2.4 2 4 5 Gv 2 3 2 4 1 1 2 1 2.4 1.2 1 1 5 (13245,41352 ) 4 s [Pan]

Constraint Graphs How many edges are there in Gh and Gv in total? Are there any transitive edges in Gh and Gv? How to remove the transitive edges? Can we reduce the size of Gh and Gv to linear, i.e., no. of edges is of order O(n), by removing all the transitive edges? [Pan]

Moves Three kinds of moves in the annealing process: M1: Rotate a module, or change the shape of a module M2: Interchange 2 modules in both sequences M3: Interchange 2 modules in the first sequence Does this set of move operations ensure reachability? Why? [Pan]

Pros and Cons of SP Advantages: Disadvantages: Simple representation All floorplans can be represented. The solution space is finite. (How big?) Disadvantages: Redundant representation. The representation is not 1-to-1. The size of the constraint graphs, and thus the runtime to construct the floorplan is quadratic [Pan]

*-Tree Methods Various methods and representations for nonslicing floorplans Bounded slicing grid (BSG) (1996) O-tree (1999) B*-tree (2000) Corner block list (CBL) (2000) Transitive closure graph (TCG) (2001) These represent nonslicing floorplans by strings and use simulated annealing to optimize the layout.

Other Floorplanning Methods Integer linear programming Uses integer variables to capture “left of,” “right of,” “above” and “below”

Overconstrained Shaping Why rectangles, L’s, T’s ? available granularity is by site spacing, row height placers can handle arbitrarily complex region constraints hard IP reuse, generated modules benefit from shape freedom Why non-overlapping ? only requirement: total assigned cell area £ total resource area Roundness and shape simplicity are mythical needs constructive pin assignment ® don’t need roundness path timing optimization ® may even want disconnected shapes [Kahng]

This is Okay, Really... (Trust Me) 1.0 0.5,0.5 1.0 Blk A Blk B [Kahng]

...The Cells Won’t Mind [Kahng]

Using Floorplan Information: A Typical “Fluid” Placement [I. Markov]

Flat vs. hierarchical placement Works well for highly interconnected networks Hierarchical Good choice for SoC Can hybridize the two to get best of both worlds [Lackey et al., IBM, DAC 03]

Other Objective Functions

Motivation Critical length as a function of technology Wire length at which delay = clock period Across-chip wire delays > clock period  Multicycle global communication is essential 0.43x Chip cross-section [Saxena (Intel), ISPD03] [Intel]

Wire-pipelining Interconnect delay is distributed among several clock cycles by inserting flip-flops Adds area/power overhead 1cm Delay = 0.67ns (70nm) [Cong, Proc. IEEE 2001] Target Frequency : 3GHz (clock period : 0.33ns) Widely used, e.g., Intel’s Itanium processor

An Example Microarchitectue Int Rename Int Reg File 0 4 2 Int scheduler EX1 MDH 4 EX2 Reorder Buffer 2 Int Reg File 1 Bpred FTQ IFetch MDH EX3 4 4 D-cache FP Scheduler EX0 FP Rename FP Reg File 2 4 EX1 8 Bus Interface Unit 41 blocks, 21 latch banks MDH = Memory Disambiguation Hardware Numbers below the lines indicate the # of instructions flowing across the line (not bit width)

Impact on Microarchitecture Keep throughput critical wires short CPI estimation – Cycle accurate simulation, using superscalar processor simulators, of benchmark programs Simulators : Simplescalar (Wisc.), Turandot (IBM), etc. Benchmarks : SPEC 2000, Mediabench Very slow – A single simulation can take days to run to completion Execution time = num-instr * cycles/instr (CPI) * cycle-time

Minimizing CPI A Possible design flow A few objectives : CPI estimator Physical design μ-arch Freq Layout A few objectives : Optimal microarchitectural configuration for a particular frequency Optimal design frequency : Wire-pipelining may not improve performance (exec time) after a certain operating frequency

Recent approaches MEVA [Jagannathan, DAC 03] – Floorplanning Simulated Annealing (SA) based, no wire-pipelining Assumption : Each block has multiple implementations Cost function : CPI * cycle-time CPI is determined by the chosen μ-arch configuration Cycle-time is determined by the global wire delays CPI is computed for each configuration before-hand μ-arch blocks Simplescalar CPI Expensive if there are too many candidate configurations Floorplanning Configuration, cycle-time

Microarchitecture Template A way to specify a class of microarchitectures Define underlying building blocks for the architecture model and their connections Individual blocks can still be parameterized Examples: Size/associativity of caches, size of register file etc. Variation in area/latency/delay of a given block Latency variation affects IPC in the architectural space Area/delay affects physical design space Some examples of alternatives.. Cache – size, associativity, latency Branch predictor - size, predictor type Register File – size, latency Instruction scheduler – different scheduling techniques [Jagannathan, DAC03]

Illustration: Cache 32K Data cache 8K Data cache 8K Data cache A=5.04 mm2, L=4 A=1.44 mm2, L=2 A=1.44 mm2, L=1 Larger area, latency Smaller area, latency [Jagannathan, DAC03]

Bus Weights Approaches Used for floorplanning, incorporating wire latencies Search space is exponential Say, up to k latencies per bus, n busses  nk combinations Each requires a cycle-accurate simulation for performance analysis Quantify the impact of each wire with a weight, which can be used in physical design optimizations [Ekpanyapong, DAC 04] : Wire weight = Number of times it is accessed – Determined from simulation profiles Are access ratios good estimators of criticality? Fetch Decode Exec Branch mispred loop The impact may vary with the loop latency

Bus Weights Approaches (Contd.) Weighted cost function: Area = area of the layout WL = wirelength WSFL = weighted sum of factor latencies AR = aspect ratio [Nookala, DAC 05] Another way of finding wire weights: wire weights are determined using a statistical design of experiments based strategy Has some benefits over access ratios, which are an indirect metric Captures the effect of capturing throughput directly Can add thermal issues [Nookala, ISLPED06] – using HotSpot (built on top of SimpleScalar)

Controlling the Wire Length “Explosion”

An Architectural Solution to Interconnect Tyranny As seen earlier, alternate scaling scenarios also face interconnect tyranny (albeit to differing degrees) Most promising approach: simplify interconnection complexity architecturally Modify wiring histogram shape (i.e. Rent’s parameters) of design An example: multi-core microprocessors Goes counter to traditional approach of increased integration through block size scaling # wires wirelength [Saxena]

Planning a City: Land Usage [Somewhere in Iowa; pop. Density of Iowa= 20 persons/km2] [Minneapolis, p.d. = 2700/km2] [Barcelona=16000/km2] [New York=26000/km2]

The Future of Chip Design Today’s chips are 2-dimensional [Maly]

3D IC Using Wafer Bonding Detailed view Generalized view Layer 1 Layer 2 Layer 3 Layer 4 Layer 5 Bulk Substrate SOI wafers with bulk substrate removed Inter-layer bonds 1mm Bulk wafer Metal level of wafer 1 10mm 500mm Device level 1 Adapted from [Das et al., ISVLSI, 2003]

Global Net Length Distribution Histogram of net length, for various numbers of 3D layers 200 400 600 800 1000 1200 1400 5 10 15 20 25 30 35 Length (mm) Net Density (#/mm) 4 Strata 2 Strata 1 Stratum 3D Global Net Distributions

3D Floorplanning Problem: getting the heat out! Need to incorporate thermal analysis into design Example of a 3D floorplanner Cong et al., ICCAD 2004; ASPDAC06.