March 20, 2007 ISPD 2007 1 An Effective Clustering Algorithm for Mixed-size Placement Jianhua Li, Laleh Behjat, and Jie Huang Jianhua Li, Laleh Behjat,

Slides:



Advertisements
Similar presentations
Multilevel Hypergraph Partitioning Daniel Salce Matthew Zobel.
Advertisements

Meng-Kai Hsu, Sheng Chou, Tzu-Hen Lin, and Yao-Wen Chang Electronics Engineering, National Taiwan University Routability Driven Analytical Placement for.
A Size Scaling Approach for Mixed-size Placement Kalliopi Tsota, Cheng-Kok Koh, Venkataramanan Balakrishnan School of Electrical and Computer Engineering.
Ripple: An Effective Routability-Driven Placer by Iterative Cell Movement Xu He, Tao Huang, Linfu Xiao, Haitong Tian, Guxin Cui and Evangeline F.Y. Young.
SimPL: An Effective Placement Algorithm Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan 1ICCAD 2010, Myung-Chul Kim,
Tanuj Jindal ∗, Charles J. Alpert‡, Jiang Hu ∗, Zhuo Li‡, Gi-Joon Nam‡, Charles B. Winn‡‡ ∗ Department of ECE, Texas A&M University, College Station, Texas.
Consistent Placement of Macro-Blocks Using Floorplanning and Standard-Cell Placement Saurabh Adya Igor Markov (University of Michigan)
FastPlace: Efficient Analytical Placement using Cell Shifting, Iterative Local Refinement and a Hybrid Net Model FastPlace: Efficient Analytical Placement.
A Clustering Utility Based Approach for S. Areibi, M. Thompson, A. Vannelli uoguelph.ca September 2001 School of Engineering ASIC Design 14th.
Congestion Driven Placement for VLSI Standard Cell Design Shawki Areibi and Zhen Yang School of Engineering, University of Guelph, Ontario, Canada December.
Placer Suboptimality Evaluation Using Zero-Change Transformations Andrew B. Kahng Sherief Reda VLSI CAD lab UCSD ECE and CSE Departments.
Intrinsic Shortest Path Length: A New, Accurate A Priori Wirelength Estimator Andrew B. KahngSherief Reda VLSI CAD Laboratory.
ISQED’2015: D. Seemuth, A. Davoodi, K. Morrow 1 Automatic Die Placement and Flexible I/O Assignment in 2.5D IC Design Daniel P. Seemuth Prof. Azadeh Davoodi.
Multiobjective VLSI Cell Placement Using Distributed Simulated Evolution Algorithm Sadiq M. Sait, Mustafa I. Ali, Ali Zaidi.
APLACE: A General and Extensible Large-Scale Placer Andrew B. KahngSherief Reda Qinke Wang VLSICAD lab University of CA, San Diego.
NuCAD ELECTRICAL ENGINEERING AND COMPUTER SCIENCE McCormick Northwestern University Robert R. McCormick School of Engineering and Applied Science FA-STAC.
Constructive Benchmarking for Placement David A. Papa EECS Department University of Michigan Ann Arbor, MI Igor L. Markov EECS.
Chapter 2 – Netlist and System Partitioning
Architecture and Details of a High Quality, Large-Scale Analytical Placer Andrew B. Kahng, Sherief Reda and Qinke Wang VLSI CAD Lab University of California,
Supply Voltage Degradation Aware Analytical Placement Andrew B. Kahng, Bao Liu and Qinke Wang UCSD CSE Department {abk, bliu,
Local Unidirectional Bias for Smooth Cutsize-delay Tradeoff in Performance-driven Partitioning Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts. Work supported.
A Semi-Persistent Clustering Technique for VLSI Circuit Placement Charles J. Alpert 1, Andrew Kahng 2, Gi-Joon Nam 1, Sherief Reda 2 and Paul G. Villarrubia.
Reconfigurable Computing (EN2911X, Fall07)
On Legalization of Row-Based Placements Andrew B. KahngSherief Reda CSE & ECE Departments University of CA, San Diego La Jolla, CA 92093
Can Recursive Bisection Alone Produce Routable Placements? Andrew E. Caldwell Andrew B. Kahng Igor L. Markov Supported by Cadence.
An Algebraic Multigrid Solver for Analytical Placement With Layout Based Clustering Hongyu Chen, Chung-Kuan Cheng, Andrew B. Kahng, Bo Yao, Zhengyong Zhu.
Lecture 9: Multi-FPGA System Software October 3, 2013 ECE 636 Reconfigurable Computing Lecture 9 Multi-FPGA System Software.
Layout-based Logic Decomposition for Timing Optimization Yun-Yin Lien* Youn-Long Lin Department of Computer Science, National Tsing Hua University, Hsin-Chu,
Triple Patterning Aware Detailed Placement With Constrained Pattern Assignment Haitong Tian, Yuelin Du, Hongbo Zhang, Zigang Xiao, Martin D.F. Wong.
A Resource-level Parallel Approach for Global-routing-based Routing Congestion Estimation and a Method to Quantify Estimation Accuracy Wen-Hao Liu, Zhen-Yu.
1 Circuit Partitioning Presented by Jill. 2 Outline Introduction Cut-size driven circuit partitioning Multi-objective circuit partitioning Our approach.
Cost-Based Tradeoff Analysis of Standard Cell Designs Peng Li Pranab K. Nag Wojciech Maly Electrical and Computer Engineering Carnegie Mellon University.
MGR: Multi-Level Global Router Yue Xu and Chris Chu Department of Electrical and Computer Engineering Iowa State University ICCAD
CRISP: Congestion Reduction by Iterated Spreading during Placement Jarrod A. Roy†‡, Natarajan Viswanathan‡, Gi-Joon Nam‡, Charles J. Alpert‡ and Igor L.
Power Reduction for FPGA using Multiple Vdd/Vth
Horizontal Benchmark Extension for Improved Assessment of Physical CAD Research Andrew B. Kahng, Hyein Lee and Jiajia Li UC San Diego VLSI CAD Laboratory.
TSV-Aware Analytical Placement for 3D IC Designs Meng-Kai Hsu, Yao-Wen Chang, and Valerity Balabanov GIEE and EE department of NTU DAC 2011.
Solving Hard Instances of FPGA Routing with a Congestion-Optimal Restrained-Norm Path Search Space Keith So School of Computer Science and Engineering.
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 EECS 527 Paper Presentation High-Performance.
UC San Diego / VLSI CAD Laboratory Incremental Multiple-Scan Chain Ordering for ECO Flip-Flop Insertion Andrew B. Kahng, Ilgweon Kang and Siddhartha Nath.
Seeing the Forest and the Trees: Steiner Wirelength Optimization in Placement Jarrod A. Roy, James F. Lu and Igor L. Markov University of Michigan Ann.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
1 Wire Length Prediction-based Technology Mapping and Fanout Optimization Qinghua Liu Malgorzata Marek-Sadowska VLSI Design Automation Lab UC-Santa Barbara.
Efficient Multi-Layer Obstacle- Avoiding Rectilinear Steiner Tree Construction Chung-Wei Lin, Shih-Lun Huang, Kai-Chi Hsu,Meng-Xiang Li, Yao-Wen Chang.
Analytic Placement. Layout Project:  Sending the RTL file: −Thursday, 27 Farvardin  Final deadline: −Tuesday, 22 Ordibehesht  New Project: −Soon 2.
Multilevel Generalized Force-directed Method for Circuit Placement Tony Chan 1, Jason Cong 2, Kenton Sze 1 1 UCLA Mathematics Department 2 UCLA Computer.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
Ho-Lin Chang, Hsiang-Cheng Lai, Tsu-Yun Hsueh, Wei-Kai Cheng, Mely Chen Chi Department of Information and Computer Engineering, CYCU A 3D IC Designs Partitioning.
Jason Cong‡†, Guojie Luo*†, Kalliopi Tsota‡, and Bingjun Xiao‡ ‡Computer Science Department, University of California, Los Angeles, USA *School of Electrical.
Session 10: The ISPD2005 Placement Contest. 2 Outline  Benchmark & Contest Introduction  Individual placement presentation  FastPlace, Capo, mPL, FengShui,
Deferred Decision Making Enabled Fixed- Outline Floorplanner Jackey Z. Yan and Chris Chu DAC 2008.
Physical Synthesis Comes of Age Chuck Alpert, IBM Corp. Chris Chu, Iowa State University Paul Villarrubia, IBM Corp.
Physical Synthesis Buffer Insertion, Gate Sizing, Wire Sizing,
CS270 Project Overview Maximum Planar Subgraph Danyel Fisher Jason Hong Greg Lawrence Jimmy Lin.
1 NTUplace: A Partitioning Based Placement Algorithm for Large-Scale Designs Tung-Chieh Chen 1, Tien-Chang Hsu 1, Zhe-Wei Jiang 1, and Yao-Wen Chang 1,2.
High-Performance Global Routing with Fast Overflow Reduction Huang-Yu Chen, Chin-Hsiung Hsu, and Yao-Wen Chang National Taiwan University Taiwan.
International Symposium on Physical Design San Diego, CA April 2002ER UCLA UCLA 1 Routability Driven White Space Allocation for Fixed-Die Standard-Cell.
Effective Linear Programming-Based Placement Techniques Sherief Reda UC San Diego Amit Chowdhary Intel Corporation.
CSE 144 Project. Overall Goal of the Project Implement a physical design tool for a two- row standard cell design
Interconnect Characteristics of 2.5-D System Integration Scheme Yangdong (Steven) Deng & Wojciech P. Maly
Proximity Optimization for Adaptive Circuit Design Ang Lu, Hao He, and Jiang Hu.
A Snap-On Placement Tool Israel Waldman. Introduction.
Prediction of Interconnect Net-Degree Distribution Based on Rent’s Rule Tao Wan and Malgorzata Chrzanowska- Jeske Department of Electrical and Computer.
Data Driven Resource Allocation for Distributed Learning
Delay Optimization using SOP Balancing
Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts.
APLACE: A General and Extensible Large-Scale Placer
A Semi-Persistent Clustering Technique for VLSI Circuit Placement
Delay Optimization using SOP Balancing
Chi-An (Rocky) Wu, Cadence Design Systems, Inc.
Presentation transcript:

March 20, 2007 ISPD An Effective Clustering Algorithm for Mixed-size Placement Jianhua Li, Laleh Behjat, and Jie Huang Jianhua Li, Laleh Behjat, and Jie Huang SCHULICH School of Engineering University of Calgary, Calgary, Canada

March 20, 2007ISPD Outline Introduction Introduction Previous Work Previous Work Proposed Clustering Algorithm Proposed Clustering Algorithm Numerical Results Numerical Results Conclusions and Future Works Conclusions and Future Works

March 20, 2007ISPD Introduction – What is clustering? Application Areas Application Areas VLSI circuit partitioning and placement VLSI circuit partitioning and placement Objective Objective To identify and cluster the groups of cells that are highly interconnected To identify and cluster the groups of cells that are highly interconnected Constraints Constraints Maximum cluster area/weight Maximum cluster area/weight Minimum clustering ratio Minimum clustering ratio

March 20, 2007ISPD Introduction – Why clustering? Deal with today’s increasing design complexity Deal with today’s increasing design complexity Algorithm scalability, e.g., FM algorithm Algorithm scalability, e.g., FM algorithm Speed up the runtime of design process Speed up the runtime of design process Fine Granularity Clustering, best choice, etc. Fine Granularity Clustering, best choice, etc. Improve the solution quality Improve the solution quality Device utilization, layout area, power consumption, etc. in FPGA design Device utilization, layout area, power consumption, etc. in FPGA design

March 20, 2007ISPD Outline

March 20, 2007ISPD Existing Clustering Algorithms Scoreless Clustering Algorithms Scoreless Clustering Algorithms No comparison between different potential clusters: FirstChoice No comparison between different potential clusters: FirstChoice Fast procedure, but random result Fast procedure, but random result Score-based Clustering Algorithms Score-based Clustering Algorithms Score comparison between different potential clusters: Best choice Score comparison between different potential clusters: Best choice Relative slower procedure, but determined and better result Relative slower procedure, but determined and better result Better choice for placement Better choice for placement

March 20, 2007ISPD Clustering Application in Placement Edge clustering algorithms are the most popular techniques Edge clustering algorithms are the most popular techniques FirstChoice and best choice FirstChoice and best choice Placers using FirstChoice Placers using FirstChoice Indirectly: Capo10, FengShui5.1 Indirectly: Capo10, FengShui5.1 Directly: NTUPlace3 Directly: NTUPlace3 Placers using best choice Placers using best choice Directly: mPL6, APlace3, hATP Directly: mPL6, APlace3, hATP

March 20, 2007ISPD Outline

March 20, 2007ISPD Research Motivations Analysis of edge clustering algorithms: pair wise operation Analysis of edge clustering algorithms: pair wise operation Pros: Pros: Fast Fast Cons: Cons: Local view of netlist structure Local view of netlist structure Non-consistent with the force-directed model Non-consistent with the force-directed model

March 20, 2007ISPD Cons of Edge Clustering Algorithms From the view of cell connectivity From the view of cell connectivity Considered: connections from seed cell to neighbors Considered: connections from seed cell to neighbors Non-considered: connections among neighbors Non-considered: connections among neighbors

March 20, 2007ISPD Cons of Edge Clustering Algorithms From the view of force-directed model From the view of force-directed model Forces from all nets are applied together Forces from all nets are applied together Not in a pair wise way Not in a pair wise way

March 20, 2007ISPD Proposed Research Objectives A new clustering algorithm Connectivity model Connectivity model Consider the connectivity as a whole, not pair wise Consider the connectivity as a whole, not pair wise Be consistent with the force-directed model Be consistent with the force-directed model Net clustering operation Net clustering operation Make clusters based on net clustering score Make clusters based on net clustering score Make clusters naturally, not pair wise. Make clusters naturally, not pair wise.

March 20, 2007ISPD Proposed Algorithm Procedure Input: A flat netlist Output: A clustered netlist Phase1: Potential Cluster Identification: For each net: Initial Cluster Formation Initial Cluster Formation Initial Cluster Refinement Initial Cluster Refinement Cluster Score Calculation Cluster Score Calculation Phase2: Final Cluster Formation Net Cluster Formation Net Cluster Formation

March 20, 2007ISPD Initial Cluster Formation Visit each net as a seed net Visit each net as a seed net Group cells in the seed net and neighbor cells Group cells in the seed net and neighbor cells Net1: cells 1, 2, 3, and 4 Net1: cells 1, 2, 3, and 4 n n1 n2 n5 n4 n3 n7 n8 n9 n11 n10 Partition: Cluster Partition: Netlist

March 20, 2007ISPD Initial Cluster Refinement FM algorithm based cell movement, until FM algorithm based cell movement, until All cell gains in “Cluster” are non-positive All cell gains in “Cluster” are non-positive All cell gains in “Netlist” are negative All cell gains in “Netlist” are negative ClusterNetlist

March 20, 2007ISPD Cluster Score Calculation 1. Calculate the cluster score ClusterNetlist

March 20, 2007ISPD Cluster Score Calculation 2. Update the incident net scores Clustered nets: Clustered nets: Cut nets: Cut nets: n1 n2 n5 n4 n3 n7n6n8 n9 n11 n10 ClusterNetlist

March 20, 2007ISPD Potential Cluster Identification Final scores for each net after phase n1 n2 n5 n4 n3 n7n6n8 n9 n11 n10

March 20, 2007ISPD Final Cluster Formation 1. Order nets based on scores 2. Cluster and merge nets with score > n1 n2 n5 n4 n3 n7n6n8 n9 n11 n10 1,2,36,7,8 4,5

March 20, 2007ISPD Analogy to Force-directed Model Initial Cluster Formation Initial Cluster Formation Equivalent: manually allocate cells in an initial cluster Equivalent: manually allocate cells in an initial cluster Initial Cluster Refinement Initial Cluster Refinement Equivalent: naturally allocate cells based on overall forces Equivalent: naturally allocate cells based on overall forces Cluster Score Calculation Cluster Score Calculation Equivalent: globally evaluate the net quality Equivalent: globally evaluate the net quality

March 20, 2007ISPD Algorithm Summary Characteristics: New connectivity computation New connectivity computation Identify the natural clusters in a circuit, despite the number of cells in clusters Identify the natural clusters in a circuit, despite the number of cells in clusters Consistent with the force-directed model Consistent with the force-directed model Net score computation Net score computation Remove cluster overlapping Remove cluster overlapping Choose globally the best nets for clustering Choose globally the best nets for clustering

March 20, 2007ISPD Outline

March 20, 2007ISPD Numerical Results Clustering Statistics Experiments Clustering Statistics Experiments Placement Experiments Placement Experiments

March 20, 2007ISPD Clustering Statistics Experiments Setup Setup Predefined cell clustering ratio (CCR) Predefined cell clustering ratio (CCR) Compare net clustering ratios (NCR) Compare net clustering ratios (NCR) Why net clustering ratio comparison? Why net clustering ratio comparison? Kind of measurement of interconnect complexity for placement and routing Kind of measurement of interconnect complexity for placement and routing Comparison algorithms Comparison algorithms FirstChoice, best choice FirstChoice, best choice Benchmark circuits Benchmark circuits ICCAD04 Mixed-size ICCAD04 Mixed-size

March 20, 2007ISPD Out of 18 benchmark circuits, Ours achieved 17 smallest net clustering ratios CCRNCROurs Best choice FirstChoice Average Average Clustering Ratio Comparison on ICCAD04 Circuits

March 20, 2007ISPD Placement Experiments Setup Setup Cluster a netlist using the proposed algorithm Cluster a netlist using the proposed algorithm Run other placers on the clustered netlist Run other placers on the clustered netlist Map the placement result and Run Capo10.1 to legalize and refine the result Map the placement result and Run Capo10.1 to legalize and refine the result Comparison placers Comparison placers mPL6, NTUPlace3-LE, Capo10.1, and FengShui5.1 mPL6, NTUPlace3-LE, Capo10.1, and FengShui5.1 Benchmark circuits Benchmark circuits ICCAD04 Mixed-size, and ISPD05 Placement Contest ICCAD04 Mixed-size, and ISPD05 Placement Contest

March 20, 2007ISPD Placement Results on ICCAD04 Benchmarks Capo10.1: 15 out of 18 improved HPWL Capo10.1: 15 out of 18 improved HPWL FengShui5.1: 14 out of 18 improved HPWL FengShui5.1: 14 out of 18 improved HPWL mPL6: 15 out of 18 improved HPWL mPL6: 15 out of 18 improved HPWL NTUPlace3-LE: 18 out of 18 improved HPWL NTUPlace3-LE: 18 out of 18 improved HPWLPlacer HPWL (10^6) Runtime (in seconds) OriginalClusteredOriginalClustered Capo FengShui mPL NTUPlace3-LE

March 20, 2007ISPD Placement Results on ISPD05 Benchmarks Capo10.1: 0 out of 8 improved HPWL Capo10.1: 0 out of 8 improved HPWL mPL6: 5 out of 8 improved HPWL mPL6: 5 out of 8 improved HPWL NTUPlace3-LE: 7 out of 8 improved HPWL NTUPlace3-LE: 7 out of 8 improved HPWLPlacer HPWL (10^6) Runtime (in seconds) OriginalClusteredOriginalClustered Capo mPL NTUPlace3-LE

March 20, 2007ISPD Experimental Summary Effective for ICCAD04 benchmark circuits Effective for ICCAD04 benchmark circuits Less effective for ISPD05 benchmark circuits Less effective for ISPD05 benchmark circuits

March 20, 2007ISPD Conclusions and Future Works Conclusions Conclusions A new clustering algorithm for placement A new clustering algorithm for placement A new connectivity model A new connectivity model Promising experimental results Promising experimental results Future Work: Future Work: Improve the algorithm efficiency Improve the algorithm efficiency Runtime Runtime Improve the algorithm scalability Improve the algorithm scalability ISPD05 benchmark circuits ISPD05 benchmark circuits Integrate into placers Integrate into placers

March 20, 2007ISPD Thank you!

March 20, 2007ISPD Appendix

March 20, 2007ISPD Why not just group the clusters? Directly cluster nets  directly optimize the placement objective Directly cluster nets  directly optimize the placement objective To deal with the cluster cell overlapping problem To deal with the cluster cell overlapping problem Net is a “finer” unit for clustering Net is a “finer” unit for clustering

March 20, 2007ISPD Runtime comparison Generally our clustering algorithm is slower than both FirstChoice and best choice, by 3 to 8 times Generally our clustering algorithm is slower than both FirstChoice and best choice, by 3 to 8 times

March 20, 2007ISPD Results on ISPD05 Probably due to the difference of the circuit structure ICCAD04, short nets majority ICCAD04, short nets majority Max net degree: from 17(ibm05) to 134(ibm02) Max net degree: from 17(ibm05) to 134(ibm02) ISPD05, large number of long nets ISPD05, large number of long nets Max net degree: 1935(adaptec2) to 11869(bigblue2) Max net degree: 1935(adaptec2) to 11869(bigblue2)

March 20, 2007ISPD NTUPlace3-LE NTUPlace3-LE Based on the Lp-norm wire model Based on the Lp-norm wire model NTUPlace3 NTUPlace3 NTUplace3 is based on the log-sum-exp wire model NTUplace3 is based on the log-sum-exp wire model State-of-the-art: better performance than NTUPlace3-LE State-of-the-art: better performance than NTUPlace3-LE NTUPlace3-LE and NTUPlace3