High-Speed Circuit-Tuning Techniques Based on Lagrangian Relaxation Charlie Chung-Ping Chen (608)2651145.

Slides:



Advertisements
Similar presentations
Porosity Aware Buffered Steiner Tree Construction C. Alpert G. Gandham S. Quay IBM Corp M. Hrkic Univ Illinois Chicago J. Hu Texas A&M Univ.
Advertisements

Gate Sizing for Cell Library Based Designs Shiyan Hu*, Mahesh Ketkar**, Jiang Hu* *Dept of ECE, Texas A&M University **Intel Corporation.
Gregory Shklover, Ben Emanuel Intel Corporation MATAM, Haifa 31015, Israel Simultaneous Clock and Data Gate Sizing Algorithm with Common Global Objective.
ECE 667 Synthesis and Verification of Digital Circuits
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Sequential Synthesis.
Engineering Optimization
4/22/ Clock Network Synthesis Prof. Shiyan Hu Office: EREC 731.
Yasuhiro Fujiwara (NTT Cyber Space Labs)
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
1 EL736 Communications Networks II: Design and Algorithms Class8: Networks with Shortest-Path Routing Yong Liu 10/31/2007.
Slide 4b.1 Stiff Structures, Compliant Mechanisms, and MEMS: A short course offered at IISc, Bangalore, India. Aug.-Sep., G. K. Ananthasuresh Lecture.
FastPlace: Efficient Analytical Placement using Cell Shifting, Iterative Local Refinement and a Hybrid Net Model FastPlace: Efficient Analytical Placement.
Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.
Near Optimal Rate Selection for Wireless Control Systems Abusayeed Saifullah, Chengjie Wu, Paras Tiwari, You Xu, Yong Fu, Chenyang Lu, Yixin Chen.
July 11, 2006 Comparison of Exact and Approximate Adjoint for Aerodynamic Shape Optimization ICCFD 4 July 10-14, 2006, Ghent Giampietro Carpentieri and.
MATH 685/ CSI 700/ OR 682 Lecture Notes
Feasibility, uncertainty and interpolation J. A. Rossiter (Sheffield, UK)
TH EDA NTHU-CS VLSI/CAD LAB 1 Re-synthesis for Reliability Design Shih-Chieh Chang Department of Computer Science National Tsing Hua University.
Local Unidirectional Bias for Smooth Cutsize-delay Tradeoff in Performance-driven Partitioning Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts. Work supported.
1 ITC-07 Paper /25/2007 Estimating Stuck Fault Coverage in Sequential Logic Using State Traversal and Entropy Analysis Soumitra Bose Design Technology,
An Algebraic Multigrid Solver for Analytical Placement With Layout Based Clustering Hongyu Chen, Chung-Kuan Cheng, Andrew B. Kahng, Bo Yao, Zhengyong Zhu.
Gate Sizing by Mathematical Programming Prof. Shiyan Hu
1 Multiple Kernel Learning Naouel Baili MRL Seminar, Fall 2009.
More Realistic Power Grid Verification Based on Hierarchical Current and Power constraints 2 Chung-Kuan Cheng, 2 Peng Du, 2 Andrew B. Kahng, 1 Grantham.
8/15/ VLSI Physical Design Automation Prof. David Pan Office: ACES Lecture 8. Floorplanning (2)
Chalmers University of Technology FlexSoC Seminar Series – Page 1 Power Estimation FlexSoc Seminar Series – Daniel Eckerbert
ICCAD 2003 Algorithm for Achieving Minimum Energy Consumption in CMOS Circuits Using Multiple Supply and Threshold Voltages at the Module Level Yuvraj.
Power Reduction for FPGA using Multiple Vdd/Vth
A Framework for Distributed Model Predictive Control
1 The Optimization of High- Performance Digital Circuits Andrew Conn (with Michael Henderson and Chandu Visweswariah) IBM Thomas J. Watson Research Center.
Frank Edward Curtis Northwestern University Joint work with Richard Byrd and Jorge Nocedal February 12, 2007 Inexact Methods for PDE-Constrained Optimization.
Non Negative Matrix Factorization
Lecture 12 Review and Sample Exam Questions Professor Lei He EE 201A, Spring 2004
EE 5900 Advanced Algorithms for Robust VLSI CAD, Spring 2009 Static Timing Analysis and Gate Sizing.
A Distributed Framework for Correlated Data Gathering in Sensor Networks Kevin Yuen, Ben Liang, Baochun Li IEEE Transactions on Vehicular Technology 2008.
Archer: A History-Driven Global Routing Algorithm Mustafa Ozdal Intel Corporation Martin D. F. Wong Univ. of Illinois at Urbana-Champaign Mustafa Ozdal.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
De-Nian Young Ming-Syan Chen IEEE Transactions on Mobile Computing Slide content thanks in part to Yu-Hsun Chen, University of Taiwan.
-1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI.
1 Optimization Based Power Generation Scheduling Xiaohong Guan Tsinghua / Xian Jiaotong University.
1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.
EE 201C Modeling of VLSI Circuits and Systems
ARCHER:A HISTORY-DRIVEN GLOBAL ROUTING ALGORITHM Muhammet Mustafa Ozdal, Martin D. F. Wong ICCAD ’ 07.
Frank Edward Curtis Northwestern University Joint work with Richard Byrd and Jorge Nocedal January 31, 2007 Inexact Methods for PDE-Constrained Optimization.
A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, and Charles J. Alpert** *Dept of ECE, Michigan Technological.
1 ε -Optimal Minimum-Delay/Area Zero-Skew Clock Tree Wire-Sizing in Pseudo-Polynomial Time Jeng-Liang Tsai Tsung-Hao Chen Charlie Chung-Ping Chen (National.
1 Lagrangean Relaxation --- Bounding through penalty adjustment.
UW-Madison Gate Sizing Based on Lagrangian Relaxation Yu-Min Lee Advisor: Charlie Chung-Ping Chen.
1 A Min-Cost Flow Based Detailed Router for FPGAs Seokjin Lee *, Yongseok Cheon *, D. F. Wong + * The University of Texas at Austin + University of Illinois.
Multi-area Nonlinear State Estimation using Distributed Semidefinite Programming Hao Zhu October 15, 2012 Acknowledgements: Prof. G.
Lagrangean Relaxation
1 Slides by Yong Liu 1, Deep Medhi 2, and Michał Pióro 3 1 Polytechnic University, New York, USA 2 University of Missouri-Kansas City, USA 3 Warsaw University.
Efficient Resource Allocation for Wireless Multicast De-Nian Yang, Member, IEEE Ming-Syan Chen, Fellow, IEEE IEEE Transactions on Mobile Computing, April.
1ISPD'03 Process Variation Aware Clock Tree Routing Bing Lu Cadence Jiang Hu Texas A&M Univ Gary Ellis IBM Corp Haihua Su IBM Corp.
High-Speed Circuit-Tuning Techniques Based on Lagrangian Relaxation Charlie Chung-Ping Chen ICCAD 99’ Embedded Tutorial Session 12A
Static Timing Analysis
A Fully Polynomial Time Approximation Scheme for Timing Driven Minimum Cost Buffer Insertion Shiyan Hu*, Zhuo Li**, Charles Alpert** *Dept of Electrical.
DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs Deming Chen, Jason Cong , Computer Science Department , UCLA Presented.
Linear Programming Piyush Kumar Welcome to CIS5930.
1 Chapter 5 Branch-and-bound Framework and Its Applications.
Proximity Optimization for Adaptive Circuit Design Ang Lu, Hao He, and Jiang Hu.
On the Relation Between Simulation-based and SAT-based Diagnosis CMPE 58Q Giray Kömürcü Boğaziçi University.
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Data Driven Resource Allocation for Distributed Learning
Computational Optimization
Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts.
CS5321 Numerical Optimization
SAT-Based Optimization with Don’t-Cares Revisited
Performance-Driven Interconnect Optimization Charlie Chung-Ping Chen
Chapter 6. Large Scale Optimization
Presentation transcript:

High-Speed Circuit-Tuning Techniques Based on Lagrangian Relaxation Charlie Chung-Ping Chen (608)

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation People Involved Joint work Charlie Chen, University of Wisconsin at Madison Chris Chu, Iowa State University D. F. Wong, University of Texas at Austin Publication “Fast and Exact Simultaneous Gate and Wire Sizing by Lagrangian Relaxation”, IEEE Transactions on Computer-Aided Design, July 1999

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Acknowledgement Strategic CAD Labs, Intel Corp. Steve Burns, Prashant Sawkar, N. Sherwani, and Noel Menezes IBM T. J. Watson Center Chandu Visweswariah C. Kime, L. He (UWisc-Madison)

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Outline Motivation Overview of Circuit Tuning Techniques Lagrangian Relaxation Based Circuit Tuning

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Motivation Double the work load and design complexity every 18 months

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Motivation Trends –Increased custom design –Aggressive tuning for performance improvement –Shorter time to market –Interconnect effects severe –Signal integrity issues emerging Circuit Tuning –Can significantly improve circuit performance and signal integrity without major modification

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Manual Sizing Pros –Takes advantage of human experience –Reliable –Simultaneously combines with other optimization techniques directly Cons –Slow, tedious, limited, and error-prone procedure –Rely too much on experience, requires solid training –Optimality not guaranteed (don’t know when to stop) Change Satisfy? iterations Simulate

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Automatic Circuit Tuning Pros –Fast –Achieves the best performance with interconnect considerations –Explores alternatives (power/delay/noise tradeoff) –Boosts productivity –Optimality guaranty (for convex problems) –Insures timing and reliability Cons –Complicated tool development and support ($$) –Tool testing, integration, and training

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Good Tuning Algorithm Fast Optimality guaranteed (for convex problem) Versatile Easy to use Solution quality index (error bound to the optimal solution) Simple (Easy to develop and maintain)

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Static vs. Dynamic Sizing Static Sizing –Stage Based –Nature circuit decomposition, large scale tuning capability –Very reasonable accuracy (when using good model) –No need for sensitization vectors –Solves for all critical paths in a polynomial formulation –False paths; Potentially inaccurate modeling of slopes of input excitation Dynamic Sizing –Simulation based –More accurate –No false path problems –Need good input vectors; good for circuits for which critical paths are known and limited –Takes care of a few scenario only –Relatively slower

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation A Simple Sizing Problem maxMinimize the maximum delay D max by changing w 1,…,w n w1w1w1w1 w2w2w2w2 w9w9w9w9 w 10 w 11 w6w6w6w6 w8w8w8w8 w3w3w3w3 w5w5w5w5 w4w4w4w4 w7w7w7w7 D 1 <D max D 2 <D max a b

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Existing Sizing Works Algorithm: fast, non-optimal for general problem formulation –TILOS (J. Fishburn, A. Dunlop, ICCAD 85’) –Weight Delay Optimization (J. Cong et al., ICCAD 95’) Mathematical Programming: slower, optimal –Geometrical Programming (TILOS) –Augmented Lagrangian (D. P. Marple et al., 86’) –Sequential Linear Programming (S. Sapatnekar et al.) –Interior Point Method (S. Sapatnekar et al., TCAD 93’) –Sequential Quadratic Programming (N. Menezes et al., DAC 95’) –Augmented Lagrangian + Adjoin Sensitivity (C. Visweswariah et al., ICCAD 96’, ICCAD97’) Is there any method that is fast and optimal?

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Converge? Mathematical Programming Algorithm ? SLP SQP Augmented Lagrangian TILOS Weighted Delay FastOptimal

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Heuristic Approach TILOS: (J. Fishburn etc ICCAD 85’) –Find all the sensitivities associated with each gate –Up-Size one gate only with the maximum sensitivity –To minimize the object function w4w4w4w4 w 11 w1w1w1w1 w5w5w5w5 w7w7w7w7 w9w9w9w9 w8w8w8w8 w 10 w6w6w6w6 w3w3w3w3 w2w2w2w2 D 1 <D max D 2 <D max Minimize D max a b

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Weighted Delay OptimizationDrivers Loads J. Cong ICCAD 95’ –Size one wire at a time in DFS order –To minimize the weighted delay –best weight? w3w3w3w3 w5w5w5w5 w4w4w4w4 w1w1w1w1 w2w2w2w2 1 D 1 1 D 1 2 D 2 2 D 2 Minimize 1 D 1  2 D 2

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Mathematical Programming Problem Formulation: Lagrangian: Optimality (Necessary) Condition: (Kuhn-Tucker Condition)

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation PSLP v.s. SQP Penalty Sequential Linear Programming Sequential Quadratic Programming

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Lagrangian Methods Augmented Lagrangian Lagrangian Relaxation

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Lagrangian Relaxation Theory LRS (Lagrangian Relaxation Subproblem) There exist Lagrangian multipliers will lead LRS to find the optimal solution for convex programming problem The optimal solution for any LRS is a lower bound of the original problem for any type of problem

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Lagrangian Relaxation

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Lagrangian Relaxation SLP SQP Augmented Lagrangian TILOS Weighted Delay Mathematical Programming Algorithm Lagrangian Relaxation

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Lagrangian Relaxation Framework Update Multipliers Weighted Delay Optimization Converge?

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Lagrangian Relaxation Framework D1D1D1D1 D2D2D2D2 D max D1D1D1D1 D2D2D2D2 D1D1D1D1 D2D2D2D More Critical -> More Resource -> More Weight

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Weighted Minimization Traverse the circuit in topological order Resize each component to minimize Lagrangian during visit w2w2w2w2 w3w3w3w3 w1w1w1w1 D1D1D1D1 D2D2D2D2 a b Minimize 1 D 1  2 D 2

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Multipliers Adjustment a subgradient approach Subgradient: An extension definition of gradient for non- smooth function Experience: Simple heuristic implementation can achieve very good convergence rate Reference: Non-smooth function optimization: N. Z. Shor

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Path Delay Formulation d1d1d1d1 d2d2d2d2 d3d3d3d3 D1D1D1D1 D2D2D2D2 AaAaAaAa AbAbAbAb AcAcAcAc Exponential growing More accurate Can exclude false paths

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Stage Delay Formulation d1d1d1d1 d2d2d2d2 d3d3d3d3 D1D1D1D1 D2D2D2D2 AaAaAaAa AbAbAbAb AcAcAcAc AeAeAeAe Polynomial size Less accurate Contains false paths

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Compatible? Stage Based Path Based ?

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Both Multipliers Satisfy KCL (Flow Conservation)  53  31   53  31  ,in  3,out 3,in  3,out Path Based Stage Based

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Mixed Delay Formulation Path Based Stage Based Stage Based

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Compatible? Stage Based Path Based Lagrangian Relaxation

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Hierarchical Objective Function Decomposition Divide the Lagrangian into who terms (containing or not containing variable w i ) Hierarchically update the Lagrangian during resizing

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Intermediate Variables Cancellation D1D1D1D1 D2D2D2D2 AaAaAaAa AbAbAbAb AcAcAcAc AeAeAeAe ae ae be be e1 e1 e2 e2 c2 c2 ae + be = e1 + e2 ae + be = e1 + e2 + ae + be + e1 - + e2 - ae (A a + d 1 ) + be (A b + d 1 ) + e1 (d 2 - D 1 ) + e2 (d 3 - D 2 )

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Decomposition and Pruning Flow Decomposition Prune out all the gates with zero multipliers

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Complimentary Condition Implications i imax i (D i -D max )= 0 Optimal Solution i max – Critical Path, weight i >= 0.0, path delay=D max – i max – Non-critical path, weight i = 0.0, path delay < D max

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Convergence Sequence Lagrangian=Lower Bound Weighted Delay<=Maximum Delay Any Feasible Maximum Delay= Upper Bound Optimal Solution # Iteration Max Delay

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Transistor Sizing Extension

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Runtime and Storage Requirement

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Runtime versus Circuit Size

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Storage versus Circuit Size

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Convergence of Subgradient Optimization

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Area vs. Delay Tradeoff Curve

C. Chen High-Speed Circuit-Sizing Techniques based on Lagrangian Relaxation Conclusion Lagrangian Relaxation –General mathematical programming algorithm –Optimality guarantee for convex programming problem –Versatile –No extra complication (no quadratic penalty function) –Lagrangian multiplier provides connections between mathematical programming and algorithmic approaches –Multipliers satisfy KCL (flow conservation) –Hierarchical update objective function provides extreme efficiency –Solution quality guaranteed (by providing lower bound)