Circuit Simulation via Matrix Exponential Method Speaker: Shih-Hung Weng Adviser: Chung-Kuan Cheng Date: 05/31/2013 1.

Slides:



Advertisements
Similar presentations
Porosity Aware Buffered Steiner Tree Construction C. Alpert G. Gandham S. Quay IBM Corp M. Hrkic Univ Illinois Chicago J. Hu Texas A&M Univ.
Advertisements

Intel Research Internet Coordinate Systems - 03/03/2004 Internet Coordinate Systems Marcelo Pias Intel Research Cambridge
Computer Science & Engineering Department University of California, San Diego SPICE Diego A Transistor Level Full System Simulator Chung-Kuan Cheng May.
Power Grid Sizing via Convex Programming Peng Du, Shih-Hung Weng, Xiang Hu, Chung-Kuan Cheng University of California, San Diego 1.
Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.
The continuous scaling trends of smaller devices, higher operating frequencies, lower power supply voltages, and more functionalities for integrated circuits.
Circuit Simulation via Matrix Exponential Operators CK Cheng UC San Diego 1.
CSE245: Computer-Aided Circuit Simulation and Verification Lecture Note 2: State Equations Prof. Chung-Kuan Cheng 1.
CSE245: Computer-Aided Circuit Simulation and Verification Lecture Note 3 Model Order Reduction (1) Spring 2008 Prof. Chung-Kuan Cheng.
1 EE 616 Computer Aided Analysis of Electronic Networks Lecture 12 Instructor: Dr. J. A. Starzyk, Professor School of EECS Ohio University Athens, OH,
CSE245: Computer-Aided Circuit Simulation and Verification Lecture Notes 3 Model Order Reduction (1) Spring 2008 Prof. Chung-Kuan Cheng.
Supply Voltage Degradation Aware Analytical Placement Andrew B. Kahng, Bao Liu and Qinke Wang UCSD CSE Department {abk, bliu,
A Concept of Environmental Forecasting and Variational Organization of Modeling Technology Vladimir Penenko Institute of Computational Mathematics and.
1 EE 616 Computer Aided Analysis of Electronic Networks Lecture 12 Instructor: Dr. J. A. Starzyk, Professor School of EECS Ohio University Athens, OH,
Design Automation for VLSI, MS-SOCs & Nanotechnologies Dr. Malgorzata Chrzanowska-Jeske Mixed-Signal System-on-Chip (supported.
UCSD CSE 245 Notes – SPRING 2006 CSE245: Computer-Aided Circuit Simulation and Verification Lecture Notes 3 Model Order Reduction (1) Spring 2006 Prof.
Chung-Kuan Cheng†, Andrew B. Kahng†‡,
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.
NuCAD ELECTRICAL ENGINEERING AND COMPUTER SCIENCE McCormick Northwestern University Robert R. McCormick School of Engineering and Applied Science Nostra-XTalk.
SAMSON: A Generalized Second-order Arnoldi Method for Reducing Multiple Source Linear Network with Susceptance Yiyu Shi, Hao Yu and Lei He EE Department,
UCSD CSE245 Notes -- Spring 2006 CSE245: Computer-Aided Circuit Simulation and Verification Lecture Notes Spring 2006 Prof. Chung-Kuan Cheng.
An Algebraic Multigrid Solver for Analytical Placement With Layout Based Clustering Hongyu Chen, Chung-Kuan Cheng, Andrew B. Kahng, Bo Yao, Zhengyong Zhu.
CS240A: Conjugate Gradients and the Model Problem.
Analytical Thermal Placement for VLSI Lifetime Improvement and Minimum Performance Variation Andrew B. Kahng †, Sung-Mo Kang ‡, Wei Li ‡, Bao Liu † † UC.
Ordinary Differential Equations (ODEs)
More Realistic Power Grid Verification Based on Hierarchical Current and Power constraints 2 Chung-Kuan Cheng, 2 Peng Du, 2 Andrew B. Kahng, 1 Grantham.
ECE 546 – Jose Schutt-Aine 1 ECE 546 Lecture -13 Latency Insertion Method Spring 2014 Jose E. Schutt-Aine Electrical & Computer Engineering University.
CSE245: Computer-Aided Circuit Simulation and Verification Lecture Note 2: State Equations Prof. Chung-Kuan Cheng.
Parallel Performance of Hierarchical Multipole Algorithms for Inductance Extraction Ananth Grama, Purdue University Vivek Sarin, Texas A&M University Hemant.
Power Network Distribution Chung-Kuan Cheng CSE Dept. University of California, San Diego.
Power Reduction for FPGA using Multiple Vdd/Vth
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
Low-Power Gated Bus Synthesis for 3D IC via Rectilinear Shortest-Path Steiner Graph Chung-Kuan Cheng, Peng Du, Andrew B. Kahng, and Shih-Hung Weng UC San.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
Jia Wang Electrical and Computer Engineering Illinois Institute of Technology Chicago, Illinois, United States November, 2012 Deterministic Random Walk.
CFD Lab - Department of Engineering - University of Liverpool Ken Badcock & Mark Woodgate Department of Engineering University of Liverpool Liverpool L69.
EE 201C Modeling of VLSI Circuits and Systems
Efficient Integration of Large Stiff Systems of ODEs Using Exponential Integrators M. Tokman, M. Tokman, University of California, Merced 2 hrs 1.5 hrs.
EE616 Dr. Janusz Starzyk Computer Aided Analysis of Electronic Circuits Innovations in numerical techniques had profound import on CAD: –Sparse matrix.
Decentralized Model Order Reduction of Linear Networks with Massive Ports Boyuan Yan, Lingfei Zhou, Sheldon X.-D. Tan, Jie Chen University of California,
1 Efficient Obstacle-Avoiding Rectilinear Steiner Tree Construction Chung-Wei Lin, Szu-Yu Chen, Chi-Feng Li, Yao-Wen Chang, Chia-Lin Yang National Taiwan.
1 ε -Optimal Minimum-Delay/Area Zero-Skew Clock Tree Wire-Sizing in Pseudo-Polynomial Time Jeng-Liang Tsai Tsung-Hao Chen Charlie Chung-Ping Chen (National.
Power Network Distribution Chung-Kuan Cheng CSE Dept. University of California, San Diego.
Large Timestep Issues Lecture 12 Alessandra Nardi Thanks to Prof. Sangiovanni, Prof. Newton, Prof. White, Deepak Ramaswamy, Michal Rewienski, and Karen.
Distributed Computation: Circuit Simulation CK Cheng UC San Diego
Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007.
Xuanxing Xiong and Jia Wang Electrical and Computer Engineering Illinois Institute of Technology Chicago, Illinois, United States November, 2011 Vectorless.
QuickYield: An Efficient Global-Search Based Parametric Yield Estimation with Performance Constraints Fang Gong 1, Hao Yu 2, Yiyu Shi 1, Daesoo Kim 1,
Multi-area Nonlinear State Estimation using Distributed Semidefinite Programming Hao Zhu October 15, 2012 Acknowledgements: Prof. G.
1 EE 616 Computer Aided Analysis of Electronic Networks Lecture 12 Instructor: Dr. J. A. Starzyk, Professor School of EECS Ohio University Athens, OH,
By Nasir Mahmood.  The NoC solution brings a networking method to on-chip communication.
CSE245: Computer-Aided Circuit Simulation and Verification Lecture Note 2: State Equations Spring 2010 Prof. Chung-Kuan Cheng.
1ISPD'03 Process Variation Aware Clock Tree Routing Bing Lu Cadence Jiang Hu Texas A&M Univ Gary Ellis IBM Corp Haihua Su IBM Corp.
Low-Power and High-Speed Interconnect Using Serial Passive Compensation Chun-Chen Liu and Chung-Kuan Cheng Computer Science and Engineering Dept. University.
Circuit Simulation using Matrix Exponential Method Shih-Hung Weng, Quan Chen and Chung-Kuan Cheng CSE Department, UC San Diego, CA Contact:
SPICE Diego : Circuit Simulation for Post Layout Analysis Chung-Kuan Cheng Department of Computer Science and Engineering University of California, San.
Pieter Heres, Aday Error control in Krylov subspace methods for Model Order Reduction Pieter Heres June 21, 2005 Eindhoven.
Exploring the Rogue Wave Phenomenon in 3D Power Distribution Networks Xiang Hu 1, Peng Du 2, Chung-Kuan Cheng 2 1 ECE Dept., 2 CSE Dept. University of.
DAC, July 2006 Model Order Reduction of Linear Networks with Massive Ports via Frequency-Dependent Port Packing Peng Li and Weiping Shi Department of ECE.
Interconnect and Packaging Chapter 1: Spectrum and Resonance (digital vs. analog) Chung-Kuan Cheng UC San Diego.
CSE 245: Computer Aided Circuit Simulation and Verification
Haihua Su, Sani R. Nassif IBM ARL
Chapter 2 Interconnect Analysis
Research on Interconnect
CSE245: Computer-Aided Circuit Simulation and Verification
CSE245: Computer-Aided Circuit Simulation and Verification
Latency Insertion Method
Xiou Ge Motivation PDN Simulation in LIM Real Example Results
Transient Analysis of Power System
EE 616 Computer Aided Analysis of Electronic Networks Lecture 12
Presentation transcript:

Circuit Simulation via Matrix Exponential Method Speaker: Shih-Hung Weng Adviser: Chung-Kuan Cheng Date: 05/31/2013 1

Foundation of Design Flow 2 PlacementLogic Synthesis Timing Analysis Routing ………… Circuit Simulation lookup table characterization Abstraction Layer Circuit Simulation

Emerging Demands Full system verification and analysis – scalability and performance 3 time voltage on-chip power grid low frequency

Publications (1/3) Circuit Simulation with Matrix Exponential Method: 1.S.-H. Weng, H. Zhuang and C.K. Cheng, “Adaptive Time Stepping for Power Grid Simulation using Matrix Exponential Method”, submitted to IEEE ICCAD S.-H. Weng, Q. Chen and C.K. Cheng, “Circuit Simulation using Matrix Exponential Method for Stiffness Handling and Parallel Processing”, IEEE ICCAD, Nov Q. Chen, W. Schoenmaker, S.-H. Weng, C.K. Cheng, G.-H. Chen, L.-J. Jiang and N. Wong, “A Fast Time- Domain EM-TCAD Coupled Simulation Framework via Matrix Exponential,” IEEE ICCAD, Nov (Best Paper Award Candidate) 4.Y. Li, Q. Cheng, S.-H. Weng, C.K. Cheng and N. Wong, “Globally Stable, Highly Parallelizable Fast Transient Circuit Simulation via Faber Series”, IEEE NewCAS May S.-H. Weng, Q. Chen and C.K. Cheng, “Time-Domain Analysis of Large-Scale Circuits by Matrix Exponential Method with Adaptive Control”, IEEE Trans. on CAD, Jul Q. Chen, S.-H. Weng and C.K. Cheng, “A Practical Regularization Technique for Modified Nodal Analysis in Large-Scale Time-Domain Circuit Simulation”, IEEE Trans. on CAD, Jun S.-H. Weng, Q. Chen and C.K. Cheng, “Circuit Simulation by Matrix Exponential Method,” IEEE ASIC Conference, Oct S.-H. Weng, P. Du and C.K. Cheng, “A Fast and Stable Explicit Integration Method by Matrix Exponential Operator for Large Scale Circuit Simulation”, IEEE ISCAS, May

Publications (2/3) Clock Gating Synthesis: 9.S.-H Weng, Y.-M. Kuo and S.-C. Chang, “Timing Optimization in Sequential Circuit by Exploiting Clock-Gating Logic,” ACM Trans. on DAES, April Y.-M. Kuo, S.-H. Weng, and S.-C. Chang, “A Novel Sequential Circuit Optimization with Clock Gating Logic,” IEEE ICCAD, Nov High-speed Interconnect: 11.G. Sun, S.-H. Weng, C.K, Cheng, B. Lin and L. Zeng, “An On-Chip Global Broadcast Network Design with Equalized Transmission Lines in the 1024-Core Era”, IEEE SLIP Jun S.-H. Weng, Y. Zhang, J. F. Buckwalter and C.K. Cheng, “Energy Efficiency Optimization through Co- Design of the Transmitter and Receiver in High-Speed On-Chip Interconnects”, accepted by IEEE Trans. on VLSI Placement and Routing: 13.C.K. Cheng, P. Du, A.B. Kahng and S.-H. Weng, “Low-Power Gated Bus Synthesis for 3D IC via Rectilinear Shortest-path Steiner Graph,” IEEE ISPD, Mar., P. Du, W. Zhao, S.H. Weng, C.K. Cheng and R.L. Graham, “Character Design and Stamp Algorithms for Character Projection Electron-Beam Lithography,” IEEE ASPDAC, Feb.,

Publications (3/3) Power Grid Analysis: 15.X. Hu, P. Du, S.-H. Weng and C.K. Cheng, “Worst-Case Noise Prediction With Non-zero Current Transition Times for Power Grid Planning,” accepted by IEEE Trans. on VLSI. 16.C.-C. Chou, H.-H. Chuang, T.-L. Wu, S.-H. Weng, and C.K. Cheng, “Eye Prediction of Digital Driver with Power Distribution Network Noise,” IEEE EPEPS, Nov (Best Student Paper Award) 17.P. Du, S.-H. Weng, X. Hu and C.K. Cheng, “Power Grid Sizing via Convex Programming,” IEEE ASIC Conference, Oct P. Du, X. Hu, S.H. Weng, A. Shayan, X. Chen, A. E. Engin and C.K. Cheng, “Worst-Case Noise Prediction with Non-zero Current Transition Times for Early Power Distribution System Verification,” IEEE ISQED, Mar S.-H. Weng, Y.-M. Kuo, S.-C. Chang, and M. Marek-Sadowska, “Timing Analysis Considering IR Drop Waveforms in Power Gating Designs,” IEEE ICCD, Oct

Outline Numerical Integration in Circuit Simulation Matrix Exponential Method – Krylov Subspace Approximation – Rational Krylov Subspace Approximation – Parallelism Experimental Results Conclusions 7

Circuit Formulation Formulated as a system of DAEs [Ho et. al. ‘75] 8 resistance & incidence capacitance & inductance branch currents & nodal voltages derivative of charges in nonlinear devices input sources currents of nonlinear devices linearized by compact model (BSIM, PSP, etc.)

Circuit Formulation Formulated as a system of DAEs [Ho et. al. ‘75] Solve x(t) in implicit or explicit numerical method 9 after linearization

10 forward Euler backward Euler Numerical Integration (1/2) Forward Euler (1 st order explicit) Backward Euler (1 st order implicit) Stability issue for stiff circuit unstable result performance & scalability issues sparse matrix-vector product solving a linear system

Methods LinearNonlinear HighMildLowHighMildLow Forward Eulerslowfastslowfast Backward Eulermedium slow Trapezoidal> Backward Euler and beyond? fast Numerical Integration (2/2) 11 MethodsComputationScalabilityErrorStabilityStep size Forward Eulerx=AvhighO(h 2 )lowtiny Backward EulerAx=blowO(h 2 )A-stablemedium TrapezoidalAx=blowO(h 3 )A-stable> Backward Euler and beyond?simplehighO(h n )highlarge stiffness lots Ax=b one Ax=b with fixed step size in C/h+G Performance = # steps x computation per step circuit dependent more #steps

Outline Numerical Integration in Circuit Simulation Matrix Exponential Method – Krylov Subspace Approximation – Rational Krylov Subspace Approximation – Parallelism Experimental Results Conclusions 12

Matrix Exponential Method (1/2) Analytical solution of – Let A=-C -1 G, b=C -1 u (C can be regularized [TCAD ‘12]) Let input be piecewise linear 13

Matrix Exponential Method (2/2) One-exponential formulation [Al-Mohy&Higham ‘11] – reduce three matrix exponential to one 14 where

Advantages Accuracy: Analytical solution – Approximate e Ah as (I+Ah)  Forward Euler – Approximate e Ah as (I-Ah) -1  Backward Euler Stability: A-stable for passive circuits 15 reference solution How to compute e A v?

Computation on Matrix Exponential 19 dubious ways [van Loan03] 16 Categories Based on Series Method Rational Approximation Decomposition Splitting Quadrature Rule Krylov Subspace eAeA eAveAv small large spec(A) regular basis and rational basis

Outline Numerical Integration in Circuit Simulation Matrix Exponential Method – Krylov Subspace Approximation – Rational Krylov Subspace Approximation – Parallelism Experimental Results Conclusions 17

Krylov Subspace Approximation (1/2) Krylov subspace K(A, v) = {v, Av, A 2 v, …, A m-1 v} – orthogonalized by Arnoldi process – approximate e Ah v by e Hmh – posteriori error estimation [Saad92] 18 {v, Av, A 2 v, …, A m-1 v} Arnoldi process sparse matrix-vector multiplication m is about 10~100 fast error estimation scaling invariant efficiency adaptivity

Stiffness affects step size and dimension – Arnoldi process captures extreme and clustered eigenvalues – Error bound [Saad92] Krylov Subspace Approximation (2/2) 19 Image{h } Real{h } highly stiff - max - min Image{h } Real{h } captured regions Arnoldi process with a small m critical part for e Ah shrink h or increase m for capturing critical eigenvalues where remedied by restarted scheme and scaling effect [ICCAD ‘12]

Outline Numerical Integration in Circuit Simulation Matrix Exponential Method – Krylov Subspace Approximation – Rational Krylov Subspace Approximation – Parallelism Experimental Results Conclusions 20

Rational basis (I-  A) -1 – K((I-  A) -1, v) = {v, (I-  A) -1 v, …, (I-  A) -m v} Rational Krylov Subspace Approximation (1/2) 21 ….. for j = 1, 2,..., m solve (I-  A)w = v j for i = 1, 2,..., j H i,j = w T v i w = w − H i,j v i end H j+1,j = |w| 2 v j+1 = w/H j+1,j end Arnoldi process (C+  G)w=Cv j avoid regularization of C subspace for A one LU for linear circuit w=Av j

Rational basis (I-  A) -1 – K((I-  A) -1, v) = {v, (I-  A) -1 v, …, (I-  A) -m v} Approximation of e Ah v Posteriori error estimation [van den Eshof 06] Rational Krylov Subspace Approximation (1/2) 22 adaptivity

Spectral transformation – similar to preconditioning – relax stiffness constraint – enable large step size with less dimension ’ min ’ max small gap - max - min -h ’’ max -h ’’ min - ’’ max - ’’ min Rational Krylov Subspace Approximation (2/2) 23 Image{h } Real{h } transforming spectrum by (I-  A) -1 captured by Arnoldi process critical part for e A projecting back to A by 1/  (I-H -1 ) applying large h to 1/  (I-H -1 ) small m is acceptable determined by  within a unit circle

Spectral transformation – similar to preconditioning – relax stiffness constraint – enable large step size with less dimension Rational Krylov Subspace Approximation (2/2) 24 small step size fix , sweep m and h

Spectral transformation – similar to preconditioning – relax stiffness constraint – enable large step size with less dimension Rational Krylov Subspace Approximation (2/2) 25  = large error fix h, sweep m and 

Methods LinearNonlinear HighMildLowHighMildLow Forward Eulerslowfastslowfast Backward Eulermedium slow Trapezoidal> Backward Euler Krylov Approx slowfastslowmedium Ration Krylov fastslow Wrap Up MethodsComputationScalabilityErrorStabilityStep size Forward Eulerx=AvhighO(h 2 )lowtiny Backward EulerAx=blowO(h 2 )A-stablemedium TrapezoidalAx=blowO(h 3 )A-stable> Backward Euler Krylov Approxx=AvhighO(h n )highmedium Ration KrylovAx=blowO(h n )highlarge 26

Outline Numerical Integration in Circuit Simulation Matrix Exponential Method – Krylov Subspace Approximation – Rational Krylov Subspace Approximation – Parallelism Experimental Results Conclusions 27

Parallelism in Krylov Subspace Arnoldi process – sparse matrix-vector multiplication [Bell&Garland ‘09] Exponential of a small matrix [Higham ‘05] – dense matrix by matrix operation 28 … thread 1 thread 2 thread n-1 thread n

t9 Constant slope within a step Input Grouping 29 input 1 input 2 time t1 t2 t3 t4 t5t6t7 t8 t10 t11 t12t13t14 t15 tiny steps due to maintaining constant slope

Constant slope within a step Input Grouping 30 group 1 group 2 time t1 t2 t3 t4 t5 t6 t7 t8 t1 t2 t3 t4 t5 t6 t7 t8 thread 1 thread 2

Outline Numerical Integration in Circuit Simulation Matrix Exponential Method – Krylov Subspace Approximation – Rational Krylov Subspace Approximation – Parallelism Experimental Results Conclusions 31

Settings of Experiments Environment – Implemented in Matlab – Intel i7 2.67GHz with 4GB memory Benchmarks – Nonlinear and large-scale circuits – Power distribution networks – IBM power grid testcases [Nassif 08] 32 DesignCategory# R# C# Trans.SizeStiffness D116bit adder x10 3 D2ALU13.6K4.3K650210K5.4x10 6 D3IO1.26M34.6K K1.6x10 6 D4Power grid10.4M8.6M012M2.6x10 5 generalized eigenvalues of (G, C)

Settings of Experiments Environment – Implemented in Matlab – Intel i7 2.67GHz with 4GB memory Benchmarks – Nonlinear and large-scale circuits – Power distribution networks – IBM power grid testcases [Nassif 08] 33 DesignArea (mm 2 )# R# C# LSizeStiffness P K15K 45.7K 8.7x10 9 P K228K 688K 8.3x10 9 P M0.97M 2.90M 1.0x10 10 P M2.47M 7.40M 1.0x10 10 RC tanks for PCB and package

Settings of Experiments Environment – Implemented in Matlab – Intel i7 2.67GHz with 4GB memory Benchmarks – Nonlinear and large-scale circuits – Power distribution networks – IBM power grid testcases [Nassif 08] 34 Design# R# C# L# I# VSizeStiffness ibmpg2t245K36K33036K330164K 3.5x10 12 ibmpg3t1.60M201K955201K9551M 3.4x10 11 ibmpg4t1.83M265K962266K9621.2M 2.5x10 11 ibmpg5t1.55M473K277473K539K2.1M 4.7x10 11 ibmpg6t2.41M761K281761K836K3.2M 3.8x10 11

Nonlinear and Large-scale Circuits Matrix exponential method (MEXP) – Krylov subspace approximation – Restarted scheme and parallel SpMV on GPU Trapezoidal method (TRAP) – same adaptive scheme as MEXP 35 DesignSizetimemTRAPMEXP-Krylovspeedup D ps s408.7s1.64X D210K100ps303,085.91s982.14s3.14X D3630K100ps308,053.45s535.92s15.05X D412M1ns20fails629.56n/a Parallel SpMV

Power Distribution Networks Simulate long time span (1μs) for step response One LU factorization – averaged by forward/backward substitutions MEXP with rational basis adaptively scales h/  TRAP uses predetermined step size 36 Design TRAP (h = 10ps) MEXP – Rational (  = ) LU(s)TotalLU(s)TotalSpeedup P m m15.73X P h m16.96X P h h17.91X P h h18.08X adaptive & large step size

Power Distribution Networks 37

IBM Testcases Widely adopted benchmarks Many input current sources Same MEXP with rational basis and TRAP 38 Design TRAP (h = 10ps) MEXP – Rational (  = ) LU(s)Total(s)LU(s)Total(s)Speedup ibmpg2t X ibmpg3t X ibmpg4t X ibmpg5t X ibmpg6t X ill alignment

IBM Testcases 39

Applying simple grouping – each group of inputs has the same pivot points – 6X speedup on average IBM Testcases 40 Design TRAP (h = 10ps) MEXP – Rational (  = ) LU(s)Total (s)# GroupLU (s)Total (s)Speedup ibmpg2t X ibmpg3t X ibmpg4t X ibmpg5t X ibmpg6t X

Conclusions Emerging challenges in the circuit simulation – scalability and performance Matrix exponential method – accuracy, adaptivity and stability – regular and rational Krylov subspace approximation Effectiveness of matrix exponential method – Simulate a large-scale circuit with 12M nodes – Nonlinear circuits: 6.61X speedup on average – Impulse response for PDNs: 15X speedup – IBM testcases: 6X speedup using input grouping 41

Future Works Variant basis in Krylov subspace – inverted, extended basis Model Order Reduction and matrix exponential method – both exploiting Krylov subspace – utilizing well-developed MOR to MEXP Hybrid simulation via matrix exponential – handle thermal, mechanical phenomena with FEM 42

Thank you! 43

Trade off between stability and performance SILCA [Li & Shi, ‘03]ACES [Devgan & Rohrer, ‘97] Where are we? 44 computational effort stability high low high Backward Euler Forward Euler Matrix Exponential Method [Weng et. al. ’11] Telescopic [Dong & Li, ‘10]Waveform Relaxation [E Lelarasmee et. al, ‘82]Domain Decomposition [K. Sun et. al., ‘07]LIM [J. E. Schutt-Aine, ‘01] Tailor for circuit simulation: Adaptive step control Scaling effect Nonlinear device Parallelization ETD in numerical community: [Saad ‘92] [Ban et. al. ‘11] [Aluffi-Pentini et. al. ‘03] [Hochbruck et. al. ‘97] Trapezoidal Method(SPICE)

Adaptive Step Control Typical circuit behavior 45 larger h smaller h error budget

Adaptive Step Size Strategy Adjustment of step size – Krylov subspace approximation require only to scale H m : α A → α H m re-calculate e Hm – backward Euler (C/h+G) changes and needs to solve linear system again Strategy: – maximize step size with a given error budget Err total – error are from Krylov space method and linearization 46

Nonlinear Formulation Decouple nonlinear and linear components 47 constant during Newton’s iterationcalculate Jacobian matrix J(F) in MEXP has less non-zeros approximate e A F MEXP: BE:

Rational basis A -1 – K(  A -1, v) = {v,  A -1 v, …,  A -m v} – requires more m and smaller h Only Inverted 48 Image{h } Real{h } after shifted-and-inverted only inverted smaller spectrum -1/ min

Different  49 needs large m

Different  50

Spectral Transformation – h = 10p Small RC mesh, 100 by 100 Different h for Krylov subspace Different  for rational Krylov subspace 51

Spectral Transformation – h = 10f Small RC mesh, 100 by 100 Different h for Krylov subspace Different  for rational Krylov subspace 52

Spectral Transformation –  = 10f Small RC mesh, 100 by 100 Different h for Krylov subspace Different  for rational Krylov subspace 53

Spectral Transformation–  = 1p Small RC mesh, 100 by 100 Different h for Krylov subspace Different  for rational Krylov subspace 54

Spectral Transformation–  = 100p Small RC mesh, 100 by 100 Different h for Krylov subspace Different  for rational Krylov subspace 55

Sweep  for Large Range 56

Sweep  for Large Range 57

Difference Between Inverted and Rational 58

Fixed  = 1p, sweep time step h 59

Fixed  = 1n, sweep time step h 60

Fixed  = 1u, sweep time step h 61

Fixed  = 1m, sweep time step h 62

Fixed  = 1, sweep time step h 63

Fixed  = 1k, sweep time step h 64

Fixed  = 1M, sweep time step h 65