Physical Synthesis Ing. Pullini Antonio

Physical Synthesis Ing. Pullini Antonio antonio.pullini@epfl.ch

Delay of Interconnects Scaling Trends WLM and their limits Physical Synthesis Outline

Interconnects

past present future Wire resistance and metal migration force lower resistance and therefore ‘taller’ geometry Capacitance couples to neighbors Total capacitance does not get smaller!

Early Models Wire width feature size Older technology had wide wires More cross-section area implies less resistance and more capacitance. Model wire only with capacitance L H W 

However... With scaling, width of wire reduced. Resistance of the wire no longer negligible. Lumped RC is good enough approximation. L H W 

Interconnect Resistance Ohm’s Law: Resistance of wire proportional to wire length (L) and inversely proportional to cross-section(HW)‏ resistivity is the property of the material. L H W

Interconnect Capacitance Capacitance of a wire = f (Shape, Distance to surrounding wires, Distance to the substrate )‏ Estimating Capacitance is a matter of determining where the field lines go. To get an accurate estimate electric field solvers (2D or 3D) are used. E.g. Fastcap or Rafael

Area Capacitance Dielectric Substrate Current W L H t di Electric Fields

Fringing Capacitance Conductor Fringing Fields  H w + w  W-H/2

Todays Interconnect

Timing Models Older Lumped RC no more valid At todays frequencies wavelenght comparable to wire length We need to consider interconnect as a distributed sistem

Elmore Delay Elmore analyzed the distributed model and came up with the figures for delay. V in V out R1R1 C1C1 R2R2 C2C2 12 R i-1 C i-1 i-1 RiRi CiCi i R N-1 C N-1 N-1 RNRN CNCN N

Wire Model Assume: Wire modeled by N equal-length segments For large values of N:

Ideal Scaling Ideal process scaling: –Device geometries shrink by  = 0.7x)‏ Device delay shrinks by  –Wire geometries shrink by  but h shrink less R/  :  /(w .h) = r/  Cc/  : (h).  /(S  ) = cc/  C/  :  (w  ).(l  )  tdi  ‏ R/  rise, C/  scale, Cc/  rise SGD h w l S ll h SS ww

Interconnect Role Short interconnect –Used to connect nearby cells, R driver >> R interconnect –Minimize wire C, i.e., use short minwidth wires Medium to long-distance (“global”) interconnect –R driver  R interconnect –Size wires to tradeoff area vs. delay –Increasing width  Capacitance increases, Resistance decreases Need to find acceptable tradeoff - wire sizing problem “Fat” wires –Thicker cross-sections in higher metal layers –Useful for reducing delays for global wires –Inductance issues, sharing of limited resource

Block Scaling Block area often stays same # cells, # nets doubles Global interconnect lengths don’t shrink Local interconnect lengths shrink

Microprocessor Interconnect Global Interconnect S Local = S Technology S Global = S Die Source: Intel

Interconnect Delay Scaling Delay of a wire of length l : –  int = (rl)(cl) = rcl 2 (first order)‏ Local interconnects : –  int = (r/  )(c)(l  ) 2 = rcl 2  – Local interconnect delay unchanged (compare to faster devices)‏ Global interconnects : –  int = (r/  )(c/  )(l) 2 = (rcl 2 )/  2 – Global interconnect delay doubles – unsustainable! – Problem somewhat mitigated using buffers Interconnect delay increasingly more dominant

Delay with technology scaling

Old Style Design Flow

Problems WLMs are statistical – Inaccurate RTL is written without knowledge of physical hierarchy – Top level nets – Pins on physical blocks – Obstruction, blockages and power straps – RAM, Macro locations Constraints are estimated

More issues... Handoff – Synthesized netlist is thrown “over-the-wall” to the vendor/foundry, can take weeks to get results

Wire Load Models (WLM)‏ With traditional flows, all nets with the same fanout have the same estimated interconnect delay during front-end design

Estimated vs. Real After placement, it is obvious that nets with the same fanout will not have the same interconnect delay

Placement Issues Place/Route – Congestion due to flat placement – Large number of ECOs perturbs design Good flat placement can be destroyed by a large ECO Convergence becomes a moving target

More problems with loops... Feedback to Synthesis – Huge SDF and set_load files Synthesis – On the critical path, if synthesis needs to “rip up” the path, new nets may be based on CWLM – Capacity and runtime…throwing a huge (full chip) problem to synthesis

Towards timing closure

Where is the problem? Timing after routing is only slightly different than after placement Unacuracy is between synthesis and placement

Possible solutions? Get Physical Information Earlier – Understand placement of macros, RAMs, power, etc., and better constraints for synthesis Hierarchical Logical/Physical Design – Can isolate timing problems more efficiently, ECOs can be done with minimal impact Timing Predictability and Analysis – Can understand impact of design tradeoffs Put Synthesis and Placement Together – Eliminates the manual iteration loop altogether

Need for Physical Synthesis

Outline: High End ASIC Design Reference point: “typical” design Use of hierarchy in design Handoff from logic design to physical design Physical tie-ins to synthesis Noise problems

Borderline “SOC” – Video Graphic Chip – Network interface/router chip – 0.18 u technology, 6 - 8 layers Design – 5 large blocks, each with ~12 RAMS 5K pins 250K instances 5 global clocks, 200 derived/gated clocks 27K lines of timing constraints – set_output_delay 4000 -clock Clk1 -rise -min -add_delay [get_ports {MemWriteBus[3]} – set_false_path -setup -rise -from [get_pins {GR_FE_STAGE1_CNTRL_MISC24BIT_REG_1_16A/CK}] Care abouts – Correctness – Timing convergence – On time delivery Typical ASIC Chip

ASIC Flow: Got Hierarchy? All chips have hard blocks in them Percent of design starts that are hierarchical increases yearly – methodology: insulates sub-projects – tool reasons: capacity – IP reasons: may not have control over some blocks high-end ASIC chips are “hierarchical” and have soft blocks that are placed independently

Partitioning Partition design so that top level logic modules = physical clusters Physical sub-blocks are flattened in floorplanning

Logic to Physical Flow (simplified)‏ Design Planner Chip Finishing Chip Assembly Chip RTL Planning Synthesis1, Floorplan Generation, Chip Level Time Budgeting Block Implementation Synthesis2 & Placement Block routing Chip Integration Chip Timing Closure: Pins, Buffers, Global Routing Finalize Top Level Routing, Extraction, Address (or ignore) Signal Integrity Issues, LVS+DRC Synthesis/Place RTL GDSII

Black Box from Initial RTL model

Tying Logical to Physical Blocks

Getting More Physical

Realistic Timing Numbers after Block Level Design

Constraints Parasitics Key to Timing Closure: Time Budgeting Applied at many levels: chip, block, path... Start after entities are identified, not necessarily defined

Block Implementation: Core of Physical Synthesis Physical Synthesis – turn generic logic gates with no placement into – optimized gates with detailed placement Requires context from chip: – shape – obstructions – pin positions – timing constraints, etc

Physical Synthesis Flow RTL to Placed Netlist

Timing Effect of Physical Synthesis After Normal P&R After Normal P&R + Post Optimization After Wroute with PhysicalCompiler Placement After PhysicalCompiler Placement Positive Negative 700K gates 0.25u 10 ns 32 RAMs 700K gates 0.25u 10 ns 32 RAMs

Physical Synthesis Details

Timing Calculation Calculations use pin-to-pin Steiner Routing Each net is calculated individually No wireload models are used

Timing Calculation (Cont.)‏

Integration of Physical Synthesis: Linking with Test Path reordering can be chosen to reduce wiring congestion

Physical Synthesis & Power Analysis

Phy. Synthesis & Power Optimization Power Consumed = 2.4W Power Consumed = 1.2W - Gates can be resynthesized based on wire length - Gated clocks can be inserted by proximity

Assembly of Blocks into chip

Block B Block A I/O Block A Chip Level Timing Closure

Physical Synthesis Ing. Pullini Antonio

Similar presentations

Presentation on theme: "Physical Synthesis Ing. Pullini Antonio"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Physical Synthesis Ing. Pullini Antonio

Similar presentations

Presentation on theme: "Physical Synthesis Ing. Pullini Antonio"— Presentation transcript:

Similar presentations

About project

Feedback