Lecture 14: Timing Analysis and Timed Simulation

Slides:



Advertisements
Similar presentations
TOPIC : SYNTHESIS DESIGN FLOW Module 4.3 Verilog Synthesis.
Advertisements

ECE 551 Digital System Design & Synthesis Lecture 08 The Synthesis Process Constraints and Design Rules High-Level Synthesis Options.
Logic Synthesis – 3 Optimization Ahmed Hemani Sources: Synopsys Documentation.
EELE 367 – Logic Design Module 2 – Modern Digital Design Flow Agenda 1.History of Digital Design Approach 2.HDLs 3.Design Abstraction 4.Modern Design Steps.
George Mason University FPGA Design Flow ECE 448 Lecture 9.
1 Lecture 28 Timing Analysis. 2 Overview °Circuits do not respond instantaneously to input changes °Predictable delay in transferring inputs to outputs.
Integrated Circuits Laboratory Faculty of Engineering Digital Design Flow Using Mentor Graphics Tools Presented by: Sameh Assem Ibrahim 16-October-2003.
CSE241 Formal Verification.1Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Recitation 6: Formal Verification.
RTL Hardware Design by P. Chu Chapter 161 Clock and Synchronization.
Kazi Spring 2008CSCI 6601 CSCI-660 Introduction to VLSI Design Khurram Kazi.
Kazi Fall 2006 EEGN 4941 EEGN-494 HDL Design Principles for VLSI/FPGAs Khurram Kazi.
Logic Design Outline –Logic Design –Schematic Capture –Logic Simulation –Logic Synthesis –Technology Mapping –Logic Verification Goal –Understand logic.
03/30/031 ECE 551: Digital System Design & Synthesis Lecture Set 9 9.1: Constraints and Timing 9.2: Optimization (In separate file)
Global Timing Constraints FPGA Design Workshop. Objectives  Apply timing constraints to a simple synchronous design  Specify global timing constraints.
Digital Design Strategies and Techniques. Analog Building Blocks for Digital Primitives We implement logical devices with analog devices There is no magic.
ISE. Tatjana Petrovic 249/982/22 ISE software tools ISE is Xilinx software design tools that concentrate on delivering you the most productivity available.
ASIC Design Flow – An Overview Ing. Pullini Antonio
© 2003 Xilinx, Inc. All Rights Reserved For Academic Use Only Xilinx Design Flow FPGA Design Flow Workshop.
Chonnam national university VLSI Lab 8.4 Block Integration for Hard Macros The process of integrating the subblocks into the macro.
Introduction to CMOS VLSI Design Lecture 5: Logical Effort GRECO-CIn-UFPE Harvey Mudd College Spring 2004.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
This material exempt per Department of Commerce license exception TSU Reading Reports.
© 2003 Xilinx, Inc. All Rights Reserved Global Timing Constraints FPGA Design Flow Workshop.
Anurag Dwivedi. Basic Block - Gates Gates -> Flip Flops.
George Mason University ECE 448 FPGA and ASIC Design with VHDL FPGA Design Flow ECE 448 Lecture 7.
Timing and Constraints “The software is the lens through which the user views the FPGA.” -Bill Carter.
Introduction to FPGA Tools
Tools - Design Manager - Chapter 6 slide 1 Version 1.5 FPGA Tools Training Class Design Manager.
Static Timing Analysis
Tools - Analyzing your results - Chapter 7 slide 1 Version 1.5 FPGA Tools Course Analyzing your Results.
ASIC/FPGA design flow. Design Flow Detailed Design Detailed Design Ideas Design Ideas Device Programming Device Programming Timing Simulation Timing Simulation.
04/21/20031 ECE 551: Digital System Design & Synthesis Lecture Set : Functional & Timing Verification 10.2: Faults & Testing.
EECE 320 L8: Combinational Logic design Principles 1Chehab, AUB, 2003 EECE 320 Digital Systems Design Lecture 8: Combinational Logic Design Principles.
State Machine Design with an HDL
IAY 0600 Digital Systems Design
Lecture 10 Flip-Flops/Latches
ASIC Design Methodology
Timing and Verification
Dept. of Electrical and Computer Engineering
EMT 351/4 DIGITAL IC DESIGN Week # Synthesis of Sequential Logic 10.
ECE 4110–5110 Digital System Design
Lecture 15: Synthesis of Memories in FPGA
Inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #21 State Elements: Circuits that Remember Hello to James Muerle in the.
Topics The logic design process..
CS Fall 2005 – Lec. #5 – Sequential Logic - 1
Topics 16 x 16 multiplier example..
Field Programmable Gate Array
Field Programmable Gate Array
Field Programmable Gate Array
Limitations of STA, Slew of a waveform, Skew between Signals
EE141 Design Styles and Methodologies
Week 5, Verilog & Full Adder
Timing Analysis 11/21/2018.
332:437 Lecture 7 Verilog Hardware Description Language Basics
IAY 0800 Digitaalsüsteemide disain
SYNTHESIS OF SEQUENTIAL LOGIC
ECE-C662 Introduction to Behavioral Synthesis Knapp Text Ch
FPGA Tools Course Answers
Chapter 10 Timing Issues Rev /11/2003 Rev /28/2003
332:437 Lecture 7 Verilog Hardware Description Language Basics
Test Fixture (Testbench)
State Machine Design with an HDL
332:437 Lecture 7 Verilog Hardware Description Language Basics
UCSD ECE 111 Prof. Farinaz Koushanfar Fall 2017
Lecture 14: Performance Optimization
UCSD ECE 111 Prof. Farinaz Koushanfar Fall 2017
THE ECE 554 XILINX DESIGN PROCESS
The Xilinx Alliance 3.3i software
The Xilinx Alliance 3.3i software
THE ECE 554 XILINX DESIGN PROCESS
Presentation transcript:

Lecture 14: Timing Analysis and Timed Simulation UCSD ECE 111 Prof. Farinaz Koushanfar Fall 2017 Some slides are courtesy of Prof. Patrick Schaumont UCSD ECE 111, Prof. Koushanfar, Fall 2017

UCSD ECE 111, Prof. Koushanfar, Fall 2017 Topics Static and Dynamic Timing Analysis Static Timing Analysis Delay Model Path Delay False Paths Timing Constraints FPGA Design Flow & STA: Sorter Example Dynamic Timing Analysis FPGA Design Flow & Timed Simulation Delay back-annotation with SDF files Demonstration UCSD ECE 111, Prof. Koushanfar, Fall 2017

Static and Dynamic Timing Analysis Digital Synchronous Design Input Output fCLK Suppose that we want to automatically find the maximum clock frequency of a given design written in Verilog and synthesized to a given technology Approach 1: Static Timing Analysis = use analysis techniques on the netlist to define fclk,max Approach 2: Dynamic Timing Analysis = use simulation and selected input stimuli to measure the slowest path UCSD ECE 111, Prof. Koushanfar, Fall 2017

Static and Dynamic Timing Analysis Static Timing Analysis seems ideal, since it is done by a design tool that starts from the netlist and runs automatically However, Static Timing Analysis may report pessimistic results and report delays that will never occur during operation Dynamic Timing Analysis is nice, because we can combine it with the simulations which we do for functional verification However, Dynamic Timing Analysis may be too optimistic, in particular if we do not simulate the worst possible case (which may be hard to predict in a complex netlist) In this lecture, we put the two techniques side-to-side UCSD ECE 111, Prof. Koushanfar, Fall 2017

Static Timing Analysis STA looks for the worst of the following delays in a chip Delay from input to register Delay from register to register Delay from register to output IN Combinational Logic Combinational Logic Combinational Logic OUT UCSD ECE 111, Prof. Koushanfar, Fall 2017

Static Timing Analysis STA relies on a similar model as we discussed before Combinational Logic CLK Tclk,min = Tclk->Q + TLogic + TRouting + TSetup Property of the component (flipflop) Property of the netlist Property of the component (flipflop) UCSD ECE 111, Prof. Koushanfar, Fall 2017

Static Timing Analysis STA also takes delay on the clock connection into account. This is called clock skew. The effect of clock skew is highly dependent on circuit topology Combinational Logic Tskew CLK Tclk,min = Tclk->Q + TLogic + TRouting + TSetup - TSkew Here, skew works 'in favor' of us. UCSD ECE 111, Prof. Koushanfar, Fall 2017

Static Timing Analysis STA also takes delay on the clock connection into account. This is called clock skew. The effect of clock skew is highly dependent on circuit topology Combinational Logic Tskew CLK Tclk,min = Tclk->Q + TLogic + TRouting + TSetup + TSkew Here, skew works 'against' us. UCSD ECE 111, Prof. Koushanfar, Fall 2017

Static Timing Analysis STA also takes delay on the clock connection into account. This is called clock skew. The effect of clock skew is highly dependent on circuit topology the physical layout of the chip including the placement of components and the routing of wires The best clock skew is 0. Chips will distribute the clock signal so that it arrives at every flip-flop at exactly the same moment UCSD ECE 111, Prof. Koushanfar, Fall 2017

Static Timing Analysis So, once the technology is chosen, STA needs to find the longest combinational path, consisting of Tlogic + Trouting Path delay = delay from an input to an output. M inputs, N outputs -> M x N path delays possible Combinational Logic 10 ns 15 ns 12 ns 9 ns Worst case delay = 15ns UCSD ECE 111, Prof. Koushanfar, Fall 2017

intrinsic delay = 1ns (delay of each gate when not connected) Delay of single path Path delay = Intrinsic Gate Delay + Fan-out Delay + Wire Delay Intrinsic Delay: 1 ns In intrinsic delay = 1ns (delay of each gate when not connected) Out UCSD ECE 111, Prof. Koushanfar, Fall 2017

UCSD ECE 111, Prof. Koushanfar, Fall 2017 Delay of a path Gate Fan-out Delay: Delay caused by the input capacitance of gates in fan-out 1 ns In Out Ci Fan-out = 4 Each gate input represents a unit load UCSD ECE 111, Prof. Koushanfar, Fall 2017

Static Timing Analysis Gate Fan-out Delay: Ci = Capacitive load of a CMOS input V 1 load 1 ns V Vi,H Ci Vinput t Fan-out Delay RC-delay R ~ drive capability of inverter output stage C ~ loading of output UCSD ECE 111, Prof. Koushanfar, Fall 2017

Static Timing Analysis Gate Fan-out Delay: Appromated by N (fan-out) x D (delay per fan-out) Eg. a gate has 0.5 ns delay per fanout, then driving 3 gates will cost 1.5 ns delay V 1 load 1 ns V Vi,H 3 load Vinput t Ci Delay UCSD ECE 111, Prof. Koushanfar, Fall 2017

Static Timing Analysis The path shown has 1ns + 2ns + 1ns = 4ns delay. 2 ns 4 x 0.5 ns 2 ns After placement and routing has completed, additional correction factors can be included (delay of wires) UCSD ECE 111, Prof. Koushanfar, Fall 2017

Static Timing Analysis STA may find paths that are never activated during operation. Such paths are called false paths Select Comb Logic 1 1 Comb Logic UCSD ECE 111, Prof. Koushanfar, Fall 2017

Static Timing Analysis STA may find paths that are never activated during operation. Such paths are called false paths. Designers must indicate such false paths to the STA tools (based on constraints) Select Logic 1 1 Logic Even though the red path has the longest delay, it will never be triggered in practice. This is a false path UCSD ECE 111, Prof. Koushanfar, Fall 2017

UCSD ECE 111, Prof. Koushanfar, Fall 2017 Timing Constraints Timing Constraints are additional directives provided by the designer to the synthesis tools Timing Constraints are used by STA to verify if the timing of the resulting circuit is OK: clock period, setup time of flip-flops, etc Timing Constraints are used by the Place & Route tools to decide how components should be mapped into a chip The most simple timing constraint is the clock period of the clock signal Knowing this value, STA can evaluate the slack for each path slack = Tclk - Tclk,min UCSD ECE 111, Prof. Koushanfar, Fall 2017

FPGA Design Flow and Static Timing Analysis - pinout - clock period Constraints (ucf) RTL Verilog No timing information known (apart from cycle boundaries) Synthesis (+translate) Timing estimated based on components only (only fan-out and intrinsic delay) Mapper Place and Route Timing estimated based on final netlist (fan-out, intrinsic, and wire delay) Bitstream Generation UCSD ECE 111, Prof. Koushanfar, Fall 2017

UCSD ECE 111, Prof. Koushanfar, Fall 2017 STA on Design Example a1 a2 a3 a4 registers sort4_stage a b sort2 sort2 min(a,b) max(a,b) sort2 sort4_stage registers q1 q2 q3 q4 UCSD ECE 111, Prof. Koushanfar, Fall 2017

UCSD ECE 111, Prof. Koushanfar, Fall 2017 STA on Design Example This design has many equally bad long paths. As long as we hit 2 sort2 cells per sorter4_stage, we will be on a long path a1 a2 a3 a4 Some examples: registers sort4_stage a b sort2 sort2 min(a,b) max(a,b) sort2 sort4_stage registers q1 q2 q3 q4 UCSD ECE 111, Prof. Koushanfar, Fall 2017

UCSD ECE 111, Prof. Koushanfar, Fall 2017 STA on Design Example We will capture this circuit in Verilog, synthesize it, and evaluate the timing with STA. We need sort2 sort4_stage register stage top cell UCSD ECE 111, Prof. Koushanfar, Fall 2017

UCSD ECE 111, Prof. Koushanfar, Fall 2017 Sort2 cell module sorter2c(qa, qb, output [3:0] qa, qb; input [3:0] a, b; assign qa = (a > b) ? assign qb = (a > b) ? a, b); : b; : a; endmodule UCSD ECE 111, Prof. Koushanfar, Fall 2017

UCSD ECE 111, Prof. Koushanfar, Fall 2017 Sort4_stage module sort4_stage(q1, q2, q3, q4, a1, a2, a3, a4); output [3:0] q1, q2, q3, q4; input a1, a2, a3, a4; wire w1, w2, w3, w4; sorter2 S1 (w1, w2, a1, a2); S2 (w3, w4, a3, a4); S3 (q2, q3, w3); assign q1 = w1; assign q4 = w4; endmodule a1 a2 a b a3 a4 sort4_stage sort2 sort2 min(a,b) max(a,b) sort2 UCSD ECE 111, Prof. Koushanfar, Fall 2017

UCSD ECE 111, Prof. Koushanfar, Fall 2017 Register stage module regblock(q1, q2, q3, q4, a1, a2, a3, a4, clk); output [3:0] q1, q2, q3, q4; reg input a1, a2, a3, a4; clk; always @(posedge clk) begin q1 <= a1; q2 a2; q3 a3; q4 a4; end endmodule UCSD ECE 111, Prof. Koushanfar, Fall 2017

UCSD ECE 111, Prof. Koushanfar, Fall 2017 Toplevel module fullsorter(q1, q2, q3, q4, a1, a2, a3, a4, clk); output [3:0] q1, q2, q3, q4; a1, a2, a3, a4; input input [3:0] clk; wire wire wire [3:0] w1, w2, w3, w4; v1, v2, v3, v4; u1, u2, u3, u4; regblock R0(w1, w2, w3, w4, a1, a2, a3, a4, clk); sort4_stage S0(v1, sort4_stage S1(u1, regblock R1(q1, endmodule v2, v3, v4, w1, w2, w3, w4); u2, u3, u4, v1, v2, v3, v4); q2, q3, q4, u1, u2, u3, u4, clk); UCSD ECE 111, Prof. Koushanfar, Fall 2017

register stage, toplevel) Timing Constraints We will only specify the period of the clock signal This is done by adding a 'PERIOD' constraint: NET "clk" PERIOD = 20 ns HIGH 50 %; NET "clk" TNM_NET = "clk"; sorter.v (sort2, sort4_stage, register stage, toplevel) sorter.ucf + Synthesis (+translate) UCSD ECE 111, Prof. Koushanfar, Fall 2017

Synthesis, Translate, Map Next, we synthesize the circuit After MAP, the Verilog is converted into a netlist of FPGA components: LUTs, flipflops, buffers, etc .. You can look at the netlist with the 'Technology Schematic' The MAP step can also use static timing analysis on the netlist There is a separate interactive timing analysis tool UCSD ECE 111, Prof. Koushanfar, Fall 2017

UCSD ECE 111, Prof. Koushanfar, Fall 2017 Timing Analyzer Delay Summary Nets + Components in Longest Path gital Design II Patri UCSD ECE 111, Prof. Koushanfar, Fall 2017

Analysis of the 'PERIOD' constraint ============================================================================ Timing constraint: NET "clk_BUFGP/IBUFG" PERIOD = 20 ns HIGH 50%; 30002 items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors) Minimum period is 9.739ns. --------------------------------------------------------------------------- Slack: uncertainty)) Source: Destination: Requirement: Data Path Delay: Clock Path Skew: Source Clock: Destination Clock: Clock Uncertainty: 10.261ns (requirement - (data path - clock path skew + R0/q4_3 (FF) R1/q3_2 (FF) 20.000ns 9.739ns (Levels of Logic = 12) 0.000ns clk_BUFGP rising at 0.000ns clk_BUFGP rising at 20.000ns 0.000ns a period measures from reg output to reg output R0/q4_3 12 levels of logic (includes setup) 9.739 ns q4, bit 3 R1/q3_2 UCSD ECE 111, Prof. Koushanfar, Fall 2017

STA on Design Example (after MAP) a1 a2 a3 a4 R0/q4_3 registers sort4_stage a b sort2 sort2 9.739 ns min(a,b) max(a,b) STA will also tells us what components are in this path sort2 sort4_stage registers R1/q3_2 q1 q2 q3 q4 UCSD ECE 111, Prof. Koushanfar, Fall 2017

STA on Design Example (after MAP) Data Path: R0/q4_3 to R1/q3_2 Delay type Delay(ns) ---------------------------- Tiockiq 0.442 Logical Resource(s) ------------------- R0/q4_3 Starting Point net (fanout=4) e 0.100 w4<3> Tilo 0.612 S0/S2/q1_cmp_gt0000121 net (fanout=1) S0/S2/q1_cmp_gt00001_map7 // some lines skipped ... net (fanout=3) e 0.100 S1/w3<1> Tilo 0.612 S1/S3/q1_cmp_gt0000145 net (fanout=1) S1/S3/q1_cmp_gt0000145/O S1/S3/q1_cmp_gt0000173 net (fanout=6) S1/S3/q1_cmp_gt0000 S1/S3/q2<2>1 u3<2> Ending Point --------------------------- (8.439ns logic, 1.300ns route) (86.7% logic, 13.3% route) Tioock 0.595 ---------------------------- Total 9.739ns R1/q3_2 UCSD ECE 111, Prof. Koushanfar, Fall 2017

STA on Design Example (after MAP) Data Path: R0/q4_3 to R1/q3_2 Delay type Delay(ns) Logic Logical Resource(s) ---------------------------- ------------------- Delays Tiockiq 0.442 net (fanout=4) e 0.100 R0/q4_3 w4<3> (LUTs) Tilo 0.612 S0/S2/q1_cmp_gt0000121 net (fanout=1) e 0.100 S0/S2/q1_cmp_gt00001_map7 // some lines skipped ... net (fanout=3) e 0.100 S1/w3<1> Tilo 0.612 S1/S3/q1_cmp_gt0000145 net (fanout=1) S1/S3/q1_cmp_gt0000145/O S1/S3/q1_cmp_gt0000173 net (fanout=6) S1/S3/q1_cmp_gt0000 S1/S3/q2<2>1 u3<2> Tioock 0.595 R1/q3_2 ---------------------------- Total 9.739ns --------------------------- (8.439ns logic, 1.300ns route) (86.7% logic, 13.3% route) UCSD ECE 111, Prof. Koushanfar, Fall 2017

STA on Design Example (after MAP) Data Path: R0/q4_3 to R1/q3_2 Delay type Delay(ns) ---------------------------- Logical Resource(s) ------------------- Net Tiockiq 0.442 R0/q4_3 Delays net (fanout=4) e 0.100 w4<3> (estimated) Tilo 0.612 S0/S2/q1_cmp_gt0000121 net (fanout=1) S0/S2/q1_cmp_gt00001_map7 // some lines skipped ... net (fanout=3) e 0.100 S1/w3<1> Tilo 0.612 S1/S3/q1_cmp_gt0000145 net (fanout=1) S1/S3/q1_cmp_gt0000145/O S1/S3/q1_cmp_gt0000173 net (fanout=6) S1/S3/q1_cmp_gt0000 S1/S3/q2<2>1 u3<2> Tioock 0.595 R1/q3_2 ---------------------------- Total 9.739ns --------------------------- (8.439ns logic, 1.300ns route) (86.7% logic, 13.3% route) UCSD ECE 111, Prof. Koushanfar, Fall 2017

STA after Place and Route We know the exact loading of each interconnection, so we can get a more precise timing estimate UCSD ECE 111, Prof. Koushanfar, Fall 2017

Analysis after Place and Route We need 10ns more, just to move data around in the chip! This is not only the case for this example. In modern logic design, routing delay is typically as large as logic delay !! ======================================================================== Timing constraint: NET "clk_BUFGP/IBUFG" PERIOD = 20 ns HIGH 50%; 30002 items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors) Minimum period is 19.000ns. ---------------------------------------------------------------------- Slack: skew + uncertainty)) Source: Destination: Requirement: Data Path Delay: Clock Path Skew: Source Clock: Destination Clock: Clock Uncertainty: 1.000ns (requirement - (data path - clock path R0/q1_3 (FF) R1/q3_1 (FF) 20.000ns 18.979ns (Levels -0.021ns clk_BUFGP rising clk_BUFGP rising 0.000ns of Logic = 12) at 0.000ns at 20.000ns UCSD ECE 111, Prof. Koushanfar, Fall 2017

Path Analysis confirms that wires are slow .. Data Path: R0/q1_3 to R1/q3_1 Delay type Delay(ns) ---------------------------- Tiockiq net (fanout=4) // some lines skipped ... Tilo net (fanout=1) net (fanout=6) Tioock Total 18.979ns Logical Resource(s) ------------------- R0/q1_3 w1<3> 0.442 2.142 0.612 0.298 0.604 0.660 0.725 0.595 S1/S3/q1_cmp_gt0000145 S1/S3/q1_cmp_gt0000145/O S1/S3/q1_cmp_gt0000173 S1/S3/q1_cmp_gt0000 S1/S3/q2<1>1 u3<1> R1/q3_1 --------------------------- (8.631ns logic, 10.348ns route) (45.5% logic, 54.5% route) UCSD ECE 111, Prof. Koushanfar, Fall 2017

Summary Static Timing Analysis STA uses an analytic delay model based on intrinsic gate delay fan-out wire delay (if available) STA will automatically check timing constraints set by a user STA is a good general checking technique but may stumble into false paths Analyzing STA results (and improving them) requires an understanding of the netlist by the designer but if you want the fastest implementation possible, it may be worth it .. UCSD ECE 111, Prof. Koushanfar, Fall 2017

Dynamic Timing Analysis Simulation is used extensively in a digital design flow for verification Since this simulation is a timed simulation, we may as well try to use intermediate design representations as simulation targets The lower-level synthesis blocks are still pin-compatible with the RTL top-level Hence, the same simulation testbench can be used Constraint-checking now becomes the role of the designer The designer needs to verify if the result of the timed simulation is still corresponding to the results of the RTL simulation UCSD ECE 111, Prof. Koushanfar, Fall 2017

Dynamic Timing Analysis Constraints RTL Verilog Testbench Synthesis (+translate) Post Synthesis Netlist Testbench Mapper Post Map Netlist Testbench Place and Route Post P&R Netlist Testbench Bitstream Generation UCSD ECE 111, Prof. Koushanfar, Fall 2017

Dynamic Timing Analysis requires simulation library RTL Verilog Post Synthesis Netlist Netlist in terms of FPGA components no logic delay (functional simulation) Post Map Netlist Netlist in terms of FPGA components logic delay included Post P&R Netlist Netlist in terms of FPGA components logic delay and wire delay included UCSD ECE 111, Prof. Koushanfar, Fall 2017

Synthesis and Simulation are integrated ISE (synthesis) external simulator UCSD ECE 111, Prof. Koushanfar, Fall 2017

How does a post-synthesis netlist look like? module sorter2_INST_5 (q1, output [3 : 0] q1; output [3 : 0] q2; q2, a1, a2); This is a component from an FPGA simulation library. The component has an parameter called INIT. In this case, the parameter captures the configuration of a LUT lookup table. The LUT contains an OR. input input [3 : 0] a1; [3 : 0] a2; // .. some lines skipped defparam \q1<3>1 .INIT = 4'hE; LUT2 \q1<3>1 ( .I0(a2[3]), .I1(a1[3]), .O(q1[3]) ); // .. some lines skipped endmodule a2[3] a1[3] q1[3] 1 = 4'hE UCSD ECE 111, Prof. Koushanfar, Fall 2017

UCSD ECE 111, Prof. Koushanfar, Fall 2017 LUT2 from library module LUT2 (O, I0, I1); parameter INIT = 4'h0; input I0, I1; output O; reg O; wire [1:0] s; assign s = {I1, I0}; always @(s) if ((s[1]^s[0] ==1) O = INIT[s]; else if ((INIT[0] (INIT[2] (INIT[0] O = INIT[0]; || (s[1]^s[0] ==0)) == INIT[1]) && == INIT[3]) && == INIT[2])) == 0) && (INIT[0] && (INIT[2] && (INIT[0] && (INIT[1] else if ((s[1] O = INIT[0]; else if ((s[1] O = INIT[2]; else if ((s[0] O = INIT[0]; else if ((s[0] O = INIT[1]; else O = 1'bx; endmodule == INIT[1])) == 1) == INIT[3])) == 0) == INIT[2])) == 1) == INIT[3])) UCSD ECE 111, Prof. Koushanfar, Fall 2017

How does a post-PAR netlist look like? Contains placement information and uses library components that support timing specifications module sorter2_INST_5 (a1, a2, q1, input [3 : 0] a1; input [3 : 0] a2; output [3 : 0] q1; output [3 : 0] q2; q2); LUT configuration // some lines skipped .. Placement information defparam q1_cmp_gt0000143.INIT = 16'h22B2; defparam q1_cmp_gt0000143.LOC = "SLICE_X25Y91"; X_LUT4 q1_cmp_gt0000143 ( .ADR0(a1[1]), .ADR1(a2[1]), .ADR2(a1[0]), .ADR3(a2[0]), .O(\q1_cmp_gt0000143/O ) ); UCSD ECE 111, Prof. Koushanfar, Fall 2017

SDF Timing Information Timing information from Place and Route is not stored in the Post-PAR netlist, but in a separate SDF File (Standard Delay Format) The Modelsim simulator will read this SDF file together with the Verilog netlist and back-annotate the delay information from the SDF file into the netlist Testbench Verilog Netlist Modelsim Post-PAR Simulation Place and Route Verilog Netlist with Placement Information SDF Data UCSD ECE 111, Prof. Koushanfar, Fall 2017

UCSD ECE 111, Prof. Koushanfar, Fall 2017 Demonstration (this slide shows the demo script. I will upload the sorter example on Blackboard) Include testbench with sorter design Choices for Post-nnnnn models: Post-synthesis, Post-translate, Post-map, Post-PAR model Post-PAR Static Timing Analysis Analyze -> against timing constraints Slowest path: R0/q4 to R1/q3 (if time left: cross-probe feature) UCSD ECE 111, Prof. Koushanfar, Fall 2017

UCSD ECE 111, Prof. Koushanfar, Fall 2017 Demonstration (2) Simulate Post-PAR Model Ilustrate delay & cross check with STA Show netlist Post-synthesis Isolate one component (LUT) Verify LUT location in simulation library Post-PAR Compare Post-PAR component with post-synthesis UCSD ECE 111, Prof. Koushanfar, Fall 2017

Summary STA and Timed Simulation Static Timing Analysis estimates the worst possible delay in a network based on network topology and components Dynamic Timing Analysis performs a timed simulation, with time delays modeled based on the intermediate synthesis results Performance optimizations will typically use static timing analysis. Specific worst-case conditions can be easily triggered using dynamic timing analysis. UCSD ECE 111, Prof. Koushanfar, Fall 2017