1 Performance Analysis (Clock Signal). 2 Unbalanced delays Logic with unbalanced delays leads to inefficient use of logic: long clock periodshort clock.

Slides:



Advertisements
Similar presentations
ECE 551 Digital System Design & Synthesis Lecture 08 The Synthesis Process Constraints and Design Rules High-Level Synthesis Options.
Advertisements

Modern VLSI Design 4e: Chapter 5 Copyright  2008 Wayne Wolf Topics n Performance analysis of sequential machines.
FPGA-Based System Design: Chapter 4 Copyright  2004 Prentice Hall PTR Topics n 16 x 16 multiplier example.
VLSI Design EE 447/547 Sequential circuits 1 EE 447/547 VLSI Design Lecture 9: Sequential Circuits.
MICROELETTRONICA Sequential circuits Lection 7.
Lecture 11: Sequential Circuit Design. CMOS VLSI DesignCMOS VLSI Design 4th Ed. 11: Sequential Circuits2 Outline  Sequencing  Sequencing Element Design.
Sequential Circuits. Outline  Floorplanning  Sequencing  Sequencing Element Design  Max and Min-Delay  Clock Skew  Time Borrowing  Two-Phase Clocking.
Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Virtex-II Architecture Virtex™-II architecture’s core voltage.
George Mason University FPGA Design Flow ECE 448 Lecture 9.
© 2003 Xilinx, Inc. All Rights Reserved Architecture Wizard and PACE FPGA Design Flow Workshop Xilinx: new module Xilinx: new module.
Chapter 9 High Speed Clock Management. Agenda Inside the DCM Inside the DFS Jitter Inside the V5 PLL.
1 Lecture 28 Timing Analysis. 2 Overview °Circuits do not respond instantaneously to input changes °Predictable delay in transferring inputs to outputs.
Integrated Circuits Laboratory Faculty of Engineering Digital Design Flow Using Mentor Graphics Tools Presented by: Sameh Assem Ibrahim 16-October-2003.
Spartan II Features  Plentiful logic and memory resources –15K to 200K system gates (up to 5,292 logic cells) –Up to 57 Kb block RAM storage  Flexible.
Reconfigurable Computing - Clocks John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western Australia.
Synchronous Digital Design Methodology and Guidelines
RTL Hardware Design by P. Chu Chapter 161 Clock and Synchronization.
Assume array size is 256 (mult: 4ns, add: 2ns)
Kazi Fall 2006 EEGN 4941 EEGN-494 HDL Design Principles for VLSI/FPGAs Khurram Kazi.
Page 1 Simplifying MSO-based debug of designs with Xilinx FPGAs.
The Spartan 3e FPGA. CS/EE 3710 The Spartan 3e FPGA  What’s inside the chip? How does it implement random logic? What other features can you use?  What.
VHDL Synthesis in FPGA By Zhonghai Shi February 24, 1998 School of EECS, Ohio University.
Evolution of implementation technologies
Interconnect Efficient LDPC Code Design Aiman El-Maleh Basil Arkasosy Adnan Al-Andalusi King Fahd University of Petroleum & Minerals, Saudi Arabia Aiman.
Achieving Timing Closure. Achieving Timing Closure - 2 © Copyright 2010 Xilinx Objectives After completing this module, you will be able to:  Describe.
Chapter #6: Sequential Logic Design 6.2 Timing Methodologies
CS 151 Digital Systems Design Lecture 28 Timing Analysis.
Achieving Timing Closure. Objectives After completing this module, you will be able to: Describe a flow for obtaining timing closure Interpret a timing.
© 2011 Xilinx, Inc. All Rights Reserved This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
© 2003 Xilinx, Inc. All Rights Reserved Power Estimation.
Global Timing Constraints FPGA Design Workshop. Objectives  Apply timing constraints to a simple synchronous design  Specify global timing constraints.
FPGA and ASIC Technology Comparison - 1 © 2009 Xilinx, Inc. All Rights Reserved Global Timing Constraints.
Digital Design Strategies and Techniques. Analog Building Blocks for Digital Primitives We implement logical devices with analog devices There is no magic.
FPGA-Based System Design: Chapter 4 Copyright  2004 Prentice Hall PTR HDL coding n Synthesis vs. simulation semantics n Syntax-directed translation n.
ISE. Tatjana Petrovic 249/982/22 ISE software tools ISE is Xilinx software design tools that concentrate on delivering you the most productivity available.
© 2003 Xilinx, Inc. All Rights Reserved Reading Reports Xilinx: This module was completely redone. Please translate entire module Some pages are the same.
© 2003 Xilinx, Inc. All Rights Reserved FPGA Design Techniques.
System Arch 2008 (Fire Tom Wada) /10/9 Field Programmable Gate Array.
© 2003 Xilinx, Inc. All Rights Reserved For Academic Use Only Xilinx Design Flow FPGA Design Flow Workshop.
1 Moore’s Law in Microprocessors Pentium® proc P Year Transistors.
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
Tools - Implementation Options - Chapter15 slide 1 FPGA Tools Course Implementation Options.
© 2003 Xilinx, Inc. All Rights Reserved FPGA Editor: Viewing and Editing a Routed Design.
EEE2243 Digital System Design Chapter 7: Advanced Design Considerations by Muhazam Mustapha, extracted from Intel Training Slides, April 2012.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
This material exempt per Department of Commerce license exception TSU Reading Reports.
This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
© 2003 Xilinx, Inc. All Rights Reserved Global Timing Constraints FPGA Design Flow Workshop.
Programmable Logic Training Course HDL Editor
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU FPGA Design with Xilinx ISE Presenter: Shu-yen Lin Advisor: Prof. An-Yeu Wu 2005/6/6.
ECE 545 Project 2 Specification. Project 2 (15 points) – due Tuesday, December 19, noon Application: cryptography OR digital signal processing optimized.
Introductory project. Development systems Design Entry –Foundation ISE –Third party tools Mentor Graphics: FPGA Advantage Celoxica: DK Design Suite Design.
George Mason University ECE 448 FPGA and ASIC Design with VHDL FPGA Design Flow ECE 448 Lecture 7.
Tools - Design Manager - Chapter 6 slide 1 Version 1.5 FPGA Tools Training Class Design Manager.
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
Tools - Analyzing your results - Chapter 7 slide 1 Version 1.5 FPGA Tools Course Analyzing your Results.
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
ASIC/FPGA design flow. Design Flow Detailed Design Detailed Design Ideas Design Ideas Device Programming Device Programming Timing Simulation Timing Simulation.
EE141 Timing Issues 1 Chapter 10 Timing Issues Rev /11/2003 Rev /28/2003 Rev /05/2003.
Dept. of Electrical and Computer Engineering
Topics 16 x 16 multiplier example..
Field Programmable Gate Array
Field Programmable Gate Array
Timing Analysis 11/21/2018.
Topics Performance analysis..
Performance Analysis (Clock Signal) مرتضي صاحب الزماني.
FPGA Tools Course Basic Constraints
Chapter 10 Timing Issues Rev /11/2003 Rev /28/2003
FPGA Tools Course Timing Analyzer
Presented by Mohsen Shakiba
Presentation transcript:

1 Performance Analysis (Clock Signal)

2 Unbalanced delays Logic with unbalanced delays leads to inefficient use of logic: long clock periodshort clock period

3 Flip-flop-based system performance analysis

4 Flip-flop-based system model Clock signal is perfect (no rise/fall), period P Clock event on rising edge Setup time s –Time from arrival of combinational logic event to clock event Propagation time p –Time for value to go from input to output (t co ?) Worst-case combinational delay C –Time from output of flip-flop to input

5 Clock period constraint P >= p + C + s. s p C

6 Clock parameters

7 Clock with rise/fall t r is large because the clock wire is long and has high capacitance.

8 Rise/fall clock period constraint P >= t r + p + C + s s p C trtr

9 Skew Skew: relative delay between events. Clock skew: can harm any sequential system.

10 Clock skew Clock must arrive at all memory elements in time to load data.

11 Clock skew in system DQDQ logic 

12 Clock skew and qualified (gated) clocks

13 Clock skew analysis model s 12 =  1 –  2 s 21 =  2 –  1 Assume  1 >  2 (s 12 > 0) φ

14 Skew and clock period Assume that each flip-flop operates instantaneously: If clock arrives at FF1 after FF2, then there is less time for the signal to propagate through the combinational logic. Given clock period, determine allowable skew: P >=  2 + s 12

15 Clock distribution Often one of the hardest problems in clock design. –Fast edges. –Minimum skew.

16 Clock skew example 10 ps 20 ps 30 ps DQDQDQ DQ

17 Clock H-Tree

Digital Clock Manager (DCM) A hard block in FPGAs –Gets clock input –Generates daughter clocks FPGA can have multiple DCMs Provides clocks to –internal circuitry –external devices on board 18

DCM 19

DCM Functions Jitter removal 20

DCM Functions Frequency synthesis: –Clock frequency generated outside ≠ frequency needed in our FPGA  multiplies/divides it to generate daughter clocks Can generate even other ratios (3/4, 4/5 of original) –In Spartan 6: n times or n/2 times (n ∈ [1,16]) 21

DCM Functions Phase shifting: –Some designs need phased-shifted clocks Some DCMs allow common values –120, 240: for 3-phase clocking scheme –90, 180, 270: for 4-phase clocking scheme Some DCMs allow to set exact values 22

DCM Functions Clock deskewing: –DCM gets a special input –DCM compares the two signals –Adds additional delay to the daughter clock to align with the main clock Two types: –PLL (phase-locked loop): analogue –DLL (digital-delay locked loop): digital 23

DCM Functions Auto skew correction: 24

DCM You can enter DCM parameters in tools: –Frequency –Duty cycle –Phase shift –… See Xilinx ISE In-Depth Tutorial, UG695 (v 14.1),

26 Case Study 16 x 16 multiplier example.

27 The FPGA design process Xilinx ISE (Integrated Synthesis Environment) –Translation from HDL –Logic synthesis –Placement and routing –Configuration generation

28 Design experiments Synthesize with no constraints. Synthesize with timing constraint. –Tighten timing constraint. Synthesize with placement constraints. Power: –Many tools don’t allow us to directly specify power consumption –Some tools allow us to specify power as an objective –May need to rewrite our h/w description for better power consumption characteristics.

Commercial Tools XST “-power” option reduces dynamic power consumption. Xilinx MAP and PAR“-power” option reduces dynamic power –But increases runtime and decreases design performance. Quartus-II has Power-Driven Synthesis and Place & Route. 29

30 Post-translation simulation model No timing or area constraints HDL model in terms of FPGA primitives. Example: X_LUT4 \p12_Madd__n0015_Mxor_Result_Xo 1 (.ADR0(x_7_IBUF),.ADR1(y_13_IBUF),.ADR2(c12[7]),.ADR3(row12[8]),.O(row13[7]) );

31 Mapping report Design Summary Number of errors: 0 Number of warnings: 0 Logic Utilization: Number of 4 input LUTs: 501 out of 1,024 48% Logic Distribution: Number of occupied Slices: 255 out of % Number of Slices containing only related logic: 255 out of % Number of Slices containing unrelated logic: 0 out of 255 0% *See NOTES below for an explanation of the effects of unrelated logic Total Number 4 input LUTs: 501 out of 1,024 48% Number of bonded IOBs: 64 out of 92 69% Total equivalent gate count for design: 3,006 Additional JTAG gate count for IOBs: 3,072 Peak Memory Usage: 64 MB

32 Static timing analysis report Timing constraint: TS_P2P = MAXDELAY FROM TIMEGRP "PADS" TO TIMEGRP "PADS" uS ; items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors) Maximum delay is ns After Mapping:  estimated delays (no information about interconnects)

33 Static timing report: delays along paths Data Sheet report: All values displayed in nanoseconds (ns) Pad to Pad Source Pad |Destination Pad| Delay | x |p | 5.824| x |p | | x |p | | x |p | |

34 Routing report Phase 1: 1975 unrouted; REAL time: 11 secs Phase 2: 1975 unrouted; REAL time: 11 secs Phase 3: 619 unrouted; REAL time: 12 secs Phase 4: 619 unrouted; (0) REAL time: 12 secs Phase 5: 619 unrouted; (0) REAL time: 12 secs Phase 6: 619 unrouted; (0) REAL time: 12 secs Phase 7: 0 unrouted; (0) REAL time: 12 secs The NUMBER OF SIGNALS NOT COMPLETELY ROUTED for this design is: 0 REAL time: Routing algorithm run time.

35 Static timing after routing Timing constraint: TS_P2P = MAXDELAY FROM TIMEGRP "PADS" TO TIMEGRP "PADS" uS ; items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors) Maximum delay is ns (vs ns in mapping report) Because of interconnect delays.

36 Timing constraint Use timing constraint editor:

37 Post-map static timing report Timing constraint: TS_P2P = MAXDELAY FROM TIMEGRP "PADS" TO TIMEGRP "PADS" 32 nS ; items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors) Maximum delay is ns. Pad to pad Hasn’t changed since this design has limited opportunities for logic synthesis to change delays by restructuring logic.

38 Post-routing static timing report Timing constraint: TS_P2P = MAXDELAY FROM TIMEGRP "PADS" TO TIMEGRP "PADS" 32 nS ; items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors) Maximum delay is ns. Tools generally try to meet the delay goal as closely as possible to minimize area.

39 Tighter timing constraints Tighten requirement to 25 ns. Post-place-route timing report: Timing constraint: TS_P2P = MAXDELAY FROM TIMEGRP "PADS" TO TIMEGRP "PADS" 25 nS ; items analyzed, 11 timing errors detected. (11 setup errors, 0 hold errors) Maximum delay is ns.

40 Report on a violated path Slack: ns (requirement - data path) Source: y (PAD) Destination: p (PAD) Requirement: ns Data Path Delay: ns (Levels of Logic = 31) Modify the logic and/or physical design to improve the delay.

41 Power report Power summary: I(mA) P(mW) Total estimated power consumption: Vccint 1.50V: 0 0 Vccaux 3.30V: Vcco V: Inputs: 0 0 Logic: 0 0 Outputs: Vcco Signals: Quiescent Vccaux 3.30V: Quiescent Vcco V: 1 3 Thermal summary: Estimated junction temperature: 36C Ambient temp: 25C Case temp: 35C Theta J-A: 34C/W Helps us determine whether we need additional cooling.

42 Improving area Floorplanner window: –Floorplanner  View/edit placed design LEs Chip floorplan Green rectangles: mapped components to CLBs

43 Rat’s nest wiring If you click on a component in the deign hierarchy window, its rat’s nest is shown.

44 Routing editor view FPGA Editor  View/Edit Routed Design

45 Editing constraints Use constraints editor to place constraints: –This tool allws you to constrain 1. placement of logic 2.assignment of chip I/Os to IOBs (e.g useful for PCB design)

46 Design browser pane

47 Drag and drop constraints

48 Change the shape of constraints

49 Full set of placement constraints We place the rows of the multiplier one below the other to create the row structure of the floorplan.

50 Placement results

51 New timing report After placement constraints: items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors) Maximum delay is ns. Compares to ns for unconstrained placement.