Reconfigurable Computing - Verifying Circuits Performance! John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn.

Slides:



Advertisements
Similar presentations
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
Advertisements

Spartan-3 FPGA HDL Coding Techniques
1 KU College of Engineering Elec 204: Digital Systems Design Lecture 9 Programmable Configurations Read Only Memory (ROM) – –a fixed array of AND gates.
ECE 551 Digital System Design & Synthesis Lecture 08 The Synthesis Process Constraints and Design Rules High-Level Synthesis Options.
Flip-Flops Last time, we saw how latches can be used as memory in a circuit. Latches introduce new problems: We need to know when to enable a latch. We.
1 Lecture 28 Timing Analysis. 2 Overview °Circuits do not respond instantaneously to input changes °Predictable delay in transferring inputs to outputs.
Reconfigurable Computing - Clocks John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western Australia.
Assume array size is 256 (mult: 4ns, add: 2ns)
Reconfigurable Computing - Verifying Circuits John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western.
CSE-221 Digital Logic Design (DLD)
1 Simple FPGA David, Ronald and Sudha Advisor: Dave Parent 12/05/2005.
Embedded Systems Hardware:
Programmable logic and FPGA
DIGITAL ELECTRONICS CIRCUIT P.K.NAYAK P.K.NAYAK ASST. PROFESSOR SYNERGY INSTITUTE OF ENGINEERING & TECHNOLOGY.
Sequential Circuit  It is a type of logic circuit whose output depends not only on the present value of its input signals but on the past history of its.
Introduction to FPGA Design Illustrating the FPGA design process using Quartus II design software and the Cyclone II FPGA Starter Board. Physics 536 –
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
CS231: Computer Architecture I Laxmikant Kale Fall 2004.
Lecture #3 Page 1 ECE 4110– Sequential Logic Design Lecture #3 Agenda 1.FPGA's 2.Lab Setup Announcements 1.No Class Monday, Labor Day Holiday 2.HW#2 assigned.
CS1Q Computer Systems Lecture 9 Simon Gay. Lecture 9CS1Q Computer Systems - Simon Gay2 Addition We want to be able to do arithmetic on computers and therefore.
Khaled A. Al-Utaibi  Interrupt-Driven I/O  Hardware Interrupts  Responding to Hardware Interrupts  INTR and NMI  Computing the.
Chapter 8 Problems Prof. Sin-Min Lee Department of Mathematics and Computer Science.
Lecture #3 Page 1 ECE 4110– Sequential Logic Design Lecture #3 Agenda 1.FPGA's 2.Lab Setup Announcements 1.No Class Monday, Labor Day Holiday 2.HW#2 assigned.
P. 4.1 Digital Technology and Computer Fundamentals Chapter 4 Digital Components.
Reconfigurable Computing - Assignment Feedback John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western.
Chapter 6-1 ALU, Adder and Subtractor
Reconfigurable Computing - Multipliers: Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on.
Microprocessor-Based System. What is it? How simple can a microprocessor-based system actually be? – It must obviously contain a microprocessor otherwise.
J. Christiansen, CERN - EP/MIC
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Reconfigurable Computing - Type conversions and the standard libraries John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
1 KU College of Engineering Elec 204: Digital Systems Design Lecture 11 Binary Adder/Subtractor.
Basic Sequential Components CT101 – Computing Systems Organization.
Reconfigurable Computing - FPGA structures John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western.
Lecture #3 Page 1 ECE 4110–5110 Digital System Design Lecture #3 Agenda 1.FPGA's 2.Lab Setup Announcements 1.HW#2 assigned Due.
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
EE3A1 Computer Hardware and Digital Design
CDA 3101 Fall 2013 Introduction to Computer Organization The Arithmetic Logic Unit (ALU) and MIPS ALU Support 20 September 2013.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Reconfigurable Computing - Verifying Circuit Performance! John Morris Chung-Ang University The University of Auckland ‘Iolanthe II’ in a good breeze on.
Cost/Performance Tradeoffs: a case study
Computer Architecture Lecture 3 Combinational Circuits Ralph Grishman September 2015 NYU.
1 System Clock and Clock Synchronization.. System Clock Background Although modern computers are quite fast and getting faster all the time, they still.
CS 232: Computer Architecture II Prof. Laxmikant (Sanjay) Kale.
Reconfigurable Computing - Pipelined Systems John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western.
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
Reconfigurable Computing - Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound,
FPGA-Based System Design: Chapter 1 Copyright  2004 Prentice Hall PTR Moore’s Law n Gordon Moore: co-founder of Intel. n Predicted that number of transistors.
1  2004 Morgan Kaufmann Publishers Performance is specific to a particular program/s –Total execution time is a consistent summary of performance For.
EE121 John Wakerly Lecture #15
Reconfigurable Computing - Verifying Circuits Performance! John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn.
CS151 Introduction to Digital Design Chapter 5: Sequential Circuits 5-1 : Sequential Circuit Definition 5-2: Latches 1Created by: Ms.Amany AlSaleh.
M211 – Central Processing Unit
EEL 5722 FPGA Design Fall 2003 Digit-Serial DSP Functions Part I.
COMPSYS 304 Computer Architecture Cache John Morris Electrical & Computer Enginering/ Computer Science, The University of Auckland Iolanthe at 13 knots.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Reconfigurable Computing - Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound,
COMBINATIONAL AND SEQUENTIAL CIRCUITS Guided By: Prof. P. B. Swadas Prepared By: BIRLA VISHVAKARMA MAHAVDYALAYA.
Chapter 3 Boolean Algebra and Digital Logic T103: Computer architecture, logic and information processing.
Addition and multiplication1 Arithmetic is the most basic thing you can do with a computer, but it’s not as easy as you might expect! These next few lectures.
Reconfigurable Computing - Performance Issues John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western.
CSC 108H: Introduction to Computer Programming Summer 2011 Marek Janicki.
Sequential Logic Design
Reconfigurable Computing - Pipelined Systems
Reconfigurable Computing - Verifying Circuits
ECE 352 Digital System Fundamentals
ECE 352 Digital System Fundamentals
Instructor: Michael Greenbaum
Presentation transcript:

Reconfigurable Computing - Verifying Circuits Performance! John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western Australia

Measuring Circuit Performance  Don’t believe the simulators!  Although some experience has shown that predictions can be reasonably accurate …  Potential for gross error is very large  A large number of small values need to be summed  Possibility of large statistical errors  Professional engineers always check That’s what makes them professional!  Scientists always want to be able to repeat an experiment  That’s a principle of scientific theory  Don’t accept anything as fact unless you can repeat it!  Whatever your background or reason …  Measurement on an actual device needed  You can use the simulator’s numbers for guidance though!

Measuring Circuit Performance  Use the simulator’s results as a guide  But what does it tell you?  It calculates propagation delays from inputs to outputs along various circuit paths  Simulators try to identify the longest (in time) path for you  In a simple combinatorial block that’s fine eg a one-stage (no registers) adder Should identify the carry chain in a ripple carry adder or its equivalent in a more complex adder a single-stage parallel array multiplier Again – in all types of multipliers – there’s a carry chain that limits performance  In a pipelined circuit, you want the longest path between two clocked flip-flops In principle, easy for the simulator to find! In practice, you may need to spend more time checking that it selected the right path!

Measuring Circuit Performance  Checking the simulator’s predictions  Do a sanity check!  Using the manufacturer’s published propagation delays for individual circuit elements Estimate the path delay yourself Count the number of logic blocks needed for the computation Will additional multiplexers be needed for steering or selection logic? Are I/O buffers needed? These typically have a considerable delay (relative to other circuit elements)

Measuring Circuit Performance  Using the manufacturer’s published propagation delays for individual circuit elements  Estimate the path delay yourself  …  You can use the synthesizer to help you here  Its count of the number of the total number of logic blocks will be 100% accurate  From this, you infer the number of logic blocks in a path eg  For a 32-bit adder, you can obviously start by dividing the total number of logic blocks by 32  Then try to estimate how many logic blocks are needed for overheads, eg Multiplexers needed in a carry select adder  For FPGAs, remember …

Measuring Circuit Performance  Using the manufacturer’s published propagation delays for individual circuit elements  Estimate the path delay yourself  For FPGAs, remember … 1.Look up tables (LUTs) are usually used for boolean logic  This means that Using Xilinx’s 9-input CLBs  y <= a AND b probably takes about the same time as  y <= a AND b AND c AND d AND … (up to 9 inputs)  Beyond 9 inputs, add a considerable delay to connect to a neighbouring CLB Using Altera’s 4-input logic elements  y <= a AND b probably takes about the same time as  y <= a AND b AND c AND d (up to 4 inputs) Beyond 4 inputs, add a small delay to use the fast cascade chain logic

Measuring Circuit Performance  Using the manufacturer’s published propagation delays for individual circuit elements  Estimate the path delay yourself  For FPGAs, remember … 2.Paths between logic blocks may have large numbers of transmission gates on them!  As noted before, there’s a considerable advantage to being able to keep critical logic on one logic block But Altera’s cascade chains attempt to mitigate the penalty for not fitting critical logic into a single logic element And all manufacturers now provide for fast adder carry chains!  This makes estimation of path delays difficult  Nevertheless, you should make a rough estimate!!

Measuring Circuit Performance  Estimate the path delay yourself  If your estimate matches that from the synthesizer, then we’re in good shape  ‘Matches’ here can be interpreted liberally  If the synthesizer reports 50ns and you calculate 30ns then this is a reasonable match You probably didn’t count enough transmission gates, etc, on the connections between logic blocks! You don’t need to do a very precise calculation The synthesizer has done that for you! Your aim is to ensure that you are reading the correct number from the synthesizer’s report!  With a reasonable match (say within 50% - either way), believe the synthesizer and continue …  With a serious mismatch 1.Read the synthesizer’s report more carefully You may be looking at the wrong figure! 2.Check your estimate more carefully

Now we believe we know how fast the circuit is …  What does this speed mean in practice?  You have a longest delay of x ns  A synchronous (clocked) circuit can run at 1/x GHz ?  Almost!  Don’t forget to allow for 1.Propagation delay in the registers 2.Temperature Circuits run slower at high T Make sure that your estimate of t pd is a good one for the highest temperature your circuit will need to withstand Don’t think that this will be low! Try touching a modern high performance processor! (Make sure you have some burn cream nearby!) or simply work out that all those fans hiding that chip aren’t there for decoration! 3.Chip-to-chip variations in fabrication …  32-bit adder – inputs a, b, c  Naïve approach - Test all possibilities a – 4  10 9 ( all possible 32-bit numbers ) b – 4  10 9 ( do ) c – 2 ( 0 or 1 ) Total 4  4  2  = 1.6 x GHz machine – 10 9 cases / sec (optimistic!) 1.6  seconds – about 6 months will do it! What about the rest of the machine? -, x, /, ^, v, >, … We should be finished in about 5 years Hmmmm … our 4 GHz machine should be about 30 GHz now!  Clearly we need to be more efficient about testing!

Now we believe we know how fast the circuit is …  What does this speed mean in practice?  You have a longest delay of x ns  A synchronous (clocked) circuit can run at 1/x GHz ?  Almost!  Don’t forget to allow for 1.Propagation delay in the registers 2.Temperature 3.Chip-to-chip variations in fabrication The gates will only be nominally 0.18  ! Some may actually be 0.15  and others 0.25  …  A maximum clock frequency of 1/(x+  ) GHz   may be quite large!  Now you’re ready to design an experiment to verify that the circuit does actually run as predicted!

A word of warning!  Experimental design!  If you don’t make an estimate of what you expect to measure before starting  You will waste a lot of time doing the experiment!  Working out the expected delay time is formally equivalent to setting out a hypothesis for the experiment  The simulator says the delay will be x ns so I hypothese (predict) that we will measure a delay of about x ns  This (simple) hypothesis guides your experimental design and set up!  For example, assume you have a 150MHz oscilloscope available …

A word of warning!  Experimental hypothesis  The simulator says the delay will be x ns so I hypothese (predict) that we will measure a delay of about x ns  This (simple) hypothesis guides your experimental design and set up!  For example, assume you have a 150MHz oscilloscope available  You try to make measurements of the delay, but are surprised to find that there appears to be no delay at all!  Somebody then remembers to go back and read the synthesis report.. Which tells you to expect a 5ns delay –  or one that will be difficult to measure on a slow ‘scope!

Experimental Hypothesis  The simulator says the delay will be x ns so I hypothese (predict) that we will measure a delay of about x ns  This (simple) hypothesis guides your experimental design and set up!  You now know that you have to design your experiment differently, eg 1.Build a wider adder So that the delay is long enough to measure easily 2.Work out how to measure n repeats of the calculation So that 5  n > 20ns (or some time that you can be certain to measure accurately!) 3.Devise an entirely new technique Which doesn’t require direct measurement of such a small delay