CprE / ComS 583 Reconfigurable Computing

Slides:



Advertisements
Similar presentations
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Sequential Synthesis.
Advertisements

Penn ESE Fall DeHon 1 ESE (ESE534): Computer Organization Day 19: March 26, 2007 Retime 1: Transformations.
Topics of Lecture Structural Model Procedures Functions Overloading.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
CS294-6 Reconfigurable Computing Day 16 October 15, 1998 Retiming.
EDA (CS286.5b) Day 19 Covering and Retiming. “Final” Like Assignment #1 –longer –more breadth –focus since assignment #2 –…but ideas are cummulative –open.
Package with 4-valued logic Signal Attributes Assertion Data Flow description.
EDA (CS286.5b) Day 18 Retiming. Today Retiming –cycle time (clock period) –C-slow –initial states –register minimization.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 8: February 13, 2008 Retiming.
CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #17 – Introduction.
VHDL IE- CSE. What do you understand by VHDL??  VHDL stands for VHSIC (Very High Speed Integrated Circuits) Hardware Description Language.
CprE / ComS 583 Reconfigurable Computing
7/10/2007DSD,USIT,GGSIPU1 Basic concept of Sequential Design.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 24: April 18, 2011 Covering and Retiming.
George Mason University ECE 545 – Introduction to VHDL Variables, Functions, Memory, File I/O ECE 545 Lecture 7.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 7: February 3, 2002 Retiming.
ELEC692 VLSI Signal Processing Architecture Lecture 3
16/11/2006DSD,USIT,GGSIPU1 Packages The primary purpose of a package is to encapsulate elements that can be shared (globally) among two or more design.
VHDL Discussion Sequential Sytems. Memory Elements. Registers. Counters IAY 0600 Digital Systems Design Alexander Sudnitson Tallinn University of Technology.
CALTECH CS137 Spring DeHon 1 CS137: Electronic Design Automation Day 5: April 12, 2004 Covering and Retiming.
04/26/20031 ECE 551: Digital System Design & Synthesis Lecture Set : Introduction to VHDL 12.2: VHDL versus Verilog (Separate File)
VHDL Discussion Subprograms IAY 0600 Digital Systems Design Alexander Sudnitson Tallinn University of Technology 1.
EGRE 6311 LHO 04 - Subprograms, Packages, and Libraries EGRE 631 1/26/09.
CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #19 –VHDL.
ECE 448 – FPGA and ASIC Design with VHDL George Mason University ECE 448 Lab 2 Implementing Combinational Logic in VHDL.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 25: April 17, 2013 Covering and Retiming.
1 - CPRE 583 (Reconfigurable Computing): VHDL overview 1 Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 2: 8/26/2011 (VHDL Overview.
1 Introduction to Engineering Spring 2007 Lecture 19: Digital Tools 3.
Overview Logistics Last lecture Today HW5 due today
IAY 0600 Digital Systems Design
Finite State Machines (part 1)
Basic Language Concepts
IAY 0600 Digitaalsüsteemide disain
ECE 448 Lecture 6 Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL Code.
Behavioral Style Combinational Design with VHDL
IAY 0600 Digital Systems Design
Verification: Testbenches in Combinational Design
CS137: Electronic Design Automation
Timing Model Start Simulation Delay Update Signals Execute Processes
Behavioral Style Combinational Design with VHDL
ECE 545 Lecture 10 Advanced Testbenches.
VHDL 5 FINITE STATE MACHINES (FSM)
ECE 434 Advanced Digital System L08
RTL Style در RTL مدار ترتيبي به دو بخش (تركيبي و عناصر حافظه) تقسيم مي شود. مي توان براي هر بخش يك پروسس نوشت يا براي هر دو فقط يك پروسس نوشت. مرتضي صاحب.
IAS 0600 Digital Systems Design
ESE535: Electronic Design Automation
VHDL Discussion Subprograms
CPE 528: Lecture #5 Department of Electrical and Computer Engineering University of Alabama in Huntsville.
Modeling of Circuits with a Regular Structure
ECE 545 Lecture 9 Modeling of Circuits with a Regular Structure Aliases, Attributes, Functions, and Procedures.
VHDL Discussion Subprograms
IAS 0600 Digital Systems Design
CS184a: Computer Architecture (Structures and Organization)
Figure 8.1. The general form of a sequential circuit.
Finite State Machines (part 1)
Implementing Combinational and Sequential Logic in VHDL
ESE535: Electronic Design Automation
Instructor: Alexander Stoytchev
ECE 352 Digital System Fundamentals
Advanced Algorithms Analysis and Design
Variables variable variable_name: variable_type
Timing Analysis and Optimization of Sequential Circuits
EGR 2131 Unit 12 Synchronous Sequential Circuits
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
CprE / ComS 583 Reconfigurable Computing
Sequntial-Circuit Building Blocks
4-Input Gates VHDL for Loops
EEL4712 Digital Design (VHDL Tutorial).
Presentation transcript:

CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #20 – Retiming

CprE 583 – Reconfigurable Computing Quick Points HW #4 due Thursday at 12:00pm Midterm, HW #3 graded by Wednesday Upcoming deadlines: November 16 – project status updates December 5,7 – project final presentations December 15 – project write-ups due October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Recap – Variables LIBRARY ieee ; USE ieee.std_logic_1164.all ; ENTITY Numbits IS PORT ( X : IN STD_LOGIC_VECTOR(1 TO 3) ; Count : OUT INTEGER RANGE 0 TO 3) ; END Numbits ; October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Variables – Example ARCHITECTURE Behavior OF Numbits IS BEGIN PROCESS(X) – count the number of bits in X equal to 1 VARIABLE Tmp: INTEGER; Tmp := 0; FOR i IN 1 TO 3 LOOP IF X(i) = ‘1’ THEN Tmp := Tmp + 1; END IF; END LOOP; Count <= Tmp; END PROCESS; END Behavior ; October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Variables – Features Can only be declared within processes and subprograms (functions & procedures) Initial value can be explicitly specified in the declaration When assigned take an assigned value immediately Variable assignments represent the desired behavior, not the structure of the circuit Should be avoided, or at least used with caution in a synthesizable code October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Variables vs. Signals LIBRARY IEEE; USE IEEE.STD_LOGIC_1164.all; ENTITY test_delay IS PORT( clk : IN STD_LOGIC; in1, in2 : IN STD_LOGIC; var1_out, var2_out : OUT STD_LOGIC; sig1_out : BUFFER STD_LOGIC; sig2_out : OUT STD_LOGIC ); END test_delay; October 31, 2006 CprE 583 – Reconfigurable Computing

Variables vs. Signals (cont.) ARCHITECTURE behavioral OF test_delay IS BEGIN PROCESS(clk) IS VARIABLE var1, var2: STD_LOGIC; if (rising_edge(clk)) THEN var1 := in1 AND in2; var2 := var1; sig1_out <= in1 AND in2; sig2_out <= sig1_out; END IF; var1_out <= var1; var2_out <= var2; END PROCESS; END behavioral; October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Simulation Result October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Assert Statements Assert is a non-synthesizable statement whose purpose is to write out messages on the screen when problems are found during simulation Depending on the severity of the problem, the simulator is instructed to continue simulation or halt Syntax: ASSERT condition [REPORT “message”] [SEVERITY severity_level ]; The message is written when the condition is FALSE Severity_level can be: Note, Warning, Error (default), or Failure October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Array Attributes A’left(N) left bound of index range of dimension N of A A’right(N) right bound of index range of dimension N of A A’low(N) lower bound of index range of dimension N of A A’high(N) upper bound of index range of dimension N of A A’range(N) index range of dimension N of A A’reverse_range(N) index range of dimension N of A A’length(N) length of index range of dimension N of A A’ascending(N) true if index range of dimension N of A is an ascending range, false otherwise October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Subprograms Include functions and procedures Commonly used pieces of code Can be placed in a library, and then reused and shared among various projects Use only sequential statements, the same as processes Example uses: Abstract operations that are repeatedly performed Type conversions October 31, 2006 CprE 583 – Reconfigurable Computing

Functions – Basic Features Always return a single value as a result Are called using formal and actual parameters the same way as components Never modify parameters passed to them Parameters can only be constants (including generics) and signals (including ports); Variables are not allowed; the default is a CONSTANT When passing parameters, no range specification should be included (for example no RANGE for INTEGERS, or TO/DOWNTO for STD_LOGIC_VECTOR) Are always used in some expression, and not called on their own October 31, 2006 CprE 583 – Reconfigurable Computing

Function Syntax and Example FUNCTION function_name (<parameter_list>) RETURN data_type IS [declarations] BEGIN (sequential statements) END function_name; FUNCTION f1 (a, b: INTEGER; SIGNAL c: STD_LOGIC_VECTOR) RETURN BOOLEAN IS BEGIN (sequential statements) END f1; CprE 583 – Reconfigurable Computing October 31, 2006

Procedures – Basic Features Do not return a value Are called using formal and actual parameters the same way as components May modify parameters passed to them Each parameter must have a mode: IN, OUT, INOUT Parameters can be constants (including generics), signals (including ports), and variables The default for inputs (mode in) is a constant, the default for outputs (modes out and inout) is a variable When passing parameters, range specification should be included (for example RANGE for INTEGERS, and TO/DOWNTO for STD_LOGIC_VECTOR) Procedure calls are statements on their own October 31, 2006 CprE 583 – Reconfigurable Computing

Procedure Syntax and Example PROCEDURE procedure_name (<parameter_list>) IS [declarations] BEGIN (sequential statements) END procedure_name; PROCEDURE p1 (a, b: in INTEGER; SIGNAL c: out STD_LOGIC) [declarations] BEGIN (sequential statements) END p1; CprE 583 – Reconfigurable Computing October 31, 2006

CprE 583 – Reconfigurable Computing Outline Recap Retiming Performance Analysis Transformations Optimizations Covering + Retiming October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Problem Given: clocked circuit Goal: minimize clock period without changing (observable) behavior I.e. minimize maximum delay between any pair of registers Freedom: move placement of internal registers October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Other Goals Minimize number of registers in circuit Achieve target cycle time Minimize number of registers while achieving target cycle time October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Simple Example Path Length (L) = 4 Can we do better? CprE 583 – Reconfigurable Computing October 31, 2006

CprE 583 – Reconfigurable Computing Legal Register Moves Retiming Lag/Lead October 31, 2006 CprE 583 – Reconfigurable Computing

Canonical Graph Representation Separate arch for each path Weight edges by number of registers (weight nodes by delay through node) October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Critical Path Length Critical Path: Length of longest path of zero weight nodes Compute in O(|E|) time by levelizing network: Topological sort, push path lengths forward until find register. October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Retiming Lag/Lead Retiming: Assign a lag to every vertex weight(e) = weight(e) + lag(head(e))-lag(tail(e)) October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Valid Retiming Retiming is valid as long as: e in graph weight(e) = weight(e) + lag(head(e))-lag(tail(e))  0 Assuming original circuit was a valid synchronous circuit, this guarantees: Non-negative register weights on all edges No traveling backward in time :-) All cycles have strictly positive register counts Propagation delay on each vertex is non-negative (assumed 1 for today) October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Retiming Task Move registers  assign lags to nodes Lags define all locally legal moves Preserving non-negative edge weights (previous slide) Guarantees collection of lags remains consistent globally October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Optimal Retiming There is a retiming of graph G w/ clock cycle c iff G-1/c has no cycles with negative edge weights G-  subtract a from each edge weight October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing G-1/c October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Intuition Must have at most c delay between every pair of registers So, count 1/c’th charge against register for every delay without out (G provides credit of 1 register every time one passed) October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Compute Retiming Lag(v) = shortest path to I/O in G-1/c Compute shortest paths in O(|V||E|) Bellman-Ford also use to detect negative weight cycles when c too small October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Apply to Example October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Apply: Find Lags October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Apply: Move Registers weight(e) = weight(e) + lag(head(e))-lag(tail(e)) October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Apply: Retimed October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Apply: Retimed Design October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Pipelining Can use this retiming to pipeline Assume have enough (infinite supply) of registers at edge of circuit Retime them into circuit See [WeaMar03A] for details October 31, 2006 CprE 583 – Reconfigurable Computing

Cover + Retiming – Example October 31, 2006 CprE 583 – Reconfigurable Computing

Cover + Retiming – Example (cont.) October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Example: Retimed October 31, 2006 CprE 583 – Reconfigurable Computing

Example: Retimed (cont.) Note: only 4 signals here (2 w/ 2 delays each) October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Basic Observation Registers break up circuit, limiting coverage fragmentation prevent grouping October 31, 2006 CprE 583 – Reconfigurable Computing

Phase Ordering Problem General problem we’ve seen before E.g. placement – don’t know where connected neighbors will be if unplaced Don’t know effect/results of other mapping step In this case Don’t know delay (what can be packed into LUT) if retime first If not retime first fragmention: forced breaks at bad places October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Observation #1 Retiming flops to input of (fanout free) subgraph is trivial (and always doable) Can cover ignoring flop placement Then retime LUTs to input October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Fanout Problem? Can I use the same trick here? October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Fanout Problem? (cont.) Cannot retime without replicating Replicating increase I/O (so cut size) October 31, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Summary Can move registers to minimize cycle time Formulate as a lag assignment to every node Optimally solve cycle time in O(|V||E|) time Can optimally solve LUT map for delay Retiming for minimum clock period Solving separately does not give optimal solution to problem October 31, 2006 CprE 583 – Reconfigurable Computing