Penn ESE535 Spring 2013 -- DeHon 1 ESE535: Electronic Design Automation Day 25: April 17, 2013 Covering and Retiming.

Slides:



Advertisements
Similar presentations
CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 19: November 21, 2005 Scheduling Introduction.
Advertisements

Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Sequential Synthesis.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 8: February 11, 2015 Scheduling Introduction.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 24: November 4, 2011 Synchronous Circuits.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 23: April 10, 2013 Statistical Static Timing Analysis.
FPGA Technology Mapping Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 21: April 2, 2007 Time Multiplexing.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 23: April 22, 2009 Statistical Static Timing Analysis.
Penn ESE 535 Spring DeHon 1 ESE535: Electronic Design Automation Day 13: March 4, 2009 FSM Equivalence Checking.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 2: January 23, 2008 Covering.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 21: April 15, 2009 Routing 1.
EDA (CS286.5b) Day 6 Partitioning: Spectral + MinCut.
Penn ESE Fall DeHon 1 ESE (ESE534): Computer Organization Day 19: March 26, 2007 Retime 1: Transformations.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 7: February 11, 2008 Static Timing Analysis and Multi-Level Speedup.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 22: April 11, 2011 Statistical Static Timing Analysis.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 19: April 9, 2008 Routing 1.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: February 2, 2009 Architecture Synthesis (Provisioning, Allocation)
EDA (CS286.5b) Day 3 Clustering (LUT Map and Delay) N.B. no lecture Thursday.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 16: March 23, 2009 Covering.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 12: March 2, 2009 Sequential Optimization (FSM Encoding)
CS294-6 Reconfigurable Computing Day 15 October 13, 1998 LUT Mapping.
Penn ESE525 Spring DeHon 1 ESE535: Electronic Design Automation Day 10: February 18, 2009 Partitioning 2 (spectral, network flow)
EDA (CS286.5b) Day 19 Covering and Retiming. “Final” Like Assignment #1 –longer –more breadth –focus since assignment #2 –…but ideas are cummulative –open.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 3: January 27, 2008 Clustering (LUT Mapping, Delay) Please work preclass example.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 17: March 30, 2009 Clustering (LUT Mapping, Delay)
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 15: March 18, 2009 Static Timing Analysis and Multi-Level Speedup.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: February 2, 2009 Architecture Synthesis (Provisioning, Allocation)
EDA (CS286.5b) Day 18 Retiming. Today Retiming –cycle time (clock period) –C-slow –initial states –register minimization.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 8: February 13, 2008 Retiming.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 4: January 28, 2015 Partitioning (Intro, KLFM)
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 2: January 26, 2015 Covering Work preclass exercise.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 2: January 21, 2015 Heterogeneous Multicontext Computing Array Work Preclass.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 9: February 16, 2015 Scheduling Variants and Approaches.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 9: February 14, 2011 Placement (Intro, Constructive)
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 24: April 18, 2011 Covering and Retiming.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 10: February 18, 2015 Architecture Synthesis (Provisioning, Allocation)
Penn ESE525 Spring DeHon 1 ESE535: Electronic Design Automation Day 6: February 4, 2014 Partitioning 2 (spectral, network flow)
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 7: February 3, 2002 Retiming.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 26: October 31, 2014 Synchronous Circuits.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 24: April 22, 2015 Statistical Static Timing Analysis.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 20: April 8, 2015 Sequential Optimization (FSM Encoding)
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 23: April 20, 2015 Static Timing Analysis and Multi-Level Speedup.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 13: March 3, 2015 Routing 1.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 3: January 12, 2004 Clustering (LUT Mapping, Delay)
Pipelining and Retiming
CALTECH CS137 Spring DeHon 1 CS137: Electronic Design Automation Day 5: April 12, 2004 Covering and Retiming.
Give qualifications of instructors: DAP
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 6: January 30, 2013 Partitioning (Intro, KLFM)
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 20: April 4, 2011 Static Timing Analysis and Multi-Level Speedup.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 15: March 13, 2013 High Level Synthesis II Dataflow Graph Sharing.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: February 2, 2015 Clustering (LUT Mapping, Delay)
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 11: February 25, 2015 Placement (Intro, Constructive)
CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 2: September 28, 2005 Covering.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: January 31, 2011 Scheduling Variants and Approaches.
CS137: Electronic Design Automation
CS137: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
CprE / ComS 583 Reconfigurable Computing
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
CS137: Electronic Design Automation
CS137: Electronic Design Automation
Presentation transcript:

Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 25: April 17, 2013 Covering and Retiming

Penn ESE535 Spring DeHon 2 Previously Cover (map) LUTs for minimum delay –solve optimally for delay  flowmap Retiming for minimum clock period –solve optimally

Penn ESE535 Spring DeHon 3 Today Solving cover/retime separately not optimal Cover+retime Behavioral (C, MATLAB, …) RTL Gate Netlist Layout Masks Arch. Select Schedule FSM assign Two-level, Multilevel opt. Covering Retiming Placement Routing

Preclass 1 Penn ESE535 Spring DeHon 4

Preclass 2 Penn ESE535 Spring DeHon 5

6 Example Cover with 4-LUTs

Penn ESE535 Spring DeHon 7 Example

Penn ESE535 Spring DeHon 8 Example: Retimed

Penn ESE535 Spring DeHon 9 Example: Retimed Note: only 4 signals here (2 signals with 2 delays each)

Penn ESE535 Spring DeHon 10 Example 2 Cover with 4-LUTs

Penn ESE535 Spring DeHon 11 Example 2 Cycle Bound: 2

Penn ESE535 Spring DeHon 12 Example 2: retimed

Penn ESE535 Spring DeHon 13 Example 2: retimed Cycle Bound: 1

Penn ESE535 Spring DeHon 14 Basic Observation Registers break up circuit, limiting coverage –fragmentation –prevent grouping

Penn ESE535 Spring DeHon 15 Phase Ordering Problem General problem –don’t know effect of other mapping step –Have seen this many places Here –don’t know delay if retime first don’t know what can be packed into LUT –If we do not retime first fragmentation: forced breaks at bad places

Penn ESE535 Spring DeHon 16 Observation #1 Retiming flops to input of (fanout free) subgraph is trivial (and always doable) Does not change I/O into subgraph

Penn ESE535 Spring DeHon 17 Observation #1: Consequence Can cover ignoring flop placement Then retime flops to input of gates

Penn ESE535 Spring DeHon 18 Fanout Problem? Can we use the same trick here?

Penn ESE535 Spring DeHon 19 Fanout Problem? Cannot retime without replicating. Replicating increases I/O (so cut size). …but I/O is what defined feasible covers for LUTs 3 cut  5 cut

Penn ESE535 Spring DeHon 20 Observation #2 Retiming flops to input of a subgraph with fanout may change the subgraph I/O

Penn ESE535 Spring DeHon 21 Different Replication Problem What does this do to I/O?

Penn ESE535 Spring DeHon 22 Different Replication Problem

Penn ESE535 Spring DeHon 23 Different Replication Problem Can retime and cover with single 4-LUT.

Penn ESE535 Spring DeHon 24 Replication Once add registers –can’t just grab max flow and get replication (compare flowmap) Or, can’t just ignore flop placement when have reconvergent fanout through flop

Penn ESE535 Spring DeHon 25 Replication Key idea: –represent timing paths in graph –differentiating based on number of registers in path –new graph: all paths from node to output have same number of flip-flops –label nodes u d where d is flip-flops to output

Penn ESE535 Spring DeHon 26 Deal with Replication Expanded Graph: –start with target output node –for each input u d to current expanded graph grab its input edge (x  u) with weight (w(e)) add node x (d+w(e)) to graph (if necessary) add edge x (d+w(e))  u d with weight (w(e)) –continue breadth first until have enough at most k  n node depth required

Penn ESE535 Spring DeHon 27 Example b c a c0c0 i j Build expanded graph

Penn ESE535 Spring DeHon 28 Example b c a c0c0 a0a0 b1b1 i j

Penn ESE535 Spring DeHon 29 Example b c a c0c0 a0a0 b1b1 i j i0i0 c1c1 j1j1

Penn ESE535 Spring DeHon 30 Example b c a c0c0 a0a0 b1b1 i j i0i0 c1c1 j1j1 a1a1 b2b2

Penn ESE535 Spring DeHon 31 Example c0c0 a0a0 b1b1 i0i0 c1c1 j0j0 a1a1 b2b2 c0c0 a0a0 b1b1 i0i0 c1c1 j1j1 a1a1 b2b2 Retime4-LUT Cover

Penn ESE535 Spring DeHon 32 Example c0c0 a0a0 b1b1 i0i0 c1c1 j1j1 a1a1 b2b2

Penn ESE535 Spring DeHon 33 Example 2 e ac bd Build expanded graph

Penn ESE535 Spring DeHon 34 Example 2 e ac bd e0e0 c0c0 d0d0 a1a1 a0a0 b0b0 b1b1 i1i1 j1j1 i0i0 j0j0

Penn ESE535 Spring DeHon 35 Expanded Graph Expanded graph does not have fanout of different flip-flop depths from the same node. –Captures IO after register retiming Can now cover ignoring flip-flops and trivially retime.

Intuition on Solution Phase ordering problem arise form –need to capture I/O effects before covering –but also need to model delay for register movement But don’t know register movement until after covering So, break retime into two pieces 1.Expanded graph (capture I/O) 2.Actual retime (moves registers) Do expanded graph piece before cover and register movement after Penn ESE535 Spring DeHon 36

Intuition on Solution Break retime into two pieces 1.Expanded graph (capture I/O) 2.Actual retime (moves registers) Do expanded graph piece before cover and register movement after Not quite that simple since how much of expanded graph need depends on covering –So really doing just-in-time expansion in the middle of covering… Before each cover/cut computation Penn ESE535 Spring DeHon 37

Penn ESE535 Spring DeHon 38 Labeling Key idea #1: –compute distances/delay like flowmap Try collapse and compute flow cut Dynamic programming to compute min delay covers Key idea #2: –count distance from register like G-1/c graph

Penn ESE535 Spring DeHon 39 Labeling: Edge Weights To target clock period c –use graph G-1/c –paper: assign weight -c*w(e)+1 (same thing scaled by c and negated)

Penn ESE535 Spring DeHon 40 Labeling: Edge Weight Idea same idea: –will need register ever c LUT delays –credit with registers as encounter –charge a fraction (1/c) every LUT delay –know net distance at each point –if negative (delays > c*registers) cannot distribute to achieve c –otherwise labeling tells where to distribute

Penn ESE535 Spring DeHon 41 Labeling: Flow cut Label node as before (flowmap) –L(v)=min{l(u)+d|  u  v} –trivially can be L(v)-1/c == new LUT Correspond to flowmap case: L(v)+1 note min vs. max and -1/c vs. +1 due to rescaling to match retiming formulation and G- 1/c graph in this formulation, a combinational circuit of depth 4 would have L(v)=-4/c –if can put this and all L(v)’s in one LUT this can be L(v) construct and compute flow cut to test

Penn ESE535 Spring DeHon 42 LUT Map and Retime Start with outputs Cover with LUT based on cut –move flip-flops to inputs of LUT don’t have meaningful labels for covered nodes Know can do this by expanded graph construction Recursively cover inputs Use label to retime r(v)=  l(v)  -1

Penn ESE535 Spring DeHon 43 Target Clock Period c As before (retiming) –binary search to find optimal c

Example Penn ESE535 Spring DeHon 44

Penn ESE535 Spring DeHon 45 Summary Can optimally solve –LUT map for delay –retiming for minimum clock period But, solving separately does not give optimal solution to problem Can solve problems together –Account for registers on paths –Label based on register placement and (flow) cover ignoring registers –Labeling gives delay, covering, retiming

Penn ESE535 Spring DeHon 46 Today’s Big Ideas Exploit freedom Cost of decomposition –benefit of composite solution Technique: –dynamic programming –network flow

Penn ESE535 Spring DeHon 47 Admin Monday reading online HW7 final due Monday