TH EDA NTHU-CS VLSI/CAD LAB 1 Re-synthesis for Reliability Design Shih-Chieh Chang Department of Computer Science National Tsing Hua University
2 Reliability Design Logic Re-synthesis for delay variation tolerance (DAC 04) A Vectorless Estimation of Maximum Instantaneous Current for Sequential Circuits (ICCAD 04)
3 Reliability Design Logic Re-synthesis for delay variation tolerance (DAC 04) A Vectorless Estimation of Maximum Instantaneous Current for Sequential Circuits (ICCAD 04)
4 Delay Variation Problem Circuit delay is increasingly sensitive to -process variation -delay defects -IR drop, cross talk Timing violation due to delay variation.
5 Pessimistic Delay Analysis Traditional solutions: Delay variation problem is alleviated by adding timing margin. -Unnecessary pessimism: a fabricated ASIC may run up to 40% faster [Chinnery and Keutzer]. -Adding timing margin may not be possible. Our solutions: Add redundancy (area penalty) for delay variation tolerance.
6 Delay Variation on a Gate Gates along critical paths are vulnerable to delay variation. Vulnerable gates have small slacks. Circuit delay = 6 2 7 Gate delay = 1
7 Delay Tolerance and Slack A gate’s slack: the delay increase without violating circuit’s delay. Slack has correlation with delay tolerance -Smaller slack more vulnerable. -Increase slacks of gates increase delay variation tolerance.
8 Delay Tolerance on a Circuit Definition: A circuit has d t delay tolerance if the smallest slack is d t. gate delay = 1 timing requirement = 7
9 Delay Tolerance on a Circuit Definition: A circuit has d t delay tolerance if the smallest slack is d t The smallest slack is 1 The circuit has 1 delay tolerance
10 Problem Formulation Inputs: -a circuit and, -a delay tolerance requirement d t, Outputs: -a re-synthesized circuit with d t delay tolerance.
11 Add redundant gates so that the smallest slack is increased. Our Basic Idea V voting machine Now: 1 delay tolerance Goal: 2 delay tolerance
12 Our Basic Idea Function does not change, but the smallest slack is increased to 2. The circuit has 2 delay tolerance V voting machine
13 Steps of our approach Start with Triple modular redundancy: three copies and a voting machine. V Voting machine
14 Property of TMR (1) Any two copies correct output correct Each wire/gate is redundant. 0 1 1 Voting machine V 1 1 1
15 The delay is NOT decided by the latest signal. Property of TMR (2) The second arriving signal V The latest signal
16 If a node’s delay becomes infinity, it will not affect the final delay. Each wire/gate has infinite slack in a TMR. Property of TMR (2) V Delay = infinite
17 TMR v.s. Delay Tolerance TMR can tolerate delay variation due to infinite slack. Process variation or noises may cause circuit delay to increase by 10% - 20%. Infinite slack is over-protective. 200% area penalty in a TMR is impractical.
18 Slack Changes After Wire Removal V Gate slack = infinite 0 0
19 Removing Redundant Wires After removing a redundant wire/gate, -circuit function does not change, -some slacks may be decreased. Objective: remove redundant wires/gates while maintaining the smallest slack d t.
20 Removing Wires V
21 Removing Wires V The smallest slack is 2 Satisfy d t =2
22 Signal Sharing Share the functions of side-input wires.
23 Signal Sharing V Share the functions of side-input wires.
24 Resulting Circuit The smallest slack is 2 Satisfy d t =2
25 Outline Delay variation problem Triple Modular Redundancy (TMR) Re-synthesis for delay variation tolerance Experimental results Conclusion
26 Experimental Flow Given a circuit, optimize the circuit by script.delay and obtain the circuit’s delay. Re-synthesize the circuit using d t = 10% * the circuit’s delay or 15% * the circuit’s delay
27 Experimental Results Circuit Originald t =10%d t =15% Delay Overhead (%) Delay Overhead (%) Delay Apex Apex Frg Pair S S S S S S Avg
28 Statistical Analysis Compare the statistically timing between a circuit and its re-synthesized circuit. Assume each gate’s delay to be a probability density function as described in [Liou DAC02]. Run Monte-Carlo to generate 10,000 samples for both a circuit and its re-synthesized circuit. Count the number of samples whose delay satisfies a pre-defined delay requirement. Delay requirement = 1.1 * the circuit’s delay
29 Experimental Results Circuit Timing requirement Statistic Analysis Originaldt=10% Apex Apex Frg Pair S S S S S S Avg.11.28
30 Conclusion Re-synthesize for d t delay tolerance. Adopt wire removal and signal sharing to reduce area overhead. Area penalty is about 21% for 10% delay tolerance.
31 Reliability Design Logic Re-synthesis for delay variation tolerance (DAC 04) A Vectorless Estimation of Maximum Instantaneous Current for Sequential Circuits (ICCAD 04)
32 Power Noises Excessively large current through power bus may cause IR drop and EM. Severe IR drop and EM degrade the performance and reliability. Accurate estimation of Maximum Instantaneous Current (MIC) to analyze noises.
33 Maximum Instantaneous Current Maximum Instantaneous Current (MIC) -Input vectors and time. 0 0 Maximum current=3 at time t=3 Maximum current=4 at time t=1. t=1t=2t=3
34 Previous Work Vector dependent: -Find a vector pair -Lower bound estimation Vector Independent: -Not find the worst case vectors -Upper bound estimation -iMax and PIE [H. Kriplani et al.]
35 Outline Maximum instantaneous current (MIC) problem Signal correlation problems MIC estimation based on the concept of mutual exclusive switching Experimental results & conclusion
36 Summary Identifying signal correlation is important for MIC estimation. Contribution: Efficiently identify complex combinational and sequential correlations. No correlation ? Correlation
37 Combinational Correlation Signal correlation in a combinational circuit. The two transitions cannot occur simultaneously
38 Combinational Correlation Can efficiently recognize complicated combinational correlations. t=4 Cannot occur simultaneously
39 Sequential Correlation Correlation across sequential elements. (f 1, f 2 )= (0, 0) (0, 1) (1, 0) (1, 1) f2f2 f1f1 t=0 t=1
40 Sequential Correlation Some (next) states are not reachable from a current state. Deriving state transition diagram is NOT practical. Implicitly obtain sequential correlation without the need of state transition diagram. None of the previous work can detect sequential correlation.
41 Before Exploring Signal Correlation… Decide whether a set of gates can switch simultaneously at time=t 1. Goal: Find necessary conditions for a gate to switch at time=t 1.
42 An Example for MES Detection Mutually Exclusive Switching at t=4 ?
43 Conflicts Mutually Exclusive Switching Initial valuesStable values Switch at t =
44 Conflicts Mutually Exclusive Switching Mutually Exclusive Switching at t=4
45 Necessary Conditions in Sequential Circuits g Flip-flop switch at t=2
46 Necessary Conditions in Sequential Circuits To reveal sequential correlation, we link the two circuit copies through flip-flops. Initial valuesStable values g Flip-flop g switch at t=2
47 MIC Estimation Based on MES Use an undirected graph to present the MES relation. Find a set of nodes that have no edge in between. Switch simultaneously. MES relation at time=t 1 Current contribution =1 Maximum current =3 at time=t 1 MES
48 Experimental Flow Combinational and sequential MCNC ISCAS benchmarks. Upper bound estimations: iMax, PIE (1000 s_nodes), and MES. Lower bound estimations: Random simulation for 3 days.
49 Results for Combinational Circuits iMax=2.6 PIE=2.3 Random=0.95 iMax=2.3 PIE=1.7 iMax PIE Random
50 Results for Sequential Circuits iMax=3.1 PIE=2.3 iMax PIE Random
51 Upper Bound Estimation Our method derives tighter upper bound for sequential circuits. iMax PIE Avg. MIC iMax=2.3 PIE=1.7 iMax=3.1 PIE=2.3
52 Lower Bound Estimation If an upper bound is close to the corresponding lower bound, both estimations are accurate. For small circuits, our upper bound results are close to the lower bound results. For large circuits, random simulation may only reach small portion of solution space. Ex. In s344, only 57% of 2625 reachable states.
53 Run Time The run time for iMax takes few seconds for the largest circuit. Our run time is in general faster than that of PIE. The MIC estimation is performed only one time and our run time is reasonable for a large design. Ex. In s15850, ours=2500sec.; PIE=15000sec.
54 Conclusion A vectorless method to estimate the MIC for sequential circuits. Based no mutually exclusive switching. Experimental results on sequential circuits are encouraging.
55 Thank you!