Advanced Digital Design Limits of Synchonous Design by A. Steininger and M. Delvai Vienna University of Technology
© A. Steininger & M. Delvai / TU Vienna recall Previous Conclusion The purpose of a design style is to provide information for flow control. Boolean Logic alone cannot provide this information. Severe technological problems force us to question the current (synchronous) design practice. We shall focus on that. Alternatives must be evaluated very critically with respect to improvements concerning power, area, robustness, ease of composition, testability and performance. Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna recall What we actually need When can SNK use its input? When it is valid and consistent f(x) SRC SNK When can SRC apply the next input? When SNK has consumed the previous one Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna recall Ideal Design Method An ideal design method … minimizes power consumption miminizes circuit overhead naturally supports composability naturally aids testability yields robust circuits yields fast circuits. Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Outline Timed Communication Model Control Flow Conditions Classification of Sychronous Design Benefits of Synchronous Design Problems with Synchronous Design Evaluation Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Timed Comm. Model Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna The Issue Condition Control TRGSRC: Have SRC issue the next data word such that the current one can still be safely consumed by SNK. Formal Condition: tinvalid,x > tsafe,x msrc > - Dinvalid Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna The Capture Condition Control TRGSNK: Have SNK capture data only after it has become consistent. Formal Condition: tcons,x > tsnkrdy,x msnk > - Dsnktrg Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Our Options We must only use consistent input vectors How can we tell an input vector is consistent? (1) use TIME to mark consistent phases synchronous approach / global time base asynchronous/bounded delay (2) use CODING to add information asynchronous/delay insensitive Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
Synchronous Philosophy „If the problem originates from the time domain, why don‘t we solve it in the time domain!“ Process inputs only after they have become stable. Use clock to signal these instants. Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Control by Global Time Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Synchronous Timing clock period active clock edge setup/hold window recovery from transients * clock to output delay * combinational delay * routing delay, … HI LO Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
The Synchronous Concept f(x) FF1 FF2 TClk Tclk so bemessen, dass F(x) einschwingen können und sicher den Wert angenommen haben. „After some TIME Tclk FF2 can use f(x)‘s output and at the same time FF1 can apply a new input“ Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna The Implications Clock Period TClk = Period p Must be determined by static timing analysis Phase j = p (!) this implies that msrc = -(Dsnktrg + Dcons) still we must guarantee msrc > -Dinvalid (issue condition) therefore Dinvalid > Dsnktrg + Dcons This is not formally safe – but it works! Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Benefits of Sync. Logic Simplicity improves productivity design on high level of abstraction truth table with „previous state“ transients are irrelevent, all considered states are clearly defined timing analysis separate, after design clear distinction between data and clock simplifies timing analysis Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
Benefits of Sync. Logic (2) High implementation efficiency: one single control signal for the complete system! periodic clock is easy to generate single-rail data coding minimum number of transitions on the data rails clock also provides a time base Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Resume 1 Synchronous design does work billions of working designs Synchronous design is VERY efficient wrt. design wrt. implementation So everything is solved Is it? Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna recall The Original Problem When can SNK use its input? When it is valid and consistent f(x) SRC SNK When can SRC apply the next input? When SNK has consumed the previous one Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna recall What have we done? We have expressed a simple information related condition by means of complicated timing related parameters that we don‘t even know! DOES IT MATTER ? Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
That damned traffic light YES! It does matter Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna That damned … Traffic light number of waiting cars Microwave oven temperature of the food Wiper visibility through the front shield Stairway light presence of a person in the stairway Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna What‘s wrong? Often events define important points in time. This does, however, not mean that the occurrence of the event can be a priori related to (absolute or relative) time. BUT: Time is relatively easy to measure Therefore it is often much more efficient to establish such an indirect relation than to observe the actual event (that is sometimes invisible) This starts to become annoying when the artificial relation between actual event and time model is too weak. Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
The Synchronous Approach f(x) FF1 FF2 Relating flow control to time in this way is convenient and effective, but in fact the implied relation does not (naturally) exist! We need to establish this relation artificially during design (timing optimization & constraints) TClk Tclk so bemessen, dass F(x) einschwingen können und sicher den Wert angenommen haben. „After some TIME Tclk FF2 can use f(x)‘s output and at the same time FF1 can apply a new input“ Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
The annoying consequences need to determine clock period circuit functionality is technology dependent considerable design efforts, large design loops need to make worst-case assumptions necessarily pessimistic no robustness wrt. exceeding them need to maintain global synchrony clock distribution problems power consumption problems Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna recall Can we predict Delay? after synthesis: logic depth complexity of operation optimization & mapping after routing: interconnect geometrie (lengths, capacitances) vias, switches during operation: actual values process variations temperature supply voltage Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
Timing Analysis not possible before the end of the design flow (large iteration loops!) Specification Validation Design-Entry Behavioral Simulation Synth. & Technol.-Mapping Prelayout-GL-Simulation Partitioning & Placement Routing Postlayout-GL-Simulation Manufact. Test Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Timing Analysis not possible before the end of the design flow (large iteration loops!) tight & safe esti- mation has become a major issue sync model transients setup/hold reality Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Timing Analysis not possible before the end of the design flow (large iteration loops!) tight & safe esti- mation has become a major issue feasible with „ideal“ clock net only original idea: avoid having to deal with transients current practice: timing analysis most difficult CLK D tdly,DATA,1m FF1 CLK D FF2 tdly,DATA,2m … combin. logic CLK D CLK D FFk FFm tPD,CLK tdly,DATA,km Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
Worst-Case Assumptions normally too pessimistic real, chip could run faster no tolerance when exceeded graceful degradation desirable alim H(a) a Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
! Performance Efficiency real computation time unbalanced stages lib: worst vs. typ crosstalk, IR drop process variation clock skew 100 50 20 30 1 20 [Cortadella, ICCD’04] Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Clock Distribution clock distribution network widely spread over chip minimization of delay & skew very tedious and costly Lecture "Advanced Digital Design" A © A. Steininger & M. Delvai / TU Vienna
! Area Efficiency area proportion devoted to intended logic function area proportion devoted to necessary flow control overhead: clock network 45% [Wilton IEEE Jnl. SSC 2/2005] Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Power Dissipation clock network con-sumes much energy concurrent switching => current peaks => voltage drops permanent switching => artificial activity according to publications 40% (DEC, e.g.) Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
! Power Efficiency power for intended function dissipated power (total) static part dynamic part control part (dynamic only*) circuit utilization * [Duarte] Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
Electromagn. Interference long clock rails are good antennas virtually all radiated energy is con-centrated to one single spectral line E(f) max / CE f Lecture "Advanced Digital Design" A © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna ! Composability each and every small change in the design requires a completely new timing analysis a switch to a new technology completely changes the timing interoperation between IP cores on a chip requires detailed specification (and matching) of both logic function and timing behavior Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Asynchronous Inputs dec. win. T0 clock period Tclk setup/hold asynchronous event probability of setup/hold violation Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
Multiple Clock Domains CLK 1 (Ref) CLK 2 arbitrary „phase“ relation setup/hold violation inevitable (fundamentally!) Lecture "Advanced Digital Design" A © A. Steininger & M. Delvai / TU Vienna
Latch: Operation Model transparent hold Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Response Time of a FF An input transition during the decision wwindow leads to an (unbounded) increase of clock-to-output delay tclk2out CLK D tclk2out,nom tsetup thold tclk2data Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Physical Equivalent Ball may remain on top („metastable“) for unbounded time A small disturbance causes the ball to fall in either direction Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
Metastability Propagation uout Inverter-characteristics data X X uin clk The inverter maps metastable inputs to metastable outputs Therefore metastability can propagate Lecture "Advanced Digital Design" A © A. Steininger & M. Delvai / TU Vienna
Inconsistent Perception Metastab. A threshold A X X 1 B treshold B The metastable state may be regarded as „1“ by one FF and as „0“ by another Lecture "Advanced Digital Design" A © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Resolution Time clk asyn syn normal operation: tclk2out < tr upset: tclk2out > tr tclk2out tcomb tSU tr asyn syn comb. logic clk Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
Mean Time Between Upset metastable output is captured by subsequent FF after tr Mean Time Between Upset (MTBU) expected value (statistics!) for interval between two subsequent upsets Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Parameters Resolution time tr interval available for output to settle after active clock edge Flip-Flop parameters tc ,T0 experimentally determined time constant tc dep. on transit frequ. T0 from effective width of decision window Clock period of FF Tclk = 1/fclk Average rate of change fdat average data rate at FF data input Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Derivation of formula transition rate at data input upset-rate probability of meta-stable state not being resolved during tr probability of transition hitting „decision window“ Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Synchronizer Example: Cascade of n Input-FFs asyn clk syn Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
! Robustness metastability Issues clock = single point of failure non-redundant signal coding no gracxeful degradation timing margins help masking faults but they are shrinking! Fault Injection Results for SPEAR [Thesis Rahbaran] Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna ! Testability Scan test turns sequential problem into combinational one => hard to beat! Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Conclusion An analysis of the data transfer process allows mapping the trigger conditions for data source and sink to the time domain, yielding an „issue condition“ and a „capture condition“. This convenient solution is used by some design styles, in particular the synchronous design. This mapping is, however, not natural. As an alternative signal coding may be used to control the triggers of source and sink. Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna
© A. Steininger & M. Delvai / TU Vienna Conclusion Synchronous design is extremely efficient wrt. design and testing. It establishes a relation between handshake events and time that becomes increasingly cumbersome. Weak points are inherent robustness and composability Power efficiency, area efficiency and performance efficiency are very good in principle, but limitations in clock distributions tend to foil these benefits. Lecture "Advanced Digital Design" © A. Steininger & M. Delvai / TU Vienna