A Quasi-Delay-Insensitive Method to Overcome Transistor Variation Charlie Brej APT Group University of Manchester 22/06/2019 VLSI 2005
Overview Synchronous Problems Asynchronous Logic Asynchronous Benefits Why? How? Asynchronous Benefits Delay Insensitivity Early Output 22/06/2019 VLSI 2005
Problems: Communication Communication horizon “For a 60 nanometer process a signal can reach only 5% of the die’s length in a clock cycle” [D. Matzke,1997] Clock distributed using wave pipelining 22/06/2019 VLSI 2005
Can’t keep ramping up the clock Intel pulls the plug on 4GHz Pentium 4 AMD and Intel using PR based model numbers New ranges run at much slower clock rate Higher concentration on parallel execution Hyper-threading Multiple cores 22/06/2019 VLSI 2005
Problems: Performance Unbalanced Stages Clock overheads Clock Skew/Jitter Transistor Variability Timing Assumption overheads Signal Integrity Cycle time Worst – Average case performance Real Computation 22/06/2019 VLSI 2005
Clock! What is it good for? No arguing with the clock 9am - 5pm. No excuses! 22/06/2019 VLSI 2005
Bundled-Data When you finish, do the next task Flexitime Request + Delay Acknowledge When you finish, do the next task Flexitime 22/06/2019 VLSI 2005
Transistor Variability Remove the Clock Unbalanced Stages Clock overheads Clock Skew/Jitter Transistor Variability Timing Assumption overheads Signal Integrity Worst – Average case performance Cycle time Real Computation 22/06/2019 VLSI 2005
How do you know when you are finished? Synchronous: Estimate Global timing reference Asynchronous (bundled-data) Local delay elements Asynchronous (delay-insensitive) When the data arrives Intrinsic 22/06/2019 VLSI 2005
Becoming Delay Insensitive Dual-Rail Two wires 00 – NULL 01 – Zero 10 – One (11 – Not used) Four Phase handshake Return to zero R0 R1 Ack 22/06/2019 VLSI 2005
Delay Insensitivity No assumptions on speed of wires or gates Environmental effects Heat Voltage supply Manufacturing defects Thin Film Transistor Next generation process sizes 22/06/2019 VLSI 2005
Early Output Logic Dual-Rail interfaces Output generated as early as possible Two Early output cases If either input is ‘0’ then the output is ‘0’ 22/06/2019 VLSI 2005
Bit level pipelining Forward completed parts of the result Pace work Don’t stall parts unless you have to 22/06/2019 VLSI 2005
Bit level pipelining Forward completed parts of the result Pace work Don’t stall parts unless you have to 22/06/2019 VLSI 2005
Early Output cases 22/06/2019 VLSI 2005
Paper contribution With missing inputs still generates results Isolates late inputs Allows next data phase 22/06/2019 VLSI 2005
Remove Unnecessary computation Unbalanced Stages Clock overheads Clock Skew/Jitter Transistor Variability Timing Assumption overheads Signal Integrity Worst – Average case performance Unnecessary Computation/Delays Real Computation Cycle time 22/06/2019 VLSI 2005
Summary Asynchronous Delay Insensitive Average case performance Safe No timing assumptions Average case performance Remove unnecessary computation 22/06/2019 VLSI 2005