Presentation Title Greg Snider QSR, Hewlett-Packard Laboratories Nano Architectures II Greg Snider QSR, Hewlett-Packard Laboratories
Today’s talk Living in an imperfect world Quick recap of Wednesday’s talk Transient faults History (von Neumann) Approaches (coding theory) Static defects Background (Teramac) Empirical studies Nano / micro interface DEMO: 4-bit nanoprocessor November 8, 2018
Configurable Tile November 8, 2018
Tile Types November 8, 2018
Mosaics November 8, 2018
1. n-FET / resistor logic GND A A B B C C AB + C V+ November 8, 2018
2. p-FET / resistor logic V+ A A B B C C AB + C GND November 8, 2018
3. n-FET / p-FET logic + V Ground November 8, 2018
One of the first papers on Nanoelectronics “Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components” November 8, 2018
One of the first papers on Nanoelectronics “Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components” J. von Neumann, 1955 ! November 8, 2018
Circuit von Neumann’s Worry pArt Static Defects: initially bad wear out PARt Circuit part pArt Dynamic Faults: transient intermittent November 8, 2018
The Solution (in order of desirability) Fewer parts November 8, 2018
The Solution (in order of desirability) Fewer parts Better parts November 8, 2018
The Solution (in order of desirability) Fewer parts Better parts Redundancy November 8, 2018
The Solution (in order of desirability) Fewer parts Better parts Redundancy November 8, 2018
Triple Modular Redundancy (von Neumann) x y f (x, y) z November 8, 2018
Triple Modular Redundancy (von Neumann) f (x, y) Voter assumed reliable! voter small coarse-grained x y x y f (x, y) z f (x, y) majority vote z f (x, y) November 8, 2018
What if voters are flaky? November 8, 2018
What if voters are flaky? Probabilistic approach Each logic signal “fuzzy” value (0…1) 0.0 0.5 1.0 November 8, 2018
What if voters are flaky? Probabilistic approach Each logic signal “fuzzy” value (0…1) false 0.0 0.5 1.0 November 8, 2018
What if voters are flaky? Probabilistic approach Each logic signal “fuzzy” value (0…1) false true 0.0 0.5 1.0 November 8, 2018
What if voters are flaky? Probabilistic approach Each logic signal “fuzzy” value (0…1) false mostly true 0.0 0.5 1.0 November 8, 2018
What if voters are flaky? Probabilistic approach Each logic signal “fuzzy” value (0…1) mostly false mostly true 0.0 0.5 1.0 November 8, 2018
What if voters are flaky? Probabilistic approach Each logic signal “fuzzy” value (0…1) mostly false failure! mostly true 0.0 0.5 1.0 November 8, 2018
Parallel Restitution (von Neumann) f (x, y) x z y 1. Replace each wire with “bundle” November 8, 2018
Parallel Restitution (von Neumann) x1 x2 x3 x4 f (x, y) z1 z2 z3 y1 y2 y3 y4 z4 1. Replace each wire with “bundle” November 8, 2018
Parallel Restitution (von Neumann) x1 x2 x3 x4 f (x, y) z1 z2 z3 y1 y2 y3 y4 z4 2. Replace function with redundant version, F November 8, 2018
Parallel Restitution (von Neumann) x1 x2 x3 x4 F(x, y) z1 z2 z3 y1 y2 y3 y4 z4 2. Replace function with redundant version, F November 8, 2018
Parallel Restitution (von Neumann) f (x, y) random permute majority vote x1 x2 x3 x4 z1 f (x, y) Each signal becomes a bundle of N signals. Voters can be flaky! => fine-grained. majority vote z2 f (x, y) majority vote z3 y1 y2 y3 y4 f (x, y) majority vote z4 F(x, y) November 8, 2018
Parallel Restitution N Localized: Replace each wire with bundle of N wires. Replicate gates, scramble inputs, add voters. November 8, 2018
Parallel Restitution How does it work? Bundle = stochastic variable (0.0 to 1.0), value is fraction of wires in HI state: false failure true 0.0 .7 .93 1.0 1 Majority gates act as “stochastic amplifiers,” reducing entropy of computation. .5 1 November 8, 2018
Parallel Restitution How does it work? 1 1 1 1 November 8, 2018
Parallel Restitution How does it work? 1 .98 .03 .02 .01 1 .94 1 1 .95 .99 November 8, 2018
Parallel Restitution. Practical? Wires in bundle Probability of failure 1 5.0 x 10-3 1,000 2.7 x 10-2 2,000 2.6 x 10-3 5,000 4.0 x 10-6 10,000 1.6 x 10-10 25,000 1.2 x 10-23 November 8, 2018
…so now what? Coding theory to the rescue… Error detecting codes Error correcting codes November 8, 2018
Error Detecting Codes 1 1 0 0 1 0 1 Noisy channel 1 0 0 0 1 0 1 November 8, 2018
Error Detecting Codes 1 1 0 0 1 0 1 Noisy channel 1 0 0 0 1 0 1 Bit error November 8, 2018
Error Detecting Codes 1 1 0 0 1 0 1 even # 1’s Noisy channel check bit 1 1 0 0 1 0 1 even # 1’s Noisy channel 1 0 0 0 1 0 1 odd # 1’s ERROR! Bit error November 8, 2018
Error Correcting Codes (ECC) check bits 0 1 0 1 1 0 0 1 0 1 Noisy channel 0 1 0 1 0 0 0 1 0 1 Bit error November 8, 2018
Error Correcting Codes (ECC) check bits 0 1 0 1 1 0 0 1 0 1 Noisy channel correction circuit 0 1 0 1 0 0 0 1 0 1 1 1 0 0 1 0 1 Bit error November 8, 2018
Error Correcting Codes (ECC) check bits 0 1 0 1 1 0 0 1 0 1 or noisy circuit! Noisy channel correction circuit 0 1 0 1 0 0 0 1 0 1 1 1 0 0 1 0 1 Bit error November 8, 2018
Self-correcting circuits encode decode f in 1 h out von Neumann’s approach, BUT his error correcting code was very inefficient. (repetition code) encode g in 2 Error correcting code November 8, 2018
Self-correcting circuits encode decode f in 1 h out More efficient codes? Memories: yes Add, sub, shift: yes AND, OR: no! encode g in 2 Error correcting code November 8, 2018
Self-checking circuits Error detection is cheaper than correction: Execute machine cycle. If no errors: latch results, advance state machine, Otherwise restart current cycle. Dynamic faults only. Non-deterministic execution time. Cheaper than in-circuit error correction. November 8, 2018
Totally Self-Checking Circuits Xd Xc Zd Yd Zc Yc no fault => legal codeword output fault => illegal codeword output November 8, 2018
Adder Fault Detection totally self-checking adder Xd + mod M Zd = Xd + Yd Xc C Zc Yd Yc * different? error checker November 8, 2018
Who Checks the Checker? Totally self-checking checkers, of course! a1 b1 a0 b0 Totally self- checking equality checker. 1-out-of-2 November 8, 2018
Totally Self-Checking Networks function unit totally self-checking checker error indication November 8, 2018
Hybrid Approach Self-correcting circuits where critical: Nano / micro interface Self-checking otherwise November 8, 2018
Static defects Defect: Permanent structural imperfection that can be discovered by testing. November 8, 2018
Defect Tolerance: Case Study Teramac (1990 –1994) Logic Simulator: 1,000,000 gates 1 MHz 2 hour compile time November 8, 2018
Teramac 864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables November 8, 2018
Teramac We could not afford perfect parts! 864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables We could not afford perfect parts! November 8, 2018
Teramac defective We could not afford perfect parts! 864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables defective We could not afford perfect parts! November 8, 2018
Teramac defective defective We could not afford perfect parts! 864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables defective defective We could not afford perfect parts! November 8, 2018
Teramac defective defective defective 864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables defective defective defective We could not afford perfect parts! November 8, 2018
Teramac defective defective defective defective 864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables defective defective defective defective We could not afford perfect parts! November 8, 2018
Teramac Defects Resource Total Defective %defective Logic cell 221,000 23,000 10.4 % Xbar line 4,880,000 146,000 3.0 % Buffer 2,420,000 37,000 1.5 % Interchip 145,000 13,800 9.5 % 7,670,000 220,000 2.9 % November 8, 2018
Teramac Defect Handling Defects were located with tests. Compiler avoided defective resources. November 8, 2018
Crossbar Compilation bool function(bool a, b, c) { bool result = return result; } AB + C A GND V+ B C November 8, 2018
But…Defects! November 8, 2018
Defects broken wires November 8, 2018
Defects “stuck open” November 8, 2018
Defects “stuck closed” November 8, 2018
Defect Avoidance V+ November 8, 2018
Resource Allocation A C D A B C B + = D November 8, 2018
+ = Resource Allocation Embedding problem (graph monomorphism) A C A B November 8, 2018
Questions How do defect rates affect ability to allocate resources? What compilation strategies are best for different defect rates? November 8, 2018
Application: written in C int game3Response(int moveNumber, int humanMove) { int response; if (moveNumber == 1) response = I; else if (moveNumber == 2) { if (humanMove == E) response = G; else response = E; } else if (moveNumber == 3) { if (humanMove == D) response = H; response = D; } return response; . November 8, 2018
Target: diode crossbar November 8, 2018
Application compiled onto target November 8, 2018
2-level logic 1.0 Prob. of successful allocation ( 20 compiles per point) .5 28 x 24 rel. area = 1.0 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
2-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 32 x 28 rel. area = 1.3 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
2-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 40 x 38 rel. area = 2.3 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
2-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 48 x 48 rel. area = 3.4 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
2-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 56 x 58 rel. area = 4.8 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
2-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 80 x 78 rel. area = 9.2 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
2-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 96 x 96 rel. area = 13.7 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
Area = f(defect rate) 10 8 2-level 4 2 1 0% 10% 20% 30% 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 24 x 32 rel. area = 1.0 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 28 x 40 rel. area = 1.5 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 32 x 48 rel. area = 2.0 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 40 x 56 rel. area = 2.9 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 48 x 64 rel. area = 4.0 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 56 x 80 rel. area = 5.8 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 64 x 96 rel. area = 8.0 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 80 x 112 rel. area = 12 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
Area = f(defect rate) 10 8 2-level 4 multi-level 2 1 0% 10% 20% 30% 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018
What about larger circuits? November 8, 2018
4-bit Nanoprocessor 3.0 relative area 6 inputs 2.0 4 inputs 1.0 2 4 6 8 10 12 14 16 18 20 % defects November 8, 2018
4-bit Nanoprocessor More information: “CMOS-like logic in defective, nanoscale crossbars,” Snider, Kuekes, Williams, Nanotechnology 15, 881-891. Can download free for another week at: http://www.iop.org/EJ/abstract/0957-4484/15/8/003 November 8, 2018
Nano / Micro interface Want to access large number of nanowires with small number of microwires. November 8, 2018
Demultiplexer Interface November 8, 2018
Demultiplexer Interface November 8, 2018
Demultiplexer Interface November 8, 2018
Demultiplexer Interface November 8, 2018
Demultiplexer 1 2 3 4 5 6 7 In A0 A1 A2 November 8, 2018
Demultiplexer 1 2 3 4 5 6 7 In A0 A1 A2 0 0 0 November 8, 2018
Demultiplexer 1 2 3 4 5 6 7 In A0 A1 A2 1 0 1 November 8, 2018
Crossbar Demultiplexers 1 2 3 4 5 6 7 A0 A0 A1 A1 A2 A2 In November 8, 2018
Crossbar Demultiplexers …diodes, FETs, resistors can all be used 1 2 3 4 5 6 7 A0 A0 A1 A1 A2 A2 In November 8, 2018
Crossbar Demultiplexers …diodes, FETs, resistors can all be used 1 2 3 4 5 6 7 Defects! A0 A0 A1 A1 A2 A2 In November 8, 2018
Defect-tolerant Crossbar Demultiplexers 1 2 3 4 5 6 7 A0 A0 A1 A1 A2 A2 In November 8, 2018
Defect-tolerant Crossbar Demultiplexers Error correcting codes 1 2 3 4 5 6 7 A0 A0 A1 A1 A2 A2 In November 8, 2018
Key Points Transient errors Static defects Error-correcting circuits Difficult to do efficiently for some computations Necessary in certain cases Can handle small number of static defects Error-detecting circuits General (totally self-checking circuits) Reasonably efficient Static defects Locate with tests Avoid in compiler November 8, 2018
Presentation Title