Presentation is loading. Please wait.

Presentation is loading. Please wait.

Presentation Title Greg Snider QSR, Hewlett-Packard Laboratories

Similar presentations


Presentation on theme: "Presentation Title Greg Snider QSR, Hewlett-Packard Laboratories"— Presentation transcript:

1 Presentation Title Greg Snider QSR, Hewlett-Packard Laboratories
Nano Architectures II Greg Snider QSR, Hewlett-Packard Laboratories

2 Today’s talk Living in an imperfect world
Quick recap of Wednesday’s talk Transient faults History (von Neumann) Approaches (coding theory) Static defects Background (Teramac) Empirical studies Nano / micro interface DEMO: 4-bit nanoprocessor November 8, 2018

3 Configurable Tile November 8, 2018

4 Tile Types November 8, 2018

5 Mosaics November 8, 2018

6 1. n-FET / resistor logic GND A A B B C C AB + C V+ November 8, 2018

7 2. p-FET / resistor logic V+ A A B B C C AB + C GND November 8, 2018

8 3. n-FET / p-FET logic + V Ground November 8, 2018

9 One of the first papers on Nanoelectronics
“Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components” November 8, 2018

10 One of the first papers on Nanoelectronics
“Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components” J. von Neumann, ! November 8, 2018

11 Circuit von Neumann’s Worry pArt Static Defects: initially bad
wear out PARt Circuit part pArt Dynamic Faults: transient intermittent November 8, 2018

12 The Solution (in order of desirability)
Fewer parts November 8, 2018

13 The Solution (in order of desirability)
Fewer parts Better parts November 8, 2018

14 The Solution (in order of desirability)
Fewer parts Better parts Redundancy November 8, 2018

15 The Solution (in order of desirability)
Fewer parts Better parts Redundancy November 8, 2018

16 Triple Modular Redundancy (von Neumann)
x y f (x, y) z November 8, 2018

17 Triple Modular Redundancy (von Neumann)
f (x, y) Voter assumed reliable! voter small coarse-grained x y x y f (x, y) z f (x, y) majority vote z f (x, y) November 8, 2018

18 What if voters are flaky?
November 8, 2018

19 What if voters are flaky?
Probabilistic approach Each logic signal  “fuzzy” value (0…1) 0.0 0.5 1.0 November 8, 2018

20 What if voters are flaky?
Probabilistic approach Each logic signal  “fuzzy” value (0…1) false 0.0 0.5 1.0 November 8, 2018

21 What if voters are flaky?
Probabilistic approach Each logic signal  “fuzzy” value (0…1) false true 0.0 0.5 1.0 November 8, 2018

22 What if voters are flaky?
Probabilistic approach Each logic signal  “fuzzy” value (0…1) false mostly true 0.0 0.5 1.0 November 8, 2018

23 What if voters are flaky?
Probabilistic approach Each logic signal  “fuzzy” value (0…1) mostly false mostly true 0.0 0.5 1.0 November 8, 2018

24 What if voters are flaky?
Probabilistic approach Each logic signal  “fuzzy” value (0…1) mostly false failure! mostly true 0.0 0.5 1.0 November 8, 2018

25 Parallel Restitution (von Neumann)
f (x, y) x z y 1. Replace each wire with “bundle” November 8, 2018

26 Parallel Restitution (von Neumann)
x1 x2 x3 x4 f (x, y) z1 z2 z3 y1 y2 y3 y4 z4 1. Replace each wire with “bundle” November 8, 2018

27 Parallel Restitution (von Neumann)
x1 x2 x3 x4 f (x, y) z1 z2 z3 y1 y2 y3 y4 z4 2. Replace function with redundant version, F November 8, 2018

28 Parallel Restitution (von Neumann)
x1 x2 x3 x4 F(x, y) z1 z2 z3 y1 y2 y3 y4 z4 2. Replace function with redundant version, F November 8, 2018

29 Parallel Restitution (von Neumann)
f (x, y) random permute majority vote x1 x2 x3 x4 z1 f (x, y) Each signal becomes a bundle of N signals. Voters can be flaky! => fine-grained. majority vote z2 f (x, y) majority vote z3 y1 y2 y3 y4 f (x, y) majority vote z4 F(x, y) November 8, 2018

30 Parallel Restitution N Localized:
Replace each wire with bundle of N wires. Replicate gates, scramble inputs, add voters. November 8, 2018

31 Parallel Restitution How does it work?
Bundle = stochastic variable (0.0 to 1.0), value is fraction of wires in HI state: false failure true 0.0 .7 .93 1.0 1 Majority gates act as “stochastic amplifiers,” reducing entropy of computation. .5 1 November 8, 2018

32 Parallel Restitution How does it work?
1 1 1 1 November 8, 2018

33 Parallel Restitution How does it work?
1 .98 .03 .02 .01 1 .94 1 1 .95 .99 November 8, 2018

34 Parallel Restitution. Practical?
Wires in bundle Probability of failure 1 5.0 x 10-3 1,000 2.7 x 10-2 2,000 2.6 x 10-3 5,000 4.0 x 10-6 10,000 1.6 x 10-10 25,000 1.2 x 10-23 November 8, 2018

35 …so now what? Coding theory to the rescue… Error detecting codes
Error correcting codes November 8, 2018

36 Error Detecting Codes 1 1 0 0 1 0 1 Noisy channel 1 0 0 0 1 0 1
November 8, 2018

37 Error Detecting Codes 1 1 0 0 1 0 1 Noisy channel 1 0 0 0 1 0 1
Bit error November 8, 2018

38 Error Detecting Codes 1 1 0 0 1 0 1  even # 1’s Noisy channel
check bit  even # 1’s Noisy channel  odd # 1’s ERROR! Bit error November 8, 2018

39 Error Correcting Codes (ECC)
check bits 0 1 0 Noisy channel 0 1 0 Bit error November 8, 2018

40 Error Correcting Codes (ECC)
check bits 0 1 0 Noisy channel correction circuit 0 1 0 Bit error November 8, 2018

41 Error Correcting Codes (ECC)
check bits 0 1 0 or noisy circuit! Noisy channel correction circuit 0 1 0 Bit error November 8, 2018

42 Self-correcting circuits
encode decode f in 1 h out von Neumann’s approach, BUT his error correcting code was very inefficient. (repetition code) encode g in 2 Error correcting code November 8, 2018

43 Self-correcting circuits
encode decode f in 1 h out More efficient codes? Memories: yes Add, sub, shift: yes AND, OR: no! encode g in 2 Error correcting code November 8, 2018

44 Self-checking circuits
Error detection is cheaper than correction: Execute machine cycle. If no errors: latch results, advance state machine, Otherwise restart current cycle. Dynamic faults only. Non-deterministic execution time. Cheaper than in-circuit error correction. November 8, 2018

45 Totally Self-Checking Circuits
Xd Xc Zd Yd Zc Yc no fault => legal codeword output fault => illegal codeword output November 8, 2018

46 Adder Fault Detection totally self-checking adder Xd + mod M
Zd = Xd + Yd Xc C Zc Yd Yc * different? error checker November 8, 2018

47 Who Checks the Checker? Totally self-checking checkers, of course!
a b a b0 Totally self- checking equality checker. 1-out-of-2 November 8, 2018

48 Totally Self-Checking Networks
function unit totally self-checking checker error indication November 8, 2018

49 Hybrid Approach Self-correcting circuits where critical:
Nano / micro interface Self-checking otherwise November 8, 2018

50 Static defects Defect:
Permanent structural imperfection that can be discovered by testing. November 8, 2018

51 Defect Tolerance: Case Study
Teramac (1990 –1994) Logic Simulator: 1,000,000 gates 1 MHz 2 hour compile time November 8, 2018

52 Teramac 864 FPGAs: “PLASMA” chip, custom design
145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables November 8, 2018

53 Teramac We could not afford perfect parts! 864 FPGAs:
“PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables We could not afford perfect parts! November 8, 2018

54 Teramac  defective We could not afford perfect parts! 864 FPGAs:
“PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables  defective We could not afford perfect parts! November 8, 2018

55 Teramac  defective  defective We could not afford perfect parts!
864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables  defective  defective We could not afford perfect parts! November 8, 2018

56 Teramac  defective  defective  defective
864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables  defective  defective  defective We could not afford perfect parts! November 8, 2018

57 Teramac  defective  defective  defective  defective
864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables  defective  defective  defective  defective We could not afford perfect parts! November 8, 2018

58 Teramac Defects Resource Total Defective %defective Logic cell 221,000
23,000 10.4 % Xbar line 4,880,000 146,000 3.0 % Buffer 2,420,000 37,000 1.5 % Interchip 145,000 13,800 9.5 % 7,670,000 220,000 2.9 % November 8, 2018

59 Teramac Defect Handling
Defects were located with tests. Compiler avoided defective resources. November 8, 2018

60 Crossbar Compilation bool function(bool a, b, c) { bool result =
return result; } AB + C A GND V+ B C November 8, 2018

61 But…Defects! November 8, 2018

62 Defects broken wires November 8, 2018

63 Defects “stuck open” November 8, 2018

64 Defects “stuck closed” November 8, 2018

65 Defect Avoidance V+ November 8, 2018

66 Resource Allocation A C D A B C B + = D November 8, 2018

67 + = Resource Allocation Embedding problem (graph monomorphism) A C A B
November 8, 2018

68 Questions How do defect rates affect ability to allocate resources?
What compilation strategies are best for different defect rates? November 8, 2018

69 Application: written in C
int game3Response(int moveNumber, int humanMove) { int response; if (moveNumber == 1) response = I; else if (moveNumber == 2) { if (humanMove == E) response = G; else response = E; } else if (moveNumber == 3) { if (humanMove == D) response = H; response = D; } return response; . November 8, 2018

70 Target: diode crossbar
November 8, 2018

71 Application compiled onto target
November 8, 2018

72 2-level logic 1.0 Prob. of successful allocation ( 20 compiles
per point) .5 28 x 24 rel. area = 1.0 0% % % % Defective junctions (stuck open) November 8, 2018

73 2-level logic 1.0 Prob. of successful allocation ( 20 compiles
.5 Prob. of successful allocation ( 20 compiles per point) 32 x 28 rel. area = 1.3 0% % % % Defective junctions (stuck open) November 8, 2018

74 2-level logic 1.0 Prob. of successful allocation ( 20 compiles
.5 Prob. of successful allocation ( 20 compiles per point) 40 x 38 rel. area = 2.3 0% % % % Defective junctions (stuck open) November 8, 2018

75 2-level logic 1.0 Prob. of successful allocation ( 20 compiles
.5 Prob. of successful allocation ( 20 compiles per point) 48 x 48 rel. area = 3.4 0% % % % Defective junctions (stuck open) November 8, 2018

76 2-level logic 1.0 Prob. of successful allocation ( 20 compiles
.5 Prob. of successful allocation ( 20 compiles per point) 56 x 58 rel. area = 4.8 0% % % % Defective junctions (stuck open) November 8, 2018

77 2-level logic 1.0 Prob. of successful allocation ( 20 compiles
.5 Prob. of successful allocation ( 20 compiles per point) 80 x 78 rel. area = 9.2 0% % % % Defective junctions (stuck open) November 8, 2018

78 2-level logic 1.0 Prob. of successful allocation ( 20 compiles
.5 Prob. of successful allocation ( 20 compiles per point) 96 x 96 rel. area = 13.7 0% % % % Defective junctions (stuck open) November 8, 2018

79 Area = f(defect rate) 10 8 2-level 4 2 1 0% 10% 20% 30%
0% % % % Defective junctions (stuck open) November 8, 2018

80 multi-level logic 1.0 Prob. of successful allocation ( 20 compiles
.5 Prob. of successful allocation ( 20 compiles per point) 24 x 32 rel. area = 1.0 0% % % % Defective junctions (stuck open) November 8, 2018

81 multi-level logic 1.0 Prob. of successful allocation ( 20 compiles
.5 Prob. of successful allocation ( 20 compiles per point) 28 x 40 rel. area = 1.5 0% % % % Defective junctions (stuck open) November 8, 2018

82 multi-level logic 1.0 Prob. of successful allocation ( 20 compiles
.5 Prob. of successful allocation ( 20 compiles per point) 32 x 48 rel. area = 2.0 0% % % % Defective junctions (stuck open) November 8, 2018

83 multi-level logic 1.0 Prob. of successful allocation ( 20 compiles
.5 Prob. of successful allocation ( 20 compiles per point) 40 x 56 rel. area = 2.9 0% % % % Defective junctions (stuck open) November 8, 2018

84 multi-level logic 1.0 Prob. of successful allocation ( 20 compiles
.5 Prob. of successful allocation ( 20 compiles per point) 48 x 64 rel. area = 4.0 0% % % % Defective junctions (stuck open) November 8, 2018

85 multi-level logic 1.0 Prob. of successful allocation ( 20 compiles
.5 Prob. of successful allocation ( 20 compiles per point) 56 x 80 rel. area = 5.8 0% % % % Defective junctions (stuck open) November 8, 2018

86 multi-level logic 1.0 Prob. of successful allocation ( 20 compiles
.5 Prob. of successful allocation ( 20 compiles per point) 64 x 96 rel. area = 8.0 0% % % % Defective junctions (stuck open) November 8, 2018

87 multi-level logic 1.0 Prob. of successful allocation ( 20 compiles
.5 Prob. of successful allocation ( 20 compiles per point) 80 x 112 rel. area = 12 0% % % % Defective junctions (stuck open) November 8, 2018

88 Area = f(defect rate) 10 8 2-level 4 multi-level 2 1 0% 10% 20% 30%
0% % % % Defective junctions (stuck open) November 8, 2018

89 What about larger circuits?
November 8, 2018

90 4-bit Nanoprocessor 3.0 relative area 6 inputs 2.0 4 inputs 1.0
2 4 6 8 10 12 14 16 18 20 % defects November 8, 2018

91 4-bit Nanoprocessor More information:
“CMOS-like logic in defective, nanoscale crossbars,” Snider, Kuekes, Williams, Nanotechnology 15, Can download free for another week at: November 8, 2018

92 Nano / Micro interface Want to access large number of nanowires with small number of microwires. November 8, 2018

93 Demultiplexer Interface
November 8, 2018

94 Demultiplexer Interface
November 8, 2018

95 Demultiplexer Interface
November 8, 2018

96 Demultiplexer Interface
November 8, 2018

97 Demultiplexer 1 2 3 4 5 6 7 In A0 A1 A2 November 8, 2018

98 Demultiplexer 1 2 3 4 5 6 7 In A0 A1 A2 November 8, 2018

99 Demultiplexer 1 2 3 4 5 6 7 In A0 A1 A2 November 8, 2018

100 Crossbar Demultiplexers
1 2 3 4 5 6 7 A0 A0 A1 A1 A2 A In November 8, 2018

101 Crossbar Demultiplexers
…diodes, FETs, resistors can all be used 1 2 3 4 5 6 7 A0 A0 A1 A1 A2 A In November 8, 2018

102 Crossbar Demultiplexers
…diodes, FETs, resistors can all be used 1 2 3 4 5 6 7 Defects! A0 A0 A1 A1 A2 A In November 8, 2018

103 Defect-tolerant Crossbar Demultiplexers
1 2 3 4 5 6 7 A0 A0 A1 A1 A2 A In November 8, 2018

104 Defect-tolerant Crossbar Demultiplexers
Error correcting codes 1 2 3 4 5 6 7 A0 A0 A1 A1 A2 A In November 8, 2018

105 Key Points Transient errors Static defects Error-correcting circuits
Difficult to do efficiently for some computations Necessary in certain cases Can handle small number of static defects Error-detecting circuits General (totally self-checking circuits) Reasonably efficient Static defects Locate with tests Avoid in compiler November 8, 2018

106 Presentation Title


Download ppt "Presentation Title Greg Snider QSR, Hewlett-Packard Laboratories"

Similar presentations


Ads by Google