Presentation is loading. Please wait.

Presentation is loading. Please wait.

Embedded Systems Laboratory Informatics Institute Federal University of Rio Grande do Sul Porto Alegre – RS – Brazil SRC TechCon 2005 Portland, Oregon,

Similar presentations


Presentation on theme: "Embedded Systems Laboratory Informatics Institute Federal University of Rio Grande do Sul Porto Alegre – RS – Brazil SRC TechCon 2005 Portland, Oregon,"— Presentation transcript:

1 Embedded Systems Laboratory Informatics Institute Federal University of Rio Grande do Sul Porto Alegre – RS – Brazil SRC TechCon 2005 Portland, Oregon, USA Dealing with Multiple Simultaneous Faults in Future Technologies Carlos A. L. LisbôaErik Schüler Luigi Carro

2 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 2 Why Multiple Simultaneous Faults ? Future technologies (2010 and beyond) very small transistors and fewer electrons to form the channel (  SETs) transient pulses due to radiation attack will last longer than the propagation delays of gates devices will be more sensitive to the effects of electromagnetic noise, neutrons and alpha particles

3 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 3 Single Event Upset Origin 1 0 1 0 0 0 0 1 0 1 0 1 1 1 1 01 1 0 1 1 1 1 0

4 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 4 Why Should One Study Multiple Faults ? Change in paradigm: Gates will behave statistically, producing correct outputs only a fraction of the time.

5 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 5 New paradigm: multiple simultaneous faults new fault tolerance techniques will be required (TMR will no longer provide enough protection) How to Deal with Multiple Faults ?

6 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 6 New paradigm: multiple simultaneous faults new fault tolerance techniques will be required (TMR will no longer provide enough protection) How to deal with this problem ? new materials and manufacturing technologies must be developed OR new design approaches must be taken How to Deal with Multiple Faults ?

7 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 7 New paradigm: multiple simultaneous faults new fault tolerance techniques will be required (TMR will no longer provide enough protection) How to deal with this problem ? How to Deal with Multiple Faults ? new design approaches must be taken (our bet !)

8 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 8 Research Approaches Use of stochastic operators Use of bit stream operators Ensuring voter reliability to use n-MR while dealing with multiple simultaneous faults Next steps: 2005 - 2007 time frame

9 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 9 Research Evolution OK for some DSP Applications Looking for more speed Stochastic Operators Small footprint and fast Tolerant to multiple faults in n-MR solutions Analog Voter Bit Stream Operators Looking for tolerant converter

10 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 10 Using Stochastic Operators SEU induced transient errors are of random nature

11 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 11 Using Stochastic Operators SEU induced transient errors are of random nature Stochastic operators rely on randomness to produce approximate results

12 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 12 Using Stochastic Operators SEU induced transient errors are of random nature Stochastic operators rely on randomness to produce approximate results The injection of random faults in the input signals processed by stochastic operators did not impact the precision of the results 0 faults 2 faults 4 faults8 faults 0.14120.25800.17680.2196 Stochastic Adder Conventional 0.0000 % Errors in 1,000 additions

13 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 13 Using Stochastic Operators SEU induced transient errors are of random nature Stochastic operators rely on randomness to produce approximate results The injection of random faults in the input signals processed by stochastic operators did not impact the precision of the results Several application areas (DSP) can deal with approximate values and still produce acceptable results (outputs)

14 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 14 Using Stochastic Operators Benefit: reduced area of the operators Stochastic multiplier circuit 1000100110011010 1001000100001011 1000000100001010 Stochastic Adder Circuit 01100010101 010111011001 S1S1 S3S3 Sum 01010101101 0010100110101 S2S2

15 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 15 Using Stochastic Operators How does it work ? Come and see the posters ! No free drinks, but the answer to this question is granted !

16 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 16 Using Bit Stream Operators Computation principles similar to those of the stochastic adder and multiplier

17 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 17 Using Bit Stream Operators Computation principles similar to those of the stochastic adder and multiplier Operators can produce bit streams which represent the exact results of the operation Proposed Multiplication Algorithm - bit stream product (the count of 1’s in the stream is equal to the product value) F1 2 1 0 x F2 2 1 0 0. F1 2 F2 0. F1 1 F2 0. F1 0 F2 1. F1 2 F2 1. F1 1 F2 1. F1 0 F2 2. F1 2 F2 2. F1 1 F2 2. F1 0 b48.. b33b32.. b17b16.. b5b4.. b1b0

18 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 18 b48.. b48 b47.. b47... b0.. b0 1 1 1 1 0 0 0 8 times 8 times 8 times +4 total count of 1’s = 8 * product + 4 Using Bit Stream Operators Computation principles similar to those of the stochastic adder and multiplier Operators can produce bit streams which represent the exact results of the operation Redundancy is added to the bit streams in order to stand to multiple bit flips Adding robustness to the bit stream through redundancy

19 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 19 Using Bit Stream Operators Computation principles similar to those of the stochastic adder and multiplier Operators can produce bit streams which represent the exact results of the operation Redundancy is added to the bit streams in order to stand to multiple bit flips Conversion of bit streams to binary coded values is delayed as much as possible, and conversion circuits must use TMR or n-MR for protection against faults

20 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 20 Using Bit Stream Operators Computation principles similar to those of the stochastic adder and multiplier Operators can produce bit streams which represent the exact results of the operation Redundancy is added to the bit streams in order to stand to multiple bit flips Conversion of bit streams to binary coded values is delayed as much as possible, and conversion circuits must use TMR or n- MR for protection against faults Issues to be further investigated: size of bit streams and area of the conversion circuits

21 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 21 Using Bit Stream Operators No free food, but some more info on this subject will be provided ! How does it work ? Come and see the posters !

22 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 22 VOTERVOTER correct output What is Wrong with TMR ? TMR protects only against single faults in one of the modules Module 1 Module 2 Module 3 correct output

23 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 23 Module 2 wrong output What is Wrong with TMR ? Module 1 Module 3 correct output VOTERVOTER TMR protects only against single faults in one of the modules

24 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 24 Module 2 correct output What is Wrong with TMR ? TMR does not protect against double faults in different modules Module 1 Module 3 wrong output VOTERVOTER

25 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 25 VOTERVOTER correct output What is Wrong with TMR ? When a single fault occurs in the voter circuit, the voter output may be wrong Module 1 Module 2 Module 3 correct output

26 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 26 VOTERVOTER correct output ? What is Wrong with TMR ? Module 1 Module 2 Module 3 correct output When a single fault occurs in the voter circuit, the voter output may be wrong

27 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 27 Making TMR (n-MR) more reliable Known solutions imply in area, performance and / or power penalties deadlock: how to protect the output generator ?

28 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 28 Making TMR (n-MR) more reliable Known solutions imply in area, performance and / or power penalties deadlock: how to protect the output generator ? Proposed solution: use TMR to cope with single faults in the modules

29 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 29 Making TMR (n-MR) more reliable Known solutions imply in area, performance and / or power penalties deadlock: how to protect the output generator ? Proposed solution: use TMR to cope with single faults in the modules replace the digital voter by an analog voter that uses a comparator to generate the output

30 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 30 Known solutions imply in area, performance and / or power penalties deadlock: how to protect the output generator ? Proposed solution: use TMR to cope with single faults in the modules replace the digital voter by an analog voter that uses a comparator to generate the output can support some noise, nevertheless producing the correct result Making TMR (n-MR) more reliable

31 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 31 The Analog Voter

32 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 32 Injection of faults in the comparator (*) Minimum Area Comparator (*) using CMOS 0.35µm

33 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 33 Electrical Simulation: Multiple Faults (SPICE and CMOS 0.35  m)

34 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 34 Dealing with Multiple Simultaneous Faults: n-MR The Analog Voter with 5 Inputs (for 5-MR)

35 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 35 Dealing with Multiple Simultaneous Faults: n-MR The Analog Voter with 5 Inputs (for 5-MR) Simulations with injection of 2 simultaneous faults also succeeded

36 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 36 The Analog Voter... Oops ! Does this work ???

37 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 37 Let’s see the posters ! The Analog Voter

38 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 38 Future Work - Short Term (2005-2006) use of signal redundancy with other number representation forms, such as Sigma-Delta

39 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 39 Future Work - Short Term (2005-2006) use of signal redundancy with other number representation forms, such as Sigma-Delta use of the analog voter as an efficient way to implement robust n-MR circuits

40 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 40 Future Work - Short Term (2005-2006) use of signal redundancy with other number representation forms, such as Sigma-Delta use of the analog voter as an efficient way to implement robust n-MR circuits investigate the application of statistical methods and neural networks to the design of fault tolerant circuits with minimum redundancy

41 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 41 Future Work - Long Term (2006-2007) use of logic properties to develop signal redundancy with low cost

42 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 42 Future Work - Long Term (2006-2007) use of logic properties to develop signal redundancy with low cost apply the developed techniques to actual processors w/ DSP and VLIW architectures

43 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 43 Future Work - Long Term (2006-2007) use of logic properties to develop signal redundancy with low cost apply the developed techniques to actual processors with DSP and VLIW architectures discuss the architectural impact of new technologies together with fault tolerance

44 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 44 Research Evolution Stochastic Operators Analog Voter Bit Stream Operators previous work (2004-2005)2005 2006 2007

45 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 45 Research Evolution Stochastic Operators Analog Voter Bit Stream Operators Sigma Delta previous work (2004-2005)2005 2006 2007

46 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 46 Research Evolution Stochastic Operators Analog Voter Bit Stream Operators Sigma Delta Logic Properties previous work (2004-2005)2005 2006 2007

47 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 47 Low cost redundancy Research Evolution Stochastic Operators Analog Voter Bit Stream Operators Sigma Delta Logic Properties previous work (2004-2005)2005 2006 2007

48 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 48 Application to actual DSP and VLIW processors Low cost redundancy Research Evolution Stochastic Operators Analog Voter Bit Stream Operators Sigma Delta Logic Properties DSP / VLIW previous work (2004-2005)2005 2006 2007

49 Carlos A. L. Lisbôa SRC TechCon 2005 - October, 26, 2005 - Paper # 20.4 49 Questions ? Looking forward to answer them at the poster booth! (# 20.4) Contact: calisboa@inf.ufrgs.br Thank You ! No free anything, but a nice chat about these matters will be a pleasure !


Download ppt "Embedded Systems Laboratory Informatics Institute Federal University of Rio Grande do Sul Porto Alegre – RS – Brazil SRC TechCon 2005 Portland, Oregon,"

Similar presentations


Ads by Google