Download presentation
Presentation is loading. Please wait.
Published byClifton Ellis Modified over 9 years ago
1
Error Detection in Hardware VO Hardware-Software-Codesign Philipp Jahn
2
6.6.2007Error Detection in Hardware2 Error detection How to detect errors with hardware methods during system operation Conditions Coverage (probability that error is detected) Latency (time between start of error and detection) Performance Slide from VO „Echtzeitsysteme“, H. Kopetz
3
6.6.2007Error Detection in Hardware3 Hardware-based error detection Hardware redundancy Passive (TMR, majority voting) Active (duplication and comparison, standby) Hybrid Information redundancy Parity Checksums Arithmetic Codes Time redundancy Watchdog timers Checking Capability Checking Consistency Checking Control-Flow Checking
4
6.6.2007Error Detection in Hardware4 Information redundancy (1) Detection / Correction Hamming distance X = (1001), Y = (0111) d(X,Y) = 3 SEC – DED
5
6.6.2007Error Detection in Hardware5 Information redundancy (2) Parity One extra bit (even / odd) Decoding circuit (set of XOR gates) Routine checking in busses, memory and registers Detecting single bit errors (no stuck-at faults)
6
6.6.2007Error Detection in Hardware6 Information redundancy (3) Overlapping parity m of n codes Duplication codes Cycle redundancy checks Sender and receiver agree upon generator polynom G(x) Append checksum (k bit) at end of data frame (n-k bit) Checksum / G(x) = 0 correct Simple implementation (linear feedback shift register and XOR gates) Detect single-bit errors, multiple adjacent bit errors affecting fewer than n-k bits, and burst transient errors High successful in serial transmission (communication channels: Ethernet, Token Ring)
7
6.6.2007Error Detection in Hardware7 Information redundancy (4) Checksums
8
6.6.2007Error Detection in Hardware8 Information redundancy (5) Arithmetic Codes Detect errors in arithmetic units (parity would not be preserved) Separate or nonseparate Examples AN codes Residue codes
9
6.6.2007Error Detection in Hardware9 Time redundancy (1) Repetition of computations two or more times and then comparing (detection or correction by majority) Error detected maybe retry Good for detecting transient faults Not protecting against errors resulting from permanent faults No extra hardware needed but longer processing time Non-time-critical applications Alternate Logic also detects permanent faults (self-checking circuits f(x) = f ‘(x’))
10
6.6.2007Error Detection in Hardware10 Time redundancy (2) Handle permanent faults per encoding the second computation (must not alter calculation) e.g. k-shift Error in k-1 consecutive bit of arithmetic or logical operation detected Additional hardware (two shifters, storage register, comparator)
11
6.6.2007Error Detection in Hardware11 Watchdog timers Implemented in hardware (external timer) or software (process) If timer expires system reset or recover Detect only very specific type = control-flow error If error occurs but timer reset no detection Difficult to determine runtime High detection latency
12
6.6.2007Error Detection in Hardware12 Capability & Consistency Checking Capability checking limits access to objects (e.g. memory segments) to authorized users (processes) Implemented in hardware (error traps) or software (firewall) e.g. checking of address validity by MMU Consistency checking determines if states or results are reasonable e.g. range checking, address checking, opcode checking
13
6.6.2007Error Detection in Hardware13 Control-Flow Checking (1) Hardware scheme Divide application program into blocks Each block has a single entry and exit point Reference signature represents an encoding of the correct execution Watchdog processor validates the application program by comparing the runtime with the signature 70% of transient faults lead to control flow errors Limitations Only suitable for processors running single programs (multiple processes or threads) Reduced coverage if transmission errors on the bus to the watchdog processor occurs
14
6.6.2007Error Detection in Hardware14 Control-Flow Checking (2) Signatured Instruction Stream (SIS) Hardware: Watchdog processor with cyclic code signature generator Software: Modified assembler and loader Control Flow Checking using Shadow Processing
15
6.6.2007Error Detection in Hardware15 Summary Hardware low error latency Hardware is more expensive e.g. Massively parallel multiprocessors Combining error detection mechanism
16
6.6.2007Error Detection in Hardware16 References Ravishankar K. Iyer, Zbigniew Kalbarczyk - Hardware and Software Error Detection - Center for Reliable and High-Performance Computing, University of Illinois at Urbana-Champaign Real-Time Systems, Design Principles for Distributed Embedded Applications Kopetz, Hermann, 1997, 356 p., Hardcover, ISBN: 978-0-7923- 9894-3 Alireza Vahdatpour, Mahdi Fazeli, Seyed Ghassem Miremadi - Transient Error Detection in Embedded Sysetms Using Reconfigurable Components - IES, October 2006 M. Dal Chin, W. Hohl, E. Michel, A. Pataricza - Error Detection Mechansims for Massively Parallel Multiprocessors - IEEE Proceedings, 1993 Evaluation of error detection coverage and fault-tolerance of digital plant protection system in nuclear power plants http://robotics.ee.uwa.edu.au/courses/faulttolerant/notes/FT2b.pdf A. Steiniger, C. Scherrer - Identifying Efficient Combinations of Error Detection Mechanisms Based on Results of Fault Injection Experiments - IEEE Transactions on computers, Vol. 51, No. 2, February 2002
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.