Error-Detecting and Error-Correcting Codes Motivation Computers make errors occasionally (data gets corrupted) due to Voltage spikes Cosmic particles Corrupt data causes incorrect behavior Fix Use some bits to hold redundant information Data + Redundancy Code Words Depending of amount of redundancy (and exact properties of the codes) we can Detect errors Correct errors (automatically) Computer System Organization GCP, DoI, AUEB
Hamming Distance Hamming Distance (between 2 codewords): the number of bits that need to be changed (reversed) to change one codeword into the other codeword Example: change in 1 bit creates a new (valid) codeword Hamming Distance = 1 Equivalently: number of bits that differ 1111 and 1010 are 2 bits apart 1111 and 0000 have distance 4 Hamming Distance of a code: the minimum Hamming Distance between any two codewords of the code Computer System Organization GCP, DoI, AUEB
Hamming Distance of 1 Hamming Distance of 1: change in 1 bit creates a new codeword What happens with change of 1 bit (1 bit in error)? A -000 D -001 F -110 C -011 H -101 G -111 B -010 E -100 Computer System Organization GCP, DoI, AUEB
Hamming Distance of 2 What happens with 1 bit in error? What happens with 2 bits in error? A -000 001 C -110 B -011 D -101 111 010 100 Computer System Organization GCP, DoI, AUEB
Hamming Distance of 3 What happens with 1 bit in error? 2 bits in error? 3 bits in error? A -000 001 110 011 101 B -111 010 100 Computer System Organization GCP, DoI, AUEB
Properties of Distance of Codes Code words have m + r bits (m data, r check) Detecting single bit errors Code must have distance >= 2 Detecting d single bit errors Code must have distance >= d+1 Correcting d single bit errors Code must have distance >= 2d+1 Correcting a single bit error: d = 1, min. distance = 3 (bits) Computer System Organization GCP, DoI, AUEB
Example of Code Distance Properties Consider the code with only 2 code words 1111 and 0000 Distance of 4 1110 Detected as single bit error Distance 1 from 1111 Correctable since only one code word can have single bit error and become “1110” This is the 1111 codeword 1100 Detected as 2 bit errors Distance 2 (e.g., from 1111) Correctable? Computer System Organization GCP, DoI, AUEB
Parity Bit Concept Given the word: 10011011 – add “parity bit” Even Parity: even # of 1’s: 110011011 Odd Parity: odd # of 1’s: 010011011 Computer System Organization GCP, DoI, AUEB
Hamming’s Algorithm (Illustrated in the Single Bit Correction Case) Bits in power of two position are check bits Bit n is checked by bits in the decomposition of n into a sum of powers of 2: 1 + 2 … + 2j = n Bit 9 is checked by 1 and 8 ( 9 = 1 + 23) Bit 1 checks 1, 3, 5, 7, 9, 11 in codeword 1+2 = 3, 1+4 = 5, etc. Bit 2 checks 2, 3, 6, 7, 10, 11 Bit 4 checks 4, 5, 6, 7, 12 Bit 8 checks 8, 9, 10, 11, 12 8 bit data word has codeword of the form D D D D P D D D P D P P 12 11 10 9 8 7 6 5 4 3 2 1 P – Parity (assume even parity) D – Data Computer System Organization GCP, DoI, AUEB
Example: Hamming Code (11,7) Computer System Organization GCP, DoI, AUEB
Error! Retrieved Stored Computer System Organization GCP, DoI, AUEB
Determining Bit in Error Computer System Organization GCP, DoI, AUEB