Error Control Coding and Applications Eric Chen Computer Science Group HKr Allmän föreläsning för alla anställda vid Högskolan med intresse för EU-forskning. Föreläsare Professor Claes Magnusson Claes Magnusson arbetar f.n. med utveckling av maskinteknikprogrammet, Kristianstad Högskola. Han har sedan tidigare erfarenhet av EU:s forskningsprogram, både som projektledare och utvärderare och han arbetade i tre år som ”dörröppnare” i Bryssel åt svenska universitet och högskolor. Claes kommer i sitt anförande behandla frågor som: • Vilka forskningsområden finns? • Hur gör man en ansökan? • Hur är det att vara med? • Hur går det för Sverige? • Hur lång tid tar det innan pengarna kommer? • Vilken hjälp finns att få? • På vilka sätt kan jag delta? • Vilka fördelar kan deltagandet ge?
Internet Banking Postgiro Bankgior OCR
OCR-Number (OCR-nummer ) ? A reference number links your payment to the invoice Typed OCR in-correctly ? Error detection? E24.se reported (how it is possible?) Företagaren Thomas Hultberg fyllde i fel OCR-nummer när han skulle betala in skatten. Nu tvingas han betala in de 229 000 kronorna en gång till.
Outline of the presentation About me Errors and Error Effect Error Control Coding Personnummer, OCR Hamming code My Research and Results
About Me Education Work 1978.7—1982.7 Electronics Engineering, Harbin C. U. 1982.7—1986.4 Computer Science, SWJTU 1987.2—1991.4 Electrical Engineering, SWJTU Work 1986.1—1993.4 Lecturer, Associate Prof. SWJTU 1993.4—1994.6 Guest researcher, Linköping Uni. 1994.7– present HKr
Information Transmission System Source encoding (remove redundancy) Encoder ( add redundancy ) Decoder ( error detection/correction) Source decoding Information Sink Receiver (Decoder) Transmitter (Encoder) Communication Channel Information Source Noise n-digit n-digit k-digit k-digit
Introduction to Coding Theory Also called channel coding The study of methods for efficient and accurate transfer of information Detecting and correcting transmission errors
Errors in digital communications
Errors and Error Effect Bits can be lost Error effect downloaded programs from Internet ? CD music ? Internet banking services ? Errors must be detected/corrected !!!
Channel model -- BSC Binary symmetric channel p: the bit error probability
Why Error Control Coding ? Bit error rate p p = 1/100000 = 10-5 for optical disks p = 10-11 for a fiber link Some calculations p = 10-6 download a file of length 107 bits 10 bit errors Data rate at 10 Mbps 1 bit error in every 1 sec!! p = 10-11, and data rate 10 Gigabits/sec 1 bit error each 10 second !
Error Control Coding – Principle add additional information, or redundancy to data added by sender, checked by receiver k data digits encoded to a codeword of n digits Code rate r = k / n k n Encoded as codeword
Application Example– Swedish personal ID 640823-3234 ? yy mm dd – nnnP yy mm dd– year month day nnn – serial number odd– for male, even for female P ? That is parity check digit Used for error detection ! OCR number uses the same technique
Personal ID Encoding Method position 1 2 3 4 5 6 7 8 9 10 6 4 0 8 2 3 3 2 3 ? 2×odd 12 4 0 8 4 3 6 2 6 add 2-digits 3 4 0 8 4 3 6 2 6 sum = 3 + 4 + 0 + 8 + 4 + 3 + 6 + 2 + 6 = 36 take the last digit of the sum: 6 parity check digit = 10 – 6 = 4 640823-3234
Personal ID Error Detection 640823-3234 460823-3234 ? position 1 2 3 4 5 6 7 8 9 10 4 6 0 8 2 3 3 2 3 ? 2×odd 8 6 0 8 4 3 6 2 6 add 2-digits 8 6 0 8 4 3 6 2 6 sum = 8 + 6 + 0 + 8 + 4 + 3 + 6 + 2 + 6 = 43 take the last digit of the sum: 3 parity check digit = 10 – 3 = 7 It is not equal to 4 Error in the number !
Parity Check Applications The same coding methods have been used for OCR reference number Bankgironummer Organisationsnummer Reference http://www.lur.nu/OCR/generera.php
Error Correcting Code– Hamming Code Binary Hamming [7, 4] code k = 4, n = 7 Encode 4 data bits by adding 3 parity bits Can correct any single error Encoding a b c d a b c d x y z Where a, b, c, d are information bits x, y, z are parity check bits
Given a, b, c, d. How to get x, y, z ? Hamming Code Given a, b, c, d. How to get x, y, z ? Place a, b, c, d in the intersections Label circles by x, y, z Parity checking rule: the sum of each circle is 0 x = a+b+c, y = a + c + d, z = b + c + d a b c d x y z
Given a, b, c, d. How to get x, y, z ? Hamming Code Example Given a, b, c, d. How to get x, y, z ? 0101 0101 xyz so the codeword is 0101 110 a b c d x y z 1 x y z 1
Hamming Code for Error Correction 0101 110 sent 0100 110 received. Encode 0100 0100 101 Compare 101 with received 110 101 110 = 011, there is an error bit d must be in error, it affects y, z correction 0101 1 1 reconstructed received
Error Detecting Codes Only detect errors Using protocol to correct errors: ACK: positive acknowledgement ( I got it) NAK: negative acknowledgement ( sorry ) Simple, reliable, high code rate Used in data communications sender receiver codeword ACK/NAK
Error Correcting Codes Detect and correct errors No feedback channel required Complicated, lower code rate (k/n) Used in storage systems (computer storage, CD, DVD), and space communications sender receiver codeword
Generator Matrix and Encoding Generator matrix G Example Hamming [7, 4] code Encoding: (a,b,c,d) G = c codeword x = a+b+c, y = a + c + d, z = b + c + d a b c d x y z
Parity Check Matrix and Decoding Parity check matrix H HGtr = 0 Example Hamming [7, 4] code Syndrome s: a column vector of length n-k Decoding Received codeword y: Hytr = s the syndrome of y If s = 0, no error detected Otherwise, there must be errors a b c d x y z
Optimal codes Distance or d-optimal code Length or n-optimal code A linear [n, k, d]q code is d-optimal if there does not exist an [n, k, d+1]q code. Length or n-optimal code A linear [n, k, d]q code is n-optimal if there does not exist an [n–1, k, d]q code. k-optimal code A linear [n, k, d]q code is k-optimal if there does not exist an [n, k+1, d]q code.
My Research Difference triangle sets Majority-logic decodable codes A generalization of Golomb rulers 1989—1995 Majority-logic decodable codes 1988—1992 Quasi-Twisted codes (ex. a [112,13,48] code) 1989—present Two-weight codes and graphs 2006—present
Difference Triangle Set Golomb ruler 1 4 6 R = {0, 1, 4, 6} 3 5 Difference triangle 2 Difference triangle sets A generalization of Golomb rulers A set of t Golomb rulers R1 = {0, 6, 11, 13}, R2 = {0, 8, 17, 18}, R3 = {0, 3, 15, 19} 6 11 13 8 17 18 3 15 19 5 7 9 10 12 16 2 1 4
Current Research Interests Quasi-Twisted Codes Many QT codes are good or optimal Computer constructions 2-weight codes Non-zero codewords of weights w1 or w2 Related to strongly regular graphs
Computer search for QT Codes Given a cyclic weight matrix of order s How to select p columns such that Maximize the minimum row sums of p cols Row sums of values w1 or w2
Computer search for QT Codes Given a cyclic weight matrix of order s Columns 0, 1, 3 produces a QT code with minimum distance of 6 row sums are of values 6 and 8 two-weight code found
Publications Co-authored books IEEE Trans. Information Theory One text book ( VAX-11 Assembly lang. prog. ) One book ( combinatorial coding theory and appl.) IEEE Trans. Information Theory 8 papers IEE Electronics Letters 6 papers Codes, Designs and Cryptography 1 paper (DDD disjoint distinct difference set)
Online Database on Codes A web database of binary quasi-cyclic codes http://moodle.tec.hkr.se/~chen/research/codes/searchqc2.htm see also: codetables http://www.codetables.de A Web database of two-weight codes http://moodle.tec.hkr.se/~chen/research/2-weight-codes/search.php
References Personnummer http://skatteverket.se/download/18.1e6d5f87115319ffba380001857/70408.pdf http://www.e24.se/pengar24/dinekonomi/familjeekonomi/artikel_360879.e24 http://www.lur.nu/OCR/generera.php