Coding Theory and its Applications 編碼及其應用

Slides:



Advertisements
Similar presentations
Cyclic Code.
Advertisements

Error Control Code.
L. J. Wang 1 Introduction to Reed-Solomon Coding ( Part I )
Information and Coding Theory
The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december fairfield.
Coding Theory and its Applications 編碼及其應用 Hung-Lin Fu Dept. of Applied Mathematics National Chiao Tung University Hsin Chu, Taiwan.
Information Theory Introduction to Channel Coding Jalal Al Roumy.
Cellular Communications
DIGITAL COMMUNICATION Coding
1 Storing Digital Audio. 2 Storage  There are many different types of storage medium and encoding methods for the storage of digital audio  CD  DVD.
7/2/2015Errors1 Transmission errors are a way of life. In the digital world an error means that a bit value is flipped. An error can be isolated to a single.
15-853Page :Algorithms in the Real World Error Correcting Codes I – Overview – Hamming Codes – Linear Codes.
DIGITAL COMMUNICATION Error - Correction A.J. Han Vinck.
Information and Coding Theory Some applications of error correcting codes. Juris Viksna, 2015.
Lecture 10: Error Control Coding I Chapter 8 – Coding and Error Control From: Wireless Communications and Networks by William Stallings, Prentice Hall,
Information and Coding Theory Linear Block Codes. Basic definitions and some examples. Juris Viksna, 2015.
CODING/DECODING CONCEPTS AND BLOCK CODING. ERROR DETECTION CORRECTION Increase signal power Decrease signal power Reduce Diversity Retransmission Forward.
Error Coding Transmission process may introduce errors into a message.  Single bit errors versus burst errors Detection:  Requires a convention that.
1 SNS COLLEGE OF ENGINEERING Department of Electronics and Communication Engineering Subject: Digital communication Sem: V Cyclic Codes.
Codes Codes are used for the following purposes: - to detect errors - to correct errors after detection Error Control Coding © Erhan A. Ince Types: -Linear.
COEN 180 Erasure Correcting, Error Detecting, and Error Correcting Codes.
MIMO continued and Error Correction Code. 2 by 2 MIMO Now consider we have two transmitting antennas and two receiving antennas. A simple scheme called.
Data and Computer Communications by William Stallings Eighth Edition Digital Data Communications Techniques Digital Data Communications Techniques Click.
Basic Characteristics of Block Codes
Error Control Code. Widely used in many areas, like communications, DVD, data storage… In communications, because of noise, you can never be sure that.
Coding Theory. 2 Communication System Channel encoder Source encoder Modulator Demodulator Channel Voice Image Data CRC encoder Interleaver Deinterleaver.
§6 Linear Codes § 6.1 Classification of error control system § 6.2 Channel coding conception § 6.3 The generator and parity-check matrices § 6.5 Hamming.
DIGITAL COMMUNICATIONS Linear Block Codes
Hamming codes. Golay codes.
Information and Coding Theory Cyclic codes Juris Viksna, 2015.
Compact Disc (CD) Coding –
The parity bits of linear block codes are linear combination of the message. Therefore, we can represent the encoder by a linear system described by matrices.
Some Computation Problems in Coding Theory
Elementary Coding Theory Including Hamming and Reed-Solomom Codes with Maple and MATLAB Richard Klima Appalachian State University Boone, North Carolina.
Digital Communications I: Modulation and Coding Course Term Catharina Logothetis Lecture 9.
Reed-Solomon Codes Rong-Jaye Chen.
Block Coded Modulation Tareq Elhabbash, Yousef Yazji, Mahmoud Amassi.
RS – Reed Solomon Error correcting code. Error-correcting codes are clever ways of representing data so that one can recover the original information.
8 Coding Theory Discrete Mathematics: A Concept-based Approach.
Hamming codes. Golay codes.
2.8 Error Detection and Correction
Error Detection & Correction
Information and Coding Theory
The Viterbi Decoding Algorithm
Data Link Layer.
Error Detection and Correction
Some applications of error correcting codes.
Communication Networks: Technology & Protocols
: An Introduction to Computer Networks
DATA COMMUNICATION AND NETWORKINGS
Advanced Computer Networks
Error Detection and Correction
Part III Datalink Layer 10.
Chapter 1 Number Systems, Number Representations, and Codes
Chapter 10 Error Detection And Correction
Subject Name: Information Theory Coding Subject Code: 10EC55
II. Linear Block Codes.
RS – Reed Solomon List Decoding.
Chapter 7 Error Detection and Correction
EEC-484/584 Computer Networks
Information Redundancy Fault Tolerant Computing
DIGITAL COMMUNICATION Coding
Cyclic Code.
Coding and Error Control
Error Correction Code (1)
Error Detection and Correction
Error Correction Code (1)
Types of Errors Data transmission suffers unpredictable changes because of interference The interference can change the shape of the signal Single-bit.
2.8 Error Detection and Correction
Data Link Layer. Position of the data-link layer.
Presentation transcript:

Coding Theory and its Applications 編碼及其應用 傅恆霖 交通大學應用數學系

Basic Ideas Messages Transmission (訊息傳遞) Correctness and Security (正確性及安全性) Save time and expense (省時及省經費) Security Study is the main job of Cryptography. (密碼學) Coding Theory not only deals with the “correctness” of transmission but also the “quickness” of transmission.

The flow of Transmission Message Encode Modulation Demodulation Decode Original Message Through Noisy Channel

Examples Grades A, B, C, and D Use digits 0 and 1 to encode A : 00 Send A 00

Receiving Following demodulation and decoding We expect to receive the original message A. Unfortunately, it is possible to make errors due to the “noise”.

Probability of Errors Let p denote the error probability of sending “0” and receiving “1”. In a “symmetric channel”, sending “1” and receiving “0” also has error probability p. If t digits are transmitted, then the probability of making s errors is C(t,s)ps(1-p)(t-s). The probability of making errors is C(t,1)p1(1-p)t-1 + C(t,2)p2(1-p)t-2 + … + pt.

1 (1-p) p Symmetric Channel

It happens! Let p = 0.01. It looks small. But, in fact, this is a very large number if we consider a transmission of real world. Million digits are transmitted in a minute. So, we have error digits about 10,000 in a minute. Therefore, if we use 00, 01, 10, and 11 for A, B, C, and D, then errors in transmitting words occur! The probability of making errors(words) is 2x(0.01)x(0.99) + (0.01)2 = 0.0199.

An Improvement Parity check digits 00 000 01 011 10 101 11 110 00 000 01 011 10 101 11 110 The probability of making errors “without noticing” is smaller! C(3,2)x(0.01)2x(0.99) + (0.01)3 = 0.000298. We can add more digits instead of just one.

Error Correction When an error occurs, we may not be able to know where is the error digit. So, “ask for retransmission”. Retransmission is not always possible.

The Idea of Correcting Errors 00 000000 01 010101 10 101010 11 111111 Assume that 101110 is received. We shall conclude that the message sent is 101010!

Hamming Distance The message we send can be expressed as an n-dimension vector over the finite field GF(2) if the message has n digits. E.g. 010101 (1,0,1,0,1,0) Let GF(2) = K. Kn is a set of 2n vectors.

A New Metric Let (a1,a2, …, an) and (b1,b2, …, bn) be two vectors of Kn. Then the Hamming distance of the two vectors is the number of k’s such that ak – bk is not equal to 0, k = 1, 2, …, n. E.g. d(101010,101110) = 1 d(000000,101110) = 4 d(111111,101110) = 2 d(010101,101110) = 5 Hamming distance is a “metric”!

Distance and Decoding If the distance of two words u and v of length n is d, then the probability of sending u and receiving v is pd(1-p)n-d. Fact: If d(w,u) > d(v,u) and u is received, then v is more probable than w as a sending word. e.g. Let 000000, 010101, 101010, and 111111 be the four possible sending words and 101110 is received. Then we choose 101010 as the sending word.

Maximum Likelihood Decoding Let C be the code we use for transmission and u be the word which is received through the channel. CMLD(Complete Maximum Likelihood Decoding): If v satisfies that d(v,u) is minimum for all codewords in C, then we conclude that v is the transmitted codeword no matter v is unique or not. IMLD(Incomplete MLD): If v(as above) is not unique, then ask for retransmission.

Linear Codes A code of length n is a subset of Kn. A linear code of length n is a linear subspace of Kn. (The sum of two vectors is taken under addition of K for each coordinate.) Linear Algebra!! A linear (n,k,d)-code is a linear code with dimension k and distance d where d is the minimum distance between two distinct vectors of the linear code.

Weights of Codewords Each vector of a code is called a codeword. The weight of a codeword is the number of 1’s in the codeword. E.g. wt(101011) = 4. Proposition. The distance of a linear code is equal to the minimum weight of a non-zero codeword.

Main Theorem Theorem. A code with distance d can detect d-1 errors and correct [(d-1)/2] errors. Proof. If w is the word received such that d(v,w) < [(d-1)/2] for some v in the code C, then for each y in C which is not v we have d(w,y) > d(v,w). For otherwise, d(v,y) < d. This is a contradiction. Therefore, C can correct up to [(d-1)/2] errors. The detection is easy to see.

Better Codes The length of a codeword determines the “time” of transmission. The dimension of a linear code shows the information rate k/n. The distance of a code tells you how many errors which can be detected (or corrected). The bits which are not information bits are parity check bits. (n-k) A(n,d) is the maximum number of words of length n such that the distance between two words is at least d. A code C is (n,d)-optimal if C has A(n,d) codewords. (A[n,d] for linear codes.)

The most Important Problem in Coding Theory Given two positive integers n and d where d < n, determine A(n,d) and A[n,d]. A(7,3) <= 27 / (1+7) = 16 (Sphere packing bound). A(7,3) = 16. (By direct constructions.)

Two Constructions Use a Steiner triple system of order 7. {1,2,4}, {2,3,5}, {3,4,6}, {4,5,7}, {5,6,1}, {6,7,2}, {7,1,3}. 1101000 0010111 0000000 0110100 1001011 1111111 0011010 1100101 0001101 1110010 1000110 0111001 0100011 1011100 1010001 0101110

Parity Check Matrix The code we plan to construct is a linear code of dimension 4. By using a 7x3 matrix H of rank 3, we conclude that the set of vectors v satisfies vH = 0 form a linear subspace of K7 with dimension 4. Again, Linear Algebra! 0 0 1 0 1 1 1 Let Ht = 0 1 0 1 0 1 1 1 0 1 1 1 0 1

BCH Codes BCH represents Bose, Chaudhuri and Hocquengham. The code we just construct is a 1-error correcting BCH code. (Distance = 3) Since no two rows (vectors) are the same, a nonzero vector v satisfies vH = 0 has weight at least 3. Hence the distance of the code is 3 (there are 3 rows which are dependent). The rows of H can be considered as the set of all non-zero elements of GF(23).

A different Point of View Kn can be viewed as the set of all polynomials of degree at most n-1 with coefficients in K. Let Rn = K[x]/(xn+1) (xn = 1). Then Rn with polynomial addition and multiplication is a ring. If f(x) is a divisor of xn+1, then the set of all multiples of f(x) is a linear (cyclic) code of dimension n – deg(f(x)).

Quiz Consider R7. x7+1 = (1+x)(1+x+x3)(1+x2+x3) (?) (Hint: 1 = -1, (1+x)2 = 1 + x2.) The set of all polynomials in R7 which are multiples of 1+x+x3 forms a linear code with 16 codewords. This is “essentially the same” code as constructed above.

Reed-Solomen Codes Instead of using K = GF(2), we shall use K = GF(q) where q is a prime power. (It is well known that a finite field of order q exists.) So, the codewords are vectors with coordinates from GF(q). The one used in CD is letting q = 28. An RS(2r,d)-code is a linear cyclic (2r-1,2r-d,d)-code over GF(q) generated by (x+bm+1)(x+bm+2)…(x+bm+d-1) where q = 2r, m is a nonnegative integer and b is a primitive element of GF(q).

Design of Compact Discs (Key Contributions) 1948, C.E. Shannon publishes “A mathematical theory of communication. 1950, R.W. Hamming publishes “Information about error detection/correction codes. 1958, Invention of laser. 1960, Start experiments of computer music.

Story- Continued 1960, I.S. Reed and G. Solomen constructed Reed-Solomen codes. 1969, Klass Copaan, a Dutch physicist comes up with the idea for compact disc. 1970, Klass complete a glass disc prototype and decide to use laser. 1978, Philips releases the video disc player and type of laser selected for CD players. 1980, CD standard proposed by Philips and Sony. 1982, Philips and Sony both have products ready to go.

Keep Going 1983, 30,000 CD players sold in U.S. and 800,000 CD’s sold in U.S. 1984, Portable CD players (Sony DiscMan) sold. 1985, CD-ROM drives hit the computer market. 1990, 9.2 millions players sold in U.S. only and about one billion CD’s sold in the world. 1997, DVD released. DVD players/movies hit consumer market. Now, we can not live without it.

A Brief Overview Data storage in CD format is not simple. Typically, a user pictures the "1’s" and "0’s" in the memory of the computer as being directly transferred to "pits" and "bumps" on the CD disk. To begin with the incoming data is subjected to a series of coding operations. These coding operations add a number of additional parity bits to the data for error detection and correction purposes. The data is also subject to an interleaving process .

Concealment(隱藏) Interpolation(添寫): In this technique, some “average” is constructed using the valid data around an error. This average is then substituted in for the erroneous data. Since most music (with the possible exception of heavy metal!) is continuous -- this method works well for concealing relatively short errors. Muting(消音): Muting is a last ditch technique -- as it effectively creates a brief period of silence in the audio train. However, it is not effective to simply set all the binary digits to zero --as this produces exactly the click that we are trying to avoid! Instead, the volume is faded out(淡出) and then back in again to conceal the error.

Error-Correcting Ability CD players use parity and interleaving techniques to minimize the effects of an error on the disk. Theoretically, the combination of parity and interleaving in a CD player can detect and correct a burst error of up to 4000 bad bits -- or a physical defect 2.47 mm long. Interpolation can conceal errors up to 13,700 or physical defects up to 8.5 mm long. -correcting codes(Burst-error)

EFM modulation EFM means Eight to Fourteen Modulation and is an incredibly clever way of reducing errors. The idea is to minimize the number of 0 to 1 and 1 to 0 transitions(臨時轉調)-- thus avoiding small pits. In EFM only those combinations of bits are used in which more than two but less than 10 zeros appear continuously. E.g. 0000 1010 EFM 10010001000000.

Figure 2

pits

Figure 4

Encoding The original musical signal is a waveform in time. A sample of this waveform in time is taken and "digitized" into two 16-bit words, one for the left channel and one for the right channel. For example, a single sample of the musical signal might look like: L1 = 0111 0000 1010 1000 R1 = 1100 0111 1010 1000 Six samples (six of the left and six of the right for a total of twelve) are taken to form a frame such as L1 R1 L2 R2 L3 R3 L4 R4 L5 R5 L6 R6.

Sound has 216 Levels The frame is then encoded in the form of 8-bit words. Each 16-bit audio signal turns into two 8-bit words, such as L1左 L1右R1左R1右L2左L2右R2左R2右L3左L3右R3左R3右 L4左L4右R4左R4右L5左L5右R5左R5右L6左L6右R6左R6右 This gives a grand total of 24 8-bit words. ((L,R) produces stereo effects and one second has 44,100 ticks.) The even words are then delayed by two blocks and the resulting "word" scrambled. This delay and scramble is the first part of the interleaving process.

RS codes Show Up! Encoded by C(227):(28,24,5)-RS: The resulting 24 byte word (remember, it has an included two block delay -- so some symbols in this word are from blocks two blocks behind) has 4 bytes of parity added. This particular parity is called "Q" parity. Parity errors found in this part of the algorithm are called C1 errors. More on the Q parity later. 4-frame delay interleaved: Now, the resulting 24 + 4Q = 28 bytes word is interleaved. Each of the 28 bytes is delayed by a different period. Each period is an integral multiple of 4 blocks. So the first byte might be delayed by 4 blocks, the second by 8 blocks, the third by 12 blocks and so on. The interleaving spreads the word over a total of 28 x 4 = 112 blocks

Another RS code Encoded by C(223):(32,28,5)-RS: The resulting 28 byte words are again subjected to a parity operation. This generates four more parity bytes called P bytes which are placed at the end of the 28 bit data word. The word is now a total of 28 + 4 = 32 bytes long. Parity errors found in this part of the algorithm are called C2 errors. Finally, another odd-even delay is performed -- but this time delay by just a single block. Both the P and Q parity bits are inverted (turning the "1’s" into "0’s") to assist data readout during muting.

EFM A subcode of length 8 is then added to the front end of the word. The subcode specifies the total number of selections on the disk, their length, and so on. Next, the data-words are converted to EFM format. EFM means Eight to Fourteen Modulation and is an incredibly clever way of reducing errors. The idea is to minimize the number of 0 to 1 and 1 to 0 transitions -- thus avoiding small pits. In EFM only those combinations of bits are used in which more than two but less than 10 zeros appear continuously.

Encode the Sound 1 subcode signal 14 bits Each frame finally has a 24-bit synchronization word attached to the very front end -- (just for completeness the word is (100000000001000000000010) and each group of 14 symbols is then coupled by three merged bits. SO! The final frame (which started at 6*16*2 = 192 data bits) now contains: 1 sync word 24 bits 1 subcode signal 14 bits 6*2*2*14 data bits 336 bits (14 comes from 8) 8*14 parity bits 112 bits 34*3 merge bits 102 bits GRAND TOTAL 588 bits. Music:

Keep Fighting!!!

Final Words 多運動,身體好! 多唸數學,頭腦好! You are lucky!