An Algorithm for Construction of Error-Correcting Symmetrical Reversible Variable Length Codes Chia-Wei Lin, Ja-Ling Wu, Jun-Cheng Chen Presented by Jun-Cheng Chen 2004/09/24
Outline Introduction Notations and preliminaries The proposed algorithm Experimental results Conclusion
Introduction to RVLC (1/3) Variable length codes (VLC) are of prime importance in the efficient transmission of digital signals. –High compression efficiency There are other criteria that may be important in the application environment. –Channel bit-error resilience, maximum codeword length limitation Reversibility of variable length codes makes instantaneous decoding possible both in the forward and backward directions.
Introduction to RVLC (2/3) A reversible variable length code (RVLC) must satisfy the prefix-free and suffix-free condition for instantaneous forward and backward decoding. Symbol ProbabilityC1C2 ABCDEABCDE Average code length Table 0: Huffman code and reversible variable length codes(RVLCs) C1: Huffman code C2: Symmetrical RVLC
Introduction to RVLC (3/3) If a bit error occurs, VLC will propagate the bit error and the data after the bit error becomes useless. However, RVLC can recover data after the bit error. VLC RVLC Bit error
Notations and preliminaries (1/3) n source symbols: probabilities of source symbols: n codewords: length of codeword : – : codeword sequences – : hamming distance
Notations and preliminaries (2/3) (minimum block distance) In the case of RVLCs Note: if a VLC does not have equal-length codewords, then its minimum block distance is undefined., if is undefined, if is defined
codewords Block distance:2 Block distance:3 Block distance:4 (minimum block distance): 2 Find minimum blcok distance Minimum Block Distance Find minimum hamming distance for each level
Notations and preliminaries (3/3) R(c, CL d, d) is the replacement of a codeword c in the codeword list CL d –children(c) are defined by all of the first symmetrical codewords on paths from the codeword c to leaf codewords. –CL d is a codeword list in which the constraint d b d holds for all its codewords. –children(c, d) is a subset of the symmetrical children of a symmetrical codeword c in which the constraint d b d holds for all the children in the subset. –children(c, CL d, d) is a subset of symmetrical children of c that the union of the subset and CL d is still a codeword list in which the constraint d b d holds for all its codewords.
00 codeword ’00’ 000 children(‘00’) = {000} children(‘00’, 2) = {000} T-List: (1,00,010, 0110) children(‘00’,T-List, 2) = {} 0000 R(‘ 00 ’,T-List,2) ={0000} 01
The Proposed Algorithm (1/4) Goal: The free distance of the proposed RVLC is always greater than one, which can result in certain improvement in symbol error rate relative to VLCs with free distance one.
The Proposed Algorithm (2/4) The algorithm –Step1: Assign the initial codeword list (“1”, “00”, “010”, “0110”,…) to the target list, T-List. –Step2: If any codeword c in T-List satisfies the condition of replacement, replace the codeword c with R(c, T-List, 2) that results in the smallest average codeword length. If the number of codewords in T-List is more than n, the number of source symbols, keep the first n codewords in T-List and discard the others. (Notice that the block distance is still greater than one after the codeword replacement) –Step3: Repeat Step2 until there is no codeword in T-List satisfying the condition of replacement.
The Proposed Algorithm (3/4) Six symbols with probabilities (0.286, 0.214, 0.143, 0.143, 0.071) are given Table 1: An example of the proposed algorithm, where P U (U) denotes the probability of the source symbol U. UP U (U)After step 1After step 2 a a a a a a Average codeword length
The Proposed Algorithm (4/4) Table 2: Some temporary results of the proposed algorithm. T-list: (1, 00, 010, 0110, 01110, ) Iteration #1 cR(c,T-List, 2) Merge result Average code length after replacement 1 11, 101, 1001, 10001, , 11, 010, 101, 0110, 1001, 01110, 10001, , , 00100, , , , , Others are 3.285, 2.856, and 2.856
Experimental Results (1/3) Test corpuses are from Canterbury Corpus File Set. (available in Table 3 respectively lists various symmetrical RVLCs constructed by other algorithms, and the proposed algorithm for the English alphabet set. Table 4 lists the results of the Canterbury Corpus file set compressed by using other algorithms and the proposed algorithm
Experiment Results (2/3) Table 3: Symmetrical RVLCs for the English alphabet set
Experiment Results (3/3) Table 4: the average codeword lengths of various symmetrical RVLCs for the Canterbury Corpus file set
Conclusion (1/2) The optimal design of RVLC with good distance properties and compression efficiency is still an open problem. The proposed algorithm is suitable for any VLC- involved applications to enhance the error- correcting capability of critical time-constrained applications. It should be pointed out that symmetrical RVLCs with cannot be surely obtained by the proposed algorithm.
Conclusion (2/2) The major contribution of the proposed algorithm is that it yields more efficient symmetrical RVLCs with the same free distance than other known algorithms.
Level 4 No. of Available Candidate Codewords at level 5 No. of Available Candidate Codewords at level 6 No. of Available Candidate Codewords at level 7 Candidate codewords MSSL MSSL: MSSL:3 MSSL (Maximum Symmetrical Suffix Length). not symmetrical