 2004 SDU Uniquely Decodable Code 1.Related Notions 2.Determining UDC 3.Kraft Inequality.

Slides:



Advertisements
Similar presentations
Completeness and Expressiveness
Advertisements

Some important properties Lectures of Prof. Doron Peled, Bar Ilan University.
Equivalence Relations
Longest Common Subsequence
Copyright © Cengage Learning. All rights reserved. CHAPTER 5 SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION.
Recursive Definitions and Structural Induction
22C:19 Discrete Structures Trees Spring 2014 Sukumar Ghosh.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
Discrete Mathematics Lecture 5 Alexander Bukharovich New York University.
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
Outline. Theorem For the two processor network, Bit C(Leader) = Bit C(MaxF) = 2[log 2 ((M + 2)/3.5)] and Bit C t (Leader) = Bit C t (MaxF) = 2[log 2 ((M.
Copyright © Cengage Learning. All rights reserved. CHAPTER 5 SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION.
Basic properties of the integers
Applied Discrete Mathematics Week 11: Graphs
Data Compression.
Induction and recursion
Complexity 15-1 Complexity Andrei Bulatov Hierarchy Theorem.
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
Regular Languages Sequential Machine Theory Prof. K. J. Hintz Department of Electrical and Computer Engineering Lecture 3 Comments, additions and modifications.
Regular Languages Sequential Machine Theory Prof. K. J. Hintz Department of Electrical and Computer Engineering Lecture 3 Comments, additions and modifications.
Data Structures – LECTURE 10 Huffman coding
Variable-Length Codes: Huffman Codes
Induction and recursion
Relations Chapter 9.
Induction and recursion
Dr.-Ing. Khaled Shawky Hassan
Chapter 6 Mathematical Induction
Mathematical Induction. F(1) = 1; F(n+1) = F(n) + (2n+1) for n≥ F(n) n F(n) =n 2 for all n ≥ 1 Prove it!
Zvi Kohavi and Niraj K. Jha 1 Memory, Definiteness, and Information Losslessness of Finite Automata.
Chapter 9. Chapter Summary Relations and Their Properties n-ary Relations and Their Applications (not currently included in overheads) Representing Relations.
 2004 SDU Lecture 7- Minimum Spanning Tree-- Extension 1.Properties of Minimum Spanning Tree 2.Secondary Minimum Spanning Tree 3.Bottleneck.
CS 103 Discrete Structures Lecture 10 Basic Structures: Sets (1)
Advanced Topics in Propositional Logic Chapter 17 Language, Proof and Logic.
Chapter 5: Sequences, Mathematical Induction, and Recursion 5.9 General Recursive Definitions and Structural Induction 1 Erickson.
Chapter 5 With Question/Answer Animations. Section 5.1.
Theory of Computation, Feodor F. Dragan, Kent State University 1 TheoryofComputation Spring, 2015 (Feodor F. Dragan) Department of Computer Science Kent.
 2004 SDU Lectrue4-Properties of DFS Properties of DFS Classification of edges Topological sort.
Specifying Languages Our aim is to be able to specify languages for use in the computer. The sketch of an FSA is easy for us to understand, but difficult.
CS 103 Discrete Structures Lecture 13 Induction and Recursion (1)
Cardinality with Applications to Computability Lecture 33 Section 7.5 Wed, Apr 12, 2006.
Chapter 9. Chapter Summary Relations and Their Properties n-ary Relations and Their Applications (not currently included in overheads) Representing Relations.
Sets Definition: A set is an unordered collection of objects, called elements or members of the set. A set is said to contain its elements. We write a.
Based on slides by Patrice Belleville and Steve Wolfman CPSC 121: Models of Computation Unit 11: Sets.
Strings Basic data type in computational biology A string is an ordered succession of characters or symbols from a finite set called an alphabet Sequence.
8.4 Closures of Relations Definition: The closure of a relation R with respect to property P is the relation obtained by adding the minimum number of.
CompSci 102 Discrete Math for Computer Science March 13, 2012 Prof. Rodger Slides modified from Rosen.
Department of Statistics University of Rajshahi, Bangladesh
Chapter 5. Section 5.1 Climbing an Infinite Ladder Suppose we have an infinite ladder: 1.We can reach the first rung of the ladder. 2.If we can reach.
Compression for Fixed-Width Memories Ori Rottenstriech, Amit Berman, Yuval Cassuto and Isaac Keslassy Technion, Israel.
Chapter 4 With Question/Answer Animations 1. Chapter Summary Divisibility and Modular Arithmetic - Sec 4.1 – Lecture 16 Integer Representations and Algorithms.
Sorting by placement and Shift Sergi Elizalde Peter Winkler By 資工四 B 周于荃.
Section Recursion 2  Recursion – defining an object (or function, algorithm, etc.) in terms of itself.  Recursion can be used to define sequences.
Section Recursion  Recursion – defining an object (or function, algorithm, etc.) in terms of itself.  Recursion can be used to define sequences.
Chapter 5 1. Chapter Summary  Mathematical Induction  Strong Induction  Recursive Definitions  Structural Induction  Recursive Algorithms.
Ch4. Zero-Error Data Compression Yuan Luo. Content  Ch4. Zero-Error Data Compression  4.1 The Entropy Bound  4.2 Prefix Codes  Definition and.
Cardinality with Applications to Computability
Chapter 2 Sets and Functions.
Induction and recursion
Proving the Correctness of Huffman’s Algorithm
Cardinality of Sets Section 2.5.
Advanced Algorithms Analysis and Design
Enumerating Distances Using Spanners of Bounded Degree
Lecture 6 Instantaneous Codes and Kraft’s Theorem (Section 1.4)
Induction and recursion
Advanced Algorithms Analysis and Design
Advanced Analysis of Algorithms
Huffman Coding Greedy Algorithm
Proving the Correctness of Huffman’s Algorithm
THE WELL ORDERING PROPERTY
Presentation transcript:

 2004 SDU Uniquely Decodable Code 1.Related Notions 2.Determining UDC 3.Kraft Inequality

 2004, 2009 SDU 2 A Resulting Problem Given a coding scheme of the source symbols, how to verify whether it is uniquely decodable or not?

 2004, 2009 SDU 3 Related Notions alphabet:  = {0, 1, …,  -1} symbol or letter: an element of alphabet  word: a sequence of symbols of finite length Code: a collection of words on a specified alphabet codeword: a word in a code message: a sequence of codewords Uniquely decodable code C: every message can be uniquely decomposed into the codewords in C  {0, 10, 01} vs {0, 10, 11}

 2004, 2009 SDU 4 Related Notions prefix and suffix: if w = ps, then p is prefix of w and s is suffix of w empty word: a word with length 0 suffix word: a non-empty word t is called a suffix word if there exist two messages C 1 C 2 …C m and C 1 ’C 2 ’…C n ’ such that  C i, C j ’ are all codewords for 1  i  m, 1  j  n, and C 1  C 1 ’,  t is the suffix of C n ’,  C 1 C 2 …C m t = C 1 ’C 2 ’…C n ’.

 2004, 2009 SDU 5 A Key Lemma for Determining UDC Lemma. A code C is uniquely decodable if and only if each suffix word is not a codeword in C. Proof.  Suppose that a suffix word t is a codeword in C, according to the definition of suffix word, there exist two messages C 1 C 2 …C m and C 1 ’C 2 ’…C n ’ such that C 1  C 1 ’ and C 1 C 2 …C m t = C 1 ’C 2 ’…C n ’. Hence, there are two ways to decompose the message C 1 ’C 2 ’…C n ’, indicating that C is not uniquely decodable. A contradiction to that C is a UDC.

 2004, 2009 SDU 6 Proof  Suppose that C is not uniquely decodable, then there exists some message which can be decomposed in more than one ways. Let  be such a message of the least length,  = C 1 C 2 …C k = C 1 ’C 2 ’…C n ’, where C i (1  i  k), C j ’ (1  j  n) are all codewords, and C 1  C 1 ’. Without loss of generality, assume that C k is a suffix of C n ’, then C k is a suffix word. A contradiction to that each suffix word is not a codeword in C.

 2004, 2009 SDU 7 UDC Verification By the key lemma If we can generate all the suffix words of a code C  If none of suffix words is a codeword in C, then C is uniquely decodable.  If some suffix words are codewords, then C is not uniquely decodable. The following determining algorithm is directly from the key lemma.

 2004, 2009 SDU 8 The Determining Algorithm UDC-Verification(C) 1 T   2 for each pair of codeword C i, C j  C (i  j) do 3 if C i = C j, then return NO. (C is not uniquely decodable) 4 if there exists a word s such that C i s = C j or C i = C j s, then T  T  {s} 5 endfor 6 for each pair of suffix word t and codeword C k do 7 if t = C k, then return NO. (C is not uniquely decodable) 8 if there exists a word s such that ts = C k or C k s = t, then T  T  {s} 9 endfor 10 return YES. (C is uniquely decodable)

 2004, 2009 SDU 9 Correctness of Algorithm Theorem. The algorithm UDC-Verification correctly verifies whether a code C is uniquely decodable or not. Proof. we should prove: (1) Each word s put into T in Step 1.2 or Step 2.2 is a suffix word. (2) If the algorithm stops at Step 3, then the algorithm computes all the suffix words of code C and ensures that they are not codewords.

 2004, 2009 SDU 10 Proof (1). The word s put in T in Step 1.2 is obviously a suffix word. We next consider the word s put into T in Step 2.2. As t is a suffix word, there exist codewords C 1, C 2,…, C m and C 1 ’, C 2 ’, …, C n ’ such that C 1  C 1 ’ and C 1 C 2 …C m t = C 1 ’C 2 ’…C n ’. If ts = C k, then C 1 C 2 …C m C k = C 1 ’C 2 ’…C n ’s, indicating s is a suffix word. If C k s = t, then C 1 C 2 …C m C k s = C 1 ’C 2 ’…C n ’, indicating s is a suffix word.

 2004, 2009 SDU 11 (2). For each suffix word t of C, let m(t) = C 1 C 2 …C m be the shortest message satisfying C 1 C 2 …C m t = C 1 ’C 2 ’…C n ’ and t is the suffix of C n ’. Prove by induction on the length of m(t) that t can be generated by the algorithm. Basic Step: |m(t)| = 1, then n = m =1, so t is generated in Step 1.2. Inductive Step: Suppose every suffix word p with |m(p)| < |m(t)| had been generated by the algorithm, we now prove that t can also be generated by the algorithm. Because t is the suffix of C n ’, we have pt = C n ’, then C 1 C 2 …C m = C 1 ’C 2 ’…C n-1 ’p. Proof

 2004, 2009 SDU 12 Proof (i). If p = C m, then C m t = C n ’, t is generated in Step 1.2. (ii). If p is suffix of C m, according to C 1 C 2 …C m = C 1 ’C 2 ’…C n-1 ’p, p is a suffix word. For |m(p)| < |m(t)|, the inductive hypothesis indicates that p had been generated by the algorithm. So when applying suffix word p and codeword C n ’ in Step 2, Step 2.2 will put t into T since pt = C n ’. (iii). If C m is a suffix of p, then C m t is suffix of C n ’, then C m t is a suffix word for C 1 C 2 …C m t = C 1 ’C 2 ’…C n ’, and |m(C m t)|  |C 1 C 2 …C m-1 |, the inductive hypothesis indicates that C m t had been generated by the algorithm. So when applying suffix word C m t and codeword C m in Step 2, Step 2.2 will put t into T for C m t = C m t. suffix word

 2004, 2009 SDU 13 Time Complexity Analysis Suppose there are n codewords in C, and the length of the longest word is l, then  Step 1: O(n 2 l) comparisons  Step 2: Number of suffix words is at most O(nl), So O(n 2 l 2 ) comparisons and O(n 2 l 2 ) insertion of suffix words into T.  Totally, O(n 2 l 2 ).

 2004, 2009 SDU 14 Property of UDC—Kraft Inequality 1.Let C = {C 1, C 2, …, C n } be a uniquely decodable code on an alphabet of cardinality , let l i = |C i | for 1  i  n, then we have 2.Conversely, if a set of integers {l 1, l 2,..., l n } satisfies the Kraft inequality, then a prefix code C = {C 1, C 2, …, C n } can be found with codeword lengths {l 1, l 2,..., l n }. Note:  prefix code C = {C 1, C 2, …, C n } means that neither C i nor C j is a prefix of the other, for each pair of codewords C i and C j (i  j). Strictly, called prefix-free code  Prefix-free code is UDC {00, 10, 11, 100, 111} vs {00, 10, 11, 010, 011} Kraft Inequality

 2004, 2009 SDU 15 Proof of Property 1 (in text book page 246):  Let m be an arbitrary positive integer, then  For each of n m messages consisting of m codewords, there is a unique corresponding term in the above formula. Let N(m, j) be the number of messages of length j and consisting of m codewords. Then  C is uniquely decodable, there are no identical messages. So N(m, j)   j, We have  So, for any positive integer m > 0, there is,  So the Kraft Inequality Holds. length of the longest codeword in C

 2004, 2009 SDU 16 Proof of Property 2 Let 1 < 2 < … < m be m integers such that {l 1, l 2, …, l n } = { 1, 2, …, m } when ignoring repeats. Let k j is the number of l i ’s that equals to j. We should prove that, there exists a prefix code C such that the number of codewords in C with length j is k j. The Kraft Inequality becomes Prove by induction that: For each 1  r  m, there exists prefix code C r such that for any 1  j  r, the number of codewords in C r with length j is k j.

 2004, 2009 SDU 17 Proof of Property 2  Basic Step: r = 1, the above inequality means k 1  - 1  1, which is k 1   1. Obviously there exist  1 different words of length 1, we can arbitrarily select k 1 of them to form C 1.  Inductive Step: Suppose that C r exists for r < m, we prove that C r+1 exist for r +1  m. From, we have, which means Among the  r+1 different words with length r+1, there are codewords with length j in C r. So we can select k r+1 different words with length r+1, and the codewords in C r are not prefix of them. So we extend C r to C r+1.

 2004, 2009 SDU 18 Thanks for attention!