Basic Concepts of Encoding Codes, their efficiency and redundancy 1.

Slides:



Advertisements
Similar presentations
DCSP-8: Minimal length coding II, Hamming distance, Encryption Jianfeng Feng
Advertisements

Noise, Information Theory, and Entropy (cont.) CS414 – Spring 2007 By Karrie Karahalios, Roger Cheng, Brian Bailey.
CY2G2 Information Theory 1
Lecture 4 (week 2) Source Coding and Compression
Applied Algorithmics - week7
Sampling and Pulse Code Modulation
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Greedy Algorithms Amihood Amir Bar-Ilan University.
Information Theory EE322 Al-Sanie.
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
SIMS-201 Compressing Information. 2  Overview Chapter 7: Compression Introduction Entropy Huffman coding Universal coding.
Data Compression.
Entropy and Shannon’s First Theorem
Chapter 6 Information Theory
Lecture04 Data Compression.
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
Lecture 6: Huffman Code Thinh Nguyen Oregon State University.
SWE 423: Multimedia Systems
Fundamental limits in Information Theory Chapter 10 :
Spatial and Temporal Data Mining
Information Theory Eighteenth Meeting. A Communication Model Messages are produced by a source transmitted over a channel to the destination. encoded.
Data Structures – LECTURE 10 Huffman coding
Shannon ’ s theory part II Ref. Cryptography: theory and practice Douglas R. Stinson.
1 Chapter 1 Introduction. 2 Outline 1.1 A Very Abstract Summary 1.2 History 1.3 Model of the Signaling System 1.4 Information Source 1.5 Encoding a Source.
Variable-Length Codes: Huffman Codes
Copyright © Cengage Learning. All rights reserved.
Fundamentals of Multimedia Chapter 7 Lossless Compression Algorithms Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Huffman Coding. Main properties : –Use variable-length code for encoding a source symbol. –Shorter codes are assigned to the most frequently used symbols,
EEE377 Lecture Notes1 EEE436 DIGITAL COMMUNICATION Coding En. Mohd Nazri Mahmud MPhil (Cambridge, UK) BEng (Essex, UK) Room 2.14.
Source Coding Hafiz Malik Dept. of Electrical & Computer Engineering The University of Michigan-Dearborn
Information Theory and Security
Noise, Information Theory, and Entropy
1 Lossless Compression Multimedia Systems (Module 2) r Lesson 1: m Minimum Redundancy Coding based on Information Theory: Shannon-Fano Coding Huffman Coding.
Huffman Codes Message consisting of five characters: a, b, c, d,e
©2003/04 Alessandro Bogliolo Background Information theory Probability theory Algorithms.
Information and Coding Theory
Information Theory & Coding…
INFORMATION THEORY BYK.SWARAJA ASSOCIATE PROFESSOR MREC.
Zvi Kohavi and Niraj K. Jha 1 Memory, Definiteness, and Information Losslessness of Finite Automata.
Channel Capacity.
Compression.  Compression ratio: how much is the size reduced?  Symmetric/asymmetric: time difference to compress, decompress?  Lossless; lossy: any.
COMMUNICATION NETWORK. NOISE CHARACTERISTICS OF A CHANNEL 1.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Communication System A communication system can be represented as in Figure. A message W, drawn from the index set {1, 2,..., M}, results in the signal.
Information Theory The Work of Claude Shannon ( ) and others.
Source Coding Efficient Data Representation A.J. Han Vinck.
Word : Let F be a field then the expression of the form a 1, a 2, …, a n where a i  F  i is called a word of length n over the field F. We denote the.
Lecture 4: Lossless Compression(1) Hongli Luo Fall 2011.
Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability Schemes. Conditional Entropy. Communication Network. Noise.
Basic Concepts of Encoding Codes and Error Correction 1.
1 Lecture 7 System Models Attributes of a man-made system. Concerns in the design of a distributed system Communication channels Entropy and mutual information.
INFORMATION THEORY Pui-chor Wong.
Multi-media Data compression
1 Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004.
Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5.
Lecture 12 Huffman Algorithm. In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly.
Channel Coding Theorem (The most famous in IT) Channel Capacity; Problem: finding the maximum number of distinguishable signals for n uses of a communication.
SEAC-3 J.Teuhola Information-Theoretic Foundations Founder: Claude Shannon, 1940’s Gives bounds for:  Ultimate data compression  Ultimate transmission.
Lossless Compression-Statistical Model Lossless Compression One important to note about entropy is that, unlike the thermodynamic measure of entropy,
UNIT I. Entropy and Uncertainty Entropy is the irreducible complexity below which a signal cannot be compressed. Entropy is the irreducible complexity.
UNIT –V INFORMATION THEORY EC6402 : Communication TheoryIV Semester - ECE Prepared by: S.P.SIVAGNANA SUBRAMANIAN, Assistant Professor, Dept. of ECE, Sri.
Information Theory Information Suppose that we have the source alphabet of q symbols s 1, s 2,.., s q, each with its probability p(s i )=p i. How much.
Ch4. Zero-Error Data Compression Yuan Luo. Content  Ch4. Zero-Error Data Compression  4.1 The Entropy Bound  4.2 Prefix Codes  Definition and.
DATA STRUCTURES AND ALGORITHM (CSE 220)
Increasing Information per Bit
Analysis & Design of Algorithms (CSCE 321)
CSE 589 Applied Algorithms Spring 1999
CSE 589 Applied Algorithms Spring 1999
Presentation transcript:

Basic Concepts of Encoding Codes, their efficiency and redundancy 1

Cost function How we can measure the efficiency of encoding? The ability to correct the errors is a great “plus”, but we have realized that a code, which is able to correct the errors, is redundant. Where there is an equilibrium point between redundancy and efficiency? Is it possible to compress the input information using encoding? To ensure more efficient encoding, we should introduce its cost function. In this terms, the efficient encoding will mean minimization of this function. 2

Cost function Let be a set of messages (words) that can be transmitted through the communication system and be their “prescribed” probabilities, and be their durations. The duration of a message may be considered as its cost. Thus, the simples cost function is the average (statistical) cost per message is 3

Cost function Suppose (without loss of generality) that all symbols in all messages have identical cost. Thus, the average cost per message becomes proportional to the average number of symbols per message, that is, the average cost per message = the average length of messages: 4

Cost function and efficiency An increase in transmission (representation) efficiency can be obtained by such a method of encoding that the statistical distribution of the symbols used for it reduce the average word length. The efficiency of the encoding procedure can be defined, if, and only if, we know the lowest possible bound of. 5

Efficiency It was shown that the lower bound for the average word length is the ratio of the entropy of the original message ensemble to log D, where D is the number of symbols in the encoding alphabet: H(X)/log D log D is the maximum possible information per symbol. 6

Efficiency The efficiency of an encoding procedure is the ratio of the average information per symbol of encoded language to the maximum possible average information per symbol: 7

Redundancy The redundancy of the code is defined as follows: 8

Efficiency and Redundancy Example 1. Suppose, we transmit four messages that are the letters of the encoded alphabet: 9

Efficiency and Redundancy Example 2. Let us encode a set of the same four messages using a binary alphabet:. Let and, k=1,2,3,4, be the numbers of 0’s and 1’s in the encoding binary vector (word), n k be the length of the encoding word. 10

Efficiency and Redundancy The average cost (the average length) of message is and Hence, the direct uniform encoding for this certain input set of messages does not improve the efficiency 11

Efficiency and Redundancy Example 3. Let us use a non-uniform code for the same example A simple logic leads to the following idea: encoding a frequent message by a shorter encoding vector (word) and a less frequent message by a longer encoding vector (word) 12

Efficiency and Redundancy Let us use the following encoding vectors: Then 13

Efficiency and Redundancy The last example shows that it is possible to design a non-uniform code with the 100% efficiency. Moreover, this code makes it possible to compress the input information: the average length of the encoding vector in the uniform code was 2, while in the non-uniform code, which we have just considered, it was

Decipherability and Irreducibility The code has the following important property: any message composed from these encoding vectors independently of their amount and order always can be unambiguously decoded. This property of code is referred to as unique decipherability. 15

Decipherability and Irreducibility A sufficient condition for unique decipherability of a non-uniform code is that no encoding vectors (words) can be obtained from each other by the addition of more letters to the end. This property is referred to as the prefix property or irreducibility. This means that there is no encoding vector, which is a prefix of the different encoding vector. 16

Decipherability and Irreducibility The set of irreducible codes is a subset of the set of uniquely decipherable codes. If a code is irreducible, then it is definitely uniquely decipherable, but the opposite is not true. For example, a code (1) (10) is uniquely decipherable, but not irreducible: (1) is the prefix for (10). 17