Linawati Electrical Engineering Department Udayana University

Slides:



Advertisements
Similar presentations
CY2G2 Information Theory 1
Advertisements

Data Compression CS 147 Minh Nguyen.
Lecture 4 (week 2) Source Coding and Compression
Applied Algorithmics - week7
Sampling and Pulse Code Modulation
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Information Theory EE322 Al-Sanie.
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
SIMS-201 Compressing Information. 2  Overview Chapter 7: Compression Introduction Entropy Huffman coding Universal coding.
Some Common Binary Signaling Formats: NRZ RZ NRZ-B AMI Manchester.
Entropy and Shannon’s First Theorem
Chapter 6 Information Theory
Lecture04 Data Compression.
Huffman Encoding 16-Apr-17.
Lecture 6: Huffman Code Thinh Nguyen Oregon State University.
Fundamental limits in Information Theory Chapter 10 :
2015/6/15VLC 2006 PART 1 Introduction on Video Coding StandardsVLC 2006 PART 1 Variable Length Coding  Information entropy  Huffman code vs. arithmetic.
Spatial and Temporal Data Mining
Information Theory Eighteenth Meeting. A Communication Model Messages are produced by a source transmitted over a channel to the destination. encoded.
Variable-Length Codes: Huffman Codes
Fundamentals of Multimedia Chapter 7 Lossless Compression Algorithms Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
2015/7/12VLC 2008 PART 1 Introduction on Video Coding StandardsVLC 2008 PART 1 Variable Length Coding  Information entropy  Huffman code vs. arithmetic.
Huffman Coding. Main properties : –Use variable-length code for encoding a source symbol. –Shorter codes are assigned to the most frequently used symbols,
EEE377 Lecture Notes1 EEE436 DIGITAL COMMUNICATION Coding En. Mohd Nazri Mahmud MPhil (Cambridge, UK) BEng (Essex, UK) Room 2.14.
Source Coding Hafiz Malik Dept. of Electrical & Computer Engineering The University of Michigan-Dearborn
1 Lossless Compression Multimedia Systems (Module 2) r Lesson 1: m Minimum Redundancy Coding based on Information Theory: Shannon-Fano Coding Huffman Coding.
Basics of Compression Goals: to understand how image/audio/video signals are compressed to save storage and increase transmission efficiency to understand.
Huffman Coding Vida Movahedi October Contents A simple example Definitions Huffman Coding Algorithm Image Compression.
STATISTIC & INFORMATION THEORY (CSNB134)
Source Coding-Compression
Informatics I101 February 25, 2003 John C. Paolillo, Instructor.
Huffman Codes. Encoding messages  Encode a message composed of a string of characters  Codes used by computer systems  ASCII uses 8 bits per character.
CS-2852 Data Structures LECTURE 13B Andrew J. Wozniewicz Image copyright © 2010 andyjphoto.com.
Prof. Amr Goneid Department of Computer Science & Engineering
Basic Concepts of Encoding Codes, their efficiency and redundancy 1.
Compression.  Compression ratio: how much is the size reduced?  Symmetric/asymmetric: time difference to compress, decompress?  Lossless; lossy: any.
Prepared by: Amit Degada Teaching Assistant, ECED, NIT Surat
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 2.
Abdullah Aldahami ( ) April 6,  Huffman Coding is a simple algorithm that generates a set of variable sized codes with the minimum average.
Lecture 4: Lossless Compression(1) Hongli Luo Fall 2011.
CS654: Digital Image Analysis Lecture 34: Different Coding Techniques.
Bahareh Sarrafzadeh 6111 Fall 2009
Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B.
Multi-media Data compression
1 Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004.
1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman.
Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5.
بسم الله الرحمن الرحيم My Project Huffman Code. Introduction Introduction Encoding And Decoding Encoding And Decoding Applications Applications Advantages.
ECE 101 An Introduction to Information Technology Information Coding.
Huffman Coding (2 nd Method). Huffman coding (2 nd Method)  The Huffman code is a source code. Here word length of the code word approaches the fundamental.
Huffman code and Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Entropy vs. Average Code-length Important application of Shannon’s entropy measure is in finding efficient (~ short average length) code words The measure.
Lossless Compression-Statistical Model Lossless Compression One important to note about entropy is that, unlike the thermodynamic measure of entropy,
ENTROPY Entropy measures the uncertainty in a random experiment. Let X be a discrete random variable with range S X = { 1,2,3,... k} and pmf p k = P X.
UNIT I. Entropy and Uncertainty Entropy is the irreducible complexity below which a signal cannot be compressed. Entropy is the irreducible complexity.
UNIT –V INFORMATION THEORY EC6402 : Communication TheoryIV Semester - ECE Prepared by: S.P.SIVAGNANA SUBRAMANIAN, Assistant Professor, Dept. of ECE, Sri.
Information Theory Information Suppose that we have the source alphabet of q symbols s 1, s 2,.., s q, each with its probability p(s i )=p i. How much.
Ch4. Zero-Error Data Compression Yuan Luo. Content  Ch4. Zero-Error Data Compression  4.1 The Entropy Bound  4.2 Prefix Codes  Definition and.
Huffman Codes ASCII is a fixed length 7 bit code that uses the same number of bits to define each character regardless of how frequently it occurs. Huffman.
Assignment 6: Huffman Code Generation
Data Compression.
Introduction to Information theory
Chapter 11 Data Compression
Huffman Coding CSE 373 Data Structures.
Huffman Encoding.
Electrical Communications Systems ECE
Presentation transcript:

Linawati Electrical Engineering Department Udayana University Information Theory Linawati Electrical Engineering Department Udayana University

Information Source Measuring Information Entropy Source Coding Designing Codes

Information Source 4 characteristics of information source The no. of symbols, n The symbols, S1, S2, …, Sn The probability of occurrence of each symbol, P(S1), P(S2), …, P(Sn) The correlation between successive symbols Memoryless source: if each symbol is independent A message: a stream of symbols from the senders to the receiver

Examples … Ex. 1.: A source that sends binary information (streams of 0s and 1s) with each symbol having equal probability and no correlation can be modeled as a memoryless source n = 2 Symbols: 0 and 1 Probabilities: p(0) = ½ and P(1) = ½

Measuring Information To measure the information contained in a message How much information does a message carry from the sender to the receiver? Examples Ex.2.: Imagine a person sitting in a room. Looking out the window, she can clearly see that the sun is shining. If at this moment she receives a call from a neighbor saying “It is now daytime”, does this message contain any information? Ex. 3. : A person has bought a lottery ticket. A friend calls to tell her that she has won first prize. Does this message contain any information?

Examples … Ex.3. It does not, the message contains no information. Why? Because she is already certain that is daytime. Ex. 4. It does. The message contains a lot of information, because the probability of winning first prize is very small Conclusion The information content of a message is inversely proportional to the probability of the occurrence of that message. If a message is very probable, it does not contain any information. If it is very improbable, it contains a lot of information

Symbol Information To measure the information contained in a message, it is needed to measure the information contained in each symbol I(s) = log2 1/P(s) bits Bits is different from the bit, binary digit, used to define a 0 or 1 Examples Ex.5. Find the information content of each symbol when the source is binary (sending only 0 or 1 with equal probability) Ex. 6. Find the information content of each symbol when the source is sending four symbols with prob. P(S1) = 1/8, P(S2) = 1/8, P(S3) = ¼ ; and P(S4) = 1/2

Examples … Ex. 5. P(0) = P(1) = ½ , the information content of each symbol is Ex.6.

Examples … Definition the relationships Ex.6. The symbols S1 and S2 are least probable. At the receiver each carries more information (3 bits) than S3 or S4. The symbol S3 is less probable than S4, so S3 carries more information than S4 Definition the relationships If P(Si) = P(Sj), then I(Si) = I(Sj) If P(Si) < P(Sj), then I(Si) > I(Sj) If P(Si) = 1, then I(Si) = 0

Message Information If the message comes from a memoryless source, each symbol is independent and the probability of receiving a message with symbols Si, Sj, Sk, … (where i, j, and k can be the same) is: P(message) = P(Si)P(Sj)P(Sk) … Then the information content carried by the message is

Example … Ex.7. An equal – probability binary source sends an 8-bit message. What is the amount of information received? The information content of the message is I(message) = I(first bit) + I(second bit) + … + I(eight bit) = 8 bits

Entropy Entropy (H) of the source Example The average amount of information contained in the symbols H(Source) = P(S1)xI(S1) + P(S2)xI(S2) + … + P(Sn)xI(Sn) Example What is the entropy of an equal-probability binary source? H(Source) = P(0)xI(0) + P(1)xI(1) = 0.5x1 + 0.5x1 = 1 bit 1 bit per symbol

Maximum Entropy For a particular source with n symbols, maximum entropy can be achieved only if all the probabilities are the same. The value of this max is In othe words, the entropy of every source has an upper limit defined by H(Source)≤log2n

Example … What is the maximum entropy of a binary source? Hmax = log22 = 1 bit

Source Coding To send a message from a source to a destination, a symbol is normally coded into a sequence of binary digits. The result is called code word A code is a mapping from a set of symbols into a set of code words. Example, ASCII code is a mapping of a set of 128 symbols into a set of 7-bit code words A ………………………..> 0100001 B …………………………> 0100010 Set of symbols ….> Set of binary streams

Fixed- and Variable-Length Code A code can be designed with all the code words the same length (fixed-length code) or with different lengths (variable length code) Examples A code with fixed-length code words: S1 -> 00; S2 -> 01; S3 -> 10; S4 -> 11 A code with variable-length code words: S1 -> 0; S2 -> 10; S3 -> 11; S4 -> 110

Distinct Codes Each code words is different from every other code word Example S1 -> 0; S2 -> 10; S3 -> 11; S4 -> 110 Uniquely Decodable Codes A distinct code is uniquely decodable if each code word can be decoded when inserted between other code words. Not uniquely decodable S1 -> 0; S2 -> 1; S3 -> 00; S4 -> 10 because 0010 -> S3 S4 or S3S2S1 or S1S1S4

Instantaneous Codes A uniquely decodable S1 -> 0; S2 -> 01; S3 -> 011; S4 -> 0111 A 0 uniquely defines the beginning of a code word A uniquely decodable code is instantaneously decodable if no code word is the prefix of any other code word

Examples … A code word and its prefixes (note that each code word is also a prefix of itself) S -> 01001 ; prefixes: 0, 10, 010, 0100, 01001 A uniquely decodable code that is instantaneously decodable S1 -> 0; s2 -> 10; s3 -> 110; s4 -> 111 When the receiver receives a 0, it immediately knows that it is S1; no other symbol starts with a 0. When the rx receives a 10, it immediately knows that it is S2; no other symbol starts with 10, and so on

Relationship between different types of coding Instantaneous codes Uniquely decodable codes Distinct codes All codes

Code … Average code length L=L(S1)xP(S1) + L(S2)xP(S2) + … Example Find the average length of the following code: S1 -> 0; S2 -> 10; S3 -> 110; S4 -> 111 P(S1) = ½, P(S2) = ¼; P(S3) = 1/8; P(S4) = 1/8 Solution L = 1x ½ + 2x ¼ + 3x 1/8 + 3x1/8 = 1 ¾ bits

Code … Code efficiency  (code efficiency) is defined as the entropy of the source code divided by the average length of the code Example Find the efficiency of the following code: S1 ->0; S2->10; S3 -> 110; S4 -> 111 P(S1) = ½, P(S2) = ¼; P(S3) = 1/8; P(S4) = 1/8 Solution

Designing Codes Two examples of instantaneous codes Shannon – Fano code Huffman code An instantaneous variable – length encoding method in which the more probable symbols are given shorter code words and the less probable are given longer code words Design builds a binary tree top (top to bottom construction) following the steps below: 1. List the symbols in descending order of probability 2. Divide the list into two equal (or nearly equal) probability sublists. Assign 0 to the first sublist and 1 to the second 3. Repeat step 2 for each sublist until no further division is possible

Example of Shannon – Fano Encoding Find the Shannon – Fano code words for the following source P(S1) = 0.3 ; P(S2) = 0.2 ; P(S3) = 0.15 ; P(S4) = 0.1 ; P(S5) = 0.1 ; P(S6) = 0.05 ; P(S7) = 0.05 ; P(S8) = 0.05 Solution Because each code word is assigned a leaf of the tree, no code word is the prefix of any other. The code is instantaneous. Calculation of the average length and the efficiency of this code H(source) = 2.7 L = 2.75  = 98%

Example of Shannon – Fano Encoding 0.30 S2 0.20 S3 0.15 S4 0.10 S5 S6 0.05 S7 S8 1 00 01 100 101 1100 1101 1110 1111

Huffman Encoding An instantaneous variable – length encoding method in which the more probable symbols are given shorter code words and the less probable are given longer code words Design builds a binary tree (bottom up construction): 1. Add two least probable symbols 2. Repeat step 1 until no further combination is possible

Example Huffman encoding Find the Huffman code words for the following source P(S1) = 0.3 ; P(S2) = 0.2 ; P(S3) = 0.15 ; P(S4) = 0.1 ; P(S5) = 0.1 ; P(S6) = 0.05 ; P(S7) = 0.05 ; P(S8) = 0.05 Solution Because each code word is assigned a leaf of the tree, no code word is the prefix of any other. The code is instantaneous. Calculation of the average length and the efficiency of this code H(source) = 2.70 ; L = 2.75 ;  = 98%

Example Huffman encoding 1.00 1 0.30 0.20 0.15 0.10 0.05 S1 00 S2 10 S3 010 S4 110 S5 111 S6 0110 S7 01110 S8 01111 0.60 0.40 0.3 0.15 0.20 0.10