Introduction to Information Theory

Slides:



Advertisements
Similar presentations
Ulams Game and Universal Communications Using Feedback Ofer Shayevitz June 2006.
Advertisements

Entropy and Information Theory
Cyclic Code.
Error Control Code.
296.3Page :Algorithms in the Real World Error Correcting Codes II – Cyclic Codes – Reed-Solomon Codes.
15-853:Algorithms in the Real World
Information Theory EE322 Al-Sanie.
SIMS-201 Compressing Information. 2  Overview Chapter 7: Compression Introduction Entropy Huffman coding Universal coding.
Day 2 Information theory ( 信息論 ) Civil engineering ( 土木工程 ) Cultural exchange.
Compression & Huffman Codes
ENGS Lecture 8 ENGS 4 - Lecture 8 Technology of Cyberspace Winter 2004 Thayer School of Engineering Dartmouth College Instructor: George Cybenko,
SWE 423: Multimedia Systems
Fundamental limits in Information Theory Chapter 10 :
CSCI 3 Chapter 1.8 Data Compression. Chapter 1.8 Data Compression  For the purpose of storing or transferring data, it is often helpful to reduce the.
Compression JPG compression, Source: Original 10:1 Compression 45:1 Compression.
Compression & Huffman Codes Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Department of Information Engineering1 What is a bit? –the basic unit in computer –represent a binary number : 0 and 1 a group of bits can represent any.
2/28/03 1 The Virtues of Redundancy An Introduction to Error-Correcting Codes Paul H. Siegel Director, CMRR University of California, San Diego The Virtues.
Department of Electrical and Computer Engineering
Reliability and Channel Coding
15-853Page :Algorithms in the Real World Error Correcting Codes I – Overview – Hamming Codes – Linear Codes.
Noise, Information Theory, and Entropy
ELG5126 Source Coding and Data Compression
Management Information Systems Lection 06 Archiving information CLARK UNIVERSITY College of Professional and Continuing Education (COPACE)
Noise, Information Theory, and Entropy
Lecture 1 Source Coding and Compression
Chapter 2 Source Coding (part 2)
Review of modern noise proof coding methods D. Sc. Valeri V. Zolotarev.
Channel Coding Part 1: Block Coding
Mathematics in Management Science
Daphne Koller Message Passing Loopy BP and Message Decoding Probabilistic Graphical Models Inference.
Information theory in the Modern Information Society A.J. Han Vinck University of Duisburg/Essen January 2003
Information Coding in noisy channel error protection:-- improve tolerance of errors error detection: --- indicate occurrence of errors. Source.
1 i206: Lecture 2: Computer Architecture, Binary Encodings, and Data Representation Marti Hearst Spring 2012.
ECE 6332, Spring, 2014 Wireless Communication Zhu Han Department of Electrical and Computer Engineering Class 18 March. 26 th, 2014.
© 2009 Pearson Education Inc., Upper Saddle River, NJ. All rights reserved. 1 Communication Reliability Asst. Prof. Chaiporn Jaikaeo, Ph.D.
Multimedia Specification Design and Production 2012 / Semester 1 / L3 Lecturer: Dr. Nikos Gazepidis
Cyclic Code. Linear Block Code Hamming Code is a Linear Block Code. Linear Block Code means that the codeword is generated by multiplying the message.
COEN 180 Erasure Correcting, Error Detecting, and Error Correcting Codes.
MIMO continued and Error Correction Code. 2 by 2 MIMO Now consider we have two transmitting antennas and two receiving antennas. A simple scheme called.
Practical Session 10 Error Detecting and Correcting Codes.
COMPRESSION. Compression in General: Why Compress? So Many Bits, So Little Time (Space) CD audio rate: 2 * 2 * 8 * = 1,411,200 bps CD audio storage:
Error Control Code. Widely used in many areas, like communications, DVD, data storage… In communications, because of noise, you can never be sure that.
Introduction to Digital and Analog Communication Systems
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Lecture 2 Outline Announcements: No class next Wednesday MF lectures (1/13,1/17) start at 12:50pm Review of Last Lecture Analog and Digital Signals Information.
Coding Theory. 2 Communication System Channel encoder Source encoder Modulator Demodulator Channel Voice Image Data CRC encoder Interleaver Deinterleaver.
Huffman Code and Data Decomposition Pranav Shah CS157B.
Coding Theory Efficient and Reliable Transfer of Information
David Wetherall Professor of Computer Science & Engineering Introduction to Computer Networks Error Detection (§3.2.2)
Channel Coding Binit Mohanty Ketan Rajawat. Recap…  Information is transmitted through channels (eg. Wires, optical fibres and even air)  Channels are.
CS654: Digital Image Analysis
Making Connections Efficient: Multiplexing and Compression Data Communications and Computer Networks: A Business User’s Approach Seventh Edition.
1 CSCD 433 Network Programming Fall 2013 Lecture 5a Digital Line Coding and other...
ECE 101 An Introduction to Information Technology Information Coding.
Lecture 20 CSE 331 July 30, Longest path problem Given G, does there exist a simple path of length n-1 ?
Practical Session 10 Computer Architecture and Assembly Language.
UNIT I. Entropy and Uncertainty Entropy is the irreducible complexity below which a signal cannot be compressed. Entropy is the irreducible complexity.
Submitted To-: Submitted By-: Mrs.Sushma Rani (HOD) Aashish Kr. Goyal (IT-7th) Deepak Soni (IT-8 th )
1 CSCD 433 Network Programming Fall 2016 Lecture 4 Digital Line Coding and other...
Data Compression: Huffman Coding in Weiss (p.389)
8 Coding Theory Discrete Mathematics: A Concept-based Approach.
Compression & Huffman Codes
Computer Architecture and Assembly Language
IMAGE COMPRESSION.
Introduction to Information theory
Introduction to Information Theory- Entropy
Advanced Computer Networks
CSE 589 Applied Algorithms Spring 1999
Computer Architecture and Assembly Language
Presentation transcript:

Introduction to Information Theory Hsiao-feng Francis Lu Dept. of Comm Eng. National Chung-Cheng Univ.

Father of Digital Communication The roots of modern digital communication stem from the ground-breaking paper “A Mathematical Theory of Communication” by Claude Elwood Shannon in 1948.

Model of a Digital Communication System Message e.g. English symbols Encoder e.g. English to 0,1 sequence Information Source Coding Communication Channel Destination Decoding Can have noise or distortion Decoder e.g. 0,1 sequence to English

Communication Channel Includes

And even this…

Shannon’s Definition of Communication “The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.” “Frequently the messages have meaning” “... [which is] irrelevant to the engineering problem.”

Shannon Wants to… Shannon wants to find a way for “reliably” transmitting data throughout the channel at “maximal” possible rate. Information Source Coding Communication Channel Destination Decoding For example, maximizing the speed of ADSL @ your home

And he thought about this problem for a while… He later on found a solution and published in this 1948 paper.

In his 1948 paper he build a rich theory to the problem of reliable communication, now called “Information Theory” or “The Shannon Theory” in honor of him.

Shannon’s Vision Data Source Encoding Channel Encoding Channel User Source Decoding Channel Decoding

Example: Disk Storage Data Zip Add CRC Channel User Unzip Verify CRC

In terms of Information Theory Terminology Zip Source Encoding = Data Compression Unzip Source Decoding = Data Decompression Add CRC Channel Encoding = Error Protection Verify CRC Channel Decoding = Error Correction

Example: VCD and DVD Moive MPEG Encoder RS Encoding CD/DVD TV MPEG Decoder RS Decoding RS stands for Reed-Solomon Code.

Example: Cellular Phone Speech Encoding CC Encoding Channel Speech Decoding CC Decoding GSM/CDMA CC stands for Convolutional Code.

Example: WLAN IEEE 802.11b Data Zip CC Encoding Channel User Unzip CC Decoding IEEE 802.11b CC stands for Convolutional Code.

Shannon Theory The original 1948 Shannon Theory contains: Measurement of Information Source Coding Theory Channel Coding Theory

Measurement of Information Shannon’s first question is “How to measure information in terms of bits?” = ? bits = ? bits

Or Lottery!? = ? bits

Or this… = ? bits = ? bits

All events are probabilistic! Using Probability Theory, Shannon showed that there is only one way to measure information in terms of number of bits: called the entropy function

For example Tossing a dice: Outcomes are 1,2,3,4,5,6 Each occurs at probability 1/6 Information provided by tossing a dice is

The number 2.585-bits is not an integer!! Wait! It is nonsense! The number 2.585-bits is not an integer!! What does you mean?

Shannon’s First Source Coding Theorem Shannon showed: “To reliably store the information generated by some random source X, you need no more/less than, on the average, H(X) bits for each outcome.”

Meaning: If I toss a dice 1,000,000 times and record values from each trial 1,3,4,6,2,5,2,4,5,2,4,5,6,1,…. In principle, I need 3 bits for storing each outcome as 3 bits covers 1-8. So I need 3,000,000 bits for storing the information. Using ASCII representation, computer needs 8 bits=1 byte for storing each outcome The resulting file has size 8,000,000 bits

But Shannon said: You only need 2.585 bits for storing each outcome. So, the file can be compressed to yield size 2.585x1,000,000=2,585,000 bits Optimal Compression Ratio is:

Let’s Do Some Test! File Size Compression Ratio No Compression 8,000,000 bits 100% Shannon 2,585,000 bits 32.31% Winzip 2,930,736 bits 36.63% WinRAR 2,859,336 bits 35.74%

The Winner is I had mathematically claimed my victory 50 years ago!

Follow-up Story Later in 1952, David Huffman, while was a graduate student in MIT, presented a systematic method to achieve the optimal compression ratio guaranteed by Shannon. The coding technique is therefore called “Huffman code” in honor of his achievement. Huffman codes are used in nearly every application that involves the compression and transmission of digital data, such as fax machines, modems, computer networks, and high-definition television (HDTV), to name a few. (1925-1999)

How? Done So far… but how about? Data Source Encoding Channel Encoding User Source Decoding Channel Decoding

The Simplest Case: Computer Network Communications over computer network, ex. Internet The major channel impairment herein is Packet Loss

Binary Erasure Channel Impairment like “packet loss” can be viewed as Erasures. Data that are erased mean they are lost during transmission… 1-p p Erasure p 1 1 1-p p is the packet loss rate in this network

and Bob goes CRAZY! Once a binary symbol is erased, it can not be recovered… Ex: Say, Alice sends 0,1,0,1,0,0 to Bob But the network was so poor that Bob only received 0,?,0,?,0,0 So, Bob asked Alice to send again Only this time he received 0,?,?,1,0,0 and Bob goes CRAZY! What can Alice do? What if Alice sends 0000,1111,0000,1111,0000,0000 Repeating each transmission four times!

What Good Can This Serve? Now Alice sends 0000,1111,0000,1111,0000,0000 The only cases Bob can not read Alice are for example ????,1111,0000,1111,0000,0000 all the four symbols are erased. But this happens at probability p4

But if the data rate in the original network is 8M bits per second Thus if the original network has packet loss rate p=0.25, by repeating each symbol 4 times, the resulting system has packet loss rate p4=0.00390625 But if the data rate in the original network is 8M bits per second 8Mbps Alice p=0.25 Bob With repetition, Alice can only transmit at 2 M bps 8Mbps 2 Mbps X 4 Alice Bob p=0.00390625

Shannon challenged: Is repetition the best Alice can do?

And he thinks again…

Shannon’s Channel Coding Theorem Shannon answered: “Give me a channel and I can compute a quantity called capacity, C for that channel. Then reliable communication is possible only if your data rate stays below C.”

? ? ? ? What does Shannon mean?

He calculated the channel capacity Shannon means In this example: 8Mbps p=0.25 Alice Bob He calculated the channel capacity C=1-p=0.75 And there exists coding scheme such that: 8Mbps ? 8 x (1-p) =6 Mbps Alice p=0 Bob

Unfortunately… I do not know exactly HOW? Neither do we… 

But With 50 Years of Hard Work We have discovered a lot of good codes: Hamming codes Convolutional codes, Concatenated codes, Low density parity check (LDPC) codes Reed-Muller codes Reed-Solomon codes, BCH codes, Finite Geometry codes, Cyclic codes, Golay codes, Goppa codes Algebraic Geometry codes, Turbo codes Zig-Zag codes, Accumulate codes and Product-accumulate codes, … We now come very close to the dream Shannon had 50 years ago! 

Nowadays… Source Coding Theorem has applied to MPEG MP3 Image Compression Audio/Video Compression MPEG Data Compression Audio Compression MP3 Channel Coding Theorem has applied to VCD/DVD – Reed-Solomon Codes Wireless Communication – Convolutional Codes Optical Communication – Reed-Solomon Codes Computer Network – LT codes, Raptor Codes Space Communication

Shannon Theory also Enables Space Communication In 1965, Mariner 4: Frequency =2.3GHz (S Band) Data Rate= 8.33 bps No Source Coding Repetition code (2 x) In 2004, Mars Exploration Rovers: Frequency =8.4 GHz (X Band) Data Rate= 168K bps 12:1 lossy ICER compression Concatenated Code

In 2006, Mars Reconnaissance Orbiter Communicates Faster than Frequency =8.4 GHz (X Band) Data Rate= 12 M bps 2:1 lossless FELICS compression (8920,1/6) Turbo Code At Distance 2.15 x 108 Km

And Information Theory has Applied to All kinds of Communications, Stock Market, Economics Game Theory and Gambling, Quantum Physics, Cryptography, Biology and Genetics, and many more…