Richard W. Hamming Learning to Learn The Art of Doing Science and Engineering Session 10: Coding Theory I Learning to Learn The Art of Doing Science and.

Slides:



Advertisements
Similar presentations
DCSP-8: Minimal length coding I Jianfeng Feng Department of Computer Science Warwick Univ., UK
Advertisements

Ulams Game and Universal Communications Using Feedback Ofer Shayevitz June 2006.
Lecture 2: Basic Information Theory TSBK01 Image Coding and Data Compression Jörgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)
Chapter 4 Variable–Length and Huffman Codes. Unique Decodability We must always be able to determine where one code word ends and the next one begins.
Computer Science CPSC 322 Lecture 25 Top Down Proof Procedure (Ch 5.2.2)
CY2G2 Information Theory 1
15-583:Algorithms in the Real World
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Cyclic Code.
Lecture 4 (week 2) Source Coding and Compression
Applied Algorithmics - week7
Foundations of Cryptography Lecture 10 Lecturer: Moni Naor.
Lecture 3: Source Coding Theory TSBK01 Image Coding and Data Compression Jörgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Information Theory EE322 Al-Sanie.
An introduction to Data Compression
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
SIMS-201 Compressing Information. 2  Overview Chapter 7: Compression Introduction Entropy Huffman coding Universal coding.
1/15 KLKSK Pertemuan III Analog & Digital Data Shannon Theorem xDSL.
Entropy and Shannon’s First Theorem
1 Introduction to Computability Theory Lecture11: Variants of Turing Machines Prof. Amos Israeli.
Lecture 6: Huffman Code Thinh Nguyen Oregon State University.
SWE 423: Multimedia Systems
Fundamental limits in Information Theory Chapter 10 :
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
Is ASCII the only way? For computers to do anything (besides sit on a desk and collect dust) they need two things: 1. PROGRAMS 2. DATA A program is a.
Information Theory Eighteenth Meeting. A Communication Model Messages are produced by a source transmitted over a channel to the destination. encoded.
Lossless data compression Lecture 1. Data Compression Lossless data compression: Store/Transmit big files using few bytes so that the original files.
Data Structures – LECTURE 10 Huffman coding
1 Chapter 1 Introduction. 2 Outline 1.1 A Very Abstract Summary 1.2 History 1.3 Model of the Signaling System 1.4 Information Source 1.5 Encoding a Source.
Variable-Length Codes: Huffman Codes
Reliability and Channel Coding
CSI Uncertainty in A.I. Lecture 201 Basic Information Theory Review Measuring the uncertainty of an event Measuring the uncertainty in a probability.
Hamming Codes 11/17/04. History In the late 1940’s Richard Hamming recognized that the further evolution of computers required greater reliability, in.
Huffman Coding Vida Movahedi October Contents A simple example Definitions Huffman Coding Algorithm Image Compression.
Information Theory & Coding…
Session 11: Coding Theory II
Induction and recursion
Source Coding-Compression
Dr.-Ing. Khaled Shawky Hassan
Great Theoretical Ideas in Computer Science.
Copyright © Cengage Learning. All rights reserved. CHAPTER 4 ELEMENTARY NUMBER THEORY AND METHODS OF PROOF ELEMENTARY NUMBER THEORY AND METHODS OF PROOF.
Information Coding in noisy channel error protection:-- improve tolerance of errors error detection: --- indicate occurrence of errors. Source.
Richard W. Hamming Learning to Learn The Art of Doing Science and Engineering Session 13: Information Theory ` Learning to Learn The Art of Doing Science.
Basic Concepts of Encoding Codes, their efficiency and redundancy 1.
Channel Capacity.
Coding & Information theory
Linawati Electrical Engineering Department Udayana University
DCSP-8: Minimal length coding I Jianfeng Feng Department of Computer Science Warwick Univ., UK
Introduction to Coding Theory. p2. Outline [1] Introduction [2] Basic assumptions [3] Correcting and detecting error patterns [4] Information rate [5]
Summer 2004CS 4953 The Hidden Art of Steganography A Brief Introduction to Information Theory  Information theory is a branch of science that deals with.
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Communication System A communication system can be represented as in Figure. A message W, drawn from the index set {1, 2,..., M}, results in the signal.
Huffman Code and Data Decomposition Pranav Shah CS157B.
Coding Theory Efficient and Reliable Transfer of Information
Additive White Gaussian Noise
Word : Let F be a field then the expression of the form a 1, a 2, …, a n where a i  F  i is called a word of length n over the field F. We denote the.
Basic Concepts of Encoding Codes and Error Correction 1.
Foundation of Computing Systems
Huffman code and Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Compression for Fixed-Width Memories Ori Rottenstriech, Amit Berman, Yuval Cassuto and Isaac Keslassy Technion, Israel.
 2004 SDU Uniquely Decodable Code 1.Related Notions 2.Determining UDC 3.Kraft Inequality.
Information Theory Information Suppose that we have the source alphabet of q symbols s 1, s 2,.., s q, each with its probability p(s i )=p i. How much.
The Mathematics of Star Trek Workshop
Huffman Codes ASCII is a fixed length 7 bit code that uses the same number of bits to define each character regardless of how frequently it occurs. Huffman.
Information Theory Michael J. Watts
A Brief Introduction to Information Theory
Lecture 6 Instantaneous Codes and Kraft’s Theorem (Section 1.4)
Theory of Information Lecture 13
Presentation transcript:

Richard W. Hamming Learning to Learn The Art of Doing Science and Engineering Session 10: Coding Theory I Learning to Learn The Art of Doing Science and Engineering Session 10: Coding Theory I

Coding Theory Overview Problem – Representation of Information Standard Model of Information Transmission Analogy to Human CommunicationAnalogy to Human Communication Representation – Codes What kind of code do we want?What kind of code do we want? What is information? Problem – Representation of Information Standard Model of Information Transmission Analogy to Human CommunicationAnalogy to Human Communication Representation – Codes What kind of code do we want?What kind of code do we want? What is information?

QuestionsQuestions Can we recognize it from white noise? Hamming: “No.”Hamming: “No.” How do we communicate ideas? Is the idea contained in words? How do machines differ from humans with respect to information? What is information? Can we recognize it from white noise? Hamming: “No.”Hamming: “No.” How do we communicate ideas? Is the idea contained in words? How do machines differ from humans with respect to information?

Information Theory Developed by C.E. Shannon at Bell Labs Should it have been called “Communication Theory”? Developed by C.E. Shannon at Bell Labs Should it have been called “Communication Theory”? Source Encoding Channel Encoding

Information Theory Nothing is assumed about source. Communication over space is no different than communication over time. Noise is different than Error. In physics theories, noise is not assumed to be part of the model.In physics theories, noise is not assumed to be part of the model. Error is in experiments, measurementsError is in experiments, measurements Nothing is assumed about source. Communication over space is no different than communication over time. Noise is different than Error. In physics theories, noise is not assumed to be part of the model.In physics theories, noise is not assumed to be part of the model. Error is in experiments, measurementsError is in experiments, measurements

Human Information Theory All you know is what comes out of the channel (hearing, reading, etc.) How do you know what went in? Design encoder to make this more probable Analogy to human communication Thought – Words spoken – Words heard – UnderstandingThought – Words spoken – Words heard – Understanding Source – Encoding – Decoding – SinkSource – Encoding – Decoding – Sink All you know is what comes out of the channel (hearing, reading, etc.) How do you know what went in? Design encoder to make this more probable Analogy to human communication Thought – Words spoken – Words heard – UnderstandingThought – Words spoken – Words heard – Understanding Source – Encoding – Decoding – SinkSource – Encoding – Decoding – Sink

Why Information Theory Precision of equipment is expensive for an additional bit. Too expensive to do it right!Too expensive to do it right! Coding theory offers an alternative. Escape necessity of doing things exactly right, allow room for error without paying for more precision!Escape necessity of doing things exactly right, allow room for error without paying for more precision! Precision of equipment is expensive for an additional bit. Too expensive to do it right!Too expensive to do it right! Coding theory offers an alternative. Escape necessity of doing things exactly right, allow room for error without paying for more precision!Escape necessity of doing things exactly right, allow room for error without paying for more precision!

What codes to use? Assume variable length codes Block codes (same length symbols) are a special caseBlock codes (same length symbols) are a special case Desired properties Unique decodability – each coded message represents a single source message if no noise added.Unique decodability – each coded message represents a single source message if no noise added. Instantaneous decodability – look at each bit only onceInstantaneous decodability – look at each bit only once Assume variable length codes Block codes (same length symbols) are a special caseBlock codes (same length symbols) are a special case Desired properties Unique decodability – each coded message represents a single source message if no noise added.Unique decodability – each coded message represents a single source message if no noise added. Instantaneous decodability – look at each bit only onceInstantaneous decodability – look at each bit only once

Non-unique code Consider the 4 symbol code: If you receive: 0011 Is it or?

Decoding Trees Represents a unique and instantaneously decodable code No symbol is a prefix for anotherNo symbol is a prefix for another Practical advice Have an exit symbol to know when entire decoding process doneHave an exit symbol to know when entire decoding process done Like quotation marks in human languageLike quotation marks in human language Represents a unique and instantaneously decodable code No symbol is a prefix for anotherNo symbol is a prefix for another Practical advice Have an exit symbol to know when entire decoding process doneHave an exit symbol to know when entire decoding process done Like quotation marks in human languageLike quotation marks in human language

Decoding Tree Decoding tree for: s4s4 s1s1 s2s2 s3s

Non-Instantaneous Code Consider the 4 symbol code: If you receive: …….111 You have to count back by groups of 111 to determine if the first symbol is 0, 01, or 011, so you look at each bit more than once.

“Goodness” of Codes McMillan’s Theorem says if we have uniqueness we have instantaneousness. No need to settle for a code without both properties.No need to settle for a code without both properties. Average code length as reasonable “goodness” measure. Good code design takes into account probabilities. McMillan’s Theorem says if we have uniqueness we have instantaneousness. No need to settle for a code without both properties.No need to settle for a code without both properties. Average code length as reasonable “goodness” measure. Good code design takes into account probabilities.

Average Code Length Given an alphabet of q symbols having length l i and probability p i, the average code length L is:

Kraft Inequality For an instantaneous code with symbol lengths l i Holds for unique decodable codes as well (McMillan’s theorem). Proved via induction on branch lengths of decoding tree. Holds for unique decodable codes as well (McMillan’s theorem). Proved via induction on branch lengths of decoding tree.

Proof of McMillan’s Theorem (again) Given a uniquely decodable code, the Kraft inequality holds, so the code is also instantaneously decodable. Unique decodability requires N k ≤ 2 k, so the following inequality must be true. Prove by contradiction. Assume K > 1. For some large n, the exponential K n exceeds the linear function f(n) = nl - n + 1

Intuitive Conclusions If the Kraft sum is strictly less than one, you can add a shorter code, or shorten an existing code. If it is exactly one, it is the best you can do. Thus have another measure of a good code. If the Kraft sum is strictly less than one, you can add a shorter code, or shorten an existing code. If it is exactly one, it is the best you can do. Thus have another measure of a good code.

What is Information? Words don’t necessarily "contain" the idea. Evidence: Hamming explains an idea in lecture.Hamming explains an idea in lecture. If you go home to explain it to your spouse, you will use different words.If you go home to explain it to your spouse, you will use different words. Words are an encoding. We don't really know what is the idea behind words. Or if there is a single definite idea.Or if there is a single definite idea. Is information like time? St. Augustine – I know what time is until you ask me.St. Augustine – I know what time is until you ask me. Words don’t necessarily "contain" the idea. Evidence: Hamming explains an idea in lecture.Hamming explains an idea in lecture. If you go home to explain it to your spouse, you will use different words.If you go home to explain it to your spouse, you will use different words. Words are an encoding. We don't really know what is the idea behind words. Or if there is a single definite idea.Or if there is a single definite idea. Is information like time? St. Augustine – I know what time is until you ask me.St. Augustine – I know what time is until you ask me.

Human Information Theory Information theory assumes symbols have consistent meaning. Human words do not have a fixed meaning. Lack of uniqueness is a challenge. Accounts vary from different witnesses of the same eventAccounts vary from different witnesses of the same event Once you encode an idea into words, that is what you believe. Coding theory is great for machines, not language. Information theory assumes symbols have consistent meaning. Human words do not have a fixed meaning. Lack of uniqueness is a challenge. Accounts vary from different witnesses of the same eventAccounts vary from different witnesses of the same event Once you encode an idea into words, that is what you believe. Coding theory is great for machines, not language.

Parting Thoughts Return to the AI question How much of what we do (or want done) will machines do?How much of what we do (or want done) will machines do? Really asking “What can we program?” In order to program we need to understand our own ideas. Greatest contributions of mathematicians are often definitions. Return to the AI question How much of what we do (or want done) will machines do?How much of what we do (or want done) will machines do? Really asking “What can we program?” In order to program we need to understand our own ideas. Greatest contributions of mathematicians are often definitions.