A Brief Introduction to Information Theory

Slides:



Advertisements
Similar presentations
Lecture 2: Basic Information Theory TSBK01 Image Coding and Data Compression Jörgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)
Advertisements

Entropy and Information Theory
Review of Probability. Definitions (1) Quiz 1.Let’s say I have a random variable X for a coin, with event space {H, T}. If the probability P(X=H) is.
EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,
Chapter 10 Shannon’s Theorem. Shannon’s Theorems First theorem:H(S) ≤ L n (S n )/n < H(S) + 1/n where L n is the length of a certain code. Second theorem:
SIMS-201 Compressing Information. 2  Overview Chapter 7: Compression Introduction Entropy Huffman coding Universal coding.
Copyright © Cengage Learning. All rights reserved.
Chapter 6 Information Theory
Background Knowledge Brief Review on Counting,Counting, Probability,Probability, Statistics,Statistics, I. TheoryI. Theory.
UCB Claude Shannon – In Memoriam Jean Walrand U.C. Berkeley
2015/6/15VLC 2006 PART 1 Introduction on Video Coding StandardsVLC 2006 PART 1 Variable Length Coding  Information entropy  Huffman code vs. arithmetic.
Sep 06, 2005CS477: Analog and Digital Communications1 Introduction Analog and Digital Communications Autumn
June 1, 2004Computer Security: Art and Science © Matt Bishop Slide #32-1 Chapter 32: Entropy and Uncertainty Conditional, joint probability Entropy.
1 Chapter 1 Introduction. 2 Outline 1.1 A Very Abstract Summary 1.2 History 1.3 Model of the Signaling System 1.4 Information Source 1.5 Encoding a Source.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 17: 10/24.
Lecture 2: Basic Information Theory Thinh Nguyen Oregon State University.
Information Theory and Security
Review of Probability Theory. © Tallal Elshabrawy 2 Review of Probability Theory Experiments, Sample Spaces and Events Axioms of Probability Conditional.
1 Advanced Smoothing, Evaluation of Language Models.
STATISTIC & INFORMATION THEORY (CSNB134)
INFORMATION THEORY BYK.SWARAJA ASSOCIATE PROFESSOR MREC.
Summer 2004CS 4953 The Hidden Art of Steganography Required Math - Summations  The summation symbol should be one of the easiest math symbols for you.
Sets, Combinatorics, Probability, and Number Theory Mathematical Structures for Computer Science Chapter 3 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesProbability.
All of Statistics Chapter 5: Convergence of Random Variables Nick Schafer.
Math 15 – Elementary Statistics Sections 7.1 – 7.3 Probability – Who are the Frequentists?
Section 7.2. Section Summary Assigning Probabilities Probabilities of Complements and Unions of Events Conditional Probability Independence Bernoulli.
4.1 Probability Distributions. Do you remember? Relative Frequency Histogram.
Prepared by: Amit Degada Teaching Assistant, ECED, NIT Surat
JHU CS /Jan Hajic 1 Introduction to Natural Language Processing ( ) Essential Information Theory I AI-lab
Summer 2004CS 4953 The Hidden Art of Steganography A Brief Introduction to Information Theory  Information theory is a branch of science that deals with.
Information Theory Basics What is information theory? A way to quantify information A lot of the theory comes from two worlds Channel.
Outline Transmitters (Chapters 3 and 4, Source Coding and Modulation) (week 1 and 2) Receivers (Chapter 5) (week 3 and 4) Received Signal Synchronization.
Coding Theory Efficient and Reliable Transfer of Information
Unit 11 Binomial Distribution IT Disicipline ITD1111 Discrete Mathematics & Statistics STDTLP 1 Unit 11 Binomial Distribution.
Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability Schemes. Conditional Entropy. Communication Network. Noise.
Entropy (YAC- Ch. 6)  Introduce the thermodynamic property called Entropy (S)  Entropy is defined using the Clausius inequality  Introduce the Increase.
Presented by Minkoo Seo March, 2006
Basic Concepts of Information Theory A measure of uncertainty. Entropy. 1.
Probability Distribution. Probability Distributions: Overview To understand probability distributions, it is important to understand variables and random.
(C) 2000, The University of Michigan 1 Language and Information Handout #2 September 21, 2000.
Discrete Random Variable Random Process. The Notion of A Random Variable We expect some measurement or numerical attribute of the outcome of a random.
Binomial Distribution (Dr. Monticino). Assignment Sheet  Read Chapter 15  Assignment # 9 (Due March 30 th )  Chapter 15  Exercise Set A: 1-6  Review.
Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability Schemes. Conditional Entropy. Communication Network. Noise.
CHAPTER 10: Introducing Probability
Introduction to Discrete Probability
Dealing with Random Phenomena
Chapter 6: Discrete Probability
Shannon Entropy Shannon worked at Bell Labs (part of AT&T)
Introduction to Information theory
Introduction to Information Theory- Entropy
CS104:Discrete Structures
Chapter 32: Entropy and Uncertainty
Information Theory Michael J. Watts
COT 5611 Operating Systems Design Principles Spring 2012
COT 5611 Operating Systems Design Principles Spring 2014
Discrete Probability Chapter 7 With Question/Answer Animations
Digital Multimedia Coding
Introduction to: PROBABILITY.
Econometric Models The most basic econometric model consists of a relationship between two variables which is disturbed by a random error. We need to use.
CONTEXT DEPENDENT CLASSIFICATION
Entropy and Uncertainty
CS 594: Empirical Methods in HCC Introduction to Bayesian Analysis
Problem 1.
Entropy CSCI284/162 Spring 2009 GWU.
Lecture 2: Basic Information Theory
Information Theoretical Analysis of Digital Watermarking
MAS2317- Introduction to Bayesian Statistics
Homework #2 Due May 29 , Consider a (2,1,4) convolutional code with g(1) = 1+ D2, g(2) = 1+ D + D2 + D3 a. Draw the.
Lecture 2 Basic Concepts on Probability (Section 0.2)
Presentation transcript:

A Brief Introduction to Information Theory Information theory is a branch of science that deals with the analysis of a communications system We will study digital communications – using a file (or network protocol) as the channel Claude Shannon Published a landmark paper in 1948 that was the beginning of the branch of information theory We are interested in communicating information from a source to a destination Source of Message Encoder NOISE Channel Decoder Destination of Message Summer 2004 CS 4953 The Hidden Art of Steganography

A Brief Introduction to Information Theory In our case, the messages will be a sequence of binary digits Does anyone know the term for a binary digit? One detail that makes communicating difficult is noise noise introduces uncertainty Suppose I wish to transmit one bit of information what are all of the possibilities? tx 0, rx 0 - good tx 0, rx 1 - error tx 1, rx 0 - error tx 1, rx 1 - good Two of the cases above have errors – this is where probability fits into the picture In the case of steganography, the “noise” may be due to attacks on the hiding algorithm Summer 2004 CS 4953 The Hidden Art of Steganography

A Brief Introduction to Information Theory Claude Shannon introduced the idea of self-information Suppose we have an event X, where Xi represents a particular outcome of the event Consider flipping a fair coin, there are two equiprobable outcomes: say X0 = heads, P0 = 1/2, X1 = tails, P1 = 1/2 The amount of self-information for any single result is 1 bit In other words, the number of bits required to communicate the result of the event is 1 bit Summer 2004 CS 4953 The Hidden Art of Steganography

A Brief Introduction to Information Theory When outcomes are equally likely, there is a lot of information in the result The higher the likelihood of a particular outcome, the less information that outcome conveys However, if the coin is biased such that it lands with heads up 99% of the time, there is not much information conveyed when we flip the coin and it lands on heads Summer 2004 CS 4953 The Hidden Art of Steganography

A Brief Introduction to Information Theory Suppose we have an event X, where Xi represents a particular outcome of the event Consider flipping a coin, however, let’s say there are 3 possible outcomes: heads (P = 0.49), tails (P=0.49), lands on its side (P = 0.02) – (likely MUCH higher than in reality) Note: the total probability MUST ALWAYS add up to one The amount of self-information for either a head or a tail is 1.02 bits For landing on its side: 5.6 bits Summer 2004 CS 4953 The Hidden Art of Steganography

A Brief Introduction to Information Theory Entropy is the measurement of the average uncertainty of information We will skip the proofs and background that leads us to the formula for entropy, but it was derived from required properties Also, keep in mind that this is a simplified explanation H – entropy P – probability X – random variable with a discrete set of possible outcomes (X0, X1, X2, … Xn-1) where n is the total number of possibilities Summer 2004 CS 4953 The Hidden Art of Steganography

A Brief Introduction to Information Theory Entropy is greatest when the probabilities of the outcomes are equal Let’s consider our fair coin experiment again The entropy H = ½ lg 2 + ½ lg 2 = 1 Since each outcome has self-information of 1, the average of 2 outcomes is (1+1)/2 = 1 Consider a biased coin, P(H) = 0.98, P(T) = 0.02 H = 0.98 * lg 1/0.98 + 0.02 * lg 1/0.02 = = 0.98 * 0.029 + 0.02 * 5.643 = 0.0285 + 0.1129 = 0.1414 Summer 2004 CS 4953 The Hidden Art of Steganography

A Brief Introduction to Information Theory In general, we must estimate the entropy The estimate depends on our assumptions about about the structure (read pattern) of the source of information Consider the following sequence: 1 2 3 2 3 4 5 4 5 6 7 8 9 8 9 10 Obtaining the probability from the sequence 16 digits, 1, 6, 7, 10 all appear once, the rest appear twice The entropy H = 3.25 bits Since there are 16 symbols, we theoretically would need 16 * 3.25 bits to transmit the information Summer 2004 CS 4953 The Hidden Art of Steganography

A Brief Introduction to Information Theory Consider the following sequence: 1 2 1 2 4 4 1 2 4 4 4 4 4 4 1 2 4 4 4 4 4 4 Obtaining the probability from the sequence 1, 2 four times (4/22), (4/22) 4 fourteen times (14/22) The entropy H = 0.447 + 0.447 + 0.415 = 1.309 bits Since there are 22 symbols, we theoretically would need 22 * 1.309 = 28.798 (29) bits to transmit the information However, check the symbols 12, 44 12 appears 4/11 and 44 appears 7/11 H = 0.530 + 0.415 = 0.945 bits 11 * 0.945 = 10.395 (11) bits to tx the info (38 % less!) We might possibly be able to find patterns with less entropy Summer 2004 CS 4953 The Hidden Art of Steganography