EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

Slides:



Advertisements
Similar presentations
Introduction to Probability
Advertisements

Lecture 2: Basic Information Theory TSBK01 Image Coding and Data Compression Jörgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)
Entropy and Information Theory
Lecture 4 (week 2) Source Coding and Compression
Applied Algorithmics - week7
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Probability Three basic types of probability: Probability as counting
An introduction to Data Compression
SIMS-201 Compressing Information. 2  Overview Chapter 7: Compression Introduction Entropy Huffman coding Universal coding.
Random Variables.
Probability theory and average-case complexity. Review of probability theory.
Chapter 6 Information Theory
Optimal Merging Of Runs
Fundamental limits in Information Theory Chapter 10 :
A gentle introduction to Gaussian distribution. Review Random variable Coin flip experiment X = 0X = 1 X: Random variable.
Expected Value.  In gambling on an uncertain future, knowing the odds is only part of the story!  Example: I flip a fair coin. If it lands HEADS, you.
1 Chapter 1 Introduction. 2 Outline 1.1 A Very Abstract Summary 1.2 History 1.3 Model of the Signaling System 1.4 Information Source 1.5 Encoding a Source.
Lecture 2: Basic Information Theory Thinh Nguyen Oregon State University.
EEE377 Lecture Notes1 EEE436 DIGITAL COMMUNICATION Coding En. Mohd Nazri Mahmud MPhil (Cambridge, UK) BEng (Essex, UK) Room 2.14.
Information Theory and Security
Noise, Information Theory, and Entropy
Discrete Random Variables: The Binomial Distribution
Noise, Information Theory, and Entropy
Chapter 9 Introducing Probability - A bridge from Descriptive Statistics to Inferential Statistics.
Entropy and some applications in image processing Neucimar J. Leite Institute of Computing
STATISTIC & INFORMATION THEORY (CSNB134)
2. Mathematical Foundations
Sets, Combinatorics, Probability, and Number Theory Mathematical Structures for Computer Science Chapter 3 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesProbability.
Chapter 5 Sampling Distributions
Simple Mathematical Facts for Lecture 1. Conditional Probabilities Given an event has occurred, the conditional probability that another event occurs.
(Important to algorithm analysis )
4.1 Probability Distributions. Do you remember? Relative Frequency Histogram.
Image Compression (Chapter 8) CSC 446 Lecturer: Nada ALZaben.
Summer 2004CS 4953 The Hidden Art of Steganography A Brief Introduction to Information Theory  Information theory is a branch of science that deals with.
Digital Communications I: Modulation and Coding Course Term Catharina Logothetis Lecture 12.
Outline Transmitters (Chapters 3 and 4, Source Coding and Modulation) (week 1 and 2) Receivers (Chapter 5) (week 3 and 4) Received Signal Synchronization.
Coding Theory Efficient and Reliable Transfer of Information
Prepared by: Engr. Jo-Ann C. Viñas 1 MODULE 2 ENTROPY.
PROBABILITY, PROBABILITY RULES, AND CONDITIONAL PROBABILITY
Abdullah Aldahami ( ) April 6,  Huffman Coding is a simple algorithm that generates a set of variable sized codes with the minimum average.
Discrete Distributions. Random Variable - A numerical variable whose value depends on the outcome of a chance experiment.
Expected values of discrete Random Variables. The function that maps S into S X in R and which is denoted by X(.) is called a random variable. The name.
1 Lecture 7 System Models Attributes of a man-made system. Concerns in the design of a distributed system Communication channels Entropy and mutual information.
Combinatorics (Important to algorithm analysis ) Problem I: How many N-bit strings contain at least 1 zero? Problem II: How many N-bit strings contain.
1 Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004.
Introduction. What is probability? Probability is the chance that a given event will occur. Examples Probability that it will rain tomorrow. Probability.
SEAC-3 J.Teuhola Information-Theoretic Foundations Founder: Claude Shannon, 1940’s Gives bounds for:  Ultimate data compression  Ultimate transmission.
ENTROPY Entropy measures the uncertainty in a random experiment. Let X be a discrete random variable with range S X = { 1,2,3,... k} and pmf p k = P X.
UNIT I. Entropy and Uncertainty Entropy is the irreducible complexity below which a signal cannot be compressed. Entropy is the irreducible complexity.
UNIT –V INFORMATION THEORY EC6402 : Communication TheoryIV Semester - ECE Prepared by: S.P.SIVAGNANA SUBRAMANIAN, Assistant Professor, Dept. of ECE, Sri.
(C) 2000, The University of Michigan 1 Language and Information Handout #2 September 21, 2000.
Discrete Random Variable Random Process. The Notion of A Random Variable We expect some measurement or numerical attribute of the outcome of a random.
Introduction to Lossless Compression
EE465: Introduction to Digital Image Processing
Shannon Entropy Shannon worked at Bell Labs (part of AT&T)
(Important to algorithm analysis )
Optimal Merging Of Runs
Context-based Data Compression
Digital Multimedia Coding
(Important to algorithm analysis )
Optimal Merging Of Runs
A Brief Introduction to Information Theory
Random Variables Binomial Distributions
Entropy CSCI284/162 Spring 2009 GWU.
POPULATION (of “units”)
Lecture 2: Basic Information Theory
Discrete Random Variables: Basics
Discrete Random Variables: Basics
Discrete Random Variables: Basics
Presentation transcript:

EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg, Michael, Shalini, Brian and Justin  Valentine’s challenge Min: minutes, Max: 5 hours, Ave: 2-3 hours  Muddiest points Regular tree grammar (CS410 compiler or CS422: Automata) Fractal geometry (“The fractal geometry of nature” by Mandelbrot)  Seeing the Connection Remember the first story in Steve Jobs’ speech “Staying Hungry, Staying Foolish”? In addition to Jobs and Shannon, I have two more examples: Charles Darwin and Bruce Lee

EE465: Introduction to Digital Image Processing 2 Data Compression Basics  Discrete source Information=uncertainty Quantification of uncertainty Source entropy  Variable length codes Motivation Prefix condition Huffman coding algorithm

EE465: Introduction to Digital Image Processing 3 Information  What do we mean by information? “A numerical measure of the uncertainty of an experimental outcome” – Webster Dictionary  How to quantitatively measure and represent information? Shannon proposes a statistical-mechanics inspired approach  Let us first look at how we assess the amount of information in our daily lives using common sense

EE465: Introduction to Digital Image Processing 4 Information = Uncertainty  Zero information Pittsburgh Steelers won the Superbowl XL (past news, no uncertainty) Yao Ming plays for Houston Rocket (celebrity fact, no uncertainty)  Little information It will be very cold in Chicago tomorrow (not much uncertainty since this is winter time) It is going to rain in Seattle next week (not much uncertainty since it rains nine months a year in NW)  Large information An earthquake is going to hit CA in July 2006 (are you sure? an unlikely event) Someone has shown P=NP (Wow! Really? Who did it?)

EE465: Introduction to Digital Image Processing 5 Shannon’s Picture on Communication (1948) source encoder channel source decoder sourcedestination Examples of source: Human speeches, photos, text messages, computer programs … Examples of channel: storage media, telephone lines, wireless transmission … super-channel channel encoder channel decoder The goal of communication is to move information from here to there and from now to then

EE465: Introduction to Digital Image Processing 6 The role of source coding (data compression): Facilitate storage and transmission by eliminating source redundancy Our goal is to maximally remove the source redundancy by intelligent designing source encoder/decoder Source-Channel Separation Principle* The role of channel coding: Fight against channel errors for reliable transmission of information (design of channel encoder/decoder is considered in EE461) We simply assume the super-channel achieves error-free transmission

EE465: Introduction to Digital Image Processing 7 Discrete Source  A discrete source is characterized by a discrete random variable X  Examples Coin flipping: P(X=H)=P(X=T)=1/2 Dice tossing: P(X=k)=1/6, k=1-6 Playing-card drawing: P(X=S)=P(X=H)=P(X=D)=P(X=C)=1/4 What is the redundancy with a discrete source?

EE465: Introduction to Digital Image Processing 8 Two Extreme Cases source encoder channel source decoder tossing a fair coin Head or Tail? channel duplication tossing a coin with two identical sides P(X=H)=P(X=T)=1/2: (maximum uncertainty) Minimum (zero) redundancy, compression impossible P(X=H)=1,P(X=T)=0: (minimum redundancy) Maximum redundancy, compression trivial (1bit is enough) HHHH… TTTT… Redundancy is the opposite of uncertainty

EE465: Introduction to Digital Image Processing 9 Quantifying Uncertainty of an Event - probability of the event x (e.g., x can be X=H or X=T)  notes must happen ( no uncertainty ) unlikely to happen ( infinite amount of uncertainty ) Self-information Intuitively, I(p) measures the amount of uncertainty with event x

EE465: Introduction to Digital Image Processing 10 Weighted Self-information  1/ Question:Which value of p maximizes I w (p)? As p evolves from 0 to 1, weighted self-information first increases and then decreases

EE465: Introduction to Digital Image Processing 11 p=1/e Maximum of Weighted Self-information*

EE465: Introduction to Digital Image Processing 12  To quantify the uncertainty of a discrete source, we simply take the summation of weighted self- information over the whole set X is a discrete random variable Quantification of Uncertainty of a Discrete Source  A discrete source (random variable) is a collection (set) of individual events whose probabilities sum to 1

EE465: Introduction to Digital Image Processing 13 Shannon’s Source Entropy Formula (bits/sample) or bps Weighting coefficients

EE465: Introduction to Digital Image Processing 14 Source Entropy Examples  Example 1: (binary Bernoulli source) Flipping a coin with probability of head being p (0<p<1) Check the two extreme cases: As p goes to zero, H(X) goes to 0 bps  compression gains the most As p goes to a half, H(X) goes to 1 bps  no compression can help

EE465: Introduction to Digital Image Processing 15 Entropy of Binary Bernoulli Source

EE465: Introduction to Digital Image Processing 16 Source Entropy Examples  Example 2: (4-way random walk) N E S W

EE465: Introduction to Digital Image Processing 17 Source Entropy Examples (Con’t)  Example 3: A jar contains the same number of balls with two different colors: blue and red. Each time a ball is randomly picked out from the jar and then put back. Consider the event that at the k-th picking, it is the first time to see a red ball – what is the probability of such event? Prob(event)=Prob(blue in the first k-1 picks)Prob(red in the k-th pick ) =(1/2) k-1 (1/2)=(1/2) k (source with geometric distribution)

EE465: Introduction to Digital Image Processing 18 Source Entropy Calculation If we consider all possible events, the sum of their probabilities will be one. Then we can define a discrete random variable X with Check: Entropy: Problem 1 in HW3 is slightly more complex than this example

EE465: Introduction to Digital Image Processing 19 Properties of Source Entropy  Nonnegative and concave  Achieves the maximum when the source observes uniform distribution (i.e., P(x=k)=1/N, k=1-N)  Goes to zero (minimum) as the source becomes more and more skewed (i.e., P(x=k)  1, P(x  k)  0)

History of Entropy  Origin: Greek root for “transformation content”  First created by Rudolf Clausius to study thermodynamical systems in 1862  Developed by Ludwig Eduard Boltzmann in 1870s-1880s (the first serious attempt to understand nature in a statistical language)  Borrowed by Shannon in his landmark work “A Mathematical Theory of Communication” in 1948 EE465: Introduction to Digital Image Processing 20

A Little Bit of Mathematics*  Entropy S is proportional to log P (P is the relative probability of a state)  Consider an ideal gas of N identical particles, of which N i are in the i-th microscopic condition (range) of position and momentum.  Use Stirling’s formula: log N! ~ NlogN-N and note that p i = N i /N, you will get S ~ ∑ p i log p i EE465: Introduction to Digital Image Processing 21

Entropy-related Quotes “My greatest concern was what to call it. I thought of calling it ‘information’, but the word was overly used, so I decided to call it ‘uncertainty’. When I discussed it with John von Neumann, he had a better idea. Von Neumann told me, ‘You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, nobody knows what entropy really is, so in a debate you will always have the advantage. ” --Conversation between Claude Shannon and John von Neumann regarding what name to give to the “measure of uncertainty” or attenuation in phone-line signals (1949) EE465: Introduction to Digital Image Processing 22

Other Use of Entropy  In biology “the order produced within cells as they grow and divide is more than compensated for by the disorder they create in their surroundings in the course of growth and division.” – A. Lehninger Ecological entropy is a measure of biodiversity in the study of biological ecology.  In cosmology “black holes have the maximum possible entropy of any object of equal size” – Stephen Hawking EE465: Introduction to Digital Image Processing 23

EE465: Introduction to Digital Image Processing 24 What is the use of H(X)? Shannon’s first theorem (noiseless coding theorem) For a memoryless discrete source X, its entropy H(X) defines the minimum average code length required to noiselessly code the source. Notes: 1. Memoryless means that the events are independently generated (e.g., the outcomes of flipping a coin N times are independent events) 2. Source redundancy can be then understood as the difference between raw data rate and source entropy

EE465: Introduction to Digital Image Processing 25 Code Redundancy* Average code length: Theoretical bound Practical performance l i : the length of codeword assigned to the i-th symbol Note: if we represent each symbol by q bits (fixed length codes), Then redundancy is simply q-H(X) bps

EE465: Introduction to Digital Image Processing 26 How to achieve source entropy? Note: The above entropy coding problem is based on simplified assumptions are that discrete source X is memoryless and P(X) is completely known. Those assumptions often do not hold for real-world data such as images and we will recheck them later. entropy coding discrete source X P(X) binary bit stream

EE465: Introduction to Digital Image Processing 27 Data Compression Basics  Discrete source Information=uncertainty Quantification of uncertainty Source entropy  Variable length codes Motivation Prefix condition Huffman coding algorithm

EE465: Introduction to Digital Image Processing 28 Recall: Variable Length Codes (VLC) Assign a long codeword to an event with small probability Assign a short codeword to an event with large probability Self-information It follows from the above formula that a small-probability event contains much information and therefore worth many bits to represent it. Conversely, if some event frequently occurs, it is probably a good idea to use as few bits as possible to represent it. Such observation leads to the idea of varying the code lengths based on the events’ probabilities.

EE465: Introduction to Digital Image Processing 29 symbol k pkpk S W N E fixed-length codeword variable-length codeword way Random Walk Example symbol stream : S S N W S E N N N W S S S N E S S fixed length: variable length: bits 28bits 4 bits savings achieved by VLC (redundancy eliminated)

EE465: Introduction to Digital Image Processing 30 =0.5×1+0.25× × ×3 =1.75 bits/symbol average code length: Toy Example (Con’t) source entropy: Total number of bits Total number of symbols (bps) fixed-length variable-length

EE465: Introduction to Digital Image Processing 31 Problems with VLC  When codewords have fixed lengths, the boundary of codewords is always identifiable.  For codewords with variable lengths, their boundary could become ambiguous symbol S W N E VLC S S N W S E … … S S W N S E …S S N W S E … e dd

EE465: Introduction to Digital Image Processing 32 Uniquely Decodable Codes  To avoid the ambiguity in decoding, we need to enforce certain conditions with VLC to make them uniquely decodable  Since ambiguity arises when some codeword becomes the prefix of the other, it is natural to consider prefix condition Example: p  pr  pre  pref  prefi  prefix a  b: a is the prefix of b

EE465: Introduction to Digital Image Processing 33 Prefix condition No codeword is allowed to be the prefix of any other codeword. We will graphically illustrate this condition with the aid of binary codeword tree

EE465: Introduction to Digital Image Processing 34 Binary Codeword Tree 10 … root Level 1 Level 2 # of codewords 2 2 2k2k Level k

EE465: Introduction to Digital Image Processing 35 Prefix Condition Examples symbol x W E S N codeword 1codeword … … 1011 codeword 1 codeword

EE465: Introduction to Digital Image Processing 36 How to satisfy prefix condition?  Basic rule: If a node is used as a codeword, then all its descendants cannot be used as codeword Example …

EE465: Introduction to Digital Image Processing 37 Kraft’s inequality l i : length of the i-th codeword Property of Prefix Codes W E S N symbol xVLC- 1VLC-2 Example (proof skipped)

EE465: Introduction to Digital Image Processing 38 Two Goals of VLC design  –log 2 p(x)  For an event x with probability of p(x), the optimal code-length is, where  x  denotes the smallest integer larger than x (e.g.,  3.4  =4 ) achieve optimal code length (i.e., minimal redundancy) satisfy prefix condition code redundancy: Unless probabilities of events are all power of 2, we often have r>0

EE465: Introduction to Digital Image Processing 39 Solution: Huffman Coding (Huffman’1952) – we will cover it later while studying JPEG Arithmetic Coding (1980s) – not covered by EE465 but EE565 (F2008)

EE465: Introduction to Digital Image Processing 40 Golomb Codes for Geometric Distribution k …k … codeword … Optimal VLC for geometric source: P(X=k)=(1/2) k, k=1,2,… …

EE465: Introduction to Digital Image Processing 41 Summary of Data Compression Basics  Shannon’s Source entropy formula (theory) Entropy (uncertainty) is quantified by weighted self-information  VLC thumb rule (practice) Long codeword  small-probability event Short codeword  large-probability event bps