Dr. Hadi AL Saadi Image Compression. Goal of Image Compression Digital images require huge amounts of space for storage and large bandwidths for transmission.

Slides:



Advertisements
Similar presentations
T.Sharon-A.Frank 1 Multimedia Compression Basics.
Advertisements

15 Data Compression Foundations of Computer Science ã Cengage Learning.
Data Compression CS 147 Minh Nguyen.
Lecture 4 (week 2) Source Coding and Compression
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
SIMS-201 Compressing Information. 2  Overview Chapter 7: Compression Introduction Entropy Huffman coding Universal coding.
Greedy Algorithms (Huffman Coding)
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
Lecture04 Data Compression.
Compression & Huffman Codes
School of Computing Science Simon Fraser University
SWE 423: Multimedia Systems
Department of Computer Engineering University of California at Santa Cruz Data Compression (1) Hai Tao.
Spatial and Temporal Data Mining
A Data Compression Algorithm: Huffman Compression
Compression & Huffman Codes Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Image Compression (Chapter 8)
Fundamentals of Multimedia Chapter 7 Lossless Compression Algorithms Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Lossless Compression in Multimedia Data Representation Hao Jiang Computer Science Department Sept. 20, 2007.
Data Compression Basics & Huffman Coding
1 Lossless Compression Multimedia Systems (Module 2) r Lesson 1: m Minimum Redundancy Coding based on Information Theory: Shannon-Fano Coding Huffman Coding.
Spring 2015 Mathematics in Management Science Binary Linear Codes Two Examples.
Software Research Image Compression Mohamed N. Ahmed, Ph.D.
Huffman Coding Vida Movahedi October Contents A simple example Definitions Huffman Coding Algorithm Image Compression.
8. Compression. 2 Video and Audio Compression Video and Audio files are very large. Unless we develop and maintain very high bandwidth networks (Gigabytes.
Chapter 2 Source Coding (part 2)
Computer Vision – Compression(2) Hanyang University Jong-Il Park.
: Chapter 12: Image Compression 1 Montri Karnjanadecha ac.th/~montri Image Processing.
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 7 – Basics of Compression (Part 2) Klara Nahrstedt Spring 2011.
Page 110/6/2015 CSE 40373/60373: Multimedia Systems So far  Audio (scalar values with time), image (2-D data) and video (2-D with time)  Higher fidelity.
Prof. Amr Goneid Department of Computer Science & Engineering
Compression.  Compression ratio: how much is the size reduced?  Symmetric/asymmetric: time difference to compress, decompress?  Lossless; lossy: any.
1 Classification of Compression Methods. 2 Data Compression  A means of reducing the size of blocks of data by removing  Unused material: e.g.) silence.
Digital Image Processing Image Compression
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
Image Compression – Fundamentals and Lossless Compression Techniques
COMPRESSION. Compression in General: Why Compress? So Many Bits, So Little Time (Space) CD audio rate: 2 * 2 * 8 * = 1,411,200 bps CD audio storage:
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Chapter 17 Image Compression 17.1 Introduction Redundant and irrelevant information  “Your wife, Helen, will meet you at Logan Airport in Boston.
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 6 – Basics of Compression (Part 1) Klara Nahrstedt Spring 2011.
Lecture 4: Lossless Compression(1) Hongli Luo Fall 2011.
CS654: Digital Image Analysis
CS654: Digital Image Analysis Lecture 34: Different Coding Techniques.
Digital Image Processing Lecture 22: Image Compression
Lossless Compression(2)
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 11 COMPRESSION.
ELE 488 F06 ELE 488 Fall 2006 Image Processing and Transmission ( ) Image Compression Review of Basics Huffman coding run length coding Quantization.
Multi-media Data compression
1 Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004.
Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 7 – Basics of Compression (Part 2) Klara Nahrstedt Spring 2012.
Page 1KUT Graduate Course Data Compression Jun-Ki Min.
Lossless Compression-Statistical Model Lossless Compression One important to note about entropy is that, unlike the thermodynamic measure of entropy,
Submitted To-: Submitted By-: Mrs.Sushma Rani (HOD) Aashish Kr. Goyal (IT-7th) Deepak Soni (IT-8 th )
Image Compression (Chapter 8)
HUFFMAN CODES.
IMAGE PROCESSING IMAGE COMPRESSION
Data Coding Run Length Coding
Compression & Huffman Codes
Digital Image Processing Lecture 20: Image Compression May 16, 2005
Data Compression.
Data Compression.
Data Compression CS 147 Minh Nguyen.
CH 8. Image Compression 8.1 Fundamental 8.2 Image compression models
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
15 Data Compression Foundations of Computer Science ã Cengage Learning.
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Presentation transcript:

Dr. Hadi AL Saadi Image Compression

Goal of Image Compression Digital images require huge amounts of space for storage and large bandwidths for transmission. A 640 x 480 color image requires close to 1MB of space. The goal of image compression is to reduce the amount of data required to represent a digital image. Reduce storage requirements and increase transmission rates.

Compression: Basic Algorithms The need for compression: Raw Video, Image and Audio files can be very large: Uncompressed Audio: 1 minute of Audio: Audio Type44.1 KHz22.05 KHz KHz 16 Bit Stereo10.1 Mb5.05 Mb2.52 Mb 16 Bit Mono5.05 Mb2.52 Mb1.26 Mb 8 Bit Mono2.52 Mb1.26 Mb630 Kb Uncompressed Images: Image TypeFile Size 512 x 512 Monochrome0.25 Mb 512 x bit colour image0.25 Mb 512 x bit colour image0.75 Mb

Video: Can also involve: Stream of audio plus video imagery. Raw Video – Uncompressed Image Frames, 512x512 True Colour, 25 fps, 1125 MB Per Min HDTV — Gigabytes per minute uncompressed (1920 x 1080, true colour, 25fps: 8.7GB per min)  Relying on higher bandwidths is not a good option.  Compression HAS TO BE part of the representation of audio, image and video formats.

Approaches Lossless ( Entropy Coding) Information preserving, does not loss information (i.e signal can perfectly reconstructed after compression ) produce variable bit – rate important for text image over network Low compression ratios, its not guaranteed to actually reduce the data size Lossy ( Source Coding) Not information preserving, (i.e the signal is not perfectly reconstructed after decompression ) its important approach for digitized, audio and video signal,because the sensitivity of the human eye and ear can not recognize all detail High compression ratios, produced any desired constant bit-rate Trade-off: image quality vs compression ratio

Data ≠ Information Data and information are not synonymous terms! Data is the means by which information is conveyed. Data compression aims to reduce the amount of data required to represent a given quantity of information while preserving as much information as possible. The same amount of information can be represented by various amount of data, e.g.: Ex1.Your wife, Helen, will meet you at Logan Airport in Boston at 5 minutes past 6:00 pm tomorrow night Ex1.Your wife, Helen, will meet you at Logan Airport in Boston at 5 minutes past 6:00 pm tomorrow night Ex2. Ex2. Your wife will meet you at Logan Airport at 5 minutes past 6:00 pm tomorrow night Ex3. Ex3. Helen will meet you at Logan at 6:00 pm tomorrow night

Data Redundancy compression Compression ratio:

Data Redundancy (cont’d) Relative data redundancy: Example:

Types of Data Redundancy (1) Coding (2) Interpixel (3) Psychovisual Compression attempts to reduce one or more of these redundancy types.

Coding Redundancy Code: a list of symbols (letters, numbers, bits etc.) Code word: a sequence of symbols used to represent a piece of information or an event (e.g., gray levels). Code word length: number of symbols in each code word

Goal: minimize the average symbol length Average symbol length Average symbol length I(i) = binary length of i-th symbol N= number of symbols source emits M symbols (one every T sec ) ) symbol i has been emitted m(i) times number of bits being emitted by the source is: average length per symbol : average bit rate =

Symbol probabilities : Probability P(i) of a symbol : corresponds to its relative frequency : Average symbol length Average symbol length

Minimum Average symbol length Basic idea for reducing the average symbol length : assign a shorter codewords to symbols that appear more frequently, and longer codewords to symbol that appear less frequently. Theorem : “ the average binary length of the encoded symbols is always greater than or equal to the entropy H of the source ( the average information content per symbol of the source ) “

N x M image r k : k-th gray level P(r k ): probability of r k l(r k ): # of bits for r k Coding Redundancy

l(r k ) = constant length Coding Redundancy

l(r k ) = variable length Consider the probability of the gray levels: Coding Redundancy

How do we measure information? information content What is the information content of a message/image? amount of data What is the minimum amount of data that is sufficient to describe completely an image without loss of information?

Information generation is assumed to be a probabilistic process. Idea: associate information with probability! Modeling Information Note: I(E)=0 when P(E)=1 A random event E with probability P(E) contains:

How much information does a pixel contain? Suppose that gray level values are generated by a random variable, then r k contains: units of information! Average information content of an image : units/pixel Entropy H Entropy H: the average information content per symbol of the source  Entropy reaches a maximum( ) when all symbols in the set are equal probable

 Entropy is small (always 0) when symbols that are much more likely to appear than other symbol  The efficiency of coding scheme is : Example : for N=4 and the probability distribution shown in Fig. below calculate the entropy to every system. i PiPi 0.25 i PiPi 1/3 2/3

Entropy Estimation It is not easy to estimate H reliably! image

Example 2: statistical encoding algorithm is being considered for transmission of a large number of long text file over a public network analysis of the file content has shown file comprises only the six different characters ( M,F,Y,N,O, and I ) each of which occurs with a relative frequency of occurrence of (0.25,0.25,0.125,0.125,0.125,0.125 ) respectively.If the encoding algorithm under consideration uses the following set of codewords. M=10 F=11 Y=010 N=011 O=000 and I= 001 Compute : 1.the average number of bit per codewords with the algorithm 2.the entropy of the source 3.the minimum number of bits required assuming fixed-length codewordes

Solution : 1. L= 2*(2*0.25)+4(3*0.125)=2.5 bit per codeword 2. =-[2(0.25Log2 0.25)+(4(0.125Log ))= the minimum number of bits required assuming fixed-length codewordes since there are 6 characters,using fixed-length codewords would require minimum of 3 bits (8 combinations)

Entropy codingRun-length coding Huffman coding Arithmetic coding Source codingPrediction DPCM DM Transformation FFT DCT Vector Quantization Hybrid codingJPEG MPEG H.261 Categories of Compression Techniques

Lossless Compression

Lossless Compression (Entropy Coding) Methods Lossless Coding Techniques (Entropy Coding) Lossless Coding Techniques (Entropy Coding) Repetitive Sequence Encoding Statistical Encoding Lossless predictive Coding Run-Length Encoding Huffman Arithmetic LZW DPCM Bitplane Encoding

Huffman Coding Huffman Coding (coding redundancy) Huffman Coding Algorithm— a bottom-up approach 1. Initialization: Put all symbols on a list sorted according to their frequency counts. 2. Repeat until the list has only one symbol left: (1) From the list pick two symbols with the lowest frequency counts. Form a Huffman subtree that has these two symbols as child nodes and create a parent node. (2) Assign the sum of the children’s frequency counts to the parent and insert it into the list such that the order is maintained. (3) Delete the children from the list. 3. Assign a codeword for each leaf based on the path from the root. Huffman code assignment procedure is based on binary tree structure Huffman code assignment procedure is based on binary tree structure

Codeword Formation: Follow the path from the root of the tree to the Child symbol, and accumulate the labels of all the branches. to the Child symbol, and accumulate the labels of all the branches. Example 1: Consider a source with the following symbol occurrence SymbolABCD Probability P(BCD)=0.5 P(A)=0.5 P(B)=0.3 P(CD)=0.2 P(C)=0.12 P(D)= P(ABCD)=1

code (A)=1, code (B)=01 code(C)=001 code(D)=000 Average length of this code=0.3*2+0.5*1+0.12*3+0.08*3=1.7 bits/ sample

Huffman Coding Codeword length CodewordX Probability

Example:Example: A series of messages is to be transferred between two computers over PSTN network.The computers just comprise the character A through H. Analyses has shown that each characters has probability as follows. A and B= 0.25 C and D = 0.14 E,F,G and H = Derive the minimum average number of bit per characters 2.Use Huffman code to derive the codewords set and prove this is the minimum set by constructing corresponding Huffman code tree 3.derive the average number of bit per character for your codeword set and compare this with the fixed- length binary codeword Ans: 1.Entropy= bit per codeword H=-[2(0.25*log2 (0.25))+ 2(0.14*log2 (0.14))+ 4(0.055*log2 (0.055))] = bits per codeword 2. shown in Fig. bellow

Entropy encoding : a: Codeword generation b: tree

Example: N=8 symbols (a,b,c,d,e,f,g and h) P(a)=0.01 P(b)=0.02 P(c)=0.05 P(d)=0.09 P(e)=0.18 P(f)=P(g)=0.2 P(h)=0.25 Find: 1.The average length per symbol before coding 2.Entropy 3.The average length per symbol before coding after Huffman coding 4.Efficiency Solution: Entropy : 3.Average length per symbol ( with Huffman Coding)= =2.63 bits/symbol

4. Efficiency of Code:=98% Original codeword Huffman codeword P(h)= P(g)= P(f)= P(e)= P(d)= P(c)= P(b)= P(a) =0.01

H/W Assume the following Files find the size of file using fixed length binary codeword 2.use Huffman code to derive the codeword set 3.find the size of file using Huffman codeword set

Run-length coding (RLC) Run-length coding (RLC) (interpixel redundancy) Used to reduce the size of a repeating string of characters (i.e., runs):  (1,5) (0, 6) (1, 1) a a a b b b b b b c c  (a,3) (b, 6) (c, 2) Encodes a run of symbols into two bytes: (symbol, count) Can compress any type of data but cannot achieve high compression ratios compared to other compression methods.

A predictive coding approach. Each pixel value (except at the boundaries) is predicted based on its neighbors (e.g., linear combination) to get a predicted image. The difference between the original and predicted images yields a differential or residual image. i.e., has much less dynamic range of pixel values. The differential image is encoded using Huffman coding. Differential Pulse Code Modulation (DPCM) Coding Differential Pulse Code Modulation (DPCM) Coding (interpixel redundancy)

AB C D AB-A C-AD-A A Simple DPCM Encoding procedure maybe described by the following steps for a 2x2 block of monochrome pixels: 1. Take top left pixel as the base value for the block, pixel A. 2. Calculate three other transformed values by taking the difference between these (respective) pixels and pixel A, Ii.e. B-A, C-A, D-A. 3. Store the base pixel and the differences as the values of the transform. A simple DPCM transform coding

Example Consider the following 4x4 image block: then we get: X0 = 120 X1 = 10 X2 = 5 X3 = 0 We can then compress these values by taking less bits to represent the data

Coding (interpixel redundancy) Lempel-Ziv-Welch (LZW) Coding (interpixel redundancy) Requires no priori knowledge of pixel probability distribution values. Assigns fixed length code words to variable length sequences. Included in GIF and TIFF and PDF file formats

BEGIN s = next input character; while not EOF { c = next input character; if s + c exists in the dictionary s = s + c; else { output the code for s; add string s + c to the dictionary with a new code; s = c; } output the code for s; END ALGORITHM - LZW Compression

Example :LZW compression for string “ABABBABCABABBA” Let’s start with a very simple dictionary (also referred to as a “string table”), initially containing only 3 characters, with codes as follows: Now if the input string is “ABABBABCABABBA”, the LZW compression algorithm works as follows:

The output codes are: Instead of sending 14 characters, only 9 codes need to be sent (compression ratio = 14/9 = 1.56).

Example2: Consider the following 4X4, 8-bit image A 512-word dictionary with the following starting content is assumed Dictionary location Entry …... ……. scOutputcodedictionary Table1

The image is encoded by processing its pixels in a Left –To – Right, Top –To- Bottom manner, each successive gray-level value is concatenated with variable – column 1 of table 1, called the currently recognized sequence. The conclusion of this coding table, the dictionary contains 256 code word and LZW algorithm, has successfully identified several repeating gray –level sequence. The encoded output is obtained by reading the third column from top to bottom. The resulting compression ration is 1.42:1

LZW Decompression

LZW Decompression Example ABABBABCABABBA

Exercises Use LZW to trace encoding the string ABRACADABRA. Write a Java program that encodes a given string using LZW. Write a Java program that decodes a given set of encoded codewords using LZW.