CS 414 - Spring 2010 CS 414 – Multimedia Systems Design Lecture 6 – Basics of Compression (Part 1) Klara Nahrstedt Spring 2010.

Slides:



Advertisements
Similar presentations
T.Sharon-A.Frank 1 Multimedia Compression Basics.
Advertisements

15 Data Compression Foundations of Computer Science ã Cengage Learning.
Data Compression CS 147 Minh Nguyen.
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Chapter 6 Review.
SIMS-201 Compressing Information. 2  Overview Chapter 7: Compression Introduction Entropy Huffman coding Universal coding.
Compression Techniques. Digital Compression Concepts ● Compression techniques are used to replace a file with another that is smaller ● Decompression.
SWE 423: Multimedia Systems
Department of Computer Engineering University of California at Santa Cruz Data Compression (1) Hai Tao.
Spatial and Temporal Data Mining
A Data Compression Algorithm: Huffman Compression
T.Sharon-A.Frank 1 Multimedia Size of Data Frame.
Computer Science 335 Data Compression.
Data Compression Basics
ATSC Digital Television
Klara Nahrstedt Spring 2014
Fundamentals of Multimedia Chapter 7 Lossless Compression Algorithms Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Lossless Compression - I Hao Jiang Computer Science Department Sept. 13, 2007.
Lossless Compression in Multimedia Data Representation Hao Jiang Computer Science Department Sept. 20, 2007.
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
1 Lossless Compression Multimedia Systems (Module 2) r Lesson 1: m Minimum Redundancy Coding based on Information Theory: Shannon-Fano Coding Huffman Coding.
Spring 2015 Mathematics in Management Science Binary Linear Codes Two Examples.
Basics of Compression Goals: to understand how image/audio/video signals are compressed to save storage and increase transmission efficiency to understand.
Lecture 10 Data Compression.
Chapter 2 Source Coding (part 2)
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 8 – JPEG Compression (Part 3) Klara Nahrstedt Spring 2012.
ECE472/572 - Lecture 12 Image Compression – Lossy Compression Techniques 11/10/11.
Klara Nahrstedt Spring 2009
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 5 – Digital Video Representation Klara Nahrstedt Spring 2014.
1 Image Compression. 2 GIF: Graphics Interchange Format Basic mode Dynamic mode A LZW method.
MULTIMEDIA TECHNOLOGY SMM 3001 DATA COMPRESSION. In this chapter The basic principles for compressing data The basic principles for compressing data Data.
Multimedia- and Web-based Information Systems Lecture 5.
Klara Nahrstedt Spring 2011
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 7 – Basics of Compression (Part 2) Klara Nahrstedt Spring 2011.
Page 110/6/2015 CSE 40373/60373: Multimedia Systems So far  Audio (scalar values with time), image (2-D data) and video (2-D with time)  Higher fidelity.
Concepts of Multimedia Processing and Transmission IT 481, Lecture 5 Dennis McCaughey, Ph.D. 19 February, 2007.
1 Analysis of Algorithms Chapter - 08 Data Compression.
Multimedia Specification Design and Production 2012 / Semester 1 / L3 Lecturer: Dr. Nikos Gazepidis
CIS679: Multimedia Basics r Multimedia data type r Basic compression techniques.
Image Compression (Chapter 8) CSC 446 Lecturer: Nada ALZaben.
Image Compression Supervised By: Mr.Nael Alian Student: Anwaar Ahmed Abu-AlQomboz ID: IT College “Multimedia”
1 Classification of Compression Methods. 2 Data Compression  A means of reducing the size of blocks of data by removing  Unused material: e.g.) silence.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
COMPRESSION. Compression in General: Why Compress? So Many Bits, So Little Time (Space) CD audio rate: 2 * 2 * 8 * = 1,411,200 bps CD audio storage:
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Chapter 17 Image Compression 17.1 Introduction Redundant and irrelevant information  “Your wife, Helen, will meet you at Logan Airport in Boston.
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 6 – Basics of Compression (Part 1) Klara Nahrstedt Spring 2011.
Lecture 4: Lossless Compression(1) Hongli Luo Fall 2011.
CS654: Digital Image Analysis Lecture 34: Different Coding Techniques.
Digital Image Processing Lecture 22: Image Compression
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 11 COMPRESSION.
Multi-media Data compression
Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 7 – Basics of Compression (Part 2) Klara Nahrstedt Spring 2012.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 19 – Multimedia Transport Subsystem (Part 2) + Midterm Review Klara Nahrstedt Spring 2014.
Dr. Hadi AL Saadi Image Compression. Goal of Image Compression Digital images require huge amounts of space for storage and large bandwidths for transmission.
Digital Video Representation Subject : Audio And Video Systems Name : Makwana Gaurav Er no.: : Class : Electronics & Communication.
Submitted To-: Submitted By-: Mrs.Sushma Rani (HOD) Aashish Kr. Goyal (IT-7th) Deepak Soni (IT-8 th )
Data Compression: Huffman Coding in Weiss (p.389)
Data Compression.
Multimedia Outline Compression RTP Scheduling Spring 2000 CS 461.
Image Compression The still image and motion images can be compressed by lossless coding or lossy coding. Principle of compression: - reduce the redundant.
Data Compression.
Data Compression CS 147 Minh Nguyen.
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
UNIT IV.
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Chapter 8 – Compression Aims: Outline the objectives of compression.
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Presentation transcript:

CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 6 – Basics of Compression (Part 1) Klara Nahrstedt Spring 2010

CS Spring 2010 Administrative MP1 is posted Discussion meeting today Monday, February 1, at 7pm, 1105 SC.

Outline Final video concepts Need for compression and compression algorithms classification Basic Coding Concepts  Fixed-length coding and variable-length coding  Compression Ratio  Entropy RLE Compression (Entropy Coding) Huffman Compression (Statistical Entropy Coding) CS Spring 2010

Reading Media Coding and Content Processing, Steinmetz, Nahrstedt, Prentice Hall, 2002  Video concepts – chapter 5 and lecture notes  Data Compression – chapter 7 Basic coding concepts – Sections and lecture notes CS Spring 2010

Video Concepts Spatial resolution Temporal resolution (Flicker Effect) Color Theory RGB Hue Scale Viewing distance Different TV Formats (NTSC, PAL, HDTV) Color coding (YUV, YIQ, ….) Pixel Aspect Ratio Aspect Ratio SMPTE CS Spring 2010

HDTV Digital Television Broadcast (DTB) System Twice as many horizontal and vertical columns and lines as traditional TV Resolutions:  1920x1080 (1080p) – Standard HDTV Frame rate: options 50 or 60 frames per second CS Spring 2010

HDTV Interlaced and/or progressive formats  Conventional TVs – use interlaced formats  Computer displays (LCDs) – use progressive scanning MPEG-2 compressed streams In Europe (Germany) – MPEG-4 compressed streams CS Spring 2010

Pixel Aspect Ratio CS Spring 2010 Source: wikipedia Computer Graphics parameter Mathematical ratio describing horizontal length of a pixel to its vertical height Used mainly in digital video editing software to properly scale and render video Into a new format

Aspect Ratio and Refresh Rate Aspect ratio  Conventional TV is 4:3 (1.33)  HDTV is 16:9 (2.11)  Cinema uses 1.85:1 or 2.35:1 Frame Rate  NTSC is 60Hz interlaced (actually 59.94Hz)  PAL/SECAM is 50Hz interlaced  Cinema is 24Hz non- interlaced CS Spring 2010 Source: wikipedia

SMPTE Time Codes Society of Motion Picture and Television Engineers defines time codes for video  HH:MM:SS:FF  01:12:59:16 represents number of pictures corresponding to 1hour, 12 minutes, 59 seconds, 16 frames If we consider 30 fps, then 59 seconds represent 59*30 frames, 12 minutes represent 12*60*30 frames and 1 hour represents 1*60*60*30 frames. For NTSC, SMPTE uses a 30 drop frame code  increment as if using 30 fps, when really NTSC has only 29.97fps  defines rules to remove the difference error CS Spring 2010

Need for Compression Uncompressed audio 8 KHz, 8 bit  8K per second  30M per hour 44.1 KHz, 16 bit  88.2K per second  317.5M per hour 100 Gbyte disk holds 315 hours of CD quality music Uncompressed video 640 x 480 resolution, 8 bit color, 24 fps  7.37 Mbytes per second  26.5 Gbytes per hour 640 x 480 resolution, 24 bit (3 bytes) color, 30 fps  27.6 Mbytes per second  99.5 Gbytes per hour 100 Gbyte disk holds 1 hour of high quality video CS Spring 2010

Broad Classification Entropy Coding (statistical)  lossless; independent of data characteristics  e.g. RLE, Huffman, LZW, Arithmetic coding Source Coding  lossy; may consider semantics of the data  depends on characteristics of the data  e.g. DCT, DPCM, ADPCM, color model transform Hybrid Coding (used by most multimedia systems)  combine entropy with source encoding  e.g., JPEG-2000, H.264, MPEG-2, MPEG-4, MPEG-7 CS Spring 2010

Data Compression Branch of information theory  minimize amount of information to be transmitted Transform a sequence of characters into a new string of bits  same information content  length as short as possible CS Spring 2010

Concepts Coding (the code) maps source messages from alphabet (A) into code words (B) Source message (symbol) is basic unit into which a string is partitioned  can be a single letter or a string of letters EXAMPLE: aa bbb cccc ddddd eeeeee fffffffgggggggg  A = {a, b, c, d, e, f, g, space}  B = {0, 1} CS Spring 2010

Taxonomy of Codes Block-block  source msgs and code words of fixed length; e.g., ASCII Block-variable  source message fixed, code words variable; e.g., Huffman coding Variable-block  source variable, code word fixed; e.g., RLE, LZW Variable-variable  source variable, code words variable; e.g., Arithmetic CS Spring 2010

Example of Block-Block Coding “aa bbb cccc ddddd eeeeee fffffffgggggggg” Requires 120 bits SymbolCode word a000 b001 c010 d011 e100 f101 g110 space111

Example of Variable-Variable Coding “aa bbb cccc ddddd eeeeee fffffffgggggggg” Requires 30 bits  don’t forget the spaces SymbolCode word aa0 bbb1 cccc10 ddddd11 eeeeee100 fffffff101 gggggggg110 space111

Concepts (cont.) A code is  distinct if each code word can be distinguished from every other (mapping is one-to-one)  uniquely decodable if every code word is identifiable when immersed in a sequence of code words e.g., with previous table, message 11 could be defined as either ddddd or bbbbbb CS Spring 2010

Static Codes Mapping is fixed before transmission  message represented by same codeword every time it appears in message (ensemble)  Huffman coding is an example Better for independent sequences  probabilities of symbol occurrences must be known in advance; CS Spring 2010

Dynamic Codes Mapping changes over time  also referred to as adaptive coding Attempts to exploit locality of reference  periodic, frequent occurrences of messages  dynamic Huffman is an example Hybrids?  build set of codes, select based on input CS Spring 2010

Traditional Evaluation Criteria Algorithm complexity  running time Amount of compression  redundancy  compression ratio How to measure? CS Spring 2010

Measure of Information Consider symbols s i and the probability of occurrence of each symbol p(s i ) In case of fixed-length coding, smallest number of bits per symbol needed is  L ≥ log 2 (N) bits per symbol  Example: Message with 5 symbols need 3 bits (L ≥ log 2 5) CS Spring 2010

Variable-Length Coding- Entropy What is the minimum number of bits per symbol? Answer: Shannon’s result – theoretical minimum average number of bits per code word is known as Entropy (H) CS Spring 2010

Entropy Example Alphabet = {A, B}  p(A) = 0.4; p(B) = 0.6 Compute Entropy (H)  -0.4*log *log =.97 bits CS Spring 2010

Compression Ratio Compare the average message length and the average codeword length  e.g., average L(message) / average L(codeword) Example:  {aa, bbb, cccc, ddddd, eeeeee, fffffff, gggggggg}  Average message length is 5  If we use code-words from slide 16, then We have {0,1,10,11,100,101,110} Average codeword length is Bits  Compression ratio: 5/2.14 = CS Spring 2010

Symmetry Symmetric compression  requires same time for encoding and decoding  used for live mode applications (teleconference) Asymmetric compression  performed once when enough time is available  decompression performed frequently, must be fast  used for retrieval mode applications (e.g., an interactive CD-ROM) CS Spring 2010

Entropy Coding Algorithms (Content Dependent Coding) Run-length Encoding (RLE)  Replaces sequence of the same consecutive bytes with number of occurrences  Number of occurrences is indicated by a special flag (e.g., !)  Example: abcccccccccdeffffggg (20 Bytes) abc!9def!4ggg (13 bytes) CS Spring 2010

Variations of RLE (Zero- suppression technique) Assumes that only one symbol appears often (blank) Replace blank sequence by M-byte and a byte with number of blanks in sequence  Example: M3, M4, M14,… Some other definitions are possible  Example: M4 = 8 blanks, M5 = 16 blanks, M4M5=24 blanks CS Spring 2010

Huffman Encoding Statistical encoding To determine Huffman code, it is useful to construct a binary tree Leaves are characters to be encoded Nodes carry occurrence probabilities of the characters belonging to the subtree Example: How does a Huffman code look like for symbols with statistical symbol occurrence probabilities: P(A) = 8/20, P(B) = 3/20, P(C ) = 7/20, P(D) = 2/20? CS Spring 2010

Huffman Encoding (Example) P(C) = 0.09P(E) = 0.11 P(D) = 0.13 P(A)=0.16 P(B) = 0.51 Step 1 : Sort all Symbols according to their probabilities (left to right) from Smallest to largest these are the leaves of the Huffman tree CS Spring 2010

Huffman Encoding (Example) P(C) = 0.09P(E) = 0.11 P(D) = 0.13 P(A)=0.16 P(B) = 0.51 P(CE) = 0.20 P(DA) = 0.29 P(CEDA) = 0.49 P(CEDAB) = 1 Step 2: Build a binary tree from left to Right Policy: always connect two smaller nodes together (e.g., P(CE) and P(DA) had both Probabilities that were smaller than P(B), Hence those two did connect first CS Spring 2010

Huffman Encoding (Example) P(C) = 0.09P(E) = 0.11 P(D) = 0.13 P(A)=0.16 P(B) = 0.51 P(CE) = 0.20 P(DA) = 0.29 P(CEDA) = 0.49 P(CEDAB) = Step 3: label left branches of the tree With 0 and right branches of the tree With CS Spring 2010

Huffman Encoding (Example) P(C) = 0.09P(E) = 0.11 P(D) = 0.13 P(A)=0.16 P(B) = 0.51 P(CE) = 0.20 P(DA) = 0.29 P(CEDA) = 0.49 P(CEDAB) = Step 4: Create Huffman Code Symbol A = 011 Symbol B = 1 Symbol C = 000 Symbol D = 010 Symbol E = CS Spring 2010

Summary Compression algorithms are of great importance when processing and transmitting  Audio  Images  Video CS Spring 2010