Data Compression Michael J. Watts

Slides:



Advertisements
Similar presentations
Compression techniques. Why we need compression. Types of compression –Lossy and lossless Concentrate on lossless techniques. Run Length coding. Entropy.
Advertisements

15 Data Compression Foundations of Computer Science ã Cengage Learning.
Data Compression CS 147 Minh Nguyen.
Image Compression. Data and information Data is not the same thing as information. Data is the means with which information is expressed. The amount of.
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture3.
SIMS-201 Compressing Information. 2  Overview Chapter 7: Compression Introduction Entropy Huffman coding Universal coding.
Lecture04 Data Compression.
Chapter 7 End-to-End Data
Compression & Huffman Codes
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
SWE 423: Multimedia Systems
CSCI 3 Chapter 1.8 Data Compression. Chapter 1.8 Data Compression  For the purpose of storing or transferring data, it is often helpful to reduce the.
Compression JPG compression, Source: Original 10:1 Compression 45:1 Compression.
Computer Science 335 Data Compression.
Compression & Huffman Codes Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Fundamentals of Multimedia Chapter 7 Lossless Compression Algorithms Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Lossless Data Compression Using run-length and Huffman Compression pages
CS430 © 2006 Ray S. Babcock Lossy Compression Examples JPEG MPEG JPEG MPEG.
EEE377 Lecture Notes1 EEE436 DIGITAL COMMUNICATION Coding En. Mohd Nazri Mahmud MPhil (Cambridge, UK) BEng (Essex, UK) Room 2.14.
Data Compression Basics & Huffman Coding
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
©Brooks/Cole, 2003 Chapter 15 Data Compression. ©Brooks/Cole, 2003 Realize the need for data compression. Differentiate between lossless and lossy compression.
Spring 2015 Mathematics in Management Science Binary Linear Codes Two Examples.
Chapter 2 Source Coding (part 2)
296.3Page 1 CPS 296.3:Algorithms in the Real World Data Compression: Lecture 2.5.
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 7 – Basics of Compression (Part 2) Klara Nahrstedt Spring 2011.
Page 110/6/2015 CSE 40373/60373: Multimedia Systems So far  Audio (scalar values with time), image (2-D data) and video (2-D with time)  Higher fidelity.
D ATA C OMMUNICATIONS Compression Techniques. D ATA C OMPRESSION Whether data, fax, video, audio, etc., compression can work wonders Compression can be.
Lecture 29. Data Compression Algorithms 1. Commonly, algorithms are analyzed on the base probability factor such as average case in linear search. Amortized.
Image Compression (Chapter 8) CSC 446 Lecturer: Nada ALZaben.
Multimedia Data Introduction to Lossless Data Compression Dr Sandra I. Woolley Electronic, Electrical.
1 Introduction to Information Technology LECTURE 5 COMPRESSION.
8. 1 MPEG MPEG is Moving Picture Experts Group On 1992 MPEG-1 was the standard, but was replaced only a year after by MPEG-2. Nowadays, MPEG-2 is gradually.
File Compression Techniques Alex Robertson. Outline History Lossless vs Lossy Basics Huffman Coding Getting Advanced Lossy Explained Limitations Future.
Addressing Image Compression Techniques on current Internet Technologies By: Eduardo J. Moreira & Onyeka Ezenwoye CIS-6931 Term Paper.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
COMPRESSION. Compression in General: Why Compress? So Many Bits, So Little Time (Space) CD audio rate: 2 * 2 * 8 * = 1,411,200 bps CD audio storage:
Huffman Code and Data Decomposition Pranav Shah CS157B.
Spring 2000CS 4611 Multimedia Outline Compression RTP Scheduling.
Lecture 4: Lossless Compression(1) Hongli Luo Fall 2011.
CS654: Digital Image Analysis Lecture 34: Different Coding Techniques.
Digital Image Processing Lecture 22: Image Compression
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 11 COMPRESSION.
Multi-media Data compression
Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5.
Lecture 12 Huffman Algorithm. In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly.
Compression and Huffman Coding. Compression Reducing the memory required to store some information. Lossless compression vs lossy compression Lossless.
Submitted To-: Submitted By-: Mrs.Sushma Rani (HOD) Aashish Kr. Goyal (IT-7th) Deepak Soni (IT-8 th )
RICO HARTONO JAHJA SOURCE CODING: PART IV.
Data Compression: Huffman Coding in Weiss (p.389)
Data Compression Michael J. Watts
Textbook does not really deal with compression.
Design & Analysis of Algorithm Huffman Coding
Compression & Huffman Codes
IMAGE COMPRESSION.
Data Compression.
Lesson Objectives Aims You should know about: 1.3.1:
Introduction to Computer Science - Lecture 4
Data Compression.
Huffman Coding, Arithmetic Coding, and JBIG2
Data Compression CS 147 Minh Nguyen.
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
Image Coding and Compression
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Govt. Polytechnic Dhangar(Fatehabad)
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Lecture 8 Huffman Encoding (Section 2.2)
Presentation transcript:

Data Compression Michael J. Watts

Lecture Outline What is compression? Huffman encoding Lempel-Ziv encoding Fraunhofer compression JPEG compression

What is Compression? Reduction of the length of symbols used to represent data  Symbols can be anything Based on redundancy  Information theory Lossless vs lossy Degree of compression depends on amount of redundancy  Random data compression?

What is Compression? Represent symbols with codewords  One symbol == one codeword  Length of codeword < length of symbol Many algorithms exist  Huffman  Lempel-Ziv  Fraunhofer

Huffman Encoding Based on probability (frequency) of symbols Assign codewords as bits Builds a binary tree based on probability  Frequent / high probability symbols at the top  Infrequent / low probability symbols lower down  Branches labelled with 0 and 1 Frequent symbols get short bit strings Infrequent symbols get longer bit strings

Huffman Encoding Algorithm  Arrange symbols in order of decreasing probability  Combine two symbols with lowest probability Assign a new name, add their probabilities  Rebuild the list  Combine the next two lowest symbols  Repeat until there is one symbol, with probability of one  Build a binary tree  Form codewords by tracing down the tree to the symbol

Huffman Encoding Decompression is the opposite of compression  Follow bit sequence to a terminal node Needs to see the entire data set  Must know symbol probabilities Also combined with other techniques  MP3 encoding

Lempel-Ziv Dictionary based encoder Lossless Replaces phrases with tokens  Tokens smaller than the phrases Converts input stream to compressed output stream Doesn't need to see entire data set  Dictionary is built as it moves through the stream

Lempel-Ziv Separates input stream into tokens Each token represents the shortest phrase that has not been seen Tokens are numbered Tokens contain other tokens Compression is inefficient at the start Better later  Larger dictionary

Fraunhofer / MP3 Encoding Lossy algorithm Compression of audio data Perceptual encoding  Psycho-acoustic model  Throws out what can't be heard Frequency bands Masking effects Output is further compressed  Uses Huffman encoding

MP3 Encoding Only usable on audio Part of the MPEG standard  DVDs Rather widely used by itself

JPEG Compression Joint Photographic Experts Group Used to compress images Compresses groups of pixels Uses a Discrete Cosine Transformation  DCT Quantises the results of the DCT  Lossy! Further compression is applied

Summary Compression is an application of information theory Encodes long strings as shorter strings All schemes depend on redundant / repetitive data Common gotchas  A compression algorithm cannot compress its own output  Cannot easily compress completely random data No / little redundancy