File Compression Techniques Alex Robertson. Outline History Lossless vs Lossy Basics Huffman Coding Getting Advanced Lossy Explained Limitations Future.

Slides:



Advertisements
Similar presentations
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Advertisements

Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Arithmetic Coding. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a How we can do better than Huffman? - I As we have seen, the.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture3.
Greedy Algorithms (Huffman Coding)
CPSC 335 Compression and Huffman Coding Dr. Marina Gavrilova Computer Science University of Calgary Canada.
Lecture 10 : Huffman Encoding Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University Lecture notes : courtesy.
Data Compression Michael J. Watts
Lecture04 Data Compression.
Compression & Huffman Codes
School of Computing Science Simon Fraser University
Huffman Encoding 16-Apr-17.
CSc 461/561 CSc 461/561 Multimedia Systems Part B: 1. Lossless Compression.
SWE 423: Multimedia Systems
A Data Compression Algorithm: Huffman Compression
DL Compression – Beeri/Feitelson1 Compression דחיסה Introduction Information theory Text compression IL compression.
Is ASCII the only way? For computers to do anything (besides sit on a desk and collect dust) they need two things: 1. PROGRAMS 2. DATA A program is a.
Compression & Huffman Codes Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Fundamentals of Multimedia Chapter 7 Lossless Compression Algorithms Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Lossless Data Compression Using run-length and Huffman Compression pages
Data Compression Basics & Huffman Coding
Squishin’ Stuff Huffman Compression. Data Compression Begin with a computer file (text, picture, movie, sound, executable, etc) Most file contain extra.
1 Lossless Compression Multimedia Systems (Module 2) r Lesson 1: m Minimum Redundancy Coding based on Information Theory: Shannon-Fano Coding Huffman Coding.
Data Compression Gabriel Laden CS146 – Dr. Sin-Min Lee Spring 2004.
CSE Lectures 22 – Huffman codes
8. Compression. 2 Video and Audio Compression Video and Audio files are very large. Unless we develop and maintain very high bandwidth networks (Gigabytes.
15-853Page :Algorithms in the Real World Data Compression II Arithmetic Coding – Integer implementation Applications of Probability Coding – Run.
Page 110/6/2015 CSE 40373/60373: Multimedia Systems So far  Audio (scalar values with time), image (2-D data) and video (2-D with time)  Higher fidelity.
Data Compression1 File Compression Huffman Tries ABRACADABRA
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
 The amount of data we deal with is getting larger  Not only do larger files require more disk space, they take longer to transmit  Many times files.
Image Compression (Chapter 8) CSC 446 Lecturer: Nada ALZaben.
Group No 5 1.Muhammad Talha Islam 2.Karim Akhter 3.Muhammad Arif 4.Muhammad Umer Khalid.
Compression.  Compression ratio: how much is the size reduced?  Symmetric/asymmetric: time difference to compress, decompress?  Lossless; lossy: any.
Huffman Coding. Huffman codes can be used to compress information –Like WinZip – although WinZip doesn’t use the Huffman algorithm –JPEGs do use Huffman.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
COMPRESSION. Compression in General: Why Compress? So Many Bits, So Little Time (Space) CD audio rate: 2 * 2 * 8 * = 1,411,200 bps CD audio storage:
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Huffman Code and Data Decomposition Pranav Shah CS157B.
Huffman Coding Yancy Vance Paredes. Outline Background Motivation Huffman Algorithm Sample Implementation Running Time Analysis Proof of Correctness Application.
Abdullah Aldahami ( ) April 6,  Huffman Coding is a simple algorithm that generates a set of variable sized codes with the minimum average.
CS654: Digital Image Analysis Lecture 34: Different Coding Techniques.
Bahareh Sarrafzadeh 6111 Fall 2009
Lossless Compression(2)
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 11 COMPRESSION.
1 Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004.
Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5.
بسم الله الرحمن الرحيم My Project Huffman Code. Introduction Introduction Encoding And Decoding Encoding And Decoding Applications Applications Advantages.
Images. Audio. Cryptography - Steganography MultiMedia Compression } Movies.
Huffman code and Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Lossless Compression-Statistical Model Lossless Compression One important to note about entropy is that, unlike the thermodynamic measure of entropy,
Submitted To-: Submitted By-: Mrs.Sushma Rani (HOD) Aashish Kr. Goyal (IT-7th) Deepak Soni (IT-8 th )
Data Compression: Huffman Coding in Weiss (p.389)
Data Compression Michael J. Watts
Textbook does not really deal with compression.
Design & Analysis of Algorithm Huffman Coding
Compression & Huffman Codes
Assignment 6: Huffman Code Generation
Data Compression.
Huffman Coding, Arithmetic Coding, and JBIG2
Data Compression If you’ve ever sent a large file to a friend, you may have compressed it into a zip archive like the one on this slide before doing so.
Chapter 8 – Binary Search Tree
Math 221 Huffman Codes.
Huffman Coding CSE 373 Data Structures.
Huffman Encoding Huffman code is method for the compression for standard text documents. It makes use of a binary tree to develop codes of varying lengths.
File Compression Even though disks have gotten bigger, we are still running short on disk space A common technique is to compress files so that they take.
Huffman Encoding.
Presentation transcript:

File Compression Techniques Alex Robertson

Outline History Lossless vs Lossy Basics Huffman Coding Getting Advanced Lossy Explained Limitations Future

History, where this all started The Problem! 1940s Shannon-Fano coding Properties Different codes have different numbers of bits. Codes for symbols with low probabilities have more bits, and codes for symbols with high probabilities have fewer bits. Though the codes are of different bit lengths, they can be uniquely decoded.

Lossless vs Lossy Lossless DEFLATE Data, every little detail is important Lossy JPEG MP3 Data can be lost and unnoticed

Understanding the Basics Properties Different codes have different numbers of bits. Codes for symbols with low probabilities have more bits, and codes for symbols with high probabilities have fewer bits. Though the codes are of different bit lengths, they can be uniquely decoded. Encode “SATA” S = 10A = 0T = 11

Prefix Rule S = 01A = 0T = 00 SATA SAAAA STT No code can be the prefix of another code. If 0 is a code, 0* can’t be a code.

Make a Tree Create a Tree A = 010 B = 11 C = 00 D = 10 R = 011

Decode A = 010 B = 11 C = 00 D = 10 R = 011 Violates the property: Codes for symbols with low probabilities have more bits, and codes for symbols with high probabilities have fewer bits.

Huffman Coding Create a Tree Encode “ABRACADABRA” Determine Frequencies 1. The two least frequent “nodes” are located. 2. A parent node is created from the two above nodes and it is given a weight equal to the sum of the two contain node frequencies. 3. One of the child nodes is given the 0 bit and the other the 1 bit 4. Repeat the above steps until only one node is left.

Does it work? Re-encode bits

It Works! = 23 bits ABRACADABRA = 11 character * 7 bits each = 77 bits but…

It Works… With Issues. Header includes the probability table Not the best in certain cases Example. ‘A’ 100 times Huffman only reduces this to 100 bits (minus the header)

Moving Forward Arithmetic Method Not Specific Code Continuously changing single floating- point output number Example

“BILL GATES” CharacterProbabilityRange SPACE1/100.0 >= r > 0.1 A1/100.1 >= r > 0.2 B1/100.2 >= r > 0.3 E1/100.3 >= r > 0.4 G1/100.4 >= r > 0.5 I1/100.5 >= r > 0.6 L2/100.6 >= r > 0.8 S1/100.8 >= r > 0.9 T1/100.9 >= r > 1.0

Dictionary Based Implemented in the late 70s Uses previously seen words as a dictionary. the quick brown fox jumped over the lazy dog I bought a Mississippi Banana in Mississippi.

Lossy Compression Lossy Formula Lossless Formula My Sound!

Mathematical Limitations Claude E. Shannon

Example DEFLATE

Future Hardware is getting better Theories are the same

Thanks You Questions