COMP261 Lecture 21 Data Compression.

Slides:



Advertisements
Similar presentations
Data compression. INTRODUCTION If you download many programs and files off the Internet, we have probably encountered.
Advertisements

15 Data Compression Foundations of Computer Science ã Cengage Learning.
Data Compression CS 147 Minh Nguyen.
Source Coding Data Compression A.J. Han Vinck. DATA COMPRESSION NO LOSS of information and exact reproduction (low compression ratio 1:4) general problem.
Lecture 6 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan
Tries Search for ‘bell’ O(n) by KMP algorithm O(dm) in a trie Tries
Compression & Huffman Codes
Compression Techniques. Digital Compression Concepts ● Compression techniques are used to replace a file with another that is smaller ● Decompression.
Compression JPG compression, Source: Original 10:1 Compression 45:1 Compression.
A Data Compression Algorithm: Huffman Compression
Lossless Data Compression Using run-length and Huffman Compression pages
 Wisegeek.com defines Data Compression as:  “Data compression is a general term for a group of technologies that encode large files in order to shrink.
©Brooks/Cole, 2003 Chapter 15 Data Compression. ©Brooks/Cole, 2003 Realize the need for data compression. Differentiate between lossless and lossy compression.
Lossless Compression Multimedia Systems (Module 2 Lesson 3)
Lecture 10 Data Compression.
Text Compression Spring 2007 CSE, POSTECH. 2 2 Data Compression Deals with reducing the size of data – Reduce storage space and hence storage cost Compression.
Data Compression.
Source Coding-Compression
1 Analysis of Algorithms Chapter - 08 Data Compression.
Survey on Improving Dynamic Web Performance Guide:- Dr. G. ShanmungaSundaram (M.Tech, Ph.D), Assistant Professor, Dept of IT, SMVEC. Aswini. S M.Tech CSE.
Multimedia Specification Design and Production 2012 / Semester 1 / L3 Lecturer: Dr. Nikos Gazepidis
Image Compression (Chapter 8) CSC 446 Lecturer: Nada ALZaben.
The LZ family LZ77 LZ78 LZR LZSS LZB LZH – used by zip and unzip
Addressing Image Compression Techniques on current Internet Technologies By: Eduardo J. Moreira & Onyeka Ezenwoye CIS-6931 Term Paper.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
Main Index Contents 11 Main Index Contents Complete Binary Tree Example Complete Binary Tree Example Maximum and Minimum Heaps Example Maximum and Minimum.
Lecture 7 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan
Comp 335 File Structures Data Compression. Why Study Data Compression? Conserves storage space Files can be transmitted faster because there are less.
Lempel-Ziv-Welch Compression
An Overview of Different Compression Algorithms Their application on compressing inverted files.
Lampel ZIV (LZ) code The Lempel-Ziv algorithm is a variable-to-fixed length code Basically, there are two versions of the algorithm LZ77 and LZ78 are the.
LZW (Lempel-Ziv-welch) compression method The LZW method to compress data is an evolution of the method originally created by Abraham Lempel and Jacob.
15-853Page :Algorithms in the Real World Data Compression III Lempel-Ziv algorithms Burrows-Wheeler Introduction to Lossy Compression.
CS 1501: Algorithm Implementation
Computer Sciences Department1. 2 Data Compression and techniques.
Compression and Huffman Coding. Compression Reducing the memory required to store some information. Lossless compression vs lossy compression Lossless.
Page 1 Algorithms in the Real World Lempel-Ziv Burroughs-Wheeler ACB.
Submitted To-: Submitted By-: Mrs.Sushma Rani (HOD) Aashish Kr. Goyal (IT-7th) Deepak Soni (IT-8 th )
Data Compression Michael J. Watts
15-853:Algorithms in the Real World
Module 11: File Structure
COMP261 Lecture 22 Data Compression 2.
Data Coding Run Length Coding
Compression & Huffman Codes
Data Compression.
Digital Image Processing Lecture 20: Image Compression May 16, 2005
4k… 4K format was named because it has 4000 pixels horizontal resolution approximately. Meanwhile, standard 1080p and 720p resolutions were named because.
Data Compression.
Multimedia Outline Compression RTP Scheduling Spring 2000 CS 461.
Lesson Objectives Aims You should know about: 1.3.1:
BIF713 Managing Disk Space.
Applied Algorithmics - week7
Information of the LO Subject: Information Theory
Data Compression.
Data Compression CS 147 Minh Nguyen.
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
Data Compression Reduce the size of data.
Topic 3: Data Compression.
Image Transforms for Robust Coding
فشرده سازي داده ها Reduce the size of data.
Greedy: Huffman Codes Yin Tat Lee
Image Coding and Compression
Tries 2/23/2019 8:29 AM Tries 2/23/2019 8:29 AM Tries.
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Tries 2/27/2019 5:37 PM Tries Tries.
Number Systems Instructions, Compression & Truth Tables.
Chapter 8 – Compression Aims: Outline the objectives of compression.
Compression.
CPS 296.3:Algorithms in the Real World
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Presentation transcript:

COMP261 Lecture 21 Data Compression

Data/Text Compression Files containing text, sound, video etc can be huge! Can we reduce the amount of time/space required to transmit/store them? E.g. text files are hugely redundant – we use 8 bits (or more) to store each characters. Reducing the memory required to store some information. Original text/image/sound compressed text/image/sound compress

Lossless v. Lossy Data Compression Data compression may be: Lossless: Can retrieve the original data exactly Important for text and some numerical data Lossy: Can’t retrieve the original data exactly Acceptable in some contexts E.g. mp3 compresses sound files Lossless compression only possible if there is redundancy in the original. compression identifies and removes some of the redundant elements. eg, repeated patterns If lots of repeated characters, replace by count and character Construct a dictionary and replace words by indexes to it

Lempel-Ziv a contrived text containing riveting contrasting Lossless compression. LZ77 = simple compression, using repeated patterns basis for many later, more sophisticated compression schemes. Key idea: If you find a repeated pattern, replace the later occurrences by a link to the first: a contrived text containing riveting contrasting a contrived text[15,5] ain[2,2] g [22,4] t[9,3][35,5] as[12,4] Compressed text revealed as step through animation: a contrived text[15,5] ain[2,2] g [22,4] t[9,3][35,5] as[12,4]

Lempel-Ziv a contrived text containing riveting contrasting a contrived text containing riveting contrasting t  [0,0,a][0,0, ][0,0,c][0,0,o][0,0,n][0,0,t][0,0,r][0,0,i][0,0,v][0,0,e][0,0,d][10,1,t] [4,1,x][3,1, ][15,4,a][15,1,n][2,2,g][11,1,r][22,3,t][9,4,c][35,4,a][0,0,s][12,5,t] a contrived text containing riveting contrasting a contrived text[15,5] ain[2,2] g [22,4] t[9,3][35,5] as[12,4] Compressed text revealed as step through animation: a contrived text[15,5] ain[2,2] g [22,4] t[9,3][35,5] as[12,4]

Lempel-Ziv 77 Outputs a string of tuples tuple = [offset, length, nextCharacter] or [0,0,character] Moves a cursor through the text one character at a time cursor points at the next character to be encoded. Drags a "sliding window" behind the cursor. searches for matches only in this sliding window Expands a lookahead buffer from the cursor this is the string it tries to match in the sliding window. Searches for a match for the longest possible lookahead stops expanding when there isn't a match Insert triple of match point, length, and next character

Lempel-Ziv 77 Algorithm cursor  0 skljsadf lkjhwep oury d dmsmesjkh fjdhfjdfjdpppdjkhf sdjkh fjdhfjds fjksdh kjjjfiuiwe dsd fdsf sdsa Algorithm cursor  0 windowSize  100 // some suitable size while cursor < text.size lookahead  0 prevMatch  0 loop match  stringMatch( text[cursor.. cursor+lookahead], text[(cursor<windowSize)?0:cursor-windowSize .. cursor-1] ) if match succeeded then prevMatch = match lookahead++ else output( [suitable value for prevMatch, lookahead - 1, text[cursor+lookahead - 1]]) cursor  cursor + lookahead break Cursor – WindowSize should never point before 0, cursor+lookahead mustn't go past end of text

Decompression Decompression: Decode each tuple in turn: cursor  0 a contrived text containing riveting contrasting t  [0,0,a][0,0, ][0,0,c][0,0,o][0,0,n][0,0,t][0,0,r][0,0,i][0,0,v][0,0,e][0,0,d][10,1,t] [4,1,x][3,1, ][15,4,a][15,1,n][2,2,g][11,1,r][22,3,t][9,4,c][35,4,a][0,0,s][12,5,t] Decompression: Decode each tuple in turn: cursor  0 for each tuple [0, 0, ch ]  output[cursor++]  ch [offset, length,ch ]  for j = 0 to length-1 output [cursor++]  output[cursor-offset ] output[cursor++]  ch That’s awful coding!!!