Lempel-Ziv-Welch (LZW) Compression Algorithm

Slides:



Advertisements
Similar presentations
Multimedia Data Introduction to Lossless Data Compression Dr Mike Spann Electronic, Electrical and.
Advertisements

Source Coding Data Compression A.J. Han Vinck. DATA COMPRESSION NO LOSS of information and exact reproduction (low compression ratio 1:4) general problem.
CSCI 3280 Tutorial 6. Outline  Theory part of LZW  Tree representation of LZW  Table representation of LZW.
Greedy Algorithms (Huffman Coding)
Lempel-Ziv-Welch (LZW) Compression Algorithm
Algorithms for Data Compression
Lecture 6 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan
Data Compression Michael J. Watts
Lossless Compression - II Hao Jiang Computer Science Department Sept. 18, 2007.
Algorithm Programming Some Topics in Compression Bar-Ilan University תשס"ח by Moshe Fresko.
Lempel-Ziv Compression Techniques Classification of Lossless Compression techniques Introduction to Lempel-Ziv Encoding: LZ77 & LZ78 LZ78 Encoding Algorithm.
Lempel-Ziv Compression Techniques
A Data Compression Algorithm: Huffman Compression
Algorithm Programming Some Topics in Compression
Lempel-Ziv-Welch (LZW) Compression Algorithm
Lempel-Ziv Compression Techniques
Data Compression Basics & Huffman Coding
Lossless Compression Multimedia Systems (Module 2 Lesson 3)
Data Compression Algorithms for Energy-Constrained Devices in Delay Tolerant Networks Christopher M. Sadler and Margaret Martonosi In: Proc. of the 4th.
Compression Algorithms
Noiseless Coding. Introduction Noiseless Coding Compression without distortion Basic Concept Symbols with lower probabilities are represented by the binary.
Source Coding-Compression
Information and Coding Theory Heuristic data compression codes. Lempel- Ziv encoding. Burrows-Wheeler transform. Juris Viksna, 2015.
Page 110/6/2015 CSE 40373/60373: Multimedia Systems So far  Audio (scalar values with time), image (2-D data) and video (2-D with time)  Higher fidelity.
Fundamental Structures of Computer Science Feb. 24, 2005 Ananda Guna Lempel-Ziv Compression.
Lecture 29. Data Compression Algorithms 1. Commonly, algorithms are analyzed on the base probability factor such as average case in linear search. Amortized.
Fundamental Structures of Computer Science March 23, 2006 Ananda Guna Lempel-Ziv Compression.
Fundamental Data Structures and Algorithms Aleks Nanevski February 10, 2004 based on a lecture by Peter Lee LZW Compression.
Survey on Improving Dynamic Web Performance Guide:- Dr. G. ShanmungaSundaram (M.Tech, Ph.D), Assistant Professor, Dept of IT, SMVEC. Aswini. S M.Tech CSE.
EEM 480 Lecture 11 Hashing and Dictionaries. Symbol Table Symbol tables are used by compilers to keep track of information about variables functions class.
Cryptographic Attacks on Scrambled LZ-Compression and Arithmetic Coding By: RAJBIR SINGH BIKRAM KAHLON.
Multimedia Data Introduction to Lossless Data Compression Dr Sandra I. Woolley Electronic, Electrical.
The LZ family LZ77 LZ78 LZR LZSS LZB LZH – used by zip and unzip
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
Fundamental Data Structures and Algorithms Margaret Reid-Miller 24 February 2005 LZW Compression.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
Multimedia – Data Compression
Lecture 7 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan
Lempel-Ziv methods.
Lempel-Ziv-Welch Compression
Page 1KUT Graduate Course Data Compression Jun-Ki Min.
Lampel ZIV (LZ) code The Lempel-Ziv algorithm is a variable-to-fixed length code Basically, there are two versions of the algorithm LZ77 and LZ78 are the.
LZW (Lempel-Ziv-welch) compression method The LZW method to compress data is an evolution of the method originally created by Abraham Lempel and Jacob.
15-853Page :Algorithms in the Real World Data Compression III Lempel-Ziv algorithms Burrows-Wheeler Introduction to Lossy Compression.
CS 1501: Algorithm Implementation
Data Compression Michael J. Watts
CSE 589 Applied Algorithms Spring 1999
3.3 Fundamentals of data representation
Data Coding Run Length Coding
Data Compression.
Digital Image Processing Lecture 20: Image Compression May 16, 2005
Increasing Information per Bit
Information and Coding Theory
Lempel-Ziv-Welch (LZW) Compression Algorithm
Applied Algorithmics - week7
Lempel-Ziv Compression Techniques
Information of the LO Subject: Information Theory
Lempel-Ziv-Welch (LZW) Compression Algorithm
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
Data Compression Reduce the size of data.
Lempel-Ziv Compression Techniques
Chapter 11 Data Compression
COMS 161 Introduction to Computing
COMS 161 Introduction to Computing
File Compression Even though disks have gotten bigger, we are still running short on disk space A common technique is to compress files so that they take.
Huffman Coding Link Huffman Coding How will the following be decoded: ‘e’
Table 3. Decompression process using LZW
CPS 296.3:Algorithms in the Real World
High-capacity Reversible Data-hiding for LZW Codes
Lempel-Ziv-Welch (LZW) Compression Algorithm
Presentation transcript:

Lempel-Ziv-Welch (LZW) Compression Algorithm Introduction to the LZW Algorithm Example 1: Encoding using LZW Example 2: Decoding using LZW LZW: Concluding Notes

Introduction to LZW As mentioned earlier, static coding schemes require some knowledge about the data before encoding takes place. Universal coding schemes, like LZW, do not require advance knowledge and can build such knowledge on-the-fly. LZW is the foremost technique for general purpose data compression due to its simplicity and versatility. It is the basis of many PC utilities that claim to “double the capacity of your hard drive” LZW compression uses a code table, with 4096 as a common choice for the number of table entries.

Introduction to LZW (cont'd) Codes 0-255 in the code table are always assigned to represent single bytes from the input file. When encoding begins the code table contains only the first 256 entries, with the remainder of the table being blanks. Compression is achieved by using codes 256 through 4095 to represent sequences of bytes. As the encoding continues, LZW identifies repeated sequences in the data, and adds them to the code table. Decoding is achieved by taking each code from the compressed file, and translating it through the code table to find what character or characters it represents.

LZW Encoding Algorithm 1 Initialize table with single character strings 2 P = first input character 3 WHILE not end of input stream 4 C = next input character 5 IF P + C is in the string table 6 P = P + C 7 ELSE 8   output the code for P 9 add P + C to the string table 10 P = C 11 END WHILE 12 output code for P

Example 1: Compression using LZW Example 1: Use the LZW algorithm to compress the string BABAABAAA

Example 1: LZW Compression Step 1 BABAABAAA P=A C=empty STRING TABLE ENCODER OUTPUT string codeword representing output code BA 256 B 66

Example 1: LZW Compression Step 2 BABAABAAA P=B C=empty STRING TABLE ENCODER OUTPUT string codeword representing output code BA 256 B 66 AB 257 A 65

Example 1: LZW Compression Step 3 BABAABAAA P=A C=empty STRING TABLE ENCODER OUTPUT string codeword representing output code BA 256 B 66 AB 257 A 65 BAA 258

Example 1: LZW Compression Step 4 BABAABAAA P=A C=empty STRING TABLE ENCODER OUTPUT string codeword representing output code BA 256 B 66 AB 257 A 65 BAA 258 ABA 259

Example 1: LZW Compression Step 5 BABAABAAA P=A C=A STRING TABLE ENCODER OUTPUT string codeword representing output code BA 256 B 66 AB 257 A 65 BAA 258 ABA 259 AA 260

Example 1: LZW Compression Step 6 BABAABAAA P=AA C=empty STRING TABLE ENCODER OUTPUT string codeword representing output code BA 256 B 66 AB 257 A 65 BAA 258 ABA 259 AA 260

LZW Decompression The LZW decompressor creates the same string table during decompression. It starts with the first 256 table entries initialized to single characters. The string table is updated for each character in the input stream, except the first one. Decoding achieved by reading codes and translating them through the code table being built.

LZW Decompression Algorithm 1 Initialize table with single character strings 2 OLD = first input code 3 output translation of OLD 4 WHILE not end of input stream 5 NEW = next input code 6  IF NEW is not in the string table 7 S = translation of OLD 8   S = S + C 9 ELSE 10  S = translation of NEW 11 output S 12   C = first character of S 13   OLD + C to the string table 14 OLD = NEW 15 END WHILE

Example 2: LZW Decompression 1 Example 2: Use LZW to decompress the output sequence of Example 1: <66><65><256><257><65><260>.

Example 2: LZW Decompression Step 1 <66><65><256><257><65><260> Old = 65 S = A New = 66 C = A STRING TABLE ENCODER OUTPUT string codeword B BA 256 A

Example 2: LZW Decompression Step 2 <66><65><256><257><65><260> Old = 256 S = BA New = 256 C = B STRING TABLE ENCODER OUTPUT string codeword B BA 256 A AB 257

Example 2: LZW Decompression Step 3 <66><65><256><257><65><260> Old = 257 S = AB New = 257 C = A STRING TABLE ENCODER OUTPUT string codeword B BA 256 A AB 257 BAA 258

Example 2: LZW Decompression Step 4 <66><65><256><257><65><260> Old = 65 S = A New = 65 C = A STRING TABLE ENCODER OUTPUT string codeword B BA 256 A AB 257 BAA 258 ABA 259

Example 2: LZW Decompression Step 5 <66><65><256><257><65><260> Old = 260 S = AA New = 260 C = A STRING TABLE ENCODER OUTPUT string codeword B BA 256 A AB 257 BAA 258 ABA 259 AA 260

LZW: Some Notes This algorithm compresses repetitive sequences of data well. Since the codewords are 12 bits, any single encoded character will expand the data size rather than reduce it. In this example, 72 bits are represented with 72 bits of data. After a reasonable string table is built, compression improves dramatically. Advantages of LZW over Huffman: LZW requires no prior information about the input data stream. LZW can compress the input stream in one single pass. Another advantage of LZW its simplicity, allowing fast execution.

LZW: Limitations What happens when the dictionary gets too large (i.e., when all the 4096 locations have been used)? Here are some options usually implemented: Simply forget about adding any more entries and use the table as is. Throw the dictionary away when it reaches a certain size. Throw the dictionary away when it is no longer effective at compression. Clear entries 256-4095 and start building the dictionary again. Some clever schemes rebuild a string table from the last N input characters.

Exercises Why did we say on Slide 15 that the codeword NEW = 65 is in the string table? Review that slide and answer this question. Use LZW to trace encoding the string ABRACADABRA. Write a program that encodes a given string using LZW. Write a program that decodes a given set of encoded codewords using LZW.