Download presentation
Presentation is loading. Please wait.
Published byAubrey Thomas Modified over 8 years ago
1
UTILITIES Group 3 Xin Li Soma Reddy
2
Data Compression To reduce the size of files stored on disk and to increase the effective rate of transmission by modems.
3
A Standard coding scheme
4
File Compression Compression –Reducing the number of bits required for data representation. Two phases –The encoding phase (compressing) –The decoding phase (uncompressing) Strategy –Ensure that most-frequent characters have the shortest representation.
5
A Binary Trie A left branch represents 0 and a right branch represents 1. The path to a node indicates its representation.
6
Representation of the original code by a tree
7
A Slightly Better Tree
8
A Full Tree All nodes either are leaves or have two children.
9
A Prefix Code No character code is a prefix of another character code. Guaranteed if the characters are only in leaves. Can be decoded unambiguously.
10
An Optimal Prefix Code Tree
11
Optimal Prefix Code
12
Huffman’s Algorithm Constructs an optimal prefix code. The weight of a tree is the sum of the frequencies of its leaves. Works by repeatedly merging the two minimum weight trees.
13
Initial Stage of Huffman’s Algorithm
14
Huffman’s Algorithm After the First Merge
15
Huffman’s Algorithm After the Second Merge
16
Huffman’s Algorithm After the Third Merge
17
Huffman’s Algorithm After the Fourth Merge
18
Huffman’s Algorithm After the Fifth Merge
19
Huffman’s Algorithm After the Final Merge
20
Implementation BitInputStream Class BitOutputStream Class CharCounter Class HuffmanTree Class Hzip Class HZIPInputStream Class HZIPOutputStream Class
21
BitInputStream Class Wraps an Inputstream and provides bit-at-a-time input Main Methods: readBit reads one bit as a 0 or 1 getBit gets an individual bit in an 8-bit byte close closes underlying stream
22
BitOutputStream Class Wraps an Outputstream and provides bit-at-a-time output Main Methods: writeBit writes one bit (0 or 1) writebits writes array of bits setBit sets an individual bit in an 8-bit byte flush flushes buffered bits close closes underlying stream
23
CharCounter Class Maintains character counts Main Methods: getCount returns the number of occurrences of a character setCount sets the number of occurences of a character
24
HuffmanTree Class (cont) Manipulates Huffman coding trees Main Methods: getCode obtains the code of a given character getCharobtains the character by giving a code createTree constructs the Huffman coding tree
25
HuffmanTree Class Main Methods: writeEncodingTable writes an encoding table to an output stream readEncodingTable reads the encoding table from an input stream
26
Hzip Class Main Methods: compress adds a “.huf” to the filename uncompress adds a “.uc” to the filename main
27
HZIPInputStream Class Contains an uncompression wrapper Main Method: read returns an uncompressed byte from the wrapped input stream
28
HZIPOutputStream Class Contains a compression wrapper Writes to HZIPOutputStream are compressed and sent to the output stream being wrapped. No writing is actually done until close. Main Method: close
29
Programming Project Part 1 Storing the character counts in the encoding table gives the uncompression algorithm the ability to perform extra consistency checks. Code is added to verify that the result of the uncompression has the same character counts as the encoding table claimed.
30
Part 1 Implementation (cont) Add several public methods In HZIPInputputStream class public HuffmanTree getTree () { return codeTree; } In HuffmanTree class public CharCounter getCharCounter() { return theCounts; }
31
Part 1 Implementation In Hzip class, uncompress method HuffmanTree tree = hzin.getTree(); CharCounter newcc1 = tree.getCharCounter(); CharCounter newcc2 = new CharCounter(in); for (int i = 0; i < BitUtils.DIFF_BYTES; i++) { if (newcc2.getCount(i) != newcc1.getCount(i)) { System.out.println( " There is an error in the uncompressing process."); File file1 = new File(inFile); file1.delete(); }
32
Part 2 Check the size of the resulting compressed file and abort if the size is larger than or equal to the original.
33
Part 2 Implementation In Hzip class, compress method File originFile = new File (inFile); File compreFile = new File (compressedFile); if (originFile.length() < compreFile.length()) { System.out.println( "The size of the resulting compressed file is larger than the original."); compreFile.delete(); return; } else if (originFile.length() == compreFile.length()) { System.out.println( "The size of the resulting compressed file is equal to the original."); compreFile.delete(); return; }
34
Run Example To compress a text file whose size is six bytes C:\>set path=c:/j2sdk1.4.1_01/bin C:\>javac Hzip.java C:\>javac HZIPInputStream.java C:\>javac HZIPOutputStream.java C:\>java Hzip -c file1.txt The size of the resulting compressed file is larger than the original. C:\>
35
Conclusion Text compression is an important technique that allows us to increase both effective disk capacity and effective modem speed. It is an area of active research. Huffman’s algorithm typically achieves compression of 25% on text files.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.