© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Data Structures for Java William H. Ford William R. Topp Chapter 23 Bit Arrays and File Compression Bret Ford © 2005, Prentice Hall
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Bit Arrays Applications such as compiler code generation and compression algorithms create data that includes specific sequences of bits. Applications such as compiler code generation and compression algorithms create data that includes specific sequences of bits. Many applications, such as compilers, generate specific sequences of bits. Many applications, such as compilers, generate specific sequences of bits.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Bit Arrays (continued) Java binary bit handling operators |, &, and ^ act on pairs of bits and return the new value. The unary operator ~ inverts the bits of its operand. Java binary bit handling operators |, &, and ^ act on pairs of bits and return the new value. The unary operator ~ inverts the bits of its operand. Bit Operations
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Bit Arrays (continued)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Bit Arrays (continued) Operator > and >>> shift values to the right using signed or unsigned arithmetic, respectively. Assume x and y are 32-bit integers. Operator > and >>> shift values to the right using signed or unsigned arithmetic, respectively. Assume x and y are 32-bit integers. x = x << 2 = x = x >> 3 = x = x >>> 3 =
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Bit Arrays (continued)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Class BitArray class lets programmers use bit operations at a higher level than the "down and dirty" use of the Java bit operators. BitArray class lets programmers use bit operations at a higher level than the "down and dirty" use of the Java bit operators.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Class (continued) public class BitArray { // number of bits in the bit array private int numberOfBits; // number of byte values used for the bit array private int byteArraySize; // the array itself private byte[] member; // constructor; create bit array of numBits // bits having value 0 public BitArray(int numBits) {... }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Class (continued) // constructor; let n = b.length; creates a // bit array whose bits are initialized // as follows: // bit 0: b[0] // bit 1: b[1] //... // bit n-1: b[n-1] public BitArray(int[] b) {... }... }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Class (continued) BitArray class methods: BitArray class methods: Conversion from primitive type: public void assignChar(char c) public void assignInt(int n) Conversion from primitive type: public void assignChar(char c) public void assignInt(int n) Bit access and update: public int bit(int i) public void set(int i) Bit access and update: public int bit(int i) public void set(int i)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Class (continued) BitArray class methods (continued): BitArray class methods (continued): Bit operators: public BitArray and(BitArray x) public BitArray or(BitArray x) public BitArray xor(BitArray x) public BitArray not() public BitArray shiftLeft(int n) public BitArray shiftRight(int n) public BitArray shiftUnsigned(int n) Bit operators: public BitArray and(BitArray x) public BitArray or(BitArray x) public BitArray xor(BitArray x) public BitArray not() public BitArray shiftLeft(int n) public BitArray shiftRight(int n) public BitArray shiftUnsigned(int n)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Class (continued) BitArray class methods (continued): BitArray class methods (continued): Input/Output: public void read(DataInputStream istr, int numBits) public void write(DataOutputStream ostr) public String toString() Input/Output: public void read(DataInputStream istr, int numBits) public void write(DataOutputStream ostr) public String toString()
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Class (continued) BitArray class methods (concluded): BitArray class methods (concluded): Miscellaneous: public void clear() public void clear(int i) public void equals (Object x) public int size() Miscellaneous: public void clear() public void clear(int i) public void equals (Object x) public int size()
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Class (continued) int[] a = {1, 0, 1, 1, 0, 0}, b = {1, 0, 0, 0, 1, 0}; BitArray x = new BitArray(a), y = new BitArray(b), z = new BitArray(a.length); y.set(0); y.set(4);// y = x.clear(2);// x = z = x.or(y);// z = x.and(y);// z = x.xor(y);// z = x.not();// z = x.shiftLeft(2);// z = x.shiftSignedRight(2);// z = x.shiftUnsignedRight(2);// z.assignInt(31); System.out.println(z); // z.assignChar('a'); System.out.println(z);//
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Class Implementation The BitArray class stores the bits in a byte array. Methods map bit numbers into the correct bit in the array. The BitArray class stores the bits in a byte array. Methods map bit numbers into the correct bit in the array.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Class Implementation (continued) The private method arrayIndex() determines the array element to which bit i belongs. The private method arrayIndex() determines the array element to which bit i belongs. // determine the index of the array element // containing bit i private int arrayIndex(int i) { return i/8; }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Class Implementation (continued) After locating the correct array index, apply the method bitMask() that returns a byte value containing a 1 in the bit position representing i. This value, called a mask, can be used to set or clear the bit. After locating the correct array index, apply the method bitMask() that returns a byte value containing a 1 in the bit position representing i. This value, called a mask, can be used to set or clear the bit.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Class Implementation (continued) // bit i is represented by a bit in // member[arrayIndex(i)]; return a byte // value with a 1 in the position that // represents bit i private byte bitMask(int i) { // use & to find the remainder after // dividing by 8; remainder 0 puts a // 1 in the left-most bit and 7 puts // a 1 in the right-most bit return (byte)(1 << (7 - (i & 7))); }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Class Implementation (continued)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Constructors There are two constructors that create a BitArray object. One creates an empty bit array of a specified size; the second constructor initializes the bit array by using a Java integer array of 0 and 1 values. There are two constructors that create a BitArray object. One creates an empty bit array of a specified size; the second constructor initializes the bit array by using a Java integer array of 0 and 1 values.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Constructor (creates empty bit array) // constructor; create bit array of numBits bits // each having value 0 public BitArray(int numBits) { numberOfBits = numBits; // number of bytes needed to hold // numberOfBits elements byteArraySize = (numberOfBits+7)/8; // initialize the array with all bytes 0 member = new byte[byteArraySize]; for (int i=0; i < member.length; i++) member[i] = 0; }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Operators Implement BitArray operator by creating an object tmp and initialize its byte array by doing the bitwise operation on the operands. Return the object as the value of the operator. Implement BitArray operator by creating an object tmp and initialize its byte array by doing the bitwise operation on the operands. Return the object as the value of the operator.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Operator or() // bitwise OR public BitArray or(BitArray x) { int i; // the bit arrays must have the same size if (numberOfBits != x.numberOfBits) throw new IllegalArgumentException( "BitArray |: bit arrays are " + "not the same size");
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. BitArray Operator or() (concluded) // form the bitwise OR in tmp BitArray tmp = new BitArray(numberOfBits); // each member element of tmp is the bitwise // OR of the current object and x for (i = 0; i < byteArraySize; i++) tmp.member[i] = (byte)(member[i] | x.member[i]); // return the bitwise OR return tmp; }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Bit Access and Modification Methods Use arrayIndex() and bitMask() to access and modify an individual bit in a BitVector. Use arrayIndex() and bitMask() to access and modify an individual bit in a BitVector. // return value of bit i public int bit(int i) { // is i in range 0 to numberOfBits-1 ? if (i = numberOfBits) throw new IndexOutOfBoundsException( "BitArray bit(): bit out of range"); // return the bit corresponding to i if ((member[arrayIndex(i)] & bitMask(i)) != 0) return 1; else return 0; }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Bit Access and Modification Methods (concluded) // clear bit i public void clear(int i) { // is i in range 0 to numberOfBits-1 ? if (i = numberOfBits) throw new IndexOutOfBoundsException( "BitArray clear(): bit out of range"); // clear the bit corresponding to i; note // that ~bitMask(i) has a 0 in the bit; // we are interested in a 1 in all others member[arrayIndex(i)] &= ~bitMask(i); }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Binary Files File types are text files and binary files. Java deals with files by creating a byte stream that connects the file and the application. File types are text files and binary files. Java deals with files by creating a byte stream that connects the file and the application. Binary files can be handled with DataInputStream and DataOutputStream classes. Binary files can be handled with DataInputStream and DataOutputStream classes.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Binary Files (continued) A data input stream lets an application read primitive Java data types from an underlying input stream in a machine- independent way. A data input stream lets an application read primitive Java data types from an underlying input stream in a machine- independent way. A data output stream lets an application write primitive Java data types to an output stream in a portable way. An application can then use a data input stream to read the data back in. A data output stream lets an application write primitive Java data types to an output stream in a portable way. An application can then use a data input stream to read the data back in.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Binary Files (continued)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Binary Files (continued)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Binary Files (continued)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Binary Files (continued)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Binary Files (continued)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Program 23.1 import java.io.*; public class Program23_1 { public static void main(String[] args) throws IOException { int intVal = 100; short shortVal = 1500; long longVal = L; byte[] buf = {3, 5, 2, 7, 15, 100, 127, 55}; // create a DataOutputStream that writes to // the file "data.dat" in the local directory DataOutputStream fout = null; // use to input data from "data.dat" DataInputStream fin = null;
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Program 23.1 (continued) try { fout = new DataOutputStream( new FileOutputStream("data.dat")); } catch (FileNotFoundException fnfe) { System.err.println("Cannot create \"data.dat\""); System.exit(1); } // write each variable and the array to f fout.writeInt(intVal); fout.writeShort(shortVal); fout.writeLong(longVal); fout.write(buf, 0, buf.length);
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Program 23.1 (continued) // close the stream and open it // as a DataInputStream fout.close(); try { fin = new DataInputStream(new FileInputStream( "data.dat")); } catch (FileNotFoundException fnfe) { System.err.println("Failure to open " + "\"data.dat\""); System.exit(1); } // input the int, short, and long from the file System.out.println("int: " + fin.readInt()); System.out.println("short: " + fin.readShort()); System.out.println("long: " + fin.readLong());
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Program 23.1 (concluded) // input the byte array that was written // to the file; the number of bytes in the // array is the number of bytes remaining // unread in the file byte[] b = new byte[fin.available()]; System.out.print("byte array: "); // input the array fin.read(b); // output the bytes for (int i=0; i < b.length; i++) System.out.print(b[i] + " "); System.out.println(); // close the stream fin.close(); }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Program 23.1 (Run) int: 100 short: 1500 long: byte array:
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. File Compression Lossless compression loses no data and is used for data backup. Lossless compression loses no data and is used for data backup.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. File Compression (continued) Lossy compression is used for applications like sound and video compression and causes minor loss of data. Lossy compression is used for applications like sound and video compression and causes minor loss of data.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. File Compression (continued) The compression ratio is the ratio of the number of bits in the original data to the number of bits in the compressed image. For instance, if a data file contains 500,000 bytes and the compressed data contains 100,000 bytes, the compression ratio is 5:1 The compression ratio is the ratio of the number of bits in the original data to the number of bits in the compressed image. For instance, if a data file contains 500,000 bytes and the compressed data contains 100,000 bytes, the compression ratio is 5:1
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Huffman Compression Huffman compression relies on counting the number of occurrences of each 8-bit byte in the data and generating a sequence of optimal binary codes called prefix codes. Huffman compression relies on counting the number of occurrences of each 8-bit byte in the data and generating a sequence of optimal binary codes called prefix codes. The Huffman algorithm is an example of a greedy algorithm. A greedy algorithm makes an optimal choice at each local step in the hope of creating an optimal solution to the entire problem. The Huffman algorithm is an example of a greedy algorithm. A greedy algorithm makes an optimal choice at each local step in the hope of creating an optimal solution to the entire problem.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Huffman Compression (continued) The algorithm generates a table that contains the frequency of occurrence of each byte in the file. Using these frequencies, the algorithm assigns each byte a string of bits known as its bit code and writes the bit code to the compressed image in place or the original byte. The algorithm generates a table that contains the frequency of occurrence of each byte in the file. Using these frequencies, the algorithm assigns each byte a string of bits known as its bit code and writes the bit code to the compressed image in place or the original byte. Compression occurs if each 8-bit char in a file is replaced by a shorter bit sequence. Compression occurs if each 8-bit char in a file is replaced by a shorter bit sequence.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Huffman Compression (continued) abcdef Frequency (in thousands) Fixed-length code word Compression Ratio = / = 2.67
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Huffman Compression (continued) Use a binary tree to represent bit codes. A left edge is a 0 and a right edge is a 1. Each interior node specifies a frequency count, and each leaf node holds a character and its frequency. Use a binary tree to represent bit codes. A left edge is a 0 and a right edge is a 1. Each interior node specifies a frequency count, and each leaf node holds a character and its frequency.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Huffman Compression (continued)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Huffman Compression (continued) Each data byte occurs only in a leaf node. Such codes are called prefix codes. Each data byte occurs only in a leaf node. Such codes are called prefix codes. A full binary tree is one in where each interior node has two children. A full binary tree is one in where each interior node has two children. By converting the tree to a full tree, we can generate better bit codes for our example. By converting the tree to a full tree, we can generate better bit codes for our example.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Huffman Compression (continued) Compression ratio = / = 3.08
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Huffman Compression (continued) To compress a file replace each char by its prefix code. To uncompress, follow the bit code bit ‑ by ‑ bit from the root of the tree to the corresponding character. Write the character to the uncompressed file. To compress a file replace each char by its prefix code. To uncompress, follow the bit code bit ‑ by ‑ bit from the root of the tree to the corresponding character. Write the character to the uncompressed file. Good compression involves choosing an optimal tree. It can be shown that the optimal bit codes for a file are always represented by a full tree. Good compression involves choosing an optimal tree. It can be shown that the optimal bit codes for a file are always represented by a full tree.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Huffman Compression (continued) For each byte b in the original file, let f(b) be the frequency of the byte and d(b) be the depth of the leaf node containing b. The depth of the node is also the number of bits in the bit code for b. The cost of the tree is the number of bits necessary to compress the file. For each byte b in the original file, let f(b) be the frequency of the byte and d(b) be the depth of the leaf node containing b. The depth of the node is also the number of bits in the bit code for b. The cost of the tree is the number of bits necessary to compress the file.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Huffman Compression (continued) A Huffman tree generates the minimum number of bits in the compressed image. It generates optimal prefix codes. A Huffman tree generates the minimum number of bits in the compressed image. It generates optimal prefix codes.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Building a Huffman Tree For each of the n bytes in a file, assign the byte and its frequency to a tree node, and insert the node into a minimum priority queue ordered by frequency. For each of the n bytes in a file, assign the byte and its frequency to a tree node, and insert the node into a minimum priority queue ordered by frequency.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Building a Huffman Tree (continued) Remove two elements, x and y, from the priority queue, and attach them as children of a node whose frequency is the sum of the frequencies of its children. Insert the resulting node into the priority queue. Remove two elements, x and y, from the priority queue, and attach them as children of a node whose frequency is the sum of the frequencies of its children. Insert the resulting node into the priority queue. In a loop, perform this action n-1 times. Each loop iteration creates one of the n-1 interior nodes of the full tree. In a loop, perform this action n-1 times. Each loop iteration creates one of the n-1 interior nodes of the full tree.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Building a Huffman Tree (continued) With a minimum priority queue the least frequently occurring characters have longer bit codes, and the more frequently occurring chars have shorter bit codes. With a minimum priority queue the least frequently occurring characters have longer bit codes, and the more frequently occurring chars have shorter bit codes.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Building a Huffman Tree (continued)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Building a Huffman Tree (continued)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Building a Huffman Tree (continued)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Building a Huffman Tree (continued) For the Huffman tree, the compressed file contains (16(2) + 4(4) + 8(2) + 6(3) + 20(2) + 3(4)) x 1000 = 134,000 bits, which corresponds to a compression ratio of 3.4.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Implementing Huffman Compression The implementation of Huffman compression uses a priority queue, bit arrays, inheritance, and binary files. The implementation of Huffman compression uses a priority queue, bit arrays, inheritance, and binary files. The class HCompress does the Huffman compression and writes progress messages to a text area. The class HCompress does the Huffman compression and writes progress messages to a text area.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Implementing Huffman Compression (continued) HCompress has a constructor that takes a file name as an argument, along with a reference to a JTextArea object. It opens the source file and creates a binary output file by adding the extension ".huf" to the name. HCompress has a constructor that takes a file name as an argument, along with a reference to a JTextArea object. It opens the source file and creates a binary output file by adding the extension ".huf" to the name. The public method compress() executes the compression steps. The public method compress() executes the compression steps.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Implementing Huffman Compression (continued) The methods compressionRatio() and size() provide some of the internal parameters of the compression process. size() gives the number of nodes in the Huffman tree. The methods compressionRatio() and size() provide some of the internal parameters of the compression process. size() gives the number of nodes in the Huffman tree. The method displayTree() displays the resulting Huffman tree. The method displayTree() displays the resulting Huffman tree.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Implementing Huffman Compression (continued) After creating an HCompress object, call the method compress() that writes a compressed image to the output file. Messages output to the text area trace the progress of the compression. The method displayTree() outputs the Huffman tree in vertical format. Use it only for small trees. After creating an HCompress object, call the method compress() that writes a compressed image to the output file. Messages output to the text area trace the progress of the compression. The method displayTree() outputs the Huffman tree in vertical format. Use it only for small trees.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Example of Huffman Compression JTextArea textArea = new JTextArea(30, 80);... HCompress hc = new HCompress("demo.dat", textArea); hc.compress(); if (hc.size() <= 11) textArea.append(hc.displayTree()); // output the compression ratio textArea.append("The compression ratio = " + hc.compressionRatio() + "\n\n");
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Example of Huffman Compression (continued) Output: Frequency analysis... File size: characters Number of unique characters: 6 Building the Huffman tree... Number of nodes in Huffman tree: 11 Generating the Huffman codes... Tree has 11 entries. Root index = 10 Index Sym Freq Parent Left Right NBits Bits 0 a b c d e f Int Int Int Int Int
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Example of Huffman Compression (continued) Generating the compressed file The compression ratio is Huffman tree
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Summary of compress() Call freqAnalysis() Call freqAnalysis() Read the file and tabulate the number of occurrences of each byte. Read the file and tabulate the number of occurrences of each byte. Compute the size of the file, to support the computation of the compression ratio. Compute the size of the file, to support the computation of the compression ratio.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Summary of compress() (continued) Call buildTree() Call buildTree() Construct the Huffman tree for the file in an array. Construct the Huffman tree for the file in an array. Call generateCodes() Call generateCodes() For each leaf node, follow the path to the root and determine the bit code for the byte. In the process, determine the cost of the tree, which is the total number of code bits generated. For each leaf node, follow the path to the root and determine the bit code for the byte. In the process, determine the cost of the tree, which is the total number of code bits generated.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Summary of compress() (continued) Write the 16-bit size of the Huffman tree to the compressed file. Write the 16-bit size of the Huffman tree to the compressed file. Write the Huffman tree to the compressed file. Write the Huffman tree to the compressed file. Write the total number of bits in the bit codes to the compressed file. Write the total number of bits in the bit codes to the compressed file. Call writeCompressedData() Call writeCompressedData() Read the source file again. For each byte, write its bit code to the compressed file. Read the source file again. For each byte, write its bit code to the compressed file.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Summary of compress() (concluded) From the actions of compress(), we see that the format of the compressed file is as follows: From the actions of compress(), we see that the format of the compressed file is as follows:
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Building the Huffman Tree DiskHuffNode class contains the data and the location of children. Its subclass HuffNode contains the remaining attributes required by the Huffman compression implementation. DiskHuffNode class contains the data and the location of children. Its subclass HuffNode contains the remaining attributes required by the Huffman compression implementation.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Building the Huffman Tree (continued)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Building the Huffman Tree (continued) HCompress method buildTree() executes Huffman algorithm to build the tree. HCompress method buildTree() executes Huffman algorithm to build the tree.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Building the Huffman Tree (continued) The HCompress method generateCodes() determines the bit codes. The HCompress method generateCodes() determines the bit codes. To output the bit codes, the method writeCompressedData() declares a BitArray object, compressedData, whose bit size is the cost of the Huffman tree. To output the bit codes, the method writeCompressedData() declares a BitArray object, compressedData, whose bit size is the cost of the Huffman tree. Upon the conclusion of input, writeCompressedData() calls the write() method of the BitArray class to output the bits to the compressed file. Upon the conclusion of input, writeCompressedData() calls the write() method of the BitArray class to output the bits to the compressed file.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.writeCompressedData() // reread the source file and write the // Huffman codes specified by the Huffman // tree to the stream dest private void writeCompressedData() throws IOException { // vector that will contain the Huffman codes // for the compressed file BitArray compressedData = new BitArray(totalBits); int bitPos, i, j; int b; // close the source file and reopen it source.close(); source = new DataInputStream(new FileInputStream( fname)); // bitPos is used to put bits into compressedData bitPos = 0;
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. writeCompressedData() (continued) // re-read the source file and generate the Huffman // codes in compressedData while (true) { try { // try to input a byte b = source.readUnsignedByte(); } catch (EOFException eofex) { // we are at end-of-file break; } // index of the tree node containing ch i = charLoc[b];
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. writeCompressedData() (concluded) // put the bit code for tree[i].b into // the bit vector for (j=0; j < tree[i].numberOfBits; j++) { // only need to call set() if // tree[i].bits.bit(j) is 1 if (tree[i].bits.bit(j) == 1) compressedData.set(bitPos); // always advance bitPos bitPos++; } // write the bit codes to the output file compressedData.write(dest); }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Implementing Huffman Decompression The class HDecompress performs Huffman decompression. The public method decompress() decodes the file. The class HDecompress performs Huffman decompression. The public method decompress() decodes the file.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Implementing Huffman Decompression (continued) The HDecompress method decompress() sequences through the bits of the compressed image, tracing paths from the root node to leaf nodes and writes the corresponding byte to the uncompressed file. The HDecompress method decompress() sequences through the bits of the compressed image, tracing paths from the root node to leaf nodes and writes the corresponding byte to the uncompressed file.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.decompress() // decompress the file public void decompress() throws IOException { int i, bitPos; // treeSize and totalBits are read from // the compressed file short treeSize; int totalBits; int decompressedFileSize = 0; textArea.append("Decompressing... \n"); // input the Huffman tree size treeSize = source.readShort(); // treeSize DiskHuffNode nodes are read from // the compressed file into the tree DiskHuffNode[] tree = new DiskHuffNode[treeSize];
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. decompress() (continued) // input the tree for (i=0; i < treeSize; i++) { tree[i] = new DiskHuffNode(); tree[i].read(source); } // input the number of bits of Huffman code totalBits = source.readInt(); // allocate a 1-bit bit array, whose contents // we immediately replace by the bits in the // compressed file BitArray bits = new BitArray(1); // read totalBits number of binary bits from // the compressed file into bits bits.read(source, totalBits);
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. decompress() (continued) // restore the original file by using the // Huffman codes to traverse the tree and // write out the corresponding characters bitPos = 0; while (bitPos < totalBits) { // root of the tree is at index treeSize-1 i = treeSize-1; // follow the bits until we arrive at a leaf node while (tree[i].left != HuffNode.NIL) { // if bit is 0, go left; otherwise, go right if (bits.bit(bitPos) == 0) i = tree[i].left; else i = tree[i].right; // we have used the current bit; move to // the next one bitPos++; }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. decompress() (concluded) // we are at a leaf node; output the // character to the file dest.writeByte(tree[i].b); decompressedFileSize++; } textArea.append("Decompressed file " + decompressedFileName + " (" + decompressedFileSize + ") characters\n"); // close the two streams source.close(); dest.close(); filesOpen = false; }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Program 23-2 The application Program23_2.java in Chapter 23 of the software supplement is a GUI application that uses the Huffman algorithms. The figure provides a snapshot of the running application. The application Program23_2.java in Chapter 23 of the software supplement is a GUI application that uses the Huffman algorithms. The figure provides a snapshot of the running application.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Program 23-2 (concluded)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Serialization A persistent object can exist apart from the executing program and can be stored in a file. A persistent object can exist apart from the executing program and can be stored in a file. Serialization involves storing and retrieving objects from an external file. Serialization involves storing and retrieving objects from an external file. The classes ObjectOutputStream and ObjectInputStream are used for serialization. The classes ObjectOutputStream and ObjectInputStream are used for serialization.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Serialization (continued) // the stream oos uses a FileOutputStream that is attached to // file "storeFile" for storage of an object ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream("storeFile"));... oos.writeObject(anObject);// write anObject to file "storeFile Assume anObject is an instance of a class that implements the Serializable interface. Assume anObject is an instance of a class that implements the Serializable interface.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Serialization (continued) Deserializing an Object. Deserializing an Object. // the stream ois uses a FileInputStream that is attached to // file "storeFile" to retrieve an object ObjectInputStream ois = new ObjectInputStream(new FileInputStream("storeFile")); ClassName recallObj; // retrieve from "storeFile" recallObj = (ClassName)ois.readObject();
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Class SerializableClass import ds.time.Time24; public class SerializableClass implements java.io.Serializable { public int n; public String str; public Time24 t; public Integer[] list = new Integer[4]; transient public Time24 currentTime;
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Class SerializableClass (concluded) public SerializableClass(int n, String str, Time24 t, Time24 currentTime) { this.n = n; this.str = str; this.t = t; for (int i = 0; i < list.length; i++) list[i] = new Integer(i + 1); this.currentTime = currentTime; }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Program 23.3 import ds.time.Time24; import ds.util.Arrays; import java.io.*; public class Program23_3 { public static void main(String[] args) throws Exception { // objects used for serialization SerializableClass obj, recallObj; // object stream connected to file // "storeFile" for output ObjectOutputStream oos = new ObjectOutputStream( new FileOutputStream("storeFile.dat"));
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Program 23.3 (continued) // initial object with runtime // update of 45 minutes obj = new SerializableClass(45, "Shooting star", new Time24(9,30), new Time24(7, 10)); obj.t.addTime(45); // output object info before copy to the file System.out.println("Serialized object:"); System.out.println(" Integer: " + obj.n + " String: " + obj.str + " Time: " + obj.t + "\n Current time: " + obj.currentTime + " List: " + Arrays.toString(obj.list)); // send object and close down the output stream oos.writeObject(obj); oos.flush(); oos.close();
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Program 23.3 (concluded) // object stream connected to file // "storeFile" for output ObjectInputStream ois = new ObjectInputStream( new FileInputStream("storeFile.dat")); // reconstruct object and allocate new currentTime recallObj = (SerializableClass)ois.readObject(); recallObj.currentTime = new Time24(15, 45); // output object after recall from the file System.out.println("Deserialized object:"); System.out.println(" Integer: " + recallObj.n + " String: " + recallObj.str + " Time: " + recallObj.t + '\n' + " Current time: " + recallObj.currentTime + " List: " + Arrays.toString(obj.list)); }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Program 23.3 (Run) Serialized object: Integer: 45 String: Shooting star Time: 10:15 Current time: 7:10 List: [1, 2, 3, 4] Deserialized object: Integer: 45 String: Shooting star Time: 10:15 Current time: 15:45 List: [1, 2, 3, 4]
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Custom Serialization In some cases a programmer needs to customize the write/read process. For instance, with collection objects, elements are dynamically generated and stored with some kind of ordering. The deserialization process must retrieve the elements and then rebuild the underlying storage structure for the collection. The ArrayList class is a good example. In some cases a programmer needs to customize the write/read process. For instance, with collection objects, elements are dynamically generated and stored with some kind of ordering. The deserialization process must retrieve the elements and then rebuild the underlying storage structure for the collection. The ArrayList class is a good example.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Custom Serialization (continued) In an ArrayList collection write out only listSize elements stored in the front of the array listArr. In an ArrayList collection write out only listSize elements stored in the front of the array listArr.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Custom Serialization (continued) For custom serialization, the programmer must implement private methods writeObject() and readObject(). For custom serialization, the programmer must implement private methods writeObject() and readObject().
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. ArrayList writeObject() private void writeObject(ObjectOutputStream out) throws java.io.IOException { // write out element count out.defaultWriteObject(); // write out the ArrayList capacity out.writeInt(listArr.length); // write the first listSize elements of listArr for (int i=0; i < listSize; i++) out.writeObject(listArr[i]); }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. ArrayList readObject() private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException { // read in list size in.defaultReadObject(); // read in array length and allocate the array int listCapacity = in.readInt(); listArr = (T[]) new Object[listCapacity]; // read listSize elements into listArr for (int i=0; i < listSize; i++) listArr[i] = (T)in.readObject(); }