Presentation is loading. Please wait.

Presentation is loading. Please wait.

GP-ZIP Genetic Programing File Compression By: Dj Gerena.

Similar presentations


Presentation on theme: "GP-ZIP Genetic Programing File Compression By: Dj Gerena."— Presentation transcript:

1 GP-ZIP Genetic Programing File Compression By: Dj Gerena

2 Road Map Compression Refresher Problems With Generic Compression What GP-Zip Is What GP-Zip Does Conclusion

3 Purpose Use fewer bits to represent data Reduce file size on disk Increase transmission speed

4 Lossy vs. Lossless Lossy Removes unnecessary bits Similar colors in a photograph Cannot recreate original file Lossless Does not delete bits Able to rebuild to original state

5 (Very) Basic Lossless Compression Algorithm Look for common words Create a dictionary When I do good, I feel good. When I do bad, I feel bad. That's my religion. (17 words) 1- When2- bad 4- I1- That’s 3- do1- my 2- good1- religion 2- feel (9 Words)

6 Problems With Generic Compression No algorithm can reduce every file No-free lunch Can’t guarantee never to increase file size Time may depended on type of file

7 GP-Zip Genetic Programming Best Application heterogeneous collections of data I.E. Large unsorted folders of data Developed by students of University of Essex, Wivenhoe, United Kindom Been in progress since 2008

8 Optimizes compression Breaks file into 5 KB blocks Analyzes file type Block passed to proper compression method “waiting area” Program will “predict” best compression method “Waiting area” can be 1600 B to 1 MB (increasing by 1600 B) Can not exceed original file size.

9 Actual Compression Utilizes Other Compression Methods Arithmetic Coding (AC) Lempel-Ziv-Welch LZW Prediction by Partial Matching (PPMD) Run Length Encoding (RLE) Boolean Minimization

10 Predicting type Attempt to predict compression ratio for each type of compression Byte Frequency Distribution (BFD) A histogram of the # of character appearances over total # of characters Byte-series Treat bytes as integers and applies a non-linear function to detect similarities between data types.

11 Predicting Type Decision tree Leaves represent classifications Branches represent conjunctions of features If certain features are present, proceed to leaf

12 Predicting Type Output of the tree is estimate compression ratio Can use either or both analyzer results to calculate final estimate

13 Files and Blocks

14 Finalizing A header is created after breaking down files Blue print for decompression Header will “glue” some blocks [PPMD][PPMD][LZW][LZW][LZW] [PPMD] [LZW] After all bits and blocks are analyzed All blocks are compressed as a labeled All compressed blocks are wrapped with the header

15 Conclusion Inefficiency of Standard Compression What Compression Is What GP-Zip Does How GP-Zip Works

16 Work Cited Evolutionary Synthesis of Lossless Compression Algorithms with GP- zip3, Proceedings of the IEEE World Congress on Computational Intelligence, IEEE 2008. Ahmed Kattan and Riccardo Poli, "Evolutionary lossless compression with GP-ZIP*," in Proceedings of the 10th annual conference on Genetic and evolutionary computation, Atlanta, Georgia, USA, 2008, 2008, pp. 1211-1218. Ahmad Kattan and Riccardo Poli, Evolutionary Lossless Compression with GP-ZIP, Proceedings of the IEEE World Congress on Computational Intelligence, IEEE 2008.


Download ppt "GP-ZIP Genetic Programing File Compression By: Dj Gerena."

Similar presentations


Ads by Google