Download presentation
Presentation is loading. Please wait.
Published byRosemary Conley Modified over 8 years ago
1
GP-ZIP Genetic Programing File Compression By: Dj Gerena
2
Road Map Compression Refresher Problems With Generic Compression What GP-Zip Is What GP-Zip Does Conclusion
3
Purpose Use fewer bits to represent data Reduce file size on disk Increase transmission speed
4
Lossy vs. Lossless Lossy Removes unnecessary bits Similar colors in a photograph Cannot recreate original file Lossless Does not delete bits Able to rebuild to original state
5
(Very) Basic Lossless Compression Algorithm Look for common words Create a dictionary When I do good, I feel good. When I do bad, I feel bad. That's my religion. (17 words) 1- When2- bad 4- I1- That’s 3- do1- my 2- good1- religion 2- feel (9 Words)
6
Problems With Generic Compression No algorithm can reduce every file No-free lunch Can’t guarantee never to increase file size Time may depended on type of file
7
GP-Zip Genetic Programming Best Application heterogeneous collections of data I.E. Large unsorted folders of data Developed by students of University of Essex, Wivenhoe, United Kindom Been in progress since 2008
8
Optimizes compression Breaks file into 5 KB blocks Analyzes file type Block passed to proper compression method “waiting area” Program will “predict” best compression method “Waiting area” can be 1600 B to 1 MB (increasing by 1600 B) Can not exceed original file size.
9
Actual Compression Utilizes Other Compression Methods Arithmetic Coding (AC) Lempel-Ziv-Welch LZW Prediction by Partial Matching (PPMD) Run Length Encoding (RLE) Boolean Minimization
10
Predicting type Attempt to predict compression ratio for each type of compression Byte Frequency Distribution (BFD) A histogram of the # of character appearances over total # of characters Byte-series Treat bytes as integers and applies a non-linear function to detect similarities between data types.
11
Predicting Type Decision tree Leaves represent classifications Branches represent conjunctions of features If certain features are present, proceed to leaf
12
Predicting Type Output of the tree is estimate compression ratio Can use either or both analyzer results to calculate final estimate
13
Files and Blocks
14
Finalizing A header is created after breaking down files Blue print for decompression Header will “glue” some blocks [PPMD][PPMD][LZW][LZW][LZW] [PPMD] [LZW] After all bits and blocks are analyzed All blocks are compressed as a labeled All compressed blocks are wrapped with the header
15
Conclusion Inefficiency of Standard Compression What Compression Is What GP-Zip Does How GP-Zip Works
16
Work Cited Evolutionary Synthesis of Lossless Compression Algorithms with GP- zip3, Proceedings of the IEEE World Congress on Computational Intelligence, IEEE 2008. Ahmed Kattan and Riccardo Poli, "Evolutionary lossless compression with GP-ZIP*," in Proceedings of the 10th annual conference on Genetic and evolutionary computation, Atlanta, Georgia, USA, 2008, 2008, pp. 1211-1218. Ahmad Kattan and Riccardo Poli, Evolutionary Lossless Compression with GP-ZIP, Proceedings of the IEEE World Congress on Computational Intelligence, IEEE 2008.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.