Download presentation
Presentation is loading. Please wait.
Published byMoses Ryan Modified over 9 years ago
1
gFPC: A Self-Tuning Compression Algorithm Martin Burtscher 1 and Paruj Ratanaworabhan 2 1 The University of Texas at Austin 2 Kasetsart University
2
Introduction Many compression algorithms are parameterizable Some parameters allow straightforward trade-offs E.g., compression ratio vs. speed Controlled via command line Other parameters provide no obvious trade-off Best value is input dependent and changes dynamically E.g., hash function in a predictor Typically hardcoded gFPC: A Self-Tuning Compression Algorithm2
3
Contribution Self-tuning approach to optimize parameters Automatic, on-line, and genetic-algorithm-based Slower compression but higher compression ratio gFPC algorithm for IEEE 754 double-precision data Compresses linear streams of FP values Lossless single-pass algorithm Repeatedly self-tunes 4 hash-table parameters gFPC: A Self-Tuning Compression Algorithm3
4
FPC Algorithm [DCC’07] Make two predictions Select closer value XOR with true value Count leading zero bytes Encode value Update predictors gFPC: A Self-Tuning Compression Algorithm4
5
Hash Function Parameters Two predictors FCM predicts values, DFCM predicts differences fcm_prediction = fcm[fcm_hash]; // prediction: read hash table entry fcm[fcm_hash] = true_value; // update: write hash table entry fcm_hash = ((fcm_hash > rshift)) & (table_size–1); Two parameters each lshift for aging rshift for eliminating random bits 802,816 possibilities with 256 kB table_size gFPC: A Self-Tuning Compression Algorithm5
6
Genetic Self-Tuning Compress blocks with several sets of parameters Start with FPC and otherwise random sets Create new sets for next data block Keep best set of parameters Evolve remaining sets gFPC: A Self-Tuning Compression Algorithm6
7
Related Work Genetic algorithms (GAs) for evolving programs Program output approximates original data GAs for evolving compressor parameters off-line Rate distortion Vector quantization Fractal codes Dictionary n-grams Best compressor for each block We use on-line GA: faster, adapts dynamically gFPC: A Self-Tuning Compression Algorithm7
8
Evaluation Method System Sun Fire X2270 Server, Ubuntu Linux 8.06 2.93 GHz 64-bit Intel Xeon 5570 (Nehalem) processor Datasets Linear streams of real-world data (18 – 277 MB) 4 observations: error, info, spitzer, temp 4 simulations: brain, comet, control, plasma 5 MPI messages: bt, lu, sp, sppm, sweep3d gFPC: A Self-Tuning Compression Algorithm8
9
Population Size Affects Compression speed Compression ratio Result Population size of 4 performs within.5% of maximum (P. size = 1 → FPC) gFPC: A Self-Tuning Compression Algorithm9
10
Block Size Affects Reconfiguration frequency Compression ratio Result 512 kB blocks good Medium sizes best Warm-up versus adaptivity tradeoff gFPC: A Self-Tuning Compression Algorithm10
11
Compression Ratio Comparison FPC size and FPC all Use off-line GA an LS to find best parameters for each size (and input) Results FPC is 5% worse FPC size no input adaptivity FPC all (mostly) better gFPC is retroactive (but can adapt on-the-fly) gFPC is 317 times faster gFPC: A Self-Tuning Compression Algorithm11
12
Self-Tuning Benefit Rarely worse, mostly better (up to 72%) Relative to FPC, which was tuned for these inputs Benefit is likely higher on other inputs gFPC: A Self-Tuning Compression Algorithm12
13
Throughput on Xeon System Compression is slower with larger population size Small compression overhead due to self tuning Decompression is faster due to better compression gFPC: A Self-Tuning Compression Algorithm13
14
Summary Self-tuning approach Based on on-line genetic algorithm Repeatedly tunes 4 hash-table parameters in gFPC Applicable to other compressors Results Higher compression ratio, lower compression speed gFPC compresses at 1 Gb/s, decompresses at 7 Gb/s C source code of gFPC is freely available http://users.ices.utexas.edu/~burtscher/research/gFPC/ gFPC: A Self-Tuning Compression Algorithm14
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.