Presentation is loading. Please wait.

Presentation is loading. Please wait.

GFPC: A Self-Tuning Compression Algorithm Martin Burtscher 1 and Paruj Ratanaworabhan 2 1 The University of Texas at Austin 2 Kasetsart University.

Similar presentations


Presentation on theme: "GFPC: A Self-Tuning Compression Algorithm Martin Burtscher 1 and Paruj Ratanaworabhan 2 1 The University of Texas at Austin 2 Kasetsart University."— Presentation transcript:

1 gFPC: A Self-Tuning Compression Algorithm Martin Burtscher 1 and Paruj Ratanaworabhan 2 1 The University of Texas at Austin 2 Kasetsart University

2 Introduction  Many compression algorithms are parameterizable  Some parameters allow straightforward trade-offs  E.g., compression ratio vs. speed  Controlled via command line  Other parameters provide no obvious trade-off  Best value is input dependent and changes dynamically  E.g., hash function in a predictor  Typically hardcoded gFPC: A Self-Tuning Compression Algorithm2

3 Contribution  Self-tuning approach to optimize parameters  Automatic, on-line, and genetic-algorithm-based  Slower compression but higher compression ratio  gFPC algorithm for IEEE 754 double-precision data  Compresses linear streams of FP values  Lossless single-pass algorithm  Repeatedly self-tunes 4 hash-table parameters gFPC: A Self-Tuning Compression Algorithm3

4 FPC Algorithm [DCC’07]  Make two predictions  Select closer value  XOR with true value  Count leading zero bytes  Encode value  Update predictors gFPC: A Self-Tuning Compression Algorithm4

5 Hash Function Parameters  Two predictors  FCM predicts values, DFCM predicts differences fcm_prediction = fcm[fcm_hash]; // prediction: read hash table entry fcm[fcm_hash] = true_value; // update: write hash table entry fcm_hash = ((fcm_hash > rshift)) & (table_size–1);  Two parameters each  lshift for aging  rshift for eliminating random bits  802,816 possibilities with 256 kB table_size gFPC: A Self-Tuning Compression Algorithm5

6 Genetic Self-Tuning  Compress blocks with several sets of parameters  Start with FPC and otherwise random sets  Create new sets for next data block  Keep best set of parameters  Evolve remaining sets gFPC: A Self-Tuning Compression Algorithm6

7 Related Work  Genetic algorithms (GAs) for evolving programs  Program output approximates original data  GAs for evolving compressor parameters off-line  Rate distortion  Vector quantization  Fractal codes  Dictionary n-grams  Best compressor for each block  We use on-line GA: faster, adapts dynamically gFPC: A Self-Tuning Compression Algorithm7

8 Evaluation Method  System  Sun Fire X2270 Server, Ubuntu Linux 8.06  2.93 GHz 64-bit Intel Xeon 5570 (Nehalem) processor  Datasets  Linear streams of real-world data (18 – 277 MB)  4 observations: error, info, spitzer, temp  4 simulations: brain, comet, control, plasma  5 MPI messages: bt, lu, sp, sppm, sweep3d gFPC: A Self-Tuning Compression Algorithm8

9 Population Size  Affects  Compression speed  Compression ratio  Result  Population size of 4 performs within.5% of maximum  (P. size = 1 → FPC) gFPC: A Self-Tuning Compression Algorithm9

10 Block Size  Affects  Reconfiguration frequency  Compression ratio  Result  512 kB blocks good  Medium sizes best  Warm-up versus adaptivity tradeoff gFPC: A Self-Tuning Compression Algorithm10

11 Compression Ratio Comparison  FPC size and FPC all  Use off-line GA an LS to find best parameters for each size (and input)  Results  FPC is 5% worse  FPC size no input adaptivity  FPC all (mostly) better  gFPC is retroactive (but can adapt on-the-fly)  gFPC is 317 times faster gFPC: A Self-Tuning Compression Algorithm11

12 Self-Tuning Benefit  Rarely worse, mostly better (up to 72%)  Relative to FPC, which was tuned for these inputs  Benefit is likely higher on other inputs gFPC: A Self-Tuning Compression Algorithm12

13 Throughput on Xeon System  Compression is slower with larger population size  Small compression overhead due to self tuning  Decompression is faster due to better compression gFPC: A Self-Tuning Compression Algorithm13

14 Summary  Self-tuning approach  Based on on-line genetic algorithm  Repeatedly tunes 4 hash-table parameters in gFPC  Applicable to other compressors  Results  Higher compression ratio, lower compression speed  gFPC compresses at 1 Gb/s, decompresses at 7 Gb/s  C source code of gFPC is freely available http://users.ices.utexas.edu/~burtscher/research/gFPC/ gFPC: A Self-Tuning Compression Algorithm14


Download ppt "GFPC: A Self-Tuning Compression Algorithm Martin Burtscher 1 and Paruj Ratanaworabhan 2 1 The University of Texas at Austin 2 Kasetsart University."

Similar presentations


Ads by Google