Presentation is loading. Please wait.

Presentation is loading. Please wait.

No Bit Left Behind: The Limits of Heap Data Compression

Similar presentations


Presentation on theme: "No Bit Left Behind: The Limits of Heap Data Compression"— Presentation transcript:

1 No Bit Left Behind: The Limits of Heap Data Compression
Jennifer B. Sartor* Martin Hirzel†, Kathryn S. McKinley* *U Texas at Austin, †IBM Watson ISMM 2008

2 Current State CPU L1 L2 CPU L1 L2 Managed languages ubiquitous
Embedded devices Multicore CPU L1 L2 CPU L1 L2 Need memory efficiency! ISMM 2008

3 Memory Efficiency of Managed Languages
COST 8-94% information content in heap in 37 benchmarks. [Mitchell & Sevitsky, OOPSLA 07] Boxed objects Trailing zeros in arrays Redundant objects Extra bit-width Data structure back-bones bzip2 86% OPPORTUNITY Memory layout abstraction (Location + size) != identity ISMM 2008

4 Related Work Ananian & Rinard. LCTES 03 Dom value field hash
Appel & Goncalves. Tech Report 93 Eql obj sharing, Const field elide, Bit-width reduction Chen, Kandemir & Irwin. VEE 05 Dom value field elide Chen, et al. OOPSLA 03 Zero compr, Trail zero trim Cooprider & Regehr. PLDI 07 Value set indirection Marinov & O’Callahan. OOPSLA 03 Eql obj sharing Stephenson, Babb & Amarasinghe. PLDI 00 Const field elide, Bit-width reduction Titzer, et al. PLDI 07 Zilles. ISMM 07 Bit-width reduction ISMM 2008

5 Limit Study 58% Quantitatively compare heap data compression
Surveyed literature Savings equations Methodology for evaluation Apples-to-apples comparison Future work: implementation Hybrid techniques 58% Findings: array & hybrid compression ISMM 2008

6 Hybrid Array Compression
x0001 x0058 x0004 x0000 x0001 x0058 x0004 x0000 Redundancy Equal array sharing ISMM 2008

7 Equal Object Sharing 14% Two objects are equal if both
Marinov & O’Callahan. OOPSLA 03; Appel & Goncalves. Tech Report 93 Two objects are equal if both Same class & all fields have same value Strictly-equal: pointer fields identical Deep: objects pointer targets are equal JVM store only 1 copy in hashtable 14% Class C, N objects, D distinct; save: ISMM 2008

8 Hybrid Array Compression
x0001 x0058 x0004 x0000 x0001 x0058 x0004 x0000 Redundancy Equal array sharing Value set indirection Dictionary: x0001 x0058 x0004 x0000 1 2 3 ISMM 2008

9 Value Set Indirection & Caching
Cooprider & Regehr/ Titzer, et al. PLDI 07 For object field or array elements with large range of values Dictionary (or cache) of 256 most frequent values, instance stores small 1 byte indices If > 256 values, 255 in dictionary, 256th says to store rest (M) in hashtable w/ objectID 14% ISMM 2008

10 Hybrid Array Compression 2
x00A0 x0073 x0002 x0001 x0101 x0000 Remove zeros Trim trailing zeros x00A0 x0073 x0002 x0001 x0101 8 5 Bit width reduce x0A0 x073 x002 x001 x101 8 5 Zero compress x0A x73 x2 x001 x101 8 5 xAF 8 5 ISMM 2008

11 Zero-based Object Compression
Chen, et al. OOPSLA 03 Remove bytes that are entirely zero Per object bit-map: 1 bit per byte Store only non-zero bytes Savings: 45% ISMM 2008

12 Hybrid Array Compression 2
x00A0 x0073 x0002 x0001 x0101 x0000 Remove zeros Trim trailing zeros x00A0 x0073 x0002 x0001 x0101 8 5 Bit width reduce x0A0 x073 x002 x001 x101 8 5 Zero compress x0A x73 x2 x001 x101 8 5 xAF ISMM 2008

13 Analysis representation
Methodology Garbage Collection Program run Heap dump series Analysis representation t  Model 1 – Model n  s Limit savings snapshot ISMM 2008

14 Experimental Details Jikes Research Virtual Machine
Java-in-Java DaCapo benchmarks + pseudojbb 20-25 heap snapshots per benchmark MarkSweep with 2x min heap Analysis Per class Objects and arrays separated JVM+app vs application (separated in paper) Per heap snapshot, and over all snapshots ISMM 2008

15 Technique Class Array GC/Run Lempel-Ziv compression X GC
Strictly-equal object sharing Obj Type Deep-equal object sharing Zero-based object compression Inst Trailing zero array trimming Bit-width reduction Fld Dominant-value field hashing Lazy invariant computation Value set indirection Value set caching Constant field elision Run Dominant-value field elision ISMM 2008

16 Value Indirection & Cache
Deep Equal Sharing Zero Compression Hybrid Compression ISMM 2008

17 Stability of Savings fop: snapshots over time ISMM 2008

18 Conclusions Limit study compare apples-to-apples heap data compression techniques Potential to reduce memory inefficiencies in managed languages Arrays Hybrids Future: save space Challenge: efficient detection & recovery Thank you! ISMM 2008


Download ppt "No Bit Left Behind: The Limits of Heap Data Compression"

Similar presentations


Ads by Google