No Bit Left Behind: The Limits of Heap Data Compression

Slides:



Advertisements
Similar presentations
1 Wake Up and Smell the Coffee: Performance Analysis Methodologies for the 21st Century Kathryn S McKinley Department of Computer Sciences University of.
Advertisements

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 1 MC 2 –Copying GC for Memory Constrained Environments Narendran Sachindran J. Eliot.
Cooperative Cache Scrubbing Jennifer B. Sartor, Wim Heirman, Steve Blackburn*, Lieven Eeckhout, Kathryn S. McKinley^ PACT 2014 * ^
Steve Blackburn Department of Computer Science Australian National University Perry Cheng TJ Watson Research Center IBM Research Kathryn McKinley Department.
1 Write Barrier Elision for Concurrent Garbage Collectors Martin T. Vechev Cambridge University David F. Bacon IBM T.J.Watson Research Center.
Dr. Ken Hoganson, © August 2014 Programming in R COURSE NOTES 2 Hoganson Language Translation.
Go with the Flow: Profiling Copies to Find Run-time Bloat Guoqing Xu, Matthew Arnold, Nick Mitchell, Atanas Rountev, Gary Sevitsky Ohio State University.
Operating Systems Lecture 10 Issues in Paging and Virtual Memory Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing.
Names and Bindings.
Object Field Analysis for Heap Space Optimization ISMM 2004 G. Chen, M. Kandemir, N. Vijaykrishnanan and M. J. Irwin The Pennsylvania State University.
MC 2 : High Performance GC for Memory-Constrained Environments - Narendran Sachindran, J. Eliot B. Moss, Emery D. Berger Sowmiya Chocka Narayanan.
An On-the-Fly Mark and Sweep Garbage Collector Based on Sliding Views Hezi Azatchi - IBM Yossi Levanoni - Microsoft Harel Paz – Technion Erez Petrank –
Asynchronous Assertions Eddie Aftandilian and Sam Guyer Tufts University Martin Vechev ETH Zurich and IBM Research Eran Yahav Technion.
MC 2 : High Performance GC for Memory-Constrained Environments N. Sachindran, E. Moss, E. Berger Ivan JibajaCS 395T *Some of the graphs are from presentation.
Microarchitectural Characterization of Production JVMs and Java Workload work in progress Jungwoo Ha (UT Austin) Magnus Gustafsson (Uppsala Univ.) Stephen.
ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.
Using Prefetching to Improve Reference-Counting Garbage Collectors Harel Paz IBM Haifa Research Lab Erez Petrank Microsoft Research and Technion.
1 The Compressor: Concurrent, Incremental and Parallel Compaction. Haim Kermany and Erez Petrank Technion – Israel Institute of Technology.
Finding Low-Utility Data Structures Guoqing Xu 1, Nick Mitchell 2, Matthew Arnold 2, Atanas Rountev 1, Edith Schonberg 2, Gary Sevitsky 2 1 Ohio State.
Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003.
Age-Oriented Concurrent Garbage Collection Harel Paz, Erez Petrank – Technion, Israel Steve Blackburn – ANU, Australia April 05 Compiler Construction Scotland.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Garbage Collection Without Paging Matthew Hertz, Yi Feng, Emery Berger University.
1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot.
1 Utilizing Field Usage Patterns for Java Heap Space Optimization Z. Guo, N. Amaral, D. Szafron and Y. Wang Department of Computing Science University.
Comparison of JVM Phases on Data Cache Performance Shiwen Hu and Lizy K. John Laboratory for Computer Architecture The University of Texas at Austin.
The College of William and Mary 1 Influence of Program Inputs on the Selection of Garbage Collectors Feng Mao, Eddy Zheng Zhang and Xipeng Shen.
Exploring Multi-Threaded Java Application Performance on Multicore Hardware Ghent University, Belgium OOPSLA 2012 presentation – October 24 th 2012 Jennifer.
An Adaptive, Region-based Allocator for Java Feng Qian, Laurie Hendren {fqian, Sable Research Group School of Computer Science McGill.
Introduction to the Java Virtual Machine 井民全. JVM (Java Virtual Machine) the environment in which the java programs execute The specification define an.
Page Overlays An Enhanced Virtual Memory Framework to Enable Fine-grained Memory Management Vivek Seshadri Gennady Pekhimenko, Olatunji Ruwase, Onur Mutlu,
Fast Conservative Garbage Collection Rifat Shahriyar Stephen M. Blackburn Australian National University Kathryn S. M cKinley Microsoft Research.
Impact of Java Compressed Heap on Mobile/Wireless Communication Mayumi KATO and Chia-Tien Dan Lo (itcc’05) Department of Computer Science, University of.
1 Fast and Efficient Partial Code Reordering Xianglong Huang (UT Austin, Adverplex) Stephen M. Blackburn (Intel) David Grove (IBM) Kathryn McKinley (UT.
Dynamic Object Sampling for Pretenuring Maria Jump Department of Computer Sciences The University of Texas at Austin Stephen M. Blackburn.
CS380 C lecture 20 Last time –Linear scan register allocation –Classic compilation techniques –On to a modern context Today –Jenn Sartor –Experimental.
Free-Me: A Static Analysis for Automatic Individual Object Reclamation Samuel Z. Guyer, Kathryn McKinley, Daniel Frampton Presented by: Dimitris Prountzos.
Chameleon Automatic Selection of Collections Ohad Shacham Martin VechevEran Yahav Tel Aviv University IBM T.J. Watson Research Center Presented by: Yingyi.
Embedded System Lab. 정범종 A_DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters H. Wang et al. VEE, 2015.
Immix: A Mark-Region Garbage Collector Curtis Dunham CS 395T Presentation Feb 2, 2011 Thanks to Steve Blackburn and Jennifer Sartor for their 2008 and.
1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),
Department of Computer Sciences ISMM No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*
Heap liveness and its usage in automatic memory management Ran Shaham Elliot Kolodner Mooly Sagiv ISMM’02 Unpublished TVLA.
CoCo: Sound and Adaptive Replacement of Java Collections Guoqing (Harry) Xu Department of Computer Science University of California, Irvine.
Department of Computer Sciences Z-Rays: Divide Arrays and Conquer Speed and Flexibility Jennifer B. Sartor Stephen M. Blackburn,
Object-Relative Addressing: Compressed Pointers in 64-bit Java Virtual Machines Kris Venstermans, Lieven Eeckhout, Koen De Bosschere Department of Electronics.
® July 21, 2004GC Summer School1 Cycles to Recycle: Copy GC Without Stopping the World The Sapphire Collector Richard L. Hudson J. Eliot B. Moss Originally.
Simple Generational GC Andrew W. Appel (Practice and Experience, February 1989) Rudy Kaplan Depena CS 395T: Memory Management February 9, 2009.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
1 The Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT), Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss.
Eliminating External Fragmentation in a Non-Moving Garbage Collector for Java Author: Fridtjof Siebert, CASES 2000 Michael Sallas Object-Oriented Languages.
Mihai Burcea, J. Gregory Steffan, Cristiana Amza
Interpreted languages Jakub Yaghob
Cork: Dynamic Memory Leak Detection with Garbage Collection
Approaches to Reflective Method Invocation
An Adaptive Data Separation Aware FTL for Improving the Garbage Collection Efficiency of Solid State Drives Wei Xie and Yong Chen Texas Tech University.
Java Review: Reference Types
CACHETOR Detecting Cacheable Data to Remove Bloat
Methodology of a Compiler that Compresses Code using Echo Instructions
Data Structures and Analysis (COMP 410)
David F. Bacon, Perry Cheng, and V.T. Rajan
Jipeng Huang, Michael D. Bond Ohio State University
Adaptive Code Unloading for Resource-Constrained JVMs
CSCI 3333 Data Structures Array
Virtual Memory Hardware
Beltway: Getting Around Garbage Collection Gridlock
No Bit Left Behind: The Limits of Heap Data Compression
Garbage Collection Advantage: Improving Program Locality
Program-level Adaptive Memory Management
Presentation transcript:

No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel†, Kathryn S. McKinley* *U Texas at Austin, †IBM Watson

Current State CPU L1 L2 CPU L1 L2 Managed languages ubiquitous Embedded devices Multicore CPU L1 L2 CPU L1 L2 Need memory efficiency!

Memory Efficiency of Managed Languages COST 8-94% information content in heap in 37 benchmarks. [Mitchell & Sevitsky, OOPSLA 07] Boxed objects Trailing zeros in arrays Redundant objects Extra bit-width Data structure back-bones bzip2 86% OPPORTUNITY Memory layout abstraction (Location + size) != identity

Related Work Ananian & Rinard. LCTES 03 Equal obj sharing Appel & Goncalves. Tech Report 93 Dom value field hash, const field elide, Bit-width Chen, Kandemir & Irwin. VEE 05 Dom value field elide Chen, et al. OOPSLA 03 Zero compr, Trail zero trim Cooprider & Regehr. PLDI 07 Value set indirection Marinov & O’Callahan. OOPSLA 03 Eql obj sharing Stephenson, Babb & Amarasinghe. PLDI 00 Const field elide, Bit-width reduction Titzer, et al. PLDI 07 Zilles. ISMM 07 Bit-width reduction

Limit Study 58% Quantitatively compare heap data compression Surveyed literature Savings equations Methodology for evaluation Apples-to-apples comparison Future work: implementation Hybrid techniques 58% Findings: array & hybrid compression

Compression Example Redundancy x0001 x0058 x0004 x0000 x0001 x0058 Equal array sharing

Equal Object Sharing 14% Two objects are equal if both Marinov & O’Callahan. OOPSLA 03; Appel & Goncalves. Tech Report 93 Two objects are equal if both Same class & all fields have same value Strictly-equal: pointer fields identical Deep: objects pointer targets are equal JVM store only 1 copy in hashtable 14% Class C, N objects, D distinct; save:

Compression Example Redundancy x0001 x0058 x0004 x0000 Dictionary: Equal array sharing Value set indirection Dictionary: x0001 x0058 x0004 x0000 1 2 3

Value Set Indirection & Caching Cooprider,Regehr’07/ Titzer,Palsberg’07 For object field or array elements with large range of values Dictionary 256 distinct values, instance stores small 1 byte indices If > 256 values, 255 in dictionary, 256th says to store rest (M) in hashtable w/ objectID 14%

Compression Example x00A0 x0073 x0002 x0001 x0101 x0000 x00A0 x0073 Trim trailing zeros x00A0 x0073 x0002 x0001 x0101 8 5 Bit width reduce x0A0 x073 x002 x001 x101 8 5 Zero compress x0A x73 x2 x001 x101 8 5 xAF 8 5 10101111

Zero-based Object Compression Chen, et al. OOPSLA 03 Remove bytes that are entirely zero Per object bit-map: 1 bit per byte Store only non-zero bytes Savings: 45%

Compression Example x00A0 x0073 x0002 x0001 x0101 x0000 x00A0 x0073 Trim trailing zeros x00A0 x0073 x0002 x0001 x0101 8 5 Bit width reduce x0A0 x073 x002 x001 x101 8 5 Zero compress x0A x73 x2 x001 x101 8 5 xAF

Analysis representation Methodology Garbage Collection Program run Heap dump series Analysis representation t  Model 1 – Model n …  s Limit savings snapshot

Experimental Details Jikes Research Virtual Machine Java-in-Java DaCapo benchmarks + pseudojbb 20-25 heap snapshots per benchmark MarkSweep with 2x min heap Analysis Per class Objects and arrays separated JVM+app vs application (separated in paper) Per heap snapshot, and over all snapshots

Technique Class Array GC/Run Lempel-Ziv compression X GC Strictly-equal object sharing Obj Type Deep-equal object sharing Zero-based object compression Inst Trailing zero array trimming Bit-width reduction Fld Dominant-value field hashing Lazy invariant computation Value set indirection Value set caching Constant field elision Run Dominant-value field elision

Savings (average over all benchmarks)

Stability of Savings fop: snapshots over time

Conclusions Limit study compare apples-to-apples heap data compression techniques Potential to reduce memory inefficiencies in managed languages Arrays Hybrids Future: save space Challenge: efficient detection & recovery Thank you!

Value Indirection & Cache Deep Equal Sharing Zero Compression Hybrid Compression