A REAL-TIME GARBAGE COLLECTOR WITH LOW OVERHEAD AND CONSISTENT UTILIZATION David F. Bacon, Perry Cheng, and V.T. Rajan IBM T.J. Watson Research Center.

Slides:



Advertisements
Similar presentations
An Implementation of Mostly- Copying GC on Ruby VM Tomoharu Ugawa The University of Electro-Communications, Japan.
Advertisements

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 1 MC 2 –Copying GC for Memory Constrained Environments Narendran Sachindran J. Eliot.
1 Write Barrier Elision for Concurrent Garbage Collectors Martin T. Vechev Cambridge University David F. Bacon IBM T.J.Watson Research Center.
Automatic Memory Management Noam Rinetzky Schreiber 123A /seminar/seminar1415a.html.
Reducing Pause Time of Conservative Collectors Toshio Endo (National Institute of Informatics) Kenjiro Taura (Univ. of Tokyo)
Bounding Space Usage of Conservative Garbage Collectors Ohad Shacham December 2002 Based on work by Hans-J. Boehm.
MC 2 : High Performance GC for Memory-Constrained Environments - Narendran Sachindran, J. Eliot B. Moss, Emery D. Berger Sowmiya Chocka Narayanan.
Garbage Collection  records not reachable  reclaim to allow reuse  performed by runtime system (support programs linked with the compiled code) (support.
1 Error-Free Garbage Collection Traces: How to Cheat and Not Get Caught ACM SIGMETRICS, 2002.
Garbage Collection CSCI 2720 Spring Static vs. Dynamic Allocation Early versions of Fortran –All memory was static C –Mix of static and dynamic.
On-the-Fly Garbage Collection: An Exercise in Cooperation Edsget W. Dijkstra, Leslie Lamport, A.J. Martin and E.F.M. Steffens Communications of the ACM,
By Jacob SeligmannSteffen Grarup Presented By Leon Gendler Incremental Mature Garbage Collection Using the Train Algorithm.
MC 2 : High Performance GC for Memory-Constrained Environments N. Sachindran, E. Moss, E. Berger Ivan JibajaCS 395T *Some of the graphs are from presentation.
Mark DURING Sweep rather than Mark then Sweep Presented by Ram Mantsour Authors: Chrisitan Queinnec, Barbara Beaudoing, Jean-Pierre Queille.
Increasing Memory Usage in Real-Time GC Tobias Ritzau and Peter Fritzson Department of Computer and Information Science Linköpings universitet
CS 536 Spring Automatic Memory Management Lecture 24.
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
OOPSLA 2003 Mostly Concurrent Garbage Collection Revisited Katherine Barabash - IBM Haifa Research Lab. Israel Yoav Ossia - IBM Haifa Research Lab. Israel.
1 The Compressor: Concurrent, Incremental and Parallel Compaction. Haim Kermany and Erez Petrank Technion – Israel Institute of Technology.
On the limits of partial compaction Anna Bendersky & Erez Petrank Technion.
MOSTLY PARALLEL GARBAGE COLLECTION Authors : Hans J. Boehm Alan J. Demers Scott Shenker XEROX PARC Presented by:REVITAL SHABTAI.
0 Parallel and Concurrent Real-time Garbage Collection Part I: Overview and Memory Allocation Subsystem David F. Bacon T.J. Watson Research Center.
Memory Allocation. Three kinds of memory Fixed memory Stack memory Heap memory.
21 September 2005Rotor Capstone Workshop Parallel, Real-Time Garbage Collection Daniel Spoonhower Guy Blelloch, Robert Harper, David Swasey Carnegie Mellon.
Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.
Incremental Garbage Collection
1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot.
Uniprocessor Garbage Collection Techniques Paul R. Wilson.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
Garbage Collection Memory Management Garbage Collection –Language requirement –VM service –Performance issue in time and space.
A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao.
1 Overview Assignment 6: hints  Living with a garbage collector Assignment 5: solution  Garbage collection.
SEG Advanced Software Design and Reengineering TOPIC L Garbage Collection Algorithms.
David F. Bacon Perry Cheng V.T. Rajan IBM T.J. Watson Research Center The Metronome: A Hard Real-time Garbage Collector.
Taking Off The Gloves With Reference Counting Immix
ISMM 2004 Mostly Concurrent Compaction for Mark-Sweep GC Yoav Ossia, Ori Ben-Yitzhak, Marc Segal IBM Haifa Research Lab. Israel.
Ulterior Reference Counting: Fast Garbage Collection without a Long Wait Author: Stephen M Blackburn Kathryn S McKinley Presenter: Jun Tao.
A Real-Time Garbage Collector Based on the Lifetimes of Objects Henry Lieberman and Carl Hewitt (CACM, June 1983) Rudy Kaplan Depena CS395T: Memory Management.
A Mostly Non-Copying Real-Time Collector with Low Overhead and Consistent Utilization David Bacon Perry Cheng (presenting) V.T. Rajan IBM T.J. Watson Research.
© Imperial College London Exploring the Barrier to Entry Incremental Generational Garbage Collection for Haskell Andy Cheadle & Tony Field Imperial College.
Copyright (c) 2004 Borys Bradel Myths and Realities: The Performance Impact of Garbage Collection Paper: Stephen M. Blackburn, Perry Cheng, and Kathryn.
Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala.
1 Real-Time Replication Garbage Collection Scott Nettles and James O’Toole PLDI 93 Presented by: Roi Amir.
Incremental Garbage Collection Uwe Kern 23. Januar 2002
Computer Science Department Daniel Frampton, David F. Bacon, Perry Cheng, and David Grove Australian National University Canberra ACT, Australia
September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct
11/26/2015IT 3271 Memory Management (Ch 14) n Dynamic memory allocation Language systems provide an important hidden player: Runtime memory manager – Activation.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
G ARBAGE C OLLECTION CSCE-531 Ankur Jain Neeraj Agrawal 1.
David F. Bacon Perry Cheng V.T. Rajan IBM T.J. Watson Research Center ControllingFragmentation and Space Consumption in the Metronome.
GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo.
Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.
Real-time collection for multithreaded Java Microcontroller Garbage Collection. Garbage Collection. Application of Java in embedded real-time systems.
® July 21, 2004GC Summer School1 Cycles to Recycle: Copy GC Without Stopping the World The Sapphire Collector Richard L. Hudson J. Eliot B. Moss Originally.
The Metronome Washington University in St. Louis Tobias Mann October 2003.
CS412/413 Introduction to Compilers and Translators April 21, 1999 Lecture 30: Garbage collection.
Real-time Garbage Collection By Tim St. John Low Overhead and Consistent Utilization. Low Overhead and Consistent Utilization. Multithreaded Java Microcontroller.
Eliminating External Fragmentation in a Non-Moving Garbage Collector for Java Author: Fridtjof Siebert, CASES 2000 Michael Sallas Object-Oriented Languages.
Immix: A Mark-Region Garbage Collector Jennifer Sartor CS395T Presentation Mar 2, 2009 Thanks to Steve for his Immix presentation from
Dynamic Compilation Vijay Janapa Reddi
Upper Bound for Defragmenting Buddy Heaps
Concepts of programming languages
Ulterior Reference Counting Fast GC Without The Wait
David F. Bacon, Perry Cheng, and V.T. Rajan
Strategies for automatic memory management
List Processing in Real Time on a Serial Computer
Chapter 12 Memory Management
Reference Counting.
Presentation transcript:

A REAL-TIME GARBAGE COLLECTOR WITH LOW OVERHEAD AND CONSISTENT UTILIZATION David F. Bacon, Perry Cheng, and V.T. Rajan IBM T.J. Watson Research Center Presented by Srilakshmi Swati Pendyala

Outline  Motivation  Introduction & Previous Works  Overview of the Proposed Garbage Collector  Example of the Collection Process  Scheduling – Time-Based Vs. Work-Based  Experimental Results  Conclusion

Motivation  Real-time systems growing in importance  ATMs, PDAs, Web Servers, Points of Sale etc.  Constraints for Real-Time Systems:  Hard constraints for continuous performance (Low Pause Times)  Memory Constraints (less memory in embedded systems)  Other Constraints ? Need for a real-time garbage collector with low memory usage.

Garbage Collection in Real-time Systems  Maximum Pause Time < Required Response  CPU Utilization sufficient to accomplish task  Measured with Minimum Mutator Utilization  Memory Requirement < Resource Limit  Important Constraint in Embedded Systems

Problems with Previous Works  Fragmentation  Early works (Baker’s Treadmill) handles a single object size  Not suitable modern languages  Fragmentation not a major problem for a family of C and C++ benchmarks (Johnstone’ Paper)  Not valid for long-run programs (web-servers, embedded systems etc.)  Use of single (large) block size  Increase in memory requirements  Leads to internal fragmentation

Problems with Previous Works  High Space Overhead  Copying algorithms to avoid fragmentation  Leads to high space overhead  Uneven Mutator Utilization  The fraction of processor devoted to mutator execution  Several copying algorithms suffer from poor/uneven mutator utilization  Long low-utilization periods render mutator unsuitable for real- time applications  Inability to handle large data structures  When collecting a subset of the heap at a time, large structures generated by adversarial mutators force unbounded work

Outline  Motivation  Introduction & Previous Works  Overview of the Proposed Garbage Collector  Example of the Collection Process  Scheduling – Time-Based Vs. Work-Based  Experimental Results  Conclusion

Components and Concepts in Proposed GC  Segregated free list allocator  Geometric size progression limits internal fragmentation  Mostly non-copying  Objects are usually not moved.  Defragmentation  Moves objects to a new page when page is fragmented due to GC  Read barrier: to-space invariant [Brooks]  New techniques with only 4% overhead  Incremental mark-sweep collector  Mark phase fixes stale pointers  Arraylets: bound fragmentation, large object ops  Time-based scheduling New Old

Segregated Free List Allocator  Heap divided into fixed-size pages  Each page divided into fixed-size blocks  Objects allocated in smallest block that fits

Limiting Internal Fragmentation  Choose page size P and block sizes s k such that  s k = s k-1 (1+ ρ )  How do we choose small s 0 & ρ ?  s 0 ~ minimum block size  ρ ~ sufficiently small to avoid internal fragmentation Too small a ρ leads to too many pages and hence a wastage of space, but it should be okay for long running processes Too large a ρ leads to internal fragmentation Memory for a page should be allocated only when there is at least one object in that page.

Defragmentation  When do we move objects?  At the end of sweep phase, when there are no sufficient free pages for the mutator to execute, that is, when there is fragmentation  Usually, program exhibits locality of size  Dead objects are re-used quickly  Defragment either when  Dead objects are not re-used for a GC cycle  Free pages fall below limit for performing a GC  In practice: we move 2-3% of data traced  Major improvement over copying collector

Read Barrier: To-space Invariant  Problem: Collector moves objects (defragmentation)  and mutator is finely interleaved  Solution: read barrier ensures consistency  Each object contains a forwarding pointer [Brooks]  Read barrier unconditionally forwards all pointers  Mutator never sees old versions of objects  Will the mutator utilization have any effects because of the read barrier ? From-spaceTo-space A X Y Z A X Y Z A′ BEFOREAFTER

Read Barrier Optimization  Previous studies: % overhead [Zorn, Nielsen]  Several optimizations applied to the read barrier and reduced the cost over-head to <10% using Eager Read Barriers  “Eager” read barrier preferred over “Lazy” read barrier.

Incremental Mark-Sweep  Mark/sweep finely interleaved with mutator  Write barrier: snapshot-at-the-beginning [Yuasa]  Ensures no lost objects  Must treat objects in write buffer as roots  Read barrier ensures consistency  Marker always traces correct object  With barriers, interleaving is simple  Are the problems inherent to mark sweep, also apply here ?

Pointer Fix-up During Mark  When can a moved object be freed?  When there are no more pointers to it  Mark phase updates pointers  Redirects forwarded pointers as it marks them  Object moved in collection n can be freed:  At the end of mark phase of collection n+1 From-spaceTo-space A X Y Z A′

Arraylets  Large arrays create problems  Fragment memory space  Can not be moved in a short, bounded time  Solution: break large arrays into arraylets  Access via indirection; move one arraylet at a time A1A2A3 A

Outline  Motivation  Introduction & Previous Works  Overview of the Proposed Garbage Collector  Example of the Collection Process  Scheduling – Time-Based Vs. Work-Based  Experimental Results  Conclusion

Heap (one size only)Stack Program Start

HeapStack free allocated Program is allocating

HeapStack free unmarked GC starts

HeapStack free unmarked marked or allocated Program allocating and GC marking

HeapStack free unmarked marked or allocated Sweeping away blocks

HeapStack free allocated evacuated GC moving objects and installing redirection

HeapStack free unmarked evacuated marked or allocated 2 nd GC starts tracing and redirection fixup

HeapStack free allocate d 2 nd GC complete

Outline  Motivation  Introduction & Previous Works  Overview of the Proposed Garbage Collector  Example of the Collection Process  Scheduling – Time-Based Vs. Work-Based  Experimental Results  Conclusion

Scheduling the Collector  Scheduling Issues  Bad CPU utilization and space usage  Loose program and collector coupling  Time-Based  Trigger the collector to run for C T seconds whenever the mutator runs for Q T seconds  Work-Based  Trigger the collector to collect C W work whenever the mutator allocate Q W bytes

Scheduling  Very predictable mutator utilization  Memory allocation does not need to be monitored.  Uneven mutator utilization due to bursty allocation  Memory allocation rates need to be monitored to make sure real-time performance is obtained Time – BasedWork – Based Why is Time-based scheduling better in terms of mutator utilization ? (Analytically and experimentally shown in the paper)

Outline  Motivation  Introduction & Previous Works  Overview of the Proposed Garbage Collector  Example of the Collection Process  Scheduling – Time-Based Vs. Work-Based  Experimental Results  Conclusion

12 ms Pause Time Distribution for javac (Time- Based vs. Work-Based)

Utilization vs. Time for javac (Time-Based vs. Work-Based) 0.45

Minimum Mutator Utilization for javac (Time-Based vs. Work-Based)

Space Usage for javac (Time-Based vs. Work-Based)

Conclusions  The Metronome provides true real-time GC  First collector to do so without major sacrifice Short pauses (4 ms) High MMU during collection (50%) Low memory consumption (2x max live)  Critical features  Time-based scheduling  Hybrid, mostly non-copying approach  Integration with the compiler