A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao.

Slides:

Advertisements

Similar presentations

Garbage collection David Walker CS 320. Where are we? Last time: A survey of common garbage collection techniques –Manual memory management –Reference.

Advertisements

Synchronization. How to synchronize processes? – Need to protect access to shared data to avoid problems like race conditions – Typical example: Updating.

Global Environment Model. MUTUAL EXCLUSION PROBLEM The operations used by processes to access to common resources (critical sections) must be mutually.

Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.

Chapter 6 Process Synchronization Bernard Chen Spring 2007.

Lecture 10: Heap Management CS 540 GMU Spring 2009.

Garbage Collection What is garbage and how can we deal with it?

Wait-Free Reference Counting and Memory Management Håkan Sundell, Ph.D.

Virtual Memory Primitives for User Programs Andrew W. Appel and Kai Li Presented by Phil Howard.

KERNEL MEMORY ALLOCATION Unix Internals, Uresh Vahalia Sowmya Ponugoti CMSC 691X.

MC 2 : High Performance GC for Memory-Constrained Environments - Narendran Sachindran, J. Eliot B. Moss, Emery D. Berger Sowmiya Chocka Narayanan.

Garbage Collection  records not reachable  reclaim to allow reuse  performed by runtime system (support programs linked with the compiled code) (support.

5. Memory Management From: Chapter 5, Modern Compiler Design, by Dick Grunt et al.

Garbage Collection CSCI 2720 Spring Static vs. Dynamic Allocation Early versions of Fortran –All memory was static C –Mix of static and dynamic.

An On-the-Fly Mark and Sweep Garbage Collector Based on Sliding Views Hezi Azatchi - IBM Yossi Levanoni - Microsoft Harel Paz – Technion Erez Petrank –

By Jacob SeligmannSteffen Grarup Presented By Leon Gendler Incremental Mature Garbage Collection Using the Train Algorithm.

MC 2 : High Performance GC for Memory-Constrained Environments N. Sachindran, E. Moss, E. Berger Ivan JibajaCS 395T *Some of the graphs are from presentation.

CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.

CS 536 Spring Automatic Memory Management Lecture 24.

Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.

Memory Management. History Run-time management of dynamic memory is a necessary activity for modern programming languages Lisp of the 1960’s was one of.

1 The Compressor: Concurrent, Incremental and Parallel Compaction. Haim Kermany and Erez Petrank Technion – Israel Institute of Technology.

G Robert Grimm New York University Cool Pet Tricks with… …Virtual Memory.

Garbage Collection Mooly Sagiv html://

MOSTLY PARALLEL GARBAGE COLLECTION Authors : Hans J. Boehm Alan J. Demers Scott Shenker XEROX PARC Presented by:REVITAL SHABTAI.

Virtual Memory Primitives for User Programs Andrew W. Appel and Kai Li Presented by: Khanh Nguyen.

21 September 2005Rotor Capstone Workshop Parallel, Real-Time Garbage Collection Daniel Spoonhower Guy Blelloch, Robert Harper, David Swasey Carnegie Mellon.

Generational Stack Collection And Profile driven Pretenuring Perry Cheng Robert Harper Peter Lee Presented By Moti Alperovitch

Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.

Incremental Garbage Collection

Compilation 2007 Garbage Collection Michael I. Schwartzbach BRICS, University of Aarhus.

Garbage collection (& Midterm Topics) David Walker COS 320.

Garbage Collection Mooly Sagiv

Damien Doligez Georges Gonthier POPL 1994 Presented by Eran Yahav Portable, Unobtrusive Garbage Collection for Multiprocessor Systems.

Jangwoo Shin Garbage Collection for Real-Time Java.

Uniprocessor Garbage Collection Techniques Paul R. Wilson.

Reference Counters Associate a counter with each heap item Whenever a heap item is created, such as by a new or malloc instruction, initialize the counter.

Using Generational Garbage Collection To Implement Cache- conscious Data Placement Trishul M. Chilimbi & James R. Larus מציג : ראובן ביק.

UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.

Garbage Collection Memory Management Garbage Collection –Language requirement –VM service –Performance issue in time and space.

/ PSWLAB Eraser: A Dynamic Data Race Detector for Multithreaded Programs By Stefan Savage et al 5 th Mar 2008 presented by Hong,Shin Eraser:

SEG Advanced Software Design and Reengineering TOPIC L Garbage Collection Algorithms.

David F. Bacon Perry Cheng V.T. Rajan IBM T.J. Watson Research Center The Metronome: A Hard Real-time Garbage Collector.

Ulterior Reference Counting: Fast Garbage Collection without a Long Wait Author: Stephen M Blackburn Kathryn S McKinley Presenter: Jun Tao.

A Real-Time Garbage Collector Based on the Lifetimes of Objects Henry Lieberman and Carl Hewitt (CACM, June 1983) Rudy Kaplan Depena CS395T: Memory Management.

A Mostly Non-Copying Real-Time Collector with Low Overhead and Consistent Utilization David Bacon Perry Cheng (presenting) V.T. Rajan IBM T.J. Watson Research.

Copyright (c) 2004 Borys Bradel Myths and Realities: The Performance Impact of Garbage Collection Paper: Stephen M. Blackburn, Perry Cheng, and Kathryn.

Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala.

1 Real-Time Replication Garbage Collection Scott Nettles and James O’Toole PLDI 93 Presented by: Roi Amir.

Incremental Garbage Collection Uwe Kern 23. Januar 2002

Computer Science Department Daniel Frampton, David F. Bacon, Perry Cheng, and David Grove Australian National University Canberra ACT, Australia

11/26/2015IT 3271 Memory Management (Ch 14) n Dynamic memory allocation Language systems provide an important hidden player: Runtime memory manager – Activation.

Lecture 5: Threads process as a unit of scheduling and a unit of resource allocation processes vs. threads what to program with threads why use threads.

UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.

David F. Bacon Perry Cheng V.T. Rajan IBM T.J. Watson Research Center ControllingFragmentation and Space Consumption in the Metronome.

Processes and Virtual Memory

A REAL-TIME GARBAGE COLLECTOR WITH LOW OVERHEAD AND CONSISTENT UTILIZATION David F. Bacon, Perry Cheng, and V.T. Rajan IBM T.J. Watson Research Center.

Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.

2/4/20161 GC16/3011 Functional Programming Lecture 20 Garbage Collection Techniques.

® July 21, 2004GC Summer School1 Cycles to Recycle: Copy GC Without Stopping the World The Sapphire Collector Richard L. Hudson J. Eliot B. Moss Originally.

The Metronome Washington University in St. Louis Tobias Mann October 2003.

CS412/413 Introduction to Compilers and Translators April 21, 1999 Lecture 30: Garbage collection.

GARBAGE COLLECTION Student: Jack Chang. Introduction Manual memory management Memory bugs Automatic memory management We know... A program can only use.

Eliminating External Fragmentation in a Non-Moving Garbage Collector for Java Author: Fridtjof Siebert, CASES 2000 Michael Sallas Object-Oriented Languages.

Garbage Collection What is garbage and how can we deal with it?

PROCESS MANAGEMENT IN MACH

Background on the need for Synchronization

Strategies for automatic memory management

Chapter 12 Memory Management

Garbage Collection What is garbage and how can we deal with it?

Presentation transcript:

A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Outline Introduction Background and definitions Theoretical algorithm Extended algorithm Evaluation Conclusion

Introduction First garbage collectors: – Non-incremental, non-parallel Recent collector – Incremental – Concurrent – Parallel

Introduction Scalably parallel and real-time collector – All aspects of the collector are incremental – Parallel Arbitrary number of application and collector threads – Tight theoretical bounds on Pause time for any application Total memory usage – Asymptotically but not practically efficient

Introduction Extended collector algorithm – Work with generations – Increase the granularity of the incremental steps – Separately handle global variables – Delay the copy on write – Reduce the synchronization cost of copying small objects – Parallelize the processing of large objects – Reduce double allocation during collection – Allow program stacks

Background and Definitions A semispace Stop-Copy Collector – Divide heap memory into two equally-sized From-space and to-space – Suspend mutator and copy reachable objects to the to-space when from-space is full – Update root values and reversing the role of from- space and to-space

Background and Definitions Types of Garbage Collectors

Background and Definitions Type of Garbage Collector (continued)

Background and Definitions Real-time Collector – Maximum pause time – Utilization The fraction of time that the mutator executes – Minimum Mutator Utilization A function of window size Minimum utilization at all windows of that size = 0 when window size <= maximum pause time

Theoretical Algorithm A Parallel, incremental and concurrent collector – Base on Cheney’s simple copying collector – All objects are stored in a shared global pool of memory – Two atomic instruction FetchAndAdd CompareAndSwap – Collector interfaces with the application Allocating space for a new object Initializing the fields of a new object Modifying the field of an existing object

Theoretical Algorithm Scalable Parallelism – Maintain the set of gray objects – Cheney’s technique Keeping them in contiguous locations in to-space Pros – Simple Cons – Restricts the traversal order to breadth-first – Difficult to implement in a parallel setting

Theoretical Algorithm Scalable Parallelism (continued) – Explicitly managed local stack Each processor maintains a stack A shared stack of gray objects Periodically transfer gray objects between local and shared stack Avoid idleness – Pushes (or pops) can proceed in parallel Reserve a target region before transfer Pushes and pops are not concurrent Room sychronization

Theoretical Algorithm Scalable Parallelism (continued) – Avoid white objects being copied twice Exclusive access by atomic instructions Copy-copy synchronization

Theoretical Algorithm Incremental and Replicating Collection – Baker’s incremental collector Copy k units of data when allocate a unit of data – Bound the pause time Mutator can only see copied objects in to-space – A read barrier is needed – Modification to avoid the read barrier Mutator can only see the original objects in from-space – A write barrier is needed

Theoretical Algorithm Concurrency – Program and collector execute simultaneously – Program manipulate primary memory graph – Collector manipulate replica graph – A copy-write synchronization is needed Replica objects should be modified correspondently Avoid race condition – Mark objects being copied – Mutator’s update to replica should be delay – A write-write synchronization is needed Prohibit different mutator threads from modifying the same memory location concurrently

Theoretical Algorithm Space and Time Bounds – Time bounds on each memory operation ck – C : a constant – K: the number of words we collect per word allocated – Space bounds 2(R(1+1.5/k)+N+5PD) ≈ 2(R(1+1.5/k) – R: reachable space – N: maximum object count – P: P-way multiprocessor – D: maximum memory graph depth

Extended Algorithm Globals, Stacks and Stacklets – Globals Updated when collection ends Arbitrary many -> unbound time Replicate globals like other heap objects Every global has two location A single flag is used for all globals – Stacks and Stacklets Divided stacks into fixed-size stacklets At most one stacklet is active and the other can be replicated savely Also bound the waste space per stack

Extended Algorithm Granularity – Block Allocation and Free Initialization Avoid calling FetchAndAdd for every memory allocation Each processor maintain a local pool in from-space and a local pool in to-space when collector is on Using a FetchAndAdd when allocating a local pool – Write Barrier Avoid updating copied objects every time Record a triple in a write log and defer Invoke the collector when the write log is full Eliminating frequent context switches

Extended Algorithm Small and Large Objects – Original Algorithm One field at a time – Reinterpretation of the tag word – Transferring the object from and to the local stack – Extended Algorithm Small objects – Locked down and copied at a time Large objects – Divided into segments – One segment at a time

Extended Algorithm Algorithmic Modifications – Reducing double allocation One allocation by mutator and one by collector Deferring the double allocation – Rooms and Better Rooms A push room and a pop room Only one room can be non-empty Rooms – Enter the pop room, fetch work and perform, transition to the push room, push objects back to the shared stack – Graying objects is time-consuming – Wait for entering the push room

Extended Algorithm Algorithm modifications – Rooms and Better Rooms (continued) Better rooms – Leave the pop room after fetching work from shared stack – Detect the shared stack is empty by maintaining a borrow counter – Generational Collection Nursery and tenured space Trigger a minor collection when nursery space is full Trigger a major collection when tenured space is full Tenured references might not be modified during collection Hold two fields for mutable pointer – one for mutator to use, the other for collector to update

Evaluation

Conclusion Implements a scalably parallel, concurrent, real-time garbage collector Thread synchronization is minimized