Continuously and Compacting By Lyndon Meadow

Slides:



Advertisements
Similar presentations
Garbage Collecting the World Bernard Lang Christian Queinnec Jose Piquer Presented by Yu-Jin Chia See also: pp text.
Advertisements

Garbage Collecting the World. --Bernard Lang, Christian and Jose Presented by Shikha Khanna coen 317 Date – May25’ 2005.
Garbage Collection CSCI 2720 Spring Static vs. Dynamic Allocation Early versions of Fortran –All memory was static C –Mix of static and dynamic.
PARALLEL PROGRAMMING with TRANSACTIONAL MEMORY Pratibha Kona.
Programming Language Semantics Java Threads and Locks Informal Introduction The Java Specification Language Chapter 17.
Chapter 11 Operating Systems
Reference Counters Associate a counter with each heap item Whenever a heap item is created, such as by a new or malloc instruction, initialize the counter.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
SEG Advanced Software Design and Reengineering TOPIC L Garbage Collection Algorithms.
Virtual Memory.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.
Computer Studies (AL) Memory Management Virtual Memory I.
Memory Management COSC 513 Presentation Jun Tian 08/17/2000.
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
Thread basics. A computer process Every time a program is executed a process is created It is managed via a data structure that keeps all things memory.
® July 21, 2004GC Summer School1 Cycles to Recycle: Copy GC Without Stopping the World The Sapphire Collector Richard L. Hudson J. Eliot B. Moss Originally.
COMP091 – Operating Systems 1 Memory Management. Memory Management Terms Physical address –Actual address as seen by memory unit Logical address –Address.
Tutorial 2: Homework 1 and Project 1
Lecture 20: Consistency Models, TM
Memory Management.
Non Contiguous Memory Allocation
Free Transactions with Rio Vista
Jonathan Walpole Computer Science Portland State University
Virtual Memory CSSE 332 Operating Systems
Computer Architecture Chapter (14): Processor Structure and Function
Process Management Process Concept Why only the global variables?
Background on the need for Synchronization
Chapter 9: Virtual Memory – Part I
Chapter 9: Virtual Memory
From Address Translation to Demand Paging
CSC 4250 Computer Architectures
Chapter 4: Multithreaded Programming
CS703 - Advanced Operating Systems
Storage Management.
Multiprocessor Cache Coherency
Swapping Segmented paging allows us to have non-contiguous allocations
Specifying Multithreaded Java semantics for Program Verification
Main Memory Management
Lecture 5: GPU Compute Architecture
Database Applications (15-415) DBMS Internals- Part III Lecture 15, March 11, 2018 Mohammad Hammoud.
Example Cache Coherence Problem
O.S Lecture 13 Virtual Memory.
Simulated Pointers.
Chapter 9: Virtual-Memory Management
Lecture 5: GPU Compute Architecture for the last time
Memory Management Lectures notes from the text supplement by Siberschatz and Galvin Modified by B.Ramamurthy Chapter 8 11/24/2018.
Simulated Pointers.
Memory Management Lectures notes from the text supplement by Siberschatz and Galvin Modified by B.Ramamurthy Chapter 9 12/1/2018.
Strategies for automatic memory management
Closure Representations in Higher-Order Programming Languages
Free Transactions with Rio Vista
Lecture 22: Consistency Models, TM
Morgan Kaufmann Publishers Memory Hierarchy: Cache Basics
Translation Buffers (TLB’s)
Dr. Mustafa Cem Kasapbaşı
Software Transactional Memory Should Not be Obstruction-Free
Lecture 20: Intro to Transactions & Logging II
CSE451 Virtual Memory Paging Autumn 2002
Memory Management Lectures notes from the text supplement by Siberschatz and Galvin Modified by B.Ramamurthy Chapter 9 4/5/2019.
Update : about 8~16% are writes
Programming with Shared Memory Specifying parallelism
Lecture: Consistency Models, TM
Translation Buffers (TLBs)
CS703 – Advanced Operating Systems
COMP755 Advanced Operating Systems
Memory Management Memory management is the art and the process of coordinating and controlling the use of memory in a computer system Why memory management?
Mr. M. D. Jamadar Assistant Professor
CSC Multiprocessor Programming, Spring, 2011
Chapter 3: Process Management
Presentation transcript:

Continuously and Compacting By Lyndon Meadow Garbage Collectors Continuously and Compacting By Lyndon Meadow

Presentation Contents Why do we want Garbage Collectors (GCs)? What is a Garbage Collector (GC)? What are our Basic Collector Models? Our Unique and Modern Problem Solution: Continuously Compacting Real-Time GCs What is means to be Stopless? Bring me my CoCo Playing Chicken Picking Clovers Endgame of Modern GCs

Why do we want Garbage Collection (GCs)? Primarily we want reliable and secure software that is quick an cheap to develop. Specifically we want the rapid development of applications with: High Responsiveness Meeting Short Deadlines

What is a Garbage Collector (GC)? Objects may be freely created, in object oriented languages. We’d like someway to keep track: Such as a root set of pointers Must detect when object is no longer referenced by the program. “Garbage” – Objects that are no longer accessible and hence need collection to free memory and avoid overflows.

What is a just a Compacting Solution? We want a continuous space in memory for optimal new object creation. Dead objects are deleted, and then all living objects made into a continuous region on the heap. First Pass Marks Dead Objects Further Passes Calculate/Move/Update Live Object References Handle Pools might be used as method of pointer consolidation. But subsequently adds a level of indirection to each object access “Stop the World” Collector – pauses program execution as GC runs

Compacting GC Model

What is a just a Concurrent Solution? We want to avoid execution delays from GC operation Useful when can only use limited resources/ time Useful when system has a spare CPU thread to dedicate Solving problem of a partially collected heap in a state of flux Uses a form a synchronization to avoid referencing marked objects Write Barriers are standard choice for protection Check for when a pointer in an already marked object is overwritten On occurrence, remark object and put it back in object queue But memory isn’t defragmented!

Concurrent GC Model

Our Unique and Modern Problem Desire both a system that collects dead objects without significant delays. And a system that makes the heap continuous for the creation of new live objects. Specifically for modern multithreaded applications, that must operate in real time.

Solution: Continuously Compacting Real-Time GCs – i.e. Stopless On-the-fly lock-free mark sweep collector, which reclaims dead objects concurrently Concurrent (Partial) Compaction mechanism that supports said parallel and lock-free multithreaded applications Additionally consider barrier costs in further optimizations as well as those by alternative Stopless implementations

Basics of Stopless GCs: “Lock free” Will some thread complete an operation after the system has run a finite number of steps? Yes, by extension of typical single-thread requirements. A thread is required to complete an operation after some constant steps.

Basics of Stopless GCs: “Real Time” In the strictest terms: Robustness to worst case almost zero probability events such as: Worst case memory fragmentations Unassured collector terminations Etc… Stopless is soft real time, in other words we let entropy into our system for hopeful performance gains in most usage cases. Worst case, the program fails in the middle of operation and hits bare metal.

Basics of Stopless: “Concurrent Compaction” Most solutions to shifting threads and multiple copies of an object may usually be tolerated in most memory coherence models. However, this isn’t the case for “Real Time” which require atomic operations. Creating ordering constraints that don’t allow much reordering. CoCo system is first practical realization of these goals.

Guide Stopless - Stock model STW – Stop the World CPP – Manual C++ memory management

Bringing the CoCo Moves objects in the heap concurrently with program executions Allows parallel threads coordinated via atomic operations on shared memory Supporting program semantics/ lock-freedom guarantees. Satisfies Linearizability and Sequential Consistency memory models. Basically, uses a wide object to transition from the original object to the copied object.

Wide Object Model

Solution is found, the Problems Continue We have achieved up to this point faster responsiveness for real time systems that require deadlines at the microseconds level. Even have obtained powerful scalability for larger programs. But, is it possible to reduce time even further? Below the newfound average of 10 micro seconds? What may be adjusted in the compaction algorithm?

What does Failure Mean? With the Stopless implementation we already have accepted some soft balling of real time requirements. We ignore problems that always can happen such as a hardware failure, memory corruption, page-faults, software-bugs, or even just odd cache behaviors. Not known what distribution space is, so relaxed attituded is justified.

Playing Chicken Assumes that a race between the Program and the GC won’t occur. Will at some cost, gracefully abort a move if such a conflict occurs. Maintains the invariant that all threads agree on if an object is copied or not. Maintains the invariant that all threads always use the correct object to load and store values.

Benefits of Being a Chicken Objects are copied as a whole Lets all program threads switch to new location Avoids sluggish per-field read checks that Stopless and Clover use. Writing is a wait-free operation If fully copied then write to to-space If not tagged for copying then write to from-space Else Chicken must Abort using compare and swap

Benefits of Being a Chicken Continued Soft-Handshake occurs before copying starts Each object that is meant to be copied is tagged first Waits until all threads acknowledge the compaction phase The copying process is wait-free Objects are flipped from from-space to to-space individually. If a program writes to an object before flipped, tag cleared, object not copied. If flipped, object is copied, then program threads read/write to-space.

Picking Clovers Doesn’t block data races between collector and program threads If a value ‘v’ is chosen from a space of ‘2^l’ values ‘V’ Then probability of being stored any given store execution is ‘2^-l’ If forbidden value is written to, opened locks need to be re-locked.

Benefits of Picking Clovers Just like Chicken the simpler implementation reduces complexity Allocated a random alpha to a single space in the heap, which probably won’t get overwritten. Odds of failure are essentially impossible on most modern 64 bit architectures Similar to both Chicken and Stopless, soft-handshakes also used.

Stopless/Chicken/Clover Heap Access Chicken: all heap access are wait free, except on compare swaps which are still lock free. Clover: Heap write might get stalled until compaction completes Stopless: Provides Lock Free writes, Branches for reads

Stopless/Chicken/Clover Aborting Chicken: Aborts eagerly since has strictest requirments Clover: Never aborts unless user requests it Stopless: Aborts when multiple mutator threads write into same field simultaneously as compaction occurs.

Stopless/Chicken/Clover Termination Chicken: Guaranteed to complete, but may not copy all objects Clover: Doesn’t have a guarantee, but can be modified to include it Stopless: Never guarantees termination

Distributions Of Transaction Times

Endgame of Modern GCs Several Implementations that crushed IBM’s Java Collector as of 2008. Several new implementations and further research still being conducted. Notably C4 introduced generational aspect to this process. Near native binary efficiency and response times, approaching.