Theory of Compilation 236360 Erez Petrank Lecture 9: Runtime (part 2); object oriented issues 1.

Slides:



Advertisements
Similar presentations
Garbage collection David Walker CS 320. Where are we? Last time: A survey of common garbage collection techniques –Manual memory management –Reference.
Advertisements

Dynamic Memory Management
Compilation /15a Lecture 13 Compiling Object-Oriented Programs Noam Rinetzky 1.
Lecture 10: Part 1: OO Issues CS 540 George Mason University.
CMSC 330: Organization of Programming Languages Memory and Garbage Collection.
Lecture 10: Heap Management CS 540 GMU Spring 2009.
CMSC 330: Organization of Programming Languages Memory and Garbage Collection.
Garbage Collection Introduction and Overview Christian Schulte Excerpted from presentation by Christian Schulte Programming Systems Lab Universität des.
Garbage Collection  records not reachable  reclaim to allow reuse  performed by runtime system (support programs linked with the compiled code) (support.
Compiler construction in4020 – lecture 12 Koen Langendoen Delft University of Technology The Netherlands.
Run-time organization  Data representation  Storage organization: –stack –heap –garbage collection Programming Languages 3 © 2012 David A Watt,
Various languages….  Could affect performance  Could affect reliability  Could affect language choice.
Pointer applications. Arrays and pointers Name of an array is a pointer constant to the first element whose value cannot be changed Address and name refer.
Garbage Collection CSCI 2720 Spring Static vs. Dynamic Allocation Early versions of Fortran –All memory was static C –Mix of static and dynamic.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
CPSC 388 – Compiler Design and Construction
CS 536 Spring Automatic Memory Management Lecture 24.
CSC321: Programming Languages 11-1 Programming Languages Tucker and Noonan Chapter 11: Memory Management 11.1 The Heap 11.2 Implementation of Dynamic Arrays.
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
Memory Management. History Run-time management of dynamic memory is a necessary activity for modern programming languages Lisp of the 1960’s was one of.
Using Prefetching to Improve Reference-Counting Garbage Collectors Harel Paz IBM Haifa Research Lab Erez Petrank Microsoft Research and Technion.
1 The Compressor: Concurrent, Incremental and Parallel Compaction. Haim Kermany and Erez Petrank Technion – Israel Institute of Technology.
Lecture 13 – Compiling Object-Oriented Programs Eran Yahav 1 Reference: MCD
Theory of Compilation Erez Petrank Lecture 10: Runtime. 1.
Memory Allocation. Three kinds of memory Fixed memory Stack memory Heap memory.
Honors Compilers Addressing of Local Variables Mar 19 th, 2002.
Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.
Run time vs. Compile time
Age-Oriented Concurrent Garbage Collection Harel Paz, Erez Petrank – Technion, Israel Steve Blackburn – ANU, Australia April 05 Compiler Construction Scotland.
The environment of the computation Declarations introduce names that denote entities. At execution-time, entities are bound to values or to locations:
1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot.
Garbage collection (& Midterm Topics) David Walker COS 320.
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
Garbage Collection Memory Management Garbage Collection –Language requirement –VM service –Performance issue in time and space.
SEG Advanced Software Design and Reengineering TOPIC L Garbage Collection Algorithms.
CS3012: Formal Languages and Compilers The Runtime Environment After the analysis phases are complete, the compiler must generate executable code. The.
Theory of Compilation Erez Petrank Lecture 8: Runtime. 1.
EE4E. C++ Programming Lecture 1 From C to C++. Contents Introduction Introduction Variables Variables Pointers and references Pointers and references.
Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD
Compiler Construction
Code Generation III. PAs PA4 5.1 – 7.2 PA5 (bonus) 24.1 –
Runtime Environments. Support of Execution  Activation Tree  Control Stack  Scope  Binding of Names –Data object (values in storage) –Environment.
Activation Records (in Tiger) CS 471 October 24, 2007.
OOPLs /FEN March 2004 Object-Oriented Languages1 Object-Oriented Languages - Design and Implementation Java: Behind the Scenes Finn E. Nordbjerg,
Theory of Compilation Erez Petrank Lecture 9: Runtime Part II. 1.
1 Lecture 22 Garbage Collection Mark and Sweep, Stop and Copy, Reference Counting Ras Bodik Shaon Barman Thibaud Hottelier Hack Your Language! CS164: Introduction.
Compilation (Semester A, 2013/14) Lecture 13b: Memory Management Noam Rinetzky Slides credit: Eran Yahav 1.
11/26/2015IT 3271 Memory Management (Ch 14) n Dynamic memory allocation Language systems provide an important hidden player: Runtime memory manager – Activation.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
University of Washington Wouldn’t it be nice… If we never had to free memory? Do you free objects in Java? 1.
Theory of Compilation Erez Petrank Lecture 8: Runtime. 1.
Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.
Introduction to Garbage Collection. Garbage Collection It automatically reclaims memory occupied by objects that are no longer in use It frees the programmer.
Compilation Lecture 13: Course summary: Advanced Topics Noam Rinetzky 1.
® July 21, 2004GC Summer School1 Cycles to Recycle: Copy GC Without Stopping the World The Sapphire Collector Richard L. Hudson J. Eliot B. Moss Originally.
CS412/413 Introduction to Compilers and Translators April 21, 1999 Lecture 30: Garbage collection.
Reference Counting. Reference Counting vs. Tracing Advantages ✔ Immediate ✔ Object-local ✔ Overhead distributed ✔ Very simple Trivial implementation for.
Runtime Environments Chapter 7. Support of Execution  Activation Tree  Control Stack  Scope  Binding of Names –Data object (values in storage) –Environment.
Design issues for Object-Oriented Languages
CSC 533: Programming Languages Spring 2016
Object Lifetime and Pointers
CSC 533: Programming Languages Spring 2015
Dynamic Memory Allocation
Concepts of programming languages
Automatic Memory Management
Memory Management and Garbage Collection Hal Perkins Autumn 2011
Memory Allocation CS 217.
RUN-TIME STORAGE Chuen-Liang Chen Department of Computer Science
CSC 533: Programming Languages Spring 2019
Presentation transcript:

Theory of Compilation Erez Petrank Lecture 9: Runtime (part 2); object oriented issues 1

Runtime Environment Code generated by the compiler to handle stuff that the programmer does not wish to handle. For example: file handling, memory management, synchronization (create threads, implement locks, etc.), runtime stack (activation records), etc. We discussed activation records and will next present an introduction to memory management.

Dynamic Memory Management: Introduction There is a course about this topic: “Algorithms for dynamic memory management” 3

Static and Dynamic Variables Static variables are defined in a method and are allocated on the runtime stack, as explained in the first part of this lecture. Sometimes there is a need for allocation during the run. – E.g., when managing a linked-list whose size is not predetermined. This is dynamic allocation. In C, “malloc” allocates a space and “delete” says that the program will not use this space anymore. 4 Ptr = malloc (256 bytes); /* Use ptr */ Free (Ptr);

Dynamic Memory Allocation In Java, “new” allocates an object for a given class. – President obama = new President But there is no instruction for manually deleting the object. It is automatically reclaimed by a garbage collector when the program “does not need it” anymore. 5 course c = new course(236360); c.class = “TAUB 2”; Faculty.add(c);

Manual Vs. Automatic Memory Management A manual memory management lets the programmer decide when objects are deleted. A memory manager that lets a garbage collector delete objects is called automatic. Manual memory management creates severe debugging problems – Memory leaks, – Dangling pointers. In large projects where objects are shared between various components it is sometimes difficult to tell when an object is not needed anymore. Considered the BIG debugging problem of the 80’s What is the main debugging problem today? 6

Automatic Memory Reclamantion When the system “knows” the object will not be used anymore, it reclaims its space. Telling whether an object will be used after a given line of code is undecidable. Therefore, a conservative approximation is used. An object is reclaimed when the program has “no way of accessing it”. Formally, when it is unreachable by a path of pointers from the “root” pointers, to which the program has direct access. – Local variables, pointers on stack, global (class) pointers, JNI pointers, etc. It is also possible to use code analysis to be more accurate sometimes. 7

What’s good about automatic “garbage collection”? © Erez Petrank8 Software engineering: – Relieves users of the book-keeping burden. – Stronger reliability, fewer bugs, faster debugging. – Code understandable and reliable. (Less interaction between modules.) Security (Java): – Program never gets a pointer to “play with”.

Importance Memory is the bottleneck in modern computation. – Time & energy (and space). Optimal allocation (even if all accesses are known in advance to the allocator) is NP-Complete (to even approximate). Must be done right for a program to run efficiently. Must be done right to ensure reliability. 9

GC and languages © Erez Petrank10 Sometimes it’s built in: – LISP, Java, C#. – The user cannot free an object. Sometimes it’s an added feature: – C, C++. – User can choose to free objects or not. The collector frees all objects not freed by the user. Most modern languages are supported by garbage collection.

Most modern languages rely on GC © Erez Petrank11 Source: “The Garbage Collection Handbook” by Richard Jones, Anthony Hosking, and Eliot Moss. 61 7

What’s bad about automatic “garbage collection”? © Erez Petrank12 It has a cost: – Old Lisp systems 40%. – Today’s Java program (if the collection is done “right”) 5- 15%. Considered a major factor determining program efficiency. Techniques have evolved since the 60’s. We will only survey basic techniques.

Garbage Collection Efficiency Overall collection time (percentage of running time). Pauses in program run. Space overhead. Cache Locality (efficiency and energy). 13

Three classical algorithms Reference counting Mark and sweep (and mark-compact) Copying. The last two are also called tracing algorithms because they go over (trace) all reachable objects. 14

Reference counting [Collins 1960] © Erez Petrank15 Recall that we would like to know if an object is reachable from the roots. Associate a reference count field with each object: how many pointers reference this object. When nothing points to an object, it can be deleted. Very simple, used in many systems.

Basic Reference Counting © Erez Petrank16 Each object has an RC field, new objects get o.RC:=1. When p that points to o 1 is modified to point to o 2 we execute: o 1.RC--, o 2.RC++. if then o 1.RC==0: – Delete o 1. – Decrement o.RC for all “children” of o 1. – Recursively delete objects whose RC is decremented to 0. o1o1 o2o2 p

A Problem: Cycles © Erez Petrank17 The Reference counting algorithm does not reclaim cycles! Solution 1: ignore cycles, they do not appear frequently in modern programs. Solution 2: run tracing algorithms (that can reclaim cycles) infrequently. Solution 3: designated algorithms for cycle collection. Another problem for the naïve algorithm: requires a lot of synchronization in parallel programs. Advanced versions solve that.

The Mark-and-Sweep Algorithm [McCarthy 1960] © Erez Petrank18 Mark phase: – Start from roots and traverse all objects reachable by a path of pointers. – Mark all traversed objects. Sweep phase: – Go over all objects in the heap. – Reclaim objects that are not marked.

The Mark-Sweep algorithm © Erez Petrank19 Traverse live objects & mark black. White objects can be reclaimed. registers Roots Note! This is not the heap data structure!

Triggering © Erez Petrank20 New(A)= if free_list is empty mark_sweep() if free_list is empty return (“out-of-memory”) pointer = allocate(A) return (pointer) Garbage collection is triggered by allocation.

Basic Algorithm © Erez Petrank 21 mark_sweep()= for Ptr in Roots mark(Ptr) sweep() mark(Obj)= if mark_bit(Obj) == unmarked mark_bit(Obj)=marked for C in Children(Obj) mark(C) Sweep()= p = Heap_bottom while (p < Heap_top) if (mark_bit(p) == unmarked) then free(p) else mark_bit(p) = unmarked; p=p+size(p)

Properties of Mark & Sweep © Erez Petrank22 Most popular method today (at a more advanced form). Simple. Does not move objects, and so heap may fragment. Complexity: Mark phase: live objects (dominant phase)  Sweep phase: heap size. Termination: each pointer traversed once. Various engineering tricks are used to improve performance.

During the run objects are allocated and reclaimed. Gradually, the heap gets fragmented. When space is too fragmented to allocate, a compaction algorithm is used. Move all live objects to the beginning of the heap and update all pointers to reference the new locations. Compaction is considered very costly and we usually attempt to run it infrequently, or only partially. Mark-Compact 23 The Heap

An Example: The Compressor A simplistic presentation of the Compressor: Go over the heap and compute for each live object where it moves to – To the address that is the sum of live space before it in the heap. – Save the new locations in a separate table. Go over the heap and for each object: – Move it to its new location – Update all its pointers. Why can’t we do it all in a single heap pass? (In the full algorithm: succinct table, execute the first pass quickly, and parallelization.) 24

Mark Compact Important parameters of a compaction algorithm: – Keep order of objects? – Use extra space for compactor data structures? – How many heap passes? – Can it run in parallel on a multi-processor? We do not elaborate in this intro. 25

Copying garbage collection © Erez Petrank26 Heap partitioned into two. Part 1 takes all allocations. Part 2 is reserved. During GC, the collector traces all reachable objects and copies them to the reserved part. After copying the parts roles are reversed: Allocation activity goes to part 2, which was previously reserved. Part 1, which was active, is reserved till next collection. 12

Copying garbage collection © Erez Petrank27 Part IPart II Roots A D C B E

The collection copies… © Erez Petrank28 Part IPart II Roots A D C B E A C

Roots are updated; Part I reclaimed. © Erez Petrank29 Part IPart II Roots A C

Properties of Copying Collection © Erez Petrank30 Compaction for free Major disadvantage: half of the heap is not used. “Touch” only the live objects – Good when most objects are dead. – Usually most new objects are dead, and so there are methods that use a small space for young objects and collect this space using copying garbage collection.

A very simplistic comparison CopyingMark & sweepReference Counting Live objects Size of heap (live objects) Pointer updates + dead objects Complexity Half heap wasted Bit/object + stack for DFS Count/object + stack for DFS Space overhead For freeAdditional work Compaction long Mostly shortPause time Cycle collectionMore issues

Modern Memory Management Considers standard program properties. Handle parallelism: – Stop the program and collect in parallel on all available processors. – Run collection concurrently with the program run. Cache consciousness. Real-time. 32

Some terms to be remembered © Erez Petrank33 Heap, objects Allocate, free (deallocate, delete, reclaim) Reachable, live, dead, unreachable Roots Reference counting, mark and sweep, copying, compaction, tracing algorithms Fragmentation

Recap Lexical analysis – regular expressions identify tokens (“words”) Syntax analysis – context-free grammars identify the structure of the program (“sentences”) Contextual (semantic) analysis – type checking defined via typing judgements – can be encoded via attribute grammars – Syntax directed translation Intermediate representation – many possible IRs; generation of intermediate representation; 3AC; backpatching Runtime: – services that are always there: function calls, memory management, threads, etc. 34

OO Issues 35

36 Representing Data at Runtime Source language types – int, boolean, string, object types Target language types – Single bytes, integers, address representation Compiler should map source types to some combination of target types – Implement source types using target types

37 Basic Types int, boolean, string, void Arithmetic operations – Addition, subtraction, multiplication, division, remainder Can be mapped directly to target language types and operations

38 Pointer Types Represent addresses of source language data structures Usually implemented as an unsigned integer Pointer dereferencing – retrieves pointed value May produce an error – Null pointer dereference – when is this error triggered?

39 Object Types An object is a record with built in methods and some additional features. Basic operations – Field selection + read/write computing address of field, dereferencing address – Copying copy block (not Java) or field-by-field copying – Method invocation Identifying method to be called, calling it How does it look at runtime?

40 Object Types class Foo { int x; int y; void rise() {…} void shine() {…} } x y rise shine Compile time information Runtime memory layout for object of class Foo DispacthVectorPtr

41 Field Selection x y rise shine Compile time information Runtime memory layout for object of class Foo DispacthVectorPtr Foo f; int q; q = f.x; MOV f, %EBX MOV 4(%EBX), %EAX MOV %EAX, q base pointer field offset from base pointer

42 Object Types - Inheritance x y rise shine Compile time information Runtime memory layout for object of class Bar twinkle z DispacthVectorPtr class Foo { int x; int y; void rise() {…} void shine() {…} } class Bar extends Foo{ int z; void twinkle() {…} }

43 Object Types - Polymorphism class Foo { … void rise() {…} void shine() {…} } x y Runtime memory layout for object of class Bar class Bar extends Foo{ … } z class Main { void main() { Foo f = new Bar(); f.rise(); } f Pointer to Bar Pointer to Foo inside Bar DVPtr

44 Static & Dynamic Binding Which “rise” should is main() using? Static binding: f is of type Foo and therefore it always refers to Foo’s rise. Dynamic binding: f points to a Bar object now, so it refers to Bar’s rise. class Foo { … void rise() {…} void shine() {…} } class Bar extends Foo{ void rise() {…} } class Main { void main() { Foo f = new Bar(); f.rise(); }

45 Typically, Dynamic Binding is used Finding the right method implementation at runtime according to object type Using the Dispatch Vector (a.k.a. Dispatch Table) class Foo { … void rise() {…} void shine() {…} } class Bar extends Foo{ void rise() {…} } class Main { void main() { Foo f = new Bar(); f.rise(); }

46 Dispatch Vectors in Depth Vector contains addresses of methods Indexed by method-id number A method signature has the same id number for all subclasses class Main { void main() { Foo f = new Bar(); f.rise(); } class Foo { … void rise() {…} void shine() {…} } 0 1 class Bar extends Foo{ void rise() {…} } 0 x y z f Pointer to Bar Pointer to Foo inside Bar DVPtr shine rise shine rise Dispatch vector for Bar Method code using Bar’s dispatch table

47 Dispatch Vectors in Depth class Main { void main() { Foo f = new Foo(); f.rise(); } class Foo { … void rise() {…} void shine() {…} } 0 1 class Bar extends Foo{ void rise() {…} } 0 x y f Pointer to Foo DVPtr shine rise shine rise using Foo’s dispatch table Dispatch vector for Foo Method code

48 Representing dispatch tables class A { void rise() {…} void shine() {…} static void foo() {…} } class B extends A { void rise() {…} void shine() {…} void twinkle() {…} } # data section.data.align 4 _DV_A:.long _A_rise.long _A_shine _DV_B:.long _B_rise.long _B_shine.long _B_twinkle

Multiple Inheritance 49 supertyping convert_ptr_to_E_to_ptr_to_C(e) = e convert_ptr_to_E_to_ptr_to_D(e) = e + sizeof (class C) subtyping convert_ptr_to_C_to_ptr_to_E(e) = c convert_ptr_to_D_to_ptr_to_E(e) = e - sizeof (class C) class C { field c1; field c2; void m1() {…} void m2() {…} } class D { field d1; void m3() {…} void m4() {…} } class E extends C,D{ field e1; void m2() {…} void m4() {…} void m5() {…} } c1 c2 DVPtr Pointer to E Pointer to C inside E DVPtr m2_C_E m1_C_C E-Object layout Dispatch vector d1 e1 m4_D_E m3_D_D m5_E_E Pointer to D inside E

50 Runtime checks generate code for checking attempted illegal operations – Null pointer check – Array bounds check – Array allocation size check – Division by zero – … If check fails jump to error handler code that prints a message and gracefully exists program

51 Null pointer check # null pointer check cmp $0,%eax je labelNPE labelNPE: push $strNPE# error message call __println push $1# error code call __exit Single generated handler for entire program

52 Array bounds check # array bounds check mov -4(%eax),%ebx # ebx = length mov $0,%ecx # ecx = index cmp %ecx,%ebx jle labelABE # ebx <= ecx ? cmp $0,%ecx jl labelABE # ecx < 0 ? labelABE: push $strABE # error message call __println push $1 # error code call __exit Single generated handler for entire program

53 Array allocation size check # array size check cmp $0,%eax# eax == array size jle labelASE # eax <= 0 ? labelASE: push $strASE # error message call __println push $1 # error code call __exit Single generated handler for entire program