Memory Management Chapter 11 1.  Memory management: the process of binding values to (logical) memory locations.  The memory accessible to a program.

Slides:



Advertisements
Similar presentations
Introduction to Memory Management. 2 General Structure of Run-Time Memory.
Advertisements

Names and Bindings.
Lecture 10: Heap Management CS 540 GMU Spring 2009.
Garbage Collection What is garbage and how can we deal with it?
1 Programming Languages Memory Management Chapter 11.
5. Memory Management From: Chapter 5, Modern Compiler Design, by Dick Grunt et al.
Run-time organization  Data representation  Storage organization: –stack –heap –garbage collection Programming Languages 3 © 2012 David A Watt,
Various languages….  Could affect performance  Could affect reliability  Could affect language choice.
Garbage Collection CSCI 2720 Spring Static vs. Dynamic Allocation Early versions of Fortran –All memory was static C –Mix of static and dynamic.
Hastings Purify: Fast Detection of Memory Leaks and Access Errors.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
CPSC 388 – Compiler Design and Construction
CS 536 Spring Automatic Memory Management Lecture 24.
CSC321: Programming Languages 11-1 Programming Languages Tucker and Noonan Chapter 11: Memory Management 11.1 The Heap 11.2 Implementation of Dynamic Arrays.
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
Memory Management. History Run-time management of dynamic memory is a necessary activity for modern programming languages Lisp of the 1960’s was one of.
Memory Allocation. Three kinds of memory Fixed memory Stack memory Heap memory.
CS 536 Spring Run-time organization Lecture 19.
1 Storage Allocation Operating System Hebrew University Spring 2007.
Chapter 10 Storage Management Implementation details beyond programmer’s control Storage/CPU time trade-off Binding times to storage.
Run-Time Storage Organization
Run time vs. Compile time
The environment of the computation Declarations introduce names that denote entities. At execution-time, entities are bound to values or to locations:
Reference Counters Associate a counter with each heap item Whenever a heap item is created, such as by a new or malloc instruction, initialize the counter.
COP4020 Programming Languages
Chapter TwelveModern Programming Languages1 Memory Locations For Variables.
Chapter 7: Runtime Environment –Run time memory organization. We need to use memory to store: –code –static data (global variables) –dynamic data objects.
CS3012: Formal Languages and Compilers The Runtime Environment After the analysis phases are complete, the compiler must generate executable code. The.
Dynamic Memory Allocation Questions answered in this lecture: When is a stack appropriate? When is a heap? What are best-fit, first-fit, worst-fit, and.
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
Compiler Construction
Chapter 3.5 Memory and I/O Systems. 2 Memory Management Memory problems are one of the leading causes of bugs in programs (60-80%) MUCH worse in languages.
Addresses in Memory When a variable is declared, enough memory to hold a value of that type is allocated for it at an unused memory location. This is.
Basic Semantics Associating meaning with language entities.
Runtime Environments. Support of Execution  Activation Tree  Control Stack  Scope  Binding of Names –Data object (values in storage) –Environment.
1 Records Record aggregate of data elements –Possibly heterogeneous –Elements/slots are identified by names –Elements in same fixed order in all records.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. C H A P T E R F I V E Memory Management.
C++ Data Types Structured array struct union class Address pointer reference Simple IntegralFloating char short int long enum float double long double.
Lists II. List ADT When using an array-based implementation of the List ADT we encounter two problems; 1. Overflow 2. Wasted Space These limitations are.
Pointers in C++. 7a-2 Pointers "pointer" is a basic type like int or double value of a pointer variable contains the location, or address in memory, of.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 9.
Implementing Subprograms What actions must take place when subprograms are called and when they terminate? –calling a subprogram has several associated.
COMP3190: Principle of Programming Languages
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
11/26/2015IT 3271 Memory Management (Ch 14) n Dynamic memory allocation Language systems provide an important hidden player: Runtime memory manager – Activation.
CSE 425: Control Abstraction I Functions vs. Procedures It is useful to differentiate functions vs. procedures –Procedures have side effects but usually.
Pointers in C Computer Organization I 1 August 2009 © McQuain, Feng & Ribbens Memory and Addresses Memory is just a sequence of byte-sized.
Concepts of programming languages Chapter 5 Names, Bindings, and Scopes Lec. 12 Lecturer: Dr. Emad Nabil 1-1.
Memory Management -Memory allocation -Garbage collection.
CS 330 Programming Languages 10 / 23 / 2007 Instructor: Michael Eckmann.
LECTURE 13 Names, Scopes, and Bindings: Memory Management Schemes.
1 Recall that... char str [ 8 ]; str is the base address of the array. We say str is a pointer because its value is an address. It is a pointer constant.
1 Compiler Construction Run-time Environments,. 2 Run-Time Environments (Chapter 7) Continued: Access to No-local Names.
Runtime Environments Chapter 7. Support of Execution  Activation Tree  Control Stack  Scope  Binding of Names –Data object (values in storage) –Environment.
Data Types Chapter 6: Data Types Lectures # 13. Topics Chapter 6: Data Types 2 Introduction Primitive Data Types Character String Types Array Types Associative.
Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.
Object Lifetime and Pointers
Garbage Collection What is garbage and how can we deal with it?
Data Types In Text: Chapter 6.
Chapter 6 – Data Types CSCE 343.
Data Structures Interview / VIVA Questions and Answers
CS 153: Concepts of Compiler Design November 28 Class Meeting
Concepts of programming languages
Storage.
Simulated Pointers.
Strategies for automatic memory management
Topic 3-b Run-Time Environment
RUN-TIME STORAGE Chuen-Liang Chen Department of Computer Science
CMPE 152: Compiler Design May 2 Class Meeting
Garbage Collection What is garbage and how can we deal with it?
Presentation transcript:

Memory Management Chapter 11 1

 Memory management: the process of binding values to (logical) memory locations.  The memory accessible to a program is its address space, represented as a set of values {0, 1, …, n}. ◦ The numbers represent memory locations. ◦ These are logical addresses – do not usually correspond to physical addresses at runtime.  The exact organization of the address space depends on the operating system and the programming language being used. 2

 Runtime memory management is an important part of program meaning. ◦ The language run-time system creates & deletes stack frames, creates & deletes dynamically allocated heap objects – in cooperation with the operating system  Whether done automatically (as in Java or Python), or partially by the programmer (as in C/C++), dynamic memory management is an important part of programming language design. 3

 Static: storage requirements are known prior to run time; lifetime is the entire program execution  Run-time stack: memory associated with active functions ◦ Structured as stack frames (activation records)  Heap: dynamically allocated storage; the least organized and most dynamic storage area 4

 Simplest type of memory to manage.  Consists of anything that can be completely determined at compile time; e.g., global variables, constants (perhaps), code.  Characteristics: ◦ Storage requirements known prior to execution ◦ Size of static storage area is constant throughout execution 5

 The stack is a contiguous memory region that grows and shrinks as a program runs.  Its purpose: to support method calls  It grows (storage is allocated) when the activation record (or stack frame) is pushed on the stack at the time a method is called (activated).  It shrinks when the method completes and storage is de-allocated. 6

 In block structured languages such as C or C++, a new activation record will be created if variables are declared in an enclosed block  The activation record at the top of the stack represents the current local scope. ◦ The static pointer points to the enclosing block, which represents the non-local scope. 7

 Non-local data is usually global data but if there are nested blocks the static pointer of an inner block would point to the outer enclosing block or method.  Static pointers in the activation record are the run-time equivalent to the stack of symbol tables. ◦ If a variable is non-local, trace the static pointer chain to find its location on the stack. 8

 The stack frame has storage for local variables, parameters, and return linkage.  The size and structure of a stack frame is known at compile time, but actual contents and time of allocation is unknown until runtime.  How is variable lifetime connected to the structure of the stack? 9

 Heap objects are allocated/deallocated dynamically as the program runs (not associated with specific event such as function entry/exit).  The kind of data found on the heap depends on the language ◦ Strings, dynamic arrays, objects, and linked structures are typically located here. ◦ Java and C/C++ have different policies.  Heap data is accessed from a variable on the stack – either a pointer or a reference variable. 10

 Special operations (e.g., malloc, new) may be needed to allocate heap storage.  When a program deallocates storage ( free, delete ) the space is returned to the heap to be re-used.  Space is allocated in variable sized blocks, so deallocation may leave “holes” in the heap (fragmentation). ◦ Compare to deallocation of stack storage 11

 Some languages (e.g. C, C++) leave heap storage deallocation to the programmer ◦ delete  Others (e.g., Java, Perl, Python, list- processing languages) employ garbage collection to reclaim unused heap space. 12

13 The Structure of Run-Time Memory Figure 11.1 These two areas grow towards each other as program events require.

 The following relation must hold: 0 ≤ a ≤ h ≤ n  In other words, if the stack top bumps into the heap, or if the beginning of the heap is greater than the end of memory, there are problems! 14

 For simplicity, we assume that memory words in the heap have one of three states: ◦ Unused: not allocated to a program yet ◦ Undef: allocated, but not yet assigned a value by the program ◦ Contains some actual value 15

 new returns the start address of a block of k words of unused heap storage and changes the state of the words from unused to undef. ◦ where k is the number of words of storage needed; e.g., suppose a Java class Point has data members x,y,z which are floats. ◦ If floats require 4 bytes of storage, then Point firstCoord = new Point( ) calls for 3 X 4 bytes (at least) to be allocated and initialized to some predetermined state. 16

 Heap overflow occurs when a call to new occurs and the heap does not have a contiguous block of k unused words  So new either fails, in the case of heap overflow, or returns a pointer to the new block 17

 delete returns a block of storage to the heap  The status of the returned words are returned to unused, and are available to be allocated in response to a future new call.  One cause of heap overflow is a failure on the part of the program to return unused storage. 18

19 The New (5) Heap Allocation Function Call: Before and After Figure 11.2 A before and after view of the heap. The “after” shows the effect of an operation requesting a size-5 block. (Note difference between “undef” and “unused”.) Deallocation reverses the process.

 Heap space isn’t necessarily allocated and deallocated from one end (like the stack) because the memory is not allocated and deallocated in a predictable (first-in, first- out or last-in, first-out) order.  As a result, the location of the specific memory cells depends on what is available at the time of the request. 20

 The memory manager can adopt either a first-fit or best-fit policy.  Free list = a list of all the free space on the heap: 4 bytes, 32 bytes, 1024 bytes, 16 bytes, …  A request for 14 bytes could be satisfied ◦ First-fit: from the 32-byte block ◦ Best-fit: from the 16 byte block 21

 The view of a process address space as a contiguous set of bytes consisting of static, stack, and heap storage, is a view of the logical (virtual) address space.  The physical address space is managed by the operating system, and may not resemble this view at all. 22

 The language is responsible for assigning logical memory  OS is responsible for mapping memory to physical memory and deciding how much physical memory a program can have at a time.  A compiler’s logical addresses are relative to the start of the program. ◦ Logical addresses = virtual addresses 23

 The value of a pointer variable is an address.  Memory that is accessed through a pointer is dynamically allocated in the heap  The pointer variable is on the stack and holds the address of the object pointed to, which is in the heap.  Java doesn’t have explicit pointers, but reference types are represented by their addresses and their storage is also allocated on the heap ◦ the reference is on the stack 24

 In addition to simple variables (ints, floats, etc.) most imperative languages support structured data types. ◦ Arrays: “[finite] ordered sequences of values that all share the same type” ◦ Records (structs): “finite collections of values that have different types” ◦ Lists: A list data structure has different features depending on the language but in general it is an ordered sequence of values, possibly of different types, possibly accessible through indexes or in the case of linked lists only by list traversals using pointers or references. 25

 In Java, arrays are always allocated dynamically from heap memory.  In some other languages (e.g., C++, C) ◦ Globally defined arrays - static memory. ◦ Local (to a function) arrays - stack storage. ◦ Dynamically allocated arrays - heap storage.  Dynamically allocated arrays also have storage on the stack – a reference (pointer) to the heap block that holds the array. 26

 Typical Java array declarations: ◦ int[] arr = new int[5]; ◦ float[][] arr1 = new float [10][5]; ◦ Object[] arr2 = new Object[100];  Typical C/C++ array declarations ◦ int arr[5]; ◦ float arr1[10][15]; ◦ int *intPtr; intPtr = new int[5] 27

 Consider the declaration int A(n);  Since array size isn’t known at compile time, storage for the array can’t be allocated in static storage or on the run- time stack.  The stack contains the dope vector for the array, including a pointer to its base address, and the heap holds the array values, in contiguous locations. See Figure 11.3, page

 The dope vector has information needed to interpret array references: ◦ Array base address (a pointer to heap data) ◦ Array size (number of elements) for multi-dimensioned arrays, size of each dimension ◦ Element type (which indicates the amount of storage required for each element)  For dynamically allocated arrays, this information must be available at runtime. 29

 The Meaning Rule on page 266 describes the semantics of a 1-d array declaration ad in Clite:  There are 4 parts to the rule: ◦ Compute addr(ad[0]) = new(ad.size) ◦ Push addr(ad[0]) onto the stack ◦ Push ad.size onto the stack (for bounds checking) ◦ Push ad.type onto the stack (for type checking) 30

31 Allocation of Stack and Heap Space for Array A Figure 11.3

Meaning Rule 11.3 The meaning of an array Assignment as is (assumes size = 1): 1. Compute addr( ad[ar.index] )=addr( ad[0] ) +( ad.index-1 ) 2. If addr( ad[0] )  addr( ad[ar.index] ) < addr (ad[0])+ad.size) then assign the value of as.source to addr( ad[ar.index] ) (the target) 3. Otherwise, signal an index-out-of-range error. 32

The assignment A[5]=3 changes the value at heap address addr(A[0])+4 to 3, since ar.index=5 and addr(A[5])=addr(A[0])+4. This assumes that the size of an int is one word. 33

 In addition to dynamically allocated arrays, C/C++ support ◦ static (globally defined) arrays ◦ fixed stack-dynamic arrays  Arrays declared in functions are allocated storage on the stack, just like other local variables.  Index range and element type are fixed at compile time 34

 The increasing popularity of OO programming has meant more emphasis on heap storage management.  Active objects: can be accessed through a pointer or reference located on the stack.  Inactive objects: blocks that cannot be accessed; no reference exists.  (Accessible and inaccessible may be more descriptive.) 35

 Garbage: any block of heap memory that cannot be accessed by the program; i.e., there is no stack pointer to the block; but the runtime system thinks it is still in use.  Garbage is created in several ways: ◦ A function ends without returning the space allocated to a local array or other dynamic variable. The pointer (dope vector) is gone. ◦ A node is deleted from a linked data structure, but isn’t freed ◦ … 36

 A second type of problem can occur when a program assigns more than one pointer to a block of heap memory  The block may be deleted and one of the pointers set to null, but the other pointers still exist.  If the runtime system reassigns the memory to another object, the original pointers pose a danger. 37

 A dangling pointer (or dangling reference, or widow) is a pointer (reference) that still contains the address of heap space that has been deallocated (returned to the free list).  An orphan (garbage) is a block of allocated heap memory that is no longer accessible through any pointer.  A memory leak is a gradual loss of available memory due to the creation of garbage. 38

Consider this code: class node { int value; node next; }... node p, q; p = new node(); q = new node();... q = p; delete(p); // p & q are on the // stack but point to // the heap  The statement q = p; creates a memory leak.  The node originally pointed to by q is no longer accessible – it’s an orphan (garbage).  Now, add the statement delete(p);  The pointer p is correctly set to null, but q is now a dangling pointer (or widow) 39

40 Creating Widows and Orphans: A Simple Example Figure 11.4 (a): after new(p); new(q); (b): after q = p; (c): after delete(p); q still points to a location in the heap, which could be allocated to another request in the future. The node originally pointed to by q is now garbage.

41 A 5 A5 2.5 A = A * 2.5 A = “cat” cat Python may allocate new storage with each assignment, so it handles memory management automatically. It will create new objects and store them in memory; it will also execute garbage collection algorithms to reclaim any inaccessible memory locations. Variables contain references to data values

 Memory management in programming languages binds (logical) addresses to instructions and data. ◦ The memory accessible to a program is its address space, represented as a set of values {0, 1, …, n}.  Three types of storage ◦ Static ◦ Stack ◦ Heap  Stack and heap are managed by the runtime system; ◦ Stack: structured, contains activation records ◦ Heap: less structured: holds dynamically allocated variables 42

 Problems with heap storage: ◦ Memory leaks (garbage): failure to free storage when pointers (references) are reassigned ◦ Dangling pointers: when storage is freed, but references to the storage still exist.  Two schools of thought ◦ Programmer takes care of memory management ◦ The language’s runtime system takes care of it ◦ Memory management = allocating & deallocating memory when the program no longer needs it.  C/C++: programmer allocates, deallocates  Java: language handles deallocation, some allocation  Python: language does it all 43

 All inaccessible blocks of storage are identified and returned to the free list.  The heap may also be compacted at this time: allocated space is compressed into one end of the heap, leaving all free space in a large block at the other end. 44

 C & C++ leave it to the programmer – if an unused block of storage isn’t explicitly freed by the program, it becomes garbage. ◦ You can get C++ garbage collectors, but they aren’t standard  Java, Python, Perl, (and other scripting languages) are examples of languages with garbage collection ◦ Python, etc. also automatic allocation: no need for “new” statements  Garbage collection was pioneered by languages like Lisp, which constantly creates and destroys linked lists. 45

 There are three major approaches to automating the process: ◦ Reference counting, Mark-sweep, Copy collection  All have the same basic format: determine the heap nodes that are accessible (directly or indirectly) and get rid of everything else. ◦ A node is directly accessible if a global or stack variable points to it (has a reference to it) ◦ A node is indirectly accessible if it can be reached through a chain of pointers that originates on the stack or in global memory 46

 Initially, the heap is structured as a linked list (free list) of nodes.  Each node has a reference count field; initially 0.  When a block is allocated it’s removed from the free list and its reference count is set to 1. ◦ The address of the block is assigned to a pointer or reference variable on the stack. 47

 When another pointer is assigned the reference count is incremented.  When a pointer is freed (or re-assigned) the reference count of the block it points to is decremented.  When a block’s count goes back to zero, return it to the free list. ◦ If it points to any other nodes, reduce their reference by one. 48

49 Reference count for each node would be 1. Reference count for top node would be 2, for bottom node would be 0

50 P Q 21 “P = null” reduces the reference count of the first node to 1 “Q = null” reduces the reference count of the first node to 0, which triggers the reduction of the reference count in node 2 to 0, recursively reduces the ref. count in node 3 to 0, and then returns all three nodes to the free list. Simple Example using Ref. Count 1null

51 Node Structure and Example Heap for Reference Counting Figure 11.5 There’s a block at the bottom with ref. count = 0. What does this represent? What would happen if delete is performed on p and q?

52 Node Structure and Example Heap for Reference Counting Figure 11.5 Suppose the instruction p→next = null is executed? the nodes on the right form an unreachable circular list

 Advantage: the algorithm is performed whenever there is an assignment or other heap action. Overhead is distributed over program lifetime  Disadvantages are: ◦ Can’t detect inaccessible circular lists. ◦ Some extra overhead due to reference counts (storage and time). 53

 Runs when the heap is full (free list is empty or cannot satisfy a request).  Two-pass process: ◦ Pass 1: All active references on the stack are followed and the blocks they point to are marked (using a special mark bit set to 1). ◦ Pass 2: The entire heap is swept, looking for unmarked blocks, which are then returned to the free list. At the same time, the mark bits are turned off (set to 0). 54

Mark(R): //R is a stack reference If (R.MB == 0) R.MB = 1; If (R.next != null) Mark(R.next); All reachable nodes are marked. Starts in the stack, moves to the heap. 55

Sweep( ): i = h; // h = first heap address While (i<=n) { if(i.MB == 0) free(i);//add node i to free list else i.MB = 0; i++; } Operates only on the heap. 56

57 Node Structure and Example for Mark-Sweep Algorithm Figure 11.6 Before the mark-sweep algorithm begins

58 Heap after Pass I of Mark-Sweep Figure 5.16 After the first (mark) pass, accessible nodes are marked, others aren’t

59 Heap after Pass II of Mark-Sweep Figure 11.8 After the 2 nd (sweep) pass: All inaccessible nodes are linked into a free list; all accessible nodes have their mark bits returned to 0

 Advantages: ◦ It may never run (it only runs when the heap is full). ◦ It finds and frees all unused memory blocks.  Disadvantage: It is very intensive when it does run. Long, unpredictable delays are unacceptable for some applications. 60

 Like Mark & Sweep (M&S), it runs when the heap is full.  Possibly faster than M&S because it only makes one pass through the heap, but the Copying part slows it down.  No reference count or mark bit needed.  The heap is divided into two halves, from_space and to_space. 61

 While garbage collection isn’t needed, ◦ From_space contains allocated nodes and nodes on the free list. ◦ To_space is not used.  When from_space is full, “flip” the two spaces, and pack all accessible nodes in the old from_space into the new from_space. Any left-over space is the free space. ◦ (A node is accessible if it can be reached from the stack, or from another node in the heap) 62

63 Initial Heap Organization for Copy Collection Figure 11.9 Not available

64 Result of a Copy Collection Activation Figure After “flipping” and repacking into the former to_space. (The accessible nodes are packed, orphans are returned to the free_list, and the two halves reverse roles.)

 When an active object is copied to the to_space, update any references contained in the objects (addresses have changed)  When copying is completed, the new to_space contains only active objects, and they are tightly packed into the space.  Consequently, the heap is automatically compacted (defragmented). 65

 Automatic compaction is the main advantage of this method when compared to mark-and- sweep.  Disadvantages: ◦ All active objects must be copied: may take a lot of time (not necessarily as much as the two-pass algorithm). ◦ Requires twice as much space for the heap 66

 If r, the ratio of active heap blocks to heap size, is significantly less than (heap size)/2, copy collection is more efficient ◦ Efficiency = amount of memory reclaimed per unit of time  As r approaches (heap size)/2 mark-sweep becomes more efficient  Based on a study reported in a paper Jones and Lins,

 Different languages and implementations will probably use some variation or combination of one of the above strategies.  Java runs garbage collection as a background process when demand on the system is low, hoping that the heap will never be full.  Java also allows programmers to explicitly request garbage collection, without waiting for the system to do it automatically.  Functional languages (Lisp, Scheme, …) also have built-in garbage collectors  C/C++ do not. 68

 Some commercial applications divide nodes into categories according to how long they’ve been in memory ◦ The assumption is that long-resident nodes are likely to be permanent – don’t examine them ◦ New nodes are less likely to be permanent – consider them first ◦ There may be several “aging” levels 69