L6: Malloc Lab Writing a Dynamic Storage Allocator October 30, 2006

Slides:

Advertisements

Similar presentations

Dynamic Memory Management

Advertisements

Carnegie Mellon 1 Dynamic Memory Allocation: Basic Concepts : Introduction to Computer Systems 17 th Lecture, Oct. 21, 2010 Instructors: Randy Bryant.

Dynamic Memory Allocation I Topics Simple explicit allocators Data structures Mechanisms Policies CS 105 Tour of the Black Holes of Computing.

Dynamic Memory Allocation I Topics Basic representation and alignment (mainly for static memory allocation, main concepts carry over to dynamic memory.

Carnegie Mellon 1 Dynamic Memory Allocation: Basic Concepts / : Introduction to Computer Systems 18 th Lecture, March 24, 2015 Instructors:

Chris Riesbeck, Fall 2007 Dynamic Memory Allocation Today Dynamic memory allocation – mechanisms & policies Memory bugs.

Week 12 (March 28th) Outline Memory Allocation Lab 6 Reminders Lab 6: April 7 th (next Thurs) Start Early!!! Exam 2: April 5 th (Next Tue) TA: Kun Gao.

Dynamic Memory Allocation I October 16, 2008 Topics Simple explicit allocators Data structures Mechanisms Policies lecture-15.ppt “The course that.

Week 7 - Friday.  What did we talk about last time?  Allocating 2D arrays.

1 Dynamic Memory Allocation: Basic Concepts Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaro.

Malloc Recitation Section K (Kevin Su) November 5 th, 2012.

Lab 3: Malloc Lab. “What do we need to do?”  Due 11/26  One more assignment after this one  Partnering  Non-Honors students may work with one other.

Dynamic Memory Allocation I Nov 5, 2002 Topics Simple explicit allocators Data structures Mechanisms Policies class21.ppt “The course that gives.

Dynamic Memory Allocation I May 21, 2008 Topics Simple explicit allocators Data structures Mechanisms Policies.

Dynamic Memory Allocation I November 1, 2006 Topics Simple explicit allocators Data structures Mechanisms Policies class18.ppt “The course that.

1 Inner Workings of Malloc and Free Professor Jennifer Rexford COS 217.

Memory Allocation CS Introduction to Operating Systems.

University of Washington Today Lab 5 out  Puts a *lot* together: pointers, debugging, C, etc.

University of Washington Today Finished up virtual memory On to memory allocation Lab 3 grades up HW 4 up later today. Lab 5 out (this afternoon): time.

1 Dynamic Memory Allocation: Basic Concepts. 2 Today Basic concepts Implicit free lists.

Writing You Own malloc() March 29, 2003 Topics Explicit Allocation Data structures Mechanisms Policies class19.ppt “The course that gives CMU its.

CS 241 Discussion Section (11/17/2011). Outline Review of MP7 MP8 Overview Simple Code Examples (Bad before the Good) Theory behind MP8.

Carnegie Mellon Dynamic Memory Allocation : Introduction to Computer Systems Monday March 30th,

Consider Starting with 160 k of memory do: Starting with 160 k of memory do: Allocate p1 (50 k) Allocate p1 (50 k) Allocate p2 (30 k) Allocate p2 (30 k)

Dynamic Memory Allocation II

Carnegie Mellon 1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Dynamic Memory Allocation: Basic Concepts :

Dynamic Memory Allocation April 9, 2002 Topics Simple explicit allocators Data structures Mechanisms Policies Reading: 10.9 Problems: and class21.ppt.

University of Washington Today More memory allocation!

Malloc Lab : Introduction to Computer Systems Recitation 11: Nov. 4, 2013 Marjorie Carlson Recitation A.

Carnegie Mellon 1 Dynamic Memory Allocation: Basic Concepts Instructors: Adapted from CMU course

CS 241 Discussion Section (12/1/2011). Tradeoffs When do you: – Expand Increase total memory usage – Split Make smaller chunks (avoid internal fragmentation)

CSE 351 Dynamic Memory Allocation 1. Dynamic Memory Dynamic memory is memory that is “requested” at run- time Solves two fundamental dilemmas: How can.

1 Dynamic Memory Allocation (II) Implementation. 2 Outline Implementation of a simple allocator Explicit Free List Segregated Free List Suggested reading:

Carnegie Mellon 1 Malloc Lab : Introduction to Computer Systems Friday, July 10, 2015 Shen Chen Xu.

University of Washington Implementation Issues How do we know how much memory to free given just a pointer? How do we keep track of the free blocks? How.

Dynamic Memory Management Jennifer Rexford 1. 2 Goals of this Lecture Dynamic memory management techniques Garbage collection by the run-time system (Java)

Carnegie Mellon 1 Malloc Recitation Ben Spinelli Recitation 11: November 9, 2015.

Carnegie Mellon Dynamic Memory Allocation : Introduction to Computer Systems Recitation 11: Monday, Nov 3, 2014 SHAILIN DESAI SECTION L 1.

Memory Management I: Dynamic Storage Allocation Oct 8, 1998 Topics User-level view Policies Mechanisms class14.ppt Introduction to Computer Systems.

Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.

Chapter 17 Free-Space Management

Instructor: Phil Gibbons

Section 10: Memory Allocation Topics

CS 3214 Introduction to Computer Systems

Dynamic Memory Allocation I

Instructor: Haryadi Gunawi

Memory Management I: Dynamic Storage Allocation Oct 7, 1999

Dynamic Memory Allocation I October 28, 2003

Dynamic Memory Allocation: Basic Concepts CS220: Computer Systems II

The Hardware/Software Interface CSE351 Winter 2013

Optimizing Malloc and Free

Memory Management I: Dynamic Storage Allocation March 2, 2000

CS Introduction to Operating Systems

User-Level Dynamic Memory Allocation: Malloc and Free

Dynamic Memory Allocation I

CS703 - Advanced Operating Systems

Memory Allocation CS 217.

Dynamic Memory Allocation

Dynamic Memory Allocation I

Dynamic Memory Allocation: Basic Concepts /18-213/14-513/15-513: Introduction to Computer Systems 19th Lecture, October 30, 2018.

CSCI 380: Operating Systems Lecture #11 Instructor: William Killian

Memory Allocation II CSE 351 Autumn 2017

Dynamic Memory Allocation November 2, 2000

CSCI 380: Operating Systems Instructor: William Killian

Dynamic Memory Allocation: Basic Concepts CSCI 380: Operating Systems

Malloc Lab CSCI 380: Operating Systems

Dynamic Memory Allocation I

Instructors: Majd Sakr and Khaled Harras

Week 7 - Friday CS222.

Presentation transcript:

L6: Malloc Lab Writing a Dynamic Storage Allocator October 30, 2006 15-213 “The course that gives CMU its Zip!” L6: Malloc Lab Writing a Dynamic Storage Allocator October 30, 2006 Topics Memory Allocator (Heap) L6: Malloc Lab Reminders L6: Malloc Lab Due Nov 10, 2006 Section A (Donnie H Kim) recitation8.ppt (some slides from lecture notes)

L6: Malloc Lab Things that matter in this lab: Performance goal Maximizing throughput Maximizing memory utilization Implementation Issues (Design Space) Free Block Organization Placement Policy Splitting Coalescing And some advice

Some sort of useful backgrounds

So what is memory allocation? kernel virtual memory memory invisible to user code stack %esp Memory mapped region for shared libraries Allocators request additional heap memory from the operating system using the sbrk function. the “brk” ptr run-time heap (via malloc) uninitialized data (.bss) initialized data (.data) program text (.text)

Malloc Package #include <stdlib.h> void *malloc(size_t size) If successful: Returns a pointer to a memory block of at least size bytes, (typically) aligned to 8-byte boundary. If size == 0, returns NULL If unsuccessful: returns NULL (0) and sets errno. void free(void *p) Returns the block pointed at by p to pool of available memory p must come from a previous call to malloc or realloc. void *realloc(void *p, size_t size) Changes size of block p and returns pointer to new block. Contents of new block unchanged up to min of old and new size.

Allocation Examples p1 = malloc(4) p2 = malloc(5) p3 = malloc(6) free(p2) p4 = malloc(2)

Performance goals Maximizing throughput (Temporal) Defined as the number of requests that it completes per unit time Maximizing Memory Utilization (Spatial) Defined as the ratio of the requested memory size and the actual memory size used There is a tension between maximizing throughput and utilization! Find an appropriate balance between two goals! Keep this in mind, we will come back to these issues

Implementation Issues Free Block Organization How do we keep track of the free blocks? How do we know how much memory to free just given a pointer? Placement Policy How do we choose an appropriate free block? Splitting What do we do with the extra space when allocating a structure that is smaller than the free block it is placed in? Coalescing How do we reinsert freed block? p0 free(p0) p1 = malloc(1)

Implementation Issues 1: Free Block Organization Identifying which block is free or allocated Available design choices of how to manage free blocks Implicit List Explicit List Segregated List Header, Footer organization storing information about the block (size, allocated, freed)

Keeping Track of Free Blocks Method 1: Implicit list using lengths -- links all blocks Method 2: Explicit list among the free blocks using pointers within the free blocks Method 3: Segregated free list Different free lists for different size classes Method 4: Blocks sorted by size Can use a balanced tree (e.g. Red-Black tree) with pointers within each free block, and the length used as a key 5 4 6 2 5 4 6 2

Free Block Organization Free Block with header 1 word a = 1: allocated block a = 0: free block size: block size payload: application data (allocated blocks only) size a payload Format of allocated and free blocks optional padding

Free Block Organization Free Block with Header and Footer Header size a a = 1: allocated block a = 0: free block size: total block size payload: application data (allocated blocks only) payload and padding Format of allocated and free blocks Boundary tag (footer) size a

Implementation Issues 2: Placement Policy “Placement Policy” choices First Fit Search free list from the beginning and chose the first free block Next Fit Starts search where the previous search has left off Best Fit Examine every free block to find the best free block

Implementation Issues 3: Splitting “Splitting” Design choices Using the entire free block Simple, fast Introduces internal fragmentation (good placement policy might reduce this) Splitting Split free block into two parts, when second part can be used for other requests (reduces internal fragmentation) p1 = malloc(1)

Implementation Issues 4: Coalescing False Fragmentations Free block chopped into small, unusable free blocks Coalesce adjacent free blocks to get bigger free block Coalescing - Policy decision of when to perform coalescing Immediate coalescing Merging any adjacent blocks each time a block is freed Deferred coalescing Merging free blocks some time later Ex) when allocation request fails. Trying “Bidirectional Immediate Coalescing” proposed by Donald Knuth would be good enough for this lab

Performance goals Maximizing throughput (Temporal) Defined as the number of requests that it completes per unit time Maximizing Memory Utilization (Spatial) Defined as the ratio of the requested memory size and the actual memory size used There is a tension between maximizing throughput and utilization! Find an appropriate balance between two goals!

Performance goal (1) - Throughput Throughput is mostly determined by time consumed to search free block How you keep track of your free block affects search time Naïve allocator Never frees block, just extend the heap when you need a new block : throughput is extremely fast, but…? Implicit Free List The allocator can indirectly traverse the entire set of free blocks by traversing all of the blocks in the heap, definitely slow. Explicit Free List The allocator can directly traverse entire set of free blocks by traversing all of the free blocks in the heap Segregated Free List The allocator can directly traverse a particular free list to find an appropriate free block

Performance goal (2) – Memory Utilization Poor memory utilization caused by fragmentation Comes in two forms: internal and external fragmentation Internal Fragmentation Based on previous requests Causes Allocator impose minimal size of block (depending on allocator’s choice of block format) Satisfying alignment requirements External Fragmenatation Based on future requests Aggregate free memory is enough, but no single free block is large enough to handle the request

Internal Fragmentation For some block, internal fragmentation is the difference between the block size and the payload size. Caused by overhead of maintaining heap data structures, padding for alignment purposes, or explicit policy decisions (e.g., not to split the block). Depends only on the pattern of previous requests, and thus is easy to measure. block Internal fragmentation Internal fragmentation payload

External Fragmentation Occurs when there is enough aggregate heap memory, but no single free block is large enough p1 = malloc(4) p2 = malloc(5) p3 = malloc(6) free(p2) p4 = malloc(6) oops! External fragmentation depends on the pattern of future requests, and thus is difficult to measure.

The Malloc Lab

Assumptions Assumptions made in Malloc Lab Standard C library malloc always returns payload pointer that is aligned to 8 bytes, so should yours 64-bit Architecture pointers are 8 bytes long! size_t is now 8 bytes (unsigned long) But the requested size will be less than 4 bytes You may use 4 byte headers and footers and get away Free word Allocated block (4 words) Free block (2 words) Allocated word

Porting to 64-bit Machine Porting the code in your CS:APP text book to 64-bit sizeof(long) == 4 // 32-bit sizeof(long) == 8 // 64-bit The only significant difference is in the definitions of the GET and PUT macros. Changes (To keep our 32-bit header and footers) #define GET(p) (*(size_t *)(p)) // 32 bits #define GET(p) (*(unsigned int *)(p)) // 64 bits #define PUT(p, val) (*(size_t *)(p) = (val)) // 32 bits #define PUT(p, val) (*(unsigned int *)(p) = (val)) // 64 bits if ((long)(bp = mem_sbrk(size)) < 0) if ((int)(bp = mem_sbrk(size)) < 0)

Using MACROS – why? #include <stdio.h> #define GET8(p) (*(unsigned long *)(p)) #define PUT8(p, val) (*(unsigned long *)(p) = (unsigned long)(val)) void test(void *p, void *pval){ unsigned long *newpval; /* Reading and writing pointers the hard way */ *(unsigned long *)p = (unsigned long) pval; newpval = (unsigned long *)(*(unsigned long *)p); printf("pval=%p newpval=%p\n", pval, newpval); /* Reading and writing pointers the easy way */ PUT8(p, pval); newpval = (unsigned long *) GET8(p); } int main() { char *pval = (char *)0x99; char buf[128]; test(&buf[0], pval); return 0;

Approach Advice Start with the implicit list implementation in your text book, and understand every details of it When you finish your implicit list, start thinking about your heap checker The more time you spend on this, the more time you will save later Go on and start implementing explicit list with several placement policies Modulate, and save each of your placement policy for comparison When you finish your explicit list, you would like to add more checks in your heap checker, do this right away. Now when you feel your explicit list is robust, move on to the segregated free list. We are looking for a good segregated free list implementation. You can go further by trying other schemes such as balanced trees, but a solid segregated free list implementation is good enough for a full credit You can also try some tweaks on the given trace files

Heap Checker (10 pts) Basic Checks Guidelines (5/10 pts) Check Heap (while working on implicit list) Check epilogue and prologue blocks Block’s address alignment (8 bytes) Heap boundaries Check your blocks’ header and footer Size (minimum size , alignment) prev/next allocate/free bit consistency (explicit list) header and footer matching each other Check your coalescing All blocks are coalesced correctly (no two consecutive free blocks in the heap)

Heap Checker (10 pts) Free List Checks Guidelines (5/10 pts) Check Free List (while working on explicit free list) All next/prev pointers are consistent (If A’s next pointer points to B, B’s prev pointer should point to A) All free list pointers points between mem_heap_lo() and mem_heap_high() Count free blocks by iterating every block, and traversing free list by pointers, see if they match Recommended to add more as you wish Check Segregated Free List (segregated free list) All blocks in each list bucket fall within bucket size range Be creative

Style (10 pts) It will be some of the most difficult and sophisticated code you have written so far in your career. Thing we are looking for: Explain your high level design at front of your code (2 pts) Each function should be prepared by a header comment (2 pts) Comment properly inside each functions (2 pts) Decompose into functions and use as few global variables as possible (2 pts) Use macros, inline functions, C preprocessors wisely (2 pts) Please try to write a clean code that is readable and self-explaining! For you For your Teaching Staff And for world peace

Debugging Techniques Guidelines for Debugging Intensively testing your code even though it seems to work is a good programming practice, try to learn the process from this lab You can print out all the information and monitor it Do this when you just started When the trace file is small You can also print out error messages only when something is wrong Printing and monitoring becomes painful when trace files are huge Just print errors

Debugging Tips Guidelines for using mdriver’s options Use ./mdriver –c <file> option to run a particular trace file just once, which only checks correctness ./mdriver runs your allocator multiple times to estimate the throughput of your allocator by using k-best measurement scheme (if you are interested, refer to ch 9 and mdriver source code) Use ./mdriver –v <level> option to set verbosity level It is sometimes useful to have layers of debugging depth Can also use #define, #ifdef, #if Make sure to turn all checking routines off completely when measuring performance – it does affect performance

More Hints? Going further (beyond solid segregated list) Before trying this, make sure your allocator is doing what you intended, using heap/free list checkers If you think you have implemented a solid segregated free list, try focus on trace files that gives you less performance results

More Hints? Some possible tackle points In malloc(), you have to adjust the requested size to meet alignment requirements or minimum block size requirements It turns out that how you adjust size affects the performance of some trace files And sometimes it is better to force your allocator to avoid splitting the free block by using larger block than the request size It will obviously increase internal fragmentation, but can also increase throughput by avoiding repeated splitting and coalescing How large will you extend your heap, when you have to extend your heap? How do you classify each free list?

Questions ?