1 Smart Memory for Smart Phones Chris Clack University College London

Slides:



Advertisements
Similar presentations
Chapter 4 Memory Management Basic memory management Swapping
Advertisements

FILE SYSTEM IMPLEMENTATION
Lecture 10: Heap Management CS 540 GMU Spring 2009.
Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
File Systems.
KERNEL MEMORY ALLOCATION Unix Internals, Uresh Vahalia Sowmya Ponugoti CMSC 691X.
Fixed/Variable Partitioning
7. Physical Memory 7.1 Preparing a Program for Execution
Allocating Memory.
Binghamton University CS-220 Spring 2015 Binghamton University CS-220 Spring 2015 Heap Management.
5/23/20151 GC16/3011 Functional Programming Lecture 19 Memory Allocation Techniques.
CPSC 388 – Compiler Design and Construction
Lab 3: Malloc Lab. “What do we need to do?”  Due 11/26  One more assignment after this one  Partnering  Non-Honors students may work with one other.
CPSC 231 Organizing Files for Performance (D.H.) 1 LEARNING OBJECTIVES Data compression. Reclaiming space in files. Compaction. Searching. Sorting, Keysorting.
Memory Management Chapter 4. Memory hierarchy Programmers want a lot of fast, non- volatile memory But, here is what we have:
Memory Management Memory Areas and their use Memory Manager Tasks:
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
1 Optimizing Malloc and Free Professor Jennifer Rexford
Memory Management (continued) CS-3013 C-term Memory Management CS-3013 Operating Systems C-term 2008 (Slides include materials from Operating System.
Memory Management 1 CS502 Spring 2006 Memory Management CS-502 Spring 2006.
CS-3013 & CS-502, Summer 2006 Memory Management1 CS-3013 & CS-502 Summer 2006.
Thrashing and Memory Management
Organizing files for performance Chapter Data compression Advantages of reduced file size Redundancy reduction: state code example Repeating sequences:
CS 333 Introduction to Operating Systems Class 19 - File System Performance Jonathan Walpole Computer Science Portland State University.
Memory Management Last Update: July 31, 2014 Memory Management1.
Memory Allocation CS Introduction to Operating Systems.
The memory allocation problem Define the memory allocation problem Memory organization and memory allocation schemes.
Real-Time Concepts for Embedded Systems Author: Qing Li with Caroline Yao ISBN: CMPBooks.
Dynamic Memory Allocation Questions answered in this lecture: When is a stack appropriate? When is a heap? What are best-fit, first-fit, worst-fit, and.
Subject: Operating System.
1 Memory Management Basics. 2 Program P Basic Memory Management Concepts Address spaces Physical address space — The address space supported by the hardware.
1 Advanced Memory Management Techniques  static vs. dynamic kernel memory allocation  resource map allocation  power-of-two free list allocation  buddy.
1 Dynamic Memory Allocation: Basic Concepts. 2 Today Basic concepts Implicit free lists.
11/26/2015IT 3271 Memory Management (Ch 14) n Dynamic memory allocation Language systems provide an important hidden player: Runtime memory manager – Activation.
CS 241 Section Week #9 (11/05/09). Topics MP6 Overview Memory Management Virtual Memory Page Tables.
CS333 Intro to Operating Systems Jonathan Walpole.
Memory Management -Memory allocation -Garbage collection.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Memory Management Overview.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Mono an multiprogramming.
Dynamic Memory Allocation II
University of Washington Today More memory allocation!
Efficient Dynamic Heap Allocation of Scratch-Pad Memory Ross McIlroy, Peter Dickman and Joe Sventek Carnegie Trust for the Universities of Scotland.
Lecture Topics: 12/1 File System Implementation –Space allocation –Free Space –Directory implementation –Caching Disk Scheduling File System/Disk Interaction.
CSE 351 Dynamic Memory Allocation 1. Dynamic Memory Dynamic memory is memory that is “requested” at run- time Solves two fundamental dilemmas: How can.
University of Washington Implementation Issues How do we know how much memory to free given just a pointer? How do we keep track of the free blocks? How.
Memory Management I: Dynamic Storage Allocation Oct 8, 1998 Topics User-level view Policies Mechanisms class14.ppt Introduction to Computer Systems.
Eliminating External Fragmentation in a Non-Moving Garbage Collector for Java Author: Fridtjof Siebert, CASES 2000 Michael Sallas Object-Oriented Languages.
CS 3204 Operating Systems Godmar Back Lecture 18.
CompSci 143A1 Part II: Memory Management Chapter 7: Physical Memory Chapter 8: Virtual Memory Chapter 9: Sharing Data and Code in Main Memory Spring, 2013.
Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.
Chapter 17 Free-Space Management
Section 10: Memory Allocation Topics
Memory Management Memory Areas and their use Memory Manager Tasks:
Memory Management.
Jonathan Walpole Computer Science Portland State University
Module 11: File Structure
From Monoprogramming to multiprogramming with swapping
CPSC 231 Organizing Files for Performance (D.H.)
Chapter 9: Virtual Memory – Part I
Memory Management 6/20/ :27 PM
Memory Management Memory Areas and their use Memory Manager Tasks:
Optimizing Malloc and Free
CS Introduction to Operating Systems
CS703 - Advanced Operating Systems
Memory Management Overview
Memory Allocation II CSE 351 Autumn 2017
OS – Memory Deallocation
Memory Management Memory Areas and their use Memory Manager Tasks:
COMP755 Advanced Operating Systems
Presentation transcript:

1 Smart Memory for Smart Phones Chris Clack University College London

2 Outline Target Architecture Problems Focus on Fragmentation Results from UT A fast allocator (not embedded) Doug Lea’s Allocator Can We Do Better? Overheads Results

3 Target Architecture Small hand-held integrated phone/PDA devices Soft real-time, “open box”, constrained applications heap Competition pressure for more, more flexible, and better (larger) applications

4 Problems (1) TOP livefree To compact: copy when nearly full Memory overhead Compaction delay A free fragment

5 Problems (2) TOP livefree To compact: do sliding compaction when nearly full Compaction delay

6 Problems (3) livefree To compact: do sliding compaction when allocation fails Compaction delay FREE LIST

7 Focus on Fragmentation What happens in real programs? Great paper by Mark Johnstone and Paul Wilson (UT): “The Memory Fragmentation Problem: Solved?”, M.Johnstone & P.Wilson, 1997 Fragmentation experiments using real programs running on real data

8 Max live at any time Max Kb at any time Average lifetime of an allocated byte

9 RESULTS No difference within experimental error

10 #4 #3 MEASURE OF FRAGMENTATION e.g. %frag #4 = (value_at_3 – value_at_2) * 100 / value_at_2

11 No difference within experimental error

12 Johnstone & Wilson’s conclusion The best free-list management policy in terms of fragmentation behaviour on real programs is BEST-FIT (Knuth notwithstanding)

13 A Fast Best-Fit Allocator IMPLICATION: use Best-fit allocation and we (maybe?) won’t ever need to compact At least, compaction delays will be minimized BUT: best-fit allocation is S-L-O-W Worst-case: have to scan the entire free list Let’s look at a widely-used best-fit allocator: Doug Lea’s malloc (arguably) the fastest best-fit allocator

14 Boundary tag – used for coalescing Boundary tag

15 exact-fit bins Fixed-width bins Sorted by size W Costs time to sort Worst case: all free blocks in one bin – reduces to O(n) search

16 Can we do better? Support boundary tags and coalescing Simple Idea (1) (of 4): Probability of fragmentation triggering compaction depends on RANGE of allocatable block sizes Very large block alloc more likely to fail due to frags Very small free blocks create frags (NB if all blocks same size, fragmentation is zero!)

17 Restrict range of allocatable sizes and create an exact-fit table: lbublb+1lb+2lb+3 … ub-1ub-2 No need to sort Worst case: O(n) search  for next highest occupied bin

18 Old idea Use an occupancy bitmap If (ub-lb) = 31, bitmap is just one word To search/allocate: read bitmap; AND with mask; find highest set bit; maybe modify bit and write lbublb+1lb+2lb+3 … ub-1ub

19 Problem What if range is very large? E.g. Nikhil wants to allocate blocks that vary from 2 words to 2 12 words 2 12 different block sizes Worst case = linear search of 128 bitmap words (128 reads + …) Two solutions: Use more efficient bitmapping Use unconstrained hybrid scheme (see later)

20 More efficient bitmapping Simple Idea (2) Use a bitmap tree: Requires words Requires worst case 5 reads, 3 tests for zero, 3 masks, 3 finds of greatest set bit, 3 modify&writes Generally: O(log 32 ((ub-lb)/32)) (Depends what you are counting … but it is fast!) Ten times faster than any other scheme we know

21 LIFO/FIFO? Simple Idea (3) Although J&W found no difference between LIFO/FIFO/AO best fit, this might be different for embedded apps So far, we can only do LIFO We can achieve FIFO if we double-link ALL free blocks into one big chain Drawback – now free takes as long as malloc (but still O(log 32 ((ub-lb)/32)))

22 lbublb+1lb+2lb+3 … ub-1ub-2 Bitmap tree Freed blocks placed at heads of chains If requested size not available, for LIFO: search bitmap tree to the right  Or for FIFO: search bitmap tree to the left , then follow link to next highest free block

23 Simple Idea (4) We can trivially also support Worst-fit by adding a pointer that always refers to the biggest block And this is where we put our wilderness block! We have no data on fragmentation behaviour of worst-fit If it turns out to be similar to best fit, it would be preferable because we would have O(1) alloc and O(log 32 ((ub-lb)/32)) free.

24 lbublb+1lb+2lb+3 … ub-1ub-2 Bitmap tree max W

25 Overheads Dynamic per-block overhead Depends on (ub-lb) – can be very small Example (total 32 bits per live block): 16 bit signed int for size and availability of current block 16 bit signed int for size and availability of previous block Could optimize for live block overhead: 1 bit in header + free blocks also hold size at end of block But, if 4-byte aligned and ANY overhead per block, can’t do better than this! Free blocks additionally need to hold two pointers minimum block size = header + 2 pointers

26 Static overheads Code A few registers (e.g. max) Data structures: Bitmap tree: 133 words Table: (ub-lb) words NOTE if (ub-lb=heap) then table size is the size of the heap! (same overhead as semi-space) So we don’t want to use this scheme for large size ranges!!! – instead use a hybrid

27 Hybrid scheme Most used range of block sizes: Use the bitmap tree and exact-fit bins as described Bigger block sizes: These are all kept on the double-linked chain above the biggest exact-fit block. Can use fixed-width bins like Lea, together with a separate bitmap tree, We lose the worst-case property of the primary scheme

28 RESULTS Re-run Johnstone and Wilson’s tests, using our allocator on their trace files

29 Memory requirement halved ! Memory required by gmalloc Memory required by new allocator Memory requested by the program Test 1 Roughly 5% fragmentation?

30 Memory required by gmalloc Memory required by new allocator Memory requested by the program Test 2

31 Memory required by gmalloc Memory required by new allocator Memory requested by the program Test 3

32 Memory required by gmalloc Memory required by new allocator Memory requested by the program Test 4 Memory requirements consistently halved! Fragmentation consistently ~ 5% (?)

33 Status Currently working with Symbian to conduct malloc- replacement trials using real smartphone applications