Presentation is loading. Please wait.

Presentation is loading. Please wait.

Precise Garbage Collection for C

Similar presentations


Presentation on theme: "Precise Garbage Collection for C"— Presentation transcript:

1 Precise Garbage Collection for C
Jon Rafkind* Adam Wick+ John Regehr* Matthew Flatt* * University of Utah + Galois, Inc.

2 C Motivation Used to implement important programs Web servers
Operating systems Virtual machines Memory management is difficult to get right Not enough insights C – used for large programs Automatic memory management avoids classes of errors Popular gc is conservative collection

3 Conservative GC Scans for pointer values
Works well for weakly typed languages Objects are stored in memory as sequences of bytes Unobtrusive to the original program Conservative collection scans memory regions for pointer values Good for C because it is weakly typed, not known what the layout is of binary sections But because it scans memory it may end up keeping an object live incorrectly Works for many C programs

4 Liveness void * void * void * Registers

5 Long running programs have possibility of liveness errors
Boehm 02 void * void * void * Registers Use a skull for dead Why is the object dead but kept alive? Long running programs have possibility of liveness errors

6 Machine state outside the realm of C
Liveness inaccuracy Machine state outside the realm of C Registers Non-pointer values as pointers void * x 0xba324100 long y 0xbd100000

7 Liveness errors have non-zero chance of occurring
Mitigating liveness Atomic blocks (memory regions with no pointers) Custom tracers Liveness errors have non-zero chance of occurring

8 Memory leaks Heap size (mb) Iterations thread 1 thread 2 thread 3
void * void * void * Iterations Heap size (mb)

9 Precise GC void * void * void * Root

10 Liveness is apparent at the source level
Precise GC Liveness is apparent at the source level struct painter * p = ...; p->brush = ...; p->color = ...; p->brush = 0;

11 Precise GC ZSnes Emulator Linux
Successfully added a precise garbage collector to long running C programs PLT Scheme ZSnes Emulator Nano text editor Linux

12 Precise All live objects must be reachable through pointers starting from roots Run-time type of objects must match their static allocation type

13 Source to source transformation Precise for C Globals Stack
All live objects must be reachable through pointers starting from roots Source to source transformation Globals Stack Run-time type of objects must match their static allocation type Dont make it transparent Type information

14 Insight C programs provide enough type information to precisely distinguish pointers from non-pointers

15 Transformation Register global variables as roots
Dynamically register stack variables Rewrite allocations with run-time information Add traversal routines for structures

16 Precise for C Globals Stack Type information
All live objects must be reachable through pointers starting from roots Globals Stack Run-time type of objects must match their static allocation type Type information

17 Roots int * counter; int * maker(int * value){ int * x = malloc(10);
*x = *value; *counter += 1; return x; } Talk about the variables

18 Shadow stack int * counter; int * maker(int * value){
Henderson 02 int * counter; int * maker(int * value){ int * x = malloc(10); *x = *value; *counter += 1; return x; } void * frame[3]; frame[0] = POINTER; frame[1] = &x; frame[2] = &value; GC_set_stack(frame);

19 Shadow stack int * counter; int * maker(int * value){
Henderson 02 int * counter; int * maker(int * value){ int * x = GC_malloc(10); *x = *value; *counter += 1; return x; } void * frame[3]; frame[0] = POINTER; frame[1] = &x; frame[2] = &value; GC_set_stack(frame); GC_set_stack(last_stack);

20 Shadow stack Garbage collection Roots Globals Stack Frame Stack Frame

21 Gzip: 80% non-allocating
Stack optimizations Call graph analysis detects functions that do not need shadow stacks Potentially allocates Does not allocate Don't need shadow stacks Need shadow stacks Gzip: 80% non-allocating H264: 40% non-allocating

22 Precise for C Globals Stack Type information
All live objects must be reachable through pointers starting from roots Globals Stack Run-time type of objects must match their static allocation type Type information

23 Allocation rules var = malloc(<expr>);
Heuristic used to statically detect run-time type var = malloc(<expr>); sizeof(t) – allocate single t structure sizeof(t) * e – allocate array of t structures sizeof(t*) * e – allocate pointer array e – allocate atomic block

24 Transforming allocations
var = malloc(sizeof(struct cat)); sizeof(t) – allocate single t structure sizeof(t) * e – allocate array of t structures sizeof(t*) * e – allocate pointer array e – allocate atomic block var = GC_malloc(gc_tag_struct_cat, sizeof(struct cat));

25 Generate traversal functions
struct cat { int * paws; char * nose; }; void gc_struct_cat_mark(struct cat * c){ GC_mark(c->paws); GC_mark(c->nose); } void gc_struct_cat_repair(struct cat * c){ GC_repair(&c->paws); GC_repair(&c->nose); } Used during mark phase Generate traversal functions Updates references to moved objects

26 Result of transformation
The transformed source program will operate semantically equivalent to what it did before and is capable of supporting a moving collector

27 C Support Internal pointers Int * x = obj->x; Pointer arithmetic
char * p = ...; while (*p){ p++; }

28 Active variant is tracked
Unions Active variant is tracked union object { int value; void * info; }; union object o; o.value = 1; GC_autotag(&o, 0); o.info = get(); GC_autotag(&o, 1);

29 External libraries Libraries that store pointers
Annotate objects that should not be moved by the collector Memory allocated by libraries Safe to use because the GC will ignore it

30 Unsupported C Ruby Perl Pointer obfuscation Constructing pointers
int * p = malloc(20); p - = 10; p[10] = 2; int * x = (int*) 0x323429; GC cannot traverse the pointer Liveness issues Ruby Perl Pointers as integers Arrays of open sized arrays Explain code more long make(){ return malloc(); } struct foo{ char rest[0]; }; GC cannot find pointer GC doesn't know how large individual elements are

31 GUI that verifies allocation heuristic and traversal functions
Tool support GUI that verifies allocation heuristic and traversal functions Initial experiments were unclear, eventually learned the system was right

32 Two tools that can be used on C programs
Tool support Two tools that can be used on C programs ~2500 line Ocaml program based on CIL (Necula 2002) Capable of parsing the Linux kernel source Fully automatic

33 Precise vs Conservative memory runtime Source transformation overhead
Benchmarks Precise vs Conservative memory runtime Source transformation overhead

34 Precise is bounded Heap size (mb) Iterations Repeated executions
of drscheme Heap size (mb) Iterations

35 PLT Scheme Performance
Benchmark: CPStack What is cpstack Duration (ms)

36 PLT Scheme Performance
Benchmark: Takl Duration (ms)

37 GC overhead Spec 2k benchmarks Slowdown % Benchmark

38 Linux kernel Test the limits of precise for C
Longest running program in a system Harsh C environment

39 Linux kernel: Implementation
Transformed 74,975 lines of C code Ext3 and ipv4 85 manual changes GC is non-moving / non-generational Why dont use non-generational/non-moving Interface to non-transformed modules

40 Linux kernel: Stack Over 4k
Stack limit reached with additional shadow stack information Over 4k Explain each bullet more

41 Linux kernel: Memory regions
Separate memory regions kmalloc Physically contiguous vmalloc Virtually contiguous Explain each bullet more GC only returns physically contiguous memory

42 Linux kernel: heuristics
Allocation heuristic broken X = kmalloc(sizeof(...)) Y = kmem_cache_alloc(cache) Explain each bullet more Type of Y is cannot be discerned

43 Linux kernel: finalizers
RCU (Read-Copy-Update) is a finalizer mechanism in Linux CPU 1 CPU 3 CPU 2 CPU 4

44 Performance dbench filesystem benchmark GC every second GC Normal
How often gc versus iteration

45 Performance dd on disk filesystem dd on memory filesystem

46 Conclusions Precise garbage collection is a practical for C because most C programs embed enough information required to extract the required properties Long running programs benefit from precise garbage collection Say the insight Thank you

47 Precise Garbage Collection for C
Jon Rafkind Adam Wick John Regehr Matthew Flatt Get rid of pictures University of utah

48

49 Precise Garbage Collection for C
C is a weakly typed language Magpie Written a tool that makes C programs use precise garbage collection Applied to PLT Scheme, Linux kernel, benchmarks

50 Garbage Collection Techniques
Precise collection Conservative collection

51 Garbage Collection Techniques
Precise collection Conservative collection

52 Garbage Collection Techniques
Precise collection Conservative collection int void* void* void* void*

53 Linked lists Lists of threads and stacks create unreclaimable chains

54 Long running programs Web servers, OS's, etc can experience unstable memory usage with conservative collectors

55 Background Core of Drscheme written in C
10+ years of managing Boehm collector

56 Conservative collection
Certain classes of programs are susceptible to memory leaks

57 ZSNes emulator – Super Nintendo emulator
Other programs ZSNes emulator – Super Nintendo emulator Nano – text editor Unbounded memory with conservative, stable with precise

58 Main idea Source to source transformation of C code can embed information to let precise garbage collection work: Invariants: All pointers are known Run-time type of objects is the same as their static allocation type

59 Precise Registers are not scanned for pointers
All live pointers in the system are known stack Type information Object Shapes Roots

60 Source transformation to retain this information
Precise Source transformation to retain this information stack Type information Object Shapes Roots

61 Allocation rules var = malloc(<expr>);
sizeof(t) – allocate single t structure sizeof(t) * e – allocate array of t structures sizeof(t*) * e – allocate pointer array e – allocate atomic block

62 Copying hybrid-compacting collector
Traverse objects: Marking Updating pointers

63 Generate traversal functions
struct cat{ int * paws; char * nose; }; void gc_struct_cat_mark(struct cat * c){ GC_mark(c->paws); GC_mark(c->nose); } void gc_struct_cat_repair(struct cat * c){ GC_repair(&c->paws); GC_repair(&c->nose); } Generate traversal functions

64 Active variant is tracked
Unions Active variant is tracked union object{ int value; void * info; }; union object o; o.value = 1; GC_autotag(&o, 0); o.info = get(); GC_autotag(&o, 1);

65 Unsupported idioms Pointer obfuscation Pointers as integers
int * p = malloc(20); p - = 10; p[10] = 2; int make(){ return malloc(); } Constructing pointers Arrays of open sized arrays Int * x = (int*) 0x323429; struct foo{ char rest[0]; };

66 Mzscheme Performance Benchmark: CPStack Conservative GC Precise GC

67 Mzscheme Performance Benchmark: Takl Conservative precise

68 Spec Performance Factor Benchmark

69 Two tools that can be used on C programs
Tool support Two tools that can be used on C programs

70 Two tools that can be used on C programs
Tool support Two tools that can be used on C programs Based on Cil (Necula 2002) Parses C and GCC extensions Fully automatic

71 Too large to fully transform Subsystems: Ext3 IPV4
Linux Too large to fully transform Subsystems: Ext3 IPV4

72 Linux User Mode Linux stack is limited to 4k – GC shadow stack went over this limit Shadow stack

73 Linux Manual changes RCU (Read-Copy-Update) finalizers
Important structures allocated with vmalloc Custom allocators (kmem_cache_alloc)

74 Performance GC can cause high latencies

75 Performance GC overhead

76 Conclusions Precise garbage collection is a practical alternative for C Long running programs benefit from precise garbage collection

77 Mzscheme Performance CPStack Takl


Download ppt "Precise Garbage Collection for C"

Similar presentations


Ads by Google