Download presentation
Presentation is loading. Please wait.
1
May 9, 2001OSQ Retreat 1 Run-Time Type Checking for Pointers and Arrays in C Wes Weimer, George Necula Scott McPeak, S.P. Rahul, Raymond To
2
May 9, 2001 OSQ Retreat2 What are we doing? Add run-time checks to C programs Catch pointer and array errors Minimal user effort More effort yields more performance Make C “feel” as safe as Java
3
May 9, 2001 OSQ Retreat3 Motivation 50% of software errors are due to pointers 50% of security errors due to buffer overruns Such errors are often hard to reproduce Difficult to locate true source of errors
4
May 9, 2001 OSQ Retreat4 Overview Motivation and System Goals Checkable Errors Run-Time Representation Static Analysis Preliminary Results Future Work
5
May 9, 2001 OSQ Retreat5 Goals Support existing C code Compatibility with external libraries Handle GCC/MSVC source, Makefiles Efficiency: 50% overhead rather than 1000% Research: 5x, Purify 10x, BoundsChecker 150x Default: many checks Reduce by static analysis and/or user annotations
6
May 9, 2001 OSQ Retreat6 Checkable Errors Array and pointer bounds checks Well-understood Dereferencing a non-pointer (or NULL) Complicated by casts and unions Pointer arithmetic outside of object bounds Not always caught by Purify, etc. Freeing non-pointers, using freed memory
7
May 9, 2001 OSQ Retreat7 Required Information Checks require information about pointers Length, base, capabilities, etc. Can be stored in a global table High table-lookup overhead: 500% Can be stored with each pointer struct { Foo *p; Foo *base; Foo *end; } SafeFoo Library compatibility is tricky
8
May 9, 2001 OSQ Retreat8 More is Needed: Tags Must keep track of which locations are valid pointers Use per-object tags (like in GC) int **X; int *Y; *Y = 55; // OK X = Y; printf(“%d”,**X); // CRASH!
9
May 9, 2001 OSQ Retreat9 Run-Time Representation Associate with each object in memory: Base (lower bound), End (upper bound) Tags (bitfield: 1 bit per word: is it a valid pointer?) Checks bounds on every access, check tags on pointer reads, set tags on every write Example: struct { int x; int *y; } *p; 01endxy tags basep
10
May 9, 2001 OSQ Retreat10 Kinds of Pointers Many pointers only move forward (no casts) Notably C strings: for (; *p; p++) if *p==‘c’ … Such “forward” pointers need only an end bound Many pointers are not involved in evil casts But may use pointer arithmetic: arrays Such “index” pointers need not carry tags
11
May 9, 2001 OSQ Retreat11 Kinds of Pointers Many pointers are completely “safe” No evil casts, no arithmetic, etc. e.g., FILE * fin = fopen(“input”, “r”); These can be represented without any extra information (just a NULL check when used) These cases yield better performance!
12
May 9, 2001 OSQ Retreat12 Physical Subtyping Define a formal notion of representation equality and subtyping for casts Keep pointers and scalars separate! Intuition: struct {char a[4];} = struct {int x;} struct {char a[4];} struct {int *x;} struct {int a; int b;} struct {int a;}
13
May 9, 2001 OSQ Retreat13 Extended Type System Simplified C types: ::= int | ref q | 1 2 | 1 + 2 -- Types q ::= safe | string | seq | wild -- Qualifiers safe = one word: standard C pointer seq = three words: pointer, base, end wild = two words: pointer, base, end, tags
14
May 9, 2001 OSQ Retreat14 Type System (continued) ref wild, must contain only wild pointers May cast between safe and seq wild may only be cast to or from wild Physical equivalence: short short = int a (b+c)= (a b)+(a c) Width subtyping: 1 2 1
15
May 9, 2001 OSQ Retreat15 Some Typing Rules O ` &e : 1 2 ref q 1 q 2 = (if q 1 = wild then wild else safe) O ` &e.L : 1 ref q 2 address of field O ` e 1 : ref q q safe O ` e 2 : int O ` e 1 +e 2 : ref q pointer arithmetic
16
May 9, 2001 OSQ Retreat16 Handling Casts O ` e : 1 ref q 1 1 ref q 1 2 ref q 2 O ` ( 2 ref q 2 )e : 2 ref q 2 cast between pointers Initial q 1 Final q 2 Constraint safe 1 = 2 seqsafe k. 1 [k] 2 seq j,k. 1 [j] = 2 [k] wild None When is 1 ref q 1 2 ref q 2 ?
17
May 9, 2001 OSQ Retreat17 Static Analysis & Inference For every pointer in the program Try to infer the fastest safe representation This is like eliminating classes of run-time checks we know will never fail Can be formulated as constraint-solving Apply subtyping rules to casts to get constraints O(E) where E is number of casts/assignments (flow insensitive)
18
May 9, 2001 OSQ Retreat18 Preliminary Results Default Overhead Reduced Overhead Check Overhead Overhead (GC) Reduced Overhead (GC) hashtest218%100%3%222%221% rbtest128%2%1%138%4% compress*36%0%37% barnes_hut108%37%109% mod_layout0% N/A
19
May 9, 2001 OSQ Retreat19 Future Work Encode type information at run-time More expressive casts with low overhead More complete handling of function pointers Handle C polymorphism Uses void*, requires vast overhead Efficient memory management GC (or something else) takes free() as a hint
20
May 9, 2001 OSQ Retreat20 Conclusion Can add efficient run-time checks to C Check bounds, valid pointers, frees, etc. Static analysis is fast and useful Can support existing C code Whole programs are considered safe pointers and wrappers for libraries Default to many checks, infer them away
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.