The Interface Definition Language for Fail-Safe C Kohei Suenaga, Yutaka Oiwa, Eijiro Sumii, Akinori Yonezawa University of Tokyko
International Symposium on Software Security2 In this presentation… We introduce the IDL for Fail-Safe C With our IDL, we can… Easily generate wrappers for external functions Safely interface Fail-Safe C with external functions Our approach can be used to other safe languages
International Symposium on Software Security3 Background
International Symposium on Software Security4 Fail-Safe C Safe implementation of C Translates C sources to fail-safe ones Inserts safety checks such as boundary checks Ensures safety focusing on types of objects Prevents programs from performing unsafe operations
International Symposium on Software Security5 Problems of Fail-Safe C Cooperation with external functions Data representation problem Fail-Safe C uses its original data representation Cannot call external functions directly Safety problem Many external functions require preconditions for safety
International Symposium on Software Security6 Solution To prepare a wrapper for each function Checks preconditions, converts representation, … We want to automatically generate such wrappers
International Symposium on Software Security7 Approach Interface Definition Language (IDL) Describe preconditions and behavior of external functions with the IDL IDL processor Interface Definition Wrappers
wrapper(…) { … memcpy(…) … wrapper(…) { … memcpy(…) … int main(…) { … memcpy(…) … … int main(…) { … memcpy(…) … … memcpy(…) { … return; } memcpy(…) { … return; } Checks Preconditions Converts Arguments Converts Return value
International Symposium on Software Security9 Outline of the Presentation Safety Fail-Safe C and our IDL guarantees Internal data representation of Fail-Safe C Wrappers’ behavior Experiment Related work Future work
International Symposium on Software Security10 The Safety Fail-Safe C Guarantees If a program attempts to perform undefined behavior, Fail-Safe C aborts the program before the operation is performed Not fully formal, but sufficient for our aim
International Symposium on Software Security11 The Safety Fail-Safe C Guarantees void strcpy(char *s1, char *s2) { while (*s1++ = *s2++) ; } Out-of-bound access may occur here The result of out-of-bound access is Undefined Usual C compiler
International Symposium on Software Security12 The Safety Fail-Safe C Guarantees void strcpy(char *s1, char *s2) { while (*s1++ = *s2++) ; } Attempts to perform out-of-bound access Aborts the program Fail-Safe C compiler
International Symposium on Software Security13 The Safety Fail-Safe C Guarantees (again) If a program attempts to perform undefined behavior, Fail-Safe C aborts the program before the operation is performed Not fully formal, but sufficient for our aim
International Symposium on Software Security14 The Safety our IDL Guarantees Two assumptions Fail-Safe C does not contain bugs The safety of Fail-Safe C does hold just before wrappers are called Interface definitions correctly reflect the implementation of external functions If these two assumptions hold, our IDL guarantees that Fail-Safe C’s safety holds after external functions return
International Symposium on Software Security15 Internal Data Representation of Fail-Safe C Fail-Safe C differs from usual C in representation of… Memory blocks Pointers Integers
International Symposium on Software Security16 Internal Data Representation of Fail-Safe C Every memory block has a header size Data TypeInfo Header
International Symposium on Software Security17 Internal Data Representation of Fail-Safe C Every pointer is represented in 2 words size Data base offset TypeInfo
International Symposium on Software Security18 Internal Data Representation of Fail-Safe C Every integer is also represented in 2 words (pointers may be cast to integers) base offset pointer integer cast
International Symposium on Software Security19 Summary of Data Representation Pointers Represented in 2 words Integers Represented in 2 words Memory blocks Have metadata Contents has Fail-Safe C’s representation Wrappers have to convert the representation of arguments
International Symposium on Software Security20 Behavior of the Wrapper of memcpy char *memcpy(char *s1, char *s2, int n);
International Symposium on Software Security21 Behavior of the Wrapper of memcpy char *memcpy(char *s1, char *s2, int n); Preconditions n > 0 1. s1 != NULL, s2 != NULL 2. First n bytes of memory blocks has to be accessible n bytes from s1 and s2 cannot overlap each other
International Symposium on Software Security22 Behavior of the Wrapper of memcpy char *memcpy(char *s1, char *s2, int n); Converting representation (before call) 2-word repr. → 1-word repr. 1.Allocates new memory block in C’s image 2.Copies contents to it converting repr.
International Symposium on Software Security23 Behavior of the Wrapper of memcpy char *memcpy(char *s1, char *s2, int n); Converting representation (after returns) 1.Writes back update of the memory block 2.Deallocates the newly allocated memory block Encodes return value maintaining the distance from s1
International Symposium on Software Security24 Behavior of Wrappers 1.Precondition Checking 2.Decoding and Allocation Integers are converted to 1-word repr. Allocates a memory block and copies to it for each pointer-type argument 3.Call Safe if assumptions appeared before hold
International Symposium on Software Security25 Behavior of Wrappers 4.Encoding and Deallocation Converts the return value to Fail-Safe C’s repr. Reflect update of passed memory blocks Reflect update of global variables (Fail- Safe C allocates two regions for each global variable) Deallocates memory blocks allocated in the wrapper
International Symposium on Software Security26 An Example of Interface Definition Add supplemental information to C’s declaration in the form of attributes [points_in(s1)] char *memcpy([never_null, can_access(0, n-1), write(true, 0, n-1)] char *s1, [never_null, can_access(0, n-1)] const char *s2 int n) [precond(n > 0), no_overlap(s1, 0, n-1, s2, 0, n-1)];
International Symposium on Software Security27 Experiments Measured overhead with four micro- benchmarks In each benchmark, we used… Programs compiled with Fail-Safe C and used wrappers to call external functions Programs compiled with gcc and called wrappers directly Environment: UltraSPARC-II 400 MHz CPU, 13.0 GB RAM
International Symposium on Software Security28 Benchmark 1: succ Takes one integer, adds 1 to it and return it Measured time spent in calling this function 10 7 times as an external function Give information about the overhead of converting integers
International Symposium on Software Security29 Benchmark 2: arraysucc Takes an array of 10 7 characters Measured time spent in calling this function once as an external function Give information about the overhead of converting pointer-type arguments
International Symposium on Software Security30 Benchmark 3: cp File-copying program that uses open, read and write system calls Measured time spent in copying a 100K- byte file Give information about the overhead of pretty practical programs
International Symposium on Software Security31 Benchmark 4: echo A simple echo server that uses socket, bind, listen, accept, recv and send system calls (with 1K-byte buffer) Measured time spent in sending/receiving 100K-byte data between two machines connected with 100BASE-T Give information about the overhead under existence of network delay
International Symposium on Software Security32 Overall Result The overhead of memory allocation is large Execution time (msec) succarraysucccpecho With wrappers Without wrappers Overhead (%)
International Symposium on Software Security33 Breakdown of arraysucc PreDecodeCallEncodeTotal Time (msec) proportion(%) Time spent in each phase of arraysucc More than half of wrapper’s execution time is spent in Decoding and Allocation
International Symposium on Software Security34 Breakdown of cp Execution time of each phase of read/write’s wrapper PreDecodeCallEncodeTotal read write Call phase takes most of time due to file access. However, Decoding and Allocation phase takes much time in wrapper’s execution time
International Symposium on Software Security35 Overall Result (again) Execution time (msec) succarraysucccpecho With wrappers Without wrappers Overhead (%) The overhead of echo is very small
International Symposium on Software Security36 Breakdown of echo Time spent in each phase of recv/send system calls Pre/DecodeCallEncodeTotal recv send1405 Under existence of network delay, wrappers’ overhead is relatively small
International Symposium on Software Security37 Discussion The overhead of Decoding and Allocation is dominant To reduce this overhead… Omit copying contents of memory block if the block is not read Omit allocating new memory block if the block has the same image as usual C’s one
International Symposium on Software Security38 Related Work CamlIDL [Leroy 01], H/Direct [Finne et.al. 98] IDL for OCaml and Haskell The syntax of our IDL is based on CamlIDL Pay less attention to safety External function call is not always safe if preconditions do not hold
International Symposium on Software Security39 Related Work CCured [Necula et.al. 02, Condit et.al. 03] Analyses pointer usage statically and cuts off unnecessary safety checks Two ways to call external functions Tell the compiler which functions are external ones Cannot check preconditions, cannot deal with memory blocks allocated in external functions Provide wrappers for each external function Have to write wrappers manually
International Symposium on Software Security40 Future Work Implementing optimizations Applying our approach to existing programs As soon as the implementation of Fail-Safe C is complemeted Target: sendmail
International Symposium on Software Security41 Conclusion We designed an IDL for Fail-Safe C to call external functions safely
International Symposium on Software Security42 Fin
International Symposium on Software Security43 Internal Data Representation of Fail-Safe C TypeInfo contains typename and handler methods size Data Type name read_word() write_word() Handler methods: Functions that access memory according to the memory block’s type
International Symposium on Software Security44 Why we don’t need to check postconditions?