CSSV: Towards a Realistic Tool for Statically Detecting All Buffer Overflows in C Nurit Dor, Michael Rodeh, Mooly Sagiv PLDI’2003 DAEDALUS project.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Modular and Verified Automatic Program Repair Francesco Logozzo, Thomas Ball RiSE - Microsoft Research Redmond.
Automated Verification with HIP and SLEEK Asankhaya Sharma.
Interprocedural Shape Analysis for Recursive Programs Noam Rinetzky Mooly Sagiv.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)
1 Lecture 05 – Pointer Analyses & CSSV Eran Yahav.
CSSV: Towards a Realistic Tool for Statically Detecting All Buffer Overflows in C Nurit Dor (TAU), Michael Rodeh (IBM Research Haifa), Mooly Sagiv (TAU)
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
Program analysis Mooly Sagiv html://
Purity Analysis : Abstract Interpretation Formulation Ravichandhran Madhavan, G. Ramalingam, Kapil Vaswani Microsoft Research, India.
4/30/2008Prof. Hilfinger CS 164 Lecture 391 Language Security Lecture 39 (from notes by G. Necula)
An Overview on Static Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of.
Program analysis Mooly Sagiv html://
Assume/Guarantee Reasoning using Abstract Interpretation Nurit Dor Tom Reps Greta Yorsh Mooly Sagiv.
Overview of program analysis Mooly Sagiv html://
Detecting Memory Errors using Compile Time Techniques Nurit Dor Mooly Sagiv Tel-Aviv University.
Overview of program analysis Mooly Sagiv html://
Symbolic Path Simulation in Path-Sensitive Dataflow Analysis Hari Hampapuram Jason Yue Yang Manuvir Das Center for Software Excellence (CSE) Microsoft.
Statically Detecting Likely Buffer Overflow Vulnerabilities David Larochelle David Evans University of Virginia Department of Computer Science Supported.
An Overview on Static Program Analysis Mooly Sagiv.
Alleviating False Alarm Problem of Static Buffer Overflow Analysis Youil Kim
Security Exploiting Overflows. Introduction r See the following link for more info: operating-systems-and-applications-in-
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Chapter 6 Buffer Overflow. Buffer Overflow occurs when the program overwrites data outside the bounds of allocated memory It was one of the first exploited.
Introduction CS 3358 Data Structures. What is Computer Science? Computer Science is the study of algorithms, including their  Formal and mathematical.
1 Testing, Abstraction, Theorem Proving: Better Together! Greta Yorsh joint work with Thomas Ball and Mooly Sagiv.
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
Slide 1 Vitaly Shmatikov CS 380S Static Defenses against Memory Corruption.
Computer Science Detecting Memory Access Errors via Illegal Write Monitoring Ongoing Research by Emre Can Sezer.
Aditya V. Nori, Sriram K. Rajamani Microsoft Research India.
Inferring Specifications to Detect Errors in Code Mana Taghdiri Presented by: Robert Seater MIT Computer Science & AI Lab.
Extended Static Checking for Java  ESC/Java finds common errors in Java programs: null dereferences, array index bounds errors, type cast errors, race.
Advanced Computer Architecture Lab University of Michigan USENIX Security ’03 Slide 1 High Coverage Detection of Input-Related Security Faults Eric Larson.
David Evans These slides: Introduction to Static Analysis.
CS Midterm Study Guide Fall General topics Definitions and rules Technical names of things Syntax of C++ constructs Meaning of C++ constructs.
Symbolically Computing Most-Precise Abstract Operations for Shape Analysis Greta Yorsh Thomas Reps Mooly Sagiv Tel Aviv University University of Wisconsin.
Overflow Examples 01/13/2012. ACKNOWLEDGEMENTS These slides where compiled from the Malware and Software Vulnerabilities class taught by Dr Cliff Zou.
Page 1 5/2/2007  Kestrel Technology LLC A Tutorial on Abstract Interpretation as the Theoretical Foundation of CodeHawk  Arnaud Venet Kestrel Technology.
Static Analysis of Memory Errors Mooly Sagiv Tel Aviv University.
1 Splint: A Static Memory Leakage tool Presented By: Krishna Balasubramanian.
Lecture 6 C++ Programming Arne Kutzner Hanyang University / Seoul Korea.
A Tool for Pro-active Defense Against the Buffer Overrun Attack D. Bruschi, E. Rosti, R. Banfi Presented By: Warshavsky Alex.
CSV 889: Concurrent Software Verification Subodh Sharma Indian Institute of Technology Delhi Scalable Symbolic Execution: KLEE.
Protecting C Programs from Attacks via Invalid Pointer Dereferences Suan Hsi Yong, Susan Horwitz University of Wisconsin – Madison.
CSSV – C String Static Verifier Nurit Dor Michael Rodeh Mooly Sagiv Greta Yorsh Tel-Aviv University
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Slides by Kent Seamons and Tim van der Horst Last Updated: Nov 11, 2011.
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
MOPS: an Infrastructure for Examining Security Properties of Software Authors Hao Chen and David Wagner Appears in ACM Conference on Computer and Communications.
Chapter 4 Static Analysis. Summary (1) Building a model of the program:  Lexical analysis  Parsing  Abstract syntax  Semantic Analysis  Tracking.
Putting Static Analysis to Work for Verification A Case Study Tal Lev-Ami Thomas Reps Mooly Sagiv Reinhard Wilhelm.
University of Virginia Computer Science Extensible Lightweight Static Checking David Evans On the I/O.
Language-Based Security: Overview of Types Deepak Garg Foundations of Security and Privacy October 27, 2009.
Secure Coding Rules for C++ Copyright © 2016 Curt Hill
Types for Programs and Proofs
C Basics.
Secure Coding Rules for C++ Copyright © Curt Hill
Symbolic Implementation of the Best Transformer
High Coverage Detection of Input-Related Security Faults
CS 465 Buffer Overflow Slides by Kent Seamons and Tim van der Horst
AdaCore Technologies for Cyber Security
All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to ask) Edward J. Schwartz, Thanassis.
Covering CWE with Programming Languages and Tools
Chien-Chung Shen CIS/UD
C++ Pointers and Strings
CUTE: A Concolic Unit Testing Engine for C
Annotation-Assisted Lightweight Static Checking
Ras Bodik WF 11-12:30 slides adapted from Mooly Sagiv
SOFTWARE ENGINEERING INSTITUTE
Presentation transcript:

CSSV: Towards a Realistic Tool for Statically Detecting All Buffer Overflows in C Nurit Dor, Michael Rodeh, Mooly Sagiv PLDI’2003 DAEDALUS project

/* from web2c [strpascal.c] */ void foo(char *s) { while ( *s != ‘ ‘ ) s++; *s = 0; } Vulnerabilities of C programs Null dereference Dereference to unallocated storage Out of bound pointer arithmeticOut of bound update

Is it common? General belief – yes! FUZZ study –Test reliability by random input –Tens of applications on 9 different UNIX systems –18% – 23% hang or crash CERT advisory –Up to 50% of attacks are due to buffer overflow COMMON AND DANGEROUS

CSSV’s Goals Efficient conservative static checking algorithm –Verify the absence of buffer overflow --- not just finding bugs –All C constructs Pointer arithmetic, casting, dynamic memory, … –Real programs –Minimum false alarms

Complicated Example /* from web2c [fixwrites.c] */ #define BUFSIZ 1024 charbuf[BUFSIZ]; char insert_long(char *cp) { char temp[BUFSIZ]; … for (i = 0; &buf[i] < cp ; ++i) temp[i] = buf[i]; strcpy(&temp[i],”(long)”); strcpy(&temp[i+6],cp); … cp buf (long) temp

Complicated Example /* from web2c [fixwrites.c] */ #define BUFSIZ 1024 charbuf[BUFSIZ]; char insert_long(char *cp) { char temp[BUFSIZ]; … for (i = 0; &buf[i] < cp ; ++i) temp[i] = buf[i]; strcpy(&temp[i],”(long)”); strcpy(&temp[i+6],cp); … cp buf ( l o n g ) temp Cleanness is potentially violated: 7 + offset (cp)  BUFSIZ

Complicated Example /* from web2c [fixwrites.c] */ #define BUFSIZ 1024 charbuf[BUFSIZ]; char insert_long(char *cp) { char temp[BUFSIZ]; … for (i = 0; &buf[i] < cp ; ++i) temp[i] = buf[i]; strcpy(&temp[i],”(long)”); strcpy(&temp[i+6],cp); … cp buf (long) temp Cleanness is potentially violated: offset(cp)+7 +len(cp)  BUFSIZ 7 + offset (cp) < BUFSIZ

Verifying Absence of Buffer Overflow is non-trivial void safe_cat(char *dst, int size, char *src ) { if ( size > strlen(src) + strlen(dst) ) { dst = dst + strlen(dst); strcpy(dst, src); } {string(src)  alloc(dst) > len(src)} {string(src)  string(dst)  alloc(dst+len(dst)) > len(src)} string(src)  string(dst)  (size > len(src)+len(dst))  alloc(dst+len(dst)) > len(src))

Can this be done for real programs? Complex linear relationships Pointer arithmetic Loops Procedures Use Polyhedra[CH78] Points-to-analysis Widening Procedure contracts Very few false alarms!

C String Static Verifier Detects string violations –Buffer overflow (update beyond bounds) –Unsafe pointer arithmetic –References beyond null termination –Unsafe library calls Handles full C –Multi-level pointers, pointer arithmetic, structures, casting, … Applied to real programs –Public domain software –C code from Airbus

Operational Semantics p 1 =alloc(m) p 2 = p 1 + i p 3 = *p 2 p 1 0x x i 0x p 2 0x x x x p 3 0x m 4 4 undef 0x x x x x Shadow memory base size 0x

Domain Construction Given an abstract domains D 1, D 2, …, D k Construct a “composite domain” c(D 1, D 2, …, D k ) Examples: –Cartesian Abstraction More later

CSSV ’ s Abstraction Ignore exact location Track base addresses i p1p1 p2p2 p3p3 heap 1 p 1 0x x i 0x p 2 0x x x x p 3 0x m 4 4 undef 0x x x x x Shadow memory base size 0x Abstract locations

CSSV ’ s Abstraction Track sizes i p1p1 p2p2 p3p3 heap 1 p 1 0x x i 0x p 2 0x x x x p 3 0x m 4 4 undef 0x x x x x Shadow memory base size 0x Abstract locations m

CSSV ’ s Abstraction Track pointers from one base to another (may) i p1p1 p2p2 p3p3 heap 1 p 1 0x x i 0x p 2 0x x x x p 3 0x m 4 4 undef 0x x x x x Shadow memory base size 0x Abstract locations m

Pointer Validation How can we validate pointer arithmetic? Track offsets from origin Track numeric values p 2 = p 1 + i i p1p1 p2p2 p3p3 heap m 0 8 =8

Numeric values are unknown Track integer relationships p 2 = p 1 + i i p1p1 p2p2 p3p3 heap m p 1.offset p 1.offset + i p 2.offset = p 1.offset + i

Validation Pointer arithmetic p 2 = p 1 + i *p 1.size  p 1.offset + i Pointer dereference p 3 = *p 2 *p 2.size  p 2.offset

The null-termination byte Many expressions involve the ‘ \0 ’ byte strcpy(dst, src) Track the existence of null-termination Track the index of the first one

Abstract Transformers Defines the effect of statements on the abstract representation p 1 =alloc(m) p 2 = p 1 + i i p2p2 p3p3 p1p heap 1 p 1.offset = 0 m p 1.offset + i

Abstract Transformers Unknown value p 3 = *p 2 p 3 =0 *p 2.is_nullt  *p2.len == p2.offset p 3 = unknownotherwise

Overly Conservative Representing infeasible concrete states Infeasible pointer aliases Infeasible integer variables

char* strcpy(char* dst, char *src) requires mod ensures Procedure Calls – Contracts ( string(src)  alloc(dst) > len(src) ) len(dst), is_nullt(dst) ( len(dst) = =  return = = )

Advantages of Procedure Contracts Modular analysis –[Not all the code is available] –Enables more expensive analyses User control of the verification –Detect errors at point of logical error –Improve the precision of the analysis –Check additional properties Beyond ANSI-C

Specification and Soundness All errors are detected Violation of procedure ’ s precondition –Call Violation of procedure's postcondition –Return Violation of statement ’ s precondition –… a[i] …

char* strcpy(char* dst, char *src) requires mod ensures Procedure Calls – Contracts ( string(src)  alloc(dst) > len(src) ) len(dst), is_nullt(dst) ( len(dst) = =  return = = )

safe_cat’s contract void safe_cat(char* dst, int size, char* src) requires mod ensures ( string(src)  string(dst) alloc(dst) == size ) ( len(dst) <= e +  len(dst) >= ) dst

Specification – insert_long() /* insert_long.c */ #include "insert_long.h" char buf[BUFSIZ]; char * insert_long (char *cp) { char temp[BUFSIZ]; int i; for (i=0; &buf[i] < cp; ++i){ temp[i] = buf[i]; } strcpy (&temp[i],"(long)"); strcpy (&temp[i + 6], cp); strcpy (buf, temp); return cp + 6; } char * insert_long(char *cp) requires( string(cp)  buf  cp < buf + BUFSIZ ) mod cp.len ensures ( len(cp) = = + 6  return_value = = cp + 6 ; )

Complicated Example /* from web2c [fixwrites.c] */ #define BUFSIZ 1024 charbuf[BUFSIZ]; char insert_long(char *cp) { char temp[BUFSIZ]; … for (i = 0; &buf[i] < cp ; ++i) temp[i] = buf[i]; strcpy(&temp[i],”(long)”); strcpy(&temp[i+6],cp); … cp buf ( l o n g ) temp Cleanness is potentially violated: 7 + offset (cp)  BUFSIZ

Complicated Example /* from web2c [fixwrites.c] */ #define BUFSIZ 1024 charbuf[BUFSIZ]; char insert_long(char *cp) { char temp[BUFSIZ]; … for (i = 0; &buf[i] < cp ; ++i) temp[i] = buf[i]; strcpy(&temp[i],”(long)”); strcpy(&temp[i+6],cp); … cp buf (long) temp Cleanness is potentially violated: offset(cp)+7 +len(cp)  BUFSIZ 7 + offset (cp) < BUFSIZ

CSSV – Technical overview C files Procedure ’ s Pointer info Pointer Analysis C2IP Integer Proc Integer Analysis Potential Error Messages Procedure name C files Contracts

Used Software ASToolKit [Microsoft] –LLVM, Soot Core C [TAU - Greta Yorsh] –CIL [Berkeley, LLVM] GOLF [Microsoft - Manuvir Das] New Polka [Inria - Bertrand Jeannet] –Apron

CSSV Static Analysis 1.Inline contracts Expose behavior of called procedures 2.Pointer analysis (global) Find relationship between base addresses Project into procedures 3.Integer analysis Compute offset information

Preliminary results (web2C) FAerrorsspace (Mb) time (sec) coreC line line Proc insert_long fprintf_pascal_string space_terminate external_file_name join remove_newline null_terminate

Preliminary results (EADS/RTC_Si) FAerrorsspace (Mb) time (sec) coreC line line Proc FiltrerCarNonImp SkipLine StoreIntInBuffer

CSSV: Summary Semantics –Safety checking –Full C –Enables abstractions Contract language –String behavior –Omit pointer aliasing Procedural points-to –Scalable –Improve precision Static analysis –Tracks important string properties –Utilizes integer analysis

Related Projects SAL Microsoft Splint: David Evans Sage: Microsoft Brian Hacket static analysis, ICSE’2006 Vinod Ganapathy: CCS’2013

Conclusion  Ambitious sound analyses  Very few false alarms  Scaling is an issue –Use staged analyses –Use modular analysis –Use encapsulation