1APLAS05 A Path Sensitive Type System for Resource Usage Verification of C like languages Korea Advanced Institute of Science and Technology Hyun-Goo Kang,

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Tintu David Joy. Agenda Motivation Better Verification Through Symmetry-basic idea Structural Symmetry and Multiprocessor Systems Mur ϕ verification system.
Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
1 PROPERTIES OF A TYPE ABSTRACT INTERPRETATER. 2 MOTIVATION OF THE EXPERIMENT § a well understood case l type inference in functional programming à la.
Mahadevan Subramaniam and Bo Guo University of Nebraska at Omaha An Approach for Selecting Tests with Provable Guarantees.
Bebop: A Symbolic Model Checker for Boolean Programs Thomas Ball Sriram K. Rajamani
A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.
6/14/991 Symbolic verification of systems with state machines David L. Dill Jeffrey Su Jens Skakkebaek Computer System Laboratory Stanford University.
Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,
Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007.
Checking and Inferring Local Non-Aliasing Alex AikenJeffrey S. Foster UC BerkeleyUMD College Park John KodumalTachio Terauchi UC Berkeley.
Termination Proofs for Systems Code Andrey Rybalchenko, EPFL/MPI joint work with Byron Cook, MSR and Andreas Podelski, MPI PLDI’2006, Ottawa.
Scalable Error Detection using Boolean Satisfiability 1 Yichen Xie and Alex Aiken Stanford University.
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
Program analysis Mooly Sagiv html://
Synergy: A New Algorithm for Property Checking
Speeding Up Dataflow Analysis Using Flow- Insensitive Pointer Analysis Stephen Adams, Tom Ball, Manuvir Das Sorin Lerner, Mark Seigle Westley Weimer Microsoft.
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
Software Reliability Methods Sorin Lerner. Software reliability methods: issues What are the issues?
Validating High-Level Synthesis Sudipta Kundu, Sorin Lerner, Rajesh Gupta Department of Computer Science and Engineering, University of California, San.
Intraprocedural Points-to Analysis Flow functions:
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
ESP [Das et al PLDI 2002] Interface usage rules in documentation –Order of operations, data access –Resource management –Incomplete, wordy, not checked.
Semantics with Applications Mooly Sagiv Schrirber html:// Textbooks:Winskel The.
CSE S. Tanimoto Syntax and Types 1 Representation, Syntax, Paradigms, Types Representation Formal Syntax Paradigms Data Types Type Inference.
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Guide To UNIX Using Linux Third Edition
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Transaction Based Modeling and Verification of Hardware Protocols Xiaofang Chen, Steven M. German and Ganesh Gopalakrishnan Supported in part by SRC Contract.
Formal Verification of SpecC Programs using Predicate Abstraction Himanshu Jain Daniel Kroening Edmund Clarke Carnegie Mellon University.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
Procedure Optimizations and Interprocedural Analysis Chapter 15, 19 Mooly Sagiv.
Thread-modular Abstraction Refinement Thomas A. Henzinger, et al. CAV 2003 Seonggun Kim KAIST CS750b.
Institute for Applied Information Processing and Communications 1 Karin Greimel Semmering, Open Implication.
C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica
Control Flow Resolution in Dynamic Language Author: Štěpán Šindelář Supervisor: Filip Zavoral, Ph.D.
Testing and Verifying Atomicity of Composed Concurrent Operations Ohad Shacham Tel Aviv University Nathan Bronson Stanford University Alex Aiken Stanford.
Inferring Specifications to Detect Errors in Code Mana Taghdiri Presented by: Robert Seater MIT Computer Science & AI Lab.
Type Systems CS Definitions Program analysis Discovering facts about programs. Dynamic analysis Program analysis by using program executions.
Basic Semantics Associating meaning with language entities.
Formal Specification of Intrusion Signatures and Detection Rules By Jean-Philippe Pouzol and Mireille Ducassé 15 th IEEE Computer Security Foundations.
Program analysis with dynamic change of precision. Philippe Giabbanelli CMPT 894 – Spring 2008.
Convergence of Model Checking & Program Analysis Philippe Giabbanelli CMPT 894 – Spring 2008.
Symbolic Execution with Abstract Subsumption Checking Saswat Anand College of Computing, Georgia Institute of Technology Corina Păsăreanu QSS, NASA Ames.
Featherweight X10: A Core Calculus for Async-Finish Parallelism Jonathan K. Lee, Jens Palsberg Presented By- Vasvi Kakkad.
CSV 889: Concurrent Software Verification Subodh Sharma Indian Institute of Technology Delhi Scalable Symbolic Execution: KLEE.
11 Counter-Example Based Predicate Discovery in Predicate Abstraction Satyaki Das and David L. Dill Computer Systems Lab Stanford University
Points-To Analysis in Almost Linear Time Josh Bauman Jason Bartkowiak CSCI 3294 OCTOBER 9, 2001.
PLC '06 Experience in Testing Compiler Optimizers Using Comparison Checking Masataka Sassa and Daijiro Sudo Dept. of Mathematical and Computing Sciences.
A Binary Agent Technology for COTS Software Integrity Anant Agarwal Richard Schooler InCert Software.
D A C U C P Speculative Alias Analysis for Executable Code Manel Fernández and Roger Espasa Computer Architecture Department Universitat Politècnica de.
1 Numeric Abstract Domains Mooly Sagiv Tel Aviv University Adapted from Antoine Mine.
MOPS: an Infrastructure for Examining Security Properties of Software Authors Hao Chen and David Wagner Appears in ACM Conference on Computer and Communications.
Chapter 4 Static Analysis. Summary (1) Building a model of the program:  Lexical analysis  Parsing  Abstract syntax  Semantic Analysis  Tracking.
Manuel Fahndrich Jakob Rehof Manuvir Das
Functional Programming
Representation, Syntax, Paradigms, Types
Over-Approximating Boolean Programs with Unbounded Thread Creation
Representation, Syntax, Paradigms, Types
Representation, Syntax, Paradigms, Types
Pointer analysis.
Representation, Syntax, Paradigms, Types
Pointer analysis John Rollinson & Kaiyuan Li
Presentation transcript:

1APLAS05 A Path Sensitive Type System for Resource Usage Verification of C like languages Korea Advanced Institute of Science and Technology Hyun-Goo Kang, Youil Kim, Taisook Han, Hwansoo Han

2APLAS05 Outline  Problem & Goal  Type System  Conclusion

3APLAS05 Resource Usage Protocol  A program should use resources in a valid way.  Such a protocol is usually specified by a correct sequence of actions on the resource, which is recognizable by a finite state machine.  Example – A file should be open before being written. – A memory cell should not be accessed after deallocation. – An acquired lock should be released eventually. –…–…

4APLAS05 Example [ Program 1 ] main() { FILE* fp = fopen( “ f ”, ” w ” ); fprintf(fp, ” x ” ); fclose(fp); } [ Program 2 ] main() { FILE* fp = fopen( “ f ”, ” w ” ); if (fp) { fprintf(fp, ” x ” ); fclose(fp); } When a program analyzer assumes that fopen always opens the specified file, Miss the bug False alarm Path Sensitivity is Essential !

5APLAS05 A Path Sensitive Specification in FA Closed Opened Error fopen {ret>0} close read/ write/ close fopen {ret<=0} fopen read/ write

6APLAS05 Related Works  Path insensitive verification : actions in finite automata specification are limited as syntactically identifiable sets – Resource Usage Analysis (Igarashi & Kobayashi) – Vault (DeLine & Fahndrich)  Path sensitive but whole program analysis – SLAM (Ball & MSR), BLAST (Henzinger et. UCB) – ESP (Das et. MSR)  Path sensitive and modular, but unsound – Saturn (Yichen Xie, Alex Stanford)

7APLAS05 Our Goal is  To design a path sensitive resource usage analysis  To design it as a modular analysis for modular specification/verification and scalability  To design it as an automatic and sound analysis

8APLAS05 Observations  Path sensitivity is essential.  Values to identify paths are mainly constants and limited to some simple integer values.  A pointer to file-like resources is normally used just as a reference.  Intraprocedural alias of resources is often but interprocedural alias of resources is not frequent.  Resource allocation rarely appears within loops. Even if it appears, every resource allocated in the loop should be deallocated or should have the same specification.

9APLAS05 Selected Abstraction  Domain abstraction – Resource states are traced in concrete level. (no abstraction, finite) – Values that identify paths are traced with a constant propagation lattice.  Join at merge point – If resource contexts from different paths are different, then we collect (union) them as a set. – Otherwise we do normal join over our lattice type. ( t )  Resource identification – Resources are identified by allocation points. All resources allocated in the same program point should satisfy the same resource usage specification.  Tracing resources – Alias information is traced in the path sensitive way within function body under the assumption of no interprocedural alias.

10APLAS05 Outline  Problem & Goal  Type System  Future work / Conclusion

11APLAS05 Our Type System  Type ≈ lattice element instrumented with type variables  Basically a subtype system (bounded polymorphism)  We add flow and path sensitivity.

12APLAS05 Domain Design (Basic Types) ; ` P v MP {  v P } `  v MP MZ T  P MZMP ZP sign r1r1 T  rnrn NR RC … resource id ALNA T  allocation state CO T  resource state A ` X 1 v X 2 if X 1 v X 2 2 Bas or X 1 v X 2 2 A £ value £ state of a resource 

13APLAS05  Natural definition of resource heap would be – resource Id ! (allocation state, resource state)  But we are interested only in the resources related to the function inferred. – constrained heap – heap update history Domain Design (Resource Heap) {(r 1,AL,O)} concretize(  ) w {} w H ¢ [r 1  (AL,O)] H ¢ [r 1  (NA,C)] ¢ [r 1  (AL,O)] {h | h(r 1 ) = open} {(r 1,AL,O),(r 2,NA,C)} w w 

14APLAS05  A Input Path (A) – a set of constraints over all type variables (input partition) {  1 v P,  1 v RC,  v {(  1,AL,O)} } – order is defined as  Output Paths (  ) – set of outputs : { (v 1,  1,H 1 ), …, (v n,  n,H n ) } – order is defined as Domain Design (Set of Paths) A1 ` A2A1 ` A2 ` A1 v A2` A1 v A2 8 (v, ,H) 2  1. 9 (v ’,  ’,h ’ ) 2  2. A ` v v v ’ Æ A `  v  ’ Æ A ` H v H ’ ` 1v2` 1v2

15APLAS05 {  v>,  v>,  v {} } vPvP  v MZ x > 0  (x) = ( ,  ) ,   v P,  v>,  v {} ,   v MZ,  v>,  v {} ,   v P,  v>,  v {}  2,   v MZ,  v>,  v {}  1,   v>,  v>,  v {}  1 t  2,  Input Path Partitioning / Merging

16APLAS05  (x) = ( ,  ) close x error “ not opened ” error “ not allocated ” error “ not resource ” {  v>,  v>,  v {} }  v>,  v RC,  v {( ,AL,O)}  v>,  v RC,  v {( ,AL,C)}  v>,  v RC,  v {( ,NA, > )}  v>,  v NR,  v {} A, ,H ` close x : {(Z, ,H ¢ [R  (AL,C)])}  (x)=(R,D) A ` R v RC A ` H v {(R,AL,O)} { (Z, ,  ¢ [   (AL,C)]) } , 

17APLAS05  (x) = ( ,  ) open x error “ not closed ” error “ not allocated ” error “ not resource ” {  v>,  v>,  v {} }  v>,  v RC,  v {( ,AL,C)}  v>,  v RC,  v {( ,AL,O)}  v>,  v RC,  v {( ,NA, > )}  v>,  v NR,  v {} A, ,H ` open x : {(P, ,H ¢ [R  (AL,O)]), (Z, ,H) }  (x)=(R,D) A ` R v RC A ` H v {(R,AL,C)} Z, ,H P, ,H ¢ [   (AL,O)] , 

18APLAS05  A set of input path(A)/output paths(  ) pairs: – 8 , , . {(A 1,  1 ), …,(A n,  n ) } – order is defined as Domain Design (Function Type) 8 (A 2,  2 ) 2 ts 2. 9 (A 1,  1 ) 2 ts 1. ` A 2 v A 1 Æ A 2 `  1 v  2 8 (A 1,  1 ) 2 ts 1. 9 (A 2,  2 ) 2 ts 2. ` A 1 v A 2 Æ A 1 `  1 v  2 A ` ts 1 v ts 2

19APLAS05  v MZ,  v RC,  v {( ,AL,C)} ] f(x) x=open x x > 0 use x close x f (x)  v>,  v>,  v {} [x:( ,  )]   v>,  v>,  v {} {}  v P,  v RC,  v {( ,AL,O)} [x:( ,  )]   v P,  v>,  v {} [x:( ,  )]  Fixpoint !!  v P,  v RC,  v {( ,AL,O)} [x:( ,  )] .[   (AL,C)]  v MZ,  v>,  v {} [x:( ,  )]   v MZ,  v RC,  v {( ,AL,C)} [x:( ,P)] .[   (AL,O)] [x:( ,Z)]   v P,  v RC,  v {( ,AL,O)} .[   (AL,C)]  v MZ,  v RC,  v {( ,AL,C)} {} .[   (AL,C)]  v MZ,  v RC,  v {( ,C)} [x:( ,ZP) { .[   (AL,C)]} [ Typing Example ]  v P,  v RC,  v {( ,AL,O)} .[   (AL,C)]  v MZ,  v RC,  v {( ,AL,C)} open x : 8 , , . {  v RC,  v {( ,AL,C)} ! {(( ,P),  ¢ [   (AL,O)]),(( ,Z),  )} close x : 8 , , . {  v RC,  v {( ,AL,O)} ! {((NR,Z),  ¢ [   (AL,C)])} use x : 8 , , . {  v RC,  v {( ,AL,O)} ! {((NR,Z),  )} [x:( ,Z)] {} [x:( ,P)] {} [x:( ,ZP)] {} (={ .[(   (AL,C)]} ] {})

20APLAS05  Theorem 1 [Correctness of Type System] If a configuration C is typed, then C is (finished) or it goes without type error. – Two main lemma : subject reduction & progress  Theorem 2 [Correctness of Algorithm] If I (A, ,H,e) = { (A 1,  1 ), , (A n,  n ) }, then A i, ,H ` e :  i. Soundness

21APLAS05 Implementation  We have implemented a prototype, and experimented it with some C programs.  The prototype extends the algorithm in the paper: – Partitions input constraints more lazily. – Handles global variables and heap storage. – Detects resource leaks.

22APLAS05 Ongoings and future work  Type based dynamic allocation  Multiple error message  Resource type based slicing  Modular pointer analysis specialized for this problem  Specification language

23APLAS05 Conclusion  We formalized a sound path-sensitive analysis for resource usage protocols.  Our analysis is modular; the analysis summarizes each function as a type scheme, without using any user annotations.  In the paper, we also showed how to handle dynamic resource allocation and aliases.

24APLAS05 Thank You

25APLAS05 Demo

26APLAS05

27APLAS05 Related Works  Path insensitive verification : actions in finite automata specification are limited as syntactically identifiable sets – Resource Usage Analysis (Igarashi & Kobayashi) – Vault (DeLine & Fahndrich)  Path sensitive but whole program analysis – SLAM (Ball & MSR), BLAST (Henzinger et. UCB) – C2BP. Then, model check – ESP (Das et. MSR) – Ideas of selective join – Lighter-weighted than SLAM/BLAST. But still whole program analysis  Path sensitive and modular, but unsound – Saturn (Yichen Xie, Alex Stanford) – Program constructs  Bit level boolean constraint (equation) – Inference  SAT solving – Unsound : assumption of no alias between arguments, finite loop unrolling – Blind summary : not symbolic (their optimization : slicing query dependent part after whole equation generation)

28APLAS05 Ongoings and future work  Type based dynamic allocation  v {(r i,NA,X)} !  ¢ [  ][r i  (AL,Y)]  v {( ,NA,X)} ― alloc({  }) !  ¢ [  ][   (AL,Y)]  Multiple error message – Better error recovery algorithm to remove multiple false alarm caused by one bug  Resource type based slicing – In GCC package of SPEC95 benchmark, there is a function that opens 15 file concurrently (2 15 path), but if we slice it based on FILE* type, then we can accelerate the complexity of inference to 2 £ 15 safely  Pointer / structure / array – Modular pointer analysis specialized for this problem  Specification Language r i is program point of alloc i now  is program point of allocator function (instantiated)

29APLAS05  v P,  v>,  v {} [fp:(  2,  2 )],  {  v>,  v>,  v {} } vPvP  v MZ x > 0  (x) = ( ,  ) ,   v P,  v>,  v {} ,   v MZ,  v>,  v {} ,   v MZ,  v>,  v {} [fp:(  1,  1 )],  Alias Can not be combined !  1 #  2 by no interprocedural alias assumption

30APLAS05 The Resource Language

31APLAS05 Dynamic Semantics

32APLAS05 Types

33APLAS05 Typing Rule (Resource API) output path generator

34APLAS05 Resource Path Sensitive Join

35APLAS05 Typing Rule (Branch) input path generator

36APLAS05 Typing Rule (Func Abstr / App) input/output path generator

37APLAS05 Typing Rule (others)

38APLAS05 Retrospection (what ’ s hard)  To be modular – Managing/Inferring , ,  part in sound/symbolic way is complex  To be a lazy input path (constraint) partitioning algorithm – Assumption set is not boolean complete lattice. (We don ’ t have exact A c )