Saturn Overview1 An Overview of the Saturn Project.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Static Analysis for Security
Semantics Static semantics Dynamic semantics attribute grammars
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Budapest University of Technology and EconomicsDagstuhl 2004 Department of Measurement and Information Systems 1 Towards Automated Formal Verification.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Ensuring Operating System Kernel Integrity with OSck By Owen S. Hofmann Alan M. Dunn Sangman Kim Indrajit Roy Emmett Witchel Kent State University College.
Presented by: Thabet Kacem Spring Outline Contributions Introduction Proposed Approach Related Work Reconception of ADLs XTEAM Tool Chain Discussion.
Constraint-Based Analysis CS void f(state *x, state *y) { result = spin_trylock( & x->lock); spin_lock( & y->lock); … if (!result) spin_unlock(
ISBN Chapter 3 Describing Syntax and Semantics.
CS 355 – Programming Languages
Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007.
The Software Model Checker BLAST by Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala and Rupak Majumdar Presented by Yunho Kim Provable Software Lab, KAIST.
Checking and Inferring Local Non-Aliasing Alex AikenJeffrey S. Foster UC BerkeleyUMD College Park John KodumalTachio Terauchi UC Berkeley.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
C. FlanaganSAS’04: Type Inference Against Races1 Type Inference Against Races Cormac Flanagan UC Santa Cruz Stephen N. Freund Williams College.
1 Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications.
Saturn1 Scalable Program Analysis Using Boolean Satisfiability: The Saturn Project Alex Aiken Stanford University.
Scalable Error Detection using Boolean Satisfiability 1 Yichen Xie and Alex Aiken Stanford University.
Aliases in a bug finding tool Benjamin Chelf Seth Hallem June 5 th, 2002.
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
Speeding Up Dataflow Analysis Using Flow- Insensitive Pointer Analysis Stephen Adams, Tom Ball, Manuvir Das Sorin Lerner, Mark Seigle Westley Weimer Microsoft.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Software Reliability Methods Sorin Lerner. Software reliability methods: issues What are the issues?
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Verifying the Safety of User Pointer Dereferences Suhabe Bugrara Stanford University Joint work with Alex Aiken.
From last time S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2 l t S1 p S2 l t S1 p L2 l t S1 p S2 l t.
Swerve: Semester in Review. Topics  Symbolic pointer analysis  Model checking –C programs –Abstract counterexamples  Symbolic simulation and execution.
ESP [Das et al PLDI 2002] Interface usage rules in documentation –Order of operations, data access –Resource management –Incomplete, wordy, not checked.
MULTIVIE W Checking System Rules Using System-Specific, Program-Written Compiler Extensions Paper: Dawson Engler, Benjamin Chelf, Andy Chou, and Seth Hallem.
Describing Syntax and Semantics
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Cs164 Prof. Bodik, Fall Symbol Tables and Static Checks Lecture 14.
Symbolic Path Simulation in Path-Sensitive Dataflow Analysis Hari Hampapuram Jason Yue Yang Manuvir Das Center for Software Excellence (CSE) Microsoft.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
University of Michigan Electrical Engineering and Computer Science 1 Practical Lock/Unlock Pairing for Concurrent Programs Hyoun Kyu Cho 1, Yin Wang 2,
SWE 619 © Paul Ammann Procedural Abstraction and Design by Contract Paul Ammann Information & Software Engineering SWE 619 Software Construction cs.gmu.edu/~pammann/
Inferring Specifications to Detect Errors in Code Mana Taghdiri Presented by: Robert Seater MIT Computer Science & AI Lab.
Race Checking by Context Inference Tom Henzinger Ranjit Jhala Rupak Majumdar UC Berkeley.
Type Systems CS Definitions Program analysis Discovering facts about programs. Dynamic analysis Program analysis by using program executions.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
1 Program Slicing Amir Saeidi PhD Student UTRECHT UNIVERSITY.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Profs. Aiken, Barrett & Dill CS357 Lecture 3 1 Saturn Lecture 3 CS357.
Convergence of Model Checking & Program Analysis Philippe Giabbanelli CMPT 894 – Spring 2008.
3.2 Semantics. 2 Semantics Attribute Grammars The Meanings of Programs: Semantics Sebesta Chapter 3.
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Semantics In Text: Chapter 3.
Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein.
Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
Semantic Analysis II Type Checking EECS 483 – Lecture 12 University of Michigan Wednesday, October 18, 2006.
Effective Static Deadlock Detection Mayur Naik (Intel Research) Chang-Seo Park and Koushik Sen (UC Berkeley) David Gay (Intel Research)
SATURN: An Overview Shrawan Kumar
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
CSC3315 (Spring 2009)1 CSC 3315 Languages & Compilers Hamid Harroud School of Science and Engineering, Akhawayn University
Points-to Analysis as a System of Linear Equations Rupesh Nasre. Computer Science and Automation Indian Institute of Science Advisor: Prof. R. Govindarajan.
Chapter 4 Static Analysis. Summary (1) Building a model of the program:  Lexical analysis  Parsing  Abstract syntax  Semantic Analysis  Tracking.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Harry Xu University of California, Irvine & Microsoft Research
runtime verification Brief Overview Grigore Rosu
Verifying the Safety of User Pointer Dereferences
Over-Approximating Boolean Programs with Unbounded Thread Creation
Data Flow Analysis Compiler Design
Amir Kamil and Katherine Yelick
Foundations and Definitions
Presentation transcript:

Saturn Overview1 An Overview of the Saturn Project

The Three-Way Trade-Off Precision –Modeling programs accurately enough to be useful Scalability –Saying anything at all about large programs Human Effort –How much work must the user do? –Either giving specifications, or interpreting results Saturn Overview2 Today’s focus Not so much about this...

Saturn Overview3 Precision int f(int x) {... } Intraprocedural analysis with minimal abstraction. Ff Ff A(F g ) A(F h ) Primary abstraction is done at function boundaries. [A(F f ), A(F g ), A(F h )] A(F f ) [A(F f ), A(F g ), A(F h )] formula

Saturn Overview4 Scalability Design constraint: SAT formula size ~ function size Analyze one function at a time Parallel implementation –Server sends functions to clients to analyze –Typically use cores to analyze Linux

Summaries Abstract at function boundaries –Compute a summary for function’s behavior Summaries should be small –Ideally linear in the size of the function’s interface Summaries are our primary form of abstraction –Saturn delays abstraction to function boundaries Slogan: Analysis design is summary design! Saturn Overview5

Expressiveness Analyses written in Calypso Logic programs –Express traversals of the program –E.g., backwards/forwards propagation Constraints –For when we don’t know traversal order Written ~40,000 lines of Calypso code Saturn Overview6

Availability An open source project –BSD license All Calypso code available for published experiments saturn.stanford.edu Saturn Overview7

People Saturn Overview8 Brian Hackett Alex Aiken Suhabe Bugrara Isil Dillig Thomas Dillig Peter Hawkins Yichen Xie (past)

Outline Saturn overview An example analysis –Intraprocedural –Interprocedural What else can you do? Survey of results Saturn Overview9

Saturn Architecture C Program C Frontend C Syntax Databases Calypso Interpreter Calypso analyses Constraint Solvers Summary Databases Summary Reports UI 10Saturn Overview

Parsing and C Frontend Source Code Build Interceptor Preprocessed Source Code CIL frontend Abstract Syntax Tree Databases Other possible frontends 11Saturn Overview

Calypso General purpose logic programming language –Pure –Prolog-like syntax Bottom-up evaluation –Magic sets transformation Also a (minor) moon of Saturn 12Saturn Overview

Helpful Features Strong static type and mode checking Permanent data (sessions) –stored as Berkeley DB databases –Sessions are just a named bundle of predicates Support for unit-at-a-time analysis 13Saturn Overview

Extensible Interpreter Logic Program Interpreter SAT Solver #sat predicate, … LP Solver DOT graph package UI package 14Saturn Overview

Scalability Interpreter is not very efficient OK, it’s slow But can run distributed analyses – CPUs Scalability is more important than raw speed –Can run intensive analyses of the entire Linux kernel (>6MLOC) in a few hours. 15Saturn Overview

Cluster Architecture Master Node Worker Node 1 Worker Node 100 Calypso DB Calypso DB Databases 16Saturn Overview

Job Scheduling Saturn Overview17 Dynamically track dependencies between jobs Rerun jobs if new dependencies found Optimistic concurrency control Job = a function body Iterate to fixpoint for circular dependencies

Constraint Solvers Calypso Analyses Alias Analysis Function Pointer Analysis C Syntax Predicates CFG Construction Memory Model NULL checker Typestate verifier 18Saturn Overview

Check that a thread does not: –acquire the same lock twice –release the lock twice Otherwise the application may deadlock or crash. The Paradigmatic Locking Analysis 19Saturn Overview

Specification Saturn Overview20 locked unlocked error unlock lock unlock lock

We assume –one locking function lock(l) –one unlocking function unlock(l). We analyze one function at a time –produce locking summary describing the FSM transitions associated with a given lock. Basic Setup 21Saturn Overview

An Example Function & Summary Saturn Overview22 f(..., lock *L,...) { lock(L);... unlock(L); } L: unlocked -> unlocked locked -> error Summaries are input state -> output state The net effect of the function on the lock Summary size is independent of function size Bounded by the square of the number of states

type lockstate ::= locked | unlocked | error. Predicates to describe lock states on nodes and edges of the CFG: predicate node_state(P:pp,L:t_trace,S:lockstate,G:g_guard). predicate edge_state(P:pp,L:t_trace,S:lockstate,G:g_guard). Program point pp is a unique id for each point in the program Trace t_trace is a unique name for a memory location Guard g_guard is a boolean constraint Lock States 23Saturn Overview

1. Initialize lock states at function entry 2. Join operator: – Combine edges to produce successor’s node_state 3. Transfer functions for every primitive: –assignments –tests –function calls The Intraprocedural Analysis 24Saturn Overview

Initializing a Lock Use fresh boolean variable  Interpretation: –  is true ) L is locked – :  is true ) L is unlocked Enforces that L cannot be both locked and unlocked simultaneously 25Saturn Overview

Notation 26Saturn Overview (lock, state, guard) At program point P, the lock is in state if guard is true. P

node_state(P0,L,locked,LG):- entry(P0), is_lock(L), fresh_variable(L, LG). node_state(P0,L,unlocked,UG) :- entry(P0), node_state(P0,L,locked,LG), #not(LG, UG). Initialization Rules 27Saturn Overview f(..., lock *L,...) {... } (L, locked, LG) (L, unlocked, UG) Allocates new boolean variable associated with lock L. P0P0

1. Initialize lock states at function entry 2. Join operator: – Combine edges to produce successor’s node_state 3. Transfer functions for every primitive: –assignments –tests –function calls The Intraprocedural Analysis 28Saturn Overview

node_state(P,L,S,G) :- edge_state(P,L,S,_), \/edge_state(P,L,S,EG):#or_all(EG,G). Joins Note: There is no abstraction in the join... 29Saturn Overview (L, locked, F 1 ) (L, locked, F 2 ) if (…) (L, locked, F 1 Ç F 2 )

1. Initialize lock states at function entry 2. Join operator: – Combine edges to produce successor’s node_state 3. Transfer functions for every primitive: –assignments –function calls –etc. The Intraprocedural Analysis 30Saturn Overview

Assignments do not affect lock state: edge_state(P1,L,S,G) :- assign(P0,P1,_), node_state(P0,L,S,G). Assignments 31Saturn Overview X = E; P0P0 P1P1 (L, S, G)

Function summaries are the building blocks of interprocedural analysis. Generating a function summary requires: –Predicates encoding relevant facts –A session to store these predicates. Interprocedural Analysis Basics 32Saturn Overview

1. Generating function summaries 2. Using function summaries –How do we retrieve the summary of a callee? –How do we map facts associated with a callee to the namespace of the currently analyzed function? Interprocedural Analysis Outline 33Saturn Overview

session sum_locking(FN:string) containing[lock_trans]. predicate lock_trans(L: t_trace, S0: lockstate, S1: lockstate). Summary Declaration 34Saturn Overview sum_locking Declares a persistent database sum_locking (function name) holding lock_trans facts

Summaries for lock and unlock: sum_locking("lock")->lock_trans(*arg0,locked,error) :-. sum_locking("lock")->lock_trans(*arg0,unlocked,locked) :-. sum_locking("unlock")->lock_trans(*arg0,unlocked,error) :-. sum_locking("unlock")->lock_trans(*arg0,locked,unlocked) :-. *arg0 is the memory location modified by lock and unlock Summary Generation: Primitives 35Saturn Overview

sum_locking(F)->lock_trans(L, S0, S1) :- current_function(F), entry(P0), node_state(P0, L, S0, G0), exit(P1), node_state(P1, L, S1, G1), #and(G0, G1, G), guard_satisfiable(G). Summary Generation: Other Functions 36Saturn Overview F(..., lock *L,...) {... } P0P0 P1P1 (L, S 0, G 0 ) (L, S 1, G 1 ) if SAT(G 1 Æ G 2 ), then... F: S 0 ! S 1 h

call_transfer(I, L, S0, S1, G) :- direct_call(I, F), call(P0, _, I), sum_locking(F)->lock_trans(CL, S0, S1), instantiate(s_call{I}, P0, CL, L, G). Summary Application Rule 37Saturn Overview G(...) { F(...) } F: S 0 ! S 1 P0P0 (S 0, L, G) (S 1, L, G)

Applications Bug finding Verification Software Understanding 38Saturn Overview

Saturn Bug Finding Early work –Locking Scalable Error Detection using Boolean Satisfiability. POPL 2005 –Memory leaks Context- and Path-Sensitive Memory Leak Detection. FSE 2005 –Scripting languages Static Detection of Security Vulnerabilities in Scripting Languages. 15 th USENIX Security Symposium, 2006 Recent work –Inconsistency Checking Static Error Detection Using Semantic Inconsistency Inference. PLDI Saturn Overview

Examples: Null pointer dereferences ApplicationKLOCWarningsBugsFalse AlarmsFA Rate Openssl-0.9.8b % Samba b % Openssh-4.3p % Pine % Mplayer-1.0pre % Sendmail % Linux % Total % 40Saturn Overview

Lessons Learned Saturn-based tools improve bug-finding –Multiple times more bugs than previous results –Lower false positive rate Why? –“Sounder” than previous bug finding tools bit-level modeling, handling casts, aliasing, etc. –Precise Fully intraprocedurally path-sensitive Partially interprocedurally path-sensitive Saturn Overview41

Lessons Learned (Cont.) Design of function summary is key to scalability and precision Summary-based analysis only looks at the relevant parts of the heap for a given function Programmers write functions with simple interfaces Saturn Overview42

Saturn Verification Unchecked user pointer dereferences –Important OS security property –Also called “probing” or “user/kernel pointers” Precision requirements –Context-sensitive –Flow-sensitive –Field-sensitive –Intraprocedurally path-sensitive 43Saturn Overview

Current Results for Linux MLOC with 91,543 functions Verified 616 / 627 system call arguments –98.2% –11 false alarms Verified 851,686 / 852,092 dereferences –99.95% –406 false alarms 44Saturn Overview

Preliminary Lessons Learned Bug finders can be sloppy: ignore functions or points- edges that inhibit scalability or precision Soundness substantially more difficult than finding bugs Lightweight, sparsely placed annotations –Have programmers add some information –Makes verification tractable –Only 22 annotations need for user pointer analysis 45Saturn Overview

Saturn for Software Understanding A program analysis is a code search engine Generic question: Do programmers ever do X? –Write an analysis to find out –Run it on lots of code –Classify the results –Write a paper... 46Saturn Overview

Examples Aliasing is used in very stylized ways, at least in C –Cursors into data structures –Parent/child pointers –And 7 other idioms How is Aliasing Used in Systems Software? FSE 2006 Do programmers take the address of function ptrs? –Answer: Almost never. –Allows simpler analysis of function pointers 47Saturn Overview

Other Things We’ve Thought About Shape analysis –We notice the lack of shape information Interprocedural path-sensitivity –Needed for some common programming patterns Proving correctness of Saturn analyses 48Saturn Overview

Related Work Lots –All bug finding and verification tools of the last 10 years Particularly, though –Systems using logic programming (bddbddb) –ESP –Metal –CQual –Blast Saturn Overview49

saturn.stanford.edu Saturn Overview50