Presentation is loading. Please wait.

Presentation is loading. Please wait.

SATURN: An Overview Shrawan Kumar

Similar presentations


Presentation on theme: "SATURN: An Overview Shrawan Kumar"— Presentation transcript:

1 SATURN: An Overview Shrawan Kumar Shrawan.kumar@tcs.com

2 21 February 2016 Topics What is SATURN? SATURN Framework SATURN working and its Spec Language Examples of Analysis through SATURN Discussion on Scalability and Precision

3 21 February 2016 Motivating Example void my_func(int i, int j, int k, int b) { int b; a = i*j; if (b>a) if (a>0) x = k / b; // Is there a division by zero? … }

4 21 February 2016 What is SATURN? SATisfiability-based failURe aNalysis Combines static program analysis and model checking It is an error detection framework, not a verification framework !! intra procedurally path sensitive Supports summary based approach for inter-procedural analysis Stores all information in BDB (Berkley data base) files Has a very rich (but low level) specification language for analysis specification Makes use of SAT solvers : MiniSAT and zChaff

5 21 February 2016 What it offers A model of program A rule based specification language to express analysis Facility to form first order logic formulae in terms of program variables Checking of Satisfiability of first order logic formulae Getting a set of assignment to variables leading to satisfiability

6 21 February 2016 Base Program Model Program IR –As a combination of AST and CFG Basic information maintained at each program point –A guard as a first order logic formulae –Memory locations pointed by a pointer variable –Value of every integral data item in symbolic form

7 21 February 2016 Program Representation Model AST of program –Containing information about each Function Variable (local, global, parameter) Struct and Field User defined Type Expression Statement –All structural information e.g. Parent-child relationship –Entry / exit points of a function CFG of program –Maintained for each function –Edges represent computation –Nodes represent program points –Relationship with AST A relationship is maintained among AST element, and program points before and after it

8 21 February 2016 Model representation : building blocks Integral variables are represented as n bit signed or unsigned integer –n and signed-ness are determined from variable type –Every bit in representation is modeled as a boolean expression A pair of mappings, called Environment, is associated with each program point as follows –VARS  VALUES –PTRS  2 GLS Where GLS is GUARD-SET X 2 LOC-SET A guard G, associated with a program point P has following meaning –Control may reach P only if guard G may hold If a pointer Q maps to GLS1 and belongs to GLS1 then –Q may point to any of locations of LOCSET1 provided G holds

9 21 February 2016 Example main() { signed char i, *p; unsigned char j,k; //P1 If (i < 10) //P2 j=10; p = &i; //P3 Else //P4 j=20; p = &k //P5 //P6 k=j; //P7 }

10 21 February 2016 Example main() { signed char i, *p; unsigned char j,k; //P1 If (i < 10) //P2 { i=20; p = &i;} //P3 Else //P4 {j=20; p = &k} //P5 //P6 k=j; //P7 } j=20i=20 P=&i i<10 P=&k K=j p1 p6 P4 P2 p5 p3 p7 ! i<10

11 21 February 2016 Example Guards –P1 : true –P2, P3 : (i < 10) –P4, P5: (! (i < 10) ) –P6, P7: true Environment –P1,P2,P4 : U, k  U> –P3 : } –P5 : } –P6:, } –P7:, } Where A is (i<10) AND U B is (!i<10) or U C is (!i<10) AND U D is (i<10) or U j=20i=20 P=&i i<10 P=&k K=j p1 p6 P4 P2 p5 p3 p7 ! i<10

12 21 February 2016 Memory location Modeling Every memory location is represented by a location trace A location trace is made up from –Root Variable Global Variable Local Variable Formal parameter Return value of a function –Field access –De-referencing There are ways to get parts of a location trace and compose them

13 21 February 2016 Information representation As a set of facts, which are instances of parameterised predicates Example: –g_guard(G, P) is a parameterized predicate where G is Guard and P is Program- Point –To be interpreted as : Guard at program point P is G –For a given program, there will be multiple instances, one for each program point, of this predicate –For every such instance, a fact will be stored Facts are stored in a Berlkley database (BDB) for efficient storage/retrieval All the information about program model is stored as set of such facts in one or more databases Information from many built-in analyses is also stored

14 21 February 2016 Saturn Tool chain C Program C Front end Analysis Specs (CLP) IR data base Constraint solvers CLP interpreter Summary databases Summary/Error reports

15 21 February 2016 Analysis specification Analysis is done over a database of facts During analysis, more facts may get added to database Every Analysis specification is a set of rules Each rule is a list of goals goal1, goal2, …,goaln where last goal must cause addition of some information in data base A basic goal is of the form : pred_name(arg1, arg2, … argn) –Each arg may be bound to some value or it may be a free variable Rules are checked for their success / failure Checking of a rule proceeds from left to right till goals continue to succeed

16 21 February 2016 Example predicate num(N:int) +num(1), +num(2), +num(3), +num(4), +num(5), +num(6), +num(7), +num(8), +num(9), +num(10), +num(11), +num(12), +num(13), +num(14), +num(15), +num(16), +num(17), +num(18), +num(19), +num(20).

17 21 February 2016 Example predicate num(N:int) +num(1), +num(2), +num(3), +num(4), +num(5), +num(6), +num(7), +num(8), +num(9), +num(10), +num(11), +num(12), +num(13), +num(14), +num(15), +num(16), +num(17), +num(18), +num(19), +num(20). predicate multiple(A:int, B:int, C:int). num(X), num(Y), num(Z), Z=X*Y, X\=1, Y\=1, +multiple(Z, X, Y).

18 21 February 2016 Example predicate num(N:int) +num(1), +num(2), +num(3), +num(4), +num(5), +num(6), +num(7), +num(8), +num(9), +num(10), +num(11), +num(12), +num(13), +num(14), +num(15), +num(16), +num(17), +num(18), +num(19), +num(20). predicate multiple(A:int, B:int, C:int). num(X), num(Y), num(Z), Z=X*Y, X\=1, Y\=1, +multiple(Z, X, Y). predicate square(P:int). multiple(Z,X,Y), X=Y, +square(Z). predicate prime(P:int). Num(X), ~multiple(X,_,_), +prime(X)

19 21 February 2016 Saturn Spec Language Saturn provides a specification language, CALYPSO, to express the analysis It is rule based and in some way similar to prolog Most of the inbuilt analysis provided by Saturn is written in CALYPSO itself Parameterised Predicates are basic abstraction unit Type of parameters supported are: –Primitive types: boolean, int, float and string are the primitive types available –list[T] is available as a type representing list of values of type T –The IR object types are available as built-in types –Vector of bits, program point, location trace are some other examples of built-in types –Addition of user types allowed Can be defined as enumerated type, aggregate type and composition of these and other primitive types

20 21 February 2016 Predicate & Fact A Predicate denotes type of a fact Declared as Pred_name(arg1:type1, arg2:type2, arg3:type3, …, argn:typen) Every predicate is given a meaning and used with that meaning consistently Example : predicate reaches(FN:string, P:pp, TR:t_trace, A:c_instr, G:g_guard). –In function FN, definition of variable with trace location TR assigned through statement A is in effect, at program point P, if G holds Fact is an instance of a predicate Many instances(Facts), with different argument values, of same predicate may exist in data base

21 21 February 2016 Goal A goal is used to : –Query existence of matching facts for a predicate –To check if a boolean expression (guard) is satisfiable –To add a new fact for a predicate A goal succeeds or fails A basic goal is in form of: –Pred_name(arg1, arg2, arg3, …, argn) –+Pred_name(arg1, arg2, arg3, …, argn) Goals can be composed through negation, disjunctions and conjunctions to get new goals Basic goal satisfaction –When a goal is used to add a fact, it always succeeds –By matching facts from database e.g. guard(P, G) Free arguments are bound with corresponding actual value of matching fact stored in DB –By invoking constraint solver e.g. guard_sat(G)

22 21 February 2016 Rule A rule consists of a goal. An analysis spec consists of multiple rules Every rule is checked independently A rule checking involves testing the success or failure of its goal A goal consisting of conjunction of sub-goals is evaluated from left to right. Goal succeeds, if all sub-goals succeed. A disjunction of sub-goals succeeds if any of the sub-goal succeeds A rule is checked repeatedly, till new combination of values for free variable of any predicate is found A set of rules is checked repeatedly till no more facts are added

23 21 February 2016 Rule - example predicate preaches(FN:string, P:pp, TR:t_trace, A:c_instr, G:g_guard). predicate reaches(FN:string, P:pp, TR:t_trace, A:c_instr, G:g_guard). cil_curfn(F), iset(PE, PX, ASN), guard(PE, GE), cil_instr_set(ASN, LHS, _), lval(PE, LHS, TR, GL), #and(GE, GL, FG), guard_sat(FG), +preaches(F, PX, TR, ASN, FG).

24 21 February 2016 Analysis control Top down or bottom up traversal How to handle loops –Three options Keeping loops as they are –Analyses which work on acyclic CFG, will not work Converting them into a condition statement (no looping) –Will be unsafe but will be fast to analyse Converting them into a tail recursive function –Will be safe but some analysis may not terminate Setting priority of different analyses (to improve efficiency)

25 21 February 2016 Concept of session The facts created are stored in databases identified by session id The session (database) may be partitioned through parameters coming from IR entities Facts added in inactive part of current session are not available in current analysis cycle Therefore analysis can be staged with facts added in each stage going into new database (session) Facts in one session can be queried/added from other session’s analysis by explicit qualification When facts for a predicate being added or not queried in same analysis, it is better to add then in a separate session Useful for inter-procedural analysis and staged analysis

26 21 February 2016 Example Analysis : Identifying Recursive Functions import "/usr/local/clpa/analysis/base/cilbase.clp". predicate calls(F:string, G: string). analyze session_name(“cil_body”). session callee_caller() containing [calls]. cil_curfn(F), dircall(_, CN), +callee_caller()->calls(F, CN). predicate calls(F:string, G: string). session callee_caller() contains [calls] analyze session_name(“callee_caller”). calls(F,G), calls(G, H), +calls(F,H). calls(F,F), +recursive(F).

27 21 February 2016 Inter-procedural analysis Suitable summary information is conceptualised Summary computed for each function –At its exit / entry –At the call site Use summary information of callee to get appropriate facts in caller Use summary information at call site to get initial information at function entry

28 21 February 2016 Example : Reaching definitions Intra-procedural analysis –compute definitions reaching at a point –While doing so, for function calls use summary information Inter-procedural : Summary at function-exit –What definitions are reaching unconditionally –What definitions are reaching conditionally

29 21 February 2016 Saturn : Scalability, precision, soundness Intra procedural Path sensitive Inter-procedural (Summary based) Use of BDBs Inter-procedural results may be less precise than intra- procedural

30 21 February 2016 Scalability Already tried on Linux Kernel which is few million lines of Code Close to one million code size took 4 hours of analysis for memory leak checker On Linux Kernel (4.8 MLOC) took 19 hours of analysis time for semaphore lock checking Limit on maximum time, which can be spent while analysing a single function, can be set. –Allows for partial analysis of complex functions. It may be unsound but it will come out with some results

31 21 February 2016 References Saturn: A Scalable Framework for Error Detection using Boolean Satisfiability – Yichen Xie and Alex Aiken http://saturn.stanford.edu

32 21 February 2016 Thank You


Download ppt "SATURN: An Overview Shrawan Kumar"

Similar presentations


Ads by Google