Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © 2016 Curt Hill Static Code Analysis What it is and does.

Similar presentations


Presentation on theme: "Copyright © 2016 Curt Hill Static Code Analysis What it is and does."— Presentation transcript:

1 Copyright © 2016 Curt Hill Static Code Analysis What it is and does.

2 Introduction A static code analyzer is a program that examines source code or resulting binary It attempts to find potential problems in the code Static means they do their checking without running the program –The alternative is a dynamic analyzer This is a handy tool because of two factors: –Programmers –Compilers Copyright © 2016 Curt Hill

3 Programmers Writing a bug free program is nearly impossible The complexity is too great Thus programmers introduce errors: –In the initial writing –In a subsequent revision Copyright © 2016 Curt Hill

4 Compilers Compilers are mostly interested in parsing and generating code –They do extensive static analysis –The errors they detect are mostly because of their goal to generate object code There are types of errors that they cannot detect or cannot detect perfectly –Generally, if they cannot always find it they never look Copyright © 2016 Curt Hill

5 Compiler Static Analysis One common form of static analysis that a compiler does is identifying blocks to optimize register usage Suppose that this code is seen: x = x * 2; On a general register machine the following actions need to be generated: –x is loaded into a register –Multiplied by 2 –Result saved in x –3 machine language operations Copyright © 2016 Curt Hill

6 Example Consider the following code: if(a>3) a = a*2+b; b = b / 5; Since a is likely loaded into the register by the if, there is no need for a further load for the assignment –There is no code that could reload the register However, there can be no assumptions about b being in a register Copyright © 2016 Curt Hill

7 In Contrast Suppose this code: int a, b, c; cin >> b; c = 2*a / b; Generally the compiler has no reason to complain that a is used before it is initialized –This does not help with its code generation process –Some actually do complain, but not all Copyright © 2016 Curt Hill

8 Language Levels FORTRAN did not require variable declarations When a variable name is used the compiler determined the type from the first letter of the name The problem is that if a variable name was mis-spelled this was not detected at compile time but run- time C requires declaration –This reduces a run-time error into a compile time error Copyright © 2016 Curt Hill

9 Other Errors Unfortunately, all run-time errors cannot be eliminated by good language design –The unitialized variable is an example Some errors cannot be found until run-time –There is no way static analysis can find it The most famous is the halting problem –It has an elegant proof as well Copyright © 2016 Curt Hill

10 The Halting Problem All we want to know is if a particular program will terminate –We do not care if it does what it should –We only want to know if there is an infinite loop Moreover, we want to know this without actually running the program –Which may be extremely long even if it does halt Copyright © 2016 Curt Hill

11 Setup What we would like is a function –Boolean Halt(program) The function takes as a parameter a program –Usually source code, but could be object It produces a Boolean as a result –True means it stops –False means it is an infinite loop Alas, it is provably impossible to write such a function Copyright © 2016 Curt Hill

12 Proof By Contradiction Assume that Halt does exist I know write a program that looks like this: if(Halt(x)) while (true) cout << “You lose”; else cout << “You still lose”; Finally I feed this program into Halt as x –If Halt says it will stop it does an infinite loop Copyright © 2016 Curt Hill

13 Results The generalization of this result is that there are lots of things that you cannot tell about a program without running it –If we could tell what a program did without running it, why would we run it? Most often if a compiler cannot detect an error all the time, it does not detect the error at all Copyright © 2016 Curt Hill

14 In Contrast So is it hopeless? –Not at all Clearly if we see code like: for (int j=0; j>0;j--)… we could complain that it will not run Similarly: while(k<m) m *= 2; is also a problem Copyright © 2016 Curt Hill

15 Static Code Analyzer Static code analyzers are not perfect –Nobody said they were They may miss certain errors –Even of a type they are looking for They may flag things as an error that are not –The false positives Still if they find any bugs then they are worth using –What they find, we do not have to find Copyright © 2016 Curt Hill

16 PFORT The earliest of these seems to be Portable FORTRAN verifier Published in 1974 It checked parameter passage and Common block usage in FORTRAN programs It was intended to make it easier to convert a FORTRAN program from one machine to another Copyright © 2016 Curt Hill

17 LINT Another early one is LINT –Not an acronym It finds problems in C files: –Uninitialized variables –Division by zero –Constant conditions –Calculations whose result is likely to be outside the range of values representable in the type used First released outside of Bell Labs in about 1979 Copyright © 2016 Curt Hill

18 Languages Static analyzers must be specific the language inside –They are typically built around a parser for that language Most production languages have these Consider some examples Copyright © 2016 Curt Hill

19 Examples Copyright © 2016 Curt Hill LanguageApplications AdaCodePeer, Fluctuat C, C++BLAST, CPPCheck, CPPLInt, CLang JavaCheckStyle,Jarchictect, SourceMeter JavaScriptESLint, JSCS PhPRIPS PowerBuilderPB Code Analyzer PythonPylint

20 Overlap A static analyzer is not a glorified compiler –They do have much in common Both must have a scanner and parser They are typically looking for different things A static analyzer may indicate a reference parameter is never changed –A compiler does not care Copyright © 2016 Curt Hill

21 What can be checked? Memory leaks –new without delete –Other resource leaks as well such as Windows handles Uninitialized variables or memory –Using something before initialization Using NULL pointers –Similar to an uninitialized variable Range issues –Assigning an int constant to a short Copyright © 2016 Curt Hill

22 Finally We typically use these to do checking of projects They can help us to find problems that could also be found by debugging –Debugging is much more expensive Neither technique will find all the bugs Copyright © 2016 Curt Hill


Download ppt "Copyright © 2016 Curt Hill Static Code Analysis What it is and does."

Similar presentations


Ads by Google