Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluating Static Analysis Tools Dr. Paul E. Black

Similar presentations


Presentation on theme: "Evaluating Static Analysis Tools Dr. Paul E. Black"— Presentation transcript:

1 Evaluating Static Analysis Tools Dr. Paul E. Black paul.black@nist.gov http://samate.nist.gov/

2 Static and Dynamic Analysis Complement Each Other Static Analysis Examine code Handles unfinished code Can find backdoors, eg, full access for user name “JoshuaCaleb ” Potentially complete Dynamic Analysis Run code Code not needed, eg, embedded systems Has few(er) assumptions Covers end-to-end or system tests

3 Different Static Analyzers Are Used For Different Purposes To check intellectual property violation By developers to decide if anything needs to be fixed (and learn better practices) By auditors or reviewer to decide if it is good enough for use

4 Dimensions of Static Analysis SyntacticHeuristicAnalyticFormal General (implicit) Application (explicit) Source Byte code Binary Level of Rigor Properties Code Analysis can look for general or application-specific properties Analysis can be on source code, byte code, or binary The level of rigor can vary from syntactic to fully formal.

5 SATE 2008 Overview Static Analysis Tool Exposition (SATE) goals: –Enable empirical research based on large test sets –Encourage improvement of tools –Speed adoption of tools by objectively demonstrating their use on real software NOT to choose the “best” tool Co-funded by NIST and DHS, Nat’l Cyber Security Division Participants: Aspect Security ASC  HP DevInspect Checkmarx CxSuite  SofCheck Inspector for Java Flawfinder  UMD FindBugs Fortify SCA  Veracode SecurityReview Grammatech CodeSonar

6 6 SATE 2008 Events Telecons, etc. to come up with procedures and goals We chose 6 C & Java programs with security implications and gave them to tool makers (15 Feb) Tool makers ran tools and returned reports (29 Feb) We analyzed reports - (tried to) find “ground truth” (15 Apr) We expected a few thousand warnings - we got over 48,000. Critique and update rounds with some tool makers (13 May) Everyone shared observations at a workshop (12 June) We released our final report and all data 30 June 2009 http://samate.nist.gov/index.php/SATE.html

7 SATE 2008: There’s No Such Thing as “One Weakness” Only 1/8 to 1/3 of weaknesses are simple. The notion breaks down when –weakness classes are related and –data or control flows are intermingled. Even “location” is nebulous.

8 Hierarchy Chains lang = %2e./%2e./%2e/etc/passwd%00 Composites f rom “Chains and Composites”,Steve Christey, MITRE http://cwe.mitre.org/data/reports/chains_and_composites.html How Weakness Classes Relate Cross-Site Scripting CWE-79 Command Injection CWE-77 Improper Input Validation CWE-20 Validate- Before-Canonicalize CWE-180 Relative Path Traversal CWE-23 Container Errors CWE-216 Race Conditions CWE-362 Predictability CWE-340 Permissions CWE-275 Symlink Following CWE-61

9 use line 819 use line 808 Intermingled Flow: 2 sources, 2 sinks, 4 paths How many weakness sites? free line 1503 free line 2644

10 Other Observations Tools can’t catch everything: cleartext transmission, unimplemented features, improper access control, … Tools catch real problems: XSS, buffer overflow, cross-site request forgery - 13 of SANS Top 25 (21 with related CWEs) Tools reported some 200 different kinds of weaknesses –Buffer errors still very frequent in C –Many XSS errors in Java “Raw” report rates vary by 3x depending on code Tools are even more helpful when “tuned” Coding without security in mind leaves MANY weaknesses

11 Current Source Code Security Analyzers Have Little Overlap 2 tools 3 tools 4 tools All 5 tools Non-overlap: Hits reported by one tool and no others (84%) Overlap: Hits reported by more than one tool (16%) from MITRE

12 Precision & Recall Scoring All True Positives No True Positives 20 40 60 0 80100 Reports Everything Misses Everything 0 20 40 60 80 100 Finds more flaws Finds mostly flaws “Better” The Perfect Tool Finds all flaws and finds only flaws from DoD

13 Tool A All True Positives No True Positives Uninitialized variable use Null pointer dereference Improper return value use All flaw types Use after free TOCTOU Memory leak Buffer overflow Tainted data/Unvalidated user input 20 40 60 0 80100 Reports Everything Misses Everything 0 20 40 60 80 100 from DoD

14 Tool B All True Positives No True Positives Uninitialized variable use Null pointer dereference Improper return value use All flaw types Use after free TOCTOU Memory leak Buffer overflow Tainted data/Unvalidated user input Command injection Format string vulnerability 20 40 60 0 80100 Reports Everything Misses Everything 0 20 40 60 80 100 from DoD

15 Best Tool All True Positives No True Positives Uninitialized variable use Improper return value use Use after free TOCTOU Memory leak Buffer overflow Tainted data/Unvalidated user input Command injection Format string vulnerability Null pointer dereference 20 40 60 0 80100 Reports Everything Misses Everything 0 20 40 60 80 100 from DoD

16 Tools Useful in Quality “Plains” Tools alone are not enough to achieve the highest “peaks” of quality. In the “plains” of typical quality, tools can help. If code is adrift in a “sea” of chaos, train developers. Tararua mountains and the Horowhenua region, New Zealand Swazi Apparel Limited www.swazi.co.nz used with permissionwww.swazi.co.nz

17 Tips on Tool Evaluation Start with many examples covering code complexities and weaknesses SAMATE Reference Dataset (SRD) http://samate.nist.gov/SRD http://samate.nist.gov/SRD Many cases from MIT: Lippmann, Zitser, Leek, Kratkiewicz Add some of your typical code. Look for –Weakness types (CWEs) reported –Code complexities handled –Traces, explanations, and other analyst support –Integration and machine-readable reports –Ability to write rules and ignore “known good” code False alarm ratio (fp/tp) is a poor measure. Report density (r/kLoc) is probably better.


Download ppt "Evaluating Static Analysis Tools Dr. Paul E. Black"

Similar presentations


Ads by Google