Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007
Outline Motivation Example: Range Checker Solution Details Belief Inference Analysis Issues Enforcing Obscure Rules Evaluation Discussion
Motivation (1) Problem Find security holes (security rules violation) in source code of system software Security rules Sanitize untrusted input before using it Do not release sensitive data to unauthorized users Observation Many rules are poorly understood and erratically obeyed Approach Use static analysis to check if security rules are obeyed
Motivation (2) Program analysis: Intuition Tool Security rules Domain specific System specific High-level Metacompilation Make it easy for programmer to add rules
Example: Range Checker (1) Security rule Integers supplied by untrustworthy sources should be range-checked before used for dangerous operations
Range Check (2) Checker needs to identify Untrustworthy sources that generate data Checks must be done to sanitize the data Trusting sinks that must be protected Untrustworthy sources System calls (sys_*) Routines copy data from user space (copy_from_user, copyin) Data from network
Range Check (3) Sanitizing data Signed integers: lower and upper bound Unsigned integers: upper bound check Tricky: integer overflow
Range Check (4) Trusting sinks Array index Loop bound Copying/allocation routines Potentially 3 x 3 x 3 = 27 types of security holes!
Implementation (1)
Implementation (2) State machine representation Metal: high-level, state-machine language Compilation extension linked to xgcc States can be global or bound to expressions How it works? “After xgcc translates each input function into its internal representation, the checker is applied down every possible execution path in that function”
Implementation (3)
Advantages Propagate the knowledge of one programmer to many Security rules are subtle Find difficult-to-observe errors Catch error without running code Many errors are found in the drivers Lightweight
About the checker Ad hoc knowledge (security rules) Effective (range checker finds 100+ errors in Linux) False negative False positive
Belief Inference Traditional checkers: Hardwired knowledge MC: Use code behavior to infer checking properties Inference Untrustworthy sources Trusting sinks Network Data
Driving Untrustworthy Sources Challenges There are many untrustworthy sources Difficult to analyze Use inference Untrustworthy input is often used in stylized ways
Deriving Trusting Sinks Normal checking sequence (1) OS reads data from unsafe source (2) Check the data (3) Pass it to a trusting sink What if (3) is missing? Something may be wrong…
Network Data Challenge Network data is not trustworthy sk_buff holds network data Incoming or outgoing? Candidates If the fields were read more often than written, the structure is incoming If the checker sees the allocation of the structure, it’s outgoing
Analysis – Transitive Tainting Allow tainted variables to transitively taint other variables
Analysis – Inter-procedural Analysis (1) The user only provides the “base” unsafe sources and trusting sinks Automatically compute all procedures that transitively produces or consumes data Two-pass process First pass: Emit a call graph, compute the transitive set of functions, store calculated sources and sinks in text files Second pass: at call sites, taint variable / report errors Special case: function pointers
Analysis – Inter-procedural Analysis (2)
Analysis – False Positives Checker design First write simple checkers Eliminating false positives Common false positives “Fancy” bound checks Taint granularity Subroutine checks bounds
Analysis – False Negatives First of all, false negatives are expected… Potential improvements Comparison with correct value Other information flow channel (tainted value stored in data structure) Info lost during inter-procedure analysis Only local inference
Enforcing Obscure Rules (1) The length-field copy attack Signed integer must be lower and upper bound checked
Enforcing Obscure Rules (2) Integer overflow Fixed size arithmetic
Evaluation (1) – Errors Overview Severe erros as common as minor ones
Evaluation (2) – Errors Overview Most bugs are local Low false positive rate
Evaluation (3) – Results Validation Linux (2.4.5 – ) Post errors to Linux Kernel Count unique errors Many resulted in kernel patch False result – kernel developers will explain why Minor bugs – may introduce possibility of new bugs OpenBSD (2.9) Submitted to a local BSD hacker All errors resulted in kernel patches Total kernel patches 50+
Discussion (1) Core techniques Static analysis – State Machine model, although implementation details are not given (see details)details Make ad hoc knowledge powerful Belief inference (save effort to specify everything) Extract information from source code presentation
Discussion (2) How to do better? Static analysis (other models/tools?) Finding errors (combine with dynamic analysis?) Other applications Finding bugs (past work) …
Thank you
Reference Using Programmer-Written Compiler Extensions to Catch Security Holes Ken Ashcraft and Dawson Engler In Proceeding of IEEE Security and Privacy 2002.