Saner: Composing Static and Dynamic Analysis to Validate Sanitization in Web Applications Davide Balzarotti, Marco Cova, Vika Felmetsger, Nenad Jovanovic, Engin Kirda, Christopher Kruegel, Giovanni Vigna 2008 IEEE Symposium on Security and Privacy 1
OUTLINE 1. Introduction 2. Motivation 3. Approach 4. Evaluation 5. Conclusions 2
1. Introduction Out of the 2,526 vulnerabilities, 66% affected web applications. A report published by Symantec in March OWASP’s Top Ten Project, unvalidated input as the number one cause of vulnerabilities in web applications. 3
1. Introduction(cont.) A particular type of input validation is sanitization. If a sanitization operation is performed on all paths from sources (the application’s inputs) to sinks (security-relevant operations), then the application is secure. 4
1. Introduction(cont.) They combine static and dynamic analysis techniques, a novel approach to analyze the correctness of the sanitization process. Saner, a prototype that analyzes PHP applications. 5
2. Motivation Input Validation and Sanitization Sensitive sinks SQL injection vulnerability XSS vulnerability Two options when such invalid values are found 6
2. Motivation(cont.) Static Analysis and Proper Sanitization The input sanitization depends on the type of sink that consumes the input Specify all sanitization operations a priori is difficult 7
2. Motivation(cont.) Current static analysis systems typically disregard the use of custom sanitization routines. A technique that can handle the use of custom sanitization routines and properly track the effect of functions that manipulate and modify program input. 8
3. Approach The static analysis component is based on the open-source web vulnerability scanner called Pixy. The goal of the dynamic phase is to examine all those program paths from input sources to sensitive sinks that the static analysis has identified as suspicious. 9
3. Approach(cont.) Sanitization-Aware Static Analysis Testing Sanitization Routines 10
3. Approach(cont.) Basic String Automata Finite automata are used as acceptors. That is, they are applied for deciding whether string values belong to a certain language. 11
3. Approach(cont.) Dependence Graphs Dependence analysis is a data flow analysis that computes a dependence graph for every program point and each variable. 12
3. Approach(cont.) Computing Automata 13
3. Approach(cont.) Cyclic Dependence Graphs To replace strongly connected components (SCCs) in the dependence graph with special SCC nodes. 14
3. Approach(cont.) Discussion The use of string literals The concatenation of two strings The use of a built-in function Saner do not handle the manipulation of strings through indexing 15
3. Approach(cont.) Precise Function Modeling Introduce a precise modeling of string- modifying functions (such as str_replace ) and replacement functions using regular expressions ( ereg_replace and preg_replace ). 16
3. Approach(cont.) Vulnerability Detection Through Intersection Intersecting the automaton that represents the sink’s input with an automaton that encodes the set of undesired strings. 17
3. Approach(cont.) Implicit Taint Propagation False positives Strings that are statically embedded into the application by the programmer are replaced by the empty string. Checking whether the second parameter of str_replace is tainted. 18
3. Approach(cont.) Providing Information to Dynamic Analysis The detection of routines that perform insufficient custom sanitization. The information is extracted from the dependence graphs that static analysis uses internally. 19
3. Approach(cont.) Sanitization-Aware Static Analysis Testing Sanitization Routines 20
3. Approach(cont.) Dynamic analysis To test the effectiveness of the sanitization routines. A vulnerability may be exploited only if the application is in a certain, well-defined state. 21
3. Approach(cont.) Extracting the Sanitization Graph The sanitization graph is a subgraph of the interprocedural dataflow graph of the application. 22
3. Approach(cont.) Testing the Effectiveness of the Sanitization Routines Infeasible paths A large number of test cases Oracle functions 23
4. Evaluation To evaluate Saner on five popular, publicly- available PHP applications that contain custom sanitization routines. 24
4. Evaluation(cont.) Discussion of Sanitization Errors The sanitization can contain programming errors. The sanitization process can be bug-free but insufficient. 25
4. Evaluation(cont.) Jetbox 26 dummy
4. Evaluation(cont.) PBLGuestbook: 27 malicious code< code
4. Evaluation(cont.) Discussion of Effectiveness and Efficiency The combination of static and dynamic techniques proved to be effective. 28
5. Conclusion Web applications perform mission-critical tasks and handle sensitive information. Saner, a novel approach to the evaluation of the sanitization process in web applications. Novel vulnerabilities that stem from incorrect or incomplete sanitization is identified. 29