Presentation is loading. Please wait.

Presentation is loading. Please wait.

‘ ?> <?php echo ’ Finding Bugs in Dynamic Web Applications Shay Artzi, Adam Kiezun, Julian Dolby, Frank Tip, Danny Dig, Amit Paradkar, Michael D. Earnst.

Similar presentations


Presentation on theme: "‘ ?> <?php echo ’ Finding Bugs in Dynamic Web Applications Shay Artzi, Adam Kiezun, Julian Dolby, Frank Tip, Danny Dig, Amit Paradkar, Michael D. Earnst."— Presentation transcript:

1 ‘ ?> <?php echo ’ Finding Bugs in Dynamic Web Applications Shay Artzi, Adam Kiezun, Julian Dolby, Frank Tip, Danny Dig, Amit Paradkar, Michael D. Earnst Presented By: Christopher Hamilton

2 ‘ ?> <?php echo ’ Introduction Webscript crashes and malformed dynamically- generated Web pages impact usability of Web applications Current tools for Web-page validation cannot handle the dynamically-generated pages on today’s Internet

3 ‘ ?> <?php echo ’ The Problem Bad scripts creating syntactically-malformed HTML –Less portable across browsers and new versions –Non-displayable HTML on separate executions –Browser’s attempt to correct  crashes & security –Discard important information –Trouble indexing correct pages

4 ‘ ?> <?php echo ’ More Problems Dynamic web page testing challenges –HTML validation tools only perform testing of static page Developer must perform –Static Testing –Dynamic Testing

5 ‘ ?> <?php echo ’ Previous Work Dynamic test-generation tools (DART, Cute, EXE) –Execute application on concrete inputs –Create additional input by solving symbolic constraints from control paths –Not practical with Web applications

6 ‘ ?> <?php echo ’ The Authors’ Goals Present automated technique for finding faults manifested as Web application crashes or malformed-HTML Identify minimal part of input responsible for triggering failures Use of an oracle to detect specification in applications output

7 ‘ ?> <?php echo ’ Apollo at a Glance On each execution: –Combined concrete and symbolic execution and constraint solving –Program monitored to record path constraints capturing outcome of control-flow predicates – Oracle determines whether fatal failure or malformed HTML occur –Automatic/iterative creation of new inputs explore different execution paths

8 ‘ ?> <?php echo ’ PHP Scripting Language Widely used in Web development –Network interactions –Database –HTTP processing Object oriented –Classes, interfaces, dynamically dispatched methods –Similar to Java Scripting –Dynamic typing & eval 1 <?php 2 3 make_header(); // print HTML header 4 5 // Make the $page variable easy to use // 6 if(!isset($_GET[’page’])) $page = 0; 7 else $page = $_GET[’page’]; 8 9 // Bring up the report cards and stop processing // 10 if($_GET[’page2’]==1337) { 11 require(’printReportCards.php’); 12 die(); // terminate the PHP program 13 } 14 15 // Validate and log the user into the system // 16 if($_GET["login"] == 1) validateLogin(); 17 18 switch ($page) 19 { 20 case 0: require(’login.php’); break; 21 case 1: require(’TeacherMain.php’); break; 22 case 2: require(’StudentMain.php’); break; 23 default: die("Incorrect page number. Please verify."); 24 } 25 26 make_footer(); // print HTML footer 27... 27 function validateLogin() { 28 if(!isset($_GET[’username’])) { 29 echo " username must be supplied. \n"; 30 return; 31 } 32 $username = $_GET[’username’]; 33 $password = $_GET[’password’]; 34 if($username=="john" && $password=="theTeacher") 35 $page=1; 36 else if($username=="john" && $password=="theStudent") 37 $page=2; 38 else echo " Login error. Please try again \n"; 39 } 40 41 function make_header() { // print HTML header 42 print(" 43 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 44 "http://www.w3.org/TR/html4/strict.dtd"> 45 46 Class Management 47 "); 48 } 49 50 function make_footer() { // close HTML elements opened by header() 51 print(" 52 53 "); 54 } 55 ?>

9 ‘ ?> <?php echo ’ Failures in PHP Scripts Execution Failures –Missing an included file –Wrong MySQL query –Uncaught exceptions Malformed HTML –Generated HTML page not syntactically correct according to HTML validation tool 1 <?php 2 3 make_header(); // print HTML header 4 5 // Make the $page variable easy to use // 6 if(!isset($_GET[’page’])) $page = 0; 7 else $page = $_GET[’page’]; 8 9 // Bring up the report cards and stop processing // 10 if($_GET[’page2’]==1337) { 11 require(’printReportCards.php’); 12 die(); // terminate the PHP program 13 } 14 15 // Validate and log the user into the system // 16 if($_GET["login"] == 1) validateLogin(); 17 18 switch ($page) 19 { 20 case 0: require(’login.php’); break; 21 case 1: require(’TeacherMain.php’); break; 22 case 2: require(’StudentMain.php’); break; 23 default: die("Incorrect page number. Please verify."); 24 } 25 26 make_footer(); // print HTML footer 27... 27 function validateLogin() { 28 if(!isset($_GET[’username’])) { 29 echo " username must be supplied. \n"; 30 return; 31 } 32 $username = $_GET[’username’]; 33 $password = $_GET[’password’]; 34 if($username=="john" && $password=="theTeacher") 35 $page=1; 36 else if($username=="john" && $password=="theStudent") 37 $page=2; 38 else echo " Login error. Please try again \n"; 39 } 40 41 function make_header() { // print HTML header 42 print(" 43 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 44 "http://www.w3.org/TR/html4/strict.dtd"> 45 46 Class Management 47 "); 48 } 49 50 function make_footer() { // close HTML elements opened by header() 51 print(" 52 53 "); 54 } 55 ?> ‘printReportCards.php’ missing make_footer() not executed in certain situations  unclosed HTML tag Generates illegal tag

10 ‘ ?> <?php echo ’ Failure-Finding in PHP Applications Concolic Testing – execute application on initial input, then on additional inputs obtained by solving constraints derived from exercised control flow paths Extensions – Validate to correctness of control flow output –Use isset, isempty, require, etc. to require generation of constraints absent in other OOPL’s –Use pre-specified set of values for database authentication –Simulate each user input by transforming code

11 ‘ ?> <?php echo ’ Transformation of Code For each page (h) that contains N buttons –Add additional input parameter p to PHP program Values range from 1 to N –Switch statement inserted including appropriate PHP source file, depending on p Required modifications are minimal  performed by hand

12 ‘ ?> <?php echo ’ The Failure Detection Algorithm parameters: Program P, oracle O result : Bug reports B; B : setOf (hfailure, setOf (pathConstraint), setOf (input)i) 1. P′ ≔ s1 imulateUserInput(P); 2. B ≔ ?; 3. pcQueue ≔ emptyQueue(); 4. enqueue(pcQueue, 4 emptyPathConstraint()); 5. while not empty(pcQueue) and not timeExpired() do 6. pathConstraint ≔ dequeue(pcQueue); 7. input ≔ solve(pathConstraint); 8. if input, ⊥ then 9. output ≔ executeConcrete(P′, 9 input); 10. failures ≔ getFailures(O, 10 output); 11. foreach f in failures do 12. merge hf, pathConstraint, 12 inputi into B; 13. c1 ∧... ∧ cn ≔ executeSymbolic(P′, 13 input); 14. foreach i = 1,...,n do 15. newPC ≔ c1 ∧... 15 ∧ ci−1 ∧ ¬ ci; 16. queue(pcQueue, 16 newPC); 17. return B; A solution, if it exists, to such an alternative path constraint corresponds to an input that will execute the program along a prefix of the original execution path, and then take the opposite branch.

13 ‘ ?> <?php echo ’ Example: Execution 1 (Expose Third Fault) 1 <?php 2 3 make_header(); // print HTML header 4 5 // Make the $page variable easy to use // 6 if(!isset($_GET[’page’])) $page = 0; 7 else $page = $_GET[’page’]; 8 9 // Bring up the report cards and stop processing // 10 if($_GET[’page2’]==1337) { 11 require(’printReportCards.php’); 12 die(); // terminate the PHP program 13 } 14 15 // Validate and log the user into the system // 16 if($_GET["login"] == 1) validateLogin(); 17 18 switch ($page) 19 { 20 case 0: require(’login.php’); break; 21 case 1: require(’TeacherMain.php’); break; 22 case 2: require(’StudentMain.php’); break; 23 default: die("Incorrect page number. Please verify."); 24 } 25 26 make_footer(); // print HTML footer 27... 27 function validateLogin() { 28 if(!isset($_GET[’username’])) { 29 echo " username must be supplied. \n"; 30 return; 31 } 32 $username = $_GET[’username’]; 33 $password = $_GET[’password’]; 34 if($username=="john" && $password=="theTeacher") 35 $page=1; 36 else if($username=="john" && $password=="theStudent") 37 $page=2; 38 else echo " Login error. Please try again \n"; 39 } 40 41 function make_header() { // print HTML header 42 print(" 43 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 44 "http://www.w3.org/TR/html4/strict.dtd"> 45 46 Class Management 47 "); 48 } 49 50 function make_footer() { // close HTML elements opened by header() 51 print(" 52 true – sets page = 0 false GoTo(20) Execution HTML validation tool determines output is illegal NotSet(page) || page2 ≠ 1337 || login ≠ 1 HTML validation tool determines output is illegal NotSet(page) || page2 ≠ 1337 || login ≠ 1 parameters: Program P, oracle O result : Bug reports B; B : setOf (hfailure, setOf (pathConstraint), setOf (input)i) 1.P′ ≔ s1 imulateUserInput(P); 2.B ≔ ?; 3.pcQueue ≔ emptyQueue(); 4.enqueue(pcQueue, 4 emptyPathConstraint()); 5.while not empty(pcQueue) and not timeExpired() do 6. pathConstraint ≔ dequeue(pcQueue); 7. input ≔ solve(pathConstraint); 8. if input, ⊥ then 9. output ≔ executeConcrete(P′, 9 input); 10. failures ≔ getFailures(O, 10 output); 11. foreach f in failures do 12. merge hf, pathConstraint, 12 inputi into B; 13. c1 ∧... ∧ cn ≔ executeSymbolic(P′, 13 input); 14. foreach i = 1,...,n do 15. newPC ≔ c1 ∧... 15 ∧ ci−1 ∧ ¬ ci; 16. queue(pcQueue, 16 newPC); 17.return B; NotSet(page) || page2 ≠ 1337 || login = 1 NotSet(page) || page2 ≠ 1337 Set(page) NotSet(page) || page2 ≠ 1337 || login = 1 NotSet(page) || page2 ≠ 1337 Set(page)

14 ‘ ?> <?php echo ’ Example: Execution 2 (The Opposite Path) For path constraint: NotSet(page) || page2 ≠ 1337 –Constraint solver may get page2  0; login  1 1 <?php 2 3 make_header(); // print HTML header 4 5 // Make the $page variable easy to use // 6 if(!isset($_GET[’page’])) $page = 0; 7 else $page = $_GET[’page’]; 8 9 // Bring up the report cards and stop processing // 10 if($_GET[’page2’]==1337) { 11 require(’printReportCards.php’); 12 die(); // terminate the PHP program 13 } 14 15 // Validate and log the user into the system // 16 if($_GET["login"] == 1) validateLogin(); 17 18 switch ($page) 19 { 20 case 0: require(’login.php’); break; 21 case 1: require(’TeacherMain.php’); break; 22 case 2: require(’StudentMain.php’); break; 23 default: die("Incorrect page number. Please verify."); 24 } 25 26 make_footer(); // print HTML footer 27... 27 function validateLogin() { 28 if(!isset($_GET[’username’])) { 29 echo " username must be supplied. \n"; 30 return; 31 } 32 $username = $_GET[’username’]; 33 $password = $_GET[’password’]; 34 if($username=="john" && $password=="theTeacher") 35 $page=1; 36 else if($username=="john" && $password=="theStudent") 37 $page=2; 38 else echo " Login error. Please try again \n"; 39 } 40 41 function make_header() { // print HTML header 42 print(" 43 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 44 "http://www.w3.org/TR/html4/strict.dtd"> 45 46 Class Management 47 "); 48 } 49 50 function make_footer() { // close HTML elements opened by header() 51 print(" 52 53 "); 54 } 55 ?> true HTML validation tool discovers failure and generates bug report  added to output set of bug reports

15 ‘ ?> <?php echo ’ Minimization on Path Constraints Eliminates irrelevant constraints Solution for a shorter path constraint is a smaller input Does not guarantee returned path constraint is shortest that exposes failure –Simple, fast, and effective in practice Differs from input minimization – operate on path constraint that exposes failure instead of input –Handles multiple constraints that lead to failure

16 ‘ ?> <?php echo ’ Minimization Example HTML malformation from previous example could have been reached from different execution paths NotSet(page) || page2 ≠ 1337 || login = 1 Set(page) || page = 0 || page2 ≠ 1337 || login = 1 page2 ≠ 1337 || login = 1 page2 ≠ 1337 login = 1 (login  1)

17 ‘ ?> <?php echo ’ Apollo User Input Simulator Executor Bug Finder –Oracle –Bug Report Repository –Input minimizer Input Generator –Symbolic Finder –Constraint Solver –Value Generator

18 ‘ ?> <?php echo ’ User Input Simulator Performs a transformation of the program that models the user input.

19 ‘ ?> <?php echo ’ Executor: Shadow Interpreter Shadow Interpreter – PHP interpreter modified to record path constraints and positional information –Symbolic variable associated with each value –At branching points, extend initially empty path constraint with conjunct corresponding to branch taken in execution –Records conditions for PHP-specific comparison operations (isset, empty, etc) which can only be applied to one variable Concrete values – influence flow control during execution Symbolic value – records control flow decisions at branching points

20 ‘ ?> <?php echo ’ Executor: Database Manager Database Manager –(Re) initializes DB used by a PHP application. Restores DB before each execution –Supply additional information about username/password pairs

21 ‘ ?> <?php echo ’ Bug Finder Bug Report = Path constraint + Input inducing failure Failure = Type of Failure + Corresponding Message + PHP statement generating bad HTML Oracle – HTML validation tool (WDG and WC3) Input Minimizer – uses the path constraints minimization algorithm –Executes program multiple times with multiple inputs that satisfy multiple constraints –Attempts to find shortest path constraint resulting in same failure characteristic

22 ‘ ?> <?php echo ’ Input Generator Symbolic Driver – Implements combined concrete and symbolic failure detection algorithm –Select next input (coverage heuristic) –Create additional inputs from each execution Constraint Driver – implements lightweight symbolic execution –Constraints = equality or inequality Choco constraint solver –Un-constrainted = random generation and constant- mining

23 ‘ ?> <?php echo ’ Evaluation How many faults can Apollo find, and of what varieties? How effective is the fault localization technique compared to alternative approaches, in terms of number and severity of discovered faults? (line coverage achieved) How effective is minimization in reducing size of inputs parameter constraints and failure- inducing inputs?

24 ‘ ?> <?php echo ’ Experimentation <?php echo " WebChess ".$Version.“Login" ; ?> <p><p> Nick: Password: <p><p> Program#filesLOCPHP LOC# DL’s faqforge19171273414164 webchess244718222632352 schoolmate63818142634466 phpsysinfo73166347745492217 total1793124514968543199

25 ‘ ?> <?php echo ’ Generation Strategies Compared to two other approaches –Halfond and Orso (Randomized) Chosen from constant values appearing in program source and from default values Difficult: parameters’ names and types not apparent Infers names and types from dynamic traces –Minimide’s static analysis Apollo’s test input generation previously discussed

26 ‘ ?> <?php echo ’ Methodology 10-minute runs on each program –Generation of hundreds of inputs Ran on both Apollo and Random test input generation strategies WDG offline HTML validation tool Coverage (number of executed lines / total lines with executable PHP code in application) –Total number of lines w/ PHP opcode

27 ‘ ?> <?php echo ’ Results Classification Execution crash: PHP interpreter terminates with exception Execution error: PHP interpreter emits warning visible in generated HTML Execution warning: PHP interpreter emits warning invisible to HTML output HTML error: program generates HTML for which validation tool produces error report HTML warning: program generates HTML for which validation produces a warning report

28 ‘ ?> <?php echo ’ Randomized Results Analysis Apollo Average line coverage – 58.0% Faults Found on Subject Apps – 214 Average line coverage – 15.0% Faults Found on Subject Apps – 59 Tries to load two missing files Database related Unset Time-zone Resulted in Malformed HTML

29 ‘ ?> <?php echo ’ Results Analysis: Effects of Constraint Minimization Minimide’s tool –Approximates string output of program with a context- free grammar. –Able to discover unclosed tags Intersect grammar with regular expression of matched pairs of delimiters –Covers phpwmis and timeclock (web-based) Apollo is more effective and efficient – 2.7 more HTML validation faults – 83 additional execution faults –More scalable

30 ‘ ?> <?php echo ’ Results Analysis: Compared to Static Analysis ProgramSuccess rate % Path ConstraintsInputs Orig. SizeReductionOrig. SizeReduction faqforge6422.30.229.30.31 webchess9123.40.1910.90.40 schoolmate5122.90.3811.50.58 phpsysinfo8224.30.1817.50.26 Reduces size of inputs by up to factor of 0.18 for more than 50% of faults

31 ‘ ?> <?php echo ’ Threats to Validity and Limitations Construct –Malformed HTML = Defect? –Line coverage = quality? –Minimization path constraints? Internal –Real, unseeded, and unknown faults? External –Generalized beyond subject programs? Reproducible? Simulating inputs based on static information –False positives… Limited tracking in native methods –C, input  output, Limited resources of input parameters –Only inputs from global arrays Running as a stand-alone application –Web server integration limited

32 ‘ ?> <?php echo ’ Future Work Handle simulated user input dynamically Create external language to model dependencies between inputs and outputs –Increase line coverage when executing native methods Web server integration

33 ‘ ?> <?php echo ’ Related Work

34 ‘ ?> <?php echo ’ Conclusion Detection of run-time errors –HTML Validation tool as oracle PHP specific issues –Simulation of interactive user input that occurs when HTML elements are activated Automated analysis to minimize size of failure- inducing inputs Apollo  run on 4 open source programs –Over 50% line coverage –214 faults over these applications –Minimized inputs 5.3 times smaller than nonminimized inputs


Download ppt "‘ ?> <?php echo ’ Finding Bugs in Dynamic Web Applications Shay Artzi, Adam Kiezun, Julian Dolby, Frank Tip, Danny Dig, Amit Paradkar, Michael D. Earnst."

Similar presentations


Ads by Google