Presentation is loading. Please wait.

Presentation is loading. Please wait.

Finding Bugs in Dynamic Web Applications Shay Artzi, Adam Kiezun, Julian Dolby, Frank Tip, Danny Dig, Amit Paradkar, Michael D. Earnst Proceeding: ISSTA.

Similar presentations


Presentation on theme: "Finding Bugs in Dynamic Web Applications Shay Artzi, Adam Kiezun, Julian Dolby, Frank Tip, Danny Dig, Amit Paradkar, Michael D. Earnst Proceeding: ISSTA."— Presentation transcript:

1 Finding Bugs in Dynamic Web Applications Shay Artzi, Adam Kiezun, Julian Dolby, Frank Tip, Danny Dig, Amit Paradkar, Michael D. Earnst Proceeding: ISSTA '08 (International Symposium on Software Testing and Analysis )

2 – Presented By » Md. Monjurul Hasan CSE 6329 Special Topics in Advanced Software Engineering

3 Dynamic Web Application Generates pages (HTML contents) on-the-fly Content varies on user and user-specified criteria Obtained by server-side programming We can say that all big, known web applications are Dynamic Web Application Source: Dynamic Web Application Development using PHP and MySQL – By Simon Stobart and David Parsons

4 Web Threats Web script crashes and malformed dynamically-generated Web pages impact usability of Web applications Current tools for Web-page validation cannot handle the dynamically-generated pages

5 Web Script Crash Missing included file Call to undefined method Wrong Database query Uncaught exceptions

6 Malformed HTML HTML that does not conform to the WDG (Web Design Group) or W3C’s (World Wide Web Consortium) standard – Not using defined tags by W3C (e.g...etc.) – Not maintaining the structure(e.g... ) – Not using proper opening and matching closing tag – etc. Web Scripting language can generate HTML

7 The Problem Bad scripts creating syntactically-malformed HTML – Partially displayable or Non-displayable HTML – Browser’s attempt to correct  crashes – Slower HTML rendering – Discard important information – Trouble indexing correct pages for search engines Example

8 More Problems Dynamic web page testing challenges – HTML validation tools only perform testing of static page – Can not fully capture behavior since not all of functionality of code is found in the HTML result – No automatic validator for scripting languages that dynamically generate HTML pages – HTML Kit validates every generated page but requires manual generation of inputs that lead to displaying pages

9 What this paper presents… Presents automated technique for finding faults manifested as Web script crashes or malformed- HTML – extends dynamic test generation to scripting languages. Identifies minimal part of input responsible for triggering failures Uses an oracle to determine well-formed HTML Creates a tool, Apollo that implements all these in the context of PHP

10 Why ? Widely used in Web development – Network interactions – Database – HTTP processing Object oriented Scripting 21 millions domains 1 (75%) are powered including large websites like Wikipedia, WordPress, Facebook, Dig etc. 1 Source Netcraft, April 2007

11 Example: program SchoolMate.php – Allows school administrators to manage classes and users, teachers to manage assignments and grades and students to access their information Typical URL: schoolmate.php?page=1&page2=100&login=1& username=user&password=password

12 1 <?php 2 3 make_header(); // print HTML header 4 5 // Make the $page variable easy to use // 6 if(!isset($_GET[’page’])) $page = 0; 7 else $page = $_GET[’page’]; 8 9 // Bring up the report cards and stop processing // 10 if($_GET[’page2’]==1337) { 11 require(’printReportCards.php’); 12 die(); // terminate the PHP program 13 } 14 15 // Validate and log the user into the system // 16 if($_GET["login"] == 1) validateLogin(); 17 18 switch ($page) 19 { 20 case 0: require(’login.php’); break; 21 case 1: require(’TeacherMain.php’); break; 22 case 2: require(’StudentMain.php’); break; 23 default: die("Incorrect page number. Please verify."); 24 } 25 26 make_footer(); // print HTML footer 27... 27 function validateLogin() { 28 if(!isset($_GET[’username’])) { 29 echo " username must be supplied. \n"; 30 return; 31 } 32 $username = $_GET[’username’]; 33 $password = $_GET[’password’]; 34 if($username=="john" && $password=="theTeacher") 35 $page=1; 36 else if($username=="john" && $password=="theStudent") 37 $page=2; 38 else echo " Login error. Please try again \n"; 39 } 40 41 function make_header() { // print HTML header 42 print(" 43 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 44 "http://www.w3.org/TR/html4/strict.dtd"> 45 46 Class Management 47 "); 48 } 49 50 function make_footer() { // close HTML elements opened by header() 51 print(" 52 53 "); 54 } 55 ?>

13 1 <?php 2 3 make_header(); // print HTML header 4 5 // Make the $page variable easy to use // 6 if(!isset($_GET[’page’])) $page = 0; 7 else $page = $_GET[’page’]; 8 9 // Bring up the report cards and stop processing // 10 if($_GET[’page2’]==1337) { 11 require(’printReportCards.php’); 12 die(); // terminate the PHP program 13 } 14 15 // Validate and log the user into the system // 16 if($_GET["login"] == 1) validateLogin(); 17 18 switch ($page) 19 { 20 case 0: require(’login.php’); break; 21 case 1: require(’TeacherMain.php’); break; 22 case 2: require(’StudentMain.php’); break; 23 default: die("Incorrect page number. Please verify."); 24 } 25 26 make_footer(); // print HTML footer 27... 27 function validateLogin() { 28 if(!isset($_GET[’username’])) { 29 echo " username must be supplied. \n"; 30 return; 31 } 32 $username = $_GET[’username’]; 33 $password = $_GET[’password’]; 34 if($username=="john" && $password=="theTeacher") 35 $page=1; 36 else if($username=="john" && $password=="theStudent") 37 $page=2; 38 else echo " Login error. Please try again \n"; 39 } 40 41 function make_header() { // print HTML header 42 print(" 43 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 44 "http://www.w3.org/TR/html4/strict.dtd"> 45 46 Class Management 47 "); 48 } 49 50 function make_footer() { // close HTML elements opened by header() 51 print(" 52 53 "); 54 } 55 ?> ‘printReportCards.php’ missing make_footer() not executed in certain situations  unclosed HTML tag Generates illegal tag

14 Failures in PHP programs Targets two types of failures – Execution failures Web Script Crashes – HTML failures Malformed HTML

15 Failure-Finding in PHP Applications Concolic Testing – Dynamic Test Generation Technique Execute application on 1.Initially on empty input 2.Then on additional inputs, obtained by solving constraints that are derived from control flow paths Extensions – Validate to correctness of program output by using oracle – Use isset, isempty, require, etc. to require generation of constraints absent in other OOPL’s – Use pre-specified set of values for database authentication – Simulate each user input by transforming source code

16 Transformation of Code Interactive HTML pages with buttons and menus For each page (h) that contains N buttons – Add additional input parameter p to PHP program Values range from 1 to N – Switch statement inserted including appropriate PHP source file, depending on p

17 An example <? /* Simulated User Input */ Switch ($_GET[“_btn”] { Case 1: require_once(“mainmenu.php”); break; Case 2: require_once (“newuser.php”); break; } ?> <?php echo “ Webchess “.$Version.” login” ; ?> Nick: Password:

18 The Failure Detection Algorithm parameters: Program P, oracle O result : Bug reports B; B : setOf ( ) 1.P′ ≔ simulateUserInput(P); 2.B ≔ empty; 3.pcQueue ≔ emptyQueue(); 4.enqueue(pcQueue, emptyPathConstraint()); 5.while not empty(pcQueue) and not timeExpired() do 6. pathConstraint ≔ dequeue(pcQueue); 7. input ≔ solve(pathConstraint); 8. if input not equals to ⊥ then 9. output ≔ executeConcrete(P′, input); 10. failures ≔ getFailures(O, output); 11. foreach f in failures do 12. merge into B; 13. c1 ∧... ∧ cn ≔ executeSymbolic(P′, input); 14. foreach i = 1,...,n do 15. newPC ≔ c1 ∧... ∧ ci−1 ∧ ¬ ci; 16. queue(pcQueue, newPC); 17.return B;

19 Example: Execution 1 (Expose Third Fault) 1 <?php 2 3 make_header(); // print HTML header 4 5 // Make the $page variable easy to use // 6 if(!isset($_GET[’page’])) $page = 0; 7 else $page = $_GET[’page’]; 8 9 // Bring up the report cards and stop processing // 10 if($_GET[’page2’]==1337) { 11 require(’printReportCards.php’); 12 die(); // terminate the PHP program 13 } 14 15 // Validate and log the user into the system // 16 if($_GET["login"] == 1) validateLogin(); 17 18 switch ($page) 19 { 20 case 0: require(’login.php’); break; 21 case 1: require(’TeacherMain.php’); break; 22 case 2: require(’StudentMain.php’); break; 23 default: die("Incorrect page number. Please verify."); 24 } 25 26 make_footer(); // print HTML footer 27... 27 function validateLogin() { 28 if(!isset($_GET[’username’])) { 29 echo " username must be supplied. \n"; 30 return; 31 } 32 $username = $_GET[’username’]; 33 $password = $_GET[’password’]; 34 if($username=="john" && $password=="theTeacher") 35 $page=1; 36 else if($username=="john" && $password=="theStudent") 37 $page=2; 38 else echo " Login error. Please try again \n"; 39 } 40 41 function make_header() { // print HTML header 42 print(" 43 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 44 "http://www.w3.org/TR/html4/strict.dtd"> 45 46 Class Management 47 "); 48 } 49 50 function make_footer() { // close HTML elements opened by header() 51 print(" 52 true – sets page = 0 false GoTo(20) Execution HTML validation tool determines output is legal NotSet(page) ∧ page2 ≠ 1337 ∧ login ≠ 1 HTML validation tool determines output is legal NotSet(page) ∧ page2 ≠ 1337 ∧ login ≠ 1 parameters: Program P, oracle O result : Bug reports B; B : setOf ( ) 1.P′ ≔ simulateUserInput(P); 2.B ≔ empty; 3.pcQueue ≔ emptyQueue(); 4.enqueue(pcQueue, emptyPathConstraint()); 5.while not empty(pcQueue) and not timeExpired() do 6. pathConstraint ≔ dequeue(pcQueue); 7. input ≔ solve(pathConstraint); 8. if input not equals to ⊥ then 9. output ≔ executeConcrete(P′, input); 10. failures ≔ getFailures(O, output); 11. foreach f in failures do 12. merge into B; 13. c1 ∧... ∧ cn ≔ executeSymbolic(P′, input); 14. foreach i = 1,...,n do 15. newPC ≔ c1 ∧... ∧ ci−1 ∧ ¬ ci; 16. queue(pcQueue, newPC); 17.return B; NotSet(page) ∧ page2 ≠ 1337 ∧ login = 1 NotSet(page) ∧ page2 = 1337 Set(page) NotSet(page) ∧ page2 ≠ 1337 ∧ login = 1 NotSet(page) ∧ page2 = 1337 Set(page)

20 Example: Execution 2 (The Opposite Path) NotSet(page) ∧ page2 ≠ 1337 ∧ login = 1 – Constraint solver may get page2  0; login  1 1 <?php 2 3 make_header(); // print HTML header 4 5 // Make the $page variable easy to use // 6 if(!isset($_GET[’page’])) $page = 0; 7 else $page = $_GET[’page’]; 8 9 // Bring up the report cards and stop processing // 10 if($_GET[’page2’]==1337) { 11 require(’printReportCards.php’); 12 die(); // terminate the PHP program 13 } 14 15 // Validate and log the user into the system // 16 if($_GET["login"] == 1) validateLogin(); 17 18 switch ($page) 19 { 20 case 0: require(’login.php’); break; 21 case 1: require(’TeacherMain.php’); break; 22 case 2: require(’StudentMain.php’); break; 23 default: die("Incorrect page number. Please verify."); 24 } 25 26 make_footer(); // print HTML footer 27... 27 function validateLogin() { 28 if(!isset($_GET[’username’])) { 29 echo " username must be supplied. \n"; 30 return; 31 } 32 $username = $_GET[’username’]; 33 $password = $_GET[’password’]; 34 if($username=="john" && $password=="theTeacher") 35 $page=1; 36 else if($username=="john" && $password=="theStudent") 37 $page=2; 38 else echo " Login error. Please try again \n"; 39 } 40 41 function make_header() { // print HTML header 42 print(" 43 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 44 "http://www.w3.org/TR/html4/strict.dtd"> 45 46 Class Management 47 "); 48 } 49 50 function make_footer() { // close HTML elements opened by header() 51 print(" 52 53 "); 54 } 55 ?> true HTML validation tool discovers failure and generates bug report  added to output set of bug reports

21 Minimization on Path Constraints Find shorter path constraint for a given bug report Eliminates irrelevant constraints – better assist programmer to detect location of the fault Solution for a shorter path constraint is often a smaller input Does not guarantee returned path constraint is shortest that exposes failure

22 Minimization Example HTML malformation from previous example could have been reached from different execution paths NotSet(page) ∧ page2 ≠ 1337 ∧ login = 1 Set(page) ∧ page = 0 ∧ page2 ≠ 1337 ∧ login = 1 page2 ≠ 1337 ∧ login = 1 page2 ≠ 1337 login = 1 (login  1)

23 parameters: Program P, oracle O, bug report b result : Short path constraint that exposes b.failure 1.c1 ∧... ∧ cn ≔ intersect(b.pathConstraints); 2.pc ≔ true; 3.foreach i = 1,..., n do 4. pci ≔ c1 ∧... ci−1 ∧ ci+1 ∧... cn; 5. input ≔ solve(pci); 6. if input not equals ⊥ then 7. output ≔ executeConcrete(P, input); 8. failures ≔ getFailures(O, output); 9. if b.failure not belongs to failures then 10. pc ≔ pc ∧ ci; 11.input pc ≔ solve(pc); 12.if input pc not equals to ⊥ then 13. output pc ≔ executeConcrete(P, input pc ); 14. failures pc ≔ getFailures(O, output pc ); 15. if b.failure ∈ failures pc then 16. return pc; 17.return shortest(b.pathConstraints); Path Constraint Minimization Algorithm

24 Apollo User Input Simulator Executor Bug Finder – Oracle – Bug Report Repository – Input minimizer Input Generator – Symbolic Finder – Constraint Solver – Value Generator

25 Apollo

26 Executor: Shadow Interpreter Shadow Interpreter – Modified Zend PHP interpreter 5.2.2 to record path constraints and information associated with output – Performs symbolic execution along with concrete execution – Records conditions for PHP-specific comparison operations such as isset and empty

27 Executor: Database Manager Database Manager – (Re) initializes DB used by a PHP application. Restores DB before each execution – Supply additional information about username/password pairs

28 Bug Finder Bug Report = Failure + Path constraint + Input inducing failure Failure = Type of Failure + Corresponding Message + PHP statement generating bad HTML Oracle – HTML validation tool (WDG and WC3) Input Minimizer – uses the path constraints minimization algorithm

29 Input Generator Symbolic Driver – generates new path constraints and select next path constraint Constraint Solver – computes an assignment of values to input parameters that satisfies a given path constraint. – Choco constraint solver Value Generator – generates value for parameters – Combines random value generation and constant values mined from source code

30 Experimentation Program#filesLOCPHP LOC# DL’s faqforge19171273414164 webchess244718222632352 schoolmate63818142634466 phpsysinfo73166347745492217 total1793124514968543199 faqforge = Tool for creating and managing documents webchess = Online chess game schoolmate = PHP/MySQL solution for administering schools phpsysinfo = Displays system info

31 Generation Strategies Compared to two other approaches – Halfond and Orso (Randomized) Random values to the parameters Proposed for JavaScript – Minamide’s static analysis Approximates the string output of program with a context-free grammar Discovers malformed HTML faults Apollo’s test input generation previously discussed

32 Methodology 10-minute runs on each program – Generation of hundreds of inputs Ran on both Apollo and Random test input generation strategies WDG offline HTML validation tool

33 Results Classification Execution crash: PHP interpreter terminates with exception Execution error: PHP interpreter emits warning visible in generated HTML Execution warning: PHP interpreter emits warning invisible to HTML output HTML error: program generates HTML for which validation tool produces error report HTML warning: program generates HTML for which validation produces a warning report

34 Randomized Results Analysis Apollo Average line coverage – 58.0% Faults Found on Subject Apps – 214 Average line coverage – 15.0% Faults Found on Subject Apps – 59 Tries to load two missing files Database related Unset Time-zone Resulted in Malformed HTML Line Coverage = Number of executed lines / Total lines with executable PHP code in application

35 Results Analysis Apollo Vs Randomized – 58% line coverage Vs 15.2% line coverage – 214 faults Vs 59 faults Apollo Vs Minamide’s tool – 2.7 more HTML validation faults (120 Vs 45) – 83 additional execution faults – 104 faults (10 minutes) Vs 14 faults (126 minutes) Apollo is more effective and efficient than both

36 Results Analysis: Path Constraint Minimization ProgramSuccess rate % Path ConstraintsInputs Orig. SizeReductionOrig. SizeReduction faqforge6422.30.229.30.31 webchess9123.40.1910.90.40 schoolmate5122.90.3811.50.58 phpsysinfo8224.30.1817.50.26 Reduces size of inputs by up to factor of 0.18 for more than 50% of faults Success rate – Percentage of faults whose exposing input was minimized Orig. size – Average size of original path constraints (# of conjuncts) and inputs (# of key-value pairs) Reduction columns – Ratio of minimized to un-minimized size. The lower the ratio, the more successful the minimization

37 Limitations Simulating user inputs statically JavaScript code in the generated HTML not tracked Limited line coverage for native C methods Limited sources of input parameters – Only inputs from global arrays (_POST, _GET and _REQUEST)

38 Thank you

39


Download ppt "Finding Bugs in Dynamic Web Applications Shay Artzi, Adam Kiezun, Julian Dolby, Frank Tip, Danny Dig, Amit Paradkar, Michael D. Earnst Proceeding: ISSTA."

Similar presentations


Ads by Google