Precise Interface Identification to Improve Testing and Analysis of Web Applications William G.J. Halfond, Saswat Anand, and Alessandro Orso Georgia Institute.

Slides:



Advertisements
Similar presentations
1 Arrays An array is a special kind of object that is used to store a collection of data. The data stored in an array must all be of the same type, whether.
Advertisements

Masahiro Fujita Yoshihisa Kojima University of Tokyo May 2, 2008
Addressing the Challenges of Current Software. Questions to Address Why? What? Where? How?
PLDI’2005Page 1June 2005 Example (C code) int double(int x) { return 2 * x; } void test_me(int x, int y) { int z = double(x); if (z==y) { if (y == x+10)
DETAILED DESIGN, IMPLEMENTATIONA AND TESTING Instructor: Dr. Hany H. Ammar Dept. of Computer Science and Electrical Engineering, WVU.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Model Counting >= Symbolic Execution Willem Visser Stellenbosch University Joint work with Matt Dwyer (UNL, USA) Jaco Geldenhuys (SU, RSA) Corina Pasareanu.
1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)
Symbolic execution © Marcelo d’Amorim 2010.
Chapter 7 User-Defined Methods. Chapter Objectives  Understand how methods are used in Java programming  Learn about standard (predefined) methods and.
Abhinn Kothari, 2009CS10172 Parth Jaiswal 2009CS10205 Group: 3 Supervisor : Huzur Saran.
 2005 Pearson Education, Inc. All rights reserved Introduction.
Hybrid Concolic Testing Rupak Majumdar Koushik Sen UC Los Angeles UC Berkeley.
CSE503: SOFTWARE ENGINEERING SYMBOLIC TESTING, AUTOMATED TEST GENERATION … AND MORE! David Notkin Spring 2011.
Automated Identification of Parameter Mismatches in Web Applications William G.J. Halfond and Alessandro Orso Georgia Institute of Technology.
Penetration Testing with Improved Input Vector Identification William G.J. Halfond, Shauvik Roy Choudhary, and Alessandro Orso College of Computing Georgia.
CS107 Introduction to Computer Science Loops. Instructions Pseudocode Assign values to variables using basic arithmetic operations x = 3 y = x/10 z =
03 Data types1June Data types CE : Fundamental Programming Techniques.
Developing Verifiable Concurrent Software Tevfik Bultan Department of Computer Science University of California, Santa Barbara
11 Methods1June Methods CE : Fundamental Programming Techniques.
Finding the Weakest Characterization of Erroneous Inputs Dzintars Avots and Benjamin Livshits.
C++ for Engineers and Scientists Third Edition
Symbolic Path Simulation in Path-Sensitive Dataflow Analysis Hari Hampapuram Jason Yue Yang Manuvir Das Center for Software Excellence (CSE) Microsoft.
PHP Server-side Programming. PHP  PHP stands for PHP: Hypertext Preprocessor  PHP is interpreted  PHP code is embedded into HTML code  interpreter.
1 Guide to JSP common functions 1.Including the libraries as per a Java class, e.g. not having to refer to java.util.Date 2.Accessing & using external.
Structural Coverage Verilog code is available to help generate tests o Code can be analyzed statically and/or simulated Easier to detect “additive” design.
CREST Internal Yunho Kim Provable Software Laboratory CS Dept. KAIST.
AMNESIA: Analysis and Monitoring for NEutralizing SQL- Injection Attacks Published by Wiliam Halfond and Alessandro Orso Presented by El Shibani Omar CS691.
1 A Static Analysis Approach for Automatically Generating Test Cases for Web Applications Presented by: Beverly Leung Fahim Rahman.
Computer Science Selection Structures.
Chapter 4: Control Structures I J ava P rogramming: From Problem Analysis to Program Design, From Problem Analysis to Program Design, Second Edition Second.
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
Aditya V. Nori, Sriram K. Rajamani Microsoft Research India.
Testing Testing Techniques to Design Tests. Testing:Example Problem: Find a mode and its frequency given an ordered list (array) of with one or more integer.
Advanced Computer Architecture Lab University of Michigan USENIX Security ’03 Slide 1 High Coverage Detection of Input-Related Security Faults Eric Larson.
CSV 889: Concurrent Software Verification Subodh Sharma Indian Institute of Technology Delhi Symbolic Execution.
Advanced Computer Science Lesson 4: Reviewing Loops and Arrays Reading User Input.
Symbolic Execution with Abstract Subsumption Checking Saswat Anand College of Computing, Georgia Institute of Technology Corina Păsăreanu QSS, NASA Ames.
©Colin Jamison 2004 Shell scripting in Linux Colin Jamison.
Learning Symbolic Interfaces of Software Components Zvonimir Rakamarić.
Comparing model-based and dynamic event-extraction based GUI testing techniques : An empirical study Gigon Bae, Gregg Rothermel, Doo-Hwan Bae The Journal.
School of Computer Science & Information Technology G6DICP - Lecture 4 Variables, data types & decision making.
A Test Case + Mock Class Generator for Coding Against Interfaces Mainul Islam, Christoph Csallner Software Engineering Research Center (SERC) Computer.
Using Symbolic PathFinder at NASA Corina Pãsãreanu Carnegie Mellon/NASA Ames.
Cybersecurity Testing and Analysis for Web Applications William GJ Halfond Center for Systems and Software Engineering University of Southern California.
Boolean expressions, part 1: Compare operators. Compare operators Compare operators compare 2 numerical values and return a Boolean (logical) value A.
1 CSC 221: Computer Programming I Fall 2005 simple conditionals and expressions  if statements, if-else  increment/decrement, arithmetic assignments.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
( = “unknown yet”) Our novel symbolic execution framework: - extends model checking to programs that have complex inputs with unbounded (very large) data.
Lazy Annotation for Program Testing and Verification (Supplementary Materials) Speaker: Chen-Hsuan Adonis Lin Advisor: Jie-Hong Roland Jiang December 3,
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
Symbolic Execution in Software Engineering By Xusheng Xiao Xi Ge Dayoung Lee Towards Partial fulfillment for Course 707.
CSE 331 SOFTWARE DESIGN & IMPLEMENTATION SYMBOLIC TESTING Autumn 2011.
Symstra: A Framework for Generating Object-Oriented Unit Tests using Symbolic Execution Tao Xie, Darko Marinov, Wolfram Schulte, and David Notkin University.
© Dr. A. Williams, Fall Present Software Quality Assurance – Clover Lab 1 Tutorial / lab 2: Code instrumentation Goals of this session: 1.Create.
Dynamic White-Box Testing What is code coverage? What are the different types of code coverage? How to derive test cases from control flows?
Chapter 15 Running Time Analysis. Topics Orders of Magnitude and Big-Oh Notation Running Time Analysis of Algorithms –Counting Statements –Evaluating.
IST 210: PHP Basics IST 210: Organization of Data IST2101.
CS223: Software Engineering Lecture 26: Software Testing.
Georgia Institute of Technology More on Creating Classes Barb Ericson Georgia Institute of Technology June 2006.
C++ for Engineers and Scientists Second Edition Chapter 4 Selection Structures.
A Test Case + Mock Class Generator for Coding Against Interfaces
RDE: Replay DEbugging for Diagnosing Production Site Failures
Business Rule Based Configuration Management and Software System Implementation Using Decision Tables Olegas Vasilecas, Aidas Smaizys VGTU, Vilnius, Lithuania.
High Coverage Detection of Input-Related Security Faults
Statements, Comments & Simple Arithmetic
Web DB Programming: PHP
Automatic Test Generation SymCrete
CSE 1020:Software Development
SOFTWARE ENGINEERING INSTITUTE
Presentation transcript:

Precise Interface Identification to Improve Testing and Analysis of Web Applications William G.J. Halfond, Saswat Anand, and Alessandro Orso Georgia Institute of Technology

Example Web Application 2 Web Server End Users Initial Visit Web Application getQuote.jsp buyPolicy.jsp Quote Information

Interface Identification 3 public void write(File outfile, String buffer, int length) Domain information Grouping of parameters 1.Names of parameters 2.Grouping of parameters 3.Domain information Parameter names

Example Web Application Interface Domain Constraints action = “checkeligibility”  integer(age)  age < 16 action = “checkeligibility”  integer(age)  age  16 4 public void service (HttpRequest req) 1. String aValue = req.getIP( “action” ) 2. if (aValue.equals( “checkeligibility” )) 3. int userAge = getNumIP( “age” ) 4. if (userAge < 16) 5. displayErrorMsg(“Too young.” ) 6. else 7. displayQuotePage( ) 8. if (aValue.equals( “doquote” )) 9. String nValue = req.getIP( “name” ) 10. String carType = req.getIP( “type” ) 11. int carYear = getNumIP( “year” ) 12. calculateQuote(carType, carYear) … public int getNumIP(String name) 1. String value = getIP(name) 2. int param = Integer.parse(value) 3. return param 1.Names of parameters 2.Grouping of parameters 3.Domain information Parameter Names action, age, name, type, year Groupings of Parameters action action, age action, name, type, year

Dynamic Spider Web spider crawls pages of application Limitation: No guarantee of completeness Static DFW 1 : Identify parameter names via static analysis Limitation: Only identifes parameter names WAM DF 2 : Uses iterative data-flow analysis Limitation: Assumes all paths feasible Previous Approaches: Interface Identification 5 1.Deng, Frankl, Wang, SEN Halfond and Orso, FSE (action, age, name, type, year) 1. String aValue = req.getIP( “action” ) 2. if (aValue.equals( “checkeligibility” )) … 8. if (aValue.equals( “doquote” )) 4. if (userAge < 16) 5. displayErrorMsg(“Too young.” ) 6. else 7. displayQuotePage( )

Our Approach Statically identify interfaces by using symbolic execution to model input parameters and domain constraining operations. 1.Program transformation 2.Symbolic execution 3.Interface identification 6

1 – Program Transformation 1. Introduce symbolic values 2. Replace domain-constraining operations value  getIP(name) s  new SymbolicValue() s.assignName( name ) SymbolicState.add(s, value ) return s 7 1. Accessing an input parameter 2. Conversion to numeric type 3. String comparison 4. Arithmetic constraints

2 – Symbolic Execution Symbolically execute the transformed web application -- track path conditions and symbolic state. 8 Symbolic Execution Transformed Web Application getQuote.jsp buyPolicy.jsp Path Conditions c 1  c 2  c 3 c 3  c 4  c 5 Symbolic States s action  aValue s year  carYear

2 – Access Input Parameters 1. String aValue = req.getIP( “action” ) (PC, SS) (PC, SS[s action  aValue ]) 9 PC = Path Condition SS = Symbolic State

2 – String Comparison (PC  s action  “checkeligibility”, SS[s action  aValue ]) (PC, SS[s action  aValue ]) 2. if (aValue.equals( “checkeligibility” )) 8. if (aValue.equals( “doQuote” )) 1. String aValue = req.getIP( “action” ) 10 (PC  s action  “checkeligibility”, SS[s action  aValue ]) TRUE FALSE

3 – Interface Identification PC1  s action  “checkeligibility”  integer(s age )  s age  16 PC2  s action  “checkeligibility”  integer(s age )  s age  16 SS  [s action  aValue, s age  userAge ] 1. String aValue = req.getIP( “action” ) 2. if (aValue.equals( “checkeligibility” )) 3. int userAge = getNumIP( “age” ) 4. if (userAge < 16) 5. displayErrorMsg(“Too young.” ) 6. else 7. displayQuotePage( ) … 11

Empirical Evaluation Research Questions (RQ): 1.Efficiency -- Is the new approach efficient in terms of its analysis time requirements? 2.Precision -- Is the new approach more precise than previous approaches? 3.Usefulness -- Does the new approach improve the performance of quality assurance techniques? 12

Implementation: WAM SE Written in Java for Java Enterprise Edition (JEE) based web applications Implementation Modules 1. TRANSFORM Customized JEE libraries Stinger for analysis and automated transformation 2. SE ENGINE Symbolic execution engine built on JavaPathFinder Constraint solver is YICES 3. PC ANALYSIS 13

Implementation: Other Approaches 14 Dynamic Spider Web spider crawls pages of application OWASP Web Scarab Project Static DFW 1 : Identify parameter names via static analysis Reimplementation of the author-provided code WAM DF 2 : Uses iterative data-flow analysis Implementation from previous work 1.Deng, Frankl, Wang, SEN Halfond and Orso, FSE 2007.

Subject Applications SubjectLOCClassesServlets Bookstore19, Classifieds10,70218 Employee Directory5, Events7, Subjects available online from GotoCode.com 15

RQ1: Efficiency 1.High amount of infeasible paths in subjects 2.Low number of constraints per parameter 3.Web applications highly modular WAM SE WAM DF DFWSpider 16

RQ2: Precision On average, 80% of WAM DF interfaces were spurious WAM SE WAM DF 17

RQ3: Usefulness Measure improvement of three quality assurance techniques: a)Invocation Verification b)Penetration Testing c)Test Input Generation 18

RQ3a – Invocation Verification 19 ApproachFalse PositivesFalse Negatives WAM DF 0%50% Spider39%0% WAM SE 0% Verification of invocations for subject Bookstore Web Application getQuote.jspbuyPolicy.jsp X

RQ3b – Penetration Testing WAM SE WAM DF DFWSpider 20 Number of vulnerabilities: 2X – 6X higher for WAM SE

RQ3c – Test Input Generation % Stmt. Coverage % Branch Coverage # Command Forms Branch coverage increase: 3%-67% Statement coverage increase: 3%-25% Command form increase: 651%-1,577% WAM SE WAM DF DFWSpider 21

RQ3c – Test Suite Size RQ3c results: 1.Higher coverage for measured metrics 2.Smaller average test suite WAM SE WAM DF DFWSpider 22 Test suite decrease in size: 4X – 10X

Summary of Results Developed interface identification technique for web applications based on symbolic execution. Empirical evaluation: Similar analysis time to other techniques More precise than current techniques Improves quality assurance techniques 23

2 – Conversion to Numeric Type (PC, SS[s action  aValue ]) 3. int userAge = getNumIP( “age” ) … public int getNumIP(String name) 1. String value = getIP(name) 2. int param = Integer.parse(value) 3. return param (PC  integer(s age ), SS[s age  userAge, s action  aValue ]) 24

2 – Arithmetic Constraints (PC  integer(s age ), SS[s age  userAge ]) (PC  integer(s age )  s age  16, SS[s age  userAge ]) 4. if (userAge < 16) 3. int userAge = getNumIP( “age” ) 25 (PC  integer(s age )  s age  16, SS[s age  userAge ]) TRUE FALSE