Automatic Detection of Previously-Unseen Application States for Deployment Environment Testing and Analysis Chris Murphy, Moses Vaughan, Waseem Ilahi,

Slides:



Advertisements
Similar presentations
Symbol Table.
Advertisements

A System to Generate Test Data and Symbolically Execute Programs Lori A. Clarke September 1976.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Analysis of Algorithms CS Data Structures Section 2.6.
Testing and Quality Assurance
Inpainting Assigment – Tips and Hints Outline how to design a good test plan selection of dimensions to test along selection of values for each dimension.
1 Bug Isolation via Remote Program Sampling Ben LiblitAlex Aiken Alice X. ZhengMichael Jordan Presented By : Arpita Gandhi.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
COEN Expressions and Assignment
Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng.
(Quickly) Testing the Tester via Path Coverage Alex Groce Oregon State University (formerly NASA/JPL Laboratory for Reliable Software)
1 Static Testing: defect prevention SIM objectives Able to list various type of structured group examinations (manual checking) Able to statically.
Subscription Subsumption Evaluation for Content-Based Publish/Subscribe Systems Hojjat Jafarpour, Bijit Hore, Sharad Mehrotra, and Nalini Venkatasubramanian.
Properties of Machine Learning Applications for Use in Metamorphic Testing Chris Murphy, Gail Kaiser, Lifeng Hu, Leon Wu Columbia University.
Reinforcement Learning Rafy Michaeli Assaf Naor Supervisor: Yaakov Engel Visit project’s home page at: FOR.
Tutorial 6 & 7 Symbol Table
Automatic System Testing of Programs without Test Oracles
The In Vivo Testing Approach Christian Murphy, Gail Kaiser, Ian Vo, Matt Chu Columbia University.
Tirgul 8 Universal Hashing Remarks on Programming Exercise 1 Solution to question 2 in theoretical homework 2.
CS 330 Programming Languages 10 / 24 / 2006 Instructor: Michael Eckmann.
Using JML Runtime Assertion Checking to Automate Metamorphic Testing in Applications without Test Oracles Christian Murphy, Kuang Shen, Gail Kaiser Columbia.
Distributed In Vivo Testing of Software Applications Matt Chu, Christian Murphy, Gail Kaiser Columbia University.
Parameterizing Random Test Data According to Equivalence Classes Chris Murphy, Gail Kaiser, Marta Arias Columbia University.
Elementary Data Structures and Algorithms
Building An Interpreter After having done all of the analysis, it’s possible to run the program directly rather than compile it … and it may be worth it.
Using Runtime Testing to Detect Defects in Applications without Test Oracles Chris Murphy Columbia University November 10, 2008.
Register Allocation and Spilling via Graph Coloring G. J. Chaitin IBM Research, 1982.
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 5 Data Flow Testing
CODING Research Data Management. Research Data Management Coding When writing software or analytical code it is important that others and your future.
Bottom-Up Integration Testing After unit testing of individual components the components are combined together into a system. Bottom-Up Integration: each.
State coverage: an empirical analysis based on a user study Dries Vanoverberghe, Emma Eyckmans, and Frank Piessens.
1. 2 What is Six Sigma? What: Data driven method of identifying and resolving variations in processes. How: Driven by close understanding of customer.
Software Testing Verification and validation planning Software inspections Software Inspection vs. Testing Automated static analysis Cleanroom software.
Vulnerability-Specific Execution Filtering (VSEF) for Exploit Prevention on Commodity Software Authors: James Newsome, James Newsome, David Brumley, David.
Verification and Validation Yonsei University 2 nd Semester, 2014 Sanghyun Park.
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Recursion, Complexity, and Searching and Sorting By Andrew Zeng.
SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY Lecture 12 CS2110 – Fall 2009.
Analysis of Algorithms
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
HW/SW PARTITIONING OF FLOATING POINT SOFTWARE APPLICATIONS TO FIXED - POINTED COPROCESSOR CIRCUITS - Nalini Kumar Gaurav Chitroda Komal Kasat.
Thread-Level Speculation Karan Singh CS
Procrastinator: Pacing Mobile Apps’ Usage of the Network mobisys 2014.
Optimization in XSLT and XQuery Michael Kay. 2 Challenges XSLT/XQuery are high-level declarative languages: performance depends on good optimization Performance.
Complexity of Algorithms
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Software Verification, Validation and Testing.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Nexthink V5 Demo ITSM – Slow Computer. Situaiton › How from a problem reported can I take smart decision to reduce overall global problem in my environment.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming.  To allocate scarce memory.
Page 1 5/2/2007  Kestrel Technology LLC A Tutorial on Abstract Interpretation as the Theoretical Foundation of CodeHawk  Arnaud Venet Kestrel Technology.
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
To Tune or not to Tune? A Lightweight Physical Design Alerter Nico Bruno, Surajit Chaudhuri DMX Group, Microsoft Research VLDB’06.
Chapter 8 Lecture 1 Software Testing. Program testing Testing is intended to show that a program does what it is intended to do and to discover program.
Protecting C Programs from Attacks via Invalid Pointer Dereferences Suan Hsi Yong, Susan Horwitz University of Wisconsin – Madison.
User-Defined Functions II TK1914: C++ Programming.
Author: Haoyu Song, Murali Kodialam, Fang Hao and T.V. Lakshman Publisher/Conf. : IEEE International Conference on Network Protocols (ICNP), 2009 Speaker:
Survey of Tools to Support Safe Adaptation with Validation Alain Esteva-Ramirez School of Computing and Information Sciences Florida International University.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
CS 330 Programming Languages 10 / 23 / 2007 Instructor: Michael Eckmann.
Constructs for Data Organization and Program Control, Scope, Binding, and Parameter Passing. Expression Evaluation.
Random Test Generation of Unit Tests: Randoop Experience
DevCOP: A Software Certificate Management System for Eclipse Mark Sherriff and Laurie Williams North Carolina State University ISSRE ’06 November 10, 2006.
Optimizing Parallel Algorithms for All Pairs Similarity Search
Configuration Fuzzing for Software Vulnerability Detection
Chapter 8 – Software Testing
CS5123 Software Validation and Quality Assurance
High Coverage Detection of Input-Related Security Faults
Dynamic Program Analysis
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Presentation transcript:

Automatic Detection of Previously-Unseen Application States for Deployment Environment Testing and Analysis Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University

2 Chris Murphy, Columbia University Overview Deployment environment testing and analysis can find defects not found prior to release Approaches potentially suffer from high overhead We seek to increase the efficiency by only running tests/analysis in previously-unseen application states

3 Chris Murphy, Columbia University In Vivo Testing [Murphy et al. ICST’09] Automatically conduct tests as the software is running in the deployment environment  Unit tests, integration tests, etc.  “Runtime assertions with side effects” Looks for defects in previously untested application states

4 Chris Murphy, Columbia University In Vivo test Function f is about to be executed with input x in state S Create a sandbox for the test Execute f(x) Program continues Execute INVtest_f(x) in state S Report violations

5 Chris Murphy, Columbia University Research Question Can the approach be made more efficient by only running tests and performing analysis in application states that the program has not seen before? Yes, assuming we can quickly determine whether the current state has not previously been seen

6 Chris Murphy, Columbia University Broader Impact Other approaches may also use data about deployment environment execution  Monitoring: Anomaly detection  Profiling: Path coverage, line coverage, etc.  Fault localization All of these may benefit if analysis is done only in previously-unseen states

7 Chris Murphy, Columbia University Analysis Overhead of standard approach  T standard = N * t i Overhead in previously unseen states  T unseen = N * DSP * (t d + t u + t i ) Overhead in previously seen states  T seen = N * (1-DSP) * t d Find DSP so that T unseen + T seen ≤ T standard Number of tests Time per test Distinct State Percentage Time to determine if state has been seen before Time to update list of seen states

8 Chris Murphy, Columbia University Analysis Results To be more efficient, we need:  DSP ≤ (t i – t d ) / (t i + t u ) If t d and t u are much less than t i the right side of the inequality approaches 1 This means that even if nearly all states are distinct, this new approach will still be more efficient

9 Chris Murphy, Columbia University Implementation Issues How do we define the “state”? How do we represent the state? How can we quickly determine whether a state has previously been seen? Does the previous analysis hold in the real world?

10 Chris Murphy, Columbia University Defining “State” We define “state” as “the values of all variables that are in scope at a given execution point” For the purposes of In Vivo Testing, this can be further refined to “the values of all variables on which a function depends that are in scope at the start of the function execution”

11 Chris Murphy, Columbia University Example We can statically determine that function f1 depends on: parameters p1, p2, p3 ; and global variables a and b

12 Chris Murphy, Columbia University Representing States Given our definition, the “state” is simply a map between variable names and values We want to avoid element-wise comparison We want to avoid false positives and false negatives a = 4 b = 3 … a = 2 b = 5 … a = 1 b = 8 … a = 2 b = 7 …

13 Chris Murphy, Columbia University Cantor Function Goal: Give each state a distinct value Hashing function that assigns a distinct number to a pair of numbers [Royden 1988] f(k1,k2) = (1/2)(k1+k2)(k1+k2+1) + k2 Can be used recursively over a set of numerical values

14 Chris Murphy, Columbia University Tracking Execution States Even if each state has its own representation, how can we quickly determine whether a state has been seen before? Hashtable: O(n) in worst case Bloom filter: O(1), but allows for false positives ?

15 Chris Murphy, Columbia University Judy Array Scalable, space efficient, speed efficient  Developed by Hewlett-Packard in 2001 Highly-optimized 256-ary prefix tree data structure Lookups are O(log 256 n) Now we can quickly detect whether a state has already been seen

16 Chris Murphy, Columbia University Automated Process 1. Statically analyze the source code to determine which parts of the state the function depends on 2. Create code that uses Cantor function to represent the state and Judy Array to determine whether it had already been seen 3. Generate instrumentation as normal, with call to function created in Step 2

17 Chris Murphy, Columbia University Evaluation In practice, is sometimes running the instrumentation (i.e., only in previously unseen states) really more efficient than always doing it? Target: In Vivo Testing implementation in C Sieve of Eratosthenes program run with 100 inputs, using varying percentages of distinct states ranging from:  0%, i.e. all values are the same  100%, i.e. all values are different

18 Chris Murphy, Columbia University Results

19 Chris Murphy, Columbia University Limitations & Future Work Memory cost Upper bound of Cantor function Representation of complex objects Coordinating globally-unseen states

Automatic Detection of Previously-Unseen Application States for Deployment Environment Testing and Analysis Chris Murphy Columbia University

21 Chris Murphy, Columbia University Related Work Reducing overhead of runtime monitoring  Static analysis to remove unnecessary instrumentation [Yong & Horwitz, 2005]  Fast cases vs. slow cases [Liblit et al., 2003] State representation for anomaly detection  [Baah et al., 2006]  [Hangal & Lam, 2002]

22 Chris Murphy, Columbia University