Efficient Regression Tests for Database Application Systems Florian Haftmann, i-TV-T AG Donald Kossmann, ETH Zurich + i-TV-T AG Alexander Kreutz, i-TV-T.

Slides:



Advertisements
Similar presentations
Automating Software Module Testing for FAA Certification Usha Santhanam The Boeing Company.
Advertisements

Serializability in Multidatabases Ramon Lawrence Dept. of Computer Science
A lightweight framework for testing database applications Joe Tang Eric Lo Hong Kong Polytechnic University.
A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Koushik Sen CS 265.
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
1 Advancing Supercomputer Performance Through Interconnection Topology Synthesis Yi Zhu, Michael Taylor, Scott B. Baden and Chung-Kuan Cheng Department.
Wavelength Assignment in Optical Network Design Team 6: Lisa Zhang (Mentor) Brendan Farrell, Yi Huang, Mark Iwen, Ting Wang, Jintong Zheng Progress Report.
Program Slicing Mark Weiser and Precise Dynamic Slicing Algorithms Xiangyu Zhang, Rajiv Gupta & Youtao Zhang Presented by Harini Ramaprasad.
Software Reliability Engineering
Planning under Uncertainty
Online Performance Auditing Using Hot Optimizations Without Getting Burned Jeremy Lau (UCSD, IBM) Matthew Arnold (IBM) Michael Hind (IBM) Brad Calder (UCSD)
The Future of Correct Software George Necula. 2 Software Correctness is Important ► Where there is software, there are bugs ► It is estimated that software.
VLSI Systems--Spring 2009 Introduction: --syllabus; goals --schedule --project --student survey, group formation.
COMP8130 and 4130Adrian Marshall 8130 and 4130 Test Execution and Reporting Adrian Marshall.
SWE Introduction to Software Engineering
Complexity of Mechanism Design Vincent Conitzer and Tuomas Sandholm Carnegie Mellon University Computer Science Department.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
7/14/2015EECS 584, Fall MapReduce: Simplied Data Processing on Large Clusters Yunxing Dai, Huan Feng.
Approaches to ---Testing Software Some of us “hope” that our software works as opposed to “ensuring” that our software works? Why? Just foolish Lazy Believe.
Detection and Resolution of Anomalies in Firewall Policy Rules
Software Testing Verification and validation planning Software inspections Software Inspection vs. Testing Automated static analysis Cleanroom software.
Dr. Pedro Mejia Alvarez Software Testing Slide 1 Software Testing: Building Test Cases.
University of Palestine software engineering department Testing of Software Systems Fundamentals of testing instructor: Tasneem Darwish.
System/Software Testing
Authors: Bhavana Bharat Dalvi, Meghana Kshirsagar, S. Sudarshan Presented By: Aruna Keyword Search on External Memory Data Graphs.
Reverse Engineering State Machines by Interactive Grammar Inference Neil Walkinshaw, Kirill Bogdanov, Mike Holcombe, Sarah Salahuddin.
1 DAN FARRAR SQL ANYWHERE ENGINEERING JUNE 7, 2010 SCHEMA-DRIVEN EXPERIMENT MANAGEMENT DECLARATIVE TESTING WITH “DEXTERITY”
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
INT-Evry (Masters IT– Soft Eng)IntegrationTesting.1 (OO) Integration Testing What: Integration testing is a phase of software testing in which.
1 Software testing. 2 Testing Objectives Testing is a process of executing a program with the intent of finding an error. A good test case is in that.
Access Path Selection in a Relational Database Management System Selinger et al.
Message-Passing for Wireless Scheduling: an Experimental Study Paolo Giaccone (Politecnico di Torino) Devavrat Shah (MIT) ICCCN 2010 – Zurich August 2.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.
1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Software Verification, Validation and Testing.
Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.
Software Construction Lecture 18 Software Testing.
Chin-Yu Huang Department of Computer Science National Tsing Hua University Hsinchu, Taiwan Optimal Allocation of Testing-Resource Considering Cost, Reliability,
Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications Karl Schnaitter, UC Santa Cruz Neoklis Polyzotis, UC Santa Cruz Lise.
2005MEE Software Engineering Lecture 11 – Optimisation Techniques.
An Introduction to Software Engineering
Thursday, May 9 Heuristic Search: methods for solving difficult optimization problems Handouts: Lecture Notes See the introduction to the paper.
University of Toronto Department of Computer Science Lifting Transformations to Product Lines Rick Salay, Michalis Famelis, Julia Rubin, Alessio Di Sandro,
The Relational Model1 Transaction Processing Units of Work.
Umans Complexity Theory Lectures Lecture 1a: Problems and Languages.
“Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore.
© 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group.
A local search algorithm with repair procedure for the Roadef 2010 challenge Lauri Ahlroth, André Schumacher, Henri Tokola
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Written by Changhyun, SON Chapter 5. Introduction to Design Optimization - 1 PART II Design Optimization.
Grid Defense Against Malicious Cascading Failure Paulo Shakarian, Hansheng Lei Dept. Electrical Engineering and Computer Science, Network Science Center,
Computer Science 1 Systematic Structural Testing of Firewall Policies JeeHyun Hwang 1, Tao Xie 1, Fei Chen 2, and Alex Liu 2 North Carolina State University.
CS223: Software Engineering Lecture 2: Introduction to Software Engineering.
Chapter 1 Software Engineering Principles. Problem analysis Requirements elicitation Software specification High- and low-level design Implementation.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 23 Slide 1 Software testing.
TCSS 342 Autumn 2004 Version TCSS 342 Data Structures & Algorithms Autumn 2004 Ed Hong.
Random Test Generation of Unit Tests: Randoop Experience
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
Testing PA165 Dec 9, 2014 Petr Adámek, Tomáš Pitner.
Software Testing and Quality Assurance Practical Considerations (1) 1.
Advanced Algorithms Analysis and Design
OPERATING SYSTEMS CS 3502 Fall 2017
Heuristic Optimization Methods
Supporting Fault-Tolerance in Streaming Grid Applications
Effective Social Network Quarantine with Minimal Isolation Costs
Objective of This Course
Presentation transcript:

Efficient Regression Tests for Database Application Systems Florian Haftmann, i-TV-T AG Donald Kossmann, ETH Zurich + i-TV-T AG Alexander Kreutz, i-TV-T AG

Conclusions 1.Testing is a Database Problem –managing state –logical and physical data independence

Conclusions 1.Testing is a Database Problem –managing state –logical and physical data independence 2.Testing is a Problem –no vendor admits it –grep for „Testing“ in SIGMOD et al. –ask your students –We love to write code; we hate testing!

Outline Background & Motivation Execution Strategies Ordering Algorithms Experiments Future Work

Regression Tests Goal: Reduce Cost of Change Requests –reduce cost of tests (automize testing) –reduce probability of emergencies –customers do their own tests (and changes) Approach: –„test programs“ –record correct behavior before change –execute test programs after change –report differences in behavior Lit.: Beck, Gamma: Test Infected. Programmers love writing tests. (JUnit)

Research Challenges Test Run Generation (in progress) –automatic (robot), teach-in, monitoring, decl. Specification Test Database Generation (in progress) Test Run, DB Management and Evolution (uns.) Execution Strategies (solved), Incremental (uns.) Computation and visualization of  (solved) Quality parameters (in progress) –functionality (solved) –performance (in progress) –availability, concurrency, security (unsolved) Cost Model, Test Economy (unsolved)

Demo

CVS-Repository, enthält Traces nach Gruppen strukturiert in einem Verzeichnisbaum

Showing Differences

What is the Problem? Application is stateful; answers depend on state Need to control state - phases of test execution –Setup:Bring application in right state (precondition) –Exec:Execute test requests (compute diffs) –Report:Generate summary of diffs –Cleanup: Bring application back into base state Demo: Nobody specified Setup (precondition)

Solution Generic Setup and Cleanup –„test database“ defines base state of application –reset test database = Setup for all tests –NOP = Cleanup for all tests Test engineers only implement Exec (Report is also generic for all tests.)

Regression Test Approaches Traditional (JUnit, IBM Rational, WinRunner, …) –Setup must be implemented by test engineers –Assumption: most applications are stateless (no DB) ( 60 abstracts; 1 abstract with word „database“) Information Systems (HTTrace) –Setup is provided as part of test infrastructure –Assumption: most applications are stateful (DB) avoid manual work to control state!

DB Regression Tests Background & Motivation Execution Strategies Ordering Algorithms Experiments Conclusion

Definitions Test Database D : Instance of database schema Request Q : A pair of functions a : {D}  answer d : {D}  {D} Test Run T : A sequence of requests T = a : { D} , a = d : { D}  {D}, d( D ) = d n (d n-1 (…d 1 ( D ))) Schedule S : A sequence of test runs S =

Failed Test Run (strict): There exists a request Q in T, a database state D  (a o, a n ) ≠ 0 or d o ( D ) ≠ d n ( D ) T o,Q o : behavior of test run, request before change T n,Q n : behavior of test run, request after change Failed Test Run (relaxed): For given D, there exist a request R in T  (a o, a n ) ≠ 0 Note: Error messages of application are answers, apply  function to error messages, too.

Definitions (ctd.) False Negative: A test run that fails although the new version of the application behaves like the old version. False Positive: A test run that does not fail although the new version of the application behaves not like the old version.

applicationO D -> test tool test engineer / test generation tool repository Teach-In (DB)

applicationN D -> test tool test engineer repository , ) Execute Tests (DB)

applicationN dni(D)dni(D) test tool test engineer repository , ) False Negative

Problem Statement Execute test runs such that –There are no false positives –There are no false negatives –Extra work to control state is affordable Unfortunately, this is too much! Possible Strategies –avoid false negatives –resolve false negatives Constraints –avoidance or resolution is automatic and cheap –add and remove test runs at any time

Strategy 1: Fixed Order Approach: Avoid False Negatives –execute test runs always in the same order –(test run always starts at the same DB instance) Assessment –one failed/broken test run kills the whole rest desaster if it is not possible to fix the test run –test engineers cannot add test runs concurrently –breaks logical data independence –use existing test infrastructure

Strategy 2: No Updates Approach: Avoid False Negatives (Manually) –write test runs that do not change test database –(mathematically: d( D ) = D for all test runs) Assessment –high burden on test engineer very careful which test runs to define very difficult to resolve false negatives –precludes automatic test run generation –breaks logical data independence –sometimes impossible (no compensating action) –use existing test infrastructure

Strategy 3: Reset Always Approach: Avoid False Negatives (Automatically) –reset D before executing each test run –schedules: R T 1 R T 2 R T 3 … R T n How to reset a database? –add software layer that logs all changes (impractical) –use database recovery mechanism (very expensive) –reload database files into file system (expensive) Assessment –everything is automatic –easy to extend test infrastructure –expensive regression tests: restart server, lose cache, I/O –(10000 test runs take about 20 days just for resets)

Strategy 4: Optimistic Motivation: Avoid unnecessary resets –T 1 tests master data module, T 2 tests forecasting module –why reset database before execution of T 2 ? Approach: Resolve False Negatives (Automatically) –reset D when test run fails, then repeat test run –schedules: R T 1 T 2 T 3 R T 3 … T n Assessment –everything is automatic –easy to extend test infrastructure –reset only when necessary –execute some test runs twice –(false positives - avoidable with random permutations)

Strategy 5: Optimistic++ Motivation: Remember failures, avoid double execution –schedule Opt: R T 1 T 2 T 3 R T 3 … T n –schedule Opt++: R T 1 T 2 R T 3 … T n Assessment –everything is automatic –easy to extend test infrastructure –reset only when necessary –(keep additional statistics) –(false positives - avoidable with random permutations) Clear winner among all execution strategies!!!

DB Regression Tests Background & Motivation Execution Strategies Ordering Algorithms Experiments Conclusion

Motivating Example T 1 : insert new PurchaseOrder T 2 : generate report - count PurchaseOrders Schedule A (Opt): T 1 before T 2 R T 1 T 2 R T 2 Schedule B (Opt): T 2 before T 1 R T 2 T 1 Ordering test runs matters!

Conflicts : sequence of test runs t: test run  t if and only if R t: no failure in, t fails R R t: no failure in, t does not fail Simplified model: is a single test run. –does not capture all conflicts –results in sub-optimal schedules

T1T2T4T3 T4T5 Conflict Management  T4  T5

Learning Conflicts E.g.: Opt produces the following schedule R T 1 T 2 R T 2 T 3 T 4 R T 4 T 5 T 6 R T 6 Add the following conflicts –  T 2 –  T 4 –  T 6 New conflicts override existing conflicts –e.g.,  T 2 supersedes  T 2

Problem Statement Problem 1: Given a set of conflicts, what is the best ordering of test runs (minimize number of resets)? Problem 2: Quickly learn relevant conflicts and find acceptable schedule! Heuristics to solve both problems at once!

Slice Heuristics Slice: –sequence of test runs without conflict Approach: –reorder slices after each iteration –form new slices after each iteration –record conflicts Convergence: –stop reordering if no improvement

Example (ctd.) Iteration 1: use random order: T 1 T 2 T 3 T 4 T 5 R T 1 T 2 T 3 R T 3 T 4 T 5 R T 5 Three slices:,, Conflicts:  T 3,  T 5

Example (ctd.) Iteration 1: use random order: T 1 T 2 T 3 T 4 T 5 R T 1 T 2 T 3 R T 3 T 4 T 5 R T 5 Three slices:,, Conflicts:  T 3,  T 5 Iteration 2: reorder slices: T 5 T 3 T 4 T 1 T 2

Example (ctd.) Iteration 1: use random order: T 1 T 2 T 3 T 4 T 5 R T 1 T 2 T 3 R T 3 T 4 T 5 R T 5 Three slices:,, Conflicts:  T 3,  T 5 Iteration 2: reorder slices: T 5 T 3 T 4 T 1 T 2 R T 5 T 3 T 4 T 1 T 2 R T 2 Two slices:, Conflicts:  T 3,  T 5,  T 2

Example (ctd.) Iteration 1: use random order: T 1 T 2 T 3 T 4 T 5 R T 1 T 2 T 3 R T 3 T 4 T 5 R T 5 Three slices:,, Conflicts:  T 3,  T 5 Iteration 2: reorder slices: T 5 T 3 T 4 T 1 T 2 R T 5 T 3 T 4 T 1 T 2 R T 2 Two slices:, Conflicts:  T 3,  T 5,  T 2 Iteration 3: reorder slices: T 2 T 5 T 3 T 4 T 1 R T 2 T 5 T 3 T 4 T 1

Slice: Example II Iteration 1: use random order: T 1 T 2 T 3 R T 1 T 2 R T 2 T 3 R T 3 Three slices:,, Conflicts:  T 2,  T 3 Iteration 2: reorder slices: T 3 T 2 T 1 R T 3 T 2 T 1 R T 1 Two slices:, Conflicts:  T 2,  T 3,  T 1 Iteration 3: no reordering, apply Opt++: R T 3 T 2 R T 1

Convergence Criterion Move before if there is no conflict  t  :  t Slice converges if no more reorderings are possible according to this criterion.

Slice is sub-optimal conflicts:  T 3,  T 1 Optimal schedule: R T 1 T 3 T 2 Applying slice with initial order: T 1 T 2 T 3 R T 1 T 2 T 3 R T 3 Two slices:, Conflicts:  T 3 Iteration 2: reorder slices: T 3 T 1 T 2 R T 3 T 1 R T 1 T 2 Two slices:, Conflicts:  T 3,  T 1 Iteration 3: no reordering, algo converges

Slice Summary Extends Opt, Opt++ Execution Strategies Strictly better than Opt++ #Resets decrease monotonically Converges very quickly (good!) Sub-optimal schedules when converges (bad!) Possible extensions –relaxed convergence criterion (bad!) –merge slices (bad!)

Graph-based Heuristics Use simplified conflict model: T x  T y Conflicts as graph: nodes are test runs Apply graph reduction algorithm –MinFanOut: runs with lowest fan-out first –MinWFanOut: weigh edges with probabilities –MaxDiff: maximum fanin - fanout first –MaxWDiff: weighted fanin - weighted fanout

Graph-based Heuristics Extend Opt, Opt++ execution strategies No monoticity Slower convergence Sub-optimal schedules Many variants conceivable

DB Regression Tests Background & Motivation Execution Strategies Ordering Algorithms Experiments Conclusion

Experimental Set-Up Real-world –Lever Faberge Europe (€5 bln. in revenue) –BTell (i-TV-T) + SAP R/3 application –63 test runs, 448 requests, 117 MB database –Sun E450: 4 CPUs, 1 GB memory, Solaris 8 Simulation –Synthetic test runs –Vary number of test runs, vary number of conflicts –Vary distribution of conflicts: Uniform, Zipf

Real World minMaxWDiff minSlice minOpt minOpt minReset ConflictsIterations R RunTimeApproach

Simulation

DB Regression Tests Background & Motivation Execution Strategies Ordering Algorithms Experiments Conclusion

Practical approach to execute DB tests –good enough for Unilever on i-TV-T, SAP apps –resets are very rare, false positives non-existent –decision: 10,000 test runs, 100 GB data by 12/2005 Theory incomplete –NP hard? How much conflict info do you need? –Will verification be viable in foreseeable future? Future Work: solve remaining problems –concurrency testing, test run evolution, …

Research Challenges Test Run Generation (in progress) –automatic (robot), teach-in, monitoring, decl. Specification Test Database Generation (in progress) Test Run, DB Management and Evolution (uns.) Execution Strategies (solved), Incremental (uns.) Computation and visualization of  (solved) Quality parameters (in progress) –functionality (solved) –performance (in progress) –availability, concurrency, security (unsolved) Cost Model, Test Economy (unsolved)

Thank you!