Download presentation
Presentation is loading. Please wait.
Published byEunice Greer Modified over 9 years ago
1
Kai Pan, Xintao Wu University of North Carolina at Charlotte Generating Program Inputs for Database Application Testing Tao Xie North Carolina State University 26th IEEE/ACM International Conference on Automated Software Engineering Nov 11, 2011 Lawrence, Kansas
2
2 Functional Testing Test Generation Program Inputs Background
3
3 Test Generation Program Inputs Background Database States Functional Testing
4
4 Program inputs Database An Example
5
Motivation 5
6
Represent real-world objects’ characteristics, helping detect faults that could cause failures in real-world settings Reduce cost of generating new database records 6 Benefits to use an existing database state
7
Dynamic Symbolic Execution (DSE) Execute the program in both concrete and symbolic way (also called concolic testing) Collect constraints along executed path as path condition Negate part of the path condition and solve the new path condition to lead to new path DSE tools for various program languages Pex for.NET from Microsoft Research 7
8
Motivation 8 Path Condition: C1: Query construction constraints
9
Motivation 9 Path Condition: C1: Query construction constraints C2: Query/DB constraints
10
Motivation 10 Path Condition: C1: Query construction constraints C2: Query/DB constraints C3: Result manipulation constraints
11
Motivation 11 Path Condition: C1: Query construction constraints C2: Query/DB constraints C3: Result manipulation constraints C1 ^ C2 ^ C3
12
Motivation 12 Path Condition: C1: Query construction constraints C2: Query/DB constraints C3: Result manipulation constraints C1 ^ C2 ^ C3 A hard part
13
Motivation 13 How to derive high-covering program input values based on a given database state?
14
Outline Background Approach Evaluation Conclusion and future work 14
15
SQL query forms Fundamental structure: SELECT, FROM, WHERE, GROUP BY, and HAVING clauses. SELECT select-list FROM from-list WHERE qualification (GROUP BY grouping-list) (HAVING group-qualification) 15
16
SQL query forms (cont’d) Nested query: a query with another query embedded within it Nested query can be unnested into equivalent single level canonical queries SELECT S.sname FROM Sailors S FROM Sailors S, Reserves R WHERE EXISTS ( SELECT * WHERE R.sid=S.sid AND R.bid=103 FROM Reserves R WHERE R.bid=103 AND R.sid=S.sid) 16 transoformation rules A nested query Its canonical form
17
SQL query forms of focus WHERE clause consisting of a disjunction of conjunctions SELECT C1, C2,..., Ch FROM from-list WHERE (A11 AND... AND A1n) OR... OR (Am1 AND... AND Amn) 17
18
Outline Background Approach Evaluation Conclusion and future work 18
19
Illustrative example 19
20
Apply DSE on the existing database 20 Step1: DSE chooses “ type=0, zip=0 ” executed query: Q1: SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=1 AND C.SSN=M.SSN Execution of Q1 zero record, not covering loop body
21
Apply DSE on the existing database (cont’d) 21 Step2: DSE flips “type == 0” to “type != 0” “type=1, zip=0” executed query: Q2: SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=30 AND C.zipcode=1 AND C.SSN=M.SSN Execution of Q2 zero record not covering loop body
22
Apply DSE on the existing database (cont’d) 22 However, An input like “type=0, zip=27694” executed query: Q3: SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=27695 AND C.SSN=M.SSN Execution of Q3 one record {C.SSN = 001, C.income = 50000, M.balance = 20000}. Covering Line14=true and Line18=false
23
Apply DSE on the existing database (cont’d) 23 Furthermore, An input like “type=0, zip=28222”, executed query: Q4: SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=28223 AND C.SSN=M.SSN Execution of Q4 one record {C.SSN = 002, C.income = 150000, M.balance = 30000}. As a result, Line14=true and Line18=true
24
Assist DSE to generate program inputs 24 How to derive high-covering program input values based on a given database state?
25
Our idea: construct auxiliary queries 25 Auxiliary query : SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN e.g., result set includes “fzip=27695”. From “fzip=zip+1”, we derive “zip=27694”!
26
Our idea: construct auxiliary queries (cont’d) 26 Auxiliary query : SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN e.g., result set includes “fzip=27695”. From “fzip=zip+1”, we derive “zip=27694”! Cover Line14=true and Line18=false! true false
27
Our idea: construct auxiliary queries (cont’d) 27 Auxiliary query : SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN e.g., result set includes “fzip=27695”. From “fzip=zip+1”, we derive “zip=27694”! Cover Line14=true and Line18=false! true false Act like “Constraint Solver” for Program Constraints +DB State Constraints
28
Approach Collect query construction constraints on program variables used in the executed queries from the program code 28
29
Approach (cont’d) Collect query construction constraints on program variables used in the executed queries from the program code Collect result manipulation constraints on comparing with record values in the query’s result set (such as “if (diff>100000)” ) 29
30
Construct auxiliary queries 30 SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=‘fzip’ AND C.SSN=M.SSN For path “Line04=true, Line14=true”, construct the abstract query: true
31
Construct auxiliary queries 31 SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=‘fzip’ AND C.SSN=M.SSN For path “Line04=true, Line14=true”, construct the abstract query: true Our target
32
Construct auxiliary queries 32 SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=‘fzip’ AND C.SSN=M.SSN SELECT C.zipcode true Construct auxiliary query
33
Construct auxiliary queries 33 SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=‘fzip’ AND C.SSN=M.SSN SELECT C.zipcode FROM customer C, mortgage M true Construct auxiliary query
34
Construct auxiliary queries 34 SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=‘fzip’ AND C.SSN=M.SSN SELECT C.zipcode FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN Construct auxiliary query true
35
Generate program input values 35 Run auxiliary query: SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN fzip:27695 or 28223
36
Generate program input values 36 Run auxiliary query: SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN fzip: 27695 or 28223 zip: 27694 or 28222
37
37 “type=0, zip=27694” covers Line04=true, Line14=true, but Line18=false true false Input combinations: type: 0 or !0 X zip: 27694 or 28222 Generate program input values
38
Approach (cont’d) Not enough! Program variables in branch condition after executing the query may be data-dependent on returned record values. How to cover Line18 true branch? 38
39
Approach (cont’d) To cover path Line04=true, Line14=true, Line18=true We need to extend previous auxiliary query 39 true
40
Construct auxiliary queries 40 SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN (----how to extend?----) We extend the WHERE clause true
41
Construct auxiliary queries 41 SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN (----how to extend?----) We extend the WHERE clause true
42
Construct auxiliary queries 42 SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN AND C.income - 1.5 * M.balance > 100000 We extend the WHERE clause true
43
Generate program input values 43 Run auxiliary query: SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN AND C.income - 1.5 * M.balance > 100000 fzip=28223
44
Generate program input values 44 Run auxiliary query: SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN AND C.income - 1.5 * M.balance > 100000 fzip=28223 zip=28222
45
Other issues (aggregate calculation) Extend auxiliary query with GROUP BY and HAVING clauses. 45 Involve multiple records
46
Other issues (aggregate calculation) SELECT C.zipcode, sum(M.balance) FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN AND C.income - 1.5 * M.balance > 100000 GROUP BY C.zipcode HAVING sum(M.balance) > 500000 46
47
Other issues (cardinality constraints) SELECT C.zipcode FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN AND C.income - 1.5 * M.balance > 100000 GROUP BY C.zipcode HAVING COUNT(*) >= 3 Use a special DSE technique for dealing with input- dependent loops P. Godefroid and D. Luchaup. Automatic partial loop summarization in dynamic test generation. In ISSTA 2011. 47
48
Outline Background Approach Evaluation Conclusion and future work 48
49
Research questions RQ1 (Effectiveness): What is the percentage increase in code coverage by the program inputs generated by Pex with our approach’s assistance? RQ2 (Cost): What is the cost of our approach’s assistance? 49
50
Evaluation subjects Two open source database applications RiskIt 4.3K LOC, database: 13 tables, 57 attributes, and >1.2 million records 17 DB-interacting methods selected for testing UnixUsage 2.8K LOC, database: 8 tables, 31 attributes, and >0.25 million records 28 DB-interacting methods selected for testing 50
51
Evaluation setup Measurement for test generation effectiveness: code coverage cost: number of runs/paths, execution time Procedure run Pex w/o our approach’s assistance perform our algorithms to generate new additional test inputs 51
52
Evaluation results: RiskIt 52 Higher code coverage
53
Evaluation results: RiskIt 53 Low additional cost Pex (only) timeout: 120 seconds Even given longer time, no new coverage observed for Pex (only)
54
Evaluation results: RiskIt 54 Pex (only) timeout: 120 seconds Even given longer time, no new coverage observed for Pex (only)
55
Preliminary Evaluation(cont’d) Evaluation results: UnixUsage
56
Summary of evaluation results RQ1: Effectiveness RiskIt: 26% higher block coverage over Pex only UnixUsage: 35% higher block coverage over Pex only RQ2: Cost RiskIt: #runs/paths: 131 more over 1135 (Pex) execution time: 517 secs more over 1781 (Pex) UnixUsage #runs/paths: 93 more over 1197 (Pex) execution time: 580 secs more over 1718 (Pex) 56
57
Outline Background Approach Evaluation Conclusion 57
58
Conclusion A new approach that formulates auxiliary queries to bridge gap between program/DB constraints. Act like a “constraint solver” for program constraints + DB constraints Empirical evaluations on 2 open source DB apps our approach can assist DSE to generate program inputs effectively achieving higher code coverage with low additional cost. 58
59
Future Work To construct auxiliary queries directly from embedded complex queries (e.g., nested queries), rather than from their transformed norm forms. To handle complex program context such as multiple queries. 59
60
Acknowledgment: This work was supported in part by U.S. National Science Foundation under CCF-0915059 for Kai Pan and Xintao Wu, and under CCF-0915400 for Tao Xie. Thank you! Questions? 60
61
Related Work All previous related work addresses a different problem: constructing both program inputs and database states (from scratch) M. Emmi, R. Majumdar, and K. Sen. Dynamic test input generation for database applications. In ISSTA, 2007. K. Taneja, Y. Zhang, and T. Xie. MODA: Automated test generation for database applications via mock objects. In ASE, 2010. 61
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.