Tao Xie North Carolina State University In collaboration with Nikolai Tillmann, Peli de Halleux, Wolfram Research, Suresh Thummalapenta,

Tao Xie North Carolina State University In collaboration with Nikolai Tillmann, Peli de Halleux, Wolfram Schulte, @Microsoft Research, Suresh Thummalapenta, and students @NCSU ASE

 Software testing is important  Software errors cost the U.S. economy about $59.5 billion each year (0.6% of the GDP) [NIST 02]  Improving testing infrastructure could save 1/3 cost [NIST 02]  Software testing is costly  Account for even half the total cost of software development [Beizer 90]  Automated testing reduces manual testing effort  Test execution: JUnit, NUnit, xUnit, etc.  Test generation: Pex, AgitarOne, Parasoft Jtest, etc.  Test-behavior checking: Pex, AgitarOne, Parasoft Jtest, etc.

= ? Outputs Expected Outputs Program + Test inputs Test Oracles

= ? Outputs Expected Outputs Program + Test inputs Test Oracles  Test Generation  Generating high-quality test inputs (e.g., achieving high code coverage)

= ? Outputs Expected Outputs Program + Test inputs Test Oracles  Test Generation  Generating high-quality test inputs (e.g., achieving high code coverage)  Test Oracles  Specifying high-quality test oracles (e.g., guarding against various faults)

 Human  Expensive, incomplete, …  Brute Force  Pairwise, predefined data, etc…  Random:  Cheap, Fast  “It passed a thousand tests” feeling  Dynamic Symbolic Execution: Pex, CUTE,EXE  Automated white-box  Not random – Constraint Solving

Code to generate inputs for: Constraints to solve a!=null a!=null && a.Length>0 a!=null && a.Length>0 && a[0]==1234567890 void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug"); } void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug"); } Observed constraints a==null a!=null && !(a.Length>0) a!=null && a.Length>0 && a[0]!=1234567890 a!=null && a.Length>0 && a[0]==1234567890 Data null {} {0} {123…} a==null a.Length>0 a[0]==123… T T F T F F Execute&Monitor Solve Choose next path Done: There is no path left. Negated condition

 Loops  Fitnex [Xie et al. DSN 09]  Generic API functions e.g., RegEx matching IsMatch(s1,regex1)  Reggae [Li et al. ASE 09-sp]  Method sequences  MSeqGen [Thummalapenta et al. ESEC/FSE 09]  Environments e.g., file systems, network, db, …  Parameterized Mock Objects [Marri et al. AST 09] Opportunities  Regression testing [Taneja et al. ICSE 09-nier]  Developer guidance (cooperative developer testing) [Xiao et al. ICSE 11]

 Loops  Fitnex [Xie et al. DSN 09]  Generic API functions e.g., RegEx matching IsMatch(s1,regex1)  Reggae [Li et al. ASE 09-sp]  Method sequences  MSeqGen [Thummalapenta et al. ESEC/FSE 09]  Environments e.g., file systems, network, db, …  Parameterized Mock Objects [Marri et al. AST 09] Applications  Test network app at Army division@Fort Hood, Texas  Test DB app of hand-held medical assistant device at FDA

Download counts (20 months) (Feb. 2008 - Oct. 2009 ) Academic: 17,366 Devlabs: 13,022 Total: 30,388

257,766 clicked 'Ask Pex!‘ since 2010 summer

 Loops  Fitnex [Xie et al. DSN 09]  Generic API functions e.g., RegEx matching IsMatch(s1,regex1)  Reggae [Li et al. ASE 09-sp]  Method sequences  MSeqGen [Thummalapenta et al. ESEC/FSE 09]  Environments e.g., file systems, network, db, …  Parameterized Mock Objects [Marri AST 09] Applications  Test network app at Army division@Fort Hood, Texas  Test DB app of hand-held medical assistant device at FDA

There are decision procedures for individual path conditions, but…  Number of potential paths grows exponentially with number of branches  Without guidance, same loop might be unfolded forever Fitnex search strategy [Xie et al. DSN 09]

public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: !(x == 90) ↓ New path condition: (x == 90) ↓ New test input: TestLoop(90, {0}) Path condition: !(x == 90) ↓ New path condition: (x == 90) ↓ New test input: TestLoop(90, {0}) Test input: TestLoop(0, {0}) Test input: TestLoop(0, {0})

Path condition: (x == 90) && !(y[0] == 15) ↓ New path condition: (x == 90) && (y[0] == 15) ↓ New test input: TestLoop(90, {15}) Path condition: (x == 90) && !(y[0] == 15) ↓ New path condition: (x == 90) && (y[0] == 15) ↓ New test input: TestLoop(90, {15}) Test input: TestLoop(90, {0}) Test input: TestLoop(90, {0}) public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; }

public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Test input: TestLoop(90, {15}) Test input: TestLoop(90, {15}) Path condition: (x == 90) && (y[0] == 15) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] == 15) && (x+1 == 110) ↓ New test input: No solution!? Path condition: (x == 90) && (y[0] == 15) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] == 15) && (x+1 == 110) ↓ New test input: No solution!?

public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: (x == 90) && (y[0] == 15) && (0 < y.Length) && !(1 < y.Length) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] == 15) && (0 < y.Length) && (1 < y.Length)  Expand array size Path condition: (x == 90) && (y[0] == 15) && (0 < y.Length) && !(1 < y.Length) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] == 15) && (0 < y.Length) && (1 < y.Length)  Expand array size Test input: TestLoop(90, {15}) Test input: TestLoop(90, {15})

We can have infinite paths! (both length and number) Manual analysis  need at least 20 loop iterations to cover the target branch Exploring all paths up to 20 loop iterations is practically infeasible: 2 20 paths We can have infinite paths! (both length and number) Manual analysis  need at least 20 loop iterations to cover the target branch Exploring all paths up to 20 loop iterations is practically infeasible: 2 20 paths Test input: TestLoop(90, {15}) Test input: TestLoop(90, {15}) public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; }

public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Test input: TestLoop(90, {15, 15}) Test input: TestLoop(90, {15, 15}) Our solution:  Prefer to flip nodes on the most promising path  Prefer to flip the most promising nodes on path  Use fitness function as a proxy for promising Our solution:  Prefer to flip nodes on the most promising path  Prefer to flip the most promising nodes on path  Use fitness function as a proxy for promising Key observations: with respect to the coverage target,  not all paths are equally promising for flipping nodes  not all nodes are equally promising to flip Key observations: with respect to the coverage target,  not all paths are equally promising for flipping nodes  not all nodes are equally promising to flip

 FF computes fitness value (distance between the current state and the goal state)  Search tries to minimize fitness value [ Tracey et al. 98, Liu at al. 05, …]

public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Fitness function: |110 – x |

(90, {0}) 20 (90, {15}) 19 (90, {15, 0}) 19 (90, {15, 15}) 18 (90, {15, 15, 0}) 18 (90, {15, 15, 15}) 17 (90, {15, 15, 15, 0}) 17 (90, {15, 15, 15, 15}) 16 (90, {15, 15, 15, 15, 0}) 16 (90, {15, 15, 15, 15, 15}) 15 … Fitness Value (x, y) Give preference to flip a node in paths with better fitness values. We still need to address which node to flip on paths … Give preference to flip a node in paths with better fitness values. We still need to address which node to flip on paths … public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Fitness function: |110 – x |

Fitness Value public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } (90, {0}) 20 (90, {15})  flip b4 19 (90, {15, 0})  flip b2 19 (90, {15, 15})  flip b4 18 (90, {15, 15, 0})  flip b2 18 (90, {15, 15, 15})  flip b4 17 (90, {15, 15, 15, 0})  flip b2 17 (90, {15, 15, 15, 15})  flip b4 16 (90, {15, 15, 15, 15, 0})  flip b2 16 (90, {15, 15, 15, 15, 15})  flip b4 15 … (x, y) Fitness function: |110 – x | Branch b1: i < y.Length Branch b2: i >= y.Length Branch b3: y[i] == 15 Branch b4: y[i] != 15 Flipping branch node of b4 (b3) gives us average 1 (-1) fitness gain (loss) Flipping branch node of b2 (b1) gives us average 0 (0) fitness gain (loss)

 Fitness gains:  FGain(b) := F(p) – F(p’)  FGain(b’) := F(p’) – F(p)  Compute the average fitness gain for each program branch over time …. p n b p’ n b’ F(p) is the fitness value of p F(p’) is the fitness value of p’

 Pex maintains global search frontier  All discovered branch nodes are added to frontier  Frontier may choose next branch node to flip  Fully explored branch nodes are removed from frontier  Pex has a default search frontier  It tries to create diversity across different coverage criteria  Frontiers can be combined in a fair round-robin scheme

We implemented a new search frontier “Fitnex”:  Nodes to flip are prioritized by their composite fitness value: F(p n ) – FGain(b n ), where  p n is path of node n  b n is explored outgoing branch of n  Fitnex always picks node with lowest composite fitness value to flip.  To avoid local optimal or biases, the fitness-guided strategy is combined with Pex’s search strategies

A collection of micro-benchmark programs routinely used by the Pex developers to evaluate Pex’s performance, extracted from real, complex C# programs Ranging from string matching like if (value.StartsWith("Hello") && value.EndsWith("World!") && value.Contains(" ")) { … } to a small parser for a Pascal-like language where the target is to create a legal program.

 Pex with the Fitnex strategy  Pex without the Fitnex strategy  Pex’s previous default strategy  Random  a strategy where branch nodes to flip are chosen randomly in the already explored execution tree  Iterative Deepening  a strategy where breadth-first search is performed over the execution tree

#runs/iterations required to cover the target Pex w/o Fitnex: avg. improvement of factor 1.9 over Random Pex w/ Fitnex: avg. improvement of factor 5.2 over Random

= ? Outputs Expected Outputs Program + Test inputs Test Oracles  Test Generation  Test inputs for PUT generated by tools (e.g., Pex)  Fitnex: guided exploration of paths [DSN 09]  MSeqGen: exploiting real-usage sequences [ESEC/FSE 09]  Test Oracles  Assertions in PUT specified by developers Division of Labors

Motivation: New Trends in Development 31 Build applications from scratch  Exponential increase in libraries or frameworks Proprietary e.g.,.NET or Java SDK Open source e.g., Eclipse  Sourceforge.net hosts nearly 230,000 projects with two million users API: Application Programming Interface 1. J. Hammond. What developers think, 2010. http://www.drdobbs.com/architect/222301141/http://www.drdobbs.com/architect/222301141/ 2. Black duck’s web page with koders usage information, 2010. http://corp.koders.com/about/

32  Programmers face difficulties in using APIs Lack of documentation 1 Outdated documentation 2 Complexity 3 :.NET library provides nearly 10,000 classes Major Problems 1. Jan Bosch, Peter Molin, Michael Mattsson, and PerOlof Bengtsson. Object-oriented framework-based software development: problems and experiences. ACM Comput. Surv 2000 2. Timothy C. Lethbridge, Janice Singer, and Andrew Forward. How software engineers use documentation: The state of the practice. IEEE Software 2003. 3. D. Kirk, M. Roper, and M. Wood. Identifying and addressing problems in object-oriented framework reuse. Journal of Empirical Soft. Eng., 12(3):243{274, 2007 Libraries or Frameworks … Use APIs 1,2

33  Programmers spend more effort in understanding APIs 1 Reducing productivity  Programmers introduce defects while using APIs 2 Reducing quality Consequences 1. Martin P. Robillard. What makes APIs hard to learn? Answers from developers. IEEE Software 2009 2. Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. Bugs as deviant behavior: A general approach to inferring errors in systems code. SOSP 2001

34 Example Programming Task  Task: How to parse code in a dirty editor of Eclipse IDE? DirtyEditor is represented by IEditorPart Requires ICompilationUnit for parsing code  Query: IEditorPart -> ICompilationUnit  An example solution IEditorPart iep =... IEditorInput editorInp = iep.getEditorInput(); IWorkingCopyManager wcm = JavaUI.getWorkingCopyManager(); ICompilationUnit icu = wcm.getWorkingCopy(editorInp);  Challenges: Needs instances of IEditorInput and IWorkingCopyManager Needs to invoke a static method of JavaUI

35  Programmers spend more effort in understanding APIs 1 Reducing productivity  Programmers introduce defects while using APIs 2 Reducing quality Consequences 1. Martin P. Robillard. What makes APIs hard to learn? Answers from developers. IEEE Software 2009 2. Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. Bugs as deviant behavior: A general approach to inferring errors in systems code. SOSP 2001

36 Defects  A code example from an open source project HSqlDB  Defect: No rollback done when SQLException occurs Missing connection.rollback()  Requires specification: FCc1 FCc2 FCc3 => FCe1 FCc1 -> Connection conn = OracleDataSource.getConnection()‏ FCc2 -> Statement stmt = Connection.createStatement()‏ FCc3 -> stmt.executeUpdate()‏ FCe1 -> conn.rollback()‏

37 Solution: WebMiner Framework Collect SE data Adopt/adapt/ develop mining algorithm Postprocess/ Apply mining results SEARCH MINE APPLY ANALYZE Resolve object types Generate candidates Observation: source code already reusing APIs can be leveraged for improving both software productivity and quality

38 Key Idea of WebMiner Framework Library Application 1 Application 2 Application 3 … Pattern Candidates … Frequent patterns Candidates Mining Productivit y Quality Defect detection techniques Defects Observation: Frequent patterns are more likely to represent API usage specifications

39 Search Phase Collect SE data Adopt/adapt/ develop mining algorithm Postprocess/ Apply mining results SEARCH MINE APPLY ANALYZE Resolve object types Generate candidates  Leverages code search engines such as Google code search  Collects relevant code examples for APIs under analysis  Addresses a major issue of “limited data points”

40 Code repositories 1 2 N … 12 mining patterns searchingmining patterns Code search engine e.g., Open source code on the web Eclipse, Linux, … Previous approaches WebMiner Framework Often lack sufficient relevant data points (eg. API call sites)‏  Missing patterns: affecting productivity  Missing related defects: affecting quality Code repositories Search Phase: Limited Data Points

41 Analyze Phase Collect SE data Adopt/adapt/ develop mining algorithm Postprocess/ Apply mining results SEARCH MINE APPLY ANALYZE Resolve object types Generate candidates  Analyzes collected code examples  Generates pattern candidates  Addresses a major issue of partial and non-compilable code examples

42 Analyze Phase  Challenge: collected code examples are partial and non-compilable  Solution: partial-program analysis  Uses heuristics based on simple language semantics  Advantages: Does not require code to be compilable Highly scalable (96 MLOC analyzed in ~ 2 hours on 3.0 GHz Xeon processor and 4GB RAM)

43 Partial-Program Analysis Heuristics  Example 1: QueueConnection connect; QueueSession session = connect.createQueueSession (false, int) ‏  How to get the return type of createQueueSession method?  No access to the method declaration  Return type can be inferred from the type of the variable “session”

44 Mine and Apply Phases Collect SE data Adopt/adapt/ develop mining algorithm Postprocess/ Apply mining results SEARC H MINE APPLY ANALYZE Resolve object types Generate candidates  Specific to the SE task under analysis  Mines pattern candidates to identify frequent patterns  Suggests patterns to programmers or uses patterns to detect defects

45 Approaches based on WebMiner WebMinerProductivity PARSEWeb [ASE 07] SpotWeb [ASE 08] Quality Static Verification CAR-Miner [ICSE 09] Alattin [ASE 09] Dynamic Testing MSeqGen [ESEC/FSE 09] DyGen [TAP 10]  Industrial Impact: Used DyGen at Microsoft Research to generate regression test suite (~500,000 tests) for two core libraries of.NET framework. Addresses queries of the form “Source  Destination” Helps identify where to start reusing a library Detects exception- handling related defects Detects missing condition checks around API method calls Mines static traces and assists white-box test- generation approaches Mines dynamic traces and generates regression tests

Alattin: Motivation 46  Problem: Programming rules are often not well documented  General solution: Adopt/adapt/ develop mining algorithm Postprocess/ Apply mining results SEARC H MINE APPLY ANALYZE Resolve object types Generate candidates Collect SE data  Mine frequent patterns across a large number of data points (e.g., code examples)  Use frequent patterns as programming rules to detect defects

47  Limited data points  Existing approaches mine specifications from a few code bases  miss specifications due to lack of sufficient data points  Existing approaches produce a large number of false positives Challenges addressed by Alattin

48  A major observation:  Programmers often write code in different ways for achieving the same task  Some ways are more frequent than others Large Number of False Positives Frequent ways Infrequent ways Mined Patterns mine patterns detect violations

49 Example: java.util.Iterator.next() PrintEntries1(ArrayList entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … } PrintEntries1(ArrayList entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … } Code Sample 1 PrintEntries2(ArrayList entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … } PrintEntries2(ArrayList entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … } Java.util.Iterator.next() throws NoSuchElementException when invoked on a list without any elements Code Sample 2

50 Example: java.util.Iterator.next() PrintEntries1(ArrayList entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … } PrintEntries1(ArrayList entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … } Code Sample 1 PrintEntries2(ArrayList entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … } PrintEntries2(ArrayList entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … } Code Sample 2 1243 code examples Sample 1 (1218 / 1243) Sample 2 (6/1243) Mined Pattern from existing approaches: “boolean check on return of Iterator.hasNext before Iterator.next”

51 Example: java.util.Iterator.next()  Requires more general patterns e.g., P 1 or P 2 P 1 : boolean check on return of Iterator.hasNext before Iterator.next P 2 : boolean check on return of ArrayList.size before Iterator.next  Motivates the requirements of new pattern formats  Beyond single or conjunctive (ᴧ) PrintEntries1(ArrayList entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … } PrintEntries1(ArrayList entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … } Code Sample 1 PrintEntries2(ArrayList entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … } PrintEntries2(ArrayList entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … } Code Sample 2

Alattin Approach 52  Proposes four pattern formats and mining algorithms  Conjunctive - And (Λ)  Disjunctive - Or (V)  Exclusive-disjunctive - Xor ( ʘ )  Combo  Attempts to mine patterns that capture nearly complete behaviour  Hypothesis: helps reduce both false positives and false negatives among detected violations  Applies mined patterns for detecting Neglected conditions

Neglected Conditions 53  Neglected conditions refer to  Missing conditions that check the arguments or receiver of the API call before the API call  Missing conditions that check the return or receiver of the API call after the API call  One of the primary reasons for many fatal issues  Security or buffer-overflow vulnerabilities  66% (109/167) bug fixes applied in Mozilla firefox are due to neglected conditions 1 1. Ray-Yaung Chang, Andy Podgurski, and Jiong Yang. Finding what’s not there: A new approach to revealing neglected conditions in software. ISSTA 2007

Alattin: Example 54  Requires specification for the “Iterator.next” method  A code example from an open source project Columba 00: String removeDoubleEntries (Matcher matcher) { 01: … 02: ArrayList entries = new ArrayList(); 03:while(matcher.find()) 04: entries.add(matcher.group()); 05:Iterator it = entries.iterator(); 06:String last = (String) it.next(); … 07: }

Collect SE data Adopt/adapt/ develop mining algorithm Postprocess/ Apply mining results SEARC H MINE APPLY ANALYZE Resolve object types Generate candidates Search and Analyze Code Examples 55 object evaluate(object val) { … 01: if(val != null && val instanceof Collection) { 02: Collection coll = (Collection) val; 03: Iterator iter = coll.iterator(); 04: if(!coll.IsEmpty()) { 05:for(; iter.hasNext();) { 06: object obj = iter.next(); 07: if(obj instanceof Node) { 08: Node node = (Node) obj; 09: } 10:} 11: } 12: } } Start 01 02 03 04 05 06 07 08 End Receiver: iter Return: obj boolean-check on return of Iterator.hasNext() control non-control Pattern candidate for Iterator.next with preceding and succeeding condition checks PC: 1 2 3 4 5 1: instance-check on receiver of Collection.iterator 2: null-check on receiver of Collection.iterator 3: boolean-check on return of Collection.IsEmpty 4: boolean-check on return of Iterator.hasNext 5: instance-check on return of Iterator.next

Collect SE data Adopt/ develop mining algo Postprocess/ Apply mining results SEARCH MINE APPLY Resolve object types Generate candidates ANALYZE 56  An example input database (ISD) with all pattern candidates  Alattin mines patterns in four formats:  And (Λ)  Or (V)  Xor ( ʘ )  Combo Mine Patterns 1. J. Han and M. Kamber. Data mining: concepts and techniques. Morgan Kaufmann Publishers Inc., 2000

57 ImMiner Algorithm  Uses frequent-itemset mining [Burdick et al. ICDE 01] iteratively  An input database with the following APIs for Iterator.next() Input databaseMapping of IDs to APIs

58 ImMiner Algorithm: Frequent Alternatives Input database Frequent itemset mining (min_sup 0.5) Frequent item: 1 P 1 : boolean-check on the return of Iterator.hasNext() before Iterator.next()

59 ImMiner: Infrequent Alternatives of P 1 Positive database (PSD) Negative database (NSD)  Split input database into two databases: Positive and Negative  Mine patterns that are frequent in NSD and are infrequent in PSD  Reason: Only such patterns serve as alternatives for P 1  Alternative Pattern : P 2 “const check on the return of ArrayList.size() before Iterator.next()”  Alattin applies ImMiner algorithm to detect neglected conditions

60  Objective: How pattern formats affect the number of false negatives and false positives among detected neglected conditions?  Setup:  Mined patterns in all four formats for three Java default API libraries  Applied patterns on four subject applications to detect neglected conditions  Ignored violations from patterns including one alternative  Baseline false negatives:  Set of all distinct defects detected by all four pattern formats Evaluation

61  And patterns have a high percentage of false negatives  Often miss many alternatives in patterns that help detect new defects  Or patterns have higher number of false negatives compared to Xor & Combo patterns  Xor & Combo patterns: Xor patterns have low number of false negatives Results: False negatives

62  Or patterns - 54%  Xor patterns - 10.2%  Combo patterns - 32.4% Results: False positives  Reduction of false positives with respect to And patterns  Summary:  And patterns: produces high % of false negatives and false positives  Xor patterns: reduces false negatives, but produces high % of false positives  Or patterns: reduces false positives, but produces high % of false negatives  Combo patterns: reasonably well

63 Conclusion

http://research.microsoft.com/pex http://www.pexforfun.com/ http://pexase.codeplex.com/ https://sites.google.com/site/asergrp/

65 Thank You Questions?

66 Partial-Program Analysis Heuristics  Example 2: public QueueSession test() ‏ {... return connect.createQueueSession(false,int); }  How to get the return type of createQueueSession method?  No access to the method declaration  Return type can be inferred from the return type of the enclosing method declaration

67 Φ 21 34 1 Λ 31 Λ 42 Λ 32 Λ 4 3 Λ 4 1 Λ 2 Λ 31 Λ 2 Λ 41 Λ 3 Λ 42 Λ 3 Λ 4 1 Λ 2 Λ 3 Λ 4 1 Λ 2  If an itemset is frequent, then all its subsets should also be frequent 1  An itemset is considered frequent, if support(itemset) >= min_sup (0.67)(0.33) (0.5) 1 Λ 3 (0.17)  Challenge: Or, Xor, and Combo patterns do not follow Apriori principle 1 V 3 (0.83) min_sup: 0.4 Apriori Principle 1. J. Han and M. Kamber. Data mining: concepts and techniques. Morgan Kaufmann Publishers Inc., 2000

68 Φ 2 (0.33)1 (0.67) 3 (0.33)4 (0.5) 1 ʘ 2 (1.0)1 ʘ 3 (0.60)1 ʘ 4 (0.17)2 ʘ 3 (0.33)2 ʘ 4 (0.83) 3 ʘ 4 (0.5) 1 ʘ 2 ʘ 4 (0.5)  Uses a greedy approach based on the following property Violated property 1  Uses mined patterns to detect neglected conditions statically min_sup: 0.4 Mining Xor Patterns Support < 0.4

69  Threats to external validity One code search engine Three Java libraries Four subject applications  Threats to internal validity Faults in our prototype Errors in our inspection Threats to Validity More experiments on wider types of subjects using other code search engines Inspected available specifications and also call sites in source code

70 Future Work  Three major directions  Mining unstructured software engineering data  Moving from syntactic to semantic analysis  Combining static verification and dynamic testing

Mining Software Engineering Data code bases change history program states structural entities software engineering data bug reports/nl programmingdefect detectiontestingdebuggingmaintenance software engineering tasks helped by data mining classification association/ patterns clustering data mining techniques … … new mining techniques Limited data scope: Extended code bases to open source code on the web via search 71 Slide from T. Xie Mining Program Source Code Limited expressiveness

T. Xie Mining Program Source Code code bases change history program states structural entities software engineering data bug reports/nl … 99 ASE 00 ICSE 05 FSE*2 ASE PLDI POPL OSDI 06 PLDI OOPSLA KDD 07 ICSE*3 FSE*3 ASE PLDI*2 ISSTA*2 KDD 04 ICSE 05 FSE*2 06 ASE 07 ICSE*2 99 ICSE 02 ICSE 03 PLDI 05 FSE PLDI 06 ISSTA 07 ISSTA 99 FSE 01 ICSE FSE 02 ISSTA POPL KDD 03 PLDI 04 ASE ISSTA 05 ICSE ASE 06 ICSE FSE*2 07 PLDI 03 ICSE 06 ICSE ASE 07 ICSE SOSP Overview of Mining SE Data

73 Partial-Program Analysis Heuristics  Example 3:

74 Alattin: Pattern Examples

75 Alattin: Partial Rule Example  API Name: java.util.regex.Matcher:start()  P 1 or P 2 or P 3 P 1 : boolean check on return of Matcher.matches before Matcher.start P 2 : gen-equality check on return of Matcher.start(int) before Matcher.start P 3 : boolean check on Matcher.find before Matcher.start

76 Alattin Approach Application Under Analysis Detect neglected conditions Classes and methods Open Source Projects on web 1 2 N … … Pattern Candidates PatternsViolations Extract classes and methods reused Phase 1: Issue queries and collect relevant code samples. Eg: “lang:java java.util.Iterator next” Phase 2: Generate pattern candidates Phase 3: Mine frequent patterns Phase 4: Detect neglected conditions statically

77 Phase 2: Generate Pattern Candidates Generate pattern candidates for Iterator.next (Node 6) Perform backward traversal to identify dominant control nodes and condition checks “boolean-check on the return of Iterator.hasNext” Perform forward traversal to identify post-dominant control nodes and condition checks “instance-check on obj” Exploit program dependencies to associate condition checks with method calls Start 01 02 03 04 05 06 07 08 End Receiver: iter Return: obj boolean-check on return of Iterator.hasNext() Instance-check on obj

78 Phase 2: Generate Pattern Candidates  Pattern candidate (PC) for Iterator.next with preceding and succeeding condition checks PC: 1 2 3 4 5 1: instance-check on receiver of Collection.iterator 2: null-check on receiver of Collection.iterator 3: boolean-check on return of Collection.IsEmpty 4: boolean-check on return of Iterator.hasNext 5: instance-check on return of Iterator.next Start 01 02 03 04 05 06 07 08 End Receiver: iter Return: obj boolean-check on return of Iterator.hasNext() Instance-check on obj

79 Alattin Results  Classification of mined rules

80  Xor patterns have a high percentage of false positives  Or & Combo patterns also have higher percentages of false positives compared to And patterns  Manual Analysis: majority of false positives are related to API methods without any And pattern  FPAnd: API methods with atleast one pattern in And pattern format  FPWithOutAnd: API methods without patterns in And pattern format Results: False positives

81 PARSEWeb Results [ASE 07]  Programming Queries

82 CAR-Miner Results [ICSE 09]  Comparison with WN-Miner [Weimer and Necula TACAS 05]  Found 224 new rules  Two major factors:  sequence association rules  Increase in the data scope

83 CAR-Miner Results [ICSE 09]

84 CAR-Miner Results [ICSE 09]  Additional rules due to data scope extension

85 Mining Unstructured SE Data  Dissertation primarily focuses on source code: a form of structured data  Various other forms of data is available on the web  Developer forums, technical blogs, manuals Code repositories 1 2 N … searching mining patterns WebMiner Framework SE artifacts 1 2 N … searchin g mining patterns WebMiner++ Framework NLP

86 Mining Unstructured SE Data  Empirical study  Kinds of unstructured SE data available on the web  SE Tasks that can be addressed from the data  Fundamental questions:  How to analyze and understand unstructured SE data?  How to index analyzed data and search for relevant data?  How to accept inputs from programmers?  How to mine searched data?

87 Future Work: Direction 3  Advantage: scalability  Disadvantage: false warnings  Example Specification: boolean-check on return of Iterator.hasNext before Iterator.next  Our solution: set up test targets and dynamically confirm violations

88 Future Work: Direction 3  Example Specification: boolean-check on return of Iterator.hasNext before Iterator.next

89 Future Work: Direction 3  Combine D4D and MSeqGen  Generate test input that confirms violation  Example

90 Mining Specifications (Usage/Implementation)  Implementation  Last paper in 2005  Permissive interfaces  boils down to model checking where state explosion is the big problem  Applied only on data structures or single classes  Usage  Last paper in 2010  Same specification can be mined in a cost-effective fashion  Inherent limitation that existing code bases are required.  Applied [Gabel and Su ICSE 2010] on Eclipse

 http://research.microsoft.com/en-us/projects/contracts/ http://research.microsoft.com/en-us/projects/contracts/  Library to state preconditions, postconditions, invariants  Supported by two tools:  Static Checker  Rewriter: turns Code Contracts into runtime checks  Pex analyses the runtime checks  Contracts act as Test Oracle  Pex may find counter examples for contracts  Missing Contracts may be suggested

Class invariant specification: public class ArrayList { private Object[] _items; private int _size;... [ContractInvariantMethod] // attribute comes with Contracts protected void Invariant() { Contract.Invariant(this._items != null); Contract.Invariant(this._size >= 0); Contract.Invariant(this._items.Length >= this._size); }

 Unit test: while it is debatable what a ‘unit’ is, a ‘unit’ should be small.  Integration test: exercises large portions of a system.  Observation: Integration tests are often “sold” as unit tests  White-box test generation does not scale well to integration test scenarios.  Possible solution: Introduce abstraction layers, and mock components not under test

AppendFormat(null, “{0} {1}!”, “Hello”, “World”);  “Hello World!”.Net Implementation: public StringBuilder AppendFormat( IFormatProvider provider, char[] chars, params object[] args) { if (chars == null || args == null) throw new ArgumentNullException(…); int pos = 0; int len = chars.Length; char ch = '\x0'; ICustomFormatter cf = null; if (provider != null) cf = (ICustomFormatter)provider.GetFormat(typeof(ICustomFormatter)); …

 Introduce a mock class which implements the interface.  Write assertions over expected inputs, provide concrete outputs public class MFormatProvider : IFormatProvider { public object GetFormat(Type formatType) { Assert.IsTrue(formatType != null); return new MCustomFormatter(); } }  Problems:  Costly to write detailed behavior by example  How many and which mock objects do we need to write?

 Introduce a mock class which implements the interface.  Let an oracle provide the behavior of the mock methods. public class MFormatProvider : IFormatProvider { public object GetFormat(Type formatType) { … object o = call.ChooseResult (); return o; } }  Result: Relevant result values can be generated by white-box test input generation tool, just as other test inputs can be generated! 97

 Chosen values can be shaped by assumptions public class MFormatProvider : IFormatProvider { public object GetFormat(Type formatType) { … object o = call.ChooseResult (); PexAssume.IsTrue(o is ICustomFormatter); return o; } }  (Note: Assertions and assumptions are “reversed” when compared to parameterized unit tests.) 98

 Choices to build parameterized models class PFileSystem : IFileSystem { // cached choices PexChosenIndexedValue files; string ReadFile(string name) { var content = this.files[name]; if (content == null) throw new FileNotFoundException(); return content; }} class PFileSystem : IFileSystem { // cached choices PexChosenIndexedValue files; string ReadFile(string name) { var content = this.files[name]; if (content == null) throw new FileNotFoundException(); return content; }}

100  Subjects:  QuickGraph  Facebook  Research Questions:  RQ1: Can our approach assist Randoop (random testing tool) in achieving higher code coverages?  RQ2: Can our approach assist Pex (DSE-based testing tool) in achieving higher code coverages? MSeqGen Evaluation

101 RQ1: Assisting Randoop

102 RQ2: Assisting Pex  Legend:  #c: number of classes  P: branch coverage achieved by Pex  P + M: branch coverage achieved by Pex and MSeqGen

void PexAssume.IsTrue(bool c) { if (!c) throw new AssumptionViolationException(); } void PexAssert.IsTrue(bool c) { if (!c) throw new AssertionViolationException(); }  Assumptions and assertions induce branches  Executions which cause assumption violations are ignored, not reported as errors or test cases

 How to test this code? (Actual code from.NET base class libraries) 104

[PexClass, TestClass] [PexAllowedException(typeof(ArgumentNullException))] [PexAllowedException(typeof(ArgumentException))] [PexAllowedException(typeof(FormatException))] [PexAllowedException(typeof(BadImageFormatException))] [PexAllowedException(typeof(IOException))] [PexAllowedException(typeof(NotSupportedException))] public partial class ResourceReaderTest { [PexMethod] public unsafe void ReadEntries(byte[] data) { PexAssume.IsTrue(data != null); fixed (byte* p = data) using (var stream = new UnmanagedMemoryStream(p, data.Length)) { var reader = new ResourceReader(stream); foreach (var entry in reader) { /* reading entries */ } } [PexClass, TestClass] [PexAllowedException(typeof(ArgumentNullException))] [PexAllowedException(typeof(ArgumentException))] [PexAllowedException(typeof(FormatException))] [PexAllowedException(typeof(BadImageFormatException))] [PexAllowedException(typeof(IOException))] [PexAllowedException(typeof(NotSupportedException))] public partial class ResourceReaderTest { [PexMethod] public unsafe void ReadEntries(byte[] data) { PexAssume.IsTrue(data != null); fixed (byte* p = data) using (var stream = new UnmanagedMemoryStream(p, data.Length)) { var reader = new ResourceReader(stream); foreach (var entry in reader) { /* reading entries */ } } } }

 Exploration of constructor/mutator method sequences  Testing with class invariants

 Write class invariant as boolean-valued parameterless method  Refers to private fields  Must be placed in implementation code  Exploration of valid states by setting public/private fields  May include states that are not reachable

Class invariant specification: public class ArrayList { private Object[] _items; private int _size;... [ContractInvariantMethod] // attribute comes with Contracts protected void Invariant() { Contract.Invariant(this._items != null); Contract.Invariant(this._size >= 0); Contract.Invariant(this._items.Length >= this._size); }

PUT: [PexMethod] public void ArrayListTest(ArrayList al, object o) { int len = al.Count; al.Add(o); PexAssert.IsTrue(al[len] == o); }

Generated Test: [TestMethod] public void Add01() { object[] os = new object[0]; // create raw instance ArrayList arrayList = PexInvariant.CreateInstance (); // set private field via reflection PexInvariant.SetField (arrayList, "_items", os); PexInvariant.SetField (arrayList, "_size", 0); // invoke invariant method via reflection PexInvariant.CheckInvariant(arrayList); // call to PUT ArrayListTest(arrayList, null); }

 Developer testing  http://www.developertesting.com/ http://www.developertesting.com/  Kent Beck’s 2004 talk on “Future of Developer Testing” http://www.itconversations.com/shows/detail301.html http://www.itconversations.com/shows/detail301.html  This talk focuses on tool automation in developer testing (e.g., unit testing)  Not system testing etc. conducted by testers

var list = new List(); list.Add(item); var list = new List(); list.Add(item); Assert.AreEqual(1, list.Count); } Assert.AreEqual(1, list.Count); }  Three essential ingredients:  Data  Method Sequence  Assertions void TestAdd() { int item = 3; void TestAdd() { int item = 3;

void TestAdd(List list, int item) { Assume.IsTrue(list != null); var count = list.Count; list.Add(item); Assert.AreEqual(count + 1, list.Count); } void TestAdd(List list, int item) { Assume.IsTrue(list != null); var count = list.Count; list.Add(item); Assert.AreEqual(count + 1, list.Count); }  Parameterized Unit Test = Unit Test with Parameters  Separation of concerns  Data is generated by a tool  Developer can focus on functional specification [Tillmann&Schulte ESEC/FSE 05]

 A Parameterized Unit Test can be read as a universally quantified, conditional axiom. void TestReadWrite(Res r, string name, string data) { Assume.IsTrue(r!=null & name!=null && data!=null); r.WriteResource(name, data); Assert.AreEqual(r.ReadResource(name), data); } void TestReadWrite(Res r, string name, string data) { Assume.IsTrue(r!=null & name!=null && data!=null); r.WriteResource(name, data); Assert.AreEqual(r.ReadResource(name), data); }  string name, string data, Res r: r ≠ null ⋀ name ≠ null ⋀ data ≠ null ⇒ equals( ReadResource(WriteResource(r, name, data).state, name), data)  string name, string data, Res r: r ≠ null ⋀ name ≠ null ⋀ data ≠ null ⇒ equals( ReadResource(WriteResource(r, name, data).state, name), data)

Parameterized Unit Tests (PUTs) commonly supported by various test frameworks .NET: Supported by.NET test frameworks  http://www.mbunit.com/  http://www.nunit.org/  …  Java: Supported by JUnit 4.X  http://www.junit.org/ Generating test inputs for PUTs supported by tools .NET: Supported by Microsoft Research Pex  http://research.microsoft.com/Pex/  Java: Supported by Agitar AgitarOne  http://www.agitar.com/

 Pex normally uses public methods to configure non-public object fields  Heuristics built-in to deal with common types  User can help if needed void (Foo foo) { if (foo.Value == 123) throw … void (Foo foo) { if (foo.Value == 123) throw … [PexFactoryMethod] Foo Create(Bar bar) { return new Foo(bar); } [PexFactoryMethod] Foo Create(Bar bar) { return new Foo(bar); }

 A graph example from QuickGraph library 118 interface IGraph { /* Adds given vertex to the graph */ void AddVertex(IVertex v); /* Creates a new vertex and adds it to the graph */ IVertex AddVertex(); /* Adds an edge to the graph. Both vertices should already exist in the graph */ IEdge AddEdge(IVertex v1, Ivertex v2); } interface IGraph { /* Adds given vertex to the graph */ void AddVertex(IVertex v); /* Creates a new vertex and adds it to the graph */ IVertex AddVertex(); /* Adds an edge to the graph. Both vertices should already exist in the graph */ IEdge AddEdge(IVertex v1, Ivertex v2); }

 Desired object state for reaching targets 1 and 2: graph object should contain vertices and edges Class SortAlgorithm { IGraph graph; public SortAlgorithm(IGraph graph) { this.graph = graph; } public void Compute (IVertex s) { foreach(IVertex u in graph.Vertices) { // Target 1 } foreach(IEdge e in graph.Edges) { //Target 2 } Class SortAlgorithm { IGraph graph; public SortAlgorithm(IGraph graph) { this.graph = graph; } public void Compute (IVertex s) { foreach(IVertex u in graph.Vertices) { // Target 1 } foreach(IEdge e in graph.Edges) { //Target 2 } method sequence

VertexAndEdgeProvider v0 = new VertexAndEdgeProvider(); Boolean v1 = false; BidirectionalGraph v2 = new BidirectionalGraph((IVertexAndEdgeProvider)v0, v1); IVertex v3 = v2.AddVertex(); IVertex v4 = v0.ProvideVertex(); IEdge v15 = v2.AddEdge(v3, v4); VertexAndEdgeProvider v0 = new VertexAndEdgeProvider(); Boolean v1 = false; BidirectionalGraph v2 = new BidirectionalGraph((IVertexAndEdgeProvider)v0, v1); IVertex v3 = v2.AddVertex(); IVertex v4 = v0.ProvideVertex(); IEdge v15 = v2.AddEdge(v3, v4); Achieved 31.82% (7 of 22) branch coverage Reason for low coverage: Not able to generate graph with vertices and edges Applying Randoop, a random testing approach that constructs test inputs by randomly selecting method calls Example sequence generated by Randoop v4 not in the graph, so edge cannot be added to graph.

VertexAndEdgeProvider v0; bool bVal; IGraph ag = new AdjacencyGraph(v0, bVal); IVertex source = ag.AddVertex(); IVertex target = ag.AddVertex(); IVertex vertex3 = ag.AdVertex(); IEdge edg1 = ag.AddEdge(source, target); IEdge edg2 = ag.AddEdge(target, vertex3); IEdge edg3 = ag.AddEdge(source, vertex3); VertexAndEdgeProvider v0; bool bVal; IGraph ag = new AdjacencyGraph(v0, bVal); IVertex source = ag.AddVertex(); IVertex target = ag.AddVertex(); IVertex vertex3 = ag.AdVertex(); IEdge edg1 = ag.AddEdge(source, target); IEdge edg2 = ag.AddEdge(target, vertex3); IEdge edg3 = ag.AddEdge(source, vertex3); Use mined sequences to assist Randoop and Pex Both Randoop and Pex achieved 86.40% (19 of 22) branch coverage with assistance from MSeqGen Mine sequences from existing code bases Reuse mined sequences for achieving desired object states A Mined sequence from an existing codebase Graph object includes both vertices and edges

Existing codebases are often large and complete analysis is expensive   Search and analyze only relevant portions Concrete values in mined sequences may be different from desired values   Replace concrete values with symbolic values and use dynamic symbolic execution Extracted sequences individually may not be sufficient to achieve desired object states   Combine extracted sequences to generate new sequences

Problem: Existing code bases are often large and complete analysis is expensive Solution: Use keyword search for identifying relevant method bodies using target classes Analyze only those relevant method bodies Target classes: System.Collections.Hashtable QuickGraph.Algorithms.TSAlgorithm Keywords: Hashtable, TSAlgorithm Target classes: System.Collections.Hashtable QuickGraph.Algorithms.TSAlgorithm Keywords: Hashtable, TSAlgorithm Shortnames of target classes are used as keywords

Problem: Concrete values in mined sequences are different from desired values to achieve target states Solution: Generalize sequences by replacing concrete values with symbolic values Class A { int f1 { set; get; } int f2 { set; get; } void CoverMe() { if (f1 != 10) return; if (f2 > 25) throw new Exception(“bug”); } Method Under Test A obj = new A(); obj.setF1(14); obj.setF2(-10); obj.CoverMe(); Mined Sequence for A Sequence cannot help in exposing bug since desired values are f1=10 and f2>25

Replace concrete values 14 and -10 with symbolic values X1 and X2 A obj = new A(); obj.setF1(14); obj.setF2(-10); obj.CoverMe(); Mined Sequence for A int x1 = *, x2 = *; A obj = new A(); obj.setF1(x1); obj.setF2(x2); obj.CoverMe(); Generalized Sequence for A Use DSE for generating desired values for X1 and X2 DSE explores CoverMe method and generates desired values (X1 = 10 and X2 = 35)

126  Randoop  Without assistance from MSeqGen: achieved 32% branch coverage  achieved 86% branch coverage  In evaluation, help Randoop achieve 8.7% (maximum 20%) higher branch coverage  Pex  Without assistance from MSeqGen: achieved 45% branch coverage  achieved 86% branch coverage  In evaluation, help Pex achieve 17.4% (maximum 22.5%) higher branch coverage

 Write assertions and Pex will try to break them  Without assertions, Pex can only find violations of runtime contracts causing NullReferenceException, IndexOutOfRangeException, etc.  Assertions leveraged in product and test code  Pex can leverage Code Contracts

Tao Xie North Carolina State University In collaboration with Nikolai Tillmann, Peli de Halleux, Wolfram Research, Suresh Thummalapenta,

Similar presentations

Presentation on theme: "Tao Xie North Carolina State University In collaboration with Nikolai Tillmann, Peli de Halleux, Wolfram Research, Suresh Thummalapenta,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Tao Xie North Carolina State University In collaboration with Nikolai Tillmann, Peli de Halleux, Wolfram Research, Suresh Thummalapenta,

Similar presentations

Presentation on theme: "Tao Xie North Carolina State University In collaboration with Nikolai Tillmann, Peli de Halleux, Wolfram Research, Suresh Thummalapenta,"— Presentation transcript:

Similar presentations

About project

Feedback