Balancing Trade-Offs in Test-Suite Reduction August Shi, Alex Gyori, Milos Gligoric, Andrey Zaytsev, and Darko Marinov 11/18/2014 FSE 22 Hong Kong NSF Grant Nos. CNS-0958199, CCF-1012759, CCF-1439957
Testing Can Be Slow Code Under Test test0 test1 test2 test3 … testN-1 EMPH: lots of tests -> slow development!
Speed Up by Removing Tests … testN-1 testN Code Under Test Speed up by removing tests
Test-Suite Reduction Code Under Test test1 test3 … testN In other words, make reduced test suite… Test engineer finds acceptable, just as good as full test suite, smaller Reduced test suite has fewer tests than full test suite but representative of full test suite on this code version
Test-Suite Reduction and Changes … testN Versioni Versioni+1 test1 test3 … testN changes Run same reduced test suite on future versions
Test-Suite Reduction and Changes … testN Versioni Versioni+1 test1 test3 … testN changes EMPH: is reduced test suite still representative?! Is reduced test suite still representative of full test suite on future versions?
How does evolution affect reduced test suites? EMPH: we still cannot judge evolution’s effect!
Test-Suite Reduction T = Tests S = Statements S1 S2 S3 S4 S5 T1 X T2 DO NOT SAY “MAINLY” WORKED ON SINGLE VERSION EMPH: Only single version previously, we do multiple (later)
Test-Suite Reduction Reduced Test Suite R1 = {T3, T5} T = Tests S = Statements S1 S2 S3 S4 S5 T1 X T2 T3 T4 T5 Reduced Test Suite R1 = {T3, T5}
Test-Suite Reduction Reduced Test Suite R1 = {T3, T5} T = Tests S = Statements S1 S2 S3 S4 S5 T1 X T2 T3 T4 T5 Reduced Test Suite R1 = {T3, T5} Statement Adequate Reduction (SAR)
Test-Suite Reduction T = Tests S = Statements X T2 T3 T4 T5 𝑆𝑖𝑧𝑅𝑒𝑑= |𝑇|−|𝑅1| |𝑇| ∗100
Test-Suite Reduction T = Tests S = Statements X T2 T3 T4 T5 𝑆𝑖𝑧𝑅𝑒𝑑= |𝑇|−|𝑅1| |𝑇| ∗100= 5 −2 5 ∗100=𝟔𝟎%
Test-Suite Reduction T = Tests S = Statements X T2 T3 T4 T5 𝑆𝑖𝑧𝑅𝑒𝑑= |𝑇|−|𝑅1| |𝑇| ∗100= 5 −2 5 ∗100=𝟔𝟎% 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠= |𝑠𝑡𝑚𝑡 𝑇 |−|𝑠𝑡𝑚𝑡 𝑅1 | |𝑠𝑡𝑚𝑡(𝑇)| ∗100
Test-Suite Reduction T = Tests S = Statements X T2 T3 T4 T5 𝑆𝑖𝑧𝑅𝑒𝑑= |𝑇|−|𝑅1| |𝑇| ∗100= 5 −2 5 ∗100=𝟔𝟎% 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠= |𝑠𝑡𝑚𝑡 𝑇 |−|𝑠𝑡𝑚𝑡 𝑅1 | |𝑠𝑡𝑚𝑡(𝑇)| ∗100=𝟎%
Test-Suite Reduction T = Tests S = Statements M = Mutants X T2 T3 T4 T5 M1 M2 M3 M4 M5 M6 X 𝑆𝑖𝑧𝑅𝑒𝑑= |𝑇|−|𝑅1| |𝑇| ∗100= 5 −2 5 ∗100=𝟔𝟎% 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠= |𝑠𝑡𝑚𝑡 𝑇 |−|𝑠𝑡𝑚𝑡 𝑅1 | |𝑠𝑡𝑚𝑡(𝑇)| ∗100=𝟎%
Test-Suite Reduction T = Tests S = Statements M = Mutants X T2 T3 T4 T5 M1 M2 M3 M4 M5 M6 X 𝑆𝑖𝑧𝑅𝑒𝑑= |𝑇|−|𝑅1| |𝑇| ∗100= 5 −2 5 ∗100=𝟔𝟎% 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠= |𝑠𝑡𝑚𝑡 𝑇 |−|𝑠𝑡𝑚𝑡 𝑅1 | |𝑠𝑡𝑚𝑡(𝑇)| ∗100=𝟎% 𝑀𝑢𝑡𝐿𝑜𝑠𝑠= |𝑚𝑢𝑡 𝑇 |−|𝑚𝑢𝑡 𝑅1 | |𝑚𝑢𝑡(𝑇)| ∗100
Test-Suite Reduction T = Tests S = Statements M = Mutants X T2 T3 T4 T5 M1 M2 M3 M4 M5 M6 X 𝑆𝑖𝑧𝑅𝑒𝑑= |𝑇|−|𝑅1| |𝑇| ∗100= 5 −2 5 ∗100=𝟔𝟎% 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠= |𝑠𝑡𝑚𝑡 𝑇 |−|𝑠𝑡𝑚𝑡 𝑅1 | |𝑠𝑡𝑚𝑡(𝑇)| ∗100=𝟎% 𝑀𝑢𝑡𝐿𝑜𝑠𝑠= |𝑚𝑢𝑡 𝑇 |−|𝑚𝑢𝑡 𝑅1 | |𝑚𝑢𝑡(𝑇)| ∗100
Test-Suite Reduction T = Tests S = Statements M = Mutants X T2 T3 T4 T5 M1 M2 M3 M4 M5 M6 X 𝑆𝑖𝑧𝑅𝑒𝑑= |𝑇|−|𝑅1| |𝑇| ∗100= 5 −2 5 ∗100=𝟔𝟎% 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠= |𝑠𝑡𝑚𝑡 𝑇 |−|𝑠𝑡𝑚𝑡 𝑅1 | |𝑠𝑡𝑚𝑡(𝑇)| ∗100=𝟎% 𝑀𝑢𝑡𝐿𝑜𝑠𝑠= |𝑚𝑢𝑡 𝑇 |−|𝑚𝑢𝑡 𝑅1 | |𝑚𝑢𝑡(𝑇)| ∗100= 6 −4 6 ∗100 =𝟑𝟑%
Test-Suite Reduction Reduced Test Suite R2 = {T1, T4, T5} T = Tests S = Statements M = Mutants S1 S2 S3 S4 S5 T1 X T2 T3 T4 T5 M1 M2 M3 M4 M5 M6 X Reduced Test Suite R2 = {T1, T4, T5}
Test-Suite Reduction Reduced Test Suite R2 = {T1, T4, T5} T = Tests S = Statements M = Mutants S1 S2 S3 S4 S5 T1 X T2 T3 T4 T5 M1 M2 M3 M4 M5 M6 X Reduced Test Suite R2 = {T1, T4, T5} Mutant Adequate Reduction (MAR)
Test-Suite Reduction T = Tests S = Statements M = Mutants X T2 T3 T4 T5 M1 M2 M3 M4 M5 M6 X 𝑆𝑖𝑧𝑅𝑒𝑑= |𝑇|−|𝑅2| |𝑇| ∗100= 5 −3 5 ∗100=𝟒𝟎%
Test-Suite Reduction T = Tests S = Statements M = Mutants X T2 T3 T4 T5 M1 M2 M3 M4 M5 M6 X 𝑆𝑖𝑧𝑅𝑒𝑑= |𝑇|−|𝑅2| |𝑇| ∗100= 5 −3 5 ∗100=𝟒𝟎% 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠= |𝑠𝑡𝑚𝑡 𝑇 |−|𝑠𝑡𝑚𝑡 𝑅2 | |𝑠𝑡𝑚𝑡(𝑇)| ∗100
Test-Suite Reduction T = Tests S = Statements M = Mutants X T2 T3 T4 T5 M1 M2 M3 M4 M5 M6 X 𝑆𝑖𝑧𝑅𝑒𝑑= |𝑇|−|𝑅2| |𝑇| ∗100= 5 −3 5 ∗100=𝟒𝟎% 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠= |𝑠𝑡𝑚𝑡 𝑇 |−|𝑠𝑡𝑚𝑡 𝑅2 | |𝑠𝑡𝑚𝑡(𝑇)| ∗100= 5−5 5 =𝟎%
Test-Suite Reduction T = Tests S = Statements M = Mutants X T2 T3 T4 T5 M1 M2 M3 M4 M5 M6 X 𝑆𝑖𝑧𝑅𝑒𝑑= |𝑇|−|𝑅2| |𝑇| ∗100= 5 −3 5 ∗100=𝟒𝟎% 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠= |𝑠𝑡𝑚𝑡 𝑇 |−|𝑠𝑡𝑚𝑡 𝑅2 | |𝑠𝑡𝑚𝑡(𝑇)| ∗100= 5−5 5 =𝟎% 𝑀𝑢𝑡𝐿𝑜𝑠𝑠= |𝑚𝑢𝑡 𝑇 |−|𝑚𝑢𝑡 𝑅2 | |𝑚𝑢𝑡(𝑇)| ∗100=𝟎%
Choosing a Test Suite SizRed StmtLoss MutLoss R1 60% 0% 33% R2 40% T
Choosing a Test Suite SizRed StmtLoss MutLoss R1 60% 0% 33% R2 40% T Single Version Evaluation
How does evolution affect reduced test suites? (Example) EMPH: we still cannot judge evolution’s effect!
How does evolution affect reduced test suites? X T2 T3 T4 T5 Versioni For instance…
How does evolution affect reduced test suites? X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 Changes to code underneath, statement coverage shuffled…
How does evolution affect reduced test suites? X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 All 6 statements covered by full test suite
How does evolution affect reduced test suites? X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 Reduced test suite does not cover all
How does evolution affect reduced test suites? X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 Missing 2 (compared to full test suite) EMPH: there is some loss originally unaware of
How does evolution affect reduced test suites? X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 EMPH: we cannot see there is loss due to evolution with current metrics! How do we measure change in loss?
Evolution-Aware Metrics X T2 T3 T4 T5 Versioni Concerned with performance of reduced test suite from earlier version on later version
Evolution-Aware Metrics X T2 T3 T4 T5 Versioni 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖 = 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 | ∗100=𝟎% Concerned with performance of reduced test suite from earlier version on later version
Evolution-Aware Metrics X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖 =𝟎% 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 | ∗100=𝟎% Concerned with performance of reduced test suite from earlier version on later version
Evolution-Aware Metrics X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖 =𝟎% 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 | ∗100=𝟎% Concerned with performance of reduced test suite from earlier version on later version 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖+1 = 𝑠𝑡𝑚𝑡 𝑖+1 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖+1 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖+1 𝑇 𝑖 | ∗100
Evolution-Aware Metrics X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖 =𝟎% 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 | ∗100=𝟎% Full test suite covers 6 statements on new 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖+1 = 𝑠𝑡𝑚𝑡 𝑖+1 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖+1 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖+1 𝑇 𝑖 | ∗100
Evolution-Aware Metrics X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖 =𝟎% 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 | ∗100=𝟎% Reduced test suite covers 4 statements 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖+1 = 𝑠𝑡𝑚𝑡 𝑖+1 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖+1 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖+1 𝑇 𝑖 | ∗100
Evolution-Aware Metrics X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖 =𝟎% 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 | ∗100=𝟎% 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖+1 = 𝑠𝑡𝑚𝑡 𝑖+1 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖+1 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖+1 𝑇 𝑖 | ∗100= 5−3 5 =𝟒𝟎%
Evolution-Aware Metrics X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖 =𝟎% 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 | ∗100=𝟎% 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖+1 =𝟒𝟎% 𝑠𝑡𝑚𝑡 𝑖+1 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖+1 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖+1 𝑇 𝑖 | ∗100= 5−3 5 =𝟒𝟎% 𝑹𝑬𝑪 𝒊 𝒊+𝟏 = 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖+1 − 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖
Evolution-Aware Metrics X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖 =𝟎% 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 | ∗100=𝟎% 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖+1 =𝟒𝟎% 𝑠𝑡𝑚𝑡 𝑖+1 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖+1 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖+1 𝑇 𝑖 | ∗100= 5−3 5 =𝟒𝟎% 𝑹𝑬𝑪 𝒊 𝒊+𝟏 = 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖+1 − 𝑆𝑡𝑚𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖 =40% −0%=𝟒𝟎𝒑𝒑
Evolution-Aware Metrics X T2 T3 T4 T5 M1’ M2’ M3’ M4’ M5’ M6’ T1 X T2 T3 T4 T5 Versioni Versioni+1
Evolution-Aware Metrics X T2 T3 T4 T5 M1’ M2’ M3’ M4’ M5’ M6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 𝑀𝑢𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖 =𝟑𝟑% 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 | ∗100=𝟎%
Evolution-Aware Metrics X T2 T3 T4 T5 M1’ M2’ M3’ M4’ M5’ M6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 𝑀𝑢𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖 =𝟑𝟑% 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 | ∗100=𝟎% 𝑀𝑢𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖+1 = 𝑚𝑢𝑡 𝑖+1 𝑇 𝑖 −| 𝑚𝑢𝑡 𝑖+1 𝑅 𝑖 | | 𝑚𝑢𝑡 𝑖+1 𝑇 𝑖 | ∗100= 6−3 6 =𝟓𝟎%
Evolution-Aware Metrics X T2 T3 T4 T5 M1’ M2’ M3’ M4’ M5’ M6’ T1 X T2 T3 T4 T5 Versioni Versioni+1 𝑀𝑢𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖 =𝟑𝟑% 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖 𝑇 𝑖 | ∗100=𝟎% 𝑀𝑢𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖+1 =𝟓𝟎% 𝑠𝑡𝑚𝑡 𝑖+1 𝑇 𝑖 −| 𝑠𝑡𝑚𝑡 𝑖+1 𝑅 𝑖 | | 𝑠𝑡𝑚𝑡 𝑖+1 𝑇 𝑖 | ∗100= 5−3 5 =𝟒𝟎% 𝑹𝑬𝑪 𝒊 𝒊+𝟏 = 𝑀𝑢𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖+1 − 𝑀𝑢𝑡𝐿𝑜𝑠𝑠 𝑖 𝑖 =50% −33%=𝟏𝟕𝒑𝒑
Evolution-Aware Metrics Versioni Versioni+1 Versioni+2 S1 S2 S3 S4 S5 T1 X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 S1’’ S2’’ S3’’ S4’’ T1 X T2 T3 T4 T5 Can collect requirements for multiple versions
Evolution-Aware Metrics Versioni Versioni+1 Versioni+2 S1 S2 S3 S4 S5 T1 X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 S1’’ S2’’ S3’’ S4’’ T1 X T2 T3 T4 T5 Can collect requirements for multiple versions 𝑹𝑬𝑪 𝒅 = 𝑅𝐸𝐶 𝑖 𝑗 | 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖 𝑎𝑛𝑑 𝑗 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 𝑑=𝑗−𝑖
Evolution-Aware Metrics Versioni Versioni+1 Versioni+2 S1 S2 S3 S4 S5 T1 X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 S1’’ S2’’ S3’’ S4’’ T1 X T2 T3 T4 T5 Compute REC pairwise Compute REC1 𝑹𝑬𝑪 𝟏 = 𝟒𝟎𝒑𝒑
Evolution-Aware Metrics Versioni Versioni+1 Versioni+2 S1 S2 S3 S4 S5 T1 X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 S1’’ S2’’ S3’’ S4’’ T1 X T2 T3 T4 T5 Even further away… Compute REC2 𝑹𝑬𝑪 𝟏 = 𝟒𝟎𝒑𝒑 𝑹𝑬𝑪 𝟐 = 𝟐𝟓𝒑𝒑
Evolution-Aware Metrics Versioni Versioni+1 Versioni+2 S1 S2 S3 S4 S5 T1 X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 S1’’ S2’’ S3’’ S4’’ T1 X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 S1’’ S2’’ S3’’ S4’’ T1 X T2 T3 T4 T5 Can start reduction point on a later point 𝑹𝑬𝑪 𝟏 = 𝟒𝟎𝒑𝒑 𝑹𝑬𝑪 𝟐 = 𝟐𝟓𝒑𝒑
Evolution-Aware Metrics Versioni Versioni+1 Versioni+2 S1 S2 S3 S4 S5 T1 X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 S1’’ S2’’ S3’’ S4’’ T1 X T2 T3 T4 T5 S1’ S2’ S3’ S4’ S5’ S6’ T1 X T2 T3 T4 T5 S1’’ S2’’ S3’’ S4’’ T1 X T2 T3 T4 T5 Get more data for certain distances 𝑹𝑬𝑪 𝟏 = 𝟒𝟎𝒑𝒑, 𝟐𝟓𝒑𝒑 𝑹𝑬𝑪 𝟐 = 𝟐𝟓𝒑𝒑 Compute REC1
How does evolution affect reduced test suites? (Evaluation) We are equipped to answer… We want to see how real projects’ reduced test suites behave
Evaluation: Implementation Use PIT to collect statement coverage and mutants killed by tests on each version Use Greedy heuristic to perform test-suite reduction Use Statement Adequate Reduction and Mutant Adequate Reduction
Evaluation: Projects Let’s see how evolution affects reduced test suites on some projects 18 open-source projects from GitHub
Evaluation: Projects Variety of applications
Evaluation: Projects Distance between versions = 30 Git commits NOTE: Not all 10 versions, some projects not as mature
Evaluation: Projects
Evaluation: Projects Median number of tests across versions NOTE: Might not be all, limitations of the tool
Evaluation: Projects
Evaluation: Projects
Statement Adequate Reduction (SAR) 𝑅𝐸𝐶 𝑑 = 𝑅𝐸𝐶 𝑖 𝑗 | 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖 𝑎𝑛𝑑 𝑗 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 𝑑=𝑗−𝑖 Explain x and y axis Explain colors Aggregation of all projects
Statement Adequate Reduction (SAR) 𝑅𝐸𝐶 𝑑 = 𝑅𝐸𝐶 𝑖 𝑗 | 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖 𝑎𝑛𝑑 𝑗 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 𝑑=𝑗−𝑖 EMPH: quality does not drop much! Median REC for SAR is around 0, does not drop much
Mutant Adequate Reduction (MAR) 𝑅𝐸𝐶 𝑑 = 𝑅𝐸𝐶 𝑖 𝑗 | 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖 𝑎𝑛𝑑 𝑗 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 𝑑=𝑗−𝑖
Mutant Adequate Reduction (MAR) 𝑅𝐸𝐶 𝑑 = 𝑅𝐸𝐶 𝑖 𝑗 | 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖 𝑎𝑛𝑑 𝑗 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 𝑑=𝑗−𝑖 Median REC for MAR is around 0, does not drop much
How does evolution affect reduced test suites? (Answer) EMPH: we can now answer question!
The quality of the reduced test suite relative to the quality of the full test suite remains about the same during evolution Answer (based on our evaluation over 18 projects)
We further evaluated… Different test-suite reduction algorithms Four other algorithms in addition to Greedy Inadequate test-suite reduction Reduction that does not preserve all statements covered or mutants killed …you can find more details in our paper
Threats to Validity Reduced test suites evaluated with same metrics (statement coverage and killed mutants) used for reduction Tests tracked by name 30 commits between versions
Related Work Improving test-suite reduction Hao et al. [ICSE 2012] Lin and Huang [IST 2009] Yoo and Harman [ISSTA 2007] Jeffrey and Gupta [TSE 2007] Black et al. [ICSE 2004] Offutt et al. [ICTCS 1995] Effect of software evolution on coverage Elbaum et al. [ICSM 2001]
Conclusions Q: How does software evolution affect test-suite reduction?
Conclusions Q: How does software evolution affect test-suite reduction? Introduced new evolution-aware metrics (REC) Performed the largest evaluation of test-suite reduction Different metrics/algorithms, inadequate test-suite reduction
Conclusions Q: How does software evolution affect test-suite reduction? Introduced new evolution-aware metrics (REC) Performed the largest evaluation of test-suite reduction Different metrics/algorithms, inadequate test-suite reduction A: Reduced test suites do not reduce in quality relative to the full test suite over time If reduced test suite is acceptable for this version, then it is likely acceptable in future versions EMPH: the answer!
Conclusions Q: How does software evolution affect test-suite reduction? Introduced new evolution-aware metrics (REC) Performed the largest evaluation of test-suite reduction Different metrics/algorithms, inadequate test-suite reduction A: Reduced test suites do not reduce in quality relative to the full test suite over time If reduced test suite is acceptable for this version, then it is likely acceptable in future versions August Shi: awshi2@illinois.edu http://mir.cs.illinois.edu/evolred/
BACKUP
Project Filters Uses Maven build system Has > 100 commits in history HEAD at time of experiments could be successfully run through PIT Has > 4 versions working through PIT in history
Evaluation: Basic Test-Suite Reduction
Different Algorithms? Versioni S1 S2 S3 S4 S5 T1 X T2 T3 T4 T5 Different algorithms can behave differently under evolution
Different Algorithms? Versioni S1 S2 S3 S4 S5 T1 X T2 T3 T4 T5 Different algorithms can behave differently under evolution
Different Algorithms? Versioni Versionj S1 S2 S3 S4 S5 T1 X T2 T3 T4 Different algorithms can behave differently under evolution
Different Algorithms? Greedy HGS GRE ILP GE Versioni Versionj S1 S2 S3 X T2 T3 T4 T5 S1 S2 S3 S4 S5 S6 T1 X T2 T3 T4 T5 Versioni Versionj Different algorithms can behave differently under evolution Greedy HGS GRE ILP GE
Evaluation: Other Algorithms Other 4 algorithms produce very similar results compared to Greedy Difference in size reduction at most 5.26pp across all algorithms Difference in MutLoss for SAR algorithms at most 7.15pp Difference in StmtLoss for MAR algorithms at most 4.15pp Difference in RECd for any distance d at most 0.33pp for MutLoss, 0.67pp for StmtLoss Cut absolute stuff for MutLoss and StmtLoss?
Inadequate Reduction? Versioni S1 S2 S3 S4 S5 T1 X T2 T3 T4 T5 Inadequate reduction if budget is very tight How does inadequately reduced test suite compare over time?
Inadequate Reduction? Versioni S1 S2 S3 S4 S5 T1 X T2 T3 T4 T5 Inadequate reduction if budget is very tight How does inadequately reduced test suite compare over time?
Inadequate Reduction? Versioni Versionj S1 S2 S3 S4 S5 T1 X T2 T3 T4 Inadequate reduction if budget is very tight How does inadequately reduced test suite compare over time?
Evaluation: Inadequate Reduction Statement Inadequate Reduction (SIR) Mutant Inadequate Reduction (MIR)
Evaluation: SIR Evolution 𝑅𝐸𝐶 𝑑 = 𝑅𝐸𝐶 𝑖 𝑗 | 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖 𝑎𝑛𝑑 𝑗 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 𝑑=𝑗−𝑖 Results for Greedy Explain x and y axis Explain colors Trends are stable Same for all requirements used in reduction This is for Greedy, similar trends for other algorithms
Evaluation: MIR Evolution 𝑅𝐸𝐶 𝑑 = 𝑅𝐸𝐶 𝑖 𝑗 | 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖 𝑎𝑛𝑑 𝑗 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 𝑑=𝑗−𝑖 Results for Greedy
Code Change