Rename Local Variable Refactoring Instances Detection of Rename Local Variable Refactoring Instances in Commit History Matin Mansouri Advisor: Dr. Nikolaos Tsantalis Date department
Refactoring
The process of changing a software system in such a way that it does not alter the external behavior of the code, yet improves its internal structure Say refactoring is a very common activity A study at Microsoft revealed that 22% of developers initiate refactorings because of poor readability, and 11% because of poor maintainability [KZN14]. Moreover, 43% of developers mentioned that they actually perceived better code readability, and around 30% perceived improved code maintainability after refactoring. According to this study, developers spend about 10% of their time in each month working on refactoring
A Frequent Activity Refactoring is done regularly [Murphy-Hill et al, TSE2012] Overall, about 16% of all the changes [Xing et al, ICSM2006] Developers consider refactoring an integral part of their programming [Ge et al, ICSE2014] 10% of development time is spent on refactoring [Kim et-al, TSE2014]
A Beneficial Activity Microsoft study [Kim et-al, TSE2014]: Benefits observed from refactoring 30% improve maintainability 43% improved readability 27% add feature easier, improve extensibility, and fewer bugs When refactoring is performed 22% because of poor readability 13% code duplication 11% because of poor maintainability Maitainability yani chi tu paper
However… Refactorings slow down code review [Ge et al, VLHCC2017] Global refactorings can easily lead to merge conflicts [Dig et al, TSE2008] 80% of the breaking changes to the APIs are because of refactorings [Dig et al, ICSM2005] Refactorings affect negatively the process of accurately locating bug-introducing changes [Costa et al, TSE2016]
But these can be mitigated if we could identify the applied refactorings
Refactoring Detection to the Rescue Refactorings slow down code review [Ge et al, VLHCC2017] Refactoring-aware code review Global changes can easily lead to merge conflicts [Dig et al, TSE2008] Refactoring-aware code merging 80% of the breaking changes to the APIs are because of refactorings [Dig et al, ICSM2005] Refactoring-aware client adaptation 1. Of the 165 developers survey
Refactoring Detection to the Rescue Refactorings affect negatively the process of accurately locating bug-introducing changes [Costa et al, TSE2016] Improving the accuracy of SZZ algorithm In addition Empirical studies showed contradicting results regarding the refactoring effect on the code [Soares et al, JSS2013] Precise tool for accurate studies 1. Of the 165 developers survey
Refactoring Detection: The process of identifying a set of refactorings that have been applied between two revisions of a software system
Refactoring Detection Approaches RefactoringCrawler [Dig et al, ECOOP2006] Ref-Finder [Prete et al, ICSM2010] RefDiff [Silva et al, MSR2017] RefactoringMiner [Tsantalis et al, ICSE2018] Renaming detection [Malpohl et al, ASE2000] REPENT [Arnaoudova et-al, TSE2014]
However… These approaches are incomplete Not all refactoring types are supported E.g., Rename Local Variable refactoring
Why Rename Local Variable? Rename refactoring is the most applied refactoring (up to 74%) [Murphy-Hill et al, TSE2012] Rename local variable is the second most applied rename refactoring (~25%) [Arnaoudova et al, TSE2014] 21% developers do the rename refactoring everyday [Arnaoudova et al, TSE2014] As important as other refactorings Slide add before this for naming refactoring tool
Refactoring Detection Approaches RefactoringCrawler [Dig et al, ECOOP2006] Ref-Finder [Prete et al, ICSM2010] RefDiff [Silva et al, MSR2017] RefactoringMiner [Tsantalis et al, ICSE2018] Renaming detection [Malpohl et al, ASE2000] REPENT [Arnaoudova et-al, TSE2014] No Support for Rename Local Variable Not Available
Rename Local Variable Detection is Difficult Extract/inline method A) before B) after
Rename Local Variable Detection is Difficult Textual diff approach limitation public void invoke(...) { ... ClusterMessage imsg = manager. requestCompleted(invalidIds[i]); if (imsg != null) cluster.send(imsg); if (session != null) id = session.getIdInternal(); if (id == null) return ClusterMessage msg = manager.requestCompleted(id); if (msg == null) return; cluster.send(msg); } protected void sendSessionReplicationMessage(...) { String id = session.getIdInternal(); if (id != null) { ClusterMessage msg = manager.requestCompleted(id); if (msg != null) cluster.send(msg); } protected void sendInvalidSessions(...) { ClusterMessage imsg = manager.requestCompleted(invalidIds[i]); if (imsg != null) cluster.send(imsg);
Rename Local Variable Detection is Difficult Merge and split variable
Approach
Approach RefactoringMiner for: Rename local variable detection rules Statement matching Support for other refactoring types (e.g., extract/inline/rename method) Rename local variable detection rules Statement matching post-processing
outMessage -> response RefactoringMiner Statement Matching: public dnsMessage sendAXFR(dnsMessage inMessage) { Socket s; ... dnsMessage outMessage; s.getOutputStream().write(out); outMessage = new dnsMessage(); while (true) { outMessage.addRecord(dns.ANSWER, r); } return outMessage; public dnsMessage sendAXFR(dnsMessage query) { Socket s; ... dnsMessage response; s.getOutputStream().write(out); response = new dnsMessage(); while (true) { response.addRecord(dns.ANSWER, r); } return response; outMessage -> response
Rename Local Variable Rules There exists a replacement r which replaces the local variable l in the old revision with l’ in the new revision public dnsMessage sendAXFR(dnsMessage inMessage) { Socket s; ... dnsMessage outMessage; s.getOutputStream().write(out); outMessage = new dnsMessage(); while (true) { outMessage.addRecord(dns.ANSWER, r); } return outMessage; public dnsMessage sendAXFR(dnsMessage query) { Socket s; ... dnsMessage response; s.getOutputStream().write(out); response = new dnsMessage(); while (true) { response.addRecord(dns.ANSWER, r); } return response;
Rename Local Variable Rules l should belong to the declared local variables in ma and l’ should belong to the declared local variables in ma' l should not exist in the new revision of the code, and l’ should not exist in the old revision of the code public dnsMessage sendAXFR(dnsMessage inMessage) { Socket s; ... dnsMessage outMessage; s.getOutputStream().write(out); outMessage = new dnsMessage(); while (true) { outMessage.addRecord(dns.ANSWER, r); } return outMessage; public dnsMessage sendAXFR(dnsMessage query) { Socket s; ... dnsMessage response; s.getOutputStream().write(out); response = new dnsMessage(); while (true) { response.addRecord(dns.ANSWER, r); } return response;
Rename Local Variable Rules l should not appear in the body of the methods extracted from ma ,as detected by RefactoringMiner A) before B) after
Rename Local Variable Rules A local variable cannot be renamed to two local variables Similarly two local variables cannot be renamed to one local variable
Statement Matching Post-Processing Introducing scope public static void main(String args[]) { if (args.get(0) == 0) { String response = args.get(1); saveError(response); } else { saveResponse(response); } public static void main(String args[]) { if (args.get(0) == 0) { String errorMessage = args.get(1); saveError(errorMessage); } else { String message = args.get(1); saveResponse(message); }
RefactoringMiner Matches Statements in a Greedy Way
Statement Matching Post-Processing Find an optimal match
Evaluation
Evaluation Research Questions: RQ1: How accurate is our technique in detecting instances of Rename Local Variable refactorings? RQ2: How does our technique perform compared to REPENT? RQ3: What is the efficiency of our technique in terms of the time taken for detecting instances of Rename Local Variable refactorings?
Oracle Construction We used existing oracle provided by [Arnaoudova et-al, TSE2014] Software Revisions Period Files Total File Revisions KLOC Detected Validated dnsjava 1998-2011 365 1,415 9-35 144 32 Tomcat 1999-2006 12,205 46,498 5-315 397 180
RefBenchmark
RefBenchmark Applications: Unify the results of different tools Compute agreement using two or more tools Evaluate the results of the tools against an oracle
Extended Oracle Using RefBenchmark RefactoringMiner REPENT Software #Refactorings Agreed TP FP Tomcat 396 268 76 55 52 77 dnsjava 128 93 19 9 17 34 Total 524 361 95 64 69 111 There are three validated by repent we couldnt find dns One agreed on FP Dns 66 laleh , 20 davood + 32 repent = 110 Tomcat = repent 180 + lale 40 + davood 10 = 230
RQ1-Our Approach is Accurate Software #Refactorings TP FP FN Recall Precision Tomcat 396 344 55 52 86.9% 86.2% dnsjava 128 111 10 17 86.7% 91.7% Total 524 455 65 69 86.8% 87.5%
RQ2-Comparison with REPENT RefactoringMiner REPENT Software #Refactorings Recall Precision Tomcat 396 86.9% 86.2% 80.8% 80.6% dnsjava 128 86.7% 91.7% 85.2% 75.7% Total 524 86.8% 87.5% 81.8% 79.3%
RQ3-Execution Time (ms) Distribution of excetion time fro each commit
Conclusion Provided a comprehensive rename local variable oracle The oracle is extended by 96 refactoring instances All detected instances are validated manually Added Rename local variable detection to RefactoringMiner Outperforms the state-of-the-art approach (REPENT) Introduced a refactoring results agreement platform (RefBenchmark)