Cross Language Clone Analysis Team 2 February 3, 2011
Parsing/CodeDOM Clone Analysis Customer Meeting GUI Implementation Testing Current Status Path Forward 2
Allen Tucker Patricia Bradford Greg Rodgers Ashley Chafin 3
Parsing and conversion to CodeDOM 4
Received grammar and included in project. One parser engine == Three languages
Significant work remaining to get all statement variations into CodeDOM Estimate of 35% complete On going work will not impede analysis work.
The Algorithm 7
3 Types of Clones (Definition of Similarity): ◦ Type 1: An exact copy without modifications (except for whitespace and comments) ◦ Type 2: A syntactically identical copy Only variable, type, or function identifiers have been changed ◦ Type 3: A copy with further modifications Statements have been changed, reordered, added, or removed 8
9 Code Base CodeDOM Conversion Use Gold Parser for conversion Transformation Transform the CodeDOM elements into a sequence of tokens Processed Code Match Detection Run comparison algorithm on transformed code Transformed Code Clones Formatting Clone pair/class locations of the transformed code are mapped to the original code base by line numbers and file location Clone Pairs/Classes Filtering Clones are extracted from the source, visualized and manually analyzed to filter out false positives
Covert source code to CodeDOM 10
Transform the CodeDOM syntax to a sequence of tokens 11
$p$p($p$p&$p){$p$p=$p;$p$p=$p.$p();for(; $p!=$p. $p();++$p){$p<<$p<<$p<<*$p<<$p;++$p;}} $p$p($p$p&$p){$p$p=$p;$p$p=$p.$p();for(; $p!=$p. $p();++$p){$p $p $p<<$p;++$p;}} Levenshtein Distance ◦ minimum number of edits needed to transform one string into the other Insertion Deletion substitution 12
13
Customer Meeting on 1/27/11 14
We met with the customer on 1/27/11. We discussed the following: ◦ Updated release plan We moved up the project management user story. ◦ Current user stories (in design/development) - 5 Source Code Load & Translate Source Code Analyze Code Clone Highlights Auto-Navigate Project Management 15
We discussed the following (cont.): ◦ Use cases – 8 Load New Source Code Project Parse Source Code Translate to CodeDOM Associate Source Code Load Project Save & Delete Project Name & Rename Project View & Edit Project Properties 16
We discussed the following (cont.): ◦ UML Models – 6 We had to transport over some UML models created last semester into our current UML model collection. ◦ Sketches – 4 Created sketched for the project management user story. ◦ Functional Tests – 3 Created functional tests for the project management user story. 17
Sketches / Demo 18
19 UI Development Complete
20 Partially complete
21 UI Development Complete
22 Under Construction
Application Menu Options ◦ Should be added to Frame Layout design Property Pages ◦ Project ◦ File Group? ◦ Source Files Probable Changes in next iteration ◦ XP Process and Single/Multiple Projects 23
White Box and Black Box Testing 24
White Box Testing: ◦ Unit Testing Black Box Testing: ◦ Production Rule Testing Allows us to test the robustness of our engine because we can force rule production errors. Regression Testing Automated ◦ Functional Testing 25
Current Test Count: 33 Added test to cover existing code All tests are passing… ◦ “Happy Path Tests” ◦ Will begin off-nominals
Where we currently stand 30
31 Source Code Load & Translate - ◦ C++ - ◦ C# - ◦ Java - ◦ Associate - Source Code Analyze - ◦ Dr. Kraft’s tool - ◦ Type 1 clones – ◦ Type 2 clones – ◦ Type 3 clones – Where we stand…
32 Project Management – ◦ Remove “demo” GUI – 100% ◦ Sketches for visual design – 40% ◦ GUI Rework – Testing - ◦ Baseline unit tests – 100% ◦ Update unit test for this iteration - Where we stand…
As of Feb 3, 2011 SLOC: ◦ CS666_Client = 2137 lines ◦ CS666_Core = 2695 lines ◦ CS666_Console = 138 lines ◦ CS666_CppParser = 155 lines ◦ CS666_CsParser = 3265 lines ◦ CS666_JavaParser = 3388 lines ◦ CS666_LanguageSupport = 84 lines ◦ CS666_UnitTests = 944 lines Total = lines (including unit tests) 33 - Used lcounter.exe to count SLOC
Path Forward for the next iteration 34
35 Schedule
36 Below is a list of the tasks for our next iteration: ◦ Parsing/CodeDOM C++ parsing Complete Java conversion to CodeDOM ◦ Clone Analysis Detecting Type 1 clones ◦ GUI Project management Displaying source code Sketches for visual design Next Iteration
37 ◦ Documentation User Stories, Use Cases, UML Models, Sketches Project management Displaying source code Displaying CodeDOM Displaying Type 1 clones detected Functional Tests Update schedule ◦ Testing Unit tests Execute functional tests Next Iteration