Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semi-Automated Software Restructuring

Similar presentations


Presentation on theme: "Semi-Automated Software Restructuring"— Presentation transcript:

1 Semi-Automated Software Restructuring
By Santosh K Singh Kesar Advisor Dr. James Fawcett Master’s Thesis Dept. of Electrical Engineering and Computer Science, Syracuse University October 8, 2008

2 Long-Term Research Goals
Attempt to answer the question: Is it possible to reliably improve the structure of large, complex, software? If so, can that be automated? If so, find appropriate means to implement a process for such improvement.

3 Specific Goals of this Research
Find ways to reduce the size of large functions and methods by turning them into a composition of smaller functions and methods with the same behavior. Automate that process. Evaluate the results.

4 Software Restructuring
Extracting new functions and methods from source code functions and methods. Semi-Automated Source Code Restructuring Maintains the same external behavior of restructured source code. New files of restructured source code written, in a different location from actual source code

5 Restructuring v/s Refactoring
Both code restructuring and refactoring are concerned with improving logical structure. Refactoring is a largely manual process with broader scope. Restructuring is automatic, but user-guided. Refactoring has traditionally been applied to managed source code Java in Eclipse C# in Visual Studio Our restructuring works with native languages: C and C++

6 Is Badly Structured Code Likely?
Is there a need for the results of this research? Do experienced researchers and professional developers often create badly structured code?

7 WeightTwoQuadrantsFactor
Imaging Research Code File Function Name Number of lines Weights_calculation.cpp WeightTwoQuadrantsFactor 280 FactorsTwoRays 206 AreaWeightFactor 223 W_Calculate 377 Main 191 WeightBottomFactor 164 Mlr800fs.c Emsid2_new 724 851 Ect 3608 emsid3_new 749 emsid4 516

8 GKGFX Library, Mozilla 1.4.1 Smallest disk is a file Dependency Lines
Number indicates the size of a strong component, in this case, 60 mutually dependent files

9 Restructuring Process
Analysis Find feasible regions for function extraction Selection Select from feasible regions code segments that require few parameters to be passed as function arguments Code generation

10 Analysis Lexical analysis Parsing Tokenize input stream
Group into analysis sequences Parsing Recognize key grammatical elements Store for later use Deeper analysis of functions

11 Lexical Analysis Tokenize Form semi-expressions Remove comments
Eliminate whitespace Recognize key punctuators Form semi-expressions Sequences of tokens appropriate for parsing

12 Our Lexical Analysis Tools
Tokenizer Sample output from Tokenizer Module

13 Our Lexical Analysis Tools
Semi-Expressions Sample output from Semi-Expressions Module

14 Parsing Recognize key grammatical elements
A very small subset of language grammar Function definitions Method definitions Data declarations Data manipulations Build parse tree Use tree elements to support code generation

15 Top Level Structure of Parse Tree

16 Different types of Nodes in Parse Tree
RootObj FunObj ClassObj ScopeObj DataObject Different types of Nodes in Parse Tree

17 Building of First Three Levels
Building Parse Tree Root Union 1 Global Function 1 Class 1 Global data 1 Global Function 2 Level 0 Level 1 Member Function 1 Member data 1 Level 2 Building of First Three Levels

18 Containment Diagram of Parse Tree
Root Global Function 1 Class 1 Global Function 2 Try Catch Member Function 1 Collection Of Local DataObjects Of member Collection of Local DataObjects Of Scope Top Level Containment diagram of Parse Tree

19 Criteria #1 – Line Numbers
Void source_code(int param) { Int _value = param; Std::string str = “test”; …… // Source code removed for brevity Try param++; if(param>5) param--; } Catch(std::exception& ex) std::cout<<“Exception!”; exit(1); If the source code in this section spans with in the maximum line count, it satisfies criteria #1, and is thus identified as a candidate ‘feasible region’ Example Criteria #1 – Feasible Region’s maximum number of lines (a Command Line Argument) Line number criteria for Feasible Regions

20 Top down approach Int _value = param; Str = str + “ string”;
Std::string str = “test”; Str = str + “ string”; Param = ++ _value; 34 35 36 37 Constant bottom pointer Moving top pointer downwards Top down approach for determining parameters

21 Class Diagram of Parsers
Class Diagram of Parsers using Utility Class

22 Class Diagram of ICRNode
Class Diagram of ICRNode Interface

23 Class Diagram of RootObj

24 Association of DataObjects
Class Relationship diagram of Parse tree Objects

25 Representing Node Types
Class Diagram of Different Node Types

26 Class Diagram of DataObject

27 Hypothetical view of Hierarchy Stack
Root class Function try catch for if Top of Stack Representing the current Scope Stack Top Pointer Hypothetical view of Hierarchy Stack

28 Class Diagram of TempContainer
Figure 3.16 – Class Diagram of TempContainer

29 Class Diagram – feasibleRegions and newFunctions
Class Relationship diagram of feasibleRegions and newFunctions

30 Class Diagram – FunctionParser and fileManager
Class Relationship diagram of FunctionParser and fileManager

31 Restructuring in multiple passes
Original length of ‘testFun’ function: 120 Maximum number of parameters Maximum number of lines Number of lines in host method: testFun Pass 1 3 20 105 Pass 2 97 Pass 3 92 Pass 4 79

32 Restructuring Functions
void setRootValues() { try std::string inFile = getInputFile(); Directory dir; Scanner scanr; scanr.doRecursiveScan(inFile); dir.RestoreFirstDirectory(); if(dir.dirContainIncludes()) scanr.setFileIncludes(); std::vector<std::string> _files = getCompleteFiles(); if(_files.size() > 0) RootObj* root = new RootObj(); std::string _type = root->_typename(); if(_type == "") _type = "pRoot"; root->displayRootStats(); } catch(std::exception& ex) std::cout<< ex.what() <<std::endl; Original Source code

33 Extracted Function Restructured Code void setRootValues_1() {
std::string inFile = getInputFile(); Directory dir; Scanner scanr; scanr.doRecursiveScan(inFile); dir.RestoreFirstDirectory(); if(dir.dirContainIncludes()) scanr.setFileIncludes(); } void setRootValues() { try setRootValues_1(); std::vector<std::string> _files = getCompleteFiles(); if(_files.size() > 0) RootObj* root = new RootObj(); std::string _type = root->_typename(); if(_type == "") _type = "pRoot"; root->displayRootStats(); } catch(std::exception& ex) std::cout<< ex.what() <<std::endl;

34 Contributions Semi-Automated Software Restructuring Future Work:
Type Analysis parser for host language Representing source code structure as Parse tree Identification of Feasible regions Demonstration with working code Future Work: Further Optimization can be achieved. Semantic cues may help make sensible functions. Other things to think about like extracting Objects.

35 Changes to Thesis document
Removed references to SMIRG Re-formatted to match university regulations.

36 Demonstration Simple code that shows: Parsing Functions and Methods
Manages header and implementation files correctly.

37 End of Presentation Questions ?


Download ppt "Semi-Automated Software Restructuring"

Similar presentations


Ads by Google