Download presentation
Presentation is loading. Please wait.
1
Semi-Automated Software Restructuring
By Santosh K Singh Kesar Advisor Dr. James Fawcett Master’s Thesis Dept. of Electrical Engineering and Computer Science, Syracuse University October 8, 2008
2
Long-Term Research Goals
Attempt to answer the question: Is it possible to reliably improve the structure of large, complex, software? If so, can that be automated? If so, find appropriate means to implement a process for such improvement.
3
Specific Goals of this Research
Find ways to reduce the size of large functions and methods by turning them into a composition of smaller functions and methods with the same behavior. Automate that process. Evaluate the results.
4
Software Restructuring
Extracting new functions and methods from source code functions and methods. Semi-Automated Source Code Restructuring Maintains the same external behavior of restructured source code. New files of restructured source code written, in a different location from actual source code
5
Restructuring v/s Refactoring
Both code restructuring and refactoring are concerned with improving logical structure. Refactoring is a largely manual process with broader scope. Restructuring is automatic, but user-guided. Refactoring has traditionally been applied to managed source code Java in Eclipse C# in Visual Studio Our restructuring works with native languages: C and C++
6
Is Badly Structured Code Likely?
Is there a need for the results of this research? Do experienced researchers and professional developers often create badly structured code?
7
WeightTwoQuadrantsFactor
Imaging Research Code File Function Name Number of lines Weights_calculation.cpp WeightTwoQuadrantsFactor 280 FactorsTwoRays 206 AreaWeightFactor 223 W_Calculate 377 Main 191 WeightBottomFactor 164 Mlr800fs.c Emsid2_new 724 851 Ect 3608 emsid3_new 749 emsid4 516
8
GKGFX Library, Mozilla 1.4.1 Smallest disk is a file Dependency Lines
Number indicates the size of a strong component, in this case, 60 mutually dependent files
9
Restructuring Process
Analysis Find feasible regions for function extraction Selection Select from feasible regions code segments that require few parameters to be passed as function arguments Code generation
10
Analysis Lexical analysis Parsing Tokenize input stream
Group into analysis sequences Parsing Recognize key grammatical elements Store for later use Deeper analysis of functions
11
Lexical Analysis Tokenize Form semi-expressions Remove comments
Eliminate whitespace Recognize key punctuators Form semi-expressions Sequences of tokens appropriate for parsing
12
Our Lexical Analysis Tools
Tokenizer Sample output from Tokenizer Module
13
Our Lexical Analysis Tools
Semi-Expressions Sample output from Semi-Expressions Module
14
Parsing Recognize key grammatical elements
A very small subset of language grammar Function definitions Method definitions Data declarations Data manipulations Build parse tree Use tree elements to support code generation
15
Top Level Structure of Parse Tree
16
Different types of Nodes in Parse Tree
RootObj FunObj ClassObj ScopeObj DataObject Different types of Nodes in Parse Tree
17
Building of First Three Levels
Building Parse Tree Root Union 1 Global Function 1 Class 1 Global data 1 Global Function 2 Level 0 Level 1 Member Function 1 Member data 1 Level 2 Building of First Three Levels
18
Containment Diagram of Parse Tree
Root Global Function 1 Class 1 Global Function 2 Try Catch Member Function 1 Collection Of Local DataObjects Of member Collection of Local DataObjects Of Scope Top Level Containment diagram of Parse Tree
19
Criteria #1 – Line Numbers
Void source_code(int param) { Int _value = param; Std::string str = “test”; …… // Source code removed for brevity Try param++; if(param>5) param--; } Catch(std::exception& ex) std::cout<<“Exception!”; exit(1); If the source code in this section spans with in the maximum line count, it satisfies criteria #1, and is thus identified as a candidate ‘feasible region’ Example Criteria #1 – Feasible Region’s maximum number of lines (a Command Line Argument) Line number criteria for Feasible Regions
20
Top down approach Int _value = param; Str = str + “ string”;
Std::string str = “test”; Str = str + “ string”; Param = ++ _value; … 34 35 36 37 Constant bottom pointer Moving top pointer downwards Top down approach for determining parameters
21
Class Diagram of Parsers
Class Diagram of Parsers using Utility Class
22
Class Diagram of ICRNode
Class Diagram of ICRNode Interface
23
Class Diagram of RootObj
24
Association of DataObjects
Class Relationship diagram of Parse tree Objects
25
Representing Node Types
Class Diagram of Different Node Types
26
Class Diagram of DataObject
27
Hypothetical view of Hierarchy Stack
Root class Function try catch for if Top of Stack Representing the current Scope Stack Top Pointer Hypothetical view of Hierarchy Stack
28
Class Diagram of TempContainer
Figure 3.16 – Class Diagram of TempContainer
29
Class Diagram – feasibleRegions and newFunctions
Class Relationship diagram of feasibleRegions and newFunctions
30
Class Diagram – FunctionParser and fileManager
Class Relationship diagram of FunctionParser and fileManager
31
Restructuring in multiple passes
Original length of ‘testFun’ function: 120 Maximum number of parameters Maximum number of lines Number of lines in host method: testFun Pass 1 3 20 105 Pass 2 97 Pass 3 92 Pass 4 79
32
Restructuring Functions
void setRootValues() { try std::string inFile = getInputFile(); Directory dir; Scanner scanr; scanr.doRecursiveScan(inFile); dir.RestoreFirstDirectory(); if(dir.dirContainIncludes()) scanr.setFileIncludes(); std::vector<std::string> _files = getCompleteFiles(); if(_files.size() > 0) RootObj* root = new RootObj(); std::string _type = root->_typename(); if(_type == "") _type = "pRoot"; root->displayRootStats(); } catch(std::exception& ex) std::cout<< ex.what() <<std::endl; Original Source code
33
Extracted Function Restructured Code void setRootValues_1() {
std::string inFile = getInputFile(); Directory dir; Scanner scanr; scanr.doRecursiveScan(inFile); dir.RestoreFirstDirectory(); if(dir.dirContainIncludes()) scanr.setFileIncludes(); } void setRootValues() { try setRootValues_1(); std::vector<std::string> _files = getCompleteFiles(); if(_files.size() > 0) RootObj* root = new RootObj(); std::string _type = root->_typename(); if(_type == "") _type = "pRoot"; root->displayRootStats(); } catch(std::exception& ex) std::cout<< ex.what() <<std::endl;
34
Contributions Semi-Automated Software Restructuring Future Work:
Type Analysis parser for host language Representing source code structure as Parse tree Identification of Feasible regions Demonstration with working code Future Work: Further Optimization can be achieved. Semantic cues may help make sensible functions. Other things to think about like extracting Objects.
35
Changes to Thesis document
Removed references to SMIRG Re-formatted to match university regulations.
36
Demonstration Simple code that shows: Parsing Functions and Methods
Manages header and implementation files correctly.
37
End of Presentation Questions ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.