Download presentation
Presentation is loading. Please wait.
Published byPosy Shaw Modified over 9 years ago
1
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis
2
2 Overview Software Reverse Engineering Definitions Program Understanding Plan Recognition
3
3 Reverse Engineering “Reverse engineering” term derived from hardware development – the process of discovering of how competitor’s system worked. –in software engineering, it is the process of discovering how your own system works. Software systems become difficult to understand and maintain since over time their size and complexity has had a continuous evolution.
4
4 Reverse engineering is usually applied to large legacy systems: – to make them easier to understand and maintain –to increase the potential for continued evolution. Often, the most fundamental reverse engineering reason is: – structural re-documentation. the structure of the system is derived with some of the design architecture recaptured. Reverse Engineering
5
5 Reverse Engineering Terminology Design recovery –is a subset of reverse engineering in which domain knowledge, external information, and deduction or fuzzy reasoning are added to the observations of the subject system to identify meaningful higher level abstractions. Design recovery recreates design abstractions from a combination of code, existing design documentation (if available), personal experience, and general knowledge about problem and application domains. Redocumentation –is the creation or revision of a semantically equivalent representation within the same relative abstraction level.. Redocumentation is the simplest and oldest form of reverse engineering, and can be considered to be an unintrusive, weak form of restructuring.
6
6 forward and reverse engineering can be illustrated as: Specifications Design Code Behavior Specifications Design Code Behavior Forward Engineering Reverse Engineering
7
7 Two distinct phases 1.identify the system’s components and any dependencies among them 2.a discovery phase which tends to be highly interactive and may involve: constructing the hierarchical subsystem components based on cohesion and coupling principles, the reconstruction of design and requirements specifications providing a ‘domain model’ and the matching of the model to the code. Reverse Engineering
8
8 Reverse engineering tends to be influenced heavily by the amount of domain knowledge available: –limits the degree of automation that is possible –limits the level of abstraction obtained. The uncovering of entities allows them to be classified and to determine shared properties and relationship attributes. Reverse Engineering
9
9 The concepts of aggregation can be applied to determine the part-of relationship between a composite and its constituents. Generalization and specialization allows an element to be related to a more general or specific element. Possible to apply grouping to form a set of elements and their necessary relationships to form a context.
10
10 various activities are performed during reverse engineering: –gathering (identifying) the software artifacts usually obtained from: specification/design documents, the code, any related documentation, application knowledge and syntactic pattern matching to identify program (functional) ‘units’. Reverse Engineering
11
11 –creating the repository of information: filter out immaterial information while selecting relevant information. –construct the abstraction layers at: the structural, functional and application levels. –May need to perform semantic and behavioral matching during this process. Reverse Engineering
12
12 the software system can be reasoned about in may different views: –structural view: the basis is from structure charts, call graphs (unit interaction), module and subsystem graphs, various metrics and organizational views [many can be constructed with CASE tools]. –functional view: usually can be obtained from the design, specification and requirements documents. –behavioral views: conceptual, temporal, process, domain and user interactive views. Reverse Engineering
13
13 Program Understanding A program is understood : –when it is possible to explain the program, its structure, behavior, how/what it effects in its operational context and its relationships to its operational domain. for large legacy systems, the program understanding phase is rather difficult,
14
14 human-oriented concepts are generally decoupled from the formal patterns of their algorithms –they involve an arbitrary semantic mapping from their operations on numbers and data to computational intentions based on their domain concepts. Automatic program analysis is usually quite limited in knowledge acquisition and the concept matching process. Program Understanding
15
15 Some general tool directions include: –Parsing and text analysis –Flow analysis (call graphs, control flow graphs, etc.) –Complexity analysis and anomaly detection (complexity measures, dataflow analysis, etc.) –Program segmentation (different slicing techniques to isolate behavior). Program Understanding
16
16 Program Segmentation Try to isolate areas of the implementation such that program understanding can be constrained to these segments which potentially implement the desired program behavior under consideration. Program slicing techniques can be applied at the source code level to isolate or highlight different behavioral properties of the program.
17
17 condition-based slice: in many cases, programs are structured along conditional tests. –for example, we could find all program slices for which the condition phone_off-hook or phone_ringing, etc. are true. Program Segmentation
18
18 in an accounting program, the tax paid and the collection method may be dependent on the province and tax_payable. Thus it may be desirable to locate areas of the code that are reachable under the globally specified condition that province=Ontario and tax=payable. In this case the user specifies the logical expression and optionally a slicing range (where to start and end slicing). Then all reachable flow paths for which the logical expression is true are found for examination. Program Segmentation
19
19 forward slice: many functions base computations on the values of the input variables. –Given a variable and the slicing range, it is possible to determine all statements which can be potentially affected by that variable –Similar to using dataflow techniques –This process tends to be recursive in nature since all variables in left-hand side of an included statement are repeatedly used as slicing variables. Program Segmentation
20
20 backward slice: basically, the classical interpretation of a slice: – those statements that can affect the value of a variable (produce some result). –This process is also recursive in nature. Program Segmentation
21
21 event-based slices: –for a given input event to the system, obtain the program segment which can be executed based on the occurrence of the event. –Slice can be obtained for input or output events (program segment(s) that could potentially be executed to generate the specific output event). –This is often useful in object-oriented implementations providing a list of objects and methods involved. Program Segmentation
22
22 Recognizing Plans we define a cliché as a frequently occurring pattern found in programs (e.g. an algorithm, some domain specific pattern or data structure). we define a plan as a representation of a cliché –e.g. using flow graphs, source code templates, and sets of logical constraints. –Then an understanding problem may be to locate the clichés using plans.
23
23 the plans can be viewed as describing design elements using common implementation patterns. –Thus the program contains a design element when a portion of its code matches one of the implementation patterns. Recognizing Plans
24
24 Program Understanding Strategies (with plans ) top-down : –begin with knowledge about the goals the program should achieve, –determine which plans can achieve these goals, and –attempt to associate these plans to the actual program code. –this process would require matching rules or constraints to determine how this code achieves various subgoals within a plan, and difference rules to recognize how they differ from code expected by the plan.
25
25 –This requires detailed advance knowledge of the goals of the program which in many cases may not be achievable. –Difficult to perform partial understanding since a program fragment is only ‘understood’ when it is connected to a top-level program goal.
26
26 bottom-up: starts at the code level, determines which plans might have this code as a component and attempt to infer higher level goals from these plans. –Continue until the programmer’s actual goals are recognized or the understander runs out of candidate plans to match against the goals. –Tends to suffer from a potential combinatorial explosion of possible paths since each code segment could be a part of a large number of plans, etc. –This is possibly the greatest limitation (for length and complexity of programs applied to) for use of this approach.
27
27 Automated program understanders have in the past avoided the size of the search space by either – restricting the top-down searches using a limited number of plans or –performing bottom-up searches using a library containing a limited number of mostly domain- independent plans. But understanding real-world software requires a bottom-up search and a reasonably large library. –These programs are naturally described in terms of domain-specific objects and operations; – thus we need to recognize both the plans which carry out these operations as well as the plans which represent the objects being manipulated.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.