Architecture Recovery

Slides:



Advertisements
Similar presentations
Software Architecture Lecture 3
Advertisements

Ch:8 Design Concepts S.W Design should have following quality attribute: Functionality Usability Reliability Performance Supportability (extensibility,
Copyright © Richard N. Taylor, Nenad Medvidovic, and Eric M. Dashofy. All rights reserved. Basic Concepts Software Architecture Lecture 3.
Clustering Software Artefacts Based on Frequent common changes Presented by Haroon Malik.
Software Architecture Lecture 2
Clustering Clustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data. The example below demonstrates.
ARCHITECTURAL RECOVERY TO AID DETECTION OF ARCHITECTURAL DEGRADATION Joshua Garcia*, Daniel Popescu*, Chris Mattmann* †, Nenad Medvidovic*, and Yuanfang.
Cataloging and Detecting Architectural Bad Smells Joshua Garcia, Daniel Popescu, and Nenad Medvidovic, University of Southern California Yuanfang Cai,
Introduction to Software Architecture. Software Architecture Definition  Definition. A software system’s architecture is the set of principal design.
Writing Good Software Engineering Research Papers A Paper by Mary Shaw In Proceedings of the 25th International Conference on Software Engineering (ICSE),
Lecture Nine Database Planning, Design, and Administration
Software Architecture Lecture 3
The Re-engineering and Reuse of Software
Clustering Software Artifacts Based on Frequent common changes Presented by: Ashgan Fararooy Prepared by: Haroon Malik (Modified)
1 Using Heuristic Search Techniques to Extract Design Abstractions from Source Code The Genetic and Evolutionary Computation Conference (GECCO'02). Brian.
Chapter 10 Architectural Design
UML - Development Process 1 Software Development Process Using UML (2)
Introduction to RUP Spring Sharif Univ. of Tech.2 Outlines What is RUP? RUP Phases –Inception –Elaboration –Construction –Transition.
Software Engineering CS B Prof. George Heineman.
Implementation Yaodong Bi. Introduction to Implementation Purposes of Implementation – Plan the system integrations required in each iteration – Distribute.
An Introduction to Software Architecture
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
Role-Based Guide to the RUP Architect. 2 Mission of an Architect A software architect leads and coordinates technical activities and artifacts throughout.
SOFTWARE DESIGN.
Basic Concepts Software Architecture. What is Software Architecture? Definition: – A software architecture is the set of principal design decisions about.
Ioana Sora, Gabriel Glodean, Mihai Gligor Department of Computers Politehnica University of Timisoara Software Architecture Reconstruction: An Approach.
Database Planning, Design, and Administration Transparencies
ACDC: An Algorithm for Comprehension-Driven Clustering Vassilios Tzerpos R.C. Holt.
Software Clustering Based on Information Loss Minimization Periklis Andritsos University of Toronto Vassilios Tzerpos York University The 10th Working.
1 A Heuristic Approach Towards Solving the Software Clustering Problem ICSM03 Brian S. Mitchell /
Software Engineering Prof. Ing. Ivo Vondrak, CSc. Dept. of Computer Science Technical University of Ostrava
Design Concepts By Deepika Chaudhary.
Job scheduling algorithm based on Berger model in cloud environment Advances in Engineering Software (2011) Baomin Xu,Chunyan Zhao,Enzhao Hua,Bin Hu 2013/1/251.
SOFTWARE DESIGN. INTRODUCTION There are 3 distinct types of activities in design 1.External design 2.Architectural design 3.Detailed design Architectural.
Cmpe 589 Spring 2006 Lecture 2. Software Engineering Definition –A strategy for producing high quality software.
1 The Search Landscape of Graph Partitioning Problems using Coupling and Cohesion as the Clustering Criteria Brian S. Mitchell & Spiros Mancoridis
Using Social Network Analysis Methods for the Prediction of Faulty Components Gholamreza Safi.
Architectural Design Introduction Design has been described as a multistep process in which representations of data and program structure,
Text Clustering Hongning Wang
Basic Concepts and Definitions
Foundations, Theory, and Practice Software Architecture Copyright © Richard N. Taylor, Nenad Medvidovic, and Eric M. Dashofy. All rights reserved. Basic.
Testing Overview Software Reliability Techniques Testing Concepts CEN 4010 Class 24 – 11/17.
Architecture Recovery (a special, and especially important type of architectural analysis)
Basic Concepts of Software Architecture. What is Software Architecture? Definition: – A software system’s architecture is the set of principal design.
Class Design. Class Design The analysis phase determines what the implementation must do, and the system design.
1 Visual Computing Institute | Prof. Dr. Torsten W. Kuhlen Virtual Reality & Immersive Visualization Till Petersen-Krauß | GUI Testing | GUI.
GUILLOU Frederic. Outline Introduction Motivations The basic recommendation system First phase : semantic similarities Second phase : communities Application.
Design Engineering 1. Analysis  Design 2 Characteristics of good design 3 The design must implement all of the explicit requirements contained in the.
Software Architecture Lecture 3
Advanced Computer Systems
Software Architecture
Object-Oriented Software Engineering Using UML, Patterns, and Java,
Fast Kernel-Density-Based Classification and Clustering Using P-Trees
Lecture 9- Design Concepts and Principles
Software Architecture Lecture 3
Software Architecture Lecture 2
Software Architecture Lecture 3
Software Clustering.
Lecture 9- Design Concepts and Principles
DATA MINING Introductory and Advanced Topics Part II - Clustering
Software Architecture Lecture 3
An Introduction to Software Architecture
Software Architecture Lecture 3
Automated Analysis and Code Generation for Domain-Specific Models
Chenchen Xi CNC Supervisor: Prof. John R. Gurd
Design Yaodong Bi.
Software Architecture Lecture 3
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Presentation transcript:

Architecture Recovery Simin Wang Advisor: Prof. Liguo Huang Southern Methodist University siminw@smu.edu

Aspect of Architecture Design decisions are made and unmade over a system’s lifetime At time t a system has only one architecture Prescriptive architecture (PA) captures design decisions made prior to system construction as-designed Descriptive architecture (DA) describes how the system has been built as-implemented

Aspect of Architecture Software decay Drift – introduction of design decisions into a system that are not encompassed or implied by its architectural design Erosion – introduction of design decisions into a system that violate its architectural design Architectural Decay Can exist both in the design and code Software smell Commonly made design or implementation decision Negatively impacts your system’s lifecycle properties It is not a bug – it doesn’t break your system It is a manifestation of technical debt

Architecture Recovery The process of determining a system’s architecture from its implementation-level artifacts (Source code, executable files, Java .class files, etc.) Output is an architectural view (a structured arrangement of a system’s implementation-level artifacts under a set of criteria, or a higher-level representation).

Why Recover Architecture? Research Maintenance Evaluation Metrics and Issues Resource (work) allocation

Evaluation

How to Recover? Walk around, look, measure? What to recover from? Humans unavailable, different ideas, afraid to tell truth Documentation not always followed Code is reliable

Methods ACDC ARC WCA LIMBO Bunch ZBR

Clustering vs. Hierarchical Clustering Clustering is the process of forming groups of items or entities such that entities within a group are similar to one another and different from those in other groups. A hierarchical clustering method produces a classification in which small clusters of very similar molecules are nested within larger clusters of less closely-related molecules.

ACDC Algorithm for Comprehension-Driven Clustering Recovers components using patterns Source File Pattern Directory Structure Pattern Body-header Pattern (.c and .h file in C) Leaf Collection Pattern (drivers) Support Library Pattern Central Dispatcher Pattern Subgraph Dominator Pattern (G = (V, E)) Dominator node n0, dominator set ni (i = 1..m) A path from n0 to every ni For any node v, exist a path P from v to any ni, either n0  P or v  N

ACDC Stage 1: Skeleton construction. Create a skeleton of the final decomposition by identifying subsystems using a pattern-driven approach Stage 2: Orphan Adoption. Deal with the problem of maintaining a system’s decomposition as the system evolves

ACDC – Skeleton Construction Source file clusters Body-header conglomeration Leaf collection und support library identification. Ordered and limited subgraph domination. Disregards any files with an out-degree larger than 20 Goes through all the nodes and examines whether they qualify as the dominator node of a subsystem following the subgraph dominator pattern. If a non-empty dominated set is discovered, ACDC creates a subsystem containing both the dominator node and the dominated set. The name of this subsystem is the name of the dominator node plus the suffix “ss”. ACDC organizes the obtained subsystems, the containment hierarchy is a tree Finally, any files that were disregarded earlier are now considered again. Creation of “Support.ss” . Any files that were identified as candidates for the support library pattern in step 3 are assigned to this subsystem, unless they were already assigned to some subsystem during step 4.

ACDC – Orphan Adoption Non-clustered files are the orphans Attempts to place each newly introduced resource (called an orphan) in the subsystem that seems more appropriate.

ARC Architecture Recovery using Concerns Recovers concerns of implementation-level entities and uses a hierarchical clustering technique to obtain architectural elements. Compute similarity measures between concerns and identify which concerns appear in a single implementation-level entity.

ARC ARC represents a software system as a set of documents A document can have different topics, which are the concerns in ARC A topic z is a multinomial probability distribution over words w A document d is represented as a multinomial probability distribution over topics z Each implementation-level entity is treated as a document where its document-topic distribution is its feature vector. Hierarchical clustering is performed by computing similarities between entities using the Jensen-Shannon divergence, which allows computing similarities between document-topic distributions.

WCA Weighted Combined Algorithm Measures the inter-cluster distance between software entities and merges them into clusters based on this distance Two measures are proposed to measure the inter- cluster distance: Unbiased Ellenberg (UE) and Unbiased Ellenberg-NM (UENM).

WCA Begins by placing each implementation-level entity in its own cluster, where a cluster represents an architectural component. Computes the pair-wise similarity between all the clusters and then combines the two most similar clusters into a new cluster. Repeated until all elements have been clustered or the desired number of clusters is obtained. When two clusters are merged by WCA, a new feature vector is formed by combining the feature vectors of the two clusters.

LIMBO A hierarchical clustering algorithm that aims to make the Information Bottleneck algorithm scalable for large data sets. Uses a mechanism called Summary Artifacts (SA) to reduce the computations needed while minimizing accuracy loss. Uses the Information Loss (IL) measure to compute similarities between entities

Bunch Transforms the architecture recovery problem into an optimization problem. An optimization function called Modularization Quality (MQ) represents the quality of a recovered architecture. Uses hill-climbing and genetic algorithms to find a partition that maximizes MQ.

ZBR Zone Based Recovery Based on natural language semantics of identifiers found in the source code. Demonstrated accuracy in recovering Java package structure but struggled with memory issues when dealing with larger systems

References Garcia, Joshua, Igor Ivkovic, and Nenad Medvidovic. "A comparative analysis of software architecture recovery techniques." Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference on. IEEE, 2013. Tzerpos, Vassilios, and Richard C. Holt. "Accd: an algorithm for comprehension-driven clustering." Reverse Engineering, 2000. Proceedings. Seventh Working Conference on. IEEE, 2000. Maqbool, Onaiza, and Haroon Babri. "Hierarchical clustering for software architecture recovery." IEEE Transactions on Software Engineering 33.11 (2007). Lutellier, Thibaud, et al. "Comparing software architecture recovery techniques using accurate dependencies." Software Engineering (ICSE), 2015 IEEE/ACM 37th IEEE International Conference on. Vol. 2. IEEE, 2015.