Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules Wenyi Qian 1, Xin Peng 1, Zhenchang Xing 2, Stan Jarzabek 3, Wenyun.

Slides:



Advertisements
Similar presentations
eClassifier: Tool for Taxonomies
Advertisements

Chapter 3: Modularization
Chapter 22 Object-Oriented Systems Analysis and Design and UML Systems Analysis and Design Kendall and Kendall Fifth Edition.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
1 IBM SanFrancisco Product Evaluation Negotiated Option Presentation By Les Beckford May 2001.
Architecture Eclipse Framework with Plugin Concept JTransformer Plugin Analysis + Transformation interface: logical Program.language keeps representation.
PROCESS MODELING Transform Description. A model is a representation of reality. Just as a picture is worth a thousand words, most models are pictorial.
Aalborg Media Lab 21-Jun-15 Software Design Lecture 1 “ Introduction to Java and OOP”
A Domain-specific Modeling Approach to the Development of Online Peer Assessment Yongwu Miao and Rob Koper Educational Technology Expertise Centre Open.
Software Lifecycle A series of steps through which a software product progresses Lifetimes vary from days to months to years Consists of –people –overall.
“DOK 322 DBMS” Y.T. Database Design Hacettepe University Department of Information Management DOK 322: Database Management Systems.
About the Presentations The presentations cover the objectives found in the opening of each chapter. All chapter objectives are listed in the beginning.
Software Issues Derived from Dr. Fawcett’s Slides Phil Pratt-Szeliga Fall 2009.
1 An Introduction to Visual Basic Objectives Explain the history of programming languages Define the terminology used in object-oriented programming.
Course Instructor: Aisha Azeem
Software Engineer Report What should contains the report?!
The Re-engineering and Reuse of Software
The chapter will address the following questions:
Chapter One Overview of Database Objectives: -Introduction -DBMS architecture -Definitions -Data models -DB lifecycle.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
OOSE 01/17 Institute of Computer Science and Information Engineering, National Cheng Kung University Member:Q 薛弘志 P 蔡文豪 F 周詩御.
New Task Group CRIS Architecture & Development Maximilian Stempfhuber RWTH Aachen University Library
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
Microsoft Visual Basic 2005: Reloaded Second Edition
Database Design - Lecture 2
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A clone detection approach for a collection of similar.
Parser-Driven Games Tool programming © Allan C. Milne Abertay University v
Software Engineering General architecture. Architectural components:  Program organisation overview Major building blocks in a system Definition of each.
Key Challenges for Modeling Language Creation by Demonstration Hyun Cho, Jeff Gray Department of Computer Science University of Alabama Jules White Bradley.
Reviewing Recent ICSE Proceedings For:.  Defining and Continuous Checking of Structural Program Dependencies  Automatic Inference of Structural Changes.
Chapter 9 Moving to Design
Computer Concepts 2014 Chapter 12 Computer Programming.
Information System Development Courses Figure: ISD Course Structure.
Systems Analysis and Design in a Changing World, 3rd Edition
Lecture 1 Introduction Figures from Lewis, “C# Software Solutions”, Addison Wesley Richard Gesick.
1 Construction Chapter Key Concepts Be familiar with the system construction process. Understand different types of tests and when to use Understand.
Logical view –show classes and objects Process view –models the executables Implementation view –Files, configuration and versions Deployment view –Physical.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
CSC 131 Fall 2006 Lecture # 6 Object-Oriented Concepts.
Software Waterfall Life Cycle
Chapter 6 – Architectural Design Lecture 1 1Chapter 6 Architectural design.
Architectural Styles, Design Patterns, and Objects Joe Paulowskey.
AUDIT SOFTWARE Chapter 16. Generalized Audit Software Off-the-shelf software that provides a means to gain access to and manipulate data maintained on.
Developing Product Line Components Jan Bosch Professor of Software Engineering University of Groningen, Netherlands
Chapter 1 Revealed Distributed Objects Design Concepts CSLA.
Lecture 1: Introduction – Graduation Projects Topics to Discuss in Lectures 1. Project Deliverables 2. Course grading 3. Project Concept Writing.
Steps to integrate XML How does XML processing work? Simple uses of passive DOM objects Adding behaviour to information A converter and translator subsystem.
Enterprise Systems Modeling EGN 5623 Enterprise Systems Optimization Fall, 2012.
Class Diagrams. Terms and Concepts A class diagram is a diagram that shows a set of classes, interfaces, and collaborations and their relationships.
Object and Class Structuring Chapter 9 Part of Analysis Modeling Designing Concurrent, Distributed, and Real-Time Applications with UML Hassan Gomaa (2001)
© 2008 UniTESK Lab, ISP RAS; made available under the EPL v1.0 Towards Common Language Toolkit Institute for System Programming of RAS,
Architectural Mismatch: Why reuse is so hard? Garlan, Allen, Ockerbloom; 1994.
Chapter 7 Lecture 1 Design and Implementation. Design and implementation Software design and implementation is the stage in the software engineering process.
Enterprise Systems Modeling EGN 5621 Enterprise Systems Collaboration (Professional MSEM) Fall, 2012.
Information Extractors Hassan A. Sleiman. Author Cuba Spain Lebanon.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
M&CML: A Monitoring & Control Specification Modeling Language
Information systems modeling tools
Introduction to Design Patterns
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Object-Orientated Programming
BASICS OF SOFTWARE TESTING Chapter 1. Topics to be covered 1. Humans and errors, 2. Testing and Debugging, 3. Software Quality- Correctness Reliability.
Dr. Sudha Ram Huimin Zhao Department of MIS University of Arizona
Team Project, Part II NOMO Auto, Part II IST 210 Section 4
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
Database Design Hacettepe University
A Case Study of Variation Mechanism in an Industrial Product Line
Architectural Mismatch: Why reuse is so hard?
Information systems modeling tools
Presentation transcript:

Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules Wenyi Qian 1, Xin Peng 1, Zhenchang Xing 2, Stan Jarzabek 3, Wenyun Zhao 1 1 Fudan University, China 2 Nanyang Technological University, Singapore 3 National University of Singapore, Singapore

Logical Clones may not well documented revealing high-level rules

Logical Clones Logical clones consisting of: –Similar methods –Similar code fragments –Similar entity classes –Persistent data projects

Logical Clones Today’s techniques on clone/similarity detection: –Simple clone (text, token, AST…) –Structural clone (simple clone) –Similar design structures (similarity metrics, machine learning) They are not enough to detect high-level clones: –lack of high-level information –need of pre-defined templates, such as certain design pattern

Approach Overview input abstraction output

Program Model Methods & functional clusters Entity classes Code clones Persistent data objects

Program Model Methods & functional clusters –Semantic clustering

Program Model Entity classes –Encapsulating information with getter/setter

Program Model Code clones –Simple clones in different methods

Program Model Persistent data objects –Data tables in DB or data entries in files

Mining Process PosScreen processPay PosPayCheck PosScreen processPay PosPayGiftCard PosClearPayment PosScreen

Mining Process PosScreen processPay PosPayCheck PosScreen processPay PosPayGiftCard PosClearPayment PosScreen

Mining Process

PosScreen processPay PosPayCheck PosScreen processPay PosPayGiftCard PosClearPayment PosScreen

Mining Process

PosScreen processPay PosPayCheck PosScreen processPay PosPayGiftCard PosClearPayment PosScreen

Mining Process

Tool: MiLico

Case Study Project: Opentaps –14,351 classes & interfaces –253,743 methods 1690 logical clones mined –at least 3 nodes & 2 instances

Case Study

Categories of Logical Clones Categories of Mined Logical Clones (manual work) –Programming Convention (37%) –Design Structure (24%) –Business Task (23%) –Business Process (16%)

Categories of Logical Clones Programming Convention –Similar ways to implement similar functions

Categories of Logical Clones Design Structure –Similar interaction structures

Categories of Logical Clones Business Task –Similar ways to implement similar business task

Categories of Logical Clones Business Process –Similar business process or sub-process

Human Study 5 senior graduate students, 2 questions: Helpful for Programming understanding? Helpful for Reuse/Evolution?

Human Study

5 senior graduate students, 2 questions: Helpful for Programming understanding? YES Helpful for Reuse/Evolution? YES

Discussion Helpful for reuse, without knowledge of code details Developers with good domain knowledge will use logical clones better Making MiLiCo integrated with IDEs will make logical clones more useful

Conclusion The concept of logical clones The approach for mining logical clones The tool: MiLoCo A case study, showing that logical clones are helpful in software understanding, reuse and maintainance

Thanks for your attention!