Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Mining Coding Patterns to Detect Crosscutting Concerns.

Slides:



Advertisements
Similar presentations
Configuration Management
Advertisements

Aspect Oriented Programming. AOP Contents 1 Overview 2 Terminology 3 The Problem 4 The Solution 4 Join point models 5 Implementation 6 Terminology Review.
Slides prepared by Rose Williams, Binghamton University ICS201 Exception Handling University of Hail College of Computer Science and Engineering Department.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Modularization.
Chapter 10 Introduction to Arrays
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Iterators T.J. Niglio Computer & Systems Engineering Fall 2003 Software Design & Documentation Object Behavioral.
A Tool Support to Merge Similar Methods with a Cohesion Metric COB ○ Masakazu Ioka 1, Norihiro Yoshida 2, Tomoo Masai 1,Yoshiki Higo 1, Katsuro Inoue 1.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Prototype of.
SMIILE Finaly COBOL! and what else is new Gordana Rakić, Zoran Budimac.
Cmp Sci 187: Midterm Review Based on Lecture Notes.
Refactoring Support Tool: Cancer Yoshiki Higo Osaka University.
Generative Programming. Generic vs Generative Generic Programming focuses on representing families of domain concepts Generic Programming focuses on representing.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Cross-application.
1 Chapter Eight Exception Handling. 2 Objectives Learn about exceptions and the Exception class How to purposely generate a SystemException Learn about.
Abstract Data Types (ADTs) and data structures: terminology and definitions A type is a collection of values. For example, the boolean type consists of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Debugging Support.
1.  Collections are data structures that holds data in different ways for flexible operations  C# Collection classes are defined as part of the ◦ System.Collections.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University ICSE 2003 Java.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Similar.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A lightweight.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University What Kinds of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Investigation.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A clone detection approach for a collection of similar.
Aspect Oriented Programming Scott Nykl CSSE 411 Senior Seminar.
Mining and Analysis of Control Structure Variant Clones Guo Qiao.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Code-Clone Analysis.
2002/12/11PROFES20021 On software maintenance process improvement based on code clone analysis Yoshiki Higo* , Yasushi Ueda* , Toshihiro Kamiya** , Shinji.
Reishi Yokomori Nanzan University, Japan Harvey Siy University of Nebraska at Omaha, USA Norihiro Yoshida Nara Institute of Science and Technology, Japan.
Cross Language Clone Analysis Team 2 October 27, 2010.
1 Gemini: Maintenance Support Environment Based on Code Clone Analysis *Graduate School of Engineering Science, Osaka Univ. **PRESTO, Japan Science and.
Arrays An array is a data structure that consists of an ordered collection of similar items (where “similar items” means items of the same type.) An array.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Applying Clone.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Inoue Laboratory Eunjong Choi 1 Investigating Clone.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University How to extract.
Generative Programming. Automated Assembly Lines.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Retrieving Similar Code Fragments based on Identifier.
Today’s Agenda  Reminder: HW #1 Due next class  Quick Review  Input Space Partitioning Software Testing and Maintenance 1.
IDENTIFYING SEMANTIC DIFFERENCES IN ASPECTJ PROGRAMS Martin Görg and Jianjun Zhao Computer Science Department, Shanghai Jiao Tong University.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University July 21, 2008WODA.
With Jeff Gray and Ira Baxter Robert Tairas Visualization of Clone Detection Results Eclipse Technology Exchange Workshop OOPSLA 2006 Portland, Oregon.
Duplicate code detection using anti-unification Peter Bulychev Moscow State University Marius Minea Institute eAustria, Timisoara.
Weaving a Debugging Aspect into Domain-Specific Language Grammars SAC ’05 PSC Track Santa Fe, New Mexico USA March 17, 2005 Hui Wu, Jeff Gray, Marjan Mernik,
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Cage: A Keyword.
Behavioral Patterns CSE301 University of Sunderland Harry R Erwin, PhD.
Gordana Rakić, Zoran Budimac
Chapter 8: Aspect Oriented Programming Omar Meqdadi SE 3860 Lecture 8 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
Starting Out with C++ Early Objects ~~ 7 th Edition by Tony Gaddis, Judy Walters, Godfrey Muganda Modified for CMPS 1044 Midwestern State University 6-1.
1 Measuring Similarity of Large Software System Based on Source Code Correspondence Tetsuo Yamamoto*, Makoto Matsushita**, Toshihiro Kamiya***, Katsuro.
Scalable Clone Detection and Elimination for Erlang Programs Huiqing Li, Simon Thompson University of Kent Canterbury, UK.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Classification.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extraction of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Extracting Sequence.
What kind of and how clones are refactored? A case study of three OSS projects WRT2012 June 1, Eunjong Choi†, Norihiro Yoshida‡, Katsuro Inoue†
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Towards a Collection of Refactoring Patterns Based.
1 Gemini: Code Clone Analysis Tool †Graduate School of Engineering Science, Osaka Univ., Japan ‡ Graduate School of Information Science and Technology,
JAVA: An Introduction to Problem Solving & Programming, 6 th Ed. By Walter Savitch ISBN © 2012 Pearson Education, Inc., Upper Saddle River,
© 2006 Pearson Addison-Wesley. All rights reserved 1-1 Chapter 1 Review of Java Fundamentals.
CMSC 202 Computer Science II for Majors. CMSC 202UMBC Topics Exceptions Exception handling.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Aries: Refactoring.
Collections Dwight Deugo Nesa Matic
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A Metric-based Approach for Reconstructing Methods.
LESSON 8: INTRODUCTION TO ARRAYS. Lesson 8: Introduction To Arrays Objectives: Write programs that handle collections of similar items. Declare array.
Mining Application-Specific Coding Patterns for Software Maintenance
○Yuichi Semura1, Norihiro Yoshida2, Eunjong Choi3, Katsuro Inoue1
Tatsuya Miyake Takashi Ishio Katsuro Inoue
Refactoring Support Tool: Cancer
On Refactoring Support Based on Code Clone Dependency Relation
Presentation transcript:

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Mining Coding Patterns to Detect Crosscutting Concerns in Java Programs Takashi Ishio, Hironori Date, Tatsuya Miyake, Katsuro Inoue Osaka University 1

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Overview What is a Coding Pattern  A frequent code fragment involved in programs  “Not Modularized Code” == Crosscutting Concerns? Sequential pattern mining for Java source code  Apply PrefixSpan algorithm to normalized source code Coding patterns in 6 Java programs  JHotDraw, jEdit, Tomcat, Azureus, ANTLR, SableCC  “Consistent Behavior” crosscutting concern sort 2

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Coding Pattern A coding pattern is a frequent code fragment to implement a particular kind of functionality. while (iter.hasNext()) { Item item = (Item)iter.next(); buf.append(item.toString()); } while (iter.hasNext()) { Item item = (Item)iter.next(); buf.append(item.toString()); } copy- and- paste reuse the loop structure while (iter.hasNext()) { Item data = (Item)iter.next(); buf.add(process(data)); buf.add (data.getItem()); } 3

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Another Example In jEdit, methods that modify a text buffer must check the buffer is editable.  If the buffer is not editable, the methods execute beep() instead of the original behavior. JEditBuffer.java public void undo(…) { if (undoMgr == null) return; if (!isEditable()) { Toolkit.getDefaultToolkit().beep(); return; } // undo an action } TextArea.java public void insertEnterAndIndent() { if (!isEditable()) { getToolkit().beep(); } else { try { // insert “\n” and indent } … } 4

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Maintenance Problem Coding patterns make maintenance difficult.  Some patterns reflect implicit design in a program.  Patterns are duplicated code. Developers have to consistently edit pattern instances. Duplicated code is also known as code-clone.  Efficient code-clone detection tools (e.g. CCFinder) are hard to detect small code fragments. Some of them might be modified after copy-and-pasted. 5

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Pattern Mining for Source Code Java source code (methods) Pattern Groups (an XML format) public B m1() { … } public B m1() { … } public B m1() { … } public B m1() { … } public B m1() { … } public B m1() { … } A.m1 IF B.m2 LOOP A.m2 END-LOOP END-IF A.m1 IF B.m2 LOOP A.m2 END-LOOP END-IF A.m1 IF B.m2 LOOP A.m2 END-LOOP END-IF A.m1 IF B.m2 LOOP A.m2 END-LOOP END-IF A.m1 IF B.m2 LOOP A.m2 END-LOOP END-IF A.m1 IF B.m2 LOOP A.m2 END-LOOP END-IF Sequence Database Normalization Mining Each method is translated into a sequence of method calls and control elements. We use PrefixSpan, an algorithm of sequential pattern mining. Patterns Classification We classify similar patterns into a group. 6

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Normalization Rules for Java source code while (iter.hasNext()) { Item item = (Item)iter.next(); if (item.isActive()) item.foo(); } hasNext(): boolean LOOP next(): Object isActive(): boolean IF foo() : void END-IF hasNext(): boolean END-LOOP  Method call elements,  method signature without class name  LOOP/END-LOOP elements, and  IF/ELSE/END-IF elements.  Maintaining statement-to-element mapping Java Source Code Normalized Sequence 7

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Pattern Mining with PrefixSpan [Pei, 2004] Parameter: min_support = 2 Parameter: min_support = 2 a c d a b c c b a a a b a #instances of length-1 patterns Extract suffix sequences of “a” b c c d b c a b c a d b a a b c a : 4 b : 3 c : 3 d : 1 a : 1 b : 2 c : 2 d : 1 a c d d : 1 c : 1 a : 1 c : 1 a : 1 b : 1 d : 1 a b The Result 2 a c 2 Pattern Support Sequence Database #instances of length-2 patterns 8

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Filtering Patterns A constraint for control statements: If a pattern includes a LOOP/IF element, the pattern must include its corresponding element generated from the same control statement. hasNext() LOOP next() hasNext() END-LOOP hasNext() LOOP next() hasNext() next() hasNext() END-LOOP is missing! Well-formed patternsMalformed patterns 9

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Classifying Patterns into Groups A pattern implies various sub-patterns.  [A, B, C, D] implies [A, B, C], [A, C, D], … We classify such sub-patterns into a single pattern group.  If an instance of a pattern overlaps with an instance of another pattern, the patterns are classified into the same group. 10

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A screenshot of our tool Class Hierarchy View highlights classes involving a pattern. Pattern Mining Configuration Source Code View indicates elements of a pattern. The resultant patterns 11

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Case Study: 6 Java programs NameVersionSize(LOC)#Pattern#Group JHotDraw , jEdit4.3pre10168, Azureus , Tomcat , ANTLR , SableCC3.235, #instances ≧ 10, #elements ≧ 4 12

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Patterns in the programs Manually analyzed 5 frequent pattern groups for each program.  Excluded pattern groups comprising only JDK methods because most of them are well-known patterns for manipulating collections and strings. 17 groups (about 55%) are related to some functionality in applications  Others are implementation-level patterns E.g. null-check if (getView() != null) getView().get… 13

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Pattern Categorization 17 pattern groups into five Categories IDPattern Description#pattern groups 1 A boolean method to insert an additional action:,,, 3 2 A boolean method to change the behavior of multiple methods:,,, 3 3 A pair of set-up and clean-up:,, …, 3 4 Exception Handling : Every instance is included in a try-catch statement. 3 5Other patterns 5 14

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Category 1: Boolean method inserting an action Logging patterns in Tomcat and Azureus [ isDebugEnabled(), IF, debug(String), END-IF ] (in Tomcat) BasicAuthenticator.java public boolean authenticate(…) … { Principal principal = … if (principal != null) { if (log.isDebugEnabled()) log.debug("Already authenticated “ + …); … return (true); } … } 304 instances in Tomcat 119 instances in Azureus A well-known crosscutting concern, but hard to modularize various messages. 15

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Category 2: Boolean method to switch the behavior of multiple methods jEdit has 34 instances of the pattern  Preventing a user from editing a read-only buffer [ isEditable, IF, beep, END-IF ] JEditBuffer.java public void undo(…) { if (undoMgr == null) return; if (!isEditable()) { Toolkit.getDefaultToolkit().beep(); return; } // undo an action } Around advice in AspectJ may replace the pattern. 16

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Category 3: set-up and clean-up Synchronization in Azureus DHTControlImpl.java try { activity_mon.enter(); listeners.addListener( l ); for (int i=0;i<activities.size();i++){ listeners.dispatch(…); } } finally { activity_mon.exit(); } 151 instances use AEMonitor.enter and AEMonitor.exit [ enter, LOOP, END-LOOP, exit ] Template-Method or before/after advices might be applicable. 17

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Category 4: Exception Handling “Report Error” pattern in ANTLR parser ANTLRParser.java void rewrite_block() throws … { try { lp = LT(1); … match(LPAREN); … } catch (RecognitionException ex) { reportError(ex); recover(ex,_tokenSet_34); } returnAST = rewrite_block_AST; } [ LT, match, reportError, recover ] 38 instances in parser classes 18

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Other Patterns Duplicated code in ANTLR’s test cases [ setErrorListener, newTool, setCodeGenerator, genRecognizer ] TestAttributes.java public void testArguments() … { … ErrorManager.setErrorListener(equeue); Grammar g = new Grammar(…); Tool antlr = newTool(); CodeGenerator generator = new … g.setCodeGenerator(generator); generator.genRecognizer(); … // extract a string from AST assertEquals(expecting, found); } 107 instances in ANTLR test cases The pattern configures parser for each test case. 19

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Discussion Coding patterns are “Consistent Behavior” crosscutting concern sort [Marin, 2007]. Maintenance support using coding patterns  Refactoring the patterns (AO-Refactoring)  Consistently edit the instances (Fluid AOP)  Documenting the patterns (SoQueT) Only frequent patterns are investigated. 20

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Conclusion Sequential pattern mining to detect Coding Patterns in Java programs.  Applied PrefixSpan algorithm to normalized source code.  Frequent coding patterns implements “Consistent Behavior” in programs. Future Work  Support try-catch and synchronized statements  Analyze more patterns with software metrics  Compare coding patterns with code clones 21

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 22

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Normalization for method calls Item item = (Item)iter.next(); Rule: Extract only a method signature without its class name. next(): Object int p = max(getX(), getY()); getX(): int getY(): int max(int, int): int Rule: Two or more method calls are sorted by their control-flow and location in source code. 23

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Normalization for if statements if (COND) T; else F; COND IF T ELSE F END-IF 1. int x = getX(); if (a == x) … 2. if (a == getX()) … getX(): int IF … END-IF Rule: 24

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Normalization for loop statements for (INIT; COND; INC) BODY; INIT COND LOOP BODY INC COND END-LOOP while (COND) BODY; COND LOOP BODY COND END-LOOP Rule: Iterator it = list.iterator(); while (it.hasNext()) { Item item = (Item)it.next(); item.doSomething(); } for (Iterator it = list.iterator(); it.hasNext(); ) { Item item = (Item)it.next(); item.doSomething(); } iterator(): Iterator hasNext(): boolean LOOP next(): Object doSomething(): void hasNext() END-LOOP Rule: 25

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University The List of “App” Patterns Sup jEdit: Open and close “scope” before and after visiting a node, respectively 55 jEdit: Alert if a read-only buffer is to be edited.34 Azureus: Enter and exit a monitor before and after a loop, respectively 151 Azureus: Logging if enabled119 Tomcat: Logging if debugging mode304 Tomcat: Execute a function in privileged mode if a protection mode is enabled 46 ANTLR: Create code generators for unit testing107 ANTLR: Parse AST and report an error if necessary38 ANTLR: Visit tree nodes to output a text29 26

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University “Execute an action in privileged mode” pattern Tomcat has 46 instances of the pattern. [ isPackageProtectionEnabled, IF, doPrivileged, ELSE, END-IF ] 27

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Summarization of the pattern groups Select the top three frequent tokens in a pattern group to summarize the pattern. setUndoActivity() createUndoActivity() getUndoActivity() setAffectedFigures() Undo Pattern in JHotDraw 5.2b1 activity:3 undo:3 set:2 affected:1 create:1 figures:1 get:1 Tokens in the pattern set Undo Activity 28