Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Investigation.

Slides:



Advertisements
Similar presentations
Sequence of characters Generalized form Expresses Pattern of strings in a Generalized notation.
Advertisements

Formal Language, chapter 4, slide 1Copyright © 2007 by Adam Webber Chapter Four: DFA Applications.
CSCI 160 Midterm Review Rasanjalee DM.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Identifying Source.
Computer Science and Engineering College of Engineering The Ohio State University Classes and Objects: Members, Visibility The credit for these slides.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Modularization.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extraction of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extracting Code.
A Tool Support to Merge Similar Methods with a Cohesion Metric COB ○ Masakazu Ioka 1, Norihiro Yoshida 2, Tomoo Masai 1,Yoshiki Higo 1, Katsuro Inoue 1.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Prototype of.
Stimulating reuse with an automated active code search tool Júlio Lins – André Santos (Advisor) –
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Measuring Copying.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Industrial Application.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Where Does This.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Cross-application.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Debugging Support.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Mining Coding Patterns to Detect Crosscutting Concerns.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A lightweight.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University What Kinds of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Criterion for.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A clone detection approach for a collection of similar.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University What Do Practitioners.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A Method to Detect License Inconsistencies for Large-
Mining and Analysis of Control Structure Variant Clones Guo Qiao.
Geoff Holmes and Bernhard Pfahringer COMP206-08S General Programming 2.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Detection and evolution analysis of code clones for.
JAVA: An Introduction to Problem Solving & Programming, 5 th Ed. By Walter Savitch and Frank Carrano. ISBN © 2009 Pearson Education, Inc., Upper.
Arrays Module 6. Objectives Nature and purpose of an array Using arrays in Java programs Methods with array parameter Methods that return an array Array.
In the name of Allah The Proxy Pattern Elham moazzen.
The Daikon system for dynamic detection of likely invariants MIT Computer Science and Artificial Intelligence Lab. 16 January 2007 Presented by Chervet.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Applying Clone.
Question of the Day  On a game show you’re given the choice of three doors: Behind one door is a car; behind the others, goats. After you pick a door,
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Inoue Laboratory Eunjong Choi 1 Investigating Clone.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University How to extract.
Mining Billions of AST Nodes to Study Actual and Potential Usage of Java Language Features Robert Dyer The research activities described in this talk were.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University VerXCombo: An.
Networks Sockets and Streams. TCP/IP in action server ports …65535 lower port numbers ( ) are reserved port echo7 time13 ftp20 telnet23.
This material is approved for public release. Distribution is limited by the Software Engineering Institute to attendees. Sponsored by the U.S. Department.
Copyright © 1994 Carnegie Mellon University Disciplined Software Engineering - Lecture 1 1 Disciplined Software Engineering Lecture #2 Software Engineering.
Introduction IS Outline  Goals of the course  Course organization  Java command line  Object-oriented programming  File I/O.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Retrieving Similar Code Fragments based on Identifier.
Alattin: Mining Alternative Patterns for Detecting Neglected Conditions Suresh Thummalapenta and Tao Xie Department of Computer Science North Carolina.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Assertion with.
Copyright © 2015 NTT DATA Corporation Kazuo Kobori, NTT DATA Corporation Makoto Matsushita, Osaka University Katsuro Inoue, Osaka University SANER2015.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Code Clones.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Cage: A Keyword.
Configuration Management CSCI 5801: Software Engineering.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Software Tag:
Aspect Mining Jin Huang Huazhong University of Science & Technology, China
1 Measuring Similarity of Large Software System Based on Source Code Correspondence Tetsuo Yamamoto*, Makoto Matsushita**, Toshihiro Kamiya***, Katsuro.
Data and Knowledge Engineering Laboratory Clustered Segment Indexing for Pattern Searching on the Secondary Structure of Protein Sequences Minkoo Seo Sanghyun.
 In the java programming language, a keyword is one of 50 reserved words which have a predefined meaning in the language; because of this,
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Classification.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extraction of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Extracting Sequence.
JAVA: An Introduction to Problem Solving & Programming, 5 th Ed. By Walter Savitch and Frank Carrano. ISBN © 2008 Pearson Education, Inc., Upper.
Arrays Chapter 7. MIS Object Oriented Systems Arrays UTD, SOM 2 Objectives Nature and purpose of an array Using arrays in Java programs Methods.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Aries: Refactoring.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Detection of License Inconsistencies in Free and.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Software Ingredients:
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A Metric-based Approach for Reconstructing Methods.
Estimating Code Size After a Complete Code-Clone Merge Buford Edwards III, Yuhao Wu, Makoto Matsushita, Katsuro Inoue 1 Graduate School of Information.
Yasuhiro Hayase†, Yu Kashima‡, Yuki Manabe‡, Katsuro Inoue‡
Chapter No. : 1 Introduction to Java.
Mining Application-Specific Coding Patterns for Software Maintenance
null, true, and false are also reserved.
Tatsuya Miyake Takashi Ishio Katsuro Inoue
Recommending Verbs for Rename Method using Association Rule Mining
Topic 25 - more array algorithms
Presentation transcript:

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Investigation of Coding Patterns over Version History Hironori Date, Takashi Ishio, Katsuro Inoue Osaka University, Japan

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Coding Patterns Frequent sequence of call elements and control elements –Call element Method call element Constructor call element –Control element IF, END-IF LOOP, END-LOOP etc… Implement a particular kind of concerns –spread around source code 2012/10/26 2 JHotDraw Ver. 5.4b1

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Previous Research [1] Extracted coding patterns from 5 applications Coding pattern type –API usage patterns –Application-specific Patterns 2012/10/26 3 [1] T. Ishio, H. Date, T. Miyake, and K. Inoue, “Mining coding patterns to detect crosscutting concerns in java programs,” in Proceedings of the 15th Working Conference on Reverse Engineering, 2008, pp. 123–132. Coding patterns are candidates of reusable code

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Previous Research [1] 2012/10/26 4 ?? Similar Patterns Which patterns are easier to reuse? Assumption: Stable patterns are reusable

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Research Question To answer this question … 1.Extract coding patterns from multiple versions of applications 2.Investigate the life-span of coding patterns Life-span: the number of versions where we find the identical pattern Are the coding patterns generally stable over the version history? RQ 2012/10/26 5

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Outline of Experiment Mining coding patterns 1.Normalization of source code 2.Sequential pattern mining for each version Tracking coding patterns –Compute life-span of each pattern 2012/10/26 6.jav a.xml … … Ver. 1Ver. 2Ver. N Source Code Coding Patterns Ver. 1Ver. 2…Ver. NLife-span Pat. 134…36 Pat. 200…24 ……………… Pat. M30…23 Life-span Mining Coding Patterns (using Fung) Tracking Coding Patterns

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Outline of Experiment Mining coding patterns 1.Normalization of source code 2.Sequential pattern mining for each version Tracking coding patterns –Compute life-span of each pattern 2012/10/26 7.jav a.xml … … Ver. 1Ver. 2Ver. N Source Code Coding Patterns Ver. 1Ver. 2…Ver. NLife-span Pat. 134…36 Pat. 200…24 ……………… Pat. M30…23 Life-span Mining Coding Patterns (using Fung) Tracking Coding Patterns

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Outline of Experiment Mining coding patterns 1.Normalization of source code 2.Sequential pattern mining for each version Tracking coding patterns –Compute life-span of each pattern 2012/10/26 8.jav a.xml … … Ver. 1Ver. 2Ver. N Source Code Coding Patterns Ver. 1Ver. 2…Ver. NLife-span Pat. 134…36 Pat. 200…24 ……………… Pat. M30…23 Life-span Mining Coding Patterns (using Fung) Tracking Coding Patterns

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Normalization in Pattern Mining Translate each method into a sequence –Call elements –Control elements Normalize control elements (Table I) 2012/10/26 9 public class A { void a() { int i = x + y; callA(); callB(); } void b() { if (cond()) { callA(); callB(); } Source File Sequence Database A.a() A.b() Normalization

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University public class A { void a() { int i = x + y; callA(); callB(); } void b() { if (cond()) { callA(); callB(); } Source File Normalization Coding Pattern Sequential Pattern Mining Sequence Database 2012/10/26 10 A.a() A.b() Sequential Pattern Mining Minimum Length: 2 threshold of #pattern element Minimum Support: 2 threshold of #pattern instance class A { void a() { … } class A { void b() { … } Parameters

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Identical Patterns Between Versions Exact match of pattern sequence Not care #instance 2012/10/26 11 Ver. X Ver. Y … … class A { void a() { … } class B { void b() { … } class A { void a() { … } class B { void b() { … } class A { void a() { … } class B { void b() { … } class C { void c() { … }

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Identical Patterns Between Versions Exact match of pattern sequence Not care #instance 2012/10/26 12 Ver. X Ver. Y … … class A { void a() { … } class B { void b() { … } class A { void a() { … } class B { void b() { … } class A { void a() { … } class B { void b() { … } class C { void c() { … } NOT Identical

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Identical Patterns Between Versions Exact match of pattern sequence Not care #instance 2012/10/26 13 Ver. X Ver. Y … … class A { void a() { … } class B { void b() { … } class A { void a() { … } class B { void b() { … } class A { void a() { … } class B { void b() { … } class C { void c() { … } Identical

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Tracking Coding Patterns 1.List all of coding patterns from all versions 2.Look up #pattern instance in each version 3.Compute life-span 2012/10/26 14 Ver. 1Ver. 2Ver. 3Life-span Ver. 1Ver. 2Ver. 3.xml Pattern Version Coding Patterns

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Tracking Coding Patterns 1.List all of coding patterns from all versions 2.Look up #pattern instance in each version 3.Compute life-span 2012/10/26 15 Ver. 1Ver. 2Ver. 3Life-span Ver. 1Ver. 2Ver. 3.xml Pattern Version Coding Patterns

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Tracking Coding Patterns 1.List all of coding patterns from all versions 2.Look up #pattern instance in each version 3.Compute life-span 2012/10/26 16 Ver. 1Ver. 2Ver. 3Life-span Ver. 1Ver. 2Ver. 3.xml Pattern Version Coding Patterns

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Ver. 1Ver. 2Ver. 3Life-span 2333 Pattern Version Tracking Coding Patterns 2012/10/26 17 Coding Patterns Ver. 1Ver. 2Ver. 3.xml V class A { void a() { … } class B { void b() { … } class A { void a() { … } class C { void c() { … } class B{ void b() { … } Ver. 1Ver. 2 Ver. 3 class A { void a() { … } class C { void c() { … } class B{ void b() { … } 2 instances 3 instances Coding Patterns

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Tracking Coding Patterns 2012/10/26 18 Coding Patterns Ver. 1Ver. 2Ver. 3.xml V Ver. 1Ver. 2 Ver. 3 class A { void a() { … } class B{ void b() { … } Not Found 2 instances Ver. 1Ver. 2Ver. 3Life-span Pattern Version Coding Patterns

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Tracking Coding Patterns 2012/10/26 19 Coding Patterns Ver. 1Ver. 2Ver. 3.xml V Ver. 1Ver. 2 Ver. 3 class A { void a() { … } class B{ void b() { … } class A { void a() { … } class C { void c() { … } class B{ void b() { … } class A { void a() { … } class C { void c() { … } class B{ void b() { … } class D { void d() { … } 4 instances 2 instances 3 instances Ver. 1Ver. 2Ver. 3Life-span Pattern Version Coding Patterns

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Tracking Coding Patterns 2012/10/26 20 Coding Patterns Ver. 1Ver. 2Ver. 3.xml V Ver. 1Ver. 2 Ver. 3 class A { void a() { … } class B{ void b() { … } Not Found class A { void a() { … } class B{ void b() { … } 2 instances Ver. 1Ver. 2Ver. 3Life-span Pattern Version Coding Patterns

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Experiments Target applications download source archive of release versions from project web sites –dnsjava Version: 0.1 to (51 versions) –JmDNS Version: 0.2 to (20 versions) Pattern mining parameters –Minimum length: 2 Threshold of the number of elements of a pattern sequence –Minimum support: 2 Threshold of the number of pattern instances 2012/10/26 21

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Result of Experiment LOC and the number of patterns –Figure 2 and Figure 3 Distribution of life-span –Figure 4 and Figure 5 Distribution of life-span and pattern length –Figure 6 and Figure 7 Show sample code of patterns with longest life-span –Picked up from Table III and Table IV 2012/10/26 22

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University LOC and the Number of Patterns in dnsjava (Figure 2) 51 versions 5,084 LOC to 33,330 LOC 512 to 4,405 patterns (in single version) 17,284 patterns in total (no duplication) The correlation coefficients (LOC & #Pattern): /10/26 23 LOC#Pattern Version

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University LOC and the Number of Patterns in JmDNS (Figure 3) 20 versions 3,408 LOC to 17,252 LOC 237 to 2,419 patterns (in single version) 8,625 patterns in total (no duplication) The correlation coefficients (LOC & #Pattern): /10/26 24 LOC #Pattern Version

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Life-span of Patterns in dnsjava (Figure 4) 2012/10/26 25 Median: 3 14 patterns appear in all versions (Table III) Life-span Frequency Stable PatternUnstable Pattern Total 17,284 patterns in 51 versions

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Life-span of Patterns in JmDNS (Figure 5) 2012/10/26 26 Median: 2 21 patterns appear in all versions (Table IV) Life-span Frequency Stable PatternUnstable Pattern Total 8,625 patterns in 20 versions

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Life-span of Patterns dnsjava (51 versions) –A half of coding pattern disappeared within 3 versions (median is 3) JmDNS (20 versions) –A half of coding pattern disappeared within 2 versions (median is 2) 2012/10/26 27 Life-span of coding pattern tends to be short

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Life-span and Pattern Length dnsjava (Figure 6) 2012/10/26 28 No Patterns Coding patterns with short life-span include a small number of elements Coding patterns with long life-span have short pattern length Coding patterns includes a large number of elements survive only a short period

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Life-span and Pattern Length JmDNS (Figure 7) 2012/10/26 29 No Patterns Coding patterns with long life-span have short pattern length A lot of patterns with short life-span include a small number of elements Coding patterns includes a large number of elements survive only a short period

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Stable Patterns in dnsjava 2012/10/26 30

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Stable Pattern in dnsjava Application-specific pattern 2012/10/26 31 public SetResponse addMessage(Message in) { boolean isAuth = in.getHeader().getFlag(Flags.AA); Record question = in.getQuestion(); Name qname; Name curname; int qtype; int qclass; int cred; int rcode = in.getHeader().getRcode(); boolean haveAnswer = false;... } org.xbill.DNS.Cache (ver ) 5 instances in ver

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Stable Pattern in dnsjava Object generation pattern 2012/10/26 32 private void findResolvConf(String file) { InputStream in = null; try { in = new FileInputStream(file); } catch (FileNotFoundException e) { return; } InputStreamReader isr = new InputStreamReader(in); BufferedReader br = new BufferedReader(isr);... } org.xbill.DNS.spi.ResolverConfig (ver ) (java.io.InputStream), java.io.BufferedReader. (java.io.Reader)> 5 instances in ver

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Stable Pattern in dnsjava Iteration related idiom 2012/10/26 33 protected DNSJavaNameService() {... if (nameServers != null) { StringTokenizer st = new StringTokenizer(nameServers, ","); String [] servers = new String[st.countTokens()]; int n = 0; while (st.hasMoreTokens()) servers[n++] = st.nextToken(); try { Resolver res = new ExtendedResolver(servers); Lookup.setDefaultResolver(res); } catch (UnknownHostException e) {... }... } org.xbill.DNS.spi.DNSJavaNameService (ver ) 6 instances in ver

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Stable Patterns in JmDNS 2012/10/26 34

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Stable Pattern in JmDNS Multi-thread idiom with synchronized keyword 2012/10/26 35 public synchronized String getPropertyString(String name) { byte data[] = this.getProperties().get(name); if (data == null) { return null; } if (data == NO_VALUE) { return "true"; } return readUTF(data, 0, data.length); } javax.jmdns.impl.ServiceInfoImpl (ver ) 2 instances in ver.3.4.1

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Answer the Research Question Coding patterns with short life-span account for a large part Few coding patterns with long life-span Are the coding patterns generally stable over the version history? RQ No, The coding patterns are NOT generally stable. Answer 2012/10/26 36

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Conclusion Investigation of the stability of coding patterns across versions –Method Extract coding patterns from versions of code Compute life-span –Target dnsjava (51 versions) JmDNS (20 versions) Result –Coding patterns are not generally stable Coding patterns may not be suitable for reuse Future work –Further investigation with more applications 2012/10/26 37