Aspect Mining Jin Huang Huazhong University of Science & Technology, China (Under Construction)
Outline Introduction Example Application Related work Recent research Future work Useful Information
Crosscutting Concerns (CCs) Type of Crosscutting Concern Homogeneous CCs: Similar pattern Heterogeneous CCs: Different pattern CCs cause comprehension and maintenance problems
C Example : Homogeneous CCs void *HeapAlloc(size_t size) { #ifdef NUTDEBUG...//some code #endif NODE **fpp = 0; //code removed #if defined(__arm__) || defined(__m68k__) || defined(__H8300H__) || … while ((size & 0x03) != 0) size++; #endif if (size >= available) { #ifdef NUTDEBUG...//some code #endif return 0; } debug concern system-specific concern int is_orphaned_pgrp(int pgrp) { int retval; … read_lock(&tasklist_lock); … retval = will_become_orphaned _pgrp(pgrp, NULL); … read_unlock(&tasklist_lock); … return retval; } synchronization concern
Separating concerns Is There a Solution? Is AOP ?
Prefetching - Heterogeneous CCs Prefetching Prefetching is used to preload the file from disk to memory. It is an OS mechanism for performance. [‘Checking system rules using system-specific, programmer-written compiler extensions’. (OSDI, 2000)] Execution paths for prefetching Randomly accessing path Sequentially accessing path
Example 2: Prefetching
Application: Spring AOP Spring: An open source application framework Inversion of Control container Aspect-oriented programming framework Transaction management Transaction management With a number of transaction management Provides a more simple API for programmatic transaction management than the APIs of JTA. Paper: “Bringing Advanced Transaction Management Capabilities to Spring” Applicationshttp:// ring/jta_spring_article.pdf
Why Aspect Mining ? If AOP can do better than OOP ?
Outline Introduction Related work Recent research Future work Useful Information
Fan-in Analysis An aspect mining approach that identifying CCs as methods that are called from many different call sites ['Identifying Crosscutting Concerns Using Fan-In Analysis'.(WCRE2004, Marius Marin)] Limitation: Only considering fan-in values of methods, can not find patterns of complex crosscutting concerns.
Random Walk Model Motivation Inspired by pageRank algorithm of google Contribution Paper first adopts the Markov model for computing popularity and significance values of elements in the coupling graphs. Structure based mining approach: considering the coupling graphs of programs ‘Efficiently mining crosscutting concerns through random walks’. (Charles Zhang, AOSD 2007)
Clustering Approach Motivation Information Retrieval: Clustering Contribution Vector Space Model: a new model for aspect mining Clustering approach is adopted for identifying CCs ‘Aspect Mining using a Vector-Space Model Based Clustering Approach’. (G. S. Moldovan and G. Serban, LATE, 2007)
Program Analyses for Aspect Mining Program Analyses (PA) Framework Static analyse: points-to analyses, escape analyseis, and dependence analyses Analyses Tools for Java: Soot, Indus. Aspect Mining through Program Analyses Clone Detection: Mining homogeneous CCs. Dependence Analysis: provide us dependencies for aspect mining.
Outline Introduction Related work Recent research Two States Model Algorithm Selection Model Experiment Conclusion Future work Useful Information
Two States Model (1/2) Information Retrieval Algorithm HITS algorithm: ‘Authoritative sources in a hyperlinked environment’, Jon Kleinberg, Two-States Model Scatter - probability of being crosscutting logic. Centralization - probability of being core logic Interaction of Two States
Algorithm (1/2) For each node q: aq : Scatter value for vertex q. hq : Centralization value for vertex q Computation model Probability Distribution Iterative Computation: t → t+1
Algorithm (2/2) Matrix form of previous equation Obviously the equation (8) converges with the properties of stochastic matrix
Selection Model Implementation and Integration Frequency out-degree
Experiments setting Cases: Prevayler, JHotDraw and HSQLDB. Metrics Precision: Threshold is set to be 0.4 Recall: Threshold is set to be 0.5 Comparison with Fan-in and Pagerank algorithms
Results for Prevayler
Results for JHotDraw
Results for HSQLDB
Advices for Aspect Refactoring Graph AGAC AGAC is generated with our model. Grouping Crosscutting Concerns Association rules mining from AGAC Grouping CCs from rules
Conclusion We apply two-state model for aspect mining. This model is based on scatter and centralization states of program elements. We design a algorithm to compute of ”scatter” and ”centralization” states. with two-state model, we generate advices for Aspect refactoring ‘Aspect Mining through Link Analysis’, Jin Huang, FCST 2010.
Outline Introduction Related work Recent research Future work Structural Aspect Mining Clustering Useful Information
Structural Aspect Mining 1/2 Disadvantage of Existing Aspect Mining Methods Too simple to find Structural information for aspect refactoring. Example: Observer pattern may cause crosscutting concerns for ‘updating’. Refactoring
Structural Aspect Mining 2/2 Related Topics Interactions of Structural Aspects. 'Analyzing Interactions of Structural Aspects'. Benoit Kessler, Eric Tanter. International Workshop on Aspects, Dependencies and Interactions (ADI) Aspect Dependences and Interactions. 'AspectOptima: A Case Study on Aspect Dependencies and Interactions‘, Jorg Kienzle, Ekwa Duala-Ekoko, Samuel Gelineau, 2009.
Clustering Vector Space Model OO Metrics are suggested. 'Aspect Mining Using Self-Organizing Maps With Method Level Dynamic Software Metrics as Input Vectors'. Sayyed Garba Maisikeli EM Clustering An model-based clustering approach Problems: Center Identification, Number of clusters
Outline Introduction Related work Recent research Future work Useful Information
Useful information AOP vs. OOP: html html Mining Software Engineering Data: Program Analysis Indus: Indus: Soot: Soot: Information Retrieval: Link Analysis: 05/CS583-link-analysis.ppt 05/CS583-link-analysis.ppt EM Clustering: maximization_algorithm maximization_algorithm