1 Timna: A Framework for Automatically Combining Aspect Mining Analyses David Shepherd 1 Jeffrey Palm 2 Lori Pollock 1 Mark Chu-Carroll 3 1 University.

Slides:



Advertisements
Similar presentations
Debugging ACL Scripts.
Advertisements

Test-First Programming. The tests should drive you to write the code, the reason you write code is to get a test to succeed, and you should only write.
Test process essentials Riitta Viitamäki,
Detecting Bugs Using Assertions Ben Scribner. Defining the Problem  Bugs exist  Unexpected errors happen Hardware failures Loss of data Data may exist.
COMPUTER PROGRAMMING I Essential Standard 5.02 Understand Breakpoint, Watch Window, and Try And Catch to Find Errors.
Annoucements  Next labs 9 and 10 are paired for everyone. So don’t miss the lab.  There is a review session for the quiz on Monday, November 4, at 8:00.
Chapter 7 User-Defined Methods. Chapter Objectives  Understand how methods are used in Java programming  Learn about standard (predefined) methods and.
An Introduction to Java Programming and Object- Oriented Application Development Chapter 8 Exceptions and Assertions.
5. Memory Management From: Chapter 5, Modern Compiler Design, by Dick Grunt et al.
Secure Systems Research Group - FAU Aspect Oriented Programming Carlos Oviedo Secure Systems Research Group.
Reverse Engineering © SERG Code Cloning: Detection, Classification, and Refactoring.
1 Interfaces, Aspects, and Views David Shepherd Dr. Lori Pollock University of Delaware.
Aspect-Oriented Programming In Eclipse ® Aspect-Oriented Programming in Eclipse with AspectJ Dr Helen Hawkins and Sian January.
Java.sun.com/javaone/sf | 2004 JavaOne SM Conference | Session BUS JavaOne 2004 What is AOP? Gregor Kiczales AspectMentor.com and University of.
Using Natural Language Program Analysis to Locate and understand Action-Oriented Concerns David Shepherd, Zachary P. Fry, Emily Hill, Lori Pollock, and.
An architecture for Privacy Preserving Mining of Client Information Jaideep Vaidya Purdue University This is joint work with Murat.
Identifying Crosscutting Concerns Using Fan-In Analysis MARIUS MARIN, Delft University of Technology ARIE VAN DEURSEN, Delft University of Technology and.
1 Case Study: Supplementing Program Analysis with Natural Language Analysis to Improve a Reverse Engineering Task David Shepherd, Lori Pollock, and K.
Transaction Processing: Concurrency and Serializability 10/4/05.
More on AspectJ. aspect MoveTracking { private static boolean _flag = false; public static boolean testAndClear() { boolean result = _flag; _flag = false;
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
CODING Research Data Management. Research Data Management Coding When writing software or analytical code it is important that others and your future.
Unit Testing & Defensive Programming. F-22 Raptor Fighter.
Introduction to Systems Analysis and Design Trisha Cummings.
Programming Languages and Paradigms Object-Oriented Programming.
PRAGMATIC PARANOIA Steven Hadfield & Anthony Rice.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Mining Coding Patterns to Detect Crosscutting Concerns.
Comp 245 Data Structures Software Engineering. What is Software Engineering? Most students obtain the problem and immediately start coding the solution.
Class Specification Implementation Graph By: Njume Njinimbam Chi-Chang Sun.
Introduction to Aspect Oriented Programming Presented By: Kotaiah Choudary. Ravipati M.Tech IInd Year. School of Info. Tech.
Problem Determination Your mind is your most important tool!
Chapter 13: Implementation Phase 13.3 Good Programming Practice 13.6 Module Test Case Selection 13.7 Black-Box Module-Testing Techniques 13.8 Glass-Box.
Lecture Set 5 Control Structures Part D - Repetition with Loops.
1 CSC 221: Computer Programming I Spring 2010 interaction & design  modular design: roulette game  constants, static fields  % operator, string equals.
Designing classes How to write classes in a way that they are easily understandable, maintainable and reusable 3.0.
T-unit: Tcl Unit Test Package Automated Unit Test Package For Tcl Procedures Final Presentation Joseph Boyle Loyola Marymount University.
Mining and Analysis of Control Structure Variant Clones Guo Qiao.
Bug Localization with Machine Learning Techniques Wujie Zheng
Aspect Oriented Programming Sumathie Sundaresan CS590 :: Summer 2007 June 30, 2007.
Chapter 25 Formal Methods Formal methods Specify program using math Develop program using math Prove program matches specification using.
Data Structures Using Java1 Chapter 2 Inheritance and Exception Handling.
1 An Aspect-Oriented Implementation Method Sérgio Soares CIn – UFPE Orientador: Paulo Borba.
Topic 1 Object Oriented Programming. 1-2 Objectives To review the concepts and terminology of object-oriented programming To discuss some features of.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
Refactoring & Testability. Testing in OOP programming No life in flexible methodologies and for refactoring- infected developers without SOME kind of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Cage: A Keyword.
CISC Machine Learning for Solving Systems Problems Presented by: Satyajeet Dept of Computer & Information Sciences University of Delaware Automatic.
Week 14 Introduction to Computer Science and Object-Oriented Programming COMP 111 George Basham.
Elementary C++. Procedural Programming Split your problem into simpler parts then solve each part separately Recognize common parts and solve them only.
A Specification Logic for Exceptions and Beyond Cristina David Cristian Gherghina National University of Singapore.
Aspect Mining Jin Huang Huazhong University of Science & Technology, China
Decisions in Python Boolean functions. A Boolean function This is a function which returns a bool result (True or False). The function can certainly work.
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. Testing Spring Applications Unit Testing.
 In the java programming language, a keyword is one of 50 reserved words which have a predefined meaning in the language; because of this,
(1) Test Driven Development Philip Johnson Collaborative Software Development Laboratory Information and Computer Sciences University of Hawaii Honolulu.
1 COS 260 DAY 12 Tony Gauvin. 2 Agenda Questions? 5 th Mini quiz –Chapter 5 40 min Assignment 3 Due Assignment 4 will be posted later (next week) –If.
1 Introduction 1. Why Data Structures? 2. What AreData Structure? 3. Phases of Software Development 4. Precondition and Postcondition 5. Examples.
AspectScope: An Outline Viewer for AspectJ Programs Michihiro Horie, Shigeru Chiba Tokyo Institute of Technology, Japan.
Manipulator example #include int main (void) { double x = ; streamsize prec = cout.precision(); cout
1.1: Objects and Classes msklug.weebly.com. Agenda: Attendance Let’s get started What is Java? Work Time.
IST 210: PHP LOGIC IST 210: Organization of Data IST210 1.
Part 1: Composition, Aggregation, and Delegation Part 2: Iterator COMP 401 Fall 2014 Lecture 10 9/18/2014.
CS 440 Database Management Systems Stored procedures & OR mapping 1.
Computer Science: A Structured Programming Approach Using C1 Objectives ❏ To understand how decisions are made in a computer ❏ To understand the logical.
Computer Software vs. Hardware. Topic: Java in the Computer World.
TESTING TEST DRIVEN DEVELOPMENT
Glossary of Terms Used in Science Papers AS
About the Presentations
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
Presentation transcript:

1 Timna: A Framework for Automatically Combining Aspect Mining Analyses David Shepherd 1 Jeffrey Palm 2 Lori Pollock 1 Mark Chu-Carroll 3 1 University of Delaware 2 Northeastern University 3 IBM

2 Introduction - What is AOP? Aspect Oriented Programming Each rectangle represents a source file Red lines represent source code lines implementing concept A figures from aspectj.com Language Support for Crosscutting Concerns (CCCs)

3 Introduction - A Closer Look at AOP Benefits of Refactoring into AOP Increased "ilities" –readability –maintainability –extensibility Crosscutting concerns (CCCs) are explicit aspect DisplayUpdating { after(): call(void Line.setP1(Point)) || call(void Line.setP2(Point)) { Display.update(); } class Line { private Point p1, p2; Point getP1() { return p1; } Point getP2() { return p2; } void setP1(Point p1) { this.p1 = p1; } void setP2(Point p2) { this.p2 = p2; } figures from aspectj.com Display.update(); Responsibility of AOP Community

4 Working Assumptions 1.AOP can provide benefits in modularity for some problems 2.Crosscutting Concerns are dangerous; awareness is essential

5 Targeted Problem: Mining Legacy Applications –refactor into AOP figures generated by the AJDT Aspect Mining is the process of finding these candidates

6 Targeted Problem: Mining 3 different programmers implement 1 concept Applications written with AOP, still problems...

7 State of the Art Researchers currently create a single analysis to perform aspect mining Examples Fan-in Analysis [Marin et al, WCRE 05] Code Clone Analysis [Shepherd et al, SERP 05] Dynamic Analysis [Breu et al, ASE 04]... Little work done on combining analysis

8 State of the Art Fan-In public void credit(float amount) { AccessController.checkPermission( new BankingPermission("accountOperation")); _balance = _balance + amount; } public void debit(float amount) throws InsufficientBalanceException { AccessController.checkPermission( new BankingPermission("accountOperation")); if (_balance < amount) { throw new InsufficientBalanceException("Insufficient total balance"); } else { _balance = _balance - amount; } checkPermission Good Candidate for Refactoring public void checkOut(SpecialList items) { SpecialIterator it = items.iterator(); while(it.hasNext()) checkOut(it.next()); } public void markItems(SpecialList items) { SpecialIterator it = items.iterator(); while(it.hasNext()) ((Item)it.next()).mark(); } next Bad Candidate for Refactoring [Marin et al, WCRE 04]

9 State of the Art Clone Detection public void debit(float amount) throws InsufficientBalanceException { UserTransaction ut =...; try { ut.begin();... business logic... ut.commit() } catch (Exception ex) { ut.rollback(); // rethrow after logging and wrapping } Conventional Transaction Management public void credit(float amount) { UserTransaction ut =...; try { ut.begin();... business logic... ut.commit() } catch (Exception ex) { ut.rollback(); // rethrow after logging and wrapping } [Shepherd et al, SERP 05]

10 Remaining Challenges Combining Analyses –if (code clone & fan-in high), more likely to be a refactoring candidate Running a large number of analyses –Methods with void return types –Getters and setters –... Our framework (Timna) combines analyses to make a decision We invented several new analyses, use 11 total analyses Humans do this during manual mining

11 Key Insight Currently, humans are the best miners. What is their process? 1.Manual Inspection 2. Learn to identify candidates in specific system 3. Generalize to other systems 4. Apply in other systems

12 Automated Approach 1.Create Training Data 1.Manual Tag Known Program 2.Automatically Run Individual Mining Analyses 2.Learn Output: set of rules to classify boolean or categories 3.Classify Unknown programs Output: refactoring candidates

13 Create Training Data Approach - Learning Known Program Manual Tagging Method AClass 2 Method BClass 1 Method CClass 1 Method DClass 3 Fan-in No Parameters Code Clone Pairings Method AAttributesClass 2 Method BAttributesClass 1 Method CAttributesClass 1 Method DAttributesClass 3 Machine Learning Classification Rules Classification Table Augmented Classification Table Mining Analyses 1.Create Training Data 2.Learn 3.Apply Method IdentifierAttributesClassification toolDone{?, ?, ?}refactor setTool{?, ?, ? }don't refactor exit{?, ?, ?}don't refactor Start{?, ?, ?}refactor Learn Rules

14 Create Training Data Learn Rules Approach - Learning Program Manual Tagging Method AClass 2 Method BClass 1 Method CClass 1 Method DClass 3 Fan-in No Parameters Code Clone Pairings Method AAttributesClass 2 Method BAttributesClass 1 Method CAttributesClass 1 Method DAttributesClass 3 Machine Learning Classification Rules Classification Table Augmented Classification Table Mining Analyses 1.Create Training Data 2.Learn 3.Apply 1.If( Fan-in > 5 and is-Void = true ), then (refactor) 2.If ( true ), then (don't refactor) Method IdentifierAttributesClassification toolDone{6, false, 3}refactor setTool{1, true, 1 }don't refactor exit{0, false, 2}don't refactor Start{4, false, 2}refactor Final Result: Only output to Classifying Phase

15 Unknown Program Approach - Classifying Method A Method B Method C Method D Fan-in No Parameters Code Clone Pairings Method AAttributes Method BAttributes Method CAttributes Method DAttributes Classifier Method AAttributesClass 2 Method BAttributesClass 1 Method CAttributesClass 1 Method DAttributesClass 3 Classification Table Mining Analyses Augmented Classification Table Completed Classification Table Classify Unknown Program Method IdentifierAttributesClassification showPrompt{6, false, 3}refactor takeOrder{1, true, 1 }don't refactor sendMessage{0, false, 2}don't refactor end{4, false, 2}refactor Classification Rules Method IdentifierAttributesClassification showPrompt{6, false, 3} takeOrder{1, true, 1 } sendMessage{0, false, 2} end{4, false, 2} Method IdentifierAttributesClassification showPrompt{} takeOrder{} sendMessage{} end{}

16 Evaluation Questions 1.Does combining analyses increase precision and recall? 2.Are generated rules effective on other programs? 3.Does categorical tagging increase performance? 4.What is the (time) overhead? 5.Can rules help direct research and evaluate new analyses? Tagged in two different ways: Boolean: either refactor or don't Categorical: either don't refactor or a reason (category) why to refactor

17 Experimental Setup Subject Programs Training Program (JHotDraw, 11K LOC) Testing Program (PetStore, 9K LOC) Steps 1.Train 2.Test on Training Program 3.Test on Testing Program Metrics Precision and Recall Time

18 Experimental Results Program not tagged, so can't calculate recall Timna: Cat Timna: Bool Timna: Cat Timna: Bool Precision = (number of good candidates returned) / (number of candidates returned) Recall = (number of candidates returned) / (number of actual good candidates)

19 Experimental Results 1.Does combining analyses increase precision and recall? Timna: Cat Timna: Bool Timna: Cat Timna: Bool Why is Fan-In performing poorly? Single analyses work well for specific cases, but fail to find all aspects. In this case, combining analyses does increase precision and recall.

20 Experimental Results 2. Are generated rules effective on other programs? Timna: Cat Timna: Bool Timna: Cat Timna: Bool In this case, the rules effectively mine from the testing program.

21 Experimental Results 3. Does categorical tagging increase performance? Timna: Cat Timna: Bool Timna: Cat Timna: Bool In this case, the categorical tagging and the boolean tagging perform similarly.

22 Experimental Results 4. What is the (time) overhead? Can be done incrementally, at each compile/edit, or overnight Only done once Can be done incrementally, at each compile/edit, or overnight Timna: Cat Timna: Bool Timna: Cat Timna: Bool From these results, we believe Timna could be integrated into an IDE without degrading response time. 6.24s 1.88s -- 5m28s 2m04s AnalyzeLearn

23 Experimental Results 5. Can rules help direct research and evaluate new analyses? [WARE 05] elaborates on use in evaluating new analyses if analyses does not appear in rules, it is providing no new information Human readable rules can help define style

24 Contributions Technique to combine mining analyses to automatically identify refactoring candidates Demonstrated how to apply machine learning to learn good AOP style from canonical examples –generate human readable rules Invented several (7) novel mining analyses during our initial use of Timna Experimentally shown evidence that combining analyses can improve performance

25 Possible Application Implicit Rule: Always create UndoActivity before "executing" undoable Implicit Rule: Check damage after changing drawing Possible Rule: Clear Selection before changing drawing Of course! Whenever I change something in the drawing, I should check to see if I damaged the drawing. Moving this concept to an aspect can eliminate a lot of similar calls from my OOP code. Provide hints, shaded by level of confidence AOSD-style programming with reduced costs and burdens on the developer

26 Future Work Examine other aspect categorization –"Sorts" (Marin et al, ICSM 05) Extend with additional analyses –NLP-based analyses Apply to more unknown programs