Presentation is loading. Please wait.

Presentation is loading. Please wait.

David Shepherd, Zachary P. Fry, Emily Hill, Lori Pollock, and K

Similar presentations


Presentation on theme: "David Shepherd, Zachary P. Fry, Emily Hill, Lori Pollock, and K"— Presentation transcript:

1 Using Natural Language Program Analysis to Locate and Understand Action-Oriented Concerns
David Shepherd, Zachary P. Fry, Emily Hill, Lori Pollock, and K. Vijay-Shanker

2 Motivation 60-90% software costs spent on reading and navigating code for maintenance* (fixing bugs, adding features, etc.) Why do maintenance tasks require so much comprehension overhead? Recently I’ve encountered research that changed how I thought about software maintenance. The key fact that influenced me was this: I found this fact both surprising and validating at the same time. This spurred me on to dig deeper into this problem and ask: what is happening during maintenance? In order to answer this question, lets take a look at a concrete situation *[Erlikh] Leveraging Legacy System Dollars for E-Business

3 Motivating Example - Problem
Concerns are often crosscutting Map of thought to code, question mark in between (how do we get there) Code = package explorer in eclipse Time Time

4 Motivating Example - Problem
Make sure peole understand what the bugre ports are Really want to make this point? Time

5 Motivating Example - Problem
Make sure peole understand what the bugre ports are Really want to make this point? Time

6 Motivating Example - Problem
Problems involving the same concern occur multiple times Lost effort / duplicate work Make sure peole understand what the bugre ports are Really want to make this point? Time

7 Key Challenge: Concern Location
Find, collect, and understand all source code related to a particular concept Foundation of many software maintenance tasks Fix Concern Upgrade Concern Concern Location Adapt Concern “Copy” Concern

8 Motivating Example - Current Strategies
Return large result sets Return irrelevant results Return hard-to-interpret result sets

9 The Find-Concept Approach
1. More effective search The Find-Concept Approach 2. Improved search terms Source Code 3. Understandable results concept Method a Concrete query Find-Concept Method b Method c Overview of approach with DFA NL Source Code Model Recommendations Method d Method e Natural Language Infomation Result Graph

10 Example Use of Find-Concept
Target Concern: Automatically finish a word Concern Location Task: Find code for the concept automatically finish a word

11 Example Use of Find-Concept
Query State Find-Concept Process Abstract Initial-Query DO Concrete Initial-Query Verb-Query and Direct-Object-Query Initialized List of Methods Show the query being inputted, DFA to the right Expand query to different forms of same word Expanded Verb-Query and Direct-Object- Query Result Graph Concern Comprehension

12 Forming an Abstract Query
Query State Find-Concept Process Abstract Initial-Query DO Concrete Initial-Query Finish the word automatically Verb-Query and Direct-Object-Query Initialized List of Methods Expanded Verb-Query and Direct-Object- Query Result Graph Concern Comprehension

13 Forming a Concrete Query
Query State Find-Concept Process Abstract Initial-Query DO Concrete Initial-Query Verb-Query and Direct-Object-Query Initialized List of Methods Show the query being inputted, DFA to the right Expand query to different forms of same word Expanded Verb-Query and Direct-Object- Query Result Graph Concern Comprehension

14 Concern Comprehension
Queries Initialized Query State Find-Concept Process Abstract Initial-Query DO Concrete Initial-Query Verb-Query and Direct-Object-Query Initialized List of Methods Show the query being inputted, DFA to the right Expand query to different forms of same word Expanded Verb-Query and Direct-Object- Query Result Graph Concern Comprehension

15 Concern Comprehension
Expand Verb Query Query State Find-Concept Process Abstract Initial-Query Concrete Initial-Query DO Verb-Query and Direct-Object-Query Initialized List of Methods Show the query being inputted, DFA to the right Expand query to different forms of same word Expanded Verb-Query and Direct-Object- Query Result Graph Concern Comprehension

16 Examine Verb Query Recommendations
DO Ordered list of verb recommendations Stemmed verb Summary of reasons Insert thesaurus info

17 Examine Verb Query Recommendations
DO Insert thesaurus info Natural language and program knowledge to make recommendations

18 Concern Comprehension
Expand DO Query Query State Find-Concept Process Abstract Initial-Query Concrete Initial-Query DO Verb-Query and Direct-Object-Query Initialized List of Methods Show the query being inputted, DFA to the right Expand query to different forms of same word Expanded Verb-Query and Direct-Object- Query Result Graph Concern Comprehension

19 Examine DO Recommendations
Insert thesaurus info

20 Examine Intermediate Feedback
Query State Find-Concept Process Abstract Initial-Query DO Concrete Initial-Query Verb-Query and Direct-Object-Query Initialized List of Methods Show the query being inputted, DFA to the right Expand query to different forms of same word Expanded Verb-Query and Direct-Object- Query Result Graph Concern Comprehension

21 Examine Intermediate Feedback
State Find-Concept Process Abstract Initial-Query Concrete Initial-Query Verb-Query and Direct-Object-Query Initialized List of Methods Show the query being inputted, DFA to the right Expand query to different forms of same word Expanded Verb-Query and Direct-Object- Query Expanded Verb-Query and Direct-Object- Query Result Graph Result Graph Concern Comprehension

22 Examine Result Graph Gain understanding of concern
Answer questions like: Where is this concern’s trigger-point? Where are completions determined? Gain understanding of concern Answer common development questions

23 Discussion: Example Use of Find-Concept
Expanded queries Added verb: Complete Added Direct-Object (DO): Completions Effective search Verb complete, not text “complete” Understandable results Overview of concern Answer common questions Show graphic for each step Word - > words Search result for “complet” Result graph vs google list?

24 Underlying Program Analysis
NL Source Code Representation [AOSD 06] Action (Verb) is key to CCCs To precise identify actions: need direct-object Representation: Action-Oriented Identifier Graph Word Recommendation Stemmed/Rooted: complete, completing Synonym: finish, complete Co-location: completeWord()

25 Experimental Evaluation
Research Questions Which search tool is most effective at forming and executing a query? Which search tool requires the least human subject effort to form an effective query? Methodology Create 9 concern location tasks with answers Ask developers to locate concerns using one of three tools

26 Experimental Setup Tools: Find-Concept, GES, Lexical Search (ELex)
Subject Apps: LOC > 20K JBidWatcher, IReport, JavaHMO, Jajuk Search tasks: From bug reports Human subjects: 13 developers, 5 grad students Quick explanation of basic methodology

27 Dependent Variables and Measures
Search Results Precision (quality) Recall (completeness) F-measure (quality and completeness) User effort Elapsed Time Effort metric Intuitive definitions (example from Google)

28 F-Measure: Find-Concept Superior
In 5/9 tasks Query expansions helped “add textfield” expanded to include “create” and “construct” “save auctions” expanded to “save file” Precise search Verb “play” when text “play” occurs in over 500 methods DO “auction” when text “auction” occurs in many files Could add pictures for this

29 F-Measure: Find-Concept Equivalent
In 2/9 tasks Cases All tools failed Expanding query did not help

30 F-Measure: Find-Concept Inferior
In 2/9 tasks Extraction Maturity Gets <compile> Needs <compile report> Extraction Limitations URLs Maybe insert some examples? Most issues are easy to address

31 Across all tasks Overall Results Effectiveness FC > Elex with statistical significance FC >= GES on 7/9 tasks FC is more consistent than GES Effort FC = Elex = GES FC is more consistent and more effective in experimental study without requiring more effort

32 Natural language shows promise for improving software search
Conclusions FC addresses fundamental problems with code search Effective search terms (not irrelevant results) Precise search (not large result sets) Understandable results (not a list) Experimental Evaluation FC performed more consistently and more effectively than other tools NL techniques can be easily improved Natural language shows promise for improving software search

33 NL shows promise for improving software search
Conclusions FC addresses fundamental problems with code search Effective search terms (not irrelevant results) Precise search (not large result sets) Understandable results (not a list) Experimental Evaluation FC performed more consistently and more effectively than other tools NL techniques can be easily improved NL shows promise for improving software search

34 Overview of Advantages
Searches high-level representation Assists in query expansion Automatically constructs and displays result graph

35 Approach: Process Overview
Formulate a query Expand the query Search the AOIG and inspect the result graph Show the feature being used, DFA to the right Show the user generating the abstract query Got pics for this

36 Concern Comprehension
Approach: Overview State Process Abstract Initial-Query Concrete Verb-Query and Direct-Object-Query Initialized Expanded Verb-Query and Direct-Object- Query List of Methods Result Graph Concern Comprehension Show the query being inputted, DFA to the right Expand query to different forms of same word

37 Motivating Example - Problem
Problems involving the same concern occur multiple times Bug reports from jajuk showing the same problem several times (playing a file)

38 Motivating Example - Difficulty
Show by example: Crosscuttingness of play (show play in concern mapper) Imprecision of other searches (play but not song?) understandability issues (how are all of the methods related, show without links) The concern is crosscutting

39 Motivating Example - Difficulty
Irrelevant Results

40 Motivating Example - Difficulty
Who calls who? Results difficult to understand

41 Example Use of Find-Concept
Target Concern: Automatically finish a word Concern Location Task: Find code for the concept automatically finish a word

42 Example Use of Find-Concept
Query State Find-Concept Process Abstract Initial-Query DO Concrete Initial-Query Verb-Query and Direct-Object-Query Initialized List of Methods Show the query being inputted, DFA to the right Expand query to different forms of same word Expanded Verb-Query and Direct-Object- Query Result Graph Concern Comprehension

43 Forming an Abstract Query
Query State Find-Concept Process Abstract Initial-Query DO Concrete Initial-Query Finish the word automatically Verb-Query and Direct-Object-Query Initialized List of Methods Expanded Verb-Query and Direct-Object- Query Result Graph Concern Comprehension

44 Forming a Concrete Query
Query State Find-Concept Process Abstract Initial-Query DO Concrete Initial-Query Verb-Query and Direct-Object-Query Initialized List of Methods Show the query being inputted, DFA to the right Expand query to different forms of same word Expanded Verb-Query and Direct-Object- Query Result Graph Concern Comprehension

45 Concern Comprehension
Queries Initialized Query State Find-Concept Process Abstract Initial-Query DO Concrete Initial-Query Verb-Query and Direct-Object-Query Initialized List of Methods Show the query being inputted, DFA to the right Expand query to different forms of same word Expanded Verb-Query and Direct-Object- Query Result Graph Concern Comprehension

46 Concern Comprehension
Expand Verb Query Query State Find-Concept Process Abstract Initial-Query Concrete Initial-Query DO Verb-Query and Direct-Object-Query Initialized List of Methods Show the query being inputted, DFA to the right Expand query to different forms of same word Expanded Verb-Query and Direct-Object- Query Result Graph Concern Comprehension

47 Examine Verb Query Recommendations
DO Ordered list of verb recommendations Stemmed verb Summary of reasons Insert thesaurus info

48 Examine Verb Query Recommendations
DO Insert thesaurus info Natural language and program knowledge to make recommendations

49 Concern Comprehension
Expand DO Query Query State Find-Concept Process Abstract Initial-Query Concrete Initial-Query DO Verb-Query and Direct-Object-Query Initialized List of Methods Show the query being inputted, DFA to the right Expand query to different forms of same word Expanded Verb-Query and Direct-Object- Query Result Graph Concern Comprehension

50 Examine DO Recommendations
Insert thesaurus info

51 Examine Intermediate Feedback
Query State Find-Concept Process Abstract Initial-Query DO Concrete Initial-Query Verb-Query and Direct-Object-Query Initialized List of Methods Show the query being inputted, DFA to the right Expand query to different forms of same word Expanded Verb-Query and Direct-Object- Query Result Graph Concern Comprehension

52 Examine Intermediate Feedback
State Find-Concept Process Abstract Initial-Query Concrete Initial-Query Verb-Query and Direct-Object-Query Initialized List of Methods Show the query being inputted, DFA to the right Expand query to different forms of same word Expanded Verb-Query and Direct-Object- Query Expanded Verb-Query and Direct-Object- Query Result Graph Result Graph Concern Comprehension

53 Examine Result Graph Gain understanding of concern
Answer questions like: Where is this concern’s trigger-point? Where are completions determined? Gain understanding of concern Answer common development questions

54 Motivating Example - Problem
Concerns are often crosscutting Map of thought to code, question mark in between (how do we get there) Code = package explorer in eclipse Time Time

55 Motivating Example - Problem
Make sure peole understand what the bugre ports are Really want to make this point? Time

56 Motivating Example - Problem
Make sure peole understand what the bugre ports are Really want to make this point? Time

57 Motivating Example - Problem
Problems involving the same concern occur multiple times Lost effort / duplicate work Make sure peole understand what the bugre ports are Really want to make this point? Time

58 F-Measure: Find-Concept Superior
In 5/9 tasks Query expansions helped “add textfield” expanded to include “create” and “construct” “save auctions” expanded to “save file” Precise search Verb “play” when text “play” occurs in over 500 methods DO “auction” when text “auction” occurs in many files Could add pictures for this

59 F-Measure: Find-Concept Equivalent
In 2/9 tasks Cases where basic search is good enough

60 Results: Find-Concept Inferior
In 2/9 tasks Extraction Maturity IReportCompiler.run() Extraction Limitations Maybe insert some examples? These issues are easy to address

61 Overall Results Effectiveness FC > Elex with statistical significance FC >= GES on 7/9 tasks FC is more consistent than GES Effort FC = Elex = GES FC is more consistent and more effective in experimental study without requiring more effort


Download ppt "David Shepherd, Zachary P. Fry, Emily Hill, Lori Pollock, and K"

Similar presentations


Ads by Google