Download presentation
Presentation is loading. Please wait.
1
Memories of Bug Fixes Sunghun Kim, Kai Pan, and E. James Whitehead Jr., University of California, Santa Cruz Presented By Gleneesha Johnson CMSC 838P, Fall 2006
2
Source Code Repositories Collection of code changes Most long developed projects have one Typically used to store histories and make backups Knowledge contained within hasn’t been fully leveraged –Previous development experience –Changes that fix bugs are particularly interesting
3
Horizontal Bug Finding Techniques Applicable across all projects Based on pre-defined bug patterns, theorem proving, and model checking ESC/Java, FindBugs, JLint, PMD
4
Vertical Bug Finding Techniques Aim for project-specific bugs Different project requirements, business logic, and semantics BugMem –Learn bug patterns over time
5
Bug Fix Memories Project-specific bug and fix knowledge base –Developed by analyzing history of bug fixes –Can be used to detect bugs and suggests fixes
6
Building Bug Fix Memories Extract source code, change logs, and deltas –Kenyon File change –Contains list of region pairs that show differences between two file versions. –Regions called hunks Consist of source code lines Identify changes where a bug was fixed –Keyword search –Search for references to bug reports
7
Hunks Hunk pair (HP) → deleted hunk (DH) (bug hunk (BH)) & added hunk (AH) (fix hunk (FH))
8
Extracting Components From Hunks Extracts syntax patterns, components, from hunks Parsing – extracts syntax components Normalizing – generalizes syntax to match similar code Filtering – eliminates noise –Information filtering –Diff filtering
9
Raw Component Extraction (Parsing) Preprocessing Basic syntax lines –Differentiate between composite statements (if, for, while) and simple statements (method call, assignment, etc.) –all multi-line simple statements are on a single line. –conditional predicates of if, for, while, etc. all lie on a single line. Raw components –Based on abstract syntax tree of basic syntax line –4 kinds: static Java call, Java call, user-defined call, and non-call
10
Normalization If the variable type of a raw component is known, normalize variable – i.e., foo.flag → Foo.flag Generate two components for raw components that contain numeric, boolean, char or string literals –One with and one without normalization for the literal –i.e., i = 1 → int = 1 & int = int
11
Normalization Continued For a method call, actual parameters are normalized to the type of the parameter Normalization level –Indicates a component’s degree of normalization amongst others extracted from same raw component –Level increases with degree of normalization
12
Examples
13
Information Filtering Information value – indicates how much unique information a component carries –Determined by summing information value of its elements Information value threshold – used to filter components with little unique information –Two is used as a threshold in the paper
14
Diff Filtering Removes code unchanged between bug and fix hunks
15
Searching Memories Bugs found by searching for matching patterns in bug hunks Suggestions made by returning code in corresponding fix hunks Several options for component searching –Adjust degree of matching and omission of very common components
16
Options
17
Evaluation
18
Evaluation Continued Determine half and full hit rates –Half hit – bug previously seen –Full hit – bug and fix previously seen Five open source projects
19
Evaluation Continued
22
Comparison With PMD
23
Limitations Missing Memories Not applicable in initial stage of a project Doesn’t catch cross-file relationships
24
Uses Bug Finding Tool –BugMem –Can be integrated into Eclipse Code example repository –Useful to developers new to a project
25
The End!
26
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.