Presentation is loading. Please wait.

Presentation is loading. Please wait.

Speaker: Liu Shuchang Osaka University

Similar presentations

Presentation on theme: "Speaker: Liu Shuchang Osaka University"— Presentation transcript:

1 Speaker: Liu Shuchang Osaka University
Extraction of Evolution History from Software Source Code Using Linear Counting Speaker: Liu Shuchang Osaka University 1

2 Background daily software development copy existing code
product variant copy edit software evolution 2

3 Evolution History Example
only source code 3

4 Introduction Evolution History Recovery product variants
using only source code Evolution Tree vertex: variant edge: derived relation (most similar pair) key: product similarity Previous Study diff based (file-to-file similarity) time needed (worst case: 2 days) Linear Counting Algorithm estimating instead of calculating 4

5 Linear Counting Algorithm
Cardinality: 11 Zero: 2 Bitmap Size: 8 -8 × ln(2/8) = An example of the Linear Counting Algorithm 5

6 Estimate Product Similarity
Multiset A Bit Map A Bit Map A∩B hash function bitwise operator Initialization Multiset B Bit Map B Bit Map A∪B hash function Similarity: Jaccard Index |A∩B| ——— |A∪B| LC(A∩B) continued division LC(A∪B) 6

7 Process Flow Variant A (Source Code) Initial Multiset A Initialization
1. n-gram modeling Jaccard Index 2. each line of the code |A∩B| ——— |A∪B| Linear Counting Algorithm Variant B (Source Code) Initial Multiset B Initialization (A, B), (A, C), (A, D), … Evolution Tree the most similar pair Prim’s Algorithm 7

8 Research Data A description of datasets we dealt with 8

9 Final Result of dataset5
The Evolution Tree we extracted (the Best Configuration) Existing actual evolution history 9

10 Analysis on Bitmap Size
Part of the experiment results of dataset5 10

11 Best Configuration Main Factors N-gram Modeling no (each line of code)
Bitmap Size 128,000,000 bits Hashing Function MurmurHash3 Results Proper Edges 86.5% (on average) Time 10s to 5mins 11

12 Contributions and Future Work
extract an ideal Evolution Tree efficiently influence of various factors best configuration faster and showed better accuracy Future Work larger datasets other programming language solve the remaining problems 12

Download ppt "Speaker: Liu Shuchang Osaka University"

Similar presentations

Ads by Google