1 Program Comprehension through Dynamic Analysis Visualization, evaluation, and a survey Bas Cornelissen (et al.) Delft University of Technology IPA Herfstdagen,

1 Program Comprehension through Dynamic Analysis Visualization, evaluation, and a survey Bas Cornelissen (et al.) Delft University of Technology IPA Herfstdagen, Nunspeet, The Netherlands November 26, 2008

Context Software maintenance –e.g., feature requests, debugging –requires understanding of the program at hand –up to 70% of effort spent on comprehension process  Support program comprehension 2

Definitions Program Comprehension “A person understands a program when he or she is able to –explain the program, its structure, its behavior, its effects on its operation context, and its relationships to its application domain –in terms that are qualitatively different from the tokens used to construct the source code of the program.” 3

Definitions (cont’d) Dynamic analysis The analysis of the properties of a running software system Unknown system Instrumentation Execution Scenario e.g., open source e.g., using AspectJ (too) much data Advantages preciseness goal-oriented Limitations incompleteness scenario-dependence scalability issues 4

Outline 1.Literature survey 2.Visualization I: UML sequence diagrams 3.Comparing reduction techniques 4.Visualization II: Extravis 5.Current work: Human factor 6.Concluding remarks 5

Literature survey 6

Why a literature survey? Numerous papers and subfields –last decade: many papers annually Need for a broad overview –keep track of current and past developments –identify future directions Existing surveys (4) do not suffice –scopes restricted –approaches not systematic –collective outcomes difficult to structure 7

Characterizing the literature Four facets –Activity: what is being performed/contributed? e.g., architecture reconstruction 8 –Target: to which languages/platforms is the approach applicable? e.g., web applications –Method: which methods are used in conducting the activity? e.g., formal concept analysis –Evaluation: how is the approach validated? e.g., industrial study

Attribute framework 9

Characterization 10 Etc.

Attribute frequencies 11

Survey results Least common activities –surveys, architecture reconstruction 12 Least common target systems –multithreaded, distributed, legacy, web Least common evaluations –industrial studies, controlled experiments, comparisons

Visualization I: Sequence Diagrams 13

UML sequence diagrams Goal –visualize testcase executions as sequence diagrams –provides insight in functionalities –accurate, up-to-date documentation Method 1.instrument system and testsuite 2.execute testsuite 3.abstract from “irrelevant” details 4.visualize as sequence diagrams 14

Evaluation JPacman –Small program for educational purposes –3 KLOC –25 classes Task –Change requests addition of “undo” functionality addition of “multi-level” functionality 15

Evaluation (cont’d) Checkstyle –code validation tool –57 KLOC –275 classes Task –Addition of a new check which types of checks exist? what is the difference in terms of implementation? 16

Results Sequence diagrams are easily readable –intuitive due to chronological ordering Sequence diagrams aid in program comprehension –supports maintenance tasks Proper reductions/abstractions are difficult –reduce 10,000 events to 100 events, but at what cost? 17

Results (cont’d) Reduction techniques: issues –which one is “best”? which are most likely to lead to significant reductions? which are the fastest? which actually abstract from irrelevant details? 18

Comparing reduction techniques 19

Trace reduction techniques Input 1: large execution trace –up to millions of events Input 2: maximum output size –e.g., 100 for visualiz. through UML sequence diagrams Output: reduced trace –was reduction successful? –how fast was the reduction performed? –has relevant data been preserved? 20

Example technique Stack depth limitation [metrics-based filtering] requires two passes discard events above maximum depth determine depth frequencies Trace 028,450 113,902 258,444 329,933 410,004... determine maximum depth maximum output size (threshold) Trace 200,000 events 50,000 events 42,352 events > depth 1 21

How can we compare the techniques? Use: –common context –common evaluation criteria –common test set  Ensures fair comparison 22

Approach Assessment methodology 1.Context 2.Criteria 3.Metrics 4.Test set 5.Application 6.Interpretation 23 need for high level knowledge reduction success rate; performance; info preservation output size; time spent; preservation % per type five open source systems, one industrial apply reductions using thresholds 1,000 thru 1,000,000 compare side-by-side

Techniques under assessment Subsequence summarization [summarization] Stack depth limitation [metrics-based] Language-based filtering [filtering] Sampling [ad hoc] 24

Assessment summary 25 Subseq. summ. Stack depth limitation Lang.-based filterings Sampling Reduction success rate oo--+ Performance --oo Information preservation +oo--

Visualization II: Extravis 26

Extravis Execution Trace Visualizer –joint collaboration with TU/e Goal –program comprehension through trace visualization trace exploration, feature location,... –address scalability issues millions of events  sequence diagrams not adequate 27

Evaluation: Cromod Industrial system –Regulates greenhouse conditions –51 KLOC –145 classes Trace –270,000 events Task –Analysis of fan-in/fan-out characteristics 29

Evaluation: Cromod (cont’d) 30

Evaluation: JHotDraw Medium-size open source application –Java framework for graphics editing –73 KLOC –344 classes Trace –180,000 events Task –feature location i.e., relate functionality to source code or trace fragment 31

Evaluation: JHotDraw (cont’d) 32

Evaluation: Checkstyle Medium-size open source system –code validation tool –73 KLOC –344 classes –Trace: 200,000 events Task –formulate hypothesis “typical scenario comprises four main phases” initialization; AST construction; AST traversal; termination –validate hypothesis through trace analysis 33

Evaluation: Checkstyle (cont’d) 34

Current work: Human factor 35

Motivation Need for controlled experiments in general –measure impact of (novel) visualizations Need for empirical validation of Extravis in particular –only anecdotal evidence thus far 36  Measure usefulness of Extravis in software maintenance does runtime information from Extravis help?

Experimental design Series of maintenance tasks –from high level to low level –e.g., overview, refactoring, detailed understanding Experimental group –±10 subjects –Eclipse IDE + Extravis Control group –±10 subjects –Eclipse IDE 37

Concluding remarks 38

Concluding remarks Program comprehension: important subject –make software maintenance more efficient Difficult to evaluate and compare –due to human factor Many future directions –several of which have been addressed by this research 39

Want to participate in the controlled experiment..? Prerequisites –at least two persons –knowledge of Java –(some) experience with Eclipse –no implementation knowledge of Checkstyle –two hours to spare between December 1 and 19  Contact me: –during lunch, or –through email: s.g.m.cornelissen@tudelft.nl 40

1 Program Comprehension through Dynamic Analysis Visualization, evaluation, and a survey Bas Cornelissen (et al.) Delft University of Technology IPA Herfstdagen,

Similar presentations

Presentation on theme: "1 Program Comprehension through Dynamic Analysis Visualization, evaluation, and a survey Bas Cornelissen (et al.) Delft University of Technology IPA Herfstdagen,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Program Comprehension through Dynamic Analysis Visualization, evaluation, and a survey Bas Cornelissen (et al.) Delft University of Technology IPA Herfstdagen,

Similar presentations

Presentation on theme: "1 Program Comprehension through Dynamic Analysis Visualization, evaluation, and a survey Bas Cornelissen (et al.) Delft University of Technology IPA Herfstdagen,"— Presentation transcript:

Similar presentations

About project

Feedback