Dror Feitelson The Hebrew University of Jerusalem

Slides:



Advertisements
Similar presentations
Chapter 14 Software Testing Techniques - Testing fundamentals - White-box testing - Black-box testing - Object-oriented testing methods (Source: Pressman,
Advertisements

SOFTWARE TESTING. INTRODUCTION  Software Testing is the process of executing a program or system with the intent of finding errors.  It involves any.
David Woo (dxw07u).  What is “White Box Testing”  Data Processing and Calculation Correctness Tests  Correctness Tests:  Path Coverage  Line Coverage.
1 Estimating Software Development Using Project Metrics.
Detailed Design Kenneth M. Anderson Lecture 21
These courseware materials are to be used in conjunction with Software Engineering: A Practitioner’s Approach, 6/e and are provided with permission by.
Cyclomatic Complexity Dan Fleck Fall 2009 Dan Fleck Fall 2009.
Introduction to High-Level Language Programming
CMSC 345 Fall 2000 Unit Testing. The testing process.
Chapter 6 : Software Metrics
Regression Testing. 2  So far  Unit testing  System testing  Test coverage  All of these are about the first round of testing  Testing is performed.
Agenda Introduction Overview of White-box testing Basis path testing
Chapter 22 Developer testing Peter J. Lane. Testing can be difficult for developers to follow  Testing’s goal runs counter to the goals of the other.
How Programmers Read Regular Code: A Controlled Experiment Using Eye Tracking ICPC 2015 Ahmad Jbara and Dror G. Feitelson The Hebrew University of Jerusalem,
1 Program Testing (Lecture 14) Prof. R. Mall Dept. of CSE, IIT, Kharagpur.
Individual Differences in Human-Computer Interaction HMI Yun Hwan Kang.
1 COSC 4406 Software Engineering COSC 4406 Software Engineering Haibin Zhu, Ph.D. Dept. of Computer Science and mathematics, Nipissing University, 100.
Software Metrics.
Agent program is the one part(class)of Othello program. How many test cases do you have to test? Reversi [Othello]
Cyclomatic complexity (or conditional complexity) is a software metric (measurement). Its gives the number of indepented paths through strongly connected.
SOFTWARE TESTING LECTURE 9. OBSERVATIONS ABOUT TESTING “ Testing is the process of executing a program with the intention of finding errors. ” – Myers.
Verification vs. Validation Verification: "Are we building the product right?" The software should conform to its specification.The software should conform.
The Emergent Structure of Development Tasks
Recursion CSE 2320 – Algorithms and Data Structures
Advanced Computer Systems
Algorithms and Problem Solving
Basic Scheme February 8, 2007 Compound expressions Rules of evaluation
Software Metrics 1.
Software Testing.
Software Testing.
Learning to Program D is for Digital.
Control Flow Testing Handouts
Cyclomatic complexity
Software Engineering (CSI 321)
Chapter 18 Maintaining Information Systems
Chapter 4 Software Process and Project Metrics
Testing and Debugging.
CPSC 315 – Programming Studio Spring 2012
Outline of the Chapter Basic Idea Outline of Control Flow Testing
Structural testing, Path Testing
Software Engineering: A Practitioner’s Approach, 6/e Chapter 12 User Interface Design copyright © 1996, 2001, 2005 R.S. Pressman & Associates, Inc.
Recursion CSE 2320 – Algorithms and Data Structures
Software Maintenance PPT By :Dr. R. Mall.
Types of Testing Visit to more Learning Resources.
White-Box Testing.
HCI in the software process
Software Testing (Lecture 11-a)
Portability CPSC 315 – Programming Studio
White-Box Testing.
Halstead software science measures and other metrics for source code
Java Programming Loops
For University Use Only
Overview of Machine Learning
HCI in the software process
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
Control Structure Testing
Software metrics.
Algorithms and Problem Solving
Examining Variables on Flow Paths
HCI in the software process
Software Metrics SAD ::: Fall 2015 Sabbir Muhammad Saleh.
Java Programming Loops
Software Testing “If you can’t test it, you can’t design it”
This Lecture Substitution model
Introduction to the Lab
Information Retrieval and Web Design
Introduction Software maintenance:
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Data Analysis, Interpretation, and Presentation
Unit III – Chapter 3 Path Testing.
Presentation transcript:

Dror Feitelson The Hebrew University of Jerusalem Eye Tracking and Program Comprehension Dror Feitelson The Hebrew University of Jerusalem

The ratio of time spent reading vs. writing is well over 10:1 The ratio of time spent reading vs. writing is well over 10:1. We are constantly reading old code as part of the effort to write new code. Robert C. Martin (Uncle Bob) Clean Code Photo: Tim-bezhashvyly

Consequences: Program comprehension is important Program comprehension is a major factor in software development costs But… Program comprehension is hard

Why software is hard to ? understand

On the Menu: Experiments Definitions Results Methodology Ideas Code comprehension Eye tracking

Code Complexity The property of code that makes it hard to comprehend Using goto, pointers, recursion Code smells Just having a lot of it Identified and measured by complexity metrics Static analysis Overly complex code = warning that code is “bad” = danger of bugs

McCabe Cyclomatic Complexity (MCC) Given a strongly connected directed graph G with n nodes, e edges, p connected components the cyclomatic number of the graph is v(G) = e – n + p Calculate this on a module’s control flow graph Shortcut: in planar (structured) graph count branch points + 1 Suggestion: keep MCC  10 Strongly connected: can get from any node to any other node, in code need to connect exit node back to entry node

BUT… It is reasonable to give the same weight to all branching constructs? while loop for loop if-then-else case in switch Is control flow the only thing that matters? It’s so 1970s…

We Can Do Better Measure the effect of different code elements Express the same functionality in different ways Conduct controlled experiments Compare the results Time to comprehend Probability to make an error S. Ajami, Y. Woodbridge, and D. G. Feitelson, “Syntax, predicates, idioms --- what really affects code complexity?”. Empirical Softw. Eng. 24(1), pp. 287-328, Feb 2019

Example 1: Loop Idioms Variations on for (i = 0; i < n; i++) Loops counting down take more time Abnormal versions cause more errors

Code complexity depends on expectations! Example 1: Loop Idioms Variations on for (i = 0; i < n; i++) Loops counting down take more time Abnormal versions cause more errors Code complexity depends on expectations!

Complexity and Expectations “Code complexity depends on expectations” Is this really a form of “complexity”? Actually, misleading code causes more errors Comprehension time as metric for complexity Error rate as metric for complexity

Example 2: Code Regularity Regularity = the same structures repeated time after time

Example 2: Code Regularity Regularity = the same structures repeated time after time Hypothesis: Figuring out the basic repeated element makes it easier to understand additional instances

Example 2: Code Regularity Code complexity depends on context! Regularity = the same structures repeated time after time Hypothesis: Figuring out the basic repeated element makes it easier to understand additional instances

Example 2: Code Regularity Same style of controlled experiment Additional metric: gaze pattern Distribution of fixations Total time of fixations Measure time devoted to repeated elements A. Jbara and D. G. Feitelson, “How programmers read regular code: A controlled experiment using eye tracking”. Empirical Softw. Eng. 22(3), pp. 1440-1477, Jun 2017

Regular version Non-regular version Image processing code: replace each pixel with median of 3x3 neighbors; avg of 4-5 subjects

in each additional instance Exponential model: Time reduced by 40% in each additional instance

Results So Far We can perform detailed measurements Empirically quantify code complexity Code complexity ≠ code “misleadingness” Prefer time over errors as metric We can use eye tracking data to parameterize models e.g. “effective complexity” based on instance number

However… The model that comprehension depends on code complexity is too simplistic Comprehension more realistically depends on the combination of 3 factors: The code’s complexity The developer’s skill and domain knowledge The circumstances (fatigue, priming)

Easy Hard Code complexity Developer skill Transient conditions Technical property of code Code complexity Easy Developer skill Properties of HUMANS Hard Transient conditions Eye tracking can help But need careful consideration of code and task

code code

Single Function Code Used in most code comprehensions studies Used in most eye tracking studies Fits on one page / screen Somewhat artificial situation When do you consider a function in isolation?

Eye Tracking Results So Far Non-linear order of reading Unlike mostly linear order for normal text Use of recurring patterns Linear scan of body of code (general overview) Jump back (to declarations?) Jump forward (to return value?) Hover (focus on a certain statement) Huge differences between subjects Single function – replicated results

EMIP 2013 Data Two subjects given different tasks

EMIP 2013 Data 1st subject

Different people reading the same code

Full System What does it mean to comprehend a system? More about structure than statements The components comprising the system Their interactions Data flow Largely detached from the code itself Related to the system’s architecture Eye tracking not very relevant O. Levy and D. G. Feitelson, “Understanding large-scale software -- A hierarchical view”. In 27th Intl. Conf. Program Comprehension, May 2019

In Between More realistic developer workflow Using an IDE Functions in the context of classes and packages Follow control flow from one function to another Using an IDE Sequence of jumps between modules What is read in each one Also other tools (issue tracking, Google) Eye tracking tools just beginning to emerge iTrace by Maletic and Sharif Enabler of new data collection

task task

What Does “Reading” Mean? How do you read A WhatsApp conversation A novel A newspaper story A poem A scientific paper Code Do you read every word, or just skim most of it? Do you fully understand it?

Levels of Understanding How do I use this function? Black box / information hiding Very common (libraries etc.) How does this function work? White box Superficial: given arguments, what does it print? Deep: how does it compute its output? Interpretation Comprehension

Interpretation vs. Comprehension Interpretation: follow control flow and interpret commands in sequence Technical process Comprehension: articulate the essence of what the code does Deep understanding

Comprehension can be easier than interpretation! int sum=0; for (int i=1; i<=100; i++) { sum += i; } print( sum ); Comprehension can be easier than interpretation!

Performing a task is proof of a certain level of comprehension You need different levels of comprehension for different tasks

Answer a trivia question E.g. does the code include a variable “fooBar” Need to scan the code Eye tracking will probably reflect scanning Does not reflect understanding

What does this code print? Need to trace and interpret the code Eye tracking will reflect tracing of control flow Perhaps with shortcuts (e.g. don’t repeat loops)

How do I use this code? Treat code as black box Depends on API and usage examples Eye tracking will probably show focus on signature

I need to fix a bug in the code Simple case: just trace the code e.g. to find a null pointer reference Hard case: need to really understand it Same as other hard scenarios

Assess quality of the code As in the context of a code review Do you need to really understand it? Can you just scan to search for problem indicators? Eye tracking may reflect linear scan

Modify, fix, or document Need to really understand the code (white box) Eye tracking will reflect the effort Scan the code, jump forward, jump back Different patterns for different people

people people

In addition to the code and the task how code is read may depend on WHO is reading it

Gender Do women read code differently from men? Few studies Some differences Women perhaps more methodical e.g. start with problem statement Men tend to jump ahead

Novice vs. Expert Do expert developers read code differently from novices? Some studies Some differences Experts identify and focus on the important parts Experts read code less linearly

The BIG Question for Eye Tracking: they What are actually doing ?

Humans Developers’ goal is to “get the job done” They want to do this efficiently So may actually try to avoid comprehension, because comprehension is hard Eye tracking results may reflect activity to circumvent comprehension rather than activity to achieve comprehension

EMIP Coding Scheme for Strategies AttentionToDetail DataFlow Debugging LinearHorizontal and LinearVertical Deductive DesignAtOnce FlowCycle Inductive InterproceduralControlFlow IntraproceduralControlFlow StrayGlance TestHypothesis Touchstone Trial&Error Wandering “It is a capital mistake to theorize before one has data.” –Sherlock Holmes

Collecting Lots of Data Needed for small effects and many confounding factors Identify common vs. uncommon cases Analyze effects and interactions Increase confidence via reproducibility EMIP dataset example Is this “Big Data”?

Goal: scientific basis for software engineering Wrapping Up Goal: scientific basis for software engineering

Wrapping Up Subgoal: understand program comprehension Goal: scientific basis for software engineering Subgoal: understand program comprehension

Wrapping Up Means: controlled experiments Goal: scientific basis for software engineering Subgoal: understand program comprehension Means: controlled experiments

Wrapping Up Prerequisites: Define tasks, metrics, variables Goal: scientific basis for software engineering Subgoal: understand program comprehension Means: controlled experiments Prerequisites: Define tasks, metrics, variables

Wrapping Up Tools: eye tracking Prerequisites: Goal: scientific basis for software engineering Subgoal: understand program comprehension Means: controlled experiments Prerequisites: Define tasks, metrics, variables Tools: eye tracking

Wrapping Up Observation: we know so little Tools: eye tracking Goal: scientific basis for software engineering Subgoal: understand program comprehension Means: controlled experiments Prerequisites: Define tasks, metrics, variables Tools: eye tracking Observation: we know so little

Wrapping Up consequence: there’s so much to do! Goal: scientific basis for software engineering Subgoal: understand program comprehension Means: controlled experiments Prerequisites: Define tasks, metrics, variables Tools: eye tracking Observation: we know so little consequence: there’s so much to do!