An experimental evaluation of continuous testing during development David Saff 2003 October 1 PAG group meeting.

Slides:



Advertisements
Similar presentations
Introduction to the SPL Interpreter
Advertisements

Machine Learning Homework
xUnit Test Patterns (Some) xUnit Test Patterns (in practice) by Adam Czepil.
CS0004: Introduction to Programming Visual Studio 2010 and Controls.
Test Case Management and Results Tracking System October 2008 D E L I V E R I N G Q U A L I T Y (Short Version)
COMPUTER PROGRAMMING I Essential Standard 5.02 Understand Breakpoint, Watch Window, and Try And Catch to Find Errors.
1 CS 106, Winter 2009 Class 4, Section 4 Slides by: Dr. Cynthia A. Brown, Instructor section 4: Dr. Herbert G. Mayer,
Debugging CPSC 315 – Programming Studio Fall 2008.
COMP6703 : eScience Project III ArtServe on Rubens Emy Elyanee binti Mustapha Supervisor: Peter Stradzins Client: Professor Michael.
Managing Your Mailbox Facilities IS Presents:. Is your mailbox getting too big? Managing Your Mailbox An overstuffed mailbox can cause problems. You won’t.
1 Lab Session-I CSIT120 Spring2001 Using Windows Using An Editor Using Visual C++ Using Compiler Writing and Running Programs Lab-1 continues (Session.
User studies. Why user studies? How do we know security and privacy solutions are really usable? Have to observe users! –you may be surprised by what.
OBJECT ORIENTED PROGRAMMING I LECTURE 1 GEORGE KOUTSOGIANNAKIS
The Experimental Approach September 15, 2009Introduction to Cognitive Science Lecture 3: The Experimental Approach.
Experimental Psychology PSY 433
Introduction to High-Level Language Programming
INF 111 / CSE 121 Discussion Session Week 2 - Fall 2007 Instructor: Michele Rousseau TA: Rosalva Gallardo.
CC0002NI – Computer Programming Computer Programming Er. Saroj Sharan Regmi Week 7.
Recovery-Oriented Computing User Study Training Materials October 2003.
CS161 Topic #21 CS161 Introduction to Computer Science Topic #2.
Invitation to Computer Science, Java Version, Second Edition.
Design and Programming Chapter 7 Applied Software Project Management, Stellman & Greene See also:
1 SEG4912 University of Ottawa by Jason Kealey Software Engineering Capstone Project Tools and Technologies.
1.8History of Java Java –Based on C and C++ –Originally developed in early 1991 for intelligent consumer electronic devices Market did not develop, project.
WebVizOr: A Fault Detection Visualization Tool for Web Applications Goal: Illustrate and evaluate the uses of WebVizOr, a new tool to aid web application.
Just as there are many human languages, there are many computer programming languages that can be used to develop software. Some are named after people,
Introduction to Eclipse CSC 216 Lecture 3 Ed Gehringer Using (with permission) slides developed by— Dwight Deugo Nesa Matic
How to Run a Java Program CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
How to Run a Java Program CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Testing. 2 Overview Testing and debugging are important activities in software development. Techniques and tools are introduced. Material borrowed here.
Guide to Programming with Python Chapter One Getting Started: The Game Over Program.
Software Engineering Chapter 3 CPSC Pascal Brent M. Dingle Texas A&M University.
9/2/ CS171 -Math & Computer Science Department at Emory University.
How can giving ELL students access to learning games on a computer help them learn in the classroom? By: Lisa Cruz.
Chapter 3 MATLAB Fundamentals Introduction to MATLAB Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Microsoft ® PowerPoint Presentation to accompany Becoming a Master Student Tenth Edition Concise Dave Ellis Viewing recommendations for Windows: Use the.
Chapter 6 Review: User Defined Functions Introduction to MATLAB 7 Engineering 161.
David Streader Computer Science Victoria University of Wellington Copyright: David Streader, Victoria University of Wellington Debugging COMP T1.
While You Were Sleeping… SAS Is Hard At Work Andrea Wainwright- Zimmerman.
(1) Introduction to Continuous Integration Philip Johnson Collaborative Software Development Laboratory Information and Computer Sciences University of.
Lecture Set 2 Part A: Creating an Application with Visual Studio – Solutions, Projects, Files 8/10/ :35 PM.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
Observing the Current System Benefits Can see how the system actually works in practice Can ask people to explain what they are doing – to gain a clear.
Intoduction to Andriod studio Environment With a hello world program.
JohnPaulAguiar.com BLOGGING | SOCIAL MEDIA | MARKETING | INSPIRATION.
Component D: Activity D.3: Surveys Department EU Twinning Project.
Together we can build something great FORWARD | 2016 Role Centers and Charting Joanna Broszeit, Dawn Stenbol, Tracie Folscroft Education Track | Boston.
Data Entry, Coding & Cleaning SPSS Training Thomas Joshua, MS July, 2008.
Programming Logic and Design Seventh Edition Chapter 1 An Overview of Computers and Programming.
Continuous Testing in Eclipse
Project Management: Messages
Test Factoring: Focusing test suites on the task at hand
14 Compilers, Interpreters and Debuggers
CSE 341 Section 1 (9/28) With thanks to Dan Grossman, et. al. from previous versions of these slides.
Introduction to Eclipse
Topics Introduction to Repetition Structures
Brian Leonard ブライアン レオナルド
Deploying and Configuring SSIS Packages
Week 1 Gates Introduction to Information Technology cosc 010 Week 1 Gates
An experimental evaluation of continuous testing during development
X in [Integration, Delivery, Deployment]
CSE341: Programming Languages Section 1
CSCE 315 – Programming Studio, Fall 2017 Tanzir Ahmed
How to Run a Java Program
CSE341: Programming Languages Section 1
Optimize Your Java Code By Tools
Partnered or Group Projects
Chapter 1 Introduction(1.1)
How to Run a Java Program
Programming Logic and Design Eighth Edition
Presentation transcript:

An experimental evaluation of continuous testing during development David Saff 2003 October 1 PAG group meeting

Overview Continuous testing automatically runs tests in the background to provide feedback as developers code. Previous work suggested that continuous testing would speed development We performed a user study to evaluate this hypothesis Result: Developers were helped, and not distracted

Outline Introduction Tools Built Experimental Design Quantitative Results Qualitative Results Conclusion

Continuous Testing Continuous testing uses excess cycles on a developer's workstation to continuously run regression tests in the background as the developer edits code. developer changes code system runs tests system notified about changes system notifies about errors

Previous Work Monitored two single-developer software projects A model of developer behavior interpreted results and predicted the effect of changes on wasted time: –Time waiting for tests to complete –Extra time tracking down and fixing regression errors

Previous Work: Findings Delays in notification about regression errors correlate with delays in fixing these errors. Therefore, quicker notification should lead to quicker fixes Predicted improvement: 10-15%

Introduction: Basic Experiment Controlled human experiment with 22 subjects. Each subject performed two unrelated development tasks. Continuous testing has no significant effect on time worked Continuous testing has a significant effect on success completing the task. Developers enjoy using continuous testing, and find it helpful, not distracting.

SECTION: Tools Built TODO: Put outline here once it is solid

JUnit wrapper WrapperJunit Test Suite Reorder tests Time individual tests Test Suite Results Remember results Output failures immediately Distinguish regressions from unimplemented tests Reorder and filter result text Results

Emacs plug-in On file save, or 15-second pause with unsaved changes, run a “shadow” compile and test. Display results in modeline: –“Compilation Errors” –“Unimpl Tests: 45” –“Regression Errors: 2” Clicking on modeline brings up description of indicated errors.

Modeline screenshots

Error buffer screenshot

Shadow directory The developer’s code directory is “shadowed” in a hidden directory. Shadow directory has state as it would be if developer saved and compiled right now. Compilation and test results are filtered to appear as if they occurred in the developer’s code directory.

Monitoring Developers who agree to the study have a monitoring plug-in installed at the same time as the continuous testing plug-in. Sent to a central server: –Changes to the source in Emacs (saved or unsaved) –Changes to the source on the file system –Manual test runs –Emacs session stops/starts

SECTION: Experimental Design TODO: Put outline here once it is solid

Participants Students in MIT’s Laboratory in Software Engineering class. 107 total students 34 agreed participants 14 excluded for logistical reasons 20 successfully monitored 73 non-participants 25% (6) no tools 25% (5) compilation notification only 50% (9) compilation and test error notification (averages both tasks)

Demographics: Experience (1) Years… Mean …programming 2.8 …using Emacs 1.3 …using Java 0.4 …using IDE 0.2 Relatively inexperienced group of participants

Demographics: Experience (2) Usual environment:Unix 29%, Windows 38%, both 33%

Problem Sets Participants completed (PS1) a poker game and (PS2) a graphing polynomial calculator. PS1PS2 participants2217 written lines of code written methods 1831 time worked (hours)

Test Suites Students were provided with test suites written by course staff. Passing tests correctly was 75% of grade. PS1PS2 tests4982 initial failing tests 4546 running time (secs) 32 compilation time (secs) 1.4

Test Suites: Usage ParticipantsNon-participants waited until end to test 31%51% tested regularly throughout 69%49% Test frequency (minutes) for those who tested regularly mean2018 min73 max4060

Sources of data Quantitative: –Monitored state changes –Student submissions –Grades from TA’s Qualitative: –Questionnaire from all students – feedback from some students –Interviews and from staff

SECTION: Quantitative Results TODO: Put outline here once it is solid

Success Variables time worked: See next slide grade: as assigned by TAs. errors: Number of tests that the student submission failed. correct: True if the student solution passed all tests.

More variables: where students spent their time All time measurements used time worked, at a five-minute resolution: Some selected time measurements: –Total time worked –Ignorance time between introducing an error and becoming aware of it –Fixing between becoming aware of an error and fixing it :00:05:10:15:20:25:30:35:40:45:50:55:00 xxxxxx x = source edit

Treatment predicts correctness TreatmentNCorrect No tool1127% Continuous compilation 1050% Continuous testing 1878% p <.03

Other predictions Problem set predicts time worked On PS1 only, years of Java experience predicts correctness No other interesting predictions found at the p <.05 level No effect on working time seen –student time budgets may have had an effect.

Ignorance and fix time Ignorance time and fix time are correlated, confirming previous result. Chart shown for the single participant with the most regression errors

Errors over time Participants with no tools make progress faster at the beginning, then taper off; may never complete. Participants with automatic tools make steadier progress.

SECTION: Qualitative Results TODO: Put outline here once it is solid

Multiple-choice impressions (TODO: should be a chart?) (all answers given on a scale of +3 = strongly agree to -3 = strongly disagree) Continuous compilation (N=20) Continuous testing (N=13) The reported errors often surprised me I discovered problems more quickly I completed the assignment faster I was distracted by the tool I enjoyed using the tool The tool changed the way I worked 1.7

Questions about future use I would use the tool… YesNo …for the rest of %6% …for my own programming 80%20% I would recommend the tool to others 90%10%

How did working habits change? “I got a small part of my code working before moving on to the next section, rather than trying to debug everything at the end.” “It was easier to see my errors when they were only with one method at a time.” “The constant testing made me look for a quick fix rather than examine the code to see what was at the heart of the problem.”

Positive feedback “Once I finally figured out how it worked, I got even lazier and never manually ran the test cases myself anymore.” Head TA: “the continuous testing worked well for students. Students used the output constantly, and they also seemed to have a great handle on the overall environment.”

Pedagogical usefulness Several students mentioned that continuous testing was most useful when: –Code was well-modularized –Specs and tests were written before development. These are important goals of the class

Negative comments “Since I had already been writing extensive Java code for a year using emacs and an xterm, it simply got in the way of my work instead of helping me. I suppose that, if I did not already have a set way of doing my coding, continuous testing could have been more useful.” Some didn’t understand the modeline, or how shadowing worked.

Suggestions for improvement More flexibility in configuration More information about failures Smarter timing of feedback Implementation issues –JUnit wrapper filtered JUnit output, which was confusing. –Infinite loops led to no output. –Irreproducible failures to run. –Performance not acceptable on all machines.

SECTION: Conclusion TODO: Put outline here once it is solid

Threats to validity Participants were relatively inexperienced –2.8 years programming experience –Only 0.4 with Java –67% were not familiar with regression testing. –Can’t predict what effect of more experience would have been. This was almost a worst-case scenario for continuous testing: –Testing was easy. –Regression errors were unlikely

Future Work We can’t repeat the experiment: continuous testing helps, and we ethically can’t deny it to some students. Case studies in industry Extend to bigger test suites: –Integrate with Delta Debugging (Zeller) –Better test prioritization –Test factoring: making small tests from big ones.

Conclusion Continuous testing has a significant effect on developer success in completing a programming task. Continuous testing does not significantly affect time worked Most developers enjoy using continuous testing, and find it helpful.

The End Thanks to: –Michael Ernst –6.170 staff –participants

Introduction: Previous Work: Findings Finding 2: Continuous testing is more effective at reducing wasted time than: –changing test frequency –reordering tests Finding 3: Continuous testing reduces total development time 10 to 15%

Reasons cited for not participating Don't use Emacs45% Don't use Athena29% Didn't want the hassle60% Feared work would be hindered44% Privacy concerns7% Students could choose as many reasons as they wished. Other IDE’s cited, in order of popularity: Eclipse text editors (vi, pico, EditPlus2) Sun ONE Studio JBuilder

Variables that predicted participation Students with more Java experience were less likely to participate –already had work habits they didn’t want to change Students with more experience compiling programs in Emacs were more likely to participate We used a control group within the set of voluntary participants—results were not skewed.

Demographics: Experience (1) Years… MeanMinMax …programming …using Java …using Emacs … using IDE

Problem Sets Participants completed several classes in a skeleton implementation of (PS1) a poker game and (PS2) a graphing polynomial calculator. PS1PS2 participants2217 total lines of code skeleton lines of code written lines of code written classes 42 written methods 1831 time worked (hours)

Test Suites Students were provided with test suites written by course staff. Passing tests correctly was 75% of grade. PS1PS2 tests4982 initial failing tests 4546 lines of code running time (secs) 32 compilation time (secs) 1.4