Generalization through a series of replicated experiments on maintainability Erik Arisholm.

Slides:



Advertisements
Similar presentations
Cross Cultural Research
Advertisements

ASSESSING RESPONSIVENESS OF HEALTH MEASUREMENTS. Link validity & reliability testing to purpose of the measure Some examples: In a diagnostic instrument,
ICT 1 Generalization from empirical studies Tore Dybå:Session introduction (~20 min.) Erik Arisholm:Generalizing results through a series of replicated.
Randomized Experimental Design
Object-Oriented Software Development CS 3331 Fall 2009.
Experimental Research Designs
Introduction Class Notes How to Program in C++ By : dettle & dettle READING MATERIAL
Assessing Program Impact Chapter 8. Impact assessments answer… Does a program really work? Does a program produce desired effects over and above what.
Research Design and Validity Threats
Software Effort Estimation based on Use Case Points Chandrika Seenappa 30 th March 2015 Professor: Hossein Saiedian.
Non-Experimental designs: Developmental designs & Small-N designs
R R R CSE870: Advanced Software Engineering (Cheng): Intro to Software Engineering1 Advanced Software Engineering Dr. Cheng Overview of Software Engineering.
Doing Social Psychology Research
Non-Experimental designs: Developmental designs & Small-N designs
CMP3265 – Professional Issues and Research Methods Research Proposals: n Aims and objectives, Method, Evaluation n Yesterday – aims and objectives: clear,
Statement of the Problem Goal Establishes Setting of the Problem hypothesis Additional information to comprehend fully the meaning of the problem scopedefinitionsassumptions.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Personality, 9e Jerry M. Burger
Software Process and Product Metrics
Chapter 9 Experimental Research Gay, Mills, and Airasian
Chapter 8 Experimental Research
Experimental Design The Gold Standard?.
Selecting a Research Design. Research Design Refers to the outline, plan, or strategy specifying the procedure to be used in answering research questions.
Chapter 3 The Research Design. Research Design A research design is a plan of action for executing a research project, specifying The theory to be tested.
Day 6: Non-Experimental & Experimental Design
“Enhancing Reuse with Information Hiding” ITT Proceedings of the Workshop on Reusability in Programming, 1983 Reprinted in Software Reusability, Volume.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
Program Evaluation. Program evaluation Methodological techniques of the social sciences social policy public welfare administration.
Creating a Shared Vision Model. What is a Shared Vision Model? A “Shared Vision” model is a collective view of a water resources system developed by managers.
Evaluating a Research Report
Reviewed By: Paul Varcholik University of Central Florida EEL 6883 – Software Engineering II Spring 2009 Wojciech James Dzidek, Erik.
Experimentation in Computer Science (Part 1). Outline  Empirical Strategies  Measurement  Experiment Process.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 4 Designing Studies 4.2Experiments.
Chapter 4 – Research Methods in Clinical Psych Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
1 Copyright © 2011 by Saunders, an imprint of Elsevier Inc. Chapter 8 Clarifying Quantitative Research Designs.
Software Testing and Maintenance 1 Code Review  Introduction  How to Conduct Code Review  Practical Tips  Tool Support  Summary.
Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.
Experimental Designs Leedy and Ormrod, Ch. 10. Introduction Experiments are part of the traditional science model Involve taking “action” and observing.
Dag Sjøberg Simula Research Laboratory Basic Research in Computing and Communication Sciences!
Learning Objectives In this chapter you will learn about the elements of the research process some basic research designs program evaluation the justification.
Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations.
Chapter 10 Experimental Research Gay, Mills, and Airasian 10th Edition
Research Design ED 592A Fall Research Concepts 1. Quantitative vs. Qualitative & Mixed Methods 2. Sampling 3. Instrumentation 4. Validity and Reliability.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Using Specialized Research Designs.
1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 4 Designing Studies 4.2Experiments.
Experimental & Quasi-Experimental Designs Dr. Guerette.
Quasi-Experimental Designs Slides Prepared by Alison L. O’Malley Passer Chapter 11.
McGraw-Hill/Irwin Copyright © 2008 by The McGraw-Hill Companies, Inc. All rights reserved. CHAPTER 2 Tools of Positive Analysis.
WERST – Methodology Group
Assessing Responsiveness of Health Measurements Ian McDowell, INTA, Santiago, March 20, 2001.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 4 Designing Studies 4.2Experiments.
Today: Assignment 2 back on Friday
Evaluation Research Dr. Guerette. Introduction Evaluation Research – Evaluation Research – The purpose is to evaluate the impact of policies The purpose.
URBDP 591 A Lecture 16: Research Validity and Replication Objectives Guidelines for Writing Final Paper Statistical Conclusion Validity Montecarlo Simulation/Randomization.
Quazi-Experimental Designs. Outline of Today’s Discussion 1.True Experiments 2.Quasi-experimental Designs.
5. 2Object-Oriented Analysis and Design with the Unified Process Objectives  Describe the activities of the requirements discipline  Describe the difference.
Advanced Software Engineering Dr. Cheng
Chapter 8 Environments, Alternatives, and Decisions.
Software Quality Control and Quality Assurance: Introduction
SESRI Workshop on Survey-based Experiments
Chapter 4 Research Methods in Clinical Psychology
Experimental Design-Chapter 8
CHAPTER 4 Designing Studies
Chapter Six Training Evaluation.
Software Engineering Experimentation
SESRI Workshop on Survey-based Experiments
Software metrics.
Simulation-driven Enterprise Modelling: WHY ?
Presentation transcript:

Generalization through a series of replicated experiments on maintainability Erik Arisholm

14/11/20052 An ideal controlled experiment … Relevant research question! Statistical generalization from a sufficiently large, representative sample of systems, tasks, settings and subjects to a well-defined and relevant target population of systems, tasks, settings and subjects (statistical power and external validity) Well defined independent and dependent variables that fully and reliably measure the concepts under study (construct validity and external validity) Equivalent treatment groups (e.g., random assignment and avoiding post-assignment drop-outs) to ensure internal validity Supported by theory that can explain the observed effects for all possible factor level combinations

14/11/20053 Experiments on Software Maintenance A series of five controlled experiments (can be considered as one quasi-experiment) where the subjects consisting of 295 junior/intermediate/senior Java consultants from Norway, Sweden and the UK, and 273 undergraduate/graduate students from Norway and Canada performed maintenance tasks on two alternative designs of the same Java system to assess the effects of (combinations of) control style (centralized vs delegated), maintenance task order (easy vs difficult task first), documentation (UML versus no UML), and development process (pair programming vs individual) on software maintainability (change effort and correctness)

14/11/20054 Centralized vs Delegated Control Style The Delegated Control Style: –Rebecca Wirfs-Brock: A delegated control style ideally has clusters of well defined responsibilities distributed among a number of objects. To me, a delegated control architecture feels like object design at its best… –Alistair Cockburn: [The delegated coffee- machine design] is, I am happy to see, robust with respect to change, and it is a much more reasonable ''model of the world.'‘ The Centralized Control Style: –Rebecca Wirfs-Brock: A centralized control style is characterized by single points of control interacting with many simple objects. To me, centralized control feels like a "procedural solution" cloaked in objects… –Alistair Cockburn: Any oversight in the “mainframe” object (even a typo!) [in the centralized coffee-machine design] means potential damage to many modules, with endless testing and unpredictable bugs.

14/11/20055 The individual experiments Exp 1: The Original Experiment with students and pen-and-paper tasks (Fall 1999) –Effect of Centralized (bad) vs Delegated (good) Control Style –Arisholm, Sjøberg & Jørgensen, "Assessing the Changeability of two Object-Oriented Design Alternatives - a Controlled Experiment," Empirical Software Engineering, vol. 6, no. 3, pp , –36 undergraduate students and 12 graduate students Exp 2: The Control-style experiment with professional Java developers and Java tools (Fall Spring 2002) –Effect of Centralized vs Delegated Control Style for Categories of Developer –Arisholm & Sjøberg, "Evaluating the Effect of a Delegated versus Centralized Control Style on the Maintainability of Object- Oriented Software," IEEE Transactions on Software Engineering, vol. 30, no. 8, pp , –99 professionals, 59 students Exp 3: UML experiments (Spring Fall 2004) –Effect of UML (vs No UML) for the Delegated Control Style –Arisholm, Briand, Hove & Labiche, "The Impact of UML Documentation on Software Maintenance: An Experimental Evaluation," Simula Technical Report Submitted to IEEE Transactions on Software Engineering, –20 students from UiO (Spring 2003) + 78 students from Carleton Univ., Canada (Fall 2004) Exp 4: Task Order Experiment (Fall 2001-Spring 2005) –Effect of Task Order and Centralized vs Delegated Control Style –Arisholm, Wang & Syrstad, “The impact of Task Order on Maintainability”, in preparation –Easy task first: 59 students from Exp 2 ( ) –Difficult task first: 66 students from NTNU (Spring 2005) Exp 5: Pair programming experiment (Fall 2003-Spring 2005): –Effect of Pair Programming (vs individual programming) and Delegated vs Centralized Control Style –Gallis, Arisholm, Dybå & Sjøberg, “Evaluating Pair Programming among Professional Java Developers”, in preparation –196 professional programmers in Norway, Sweden, and UK (98 pairs) –99 individuals (from Exp 2)

14/11/20056 A quasi-experiment of increasing scope

14/11/20057 Summary of experimental results Performing change tasks on a delegated control style requires, on average, more time and results in more defects than on a centralized control style, in particular for novices (undergraduate students and junior consultants) –Only seniors seem to have the necessary skills to benefit from the more ”elegant” delegated control style. –Explanation: Unlike experts, novices perform a mental trace the code in order to understand it. This tracing effort is more difficult in a delegated control style –Results are consistent with pen&paper versus using Java tools –Results are consistent when performing the most difficult task first Two ways to decrease the cognitive complexity of the delegated control style (for novices in particular) –Extensive UML documentation –Pair programming Uncertain whether the results are valid for –other systems and tasks (external validity) –alternative ways of measuring maintainability (construct validity)

14/11/20058 Dealing with non-equivalent groups (to compare results across experiments) Necessary to have a common pre-test to adjust for skill differences between groups (using ANCOVA)* *T.D. Cook, and D.T. Campbell (1979), Quasi-Experimentation: Design & Analysis Issues for Field Settings, Houghton Mifflin Company.

14/11/20059 Effect of Pen & Paper vs Tools (Exp1 + Exp 2)

14/11/ Effect of Expertise (Exp 2)

14/11/ Effect of UML (Exp 3)

14/11/ Effect of Task Order (Exp 2 + Exp 4)

14/11/ Effect of Pair Programming (Exp 2 + Exp 5)