Experimentation in Computer Science (Part 1)
Outline Empirical Strategies Measurement Experiment Process
Outline Empirical Strategies Measurement Experiment Process
Empirical Strategies: Research Paradigms Qualitative -- study objects in their natural setting, interpret phenomena based on people’s explanations, discover causes noticed by subjects Quantitative -- compare 2 or more groups or treatments, identify or quantify cause-effect relationships Qualitative and quantitative are complementary Quantitative research considers measures to assess effects of treatment, whereas qualitative research considers beliefs/understanding to explain why treatments differ.
Empirical Strategies: Types of Investigations Survey – often retrospective, through interviews or questionnaires Case Study - observational, often in-situ for an ongoing project Experiment - controlled study, laboratory, manipulation of variables Quasi-Experiment – experiment, but lacking randomization Experiments are quantitative; surveys and case studies can be quantitative or qualitative
Empirical Strategies: Surveys Purpose: descriptive (assert characteristics) explanatory (assess why) exploratory (pre-study) Process select variables select sample collect data analyze and generalize Example: survey 10% of web application users in a community on their opinions of a new web technology, to infer overall opinion across all developers in the community, and understand why they use or do not use this technology.
Empirical Strategies: Surveys Data collection methods: Questionnaires oCheaper distribution and execution Interviews oHigher response rate oRemoves possible ambiguity Resource: E. Babbie, Survey Research Methods, Wadsworth, 1990
Empirical Strategies: Case Studies Study a phenomenon in a specific time/space Applicable to dynamic or larger studies: Long term evaluations Industrial evaluations Can be used to compare approaches (e.g. on a project and sister project); not an experiment because not randomly selected Easier to plan than controlled experiments
Resource: R. K. Yin, Case Study Research Design and Methods, Sage Publications, 1994 Empirical Strategies: Case Studies Harder to control, hence less useful for asserting causality Example: observe the application of a web application development method within an industrial context, using two different development methodologies, over a long portion of the lifecycles of two development projects
Empirical Strategies: Experiments Most controlled form of study Manipulate independent variables View affects on dependent variables Controlled environment Reproducible Randomization over subjects and objects Often involve a baseline (control group) Resource: Wohlin et al., Experimentation in Software Engineering, Kluwer, 2 nd Ed.
Empirical Strategies: Experiments Uses: Confirm / reject theories Confirm conventional wisdom, test pre-conceptions Explore relationships and interactions Evaluate accuracy of models Validate measures Example: randomly assign a set of web app testers from company C to two groups: one using testing technique A and one using testing technique B, and ask them to apply these techniques to a randomly selected set of C’s web apps containing seeded faults, and measure fault-detection effectiveness
Empirical Strategies: Quasi-Experiments The same as an experiment, except with a lack of randomization over subjects or objects Drawback: reduced generality Example: a large set of test suites created by a specific technique, and a set of equivalently sized randomly generated test suites, are applied to a single large web app that your research group has developed.
Empirical Strategies: Comparison
Empirical Strategies: Factors in Selecting a Strategy Type of research question Level of control required Time and space of target of study Resources available All types of studies are valuable.
Experimentation in Software Engineering --- Outline Empirical Strategies Measurement Experiment Process
Measurement: Terminology Measurement: a mapping from the empirical world to the formal, relational world Measure: a number or symbol assigned to an entity (object in the world) by this mapping to characterize and manipulate an attribute Valid measure: a measure that captures necessary properties of the attribute, and properly characterizes it mathematically.
Measurement: Failing to Capture Properties Goal: measure programmer productivity Usual: code size or coding effort oSize = LOC oEffort = person_months Problems with LOC oLOC only reflects part of the output oWe want to measure production (output) as delivered benefit Problems with effort oEquivalence among employers or even developers oFull / part time
An objective measure involves no subjective judgment about the measurement value A subjective measure does involve subjective judgment, depends on both the object and the viewpoint from which the measurement is taken Examples: Objective: lines of code, delivery date Subjective: personnel skill, usability Measurement: Objective vs Subjective
A direct measure of an attribute does not involve measurement of other attributes An indirect (derived) measure does involve measurement of other attributes. Often, only indirect measures are available. Examples: Direct: number of defects Indirect: programmer productivity = LOC/effort Measurement: Direct vs Indirect
Experimentation in Software Engineering --- Outline Empirical Strategies Measurement Experiment Process
Experiment Process: Motivation Properly designed experiments provide: Control of subjects, objects, and instrumentation, allowing us to draw more general conclusions or conclusions about causality Ability to use statistical analyses (hypothesis testing) Support for replication Achieving proper design requires us to define a process for experimentation
Experiment Process: Overview Cause construct Effect construct TreatmentOutcome Theory (hypothesis) Observation cause-effect construct treatment-outcome construct Independent variableDependent variable Experiment objective Experiment operation
Experiment Process: Variables, Factors Variable: Entity that can change and take on different values Independent variable(s): variables that are manipulated and controlled – also known as factors Dependent variable(s): variables we want to see the effects of changes in independent variables on Factor: variables other than those chosen as independent and dependent; these must be carefully controlled for S.E. Process Dependent variables Independent variables (Factors)
Experiment Process: Treatments Treatment: one value of an independent variable (the variable manipulated) S.E. Process Dependent variables Independent variables Exp. Design Experiment Independent variables at fixed levels Treatment
Experiment Process: Objects and Subjects Treatments are assigned to subjects / objects Subjects: people to whom treatments are applied, or more often, who apply treatments Objects: artifacts that treatments are applied to, or that are manipulated by people Tests (or trials): a combination of treatments with subjects and/or objects