1 Experimentation in Computer Science – Part 3. 2 Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment.

Slides:



Advertisements
Similar presentations
Validity of Quantitative Research Conclusions. Internal Validity External Validity Issues of Cause and Effect Issues of Generalizability Validity of Quantitative.
Advertisements

Decision Errors and Power
Statistical Issues in Research Planning and Evaluation
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
Statistics Micro Mini Threats to Your Experiment!
Educational Research by John W. Creswell. Copyright © 2002 by Pearson Education. All rights reserved. Slide 1 Chapter 11 Experimental and Quasi-experimental.
Chapter 3 Preparing and Evaluating a Research Plan Gay and Airasian
EVAL 6970: Experimental and Quasi- Experimental Designs Dr. Chris L. S. Coryn Dr. Anne Cullen Spring 2012.
Validity Lecture Overview Overview of the concept Different types of validity Threats to validity and strategies for handling them Examples of validity.
Richard M. Jacobs, OSA, Ph.D.
L1 Chapter 11 Experimental and Quasi- experimental Designs Dr. Bill Bauer.
CORRELATIO NAL RESEARCH METHOD. The researcher wanted to determine if there is a significant relationship between the nursing personnel characteristics.
Methodology: How Social Psychologists Do Research
Experimental Research Take some action and observe its effects Take some action and observe its effects Extension of natural science to social science.
RESEARCH DESIGN.
Chapter 4 Principles of Quantitative Research. Answering Questions  Quantitative Research attempts to answer questions by ascribing importance (significance)
Chapter 2: The Research Enterprise in Psychology
Chapter 2: The Research Enterprise in Psychology
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
Hypothesis Testing in Linear Regression Analysis
Chapter 1: Introduction to Statistics
بسم الله الرحمن الرحيم * this presentation about :- “experimental design “ * Induced to :- Dr Aidah Abu Elsoud Alkaissi * Prepared by :- 1)-Hamsa karof.
Conducting a User Study Human-Computer Interaction.
Research Methods Key Points What is empirical research? What is the scientific method? How do psychologists conduct research? What are some important.
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Chapter 8 Introduction to Hypothesis Testing
Undergraduate Dissertation Preparation – Research Strategy.
CHAPTER 18: Inference about a Population Mean
Evaluating a Research Report
 Internal Validity  Construct Validity  External Validity * In the context of a research study, i.e., not measurement validity.
Module 4 Notes Research Methods. Let’s Discuss! Why is Research Important?
Techniques of research control: -Extraneous variables (confounding) are: The variables which could have an unwanted effect on the dependent variable under.
The Research Enterprise in Psychology
Conducting a User Study Human-Computer Interaction.
Step 3 of the Data Analysis Plan Confirm what the data reveal: Inferential statistics All this information is in Chapters 11 & 12 of text.
Experimentation in Computer Science (Part 1). Outline  Empirical Strategies  Measurement  Experiment Process.
1 f02kitchenham5 Preliminary Guidelines for Empirical Research in Software Engineering Barbara A. Kitchenham etal IEEE TSE Aug 02.
Independent vs Dependent Variables PRESUMED CAUSE REFERRED TO AS INDEPENDENT VARIABLE (SMOKING). PRESUMED EFFECT IS DEPENDENT VARIABLE (LUNG CANCER). SEEK.
Experimental Design Chapter 1 Research Strategies and the Control of Nuisance Variables.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)
Human-Computer Interaction. Overview What is a study? Empirically testing a hypothesis Evaluate interfaces Why run a study? Determine ‘truth’ Evaluate.
Chapter 10 Experimental Research Gay, Mills, and Airasian 10th Edition
CHAPTER 2 Research Methods in Industrial/Organizational Psychology
The Scientific Method: Terminology Operational definitions are used to clarify precisely what is meant by each variable Participants or subjects are the.
1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal.
Experimental Research Methods in Language Learning Chapter 5 Validity in Experimental Research.
Chapter 6: Analyzing and Interpreting Quantitative Data
Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.
© Copyright McGraw-Hill 2004
Chapter Eight: Quantitative Methods
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
RESEARCH METHODS IN INDUSTRIAL PSYCHOLOGY & ORGANIZATION Pertemuan Matakuliah: D Sosiologi dan Psikologi Industri Tahun: Sep-2009.
Introduction to Validity True Experiment – searching for causality What effect does the I.V. have on the D.V. Correlation Design – searching for an association.
Methodology: How Social Psychologists Do Research
URBDP 591 A Lecture 16: Research Validity and Replication Objectives Guidelines for Writing Final Paper Statistical Conclusion Validity Montecarlo Simulation/Randomization.
Construct validity s.net/kb/consthre.htm.
Research design By Dr.Ali Almesrawi asst. professor Ph.D.
Can you hear me now? Keeping threats to validity from muffling assessment messages Maureen Donohue-Smith, Ph.D., RN Elmira College.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
Some Terminology experiment vs. correlational study IV vs. DV descriptive vs. inferential statistics sample vs. population statistic vs. parameter H 0.
Chapter Nine Hypothesis Testing.
Logic of Hypothesis Testing
Principles of Quantitative Research
Qualitative vs. Quantitative
Understanding Results
CHAPTER 2 Research Methods in Industrial/Organizational Psychology
Conducting a User Study
Chapter Eight: Quantitative Methods
Group Experimental Design
Presentation transcript:

1 Experimentation in Computer Science – Part 3

2 Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process (Continued)

E Experiment Process: Phases Experiment Definition Experiment Planning Experiment Operation Analysis & Interpretation Presentation & Package Conclusions Experiment Idea Experiment Process

4 Experiment Planning: Overview Context Selection Hypothesis Formulation Variables Selection Selection of Subjects Experiment Design Experiment Operation Experiment Definition Experiment Planning Instrumen- tation Validity Evaluation

Experiment Planning: Instrumentation  Instrumentation types:  Objects (e.g., specs, code)  Guidelines (e.g., process descriptions, checklists, tutorial documents)  Measurement instruments (surveys, forms, automated data collection tools)  Overall goal of instrumentation: facilitate its performance without affecting control (instrumentation must not affect outcomes)

Experiment Planning: Validity Evaluation  Threats to external validity concern the ability to generalize results outside the experimental setting  Threats to internal validity concern the ability to conclude that a causal effect exists between independent and dependent variables  Threats to construct validity concern the extent to which variables and measures accurately reflect the constructs under study.  Threats to conclusion validity concern issues that affect our ability to draw accurate statistical conclusions

Experiment Planning: Process and Threats Related Cause construct Effect construct TreatmentOutcome Theory (hypothesis) Observation cause-effect construct treatment-outcome construct Independent variableDependent variable

Experiment Planning: Process and Threats Related Cause construct Effect construct TreatmentOutcome Theory (hypothesis) Observation cause-effect construct treatment-outcome construct Independent variableDependent variable external construct internal conclusion

Experiment Planning: Threats to External Validity  Population: subject population not representative of population we wish to generalize to  Place: experimental setting or materials not representative of setting we wish to generalize to  Time: experiment is conducted at a time that affects results Reduce external validity threats in a given experiment by making environment as realistic as possible; however, reality is not homogenous, so important to report environment characterisitics. Reduce external validity threats long-term through replication.

Experiment Planning: Threats to Internal Validity  Instrumentation: measurement tools report inaccurately or affect results  Selection: groups selected are not equivalent  Learning: subjects learn over the course of the experiment, altering later results  Mortality: subjects drop out of the experiment  Social Effects: e.g., control group resents treatment group (demoralization or rivalry) Reduce internal threats through careful experiment design.

Experiment Planning: Threats to Construct Validity  Inadequate preoperational explication of constructs: theory isn’t clear enough (e.g. what is “better”)  Mono-operation or mono-method bias: using a single independent variable, case, subject, treatment, or measure may under-represent constructs  Levels of constructs: using incorrect levels of constructs may confound presence of construct with its level  Integration of testing and treatment: testing itself makes subjects sensitive to treatment; test is part of treatment  Social effects: experimenter expectancy, evaluation apprehension, hypothesis guessing Reduce construct threats through careful design, and replication.

Experiment Planning: Threats to Conclusion Validity  Low statistical power: increases risk of being unable to reject a false null hypothesis  Violated assumptions of statistical tests: some tests have assumptions, e.g. about normally distributed and independent samples  Fishing: searching for a specific result causes analyses to not be independent, and researchers may influence results by seeking specific outcomes  Reliability of measures: if you can’t measure the result twice with equal outcomes, measures aren’t reliable Reduce conclusion validity threats through careful design, and perhaps through consultation with statistical experts

Experiment Planning: Priorities Among Validity Threats  Decreasing some types of threats may cause others to increase. (E.g. using CS students increases group size, reduces heterogeneity, aids conclusion validity, reduces external validity.)  Tradeoffs need to be considered for type of study:  Theory testing is more interested in internal and construct validity than external  Applied experimentation is more interested in external and possibly conclusion validity

E Experiment Process: Phases Experiment Definition Experiment Planning Experiment Operation Analysis & Interpretation Presentation & Package Conclusions Experiment Idea Experiment Process

15 Experiment Operation: Overview  Experiment operation: carrying out the actual experiment and collecting data  Three phases:  Preparation  Execution  Data validation

16 Experiment Operation: Preparation  Locate participants  Offer inducements to obtain participants  Obtain participant consent, maybe also IRB approval  Consider confidentiality (maintain it, inform participants about it)  Avoid deception where it affects participants, reveal it later discussing necessity (beware validity tradeoffs; providing information is good but may affect results)  Prepare instrumentation  Objects, guidelines, tools, forms  Use pilot studies and walkthroughs to reduce threats

17 Experiment Operation: Execution  Execution might take place over a small set of specified occasions, or across a long time span  Data collection takes place: subjects or interviewers fill out forms, tools collect metrics  Consider interaction between experiment and environment, e.g., if experiment is being performed in-vivo, watch for confounding effects (experiment process altering behavior)

18 Experiment Operation: Data Validation  Verify that data has been collected correctly  Verify that data is reasonable  Consider whether outliers exist and should be removed (must be for good reasons)  Verify that experiment was conducted as intended  Post-experiment questionnaires can assess whether subjects understood instructions

E Experiment Process: Phases Experiment Definition Experiment Planning Experiment Operation Analysis & Interpretation Presentation & Package Conclusions Experiment Idea Experiment Process

20 Analysis and Interpretation: Overview  Quantitative interpretation can include:  Descriptive statistics: describe and graphically present data set, used before hypothesis testing to better understand data and identify outliers  Data set reduction: locate and possibly remove anomalous data points  Hypothesis testing: apply statistical tests to determine whether the null hypothesis can be rejected

21 Analysis and Interpretation: Visualizing Data Sets  Graphs are effective ways to provide an overview of a data set  Basic graphs types for use in visualization:  Scatter plots  Box plots  Line plots  Bar charts  Cumulative bar charts  Pie charts

22 Analysis and Interpretation: Data Set Reduction  Hypothesis testing techniques depend on quality of data set; data set reduction improves data set quality by removing anomalous data (outliers)  Outliers can be removed, but only for reasons such as that they represent rare events not likely to occur again  Scatter plots can help find outliers  Statistical tests can determine probabilities that points are outliers  Sometimes redundant data is not easily analyzed, if the redundancy is too large; factor analysis and principal components analysis can identify orthogonal factors with which to replace redundant factors

23 Analysis and Interpretation: Hypothesis Testing  Hypothesis testing: can we reject H 0 ?  If statistical tests say we can’t, we draw no conclusions  If tests say we can, H 0 is false with a given significance  = P(type-I-error) = P(reject H 0 | H 0 is true).  We also calculate p-value  : the lowest possible significance with which we can reject H 0  Typically,  is 0.05; to claim significance  must be < 

24 Analysis and Interpretation: Statistical Tests per Design DesignParametricNon-parametric One factor, one treatmentChi-2 Binomial test One factor, two treatments, completely randomized t-test f-test Mann-Whitney Chi-2 One factor, two treatments, paired comparison paired t-testWilcoxon Sign test One factor, more than two treatments ANOVAKruskal-Wallis Chi-2 More than one factorANOVA

25 Analysis and Interpretation: Statistical Tests  Important to choose the right test - type of data must be appropriate  are data items paired or not?  is data normally distributed or not?  are data sets completely independent or not?  Take a stats course, see texts such as Montgomery, consult with statisticians, use statistical packages

26 Analysis and Interpretation: Statistical vs Practical Significance  Statistical significance does not imply practical importance. E.g. if T1 is shown with statistical significance to be 1% more effective than T2, it must still be decided whether 1% matters  Lack of statistical significance does not imply lack of practical importance. The fact that H 0 cannot be rejected at level  does not mean that H 0 is true, and results of high practical importance may justify using a lower 

E Experiment Process: Phases Experiment Definition Experiment Planning Experiment Operation Analysis & Interpretation Presentation & Package Conclusions Experiment Idea Experiment Process

28 Presentation: An Outline for an Experiment Report 1.Introduction, Motivation 2.Background, Prior Work 3.Empirical Study 3.0 Research Questions 3.1 Objects of analysis participants objects 3.2 Variables and measures independent variables dependent variables other factors 3.3 Experiment setup setup details operational details 3.4 Analysis strategy 3.5 Threats to validity 3.6 Data and analysis 4. Interpretation 5. Conclusions

Presentation Issues Supporting replicability. What to say and what not to say? How much to say? Describing design decisions