Download presentation
Presentation is loading. Please wait.
Published byAlexia Newton Modified over 9 years ago
1
1 Experimentation in Computer Science – Part 3
2
2 Experimentation in Software Engineering --- Outline Empirical Strategies Measurement Experiment Process (Continued)
3
E Experiment Process: Phases Experiment Definition Experiment Planning Experiment Operation Analysis & Interpretation Presentation & Package Conclusions Experiment Idea Experiment Process
4
4 Experiment Planning: Overview Context Selection Hypothesis Formulation Variables Selection Selection of Subjects Experiment Design Experiment Operation Experiment Definition Experiment Planning Instrumen- tation Validity Evaluation
5
Experiment Planning: Instrumentation Instrumentation types: Objects (e.g., specs, code) Guidelines (e.g., process descriptions, checklists, tutorial documents) Measurement instruments (surveys, forms, automated data collection tools) Overall goal of instrumentation: facilitate its performance without affecting control (instrumentation must not affect outcomes)
6
Experiment Planning: Validity Evaluation Threats to external validity concern the ability to generalize results outside the experimental setting Threats to internal validity concern the ability to conclude that a causal effect exists between independent and dependent variables Threats to construct validity concern the extent to which variables and measures accurately reflect the constructs under study. Threats to conclusion validity concern issues that affect our ability to draw accurate statistical conclusions
7
Experiment Planning: Process and Threats Related Cause construct Effect construct TreatmentOutcome Theory (hypothesis) Observation cause-effect construct treatment-outcome construct Independent variableDependent variable
8
Experiment Planning: Process and Threats Related Cause construct Effect construct TreatmentOutcome Theory (hypothesis) Observation cause-effect construct treatment-outcome construct Independent variableDependent variable external construct internal conclusion
9
Experiment Planning: Threats to External Validity Population: subject population not representative of population we wish to generalize to Place: experimental setting or materials not representative of setting we wish to generalize to Time: experiment is conducted at a time that affects results Reduce external validity threats in a given experiment by making environment as realistic as possible; however, reality is not homogenous, so important to report environment characterisitics. Reduce external validity threats long-term through replication.
10
Experiment Planning: Threats to Internal Validity Instrumentation: measurement tools report inaccurately or affect results Selection: groups selected are not equivalent Learning: subjects learn over the course of the experiment, altering later results Mortality: subjects drop out of the experiment Social Effects: e.g., control group resents treatment group (demoralization or rivalry) Reduce internal threats through careful experiment design.
11
Experiment Planning: Threats to Construct Validity Inadequate preoperational explication of constructs: theory isn’t clear enough (e.g. what is “better”) Mono-operation or mono-method bias: using a single independent variable, case, subject, treatment, or measure may under-represent constructs Levels of constructs: using incorrect levels of constructs may confound presence of construct with its level Integration of testing and treatment: testing itself makes subjects sensitive to treatment; test is part of treatment Social effects: experimenter expectancy, evaluation apprehension, hypothesis guessing Reduce construct threats through careful design, and replication.
12
Experiment Planning: Threats to Conclusion Validity Low statistical power: increases risk of being unable to reject a false null hypothesis Violated assumptions of statistical tests: some tests have assumptions, e.g. about normally distributed and independent samples Fishing: searching for a specific result causes analyses to not be independent, and researchers may influence results by seeking specific outcomes Reliability of measures: if you can’t measure the result twice with equal outcomes, measures aren’t reliable Reduce conclusion validity threats through careful design, and perhaps through consultation with statistical experts
13
Experiment Planning: Priorities Among Validity Threats Decreasing some types of threats may cause others to increase. (E.g. using CS students increases group size, reduces heterogeneity, aids conclusion validity, reduces external validity.) Tradeoffs need to be considered for type of study: Theory testing is more interested in internal and construct validity than external Applied experimentation is more interested in external and possibly conclusion validity
14
E Experiment Process: Phases Experiment Definition Experiment Planning Experiment Operation Analysis & Interpretation Presentation & Package Conclusions Experiment Idea Experiment Process
15
15 Experiment Operation: Overview Experiment operation: carrying out the actual experiment and collecting data Three phases: Preparation Execution Data validation
16
16 Experiment Operation: Preparation Locate participants Offer inducements to obtain participants Obtain participant consent, maybe also IRB approval Consider confidentiality (maintain it, inform participants about it) Avoid deception where it affects participants, reveal it later discussing necessity (beware validity tradeoffs; providing information is good but may affect results) Prepare instrumentation Objects, guidelines, tools, forms Use pilot studies and walkthroughs to reduce threats
17
17 Experiment Operation: Execution Execution might take place over a small set of specified occasions, or across a long time span Data collection takes place: subjects or interviewers fill out forms, tools collect metrics Consider interaction between experiment and environment, e.g., if experiment is being performed in-vivo, watch for confounding effects (experiment process altering behavior)
18
18 Experiment Operation: Data Validation Verify that data has been collected correctly Verify that data is reasonable Consider whether outliers exist and should be removed (must be for good reasons) Verify that experiment was conducted as intended Post-experiment questionnaires can assess whether subjects understood instructions
19
E Experiment Process: Phases Experiment Definition Experiment Planning Experiment Operation Analysis & Interpretation Presentation & Package Conclusions Experiment Idea Experiment Process
20
20 Analysis and Interpretation: Overview Quantitative interpretation can include: Descriptive statistics: describe and graphically present data set, used before hypothesis testing to better understand data and identify outliers Data set reduction: locate and possibly remove anomalous data points Hypothesis testing: apply statistical tests to determine whether the null hypothesis can be rejected
21
21 Analysis and Interpretation: Visualizing Data Sets Graphs are effective ways to provide an overview of a data set Basic graphs types for use in visualization: Scatter plots Box plots Line plots Bar charts Cumulative bar charts Pie charts
22
22 Analysis and Interpretation: Data Set Reduction Hypothesis testing techniques depend on quality of data set; data set reduction improves data set quality by removing anomalous data (outliers) Outliers can be removed, but only for reasons such as that they represent rare events not likely to occur again Scatter plots can help find outliers Statistical tests can determine probabilities that points are outliers Sometimes redundant data is not easily analyzed, if the redundancy is too large; factor analysis and principal components analysis can identify orthogonal factors with which to replace redundant factors
23
23 Analysis and Interpretation: Hypothesis Testing Hypothesis testing: can we reject H 0 ? If statistical tests say we can’t, we draw no conclusions If tests say we can, H 0 is false with a given significance = P(type-I-error) = P(reject H 0 | H 0 is true). We also calculate p-value : the lowest possible significance with which we can reject H 0 Typically, is 0.05; to claim significance must be <
24
24 Analysis and Interpretation: Statistical Tests per Design DesignParametricNon-parametric One factor, one treatmentChi-2 Binomial test One factor, two treatments, completely randomized t-test f-test Mann-Whitney Chi-2 One factor, two treatments, paired comparison paired t-testWilcoxon Sign test One factor, more than two treatments ANOVAKruskal-Wallis Chi-2 More than one factorANOVA
25
25 Analysis and Interpretation: Statistical Tests Important to choose the right test - type of data must be appropriate are data items paired or not? is data normally distributed or not? are data sets completely independent or not? Take a stats course, see texts such as Montgomery, consult with statisticians, use statistical packages
26
26 Analysis and Interpretation: Statistical vs Practical Significance Statistical significance does not imply practical importance. E.g. if T1 is shown with statistical significance to be 1% more effective than T2, it must still be decided whether 1% matters Lack of statistical significance does not imply lack of practical importance. The fact that H 0 cannot be rejected at level does not mean that H 0 is true, and results of high practical importance may justify using a lower
27
E Experiment Process: Phases Experiment Definition Experiment Planning Experiment Operation Analysis & Interpretation Presentation & Package Conclusions Experiment Idea Experiment Process
28
28 Presentation: An Outline for an Experiment Report 1.Introduction, Motivation 2.Background, Prior Work 3.Empirical Study 3.0 Research Questions 3.1 Objects of analysis 3.1.1 participants 3.1.2 objects 3.2 Variables and measures 3.2.1 independent variables 3.2.2 dependent variables 3.2.3 other factors 3.3 Experiment setup 3.3.1 setup details 3.3.2 operational details 3.4 Analysis strategy 3.5 Threats to validity 3.6 Data and analysis 4. Interpretation 5. Conclusions
29
Presentation Issues Supporting replicability. What to say and what not to say? How much to say? Describing design decisions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.