 Internal Validity  Construct Validity  External Validity * In the context of a research study, i.e., not measurement validity.

Slides:



Advertisements
Similar presentations
Andrea M. Landis, PhD, RN UW LEAH
Advertisements

Agenda Group Hypotheses Validity of Inferences from Research Inferences and Errors Types of Validity Threats to Validity.
Validity (cont.)/Control RMS – October 7. Validity Experimental validity – the soundness of the experimental design – Not the same as measurement validity.
Copyright © Allyn & Bacon (2007) Hypothesis Testing, Validity, and Threats to Validity Graziano and Raulin Research Methods: Chapter 8 This multimedia.
Experimental Research Neuman and Robson, Ch. 9. Introduction Experiments are part of the traditional science model Involve taking “action” and observing.
Validity of Quantitative Research Conclusions. Internal Validity External Validity Issues of Cause and Effect Issues of Generalizability Validity of Quantitative.
CHAPTER 8, experiments.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.
Correlation AND EXPERIMENTAL DESIGN
Research Design and Validity Threats
Educational Action Research Todd Twyman Summer 2011 Week 1.
Common Designs and Quality Issues in Quantitative Research Research Methods and Statistics.
MSc Applied Psychology PYM403 Research Methods Validity and Reliability in Research.
SOWK 6003 Social Work Research Week 4 Research process, variables, hypothesis, and research designs By Dr. Paul Wong.
Educational Research by John W. Creswell. Copyright © 2002 by Pearson Education. All rights reserved. Slide 1 Chapter 11 Experimental and Quasi-experimental.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
EVAL 6970: Experimental and Quasi- Experimental Designs Dr. Chris L. S. Coryn Dr. Anne Cullen Spring 2012.
Experiments Pierre-Auguste Renoir: Barges on the Seine, 1869.
Validity Lecture Overview Overview of the concept Different types of validity Threats to validity and strategies for handling them Examples of validity.
L1 Chapter 11 Experimental and Quasi- experimental Designs Dr. Bill Bauer.
Experimental Research
Experimental Research Take some action and observe its effects Take some action and observe its effects Extension of natural science to social science.
Chapter 8 Experimental Research
Experimental and Quasi-Experimental Designs
Day 6: Non-Experimental & Experimental Design
Copyright © 2010 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Experiments and Observational Studies.
Power Point Slides by Ronald J. Shope in collaboration with John W. Creswell Chapter 11 Experimental Designs.
CRJS 4466 PROGRAM & POLICY EVALUATION LECTURE #3 Evaluation projects Resume preparation Job hunting Questions? In-class test #1 – next week!
CHAPTER 8, experiments.
Slide 13-1 Copyright © 2004 Pearson Education, Inc.
Chapter Four Experimental & Quasi-experimental Designs.
The Basics of Experimentation Ch7 – Reliability and Validity.
INTERNAL VALIDITY AND BASIC RESEARCH DESIGN. Internal Validity  the approximate truth about inferences regarding cause-effect or causal relationships.
1 Experimental Research Cause + Effect Manipulation Control.
CAUSAL INFERENCE Presented by: Dan Dowhower Alysia Cohen H 615 Friday, October 4, 2013.
Research methods and statistics.  Internal validity is concerned about the causal-effect relationship in a study ◦ Can observed changes be attributed.
Research Methods ContentArea Researchable Questions ResearchDesign MeasurementMethods Sampling DataCollection StatisticalAnalysisReportWriting ?
Experimental Research
Reading and Evaluating Research Method. Essential question to ask about the Method: “Is the operationalization of the hypothesis valid? Sections: Section.
Internal Validity. All about whether the research design (and data analysis) warrants the conclusions. Concerned with: – Causal relationships – Various.
Research Design ED 592A Fall Research Concepts 1. Quantitative vs. Qualitative & Mixed Methods 2. Sampling 3. Instrumentation 4. Validity and Reliability.
Experimental Research Methods in Language Learning Chapter 5 Validity in Experimental Research.
Experimental & Quasi-Experimental Designs Dr. Guerette.
SOCW 671: #6 Research Designs Review for 1 st Quiz.
 The basic components of experiments are: 1) taking action 2) observing the consequence of that action  Experimental model is most closely linked to.
CHAPTER 8 EXPERIMENTS.
Experiments.  Labs (update and questions)  STATA Introduction  Intro to Experiments and Experimental Design 2.
Experimental Research Design Causality & Validity Threats to Validity –Construct (particular to experiments) –Internal –External – already discussed.
CJ490: Research Methods in Criminal Justice UNIT #4 SEMINAR Professor Jeffrey Hauck.
Construct validity s.net/kb/consthre.htm.
Validity Threats to Validity Threats to validity – definition Theory vs. measurement Types of validity – Conclusion validity – Internal validity – Construct.
Can you hear me now? Keeping threats to validity from muffling assessment messages Maureen Donohue-Smith, Ph.D., RN Elmira College.
William M. Trochim James P. Donnelly Kanika Arora 8 Introduction to Design.
Experimental Research
Experiments Why would a double-blind experiment be used?
Hypothesis Testing, Validity, and Threats to Validity
New Media Research Methods
Introduction to Design
Experiments and Quasi-Experiments
Wake Up and Smell the Coffee: Evaluation Methodology for the 21st Century May 4th 2017 Ben Lenard.
Experiments and Quasi-Experiments
Experiments: Validity, Reliability and Other Design Considerations
Experiments II: Validity and Design Considerations
External Validity.
Experiments: Part 2.
Group Experimental Design
Study on Method of Mass Communication Research 传播研究方法 (7&8) Dr
Chapter 11 EDPR 7521 Dr. Kakali Bhattacharya
Misc Internal Validity Scenarios External Validity Construct Validity
Presentation transcript:

 Internal Validity  Construct Validity  External Validity * In the context of a research study, i.e., not measurement validity.

 Generally relevant only to studies with causal relationships. ◦ Temporal precedence ◦ Correlation ◦ No plausible alternative  Key question: can the outcome be attributed to causes other than the designed interventions ◦ If so, it is likely that internal validity needs to be tightened up

 Threats to Internal Validity ◦ Single Group Threats ◦ Multiple Group Threats ◦ Social threats to internal validity

Image an educational program where two different testing regimens are used. In one, an intervention and then a post-test is used. In the second, a pre- test, intervention and post-test is used. What are the single group threats for this design?

 Single Group Threats ◦ History (something happened at the same time) ◦ Maturation (something would have happened at the same time) ◦ Testing (testing itself induced an effect) ◦ Instrumentation (changes in the testing) ◦ Mortality (attrition in study participants) ◦ Regression (regression to the mean)

 Suppose for the previous study we had multiple groups instead of single groups?  Multiple Group Threats are variations on the Single Group Threat with selection bias added. If the added second group is a control, for instance, it must be selected in a way that makes it fully comparable to the first group (random assignment).  If participants cannot be randomly assigned, then we get quasi-experimental design.

 Applicable to social sciences (because people do not react simply to stimuli) ◦ Diffusion (people in treatment groups talk to one another) ◦ Compensatory rivalry (treatments groups know what is happening and develop a rivalry) ◦ Resentful demoralization (same as above, but with an opposite sign) ◦ Compensatory equalization (researchers or others equalize groups).

 Are the results valid for other persons in other places and at other times? ◦ Do they generalize?  Types of generalization  Threats to external validity

 Generalizations ◦ Sampling Model: try to make certain that your study groups are a random sample of the population you wish your generalization to extend to. ◦ “Proximal Similarity”: measure or stratify the sample on the things you cannot randomize.

 Threats to external validity ◦ People ◦ Places ◦ Times

 An assessment of how well ideas or theories are translated into actual programs.  Mapping of concrete activities into theoretical constructs.

 Formal articulations: ◦ Nomological network (Cronbach and Meehl, 1955): researchers were to establish a theoretical network of what to measure, empirical frameworks of what to measure and the linkages between the two. ◦ Multitrait-Multimethod Matrix (Campbell and Fiske, 1959): Convergent concepts should show higher correlations divergent concepts lower correlations. ◦ Pattern matching (Trochim, 1985): Linking a theoretical pattern with an operational pattern.

 Threats to Construct Validity ◦ Poorly defined constructs ◦ Mono-operation bias: The construct is larger than the single program / treatment you devised. ◦ Mono-method bias: the construct is larger than the limited set of measurements you devised. ◦ Test and treatment interaction: measurement changes the treatment group ◦ Other threats generally fall under “labeling” threats: a construct is essentially a metaphor, and if not precisely articulated differing meanings can be held by different persons.

 Social Threats to Construct Validity ◦ Hypothesis guessing: participants guess at the purpose of your study and attempt to game it. ◦ Evaluation apprehension: if apprehension causes participants to do poorly (or to pose as doing well) then the apprehension becomes a confounding factor. ◦ Researcher expectancies: Researcher expectancies confound the outcome.  Hawthorne effect: people change behavior when observed  Rosenthal effect: researcher expectations can change outcomes even when subjects are uninformed.

 Authors see methodology as intellectual infrastructure.  Believe that rapid change in CS produces outdated methodology.  Three key claims: ◦ Workloads used need to be appropriate ◦ Experimental design needs to be appropriate ◦ Analysis needs to be rigorous

 For this paper, the authors focus on Java ◦ Modern language additions (type safety, memory management, secure execution) have been added to Java ◦ Authors believe that these additions make previous benchmarks untenable:  Tradeoffs due to garbage collection where heap size is a control variable  Non-determinism due to adaptive optimization and sampling technologies  System warm-up from dynamic class loading and just-in- time compilation

 Authors created a suite (DaCapo) of benchmark tools suitable for research. The suite consists of open source applications.  DaCapo validates diversity a variety of tests and then applying PCA.  Authors point to “cherry picking” research by Perez, showing that dropping diversity of measures increases ambiguous and incorrect conclusions.

 The authors in their results show four ways to evaluate garbage collection. Any specific measure can be “gamed” to produce a desired result.  Classic comparison of Fortran / C / C++: control for host platform and language runtime.  New comparisons: control for host platform, language runtime, heap size, nondeterminism and warm-up.

 To obtain meaningful data from noisy estimates, data must be collected and aggregated.  Current practices sometimes lack statistical rigor.  Presenting all the results from the suite (as opposed to one number) will reduce “cherry picking”.