The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes.

Slides:



Advertisements
Similar presentations
Training. Training & Development Definition “The systematic acquisition of attitudes, concepts, knowledge, roles, or skills, that result in improved performance.
Advertisements

Evaluation Procedures
Validity (cont.)/Control RMS – October 7. Validity Experimental validity – the soundness of the experimental design – Not the same as measurement validity.
Reliability and Validity
Increasing your confidence that you really found what you think you found. Reliability and Validity.
Defining Characteristics
Inadequate Designs and Design Criteria
GROUP-LEVEL DESIGNS Chapter 9.
Group Discussion Describe the fundamental flaw that prevents a nonequivalent group design from being a true experiment? (That is, why can’t these designs.
Get Ready to Play Publish or Perish! Please select a team. 1.Reeses 2.KitKat 3.Milky Way 4.Snickers 5. 3 Musketeers.
Causal Designs Chapter 9 Understanding when (and why) X  Y.
Correlation AND EXPERIMENTAL DESIGN
Research Design and Validity Threats
Research Problems.
Who are the participants? Creating a Quality Sample 47:269: Research Methods I Dr. Leonard March 22, 2010.
Non-Experimental designs: Developmental designs & Small-N designs
MSc Applied Psychology PYM403 Research Methods Validity and Reliability in Research.
Lecture 10 Psyc 300A. Types of Experiments Between-Subjects (or Between- Participants) Design –Different subjects are assigned to each level of the IV.
Non-Experimental designs: Developmental designs & Small-N designs
The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes.
Non-Experimental designs
Group Discussion Describe the similarities and differences between experiments , non-experiments , and quasi-experiments. Actions for Describe the similarities.
Experimental Research
Experiments and Observational Studies.  A study at a high school in California compared academic performance of music students with that of non-music.
Experimental Research Take some action and observe its effects Take some action and observe its effects Extension of natural science to social science.
Experimental Design The Gold Standard?.
Research Methods in Psychology
Research and Evaluation Center Jeffrey A. Butts John Jay College of Criminal Justice City University of New York August 7, 2012 How Researchers Generate.
Chapter 12: Quasi-Experimental Designs
I want to test a wound treatment or educational program in my clinical setting with patient groups that are convenient or that already exist, How do I.
Research Methods for Counselors COUN 597 University of Saint Joseph Class # 5 Copyright © 2015 by R. Halstead. All rights reserved.
Consumer Preference Test Level 1- “h” potato chip vs Level 2 - “g” potato chip 1. How would you rate chip “h” from 1 - 7? Don’t Delicious like.
Chapter 11 Experimental Designs
INTERNAL VALIDITY AND BASIC RESEARCH DESIGN. Internal Validity  the approximate truth about inferences regarding cause-effect or causal relationships.
Independent vs Dependent Variables PRESUMED CAUSE REFERRED TO AS INDEPENDENT VARIABLE (SMOKING). PRESUMED EFFECT IS DEPENDENT VARIABLE (LUNG CANCER). SEEK.
Validity RMS – May 28, Measurement Reliability The extent to which a measurement gives results that are consistent.
Single-Group Threats to Internal Validity. The Single Group Case Two designs:
Threats to validity in intervention studies Potential problems Issues to consider in planning.
Research methods and statistics.  Internal validity is concerned about the causal-effect relationship in a study ◦ Can observed changes be attributed.
Research in Communicative Disorders1 Research Design & Measurement Considerations (chap 3) Group Research Design Single Subject Design External Validity.
Introduction section of article
Experimental Research
Multiple-Group Threats to Internal Validity. The Central Issue l When you move from single to multiple group research the big concern is whether the groups.
Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations.
Chapter 6 Research Validity. Research Validity: Truthfulness of inferences made from a research study.
Research Design ED 592A Fall Research Concepts 1. Quantitative vs. Qualitative & Mixed Methods 2. Sampling 3. Instrumentation 4. Validity and Reliability.
Experimental & Quasi-Experimental Designs Dr. Guerette.
Chapter 11.  The general plan for carrying out a study where the independent variable is changed  Determines the internal validity  Should provide.
The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes.
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov February 16, 2011.
Journalism 614: Experimental Methods Experimental Research  Take some action and observe its effects –Extension of natural science to social science.
Research Design: Causal Studies l Quick Review: Three general forms of quantitative research studies –Descriptive: Describes a situation –Relational :
CJ490: Research Methods in Criminal Justice UNIT #4 SEMINAR Professor Jeffrey Hauck.
Lecture 8 – Internal Validity Threats continued EPSY 640 Texas A&M University.
1. /32  A quasi-experimental design is one that looks like an experimental design but lacks the key ingredient -- random assignment. 2.
Can you hear me now? Keeping threats to validity from muffling assessment messages Maureen Donohue-Smith, Ph.D., RN Elmira College.
William M. Trochim James P. Donnelly Kanika Arora 8 Introduction to Design.
Research Design in Psychology
Experimental Research
Experiments Why would a double-blind experiment be used?
Internal Validity and Confounding Variables
Introduction to Design
Experiments and Quasi-Experiments
Quasi-Experimental Design
Experiments and Quasi-Experiments
Regression To The Mean 林 建 甫 C.F. Jeff Lin, MD. PhD.
External Validity.
Chapter 18: Experimental and Quasi-Experimental Research
Experimental Research
Experimental Research
Presentation transcript:

The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes that may be thought provoking and challenging It is not intended for the content or delivery to cause offence Any issues raised in the lecture may require the viewer to engage in further thought, insight, reflection or critical evaluation

Validity of Research Threats to Validity Dr. Craig Jackson Senior Lecturer in Health Psychology School of Health and Policy Studies Faculty of Health & Community Care University of Central England

Validity Important consideration Example project: access to 300 workers workers’ ability is assessed workers attend a 1 week training course workers’ ability is assessed again classic within-subjects design (pre-post test design)

300 subjects randomised 150 control group 150 intervention group assess ability control results intervention results compare mean scores Design Concept - Between subjects method

Design Concept - Within-subjects method - better 300 subjects randomised 300 control group assess ability #1 assess ability #2 300 treatment group training course

Threats to within-subjects designs observe increase after training course gain from test #1 to test #2 scores student concludes the outcome (improvement) is due to training could this be wrong? some threats to internal validity that critics (examiners) might raise and some plausible alternative explanations for the observed effects

History threats Some “historical” event caused increase – not the training TV & other media Sesame Street, Countdown, Tomorrow’s World, Open University Elementary intellectual content Can be mundane or extraordinary “Specific event / chain of events” British Journal of Psychiatry (2000) 177: pp469-72

Maturation threats “Age is the key to wisdom” Improvement would occur without any training course Measuring natural maturation / growth of understanding Effects up to a certain limit Differential maturation Similar to “history threat”?

Testing threats Specific to pre-post test designs Taking a test can increase knowledge Taking test #1 may teach participants Priming – make ready for training in a way they would not be Heisenberg’s Uncertainty Principle (1927)

Instrumentation threats Specific to pre-post test designs “Making the goals bigger” Taking a test twice can increase knowledge Studies do not use same test twice Avoiding testing threats Perhaps 2 versions of the test are not really similar The instrument causes changes not the training course

Instrumentation threats (further) Specific to pre-post test designs Especially likely with human “instruments” Observations or Clinical assessment 3 Factors Observers fatigue over time Observers improve over time Different observers

Mortality threats Metaphorical Dropping out of study Obvious problem? Especially when drop out is non-trivial N = 300 take test #1 N =50 drop-out after taking test #1 N = 250 remain and take test #2 What if the drop-outs were low-scorers on test #1? (self-esteem)

Mortality threats (further) Mean gain from test #1 to test #2 Using all of the scores available on each occasion Includes 50 low test #1 scorers (soon-to-be-dropouts) in the test #1 score Test #1 (n=300)Test #2 (n=250) Mean score 60.5 (± 9.7) 81.6 (± 8.9) Problem - - drops out the potential low scorers from test #2 Inflates mean test #2 score over what it would be if the poor scorers took it Solution - - compare mean test #1 and test #2 scores for only those workers who stayed in the whole study (n = 250)? No - - a sub-sample certainly not be representative of the original sample

Mortality threats (further) Degree of this threat gauged by comparison Compare the drop-out group (n = 50) with the non drop-out group (n = 250) e.g. using test #1 scores demographic data – especially age & sex If no major differences between groups: Reasonable to assume mortality occurred across entire sample Reasonable to assume mortality was not biasing results Depends greatly on size of mortality N

Regression threats Things can only get better – things can only get worse “Regression artefact” “Regression to the mean” Purely statistical phenomenon Whenever there is:a non-random sample from a population two measures imperfectly correlated (test #1 and test #2 scores) these will not be perfectly correlated with each other

Regression threats Few measurements stay exactly the same – confusing? e.g.: If a training program only includes people who are the lowest 10% of the class on test #1, what are the chances that they would constitute exactly the lowest 10% on test #2? Not very likely ! Most of them would score low on the post-test but unlikely to be the lowest 10% twice! The lowest 10% on test #1, they can't get any lower than being the lowest -- they can only go up from there, relative to the larger population from which they were selected

Summary of single-group threats History threats Maturation threats Testing threats Instrumentation threats Mortality threats Regression threats

Multiple Group threats Comparison of 2 different methods Training course to aid factory workers’ in living health lifestyle Example of an MSc project: Student has access to 300 workers 1.Workers’ lifestyle is assessed (test #1) 2.50% workers attend 1 week healthy lifestyle program 2.50% workers shown a healthy lifestyle software 3.Workers’ lifestyle is assessed again (test #2)

software group n=150 training course group n=150 complete lifestyle assessment test #1 trained on software Randomisation of 300 subjects complete lifestyle assessment test #1 attend training course complete lifestyle assessment test #2 Design Concept – Between and Within subjects method

software course software course Test #1 Test #2 Healthy Lifestyle Score (HLS) Factory workers and Healthy Lifestyle Training What does the graph show?

Selection comparability threats What if there is: an overall change from test #1 to test #2 level of change different between the two groups? Student concludes:the outcome is due to the different styles of risk assessment program. How could this be wrong? Key validity issue: the degree to which the groups are comparable before the study. If groups comparable and the only difference between them is the program, post-test differences can be attributed to the program a big “IF”

Selection comparability threats If groups not comparable to begin with, how much of the change can be attributed to training programs or to the initial differences between groups? The only multiple group threat to internal validity This threat is a selection bias or selection threat Selection threat - “ any factor other than the program that leads to post-test differences between groups”

Selection History threats Any other event that occurs between test #1 and test #2 that the 2 groups experience differently “selection threat”the groups differ in some way “history threat”the way the groups differ is with respect to their reactions to / experiences of “historical” events e.g. If the groups may differ in their reading habits Perhaps the training course group read “health” more frequently than those in the software group A higher test #2 score for the training course group doesn't indicate the effect of lifestyle training…..…..it's really an effect of the two groups differentially experiencing a relevant event (TV)

Selection Instrumentation threats Any differential change in the test used for each group from pre-course and post-course. e.g. the test may change differently for the two groups Especially observers: - differential changes between groups

Selection Mortality threats Arises when there is differential (non-random) dropout between the two groups, from test #1 to test #2 Different types of workers might drop out of each group, More may drop out of one than the other Possibly based on how they were selected Observed differences in results might be due to the different types of dropouts -- the selection-mortality -- and not to the different training programs If the selection into groups was not random a bias will often exist

Selection Regression threats Occurs when there are different rates of regression to the mean in the two groups. This might happen if one group scores more extremely on test #1 than the other group – bias again Perhaps that the software group is getting a disproportionate number of low ability workers (factory managers think they need the “new” tutoring) Managers don't understand the need for 'comparable' program and comparison groups! Since the software group has more extreme lower scorers at test #1, their mean will regress (increase) a greater distance toward the overall population mean at test #2, and they will appear to “gain” more than the training course group