Class 4 Experimental Studies: Validity Issues Reliability of Instruments Chapters 7 Spring 2017.

Slides:



Advertisements
Similar presentations
Chapter 8 Flashcards.
Advertisements

Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
Reliability and Validity of Dependent Measures
Reliability & Validity.  Limits all inferences that can be drawn from later tests  If reliable and valid scale, can have confidence in findings  If.
Part II Sigma Freud & Descriptive Statistics
Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.
Validity, Sampling & Experimental Control Psych 231: Research Methods in Psychology.
Reliability and Validity in Experimental Research ♣
MSc Applied Psychology PYM403 Research Methods Validity and Reliability in Research.
Statistics Micro Mini Threats to Your Experiment!
Measurement: Reliability and Validity For a measure to be useful, it must be both reliable and valid Reliable = consistent in producing the same results.
Psych 231: Research Methods in Psychology
Personality, 9e Jerry M. Burger
Validity, Reliability, & Sampling
Validity Lecture Overview Overview of the concept Different types of validity Threats to validity and strategies for handling them Examples of validity.
Statistical Analyses & Threats to Validity
Understanding Statistics
Final Study Guide Research Design. Experimental Research.
INTRO TO EXPERIMENTAL RESEARCH, continued Lawrence R. Gordon Psychology Research Methods I.
The Basics of Experimentation Ch7 – Reliability and Validity.
Tests and Measurements Intersession 2006.
Independent vs Dependent Variables PRESUMED CAUSE REFERRED TO AS INDEPENDENT VARIABLE (SMOKING). PRESUMED EFFECT IS DEPENDENT VARIABLE (LUNG CANCER). SEEK.
Chapter 4 – Research Methods in Clinical Psych Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Experiment Basics: Variables Psych 231: Research Methods in Psychology.
Chapter 2: Behavioral Variability and Research Variability and Research 1. Behavioral science involves the study of variability in behavior how and why.
Class 9 Dependent Variables, Instructions/Literature Review
How can we get the answers to our questions about development?
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
The Theory of Sampling and Measurement. Sampling First step in implementing any research design is to create a sample. First step in implementing any.
Chapter 6 Research Validity. Research Validity: Truthfulness of inferences made from a research study.
Experimental Research Methods in Language Learning Chapter 5 Validity in Experimental Research.
Chapter 8 – Lecture 6. Hypothesis Question Initial Idea (0ften Vague) Initial ObservationsSearch Existing Lit. Statement of the problem Operational definition.
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Experimental Research Design Causality & Validity Threats to Validity –Construct (particular to experiments) –Internal –External – already discussed.
RELIABILITY AND VALIDITY Dr. Rehab F. Gwada. Control of Measurement Reliabilityvalidity.
Class 9 Dependent Variables, Instructions/Literature Review Class 9 Dependent Variables, Instructions/Literature Review Chapters 13 Spring 2016.
Class 11 and 12 Jacobson et al (1996) Spring
Some Terminology experiment vs. correlational study IV vs. DV descriptive vs. inferential statistics sample vs. population statistic vs. parameter H 0.
MGMT 588 Research Methods for Business Studies
Reliability and Validity
Chapter 2: The Research Enterprise in Psychology
Assist. Prof. Merve Topcu Department of Psychology, Çankaya University
Approaches to social research Lerum
Sample Power No reading, class notes only
Chapter 4 Research Methods in Clinical Psychology
Reliability and Validity
Assessment Theory and Models Part II
Test Validity.
CHAPTER 5 MEASUREMENT CONCEPTS © 2007 The McGraw-Hill Companies, Inc.
Understanding Results
Classical Test Theory Margaret Wu.
Journalism 614: Reliability and Validity
Reliability & Validity
Statistical Analyses & Threats to Validity
Introduction to Measurement
Class 4 Experimental Studies: Validity Issues Reliability of Instruments Chapters 7 Spring 2017.
Chapter Eight: Quantitative Methods
PHLS 8334 Class 2 (Spring 2017).
Reliability and Validity of Measurement
Reliability, validity, and scaling
Chapter 6 Research Validity.
Experiments and Quasi-Experiments
Field Research (outside of lab)
Hypothesis Testing, Validity &
Experiment Basics: Variables
Reliability and Validity
Misc Internal Validity Scenarios External Validity Construct Validity
Presentation transcript:

Class 4 Experimental Studies: Validity Issues Reliability of Instruments Chapters 7 Spring 2017

Dependent Variables: Continuous Psychometric Properties Reliability Validity

Is reliability a property of : The psychological instrument (e.g. depression) Scores in the psychological instrument

Reliability The extent to which scores show true variance in attributes within or between participants as opposed to error variance A score = true variance + error variance More items of good quality = higher reliability One-item scales have very low internal reliability; estimated around r =.25

Sources of Measurement Error Specific error something unique about the instrument that differs from what the researcher intended (e.g. social desirability; reading level; idioms) Transient error some temporary factor that affects the measurement (e.g. order of instruments; historical events; noise while observations occur; tiredness; inattention)

Types of Measures Observational Measures Self-Report Paper-and-Pencil Measures Content tests right/wrong answers Likert-scales (3 to 7 response options)

Types of Reliability Inter-Scorer Agreement - Observation Measures Test/Re-Test Alternate Forms (achievement) Internal Consistency how items correlate with each other

Internal Consistency Reliability Split-Half Kuder-Richardson Dichotomous items- right /wrong; Yes/No Chronbach Alpha α Average correlation of all possible split-half reliability calculations

Internet Addiction Measure 12-item measure rx= .70 to rx= .95 among college students Expected reliability estimates among the adolescent sample Three-item measure: rx=.40 to rx=.55

Internet Addiction: Q 2a 12-item measure: internal consistency reliability coefficients rx= .70 to rx= .95 among college students 70% to 95% of variability in respondents’ scores is due to and the rest is due to .

Reliability Estimates Extent to which variability is due to true variation versus error Cronbach alpha =. 70; 70% of variation in scores is due to true differences in internet addiction & 30% is due to error Cronbach alpha =. 95; 95% of variation in scores is due to true differences in internet addiction & 5% is due to error

12-item measure rx=. 70 to rx=. 95 among college-age samples in the U 12-item measure rx= .70 to rx= .95 among college-age samples in the U.S. Expected reliability estimates among any adolescent sample: At least .70 Between .70 and .95 Unkown

Reliability Refers to scores with specific and . It’s a property of not of the instruments.

Reliability Refers to scores with specific populations and conditions. It’s a property of scores not of the instrument per se.

More accurate? The internal consistency of the Internet Addiction Scale (IAS) has ranged from .70 to .95. The internal consistency of scores in the Internet Addiction Scale (IAS) with college students in the U.S. has ranged from .70 to .95.

Wellbeing Measure: Q 2c Three-item measure: rx=.40 to rx=.45. To improve the scores’ reliability just add 5 or 6 items: True False Not sure

Internet Addiction Measure: Q 2c Three-item measure: rx=.40 to rx=.45 – New items increase reliability only if they are of good quality

Reliability and Correlation: In correlational research, how does the reliability of two scores (e.g. depression and self -esteem) affect the probability that the observed correlation coefficient between scores in the two variables approximates the “true” correlation coefficient in the population?

Internal Reliability and Correlation Depression 1 Cronbach α = .45 Depression 2 Cronbach α = .90 Same sample: which r dep-se below will be larger? DEP1 (α = .45) correl. Self Esteem DEP2 (α = .90) correl. Self Esteem

Validity of Measures Construct Validity Predictive Validity Factor structure -- latent constructs Convergent and Discriminate Validity Correlation with similar/dis-similar measures Predictive Validity Correlation among different constructs based on expected relations Cross-sectional or Longitudinal

Reliability vs. Validity Observed correlation coefficient will be smaller and less accurate with the less reliable measure Correlations between constructs are attenuated by the (internal) reliability of the measures The reliability of a measure puts a ceiling on its validity

Validity Of Experimental Designs Do inferences from an outcome study results reflect how things actually are in the population? Does the manipulation (treatment) causes the observed outcome? vs. other reasons explains the findings Threats to Experimental Validity

Threats to Statistical Conclusion Validity Are the observed relations among IV (Manipulation) and DV (Outcome) variables accurate? Power 2. Unreliability of measures 3. Unreliability of treatment implementation 4. Extraneous variables Heterogeneity of participants

Threats to Statistical Conclusion Validity Are the observed relations among variables accurate? Power Not very large N (-) only 18-21 fam. per treatment Experimental design (+) Two of the treatments clearly delineated (+) No power analyses reported (-) 2. Unreliability of Measures Survey outcome measures are well-known (+) High internal consistency for most measures (+) Unreliability of Treatment Implementation PCIT and GANA had manuals, (+) integrity was coded from videotapes (+)using checklists in manual(+); therapists trained/supervised for exp treatments(+)(-) Extraneous Variance in Experimental Setting It seems all treatments were conducted in the same agency + Not clear timing of treatments or history effects Heterogeneity of Participants Random assignment + Low levels of exclusion crit. – Caregivers mostly female. Similar Demographics (+)

Threats to Internal Validity Can we conclude that there is a causal relation between the IV and the DV? Selection to Treatment Groups History Attrition Testing Effects

Threats to Internal Validity Can we conclude that there is a causal relation between the IV and the DV? Selection to Treatment Groups Clear inclusion/ exclusion criteria + limited exclusion criteria (-) Used Randomization to treatment groups + Similar across Tx groups in demographic charact. + History Therapy appeared to occur for everyone at once + Attrition Attrition rates of 43% GANA; 32% PCIT 56% TAU – NS – however relatively high - # of sessions attended was similar for Exp groups PCIT 13.2 and GANA 13.92-- and 10 for TAU +/- Testing Effects Appears testing only took place at pre- and post-test +

Threats to Construct Validity To what extent variables (DV & IV) capture desired constructs 1. Mono-Operation Bias 2. Mono-Method Bias Reactivity to Exp. Conditions Experimenter Expectancies

Threats to Construct Validity To what extent variables (DV & IV) capture desired constructs Mono-Operation Bias Used extensive # of self-report measures- of child behaviors and parents affective states +ECBI- Early Childhood Inventory; Parenting Practices Scale; Parenting Stress Index; Parenting Distress Scale; Parent Child Dysfunction Interaction Scale; Difficult Child Scale + 2. Mono-Method Bias Both self-report and observational measures + Reactivity to Exp. Conditions Possible for TAU participants to determine they were not in the experimental conditions and gave less value to it - Experimenter Expectancies Researchers and therapists in 3 conditions knew treatment they were delivering (-) Hawthorne effect for both : GANA and PCI (-)

Threats to External Validity Can we generalize observed relations across persons, settings and times 1. Person-Units 2. Treatments Outcome Measures 4. Settings

Threats to External Validity Can we generalize observed relations across persons, settings and times 1. Person-Units Not Highly selected sample (+) ; seems representative of community mental health clinic clients(+) Generalizable to primarily Hispanic, first-generation, lower SES women caregivers and their children with behavior (+) 2. Treatments Not known – Did same therapists deliver PCIT and GANA? Did Participants talk with each other? Outcome Measures They used two types of measures – interview based and self report (+) Statistically significant findings were not consistent across measures (+) 4. Settings Empirical Question….. Therapy took place at a community mental health clinic + Will results be similar w/o experimental controls and trainings?

Instruments Description of Measure Validity Estimates Instrument name Convergent/Discriminate Validity Acronym Validation Sample(s) Authors Key References Reliability Estimates Brief description of construct(s) Chronbach’s alpha coefficient Type of measure (e.g self-report) Previous and Current studies Number of items Test/Re-Test Example of items Items response options Factors or subscales Scoring options and direction