Presentation is loading. Please wait.

Presentation is loading. Please wait.

Class 4 Experimental Studies: Validity Issues Reliability of Instruments Chapters 7 Spring 2017.

Similar presentations


Presentation on theme: "Class 4 Experimental Studies: Validity Issues Reliability of Instruments Chapters 7 Spring 2017."— Presentation transcript:

1 Class 4 Experimental Studies: Validity Issues Reliability of Instruments
Chapters 7 Spring 2017

2 Dependent Variables: Continuous
Psychometric Properties Reliability Validity

3 Is reliability a property of :
The psychological instrument (e.g. depression) Scores in the psychological instrument

4 Reliability The extent to which scores show true variance in attributes within or between participants as opposed to error variance A score = true variance + error variance More items of good quality = higher reliability One-item scales have very low internal reliability; estimated around r =.25

5 Sources of Measurement Error
Specific error something unique about the instrument that differs from what the researcher intended (e.g. social desirability; reading level; idioms) Transient error some temporary factor that affects the measurement (e.g. order of instruments; historical events; noise while observations occur; tiredness; inattention)

6 Types of Measures Observational Measures
Self-Report Paper-and-Pencil Measures Content tests right/wrong answers Likert-scales (3 to 7 response options)

7 Types of Reliability Inter-Scorer Agreement - Observation Measures
Test/Re-Test Alternate Forms (achievement) Internal Consistency how items correlate with each other

8 Internal Consistency Reliability
Split-Half Kuder-Richardson Dichotomous items- right /wrong; Yes/No Chronbach Alpha α Average correlation of all possible split-half reliability calculations

9 Internet Addiction Measure
12-item measure rx= .70 to rx= .95 among college students Expected reliability estimates among the adolescent sample Three-item measure: rx=.40 to rx=.55

10 Internet Addiction: Q 2a
12-item measure: internal consistency reliability coefficients rx= .70 to rx= .95 among college students 70% to 95% of variability in respondents’ scores is due to and the rest is due to .

11 Reliability Estimates
Extent to which variability is due to true variation versus error Cronbach alpha =. 70; 70% of variation in scores is due to true differences in internet addiction & 30% is due to error Cronbach alpha =. 95; 95% of variation in scores is due to true differences in internet addiction & 5% is due to error

12 12-item measure rx=. 70 to rx=. 95 among college-age samples in the U
12-item measure rx= .70 to rx= .95 among college-age samples in the U.S. Expected reliability estimates among any adolescent sample: At least .70 Between .70 and .95 Unkown

13 Reliability Refers to scores with specific and .
It’s a property of not of the instruments.

14 Reliability Refers to scores with specific populations and conditions.
It’s a property of scores not of the instrument per se.

15 More accurate? The internal consistency of the Internet Addiction Scale (IAS) has ranged from to .95. The internal consistency of scores in the Internet Addiction Scale (IAS) with college students in the U.S. has ranged from .70 to .95.

16 Wellbeing Measure: Q 2c Three-item measure: rx=.40 to rx=.45.
To improve the scores’ reliability just add 5 or 6 items: True False Not sure

17 Internet Addiction Measure: Q 2c
Three-item measure: rx=.40 to rx=.45 – New items increase reliability only if they are of good quality

18 Reliability and Correlation:
In correlational research, how does the reliability of two scores (e.g. depression and self -esteem) affect the probability that the observed correlation coefficient between scores in the two variables approximates the “true” correlation coefficient in the population?

19 Internal Reliability and Correlation
Depression 1 Cronbach α = .45 Depression 2 Cronbach α = .90 Same sample: which r dep-se below will be larger? DEP1 (α = .45) correl. Self Esteem DEP2 (α = .90) correl. Self Esteem

20 Validity of Measures Construct Validity Predictive Validity
Factor structure -- latent constructs Convergent and Discriminate Validity Correlation with similar/dis-similar measures Predictive Validity Correlation among different constructs based on expected relations Cross-sectional or Longitudinal

21 Reliability vs. Validity
Observed correlation coefficient will be smaller and less accurate with the less reliable measure Correlations between constructs are attenuated by the (internal) reliability of the measures The reliability of a measure puts a ceiling on its validity

22 Validity Of Experimental Designs
Do inferences from an outcome study results reflect how things actually are in the population? Does the manipulation (treatment) causes the observed outcome? vs. other reasons explains the findings Threats to Experimental Validity

23 Threats to Statistical Conclusion Validity
Are the observed relations among IV (Manipulation) and DV (Outcome) variables accurate? Power 2. Unreliability of measures 3. Unreliability of treatment implementation 4. Extraneous variables Heterogeneity of participants

24 Threats to Statistical Conclusion Validity
Are the observed relations among variables accurate? Power Not very large N (-) only fam. per treatment Experimental design (+) Two of the treatments clearly delineated (+) No power analyses reported (-) 2. Unreliability of Measures Survey outcome measures are well-known (+) High internal consistency for most measures (+) Unreliability of Treatment Implementation PCIT and GANA had manuals, (+) integrity was coded from videotapes (+)using checklists in manual(+); therapists trained/supervised for exp treatments(+)(-) Extraneous Variance in Experimental Setting It seems all treatments were conducted in the same agency + Not clear timing of treatments or history effects Heterogeneity of Participants Random assignment + Low levels of exclusion crit. – Caregivers mostly female. Similar Demographics (+)

25 Threats to Internal Validity
Can we conclude that there is a causal relation between the IV and the DV? Selection to Treatment Groups History Attrition Testing Effects

26 Threats to Internal Validity
Can we conclude that there is a causal relation between the IV and the DV? Selection to Treatment Groups Clear inclusion/ exclusion criteria + limited exclusion criteria (-) Used Randomization to treatment groups + Similar across Tx groups in demographic charact. + History Therapy appeared to occur for everyone at once + Attrition Attrition rates of 43% GANA; 32% PCIT 56% TAU – NS – however relatively high - # of sessions attended was similar for Exp groups PCIT 13.2 and GANA and 10 for TAU +/- Testing Effects Appears testing only took place at pre- and post-test +

27 Threats to Construct Validity
To what extent variables (DV & IV) capture desired constructs 1. Mono-Operation Bias 2. Mono-Method Bias Reactivity to Exp. Conditions Experimenter Expectancies

28 Threats to Construct Validity
To what extent variables (DV & IV) capture desired constructs Mono-Operation Bias Used extensive # of self-report measures- of child behaviors and parents affective states +ECBI- Early Childhood Inventory; Parenting Practices Scale; Parenting Stress Index; Parenting Distress Scale; Parent Child Dysfunction Interaction Scale; Difficult Child Scale + 2. Mono-Method Bias Both self-report and observational measures + Reactivity to Exp. Conditions Possible for TAU participants to determine they were not in the experimental conditions and gave less value to it - Experimenter Expectancies Researchers and therapists in 3 conditions knew treatment they were delivering (-) Hawthorne effect for both : GANA and PCI (-)

29 Threats to External Validity
Can we generalize observed relations across persons, settings and times 1. Person-Units 2. Treatments Outcome Measures 4. Settings

30 Threats to External Validity
Can we generalize observed relations across persons, settings and times 1. Person-Units Not Highly selected sample (+) ; seems representative of community mental health clinic clients(+) Generalizable to primarily Hispanic, first-generation, lower SES women caregivers and their children with behavior (+) 2. Treatments Not known – Did same therapists deliver PCIT and GANA? Did Participants talk with each other? Outcome Measures They used two types of measures – interview based and self report (+) Statistically significant findings were not consistent across measures (+) 4. Settings Empirical Question….. Therapy took place at a community mental health clinic + Will results be similar w/o experimental controls and trainings?

31 Instruments Description of Measure Validity Estimates
Instrument name Convergent/Discriminate Validity Acronym Validation Sample(s) Authors Key References Reliability Estimates Brief description of construct(s) Chronbach’s alpha coefficient Type of measure (e.g self-report) Previous and Current studies Number of items Test/Re-Test Example of items Items response options Factors or subscales Scoring options and direction


Download ppt "Class 4 Experimental Studies: Validity Issues Reliability of Instruments Chapters 7 Spring 2017."

Similar presentations


Ads by Google