Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sample Power No reading, class notes only

Similar presentations


Presentation on theme: "Sample Power No reading, class notes only"— Presentation transcript:

1 Sample Power No reading, class notes only
Last part of this lecture: Black pp

2 Sample Power Probability that a sample of given size will allow us to correctly reject the null hypothesis. Null Hypothesis is true: No difference Null Hypothesis is false: There is difference Wrong Decision Null Hypothesis is accepted Correct Decision Type II error 95% probability 20% probability Wrong Decision Null Hypothesis is rejected Correct decision 5% probability Sample Power Type I error 80% probability

3 Summary Power is the probability of correctly rejecting the null hypothesis Power is related to the mean of the population that the sample is actually coming from Power is related to the SEM (SD & sample size) larger the sample size, greater the power Power is related to the chosen level of significance (usually p=.05)

4 Example sample size calculation
I have two groups that I wish to compare. The mean of group 1 is 50. The mean of group 2 is 55. SD of both groups is 10. What should be my sample size so that I can conclude that the difference is statistically significant? Probability of correctly detecting this difference should be high (usually 80%) SOLUTION:

5 Power Analysis Example
Exercise 1: I wish to estimate the behavior problem scores of children age who had school disciplinary problems. This mean is estimated to be I wish to be able to say that this sample of children could not have been generated by a normative population (mean = 100, SD = 15). What should my sample size be?

6 A useful concept for Power Analysis: Effect Size
A standardized way to measure the effects of one variable on another X  Y If X changed by 1 SD, by how many SDs would Y change? How strong is this effect?

7 Using effect size: Exercise 2
Hypothesis: Children who experienced highly authoritarian parenting are expected to be less empathetic in their dating relationships than children who did not experience high levels of authoritarian parenting because …. What should be your sample size???? What is the IV and what is the DV? What would be your expectation about the size of the effect? Small, medium, or large? How does that translate to the difference between two groups?

8 Effect Size 0.2 or less: Small effect 0.3-0.4: Moderate effect
: Large effect 0.7 or greater: Very large effect (almost impossible in Social Sciences)

9 Power Analysis Example
Exercise 3: I wish to compare the mean behavior problems scores of children age who have intact or disrupted families. I expect the score in disrupted families to be higher and to me, this score should be at least 0.3 SD higher than the intact families to be meaningful. What should my sample size be?

10 Power Analysis Example
Exercise 4 I would like to test if students with high self esteem give more autonomous decisions in choosing their colleges than students with low self esteem. I expect that the difference between low and high self esteem groups will be small (0.2 SD). What should my sample size be, so that I can detect this small difference with statistical significance at p=0.05 level with 85% power?

11 Power Analysis Example
Exercise 5 I would like to estimate the effects of authoritarian parenting on the level of religious prejudice of adolescents. I expect that this correlation will be about What should my sample size be, so that I can detect this modest correlation with statistical significance at p=0.05 level with 85% power?

12 Power Analysis FINAL Example
Exercise 6 [YOUR PARAGRAPH ON SAMPLE POWER] This research is about the effects of interparental violence on the level of intimacy in the dating relationships of college students. It is expected that there is a moderate negative correlation between the level of overt interparental conflict and ability of the college students to maintain intimacy in their dating relationships. Previous studies found this correlation to be around 0.25 (Smith, 1999; Miller, 2001). What should the sample size be, so that this correlation can be detected with statistical significance at p=0.05 level with 80% power?

13 MEASUREMENT Black ,

14 Principles of Measurement
Types of variables – Level of measurement: Nominal, ordinal, interval/ratio measurement Types of instruments: Factual data instruments Attitudinal instruments Observational instruments Tests

15 Validity Validity as a set of interrelated attributes of measurement
Construct validity Criterion validity Predictive validity Content validity Face validity

16 Construct Validity Definition: The degree of consistency between the construct, and its operational definition. How do we measure it? Use theoretical, conceptual, logical arguments Example: Relationship satisfaction Definition of the construct Operational definition

17 Criterion Validity Definition: The suitability of a measurement instrument for classifying individuals based on a trait. How do we evaluate criterion validity? Compare the scores of individuals who are known to differ on the trait (according to an independent source). Example: Depressive affect scale Independent classification of subjects into depressed and not-depressed groups (e.g., by a clinician)

18 Predictive validity Definition: The suitability of a measurement instrument for predicting outcomes. How do we evaluate predictive validity? Compare the scores of individuals to their future scores or future performance. Example: OSS Obtain the freshman GPA’s of students

19 Content Validity Definition: The adequacy of a measurement instrument for representing the construct in its entirety. How do we evaluate content validity? Theoretical consideration of all dimensions of the construct, and verification that all dimensions are considered by the operationalization. Example: Relationship satisfaction What aspects of the relationship?

20 Face Validity Definition: The perception of the subjects that the instrument is a measurement of the construct of interest. How do we measure it? Talk-through sessions with subjects. Example: Behavior problems Aggressive behaviors Bullying Destroying toys Example: Relationship satisfaction

21 Reliability Definition: Consistency between two measurements
Measure applied twice in two occasions Two halves of the same instrument Measure administered by two different assessors

22 Reliability For any given measure, there are two sources of variability True variability Variability due to measurement problems or the situation Variance of the “true” score Variance of the “error” in measurement Variance of the total score

23 Factors affecting reliability and validity
Enough questions to elicit needed information Quality of wording Time needed / time allowed to respond Enough heterogeneity in subjects


Download ppt "Sample Power No reading, class notes only"

Similar presentations


Ads by Google