Sample Power No reading, class notes only

Slides:



Advertisements
Similar presentations
Questionnaire Development
Advertisements

Conceptualization, Operationalization, and Measurement
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
What is a Good Test Validity: Does test measure what it is supposed to measure? Reliability: Are the results consistent? Objectivity: Can two or more.
Statistical Issues in Research Planning and Evaluation
Reliability and Validity Dr. Roy Cole Department of Geography and Planning GVSU.
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
PSY 1950 Confidence and Power December, Requisite Quote “The picturing of data allows us to be sensitive not only to the multiple hypotheses that.
PSYC512: Research Methods PSYC512: Research Methods Lecture 9 Brian P. Dyre University of Idaho.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Measurement and Data Quality
Choosing Statistical Procedures
Chapter 4 Hypothesis Testing, Power, and Control: A Review of the Basics.
Overview of Statistical Hypothesis Testing: The z-Test
Testing Hypotheses I Lesson 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics n Inferential Statistics.
Week 9 Chapter 9 - Hypothesis Testing II: The Two-Sample Case.
Hypothesis Testing II The Two-Sample Case.
1 STATISTICAL HYPOTHESES AND THEIR VERIFICATION Kazimieras Pukėnas.
Chapter 8 Introduction to Hypothesis Testing
CHAPTER 4 Research in Psychology: Methods & Design
Comparing Means From Two Sets of Data
Instrumentation.
Unanswered Questions in Typical Literature Review 1. Thoroughness – How thorough was the literature search? – Did it include a computer search and a hand.
T tests comparing two means t tests comparing two means.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Understanding the Variability of Your Data: Dependent Variable Two "Sources" of Variability in DV (Response Variable) –Independent (Predictor/Explanatory)
The Argument for Using Statistics Weighing the Evidence Statistical Inference: An Overview Applying Statistical Inference: An Example Going Beyond Testing.
Chapter 1 Measurement, Statistics, and Research. What is Measurement? Measurement is the process of comparing a value to a standard Measurement is the.
Chapter 16 The Chi-Square Statistic
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Chapter 10 Copyright © Allyn & Bacon 2008 This multimedia product and its contents are protected under copyright law. The following are prohibited by law:
Chapter 13 Understanding research results: statistical inference.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Lesson 3 Measurement and Scaling. Case: “What is performance?” brandesign.co.za.
Measurement Chapter 6. Measuring Variables Measurement Classifying units of analysis by categories to represent variable concepts.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Measurement and Scaling Concepts
Descriptive Statistics Report Reliability test Validity test & Summated scale Dr. Peerayuth Charoensukmongkol, ICO NIDA Research Methods in Management.
Some Terminology experiment vs. correlational study IV vs. DV descriptive vs. inferential statistics sample vs. population statistic vs. parameter H 0.
Chapter 8 Introducing Inferential Statistics.
Logic of Hypothesis Testing
Effect Size.
Hypothesis Testing.
Chapter 6  PROBABILITY AND HYPOTHESIS TESTING
How big is my sample going to be?
INF397C Introduction to Research in Information Studies Spring, Day 12
CHAPTER 4 Research in Psychology: Methods & Design
Inference and Tests of Hypotheses
Sample Power.
Understanding Results
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Hypothesis Testing Is It Significant?.
Hypothesis Testing: Hypotheses
Introduction to Inferential Statistics
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
Power.
Chapter 12 Power Analysis.
Understanding Statistical Inferences
Chapter 6 Predicting Future Performance
Testing Hypotheses I Lesson 9.
REVIEW I Reliability scraps Index of Reliability
Statistical Power.
Rest of lecture 4 (Chapter 5: pg ) Statistical Inferences
PSYCHOLOGY AND STATISTICS
Introduction To Hypothesis Testing
Presentation transcript:

Sample Power No reading, class notes only Last part of this lecture: Black pp 380-385

Sample Power Probability that a sample of given size will allow us to correctly reject the null hypothesis. Null Hypothesis is true: No difference Null Hypothesis is false: There is difference Wrong Decision Null Hypothesis is accepted Correct Decision Type II error 95% probability 20% probability Wrong Decision Null Hypothesis is rejected Correct decision 5% probability Sample Power Type I error 80% probability

Summary Power is the probability of correctly rejecting the null hypothesis Power is related to the mean of the population that the sample is actually coming from Power is related to the SEM (SD & sample size) larger the sample size, greater the power Power is related to the chosen level of significance (usually p=.05)

Example sample size calculation I have two groups that I wish to compare. The mean of group 1 is 50. The mean of group 2 is 55. SD of both groups is 10. What should be my sample size so that I can conclude that the difference is statistically significant? Probability of correctly detecting this difference should be high (usually 80%) SOLUTION: http://www.stat.uiowa.edu/~rlenth/Power/

Power Analysis Example Exercise 1: I wish to estimate the behavior problem scores of children age 12-14 who had school disciplinary problems. This mean is estimated to be 115. I wish to be able to say that this sample of children could not have been generated by a normative population (mean = 100, SD = 15). What should my sample size be?

A useful concept for Power Analysis: Effect Size A standardized way to measure the effects of one variable on another X  Y If X changed by 1 SD, by how many SDs would Y change? How strong is this effect?

Using effect size: Exercise 2 Hypothesis: Children who experienced highly authoritarian parenting are expected to be less empathetic in their dating relationships than children who did not experience high levels of authoritarian parenting because …. What should be your sample size???? What is the IV and what is the DV? What would be your expectation about the size of the effect? Small, medium, or large? How does that translate to the difference between two groups?

Effect Size 0.2 or less: Small effect 0.3-0.4: Moderate effect 0.5-0.6: Large effect 0.7 or greater: Very large effect (almost impossible in Social Sciences)

Power Analysis Example Exercise 3: I wish to compare the mean behavior problems scores of children age 12-14 who have intact or disrupted families. I expect the score in disrupted families to be higher and to me, this score should be at least 0.3 SD higher than the intact families to be meaningful. What should my sample size be?

Power Analysis Example Exercise 4 I would like to test if students with high self esteem give more autonomous decisions in choosing their colleges than students with low self esteem. I expect that the difference between low and high self esteem groups will be small (0.2 SD). What should my sample size be, so that I can detect this small difference with statistical significance at p=0.05 level with 85% power?

Power Analysis Example Exercise 5 I would like to estimate the effects of authoritarian parenting on the level of religious prejudice of adolescents. I expect that this correlation will be about 0.2. What should my sample size be, so that I can detect this modest correlation with statistical significance at p=0.05 level with 85% power?

Power Analysis FINAL Example Exercise 6 [YOUR PARAGRAPH ON SAMPLE POWER] This research is about the effects of interparental violence on the level of intimacy in the dating relationships of college students. It is expected that there is a moderate negative correlation between the level of overt interparental conflict and ability of the college students to maintain intimacy in their dating relationships. Previous studies found this correlation to be around 0.25 (Smith, 1999; Miller, 2001). What should the sample size be, so that this correlation can be detected with statistical significance at p=0.05 level with 80% power?

MEASUREMENT Black 188-199, 201-207

Principles of Measurement Types of variables – Level of measurement: Nominal, ordinal, interval/ratio measurement Types of instruments: Factual data instruments Attitudinal instruments Observational instruments Tests

Validity Validity as a set of interrelated attributes of measurement Construct validity Criterion validity Predictive validity Content validity Face validity

Construct Validity Definition: The degree of consistency between the construct, and its operational definition. How do we measure it? Use theoretical, conceptual, logical arguments Example: Relationship satisfaction Definition of the construct Operational definition

Criterion Validity Definition: The suitability of a measurement instrument for classifying individuals based on a trait. How do we evaluate criterion validity? Compare the scores of individuals who are known to differ on the trait (according to an independent source). Example: Depressive affect scale Independent classification of subjects into depressed and not-depressed groups (e.g., by a clinician)

Predictive validity Definition: The suitability of a measurement instrument for predicting outcomes. How do we evaluate predictive validity? Compare the scores of individuals to their future scores or future performance. Example: OSS Obtain the freshman GPA’s of students

Content Validity Definition: The adequacy of a measurement instrument for representing the construct in its entirety. How do we evaluate content validity? Theoretical consideration of all dimensions of the construct, and verification that all dimensions are considered by the operationalization. Example: Relationship satisfaction What aspects of the relationship?

Face Validity Definition: The perception of the subjects that the instrument is a measurement of the construct of interest. How do we measure it? Talk-through sessions with subjects. Example: Behavior problems Aggressive behaviors Bullying Destroying toys Example: Relationship satisfaction

Reliability Definition: Consistency between two measurements Measure applied twice in two occasions Two halves of the same instrument Measure administered by two different assessors

Reliability For any given measure, there are two sources of variability True variability Variability due to measurement problems or the situation Variance of the “true” score Variance of the “error” in measurement Variance of the total score

Factors affecting reliability and validity Enough questions to elicit needed information Quality of wording Time needed / time allowed to respond Enough heterogeneity in subjects