Power and p-values Benjamin Neale March 10th, 2016

Slides:



Advertisements
Similar presentations
Probability models- the Normal especially.
Advertisements

1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Business 205. Review Sampling Continuous Random Variables Central Limit Theorem Z-test.
Chapter Seventeen HYPOTHESIS TESTING
Power and Sample Size + Principles of Simulation Benjamin Neale March 4 th, 2010 International Twin Workshop, Boulder, CO.
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Chapter 6 Hypotheses texts. Central Limit Theorem Hypotheses and statistics are dependent upon this theorem.
Hypothesis Testing: Type II Error and Power.
Hypothesis Testing:.
Sampling Distributions and Hypothesis Testing. 2 Major Points An example An example Sampling distribution Sampling distribution Hypothesis testing Hypothesis.
- Interfering factors in the comparison of two sample means using unpaired samples may inflate the pooled estimate of variance of test results. - It is.
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Power & Sample Size and Principles of Simulation Michael C Neale HGEN 691 Sept Based on Benjamin Neale March 2014 International Twin Workshop,
Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Elementary Statistical Methods André L. Souza, Ph.D. The University of Alabama Lecture 22 Statistical Power.
Population: a data set representing the entire entity of interest - What is a population? Sample: a data set representing a portion of a population Population.
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
Power and Sample Size Boulder 2004 Benjamin Neale Shaun Purcell.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Math 4030 – 9a Introduction to Hypothesis Testing
Power Simulation Ascertainment Benjamin Neale March 6 th, 2014 International Twin Workshop, Boulder, CO Denotes practical Denotes He-man.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
INTRODUCTION TO TESTING OF HYPOTHESIS INTRODUCTION TO TESTING OF HYPOTHESIS SHWETA MOGRE.
Methods of Presenting and Interpreting Information Class 9.
TESTING STATISTICAL HYPOTHESES
Module 10 Hypothesis Tests for One Population Mean
Logic of Hypothesis Testing
Statistical Inference
Chapter 7 Exploring Measures of Variability
Part Four ANALYSIS AND PRESENTATION OF DATA
INF397C Introduction to Research in Information Studies Spring, Day 12
Unit 3 Hypothesis.
Size of a hypothesis test
Inference and Tests of Hypotheses
Tests of Significance The reasoning of significance tests
Chapter 21 More About Tests.
PCB 3043L - General Ecology Data Analysis.
Understanding Results
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Distribution of the Sample Means
Statistical inference: distribution, hypothesis testing
Inference Key Questions
Hypothesis Testing: Hypotheses
Statistics in Applied Science and Technology
Introduction to Inferential Statistics
CONCEPTS OF HYPOTHESIS TESTING
Two-sided p-values (1.4) and Theory-based approaches (1.5)
Chapter 9 Hypothesis Testing.
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
Chapter 9: Hypothesis Tests Based on a Single Sample
Basic analysis Process the data validation editing coding data entry
Interval Estimation and Hypothesis Testing
Chapter 3 Probability Sampling Theory Hypothesis Testing.
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Statistics II: An Overview of Statistics
Power and Sample Size I HAVE THE POWER!!! Boulder 2006 Benjamin Neale.
Psych 231: Research Methods in Psychology
What determines Sex Ratio in Mammals?
Psych 231: Research Methods in Psychology
CS639: Data Management for Data Science
Psych 231: Research Methods in Psychology
Chapter 9: Significance Testing
Is this quarter fair?. Is this quarter fair? Is this quarter fair? How could you determine this? You assume that flipping the coin a large number of.
Type I and Type II Errors
Statistical Inference for the Mean: t-test
Presentation transcript:

Power and p-values Benjamin Neale March 10th, 2016 Denotes He-man Denotes practical Power and p-values Benjamin Neale March 10th, 2016 International Twin Workshop, Boulder, CO

p-values are in the press

What we’ve been teaching

What exactly is a p-value? A way to ask if the data are consistent with a null model What exactly is a p-value?

The baseline model for comparison, usually no effect [e. g The baseline model for comparison, usually no effect [e.g. no heritability] What’s a null model?

Whose fault is it anyway? Distrust of his aunt’s claims of being able to discriminate between milk in first or tea in first Whose fault is it anyway?

alternative vs null some effect no effect Hypothesis testing

Possible scenarios Statistics Truth Reject H0 Fail to reject H0 H0 is true a 1-a Ha is true 1-b b Truth a=type 1 error rate b=type 2 error rate 1-b=statistical power Possible scenarios

Standard Case  Sampling distribution if HA were true Sampling distribution if H0 were true alpha 0.05 POWER: 1 -    T Non-centrality parameter

Increased effect size  Sampling distribution if HA were true Sampling distribution if H0 were true alpha 0.05 POWER: 1 -  ↑   T Non-centrality parameter

Standard Case  Sampling distribution if HA were true Sampling distribution if H0 were true alpha 0.05 POWER: 1 -    T Non-centrality parameter

More conservative α  Sampling distribution if HA were true Sampling distribution if H0 were true alpha 0.01 POWER: 1 -  ↓   T Non-centrality parameter

Less conservative α  Sampling distribution if HA were true Sampling distribution if H0 were true alpha 0.10 POWER: 1 -  ↑   T Non-centrality parameter

Standard Case  Sampling distribution if HA were true Sampling distribution if H0 were true alpha 0.05 POWER: 1 -    T Non-centrality parameter

Increased sample size  Sampling distribution if HA were true Sampling distribution if H0 were true alpha 0.05 POWER: 1 -  ↑   T Non-centrality parameter

Increased sample size Sample size scales linearly with chi square NCP Sampling distribution if HA were true Sampling distribution if H0 were true alpha 0.05 POWER: 1 -  ↑ Sample size scales linearly with chi square NCP   T Non-centrality parameter

Nonsignificant result Statistical Analysis Rejection of H0 Non-rejection of H0 Type I error at rate  Nonsignificant result (1- ) H0 true Truth Type II error at rate  Significant result (1-) HA true

What is power? Definitions of power The probability that the test will reject the null hypothesis if the alternative hypothesis is true The chance the your statistical test will yield a significant result when the effect you are testing exists What is power?

Practical 1 We are going to simulate a normal distribution using R We can do this with a single line of code, but let’s break it up Practical 1

Simulation functions R has functions for many distributions Normal, χ2, gamma, beta (others) Let’s start by looking at the random normal function: rnorm() Simulation functions

In R: ?rnorm rnorm Documentation

Standard deviation of distribution rnorm(n, mean = 0, sd = 1) Function name Mean of distribution with default value Number of observations to simulate Standard deviation of distribution with default value rnorm syntax

R script: Norm_dist_sim.R This script will plot 4 samples from the normal distribution Look for changes in shape Thoughts? R script: Norm_dist_sim.R

One I made earlier

Concepts Sampling variance We saw that the ‘normal’ distribution from 100 observations looks stranger than for 1,000,000 observations Where else may this sampling variance happen? How certain are we that we have created a good distribution? Concepts

Rather than just simulating the normal distribution, let’s simulate what our estimate of a mean looks like as a function of sample size We will run the R script mean_estimate_sim.R Mean estimation

R script: mean_estimate_sim.R This script will plot 4 samples from the normal distribution Look for changes in shape Thoughts? R script: mean_estimate_sim.R

One I made earlier

We see an inverse relationship between sample size and the variance of the estimate This variability in the estimate can be calculated from theory SEx = s/√n SEx is the standard error, s is the sample standard deviation, and n is the sample size Standard Error

The sampling variability in my estimate affects my ability to declare a parameter as significant (or significantly different) Key Concept 1

Power definition again The probability that the test will reject the null hypothesis if the alternative hypothesis is true Power definition again

Why being picky can be good and bad Ascertainment Why being picky can be good and bad

Bivariate plot for actors in Hollywood

Bivariate plot for actors who “made it” in Hollywood

Bivariate plot for actors who “made it” in Hollywood P<2e-16 Bivariate plot for actors who “made it” in Hollywood

Again – you’re meant to say something I’m waiting… What happened?

Ascertainment Bias in your parameter estimates Bias is a difference between the “true value” and the estimated value Can apply across a range of scenarios Bias estimates of means, variances, covariances, betas etc. Ascertainment

When might we want to ascertain? For testing means, ascertainment increases power For characterizing variance:covariance structure, ascertainment can lead to bias When might we want to ascertain?

When might we want to ascertain? For testing means, ascertainment increases power For characterizing variance:covariance structure, ascertainment can lead to bias When might we want to ascertain?

Practical! Power calculations using NCP We create the model specifying our effect sizes We then simulate data empirical = T means that the simulated data matches the specifications [within some error] The chi square can then be used to generate power Practical!