Sample size calculations

Slides:



Advertisements
Similar presentations
Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: Null hypothesis.
Advertisements

Study Size Planning for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Comparing Two Proportions (p1 vs. p2)
Sample size estimation
Objectives (BPS chapter 18) Inference about a Population Mean  Conditions for inference  The t distribution  The one-sample t confidence interval 
Inference for a population mean BPS chapter 18 © 2006 W. H. Freeman and Company.
Estimation of Sample Size
Confidence Intervals © Scott Evans, Ph.D..
SAMPLE SIZE ESTIMATION
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Nemours Biomedical Research Statistics March 19, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Point and Confidence Interval Estimation of a Population Proportion, p
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Horng-Chyi HorngStatistics II41 Inference on the Mean of a Population - Variance Known H 0 :  =  0 H 0 :  =  0 H 1 :    0, where  0 is a specified.
BS704 Class 7 Hypothesis Testing Procedures
Chapter 9 Hypothesis Testing.
Introduction to sample size and power calculations How much chance do we have to reject the null hypothesis when the alternative is in fact true? (what’s.
Sample Size Annie Herbert Medical Statistician Research & Development Support Unit Salford Royal Hospitals NHS Foundation Trust
Sample Size Determination
Calculating sample size for a case-control study
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Chapter 9: Introduction to the t statistic
Sample Size Determination Ziad Taib March 7, 2014.
Dr Mohammad Hossein Fallahzade Determining the Size of a Sample In the name of God.
Multiple Choice Questions for discussion
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
Inference for a Single Population Proportion (p).
CHAPTER 16: Inference in Practice. Chapter 16 Concepts 2  Conditions for Inference in Practice  Cautions About Confidence Intervals  Cautions About.
Jan 17,  Hypothesis, Null hypothesis Research question Null is the hypothesis of “no relationship”  Normal Distribution Bell curve Standard normal.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
ESTIMATION. STATISTICAL INFERENCE It is the procedure where inference about a population is made on the basis of the results obtained from a sample drawn.
LESSON Tests about a Population Parameter.
PARAMETRIC STATISTICAL INFERENCE
MS 305 Recitation 11 Output Analysis I
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 8 – Comparing Proportions Marshall University Genomics.
Psy B07 Chapter 4Slide 1 SAMPLING DISTRIBUTIONS AND HYPOTHESIS TESTING.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Hypothesis Testing Hypothesis Testing Topic 11. Hypothesis Testing Another way of looking at statistical inference in which we want to ask a question.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 8-3 Testing a Claim About a Proportion.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Sample Size August, 2007 Charles E. McCulloch Professor and Head, Division of Biostatistics Department of Epidemiology and Biostatistics.
Randomized Trial of Preoperative Chemoradiation Versus Surgery Alone in Patients with Locoregional Esophageal Carcinoma, Ursa et al. Statistical Methods:
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
통계적 추론 (Statistical Inference) 삼성생명과학연구소 통계지원팀 김선우 1.
Chapter 8 Delving Into The Use of Inference 8.1 Estimating with Confidence 8.2 Use and Abuse of Tests.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Medical Statistics as a science
Issues concerning the interpretation of statistical significance tests.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Fall 2002Biostat Statistical Inference - Proportions One sample Confidence intervals Hypothesis tests Two Sample Confidence intervals Hypothesis.
Probability & Significance Everything you always wanted to know about p-values* *but were afraid to ask Evidence Based Chiropractic April 10, 2003.
Sample Size Determination
Various Topics of Interest to the Inquiring Orthopedist Richard Gerkin, MD, MS BGSMC GME Research.
Compliance Original Study Design Randomised Surgical care Medical care.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
A short introduction to epidemiology Chapter 6: Precision Neil Pearce Centre for Public Health Research Massey University Wellington, New Zealand.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Review Statistical inference and test of significance.
Copyright © 2009 Pearson Education, Inc t LEARNING GOAL Understand when it is appropriate to use the Student t distribution rather than the normal.
DSCI 346 Yamasaki Lecture 1 Hypothesis Tests for Single Population DSCI 346 Lecture 1 (22 pages)1.
Critical Appraisal Course for Emergency Medicine Trainees Module 2 Statistics.
Inference for a Single Population Proportion (p)
Sample Size Determination
ESTIMATION.
Chapter 8: Inference for Proportions
Objectives 6.1 Estimating with confidence Statistical confidence
Objectives 6.1 Estimating with confidence Statistical confidence
Presentation transcript:

Sample size calculations Marie-Pierre Sylvestre mp.sylvestre@epimgh.mcgill.ca Material adapted from http://www.sgul.ac.uk/depts/phs/guide/size.htm June 2007

Why bother? Sample size calculations are important to ensure that estimates are obtained with required precision or confidence. E.g. a prevalence of 10% from a sample of size 20 ... 95%CI is 1% to 31%... ... a prevalence of 10% from a sample of size 400 ... 95%CI is 7% to 13% In studies concerned with detecting an effect if an effect deemed to be clinically or biologically important exists, then there is a high chance of it being detected, i.e. that the analysis will be statistically significant. If the sample is too small, then even if large differences are observed, it will be impossible to show that these are due to anything more than sampling variation.

Some terminology Significance level Cut-off point for the p-value, below which the null hypothesis will be rejected and it will be concluded that there is evidence of an effect. Typically set at 5%. One-sided and two-sided tests of significance Two-sided tests should be used unless there is a very good reason for doing otherwise. Power Power is the probability that the null hypothesis will be correctly rejected i.e. rejected when there is indeed a real difference or association. It can also be thought of as "100 minus the percentage chance of missing a real effect" - therefore the higher the power, the lower the chance of missing a real effect. Power is typically set at 80% or 90% but not below 80%. Effect size of clinical importance This is the smallest difference between the group means or proportions (or odds ratio/relative risk closest to unity) which would be considered to be clinically or biologically important. The sample size should be set so that if such a difference exists, then it is very likely that a statistically significant result would be obtained.

Example (1) Estimating a single proportion Scenario: The prevalence of dysfunctional breathing amongst asthma patients being treated in general practice is to be assessed using a postal questionnaire survey (Thomas et al. 2001). Required information: Primary outcome variable = presence/absence of dysfunctional breathing 'Best guess' of expected percentage (proportion) = 30% (0.30) Desired width of 95% confidence interval = 10% (i.e. +/- 5%, or 25% to 35%) Formula for sample size for estimation of a proportion is n = 15.4 * p * (1-p)/W2 where n = the required sample size p = the expected proportion - here 0.30 W = width of confidence interval - here 0.10 To get a feel

Example (2) Estimating a single proportion Here we have: n = 15.4 * 0.30 * (0.70)/ 0.102 = 324 "A sample of 324 patients with asthma will be required to obtain a 95% confidence interval of +/- 5% around a prevalence estimate of 30%. To allow for an expected 70% response rate to the questionnaire, a total of 480 questionnaires will be delivered." Note: The formula presented below is based on 'normal approximation methods', and, should not be applied when estimating percentages which are close to 0% or 100%. In these circumstances 'exact methods' should be used.

Prevalence/Proportion http://www.cs.uiowa.edu/~rlenth/Power/ Then, test for 1 proportion

Cohort studies Epi Info: http://www.cdc.gov/epiinfo/

Unmatched case-controls http://stat.ubc.ca/~rollin/stats/ssize/caco.html http://calculators.stat.ucla.edu/powercalc/binomial/case-control/index.php

Clinical Trials http://hedwig.mgh.harvard.edu/sample_size/quan_measur/assoc_quant.html

Simple Survival Analysis and Regression PS: http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/PowerSampleSize

Which variables should be included in the sample size calculation? The sample size calculation should relate to the study's primary outcome variable. If the study has secondary outcome variables which are also considered important (as is often the case), the sample size should also be sufficient for the analyses of these variables.

Allowing for response rates and other losses to the sample The sample size calculation should relate to the final, achieved sample. Therefore, the initial numbers approached in the study may need to be increased in accordance with the expected response rate, loss to follow up, lack of compliance, and any other predicted reasons for loss of subjects. The link between the initial numbers approached and the final achieved sample size should be made explicit.

Consistency with study aims and statistical analysis If the aim is to demonstrate that a new drug is superior to an existing one then it is important that the sample size is sufficient to detect a clinically important difference between the two treatments. However, sometimes the aim is to demonstrate that two drugs are equally effective. This type of trial is called an equivalence trial or a 'negative' trial. The sample size required to demonstrate equivalence will be larger than that required to demonstrate a difference. The sample size calculation should also be consistent with the study's proposed method of analysis, since both the sample size and the analysis depend on the design of the study.

Pitfalls to avoid (1) "The throughput of the clinic is around 50 patients a year, of whom 10% may refuse to take part in the study. Therefore over the 2 years of the study, the sample size will be 90 patients. " Although most studies need to balance feasibility with study power, the sample size should not be decided on the number of available patients alone. Where the number of available patients is a known limiting factor, sample size calculations should still be provided, to indicate either The power which the study will have to detect the desired difference of clinical importance, or The difference which will be detected when the desired power is applied.

Pitfalls to avoid (2) "Sample sizes are not provided because there is no prior information on which to base them." Where prior information on standard deviations is unavailable, sample size calculations can be given in very general terms, i.e. by giving the size of difference that may be detected in terms of a number of standard deviations.

Pitfalls to avoid (3) "A previous study in this area recruited 150 subjects and found highly significant results (p=0.014), and therefore a similar sample size should be sufficient here." Previous studies may have been 'lucky' to find significant results, due to random sampling variation. Calculations of sample size specific to the present, proposed study should be provided, including details of power significance level primary outcome variable effect size of clinical importance for this variable standard deviation (if a continuous variable) sample size in each group (if comparing groups)

References http://www.sgul.ac.uk/depts/phs/guide/size.htm