Effect of Sample size on Research Outcomes

Slides:



Advertisements
Similar presentations
Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
Advertisements

Objectives (BPS chapter 24)
Confidence Intervals © Scott Evans, Ph.D..
HYPOTHESIS TESTING Four Steps Statistical Significance Outcomes Sampling Distributions.
Statistics 101 Class 9. Overview Last class Last class Our FAVORATE 3 distributions Our FAVORATE 3 distributions The one sample Z-test The one sample.
Sample Size Determination In the Context of Hypothesis Testing
Summary of Quantitative Analysis Neuman and Robson Ch. 11
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
February  Study & Abstract StudyAbstract  Graphic presentation of data. Graphic presentation of data.  Statistical Analyses Statistical Analyses.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
T-Tests and Chi2 Does your sample data reflect the population from which it is drawn from?
Power and Sample Size Determination Anwar Ahmad. Learning Objectives Provide examples demonstrating how the margin of error, effect size and variability.
ESTIMATION. STATISTICAL INFERENCE It is the procedure where inference about a population is made on the basis of the results obtained from a sample drawn.
The Argument for Using Statistics Weighing the Evidence Statistical Inference: An Overview Applying Statistical Inference: An Example Going Beyond Testing.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
CONFIDENCE INTERVAL It is the interval or range of values which most likely encompasses the true population value. It is the extent that a particular.
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
Medical Statistics as a science
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 8 First Part.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Chapter 6: Analyzing and Interpreting Quantitative Data
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
1 Mean Analysis. 2 Introduction l If we use sample mean (the mean of the sample) to approximate the population mean (the mean of the population), errors.
Sample Size Determination
Chi Square & Correlation
Introduction to Medical Statistics. Why Do Statistics? Extrapolate from data collected to make general conclusions about larger population from which.
PART 2 SPSS (the Statistical Package for the Social Sciences)
Chapter 13 Understanding research results: statistical inference.
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Copyright © 2009 Pearson Education, Inc t LEARNING GOAL Understand when it is appropriate to use the Student t distribution rather than the normal.
CHAPTER 6: SAMPLING, SAMPLING DISTRIBUTIONS, AND ESTIMATION Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for a Diverse Society.
Quantitative Methods in the Behavioral Sciences PSY 302
Stats Methods at IC Lecture 3: Regression.
Logic of Hypothesis Testing
Sample Size Determination
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
Effect Size 10/15.
Dr. Siti Nor Binti Yaacob
Research Methodology Lecture No :25 (Hypothesis Testing – Difference in Groups)
How many study subjects are required ? (Estimation of Sample size) By Dr.Shaik Shaffi Ahamed Associate Professor Dept. of Family & Community Medicine.
Statistical Core Didactic
Applied Biostatistics: Lecture 2
R. E. Wyllys Copyright 2003 by R. E. Wyllys Last revised 2003 Jan 15
“Victor Babes” UNIVERSITY OF MEDICINE AND PHARMACY TIMISOARA
Association between two categorical variables
Understanding Results
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Analyzing and Interpreting Quantitative Data
STATISTICS MADE EASY Nachiket Shankar 11/02/2017 OBGYAN 2017.
Basic Statistics Overview
Chapter 8: Inference for Proportions
Inferential Statistics
Comparing k Populations
Calculating Sample Size: Cohen’s Tables and G. Power
David Pieper, Ph.D. STATISTICS David Pieper, Ph.D.
NURS 790: Methods for Research and Evidence Based Practice
chapter-7 hypothesis testing for quantitative variable
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
Power and Sample Size I HAVE THE POWER!!! Boulder 2006 Benjamin Neale.
Understanding Statistical Inferences
GENERALIZATION OF RESULTS OF A SAMPLE OVER POPULATION
How many study subjects are required ? (Estimation of Sample size) By Dr.Shaik Shaffi Ahamed Professor Dept. of Family & Community Medicine College.
ESTIMATION.
Analyzing and Interpreting Quantitative Data
Type I and Type II Errors
Presentation transcript:

Effect of Sample size on Research Outcomes 1Prabhaker Mishra, 2A.Kaul, 3CM Pandey, 4Uttam Singh 1Assistant Professor, 2Additional Professor, 3Professor & Head, 4Professor 1,3,4Dept. of Biostatistics & Health Informatics, 2Nephrology Sanjay Gandhi Postgraduate Institute of Medical Sciences, Lucknow- India 1

Introduction : A good research study is one, that is well designed and leads to valid and meaning outcomes. Outcome of the research study, is considered statistically significant (null hypothesis rejected) when level of significance (p value) is below 5% or 0.05. There are various statistical methods, used to calculate p value and for each of the statistical methods, need some minimum number of individuals (sample size) for a given conditions and without proper sample size result outcomes is considered by chance.

Introduction : Power of the study is another important factor, which is used in the comparative study only. For a comparative study, usually our power should not be below 80%. To achieve at least 80% power of the study, there is some minimum number of sample size required in each of the groups, at given difference.

Introduction : Relative error / margin of error are the error, play a major role in deciding the sample size (for some conditions). To getting small error, we need higher sample size. In the present study, the effect of sample size on the research outcomes/p value are discussed.

Materials and Methods : In the present exploratory study, retrospective data of the 25 consecutive renal failure female patients with induced pregnancy was collected from the department of Nephrology, SGPGIMS, Lucknow, those visited in a single unit of SGPGIMS during 2012-17. Data used : For this study, data of the variables were collected. 1. Renal outcome (recovery / not recovery). 2. Systolic Blood Pressure (SBP) in mmHg, Diastolic Blood Pressure (DBP) in mmHg, Hemoglobin (Hb) in g/dl, Serum creatinine in mg/dl & Hospital stay in days.

Materials and Methods : For the analysis point of view, data of the hospitalization in days was further categorized between two groups (≤15 days and ≥16 days). Statistical Analysis : Data of the continuous variable‘s was presented in mean± standard deviation while categorical data in frequency & percentage. To compare the means/ proportions between two independent groups (recovery / not recovery), Independent samples t test / chi-square test was used.

Materials and Methods : Statistical Analysis : …………………… To test the linear relationship between two continuous variables, pearson correlation coefficient was calculated. Data was analyzed using statistical package for social sciences, version 23 (SPSS-23, IBM, Chicago, USA). A p value <0.05 is considered statistically significant.

Materials and Methods : Sample size : Sample sizes were estimated to compare the means between two groups of the patients (recovery vs not recovery) for each of the Systolic blood pressure (SBP), Diastolic blood pressure (DBP) and proportions of the recovery between hospital stay days (≤15 days, ≥16days). Sample size estimation was done using software “Power analysis and sample size version -2008” (PASS-2008) and details of the sample size for each of the variables are given. .

Estimated Sample size for SBP Sample size : Group sample sizes of 24 and 62 produce a two-sided 95% confidence interval with a margin of error 7.76 in mean SBP difference when mean ± SD of the group1 and group2 are 134.75±17.34 and 142.65±10.91 respectively. Sample size : Group sample sizes of 46 and 119 achieve 81% power to detect a difference of 7.9 between the groups when group1 and group2 mean SBP score were 134.75 and 142.65 with estimated group standard deviations of 17.3 and 10.9 and with a significance level (alpha) of 0.05 using a two-sided two-sample t-test.

Estimated Sample size for DBP Sample size : Group sample sizes of 49 and 127 produce a two-sided 95% confidence interval with a margin of error 2.58 in mean DBP difference when Mean ±SD of the group1 and group2 were 82.75±7.85 and 85.35±7.32 respectively. Sample size : Group sample sizes of 97 and 250 achieve 80% power to detect a difference of 2.6 between the groups when group1 and group2 mean DBP score were 82.75 and 85.35 with estimated standard deviations of 7.85 and 7.32 respectively with a significance level (alpha) of 0.05 using a two-sided two-sample t-test.

Estimated Sample size for Hospital stay Sample size : Group1 and group2 sample sizes of 14 and 36 produce a two-sided 95% confidence interval for the difference in proportions with a width that is equal to 0.57 (95% CI =0.001- 0.58) when the sample-1 and sample-2 proportion was 0.46 and 0.17 respectively. Sample size : Group sample sizes of 28 and 72 achieve 80% power to detect a difference of 0.295 between the groups when group1 and group2 proportions were 0.46 and 0.16 and with a significance level (alpha) of 0.05 using a two proportions z test for independent samples.

Estimated Sample size Sample size for Correlation Sample size - (SBP & Serum creatinine) : A sample size of 27 produces a two-sided 95% confidence interval with a width equal to 0.661, 95% CI= 0.007 - 0.668, when the sample correlation is 0.386. Sample size -(DBP & Serum creatinine) : A sample size of 1890 produces a two-sided 95% confidence interval with a width equal to 0.090, 95%CI = 0.001-0.091, when the sample correlation is 0.046. Sample size -(Hb & Serum creatinine) A sample size of 198 produces a two-sided 95% confidence interval with a width equal to 0.274, 95% CI = 0.001 - 0.274, when the sample correlation is 0.140.

Steps used in this study : To show the effect of sample size on test of significance : Initially level of significance was calculated at sample size of 25. As result was not significant at sample size 25. At same difference, whether higher sample size can play any role to get the significance p value, sample sizes was increased 2 times, 3 times, ……..and again level of significance was calculated. Above process was done upto getting significance p value and trend in p values are discussed.

Results

Descriptive Statistics: Total number of the study patients 25 Status of the recovery Recovery (7, 28%), Not recovery (18, 72%) Mean SBP 140.12 ± 13.46 Mean DBP 83.90 ± 7.62 Mean Hospital Stay 20.60± 13.49 Serum creatinine 6.96±2.04

Comparison of Mean SBP Between Outcomes 134.75±17.34 142.65±10.91 Despite mean SBP are same and only sample size are increasing in each of the comparisons, p value are continuously decreasing. Sig level are increasing.

Comparison of Mean DBP Between Outcomes 82.75±7.85 85.35±7.32 Despite mean DBP are same and only sample size are increasing in each of the comparisons, p value are continuously decreasing. Sig level are increasing.

Recovery in proportions between hospital stay (≥16 days Vs. ≤15 days) Association Between Hospital Stay and Recovery (Change in Significance Level at different Sample Size) Recovery in proportions between hospital stay (≥16 days Vs. ≤15 days) Despite recovered vs non recovered proportions are same and only sample size are increasing in each of the comparisons, p value are continuously decreasing.

Correlation between Variables (Change in Significance Level at different Sample Size) Correlation with Serum creatinine SBP (calculated sample size=27)) DBP (calculated sample size=1890) Hb (calculated sample size=198) N=25 0.386 (p=0.057) 0.046 (p=0.828) - 0.140 (p=0.505) N=50 0.386 (p=0.006) 0.046 (p=0.753) - 0.140 (p=0.333) N=100 0.386 (p=0.001) 0.046 (p=0.697) - 0.140 (p=0.232) N=150 0.386 (p<0.001) 0.046 (p=0.613) - 0.140 (p=0.120) N=200 0.046 (p=0.520) - 0.140 (p=0.048) N=500 0.046 (p=0.308) - 0.140 (p=0.002) N=2000 0.046 (p=0.041) - 0.140 (p<0.001) P<0.05 significant Despite correlation coefficient are same and only sample size are increasing in each of the computation, p value are continuously decreasing. Sig level are increasing.

Discussion & Conclusions : Sample size is an important factor for level of significance. Same mean difference is insignificant at small sample size but significant for larger sample size. Between recovered and not recovered patients, same mean difference (134.75 Vs. 142.65) of SBP was not significant at sample size of 25 (p>0.05) but significant at sample size of 75 ( p<0.05). Similarly, same mean difference (82.75±7.85 Vs. 85.35) of DBP was not significant at sample size of 25 (p>0.05) but significant at sample size of 150 ( p<0.05).

Discussion & Conclusions : Hospital stay (≥16 days Vs. ≤15 days), difference in recovery proportions (46.2% Vs. 16.7%) was not significant at sample size of 25 (p>0.05) but significant at sample size of 50 (p<0.05). Similar result was observed for the correlation also. For small sample size, we estimate more standard error in the data [Standard error = Standard deviation / V (sample size) ] and resultant we get wider confidence interval (mean± Z. standard error). At higher sample size we get less standard error and resultant a narrow confidence interval, shows more closer to the point value of the data.

Discussion & Conclusions : To achieving small confidence limit between point value and lower/upper value, we need higher sample size. Larger samples increase the chance of finding a significant difference because they are more reliably reflect the population mean. In a appropriate sample size, when detected difference ≥ assumed difference, result came out as significant.

Discussion & Conclusions : As calculated sample size is only estimated sample size, and our result can be significant around this size or more than this calculated sample size. When we add power in the study, our sample size is increasing. It is recommended to calculate sample size in the study, and taken sample size in study should be more than calculated sample size, so that our result become significant and can be generalized.

Limitation of this study : As there is no any sampling method is available which can ensure that if from a population (size N), if we draw samples of size n, size 2n, size 3n…..our mean ± SD will be remain same in each size of the sample drawn. To overcome this problem, in the present study, we have increased the same sample size in multiple times (i.e. 50,75,100,…etc.). In this study, 50, 75,100,…etc. are hypothetical sample size, worked as mirror of the study sample size of 25.

THANKS 25