Overview and Common Pitfalls in Statistics and How to Avoid Them

Slides:



Advertisements
Similar presentations
A Spreadsheet for Analysis of Straightforward Controlled Trials
Advertisements

KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
Lecture 28 Categorical variables: –Review of slides from lecture 27 (reprint of lecture 27 categorical variables slides with typos corrected) –Practice.
LSU-HSC School of Public Health Biostatistics 1 Statistical Core Didactic Introduction to Biostatistics Donald E. Mercante, PhD.
Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 12 Measures of Association.
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Nemours Biomedical Research Statistics March 19, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Biol 500: basic statistics
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Sample size calculations
Sample Size Determination
Richard M. Jacobs, OSA, Ph.D.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Factors that Associated with Stress in Nursing Faculty in Thailand
AM Recitation 2/10/11.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Chapter 13: Inference in Regression
1 Bandit Thinkhamrop, PhD.(Statistics) Dept. of Biostatistics & Demography Faculty of Public Health Khon Kaen University Critical Appraisal For Health.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Hypothesis Testing.
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
14. Introduction to inference
Simple Linear Regression
1 Bandit Thinkhamrop, PhD.(Statistics) Dept. of Biostatistics & Demography Faculty of Public Health Khon Kaen University Formulation of a research Using.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.
Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
Introduction To Biological Research. Step-by-step analysis of biological data The statistical analysis of a biological experiment may be broken down into.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Hypothesis Testing PowerPoint Prepared by Alfred.
Basic Statistical Concepts. Chapter 2 Reading instructions 2.1 Introduction: Not very important 2.2 Uncertainty and probability: Read 2.3 Bias and variability:
Logistic Regression Bandit Thinkhamrop, PhD. (Statistics) Department of Biostatistics and Demography Faculty of Public Health, Khon Kaen University.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
How to Teach Statistics in EBM Rafael Perera. Basic teaching advice Know your audience Know your audience! Create a knowledge gap Give a map of the main.
January 31 and February 3,  Some formulae are presented in this lecture to provide the general mathematical background to the topic or to demonstrate.
Introduction to Experimental Design
ANOVA (Analysis of Variance) by Aziza Munir
Hypothesis Testing Hypothesis Testing Topic 11. Hypothesis Testing Another way of looking at statistical inference in which we want to ask a question.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Education 793 Class Notes Presentation 10 Chi-Square Tests and One-Way ANOVA.
Introduction to sample size and power calculations Afshin Ostovar Bushehr University of Medical Sciences.
CHI SQUARE TESTS.
Issues concerning the interpretation of statistical significance tests.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
Tuesday, April 8 n Inferential statistics – Part 2 n Hypothesis testing n Statistical significance n continued….
Statistical Analysis II Lan Kong Associate Professor Division of Biostatistics and Bioinformatics Department of Public Health Sciences December 15, 2015.
Sample Size Determination
Various Topics of Interest to the Inquiring Orthopedist Richard Gerkin, MD, MS BGSMC GME Research.
Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco State University.
Statistical inference Statistical inference Its application for health science research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics.
Statistical significance using Confidence Intervals
Comparing the Means of Two Dependent Populations.
Instrument design Essential concept behind the design Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public.
1 Bandit Thinkhamrop, PhD.(Statistics) Dept. of Biostatistics & Demography Faculty of Public Health Khon Kaen University Overview and Common Pitfalls in.
Hypothesis Tests for 1-Proportion Presentation 9.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Chi Square Test Dr. Asif Rehman.
Sample Size Determination
Significance testing Introduction to Intervention Epidemiology
Statistics.
Overview Biostatistics Applications of Statistics in Research
Categorical Data Analysis Review for Final
Biostatistics Critical Appraisal for
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Effect of Sample size on Research Outcomes
Presentation transcript:

Overview and Common Pitfalls in Statistics and How to Avoid Them Bandit Thinkhamrop, PhD.(Statistics) Dept. of Biostatistics & Demography Faculty of Public Health Khon Kaen University

Roles of Statistics in Research Begin at a clear destination What is the conclusion? Concluded based on what? Could it be wrong? Can the data be wrong? Can it be wrong due to data analysis? 7

Statistics Quantify the effect and its error Magnitude of effect Parameter estimation [95%CI] Hypothesis testing [P-value] Quantify errors for further judgments 7

P-value vs. 95%CI (1) An example of a study with dichotomous outcome A study compared cure rate between Drug A and Drug B Setting: Drug A = Alternative treatment Drug B = Conventional treatment Results: Drug A: n1 = 50, Pa = 80% Drug B: n2 = 50, Pb = 50% Pa-Pb = 30% (95%CI: 26% to 34%; P=0.001)

P-value vs. 95%CI (2) Pa > Pb Pb > Pa Pa-Pb = 30% (95%CI: 26% to 34%; P< 0.05)

P-value vs. 95%CI (3) Adapted from: Armitage, P. and Berry, G. Statistical methods in medical research. 3rd edition. Blackwell Scientific Publications, Oxford. 1994. page 99

Tips #6 (b) P-value vs. 95%CI (4) Adapted from: Armitage, P. and Berry, G. Statistical methods in medical research. 3rd edition. Blackwell Scientific Publications, Oxford. 1994. page 99 There were statistically significant different between the two groups.

Tips #6 (b) P-value vs. 95%CI (5) Adapted from: Armitage, P. and Berry, G. Statistical methods in medical research. 3rd edition. Blackwell Scientific Publications, Oxford. 1994. page 99 There were no statistically significant different between the two groups.

P-value vs. 95%CI (4) Save tips: Always report 95%CI with p-value, NOT report solely p-value Always interpret based on the lower or upper limit of the confidence interval, p-value can be an optional Never interpret p-value > 0.05 as an indication of no difference or no association, only the CI can provide this message.

1. Over reliance on p-value Example: Significant findings p-value <0.05 Non-significant findings p-value > 0.05

Diff เดิน = 1.3, 95%CI: 0.04 to 2.62 Diff เดิน+นับเลข = 3.0, 95%CI: 0.17 to 5.87 Diff เดิน+นับเดือน = 3.9, 95%CI: 1.16 to 6.62

1. Over reliance on p-value (cont.) Example: significant findings p-value <0.05 Tips to avoid it: Always report the magnitude of effect and its 95%CI Always interpret the findings based on the magnitude of effect, either the lower or upper boundary of the CI, against the minimum meaningful level

2. Test for baseline comparisons Factors Group A (n=20) Group B P-value Age (years) 39.0 0.2 39.5 0.5 <0.001 Male, n(%) 2 (10.0%) 6 (30.0%) 0.114 Weight (kg) 60 52 30 55 0.084 Height (cm) 160 100 130 99 0.346 SBP at baseline (mmHg) 135 5 130 8 0.023 VAS (pain) at baseline 5 5 9 8 0.067 Number is meanSD unless indicated otherwise

Test for baseline

2. Test for baseline comparisons (cont.) Compare all variables that could related to an association between the exposure and the study outcome. Indication of imbalance is based on clinical judgment - no statistical test is needed. Magnitude of the difference is matter, NOT p-value Large or small difference is clinical judgment If the variable is not highly correlated with the study outcome, it can be ignored even if the difference is high. If in doubt, use multivariable analysis where all imbalance variables were included in the model

3. No magnitude of effect presented Example: See various examples in the class Tips to avoid it: If you can’t count it, it doesn’t exist… Tribe, 1971, p.1360, 1361-2. Always quantify magnitude of effect Always provide the confidence interval of the effect that is the primary objective of the study

Factors affecting birth weight Num-ber Mean Mean Diff 95%CI P-value 1. Being complete ANC Yes No xxx xx.x xx.x – xx.x 0.xxx 2. Education or mother Primary school or lower Secondary school College or higher 3. Mother age (year) Less than 20 20 – 45 45 or older

Factors affecting low birth weight Num-ber % LBW OR 95%CI P-value 1. Being complete ANC Yes No xxx xx.x 1 x.xx x.xx – x.xx 0.xxx 2. Education or mother Primary school or lower Secondary school College or higher 3. Mother age (year) Less than 20 20 – 45 45 or older

4. Applied inappropriate methods of analysis Example: Inconsistent with type of the data Not handle dependency among observation Not accounted for sampling design Not well handle missing data Not accounted for confounding effects Not investigated interaction effects Tips to avoid it: Based on the objective and design of the study

5. Described methods of analysis inappropriately Example: Too general Tips to avoid it: Specific and replicable

6. Presented the results inappropriately Example: See various examples in the class and some examples as follow: Sex (OR = 3.5) Age (OR = 1.5) Marital status (OR = 2.0) Tips to avoid it: Always quantify magnitude of effect Always provide the confidence interval of the effect that is the primary objective of the study

Repeated measure ANOVA

Logistic regression

Student’s t-test

Correlation coefficient

ANOVA and t-test

Regression Model

Regression model

Concluded based on sample statistics NOT on population parameter

Within or between group

7. Sample size unjustified Example: Simplified methods might be misleading Tips to avoid it:

8. Interpret a confidence interval inappropriately Example: Width -> wide vs narrow interval Cross the null value -> sig- vs non-significant Tips to avoid it: Compare magnitude of either lower or upper boundary of the interval with the meaningful level then make a judgment

9. Categorization of the continuous variable inappropriately Example: Continuous -> categorical Numerical count -> categorical Survival outcome -> categorical Tips to avoid it: Based on the research question Keep the intrinsic type of the variable –categorization of it can be done for exploratory purpose Based on clinical judgments

10. Handle the primary outcome inappropriately Example: Interchange among the following: Continuous outcome Categorical outcome Numerical count Survival outcome Tips to avoid it: Based on the research question Based on clinical judgments

11. Before-after design Example: Possible approaches: Post measurement only Change score Fraction Post measurement adjusted for baseline Tips to avoid it: Based on the research question Preferably - post measurement adjusted for baseline

Between or within group comparisons?

Suggested format of presentation Time Group 1 (n=25) 2 3 Diff (95%CI, p-value)* 2-1 3-1 Pre 1.51.0 1.81.2 2.52.0 NA Post 1.71.0 3.53.0 0.8 (0.2-1.6) P=0.01 1.8 (1.2-5.6) P=0.03 Late 1.61.0 2.91.5 4.52.0 1.3 (0.9-5.3) 2.9 (1.7-8.5) * Mean difference adjusted for baseline using ANCOVA

12. Jump to non-parametric test without through exploration of distribution of the data Example: “Since the sample size is small, we decided to use non-parametric test.” Tips to avoid it: Raw data could be better than p-value obtained from non-parametric test Small sample cannot be corrected by non-parametric statistics, in fact, we have NO SUFFICIENT evidence to allow any valid conclusions!

13. Row total, Column total, Grand total fixed? Example: Row-total fixed -> Cohort study Column-total fixed -> Case-control study Sex Disease Normal Total Male 8 (80%) 2 (20%) 10 (100%) Female 12 (24%) 38 (76%) 50 (100%) 20 (33.3%) 40 (66.6%) 60 (100%) Sex Disease Normal Total Male 8 (40%) 2 (5%) 10 (16.7%) Female 12 (60%) 38 (95%) 50 (83.3%) 20 (100%) 40 (100%) 60 (100%) Tips to avoid it: Based on the study design

14. Concluded based on opinion or too general, not on the main findings or specific to the study results Example: “Effective prevention strategies should be formulated. Health education should be provided.” Tips to avoid it: Logically link from the main finding that is the primary research question. Specific to what was found in the study

ผิดเป็นครู Q & A