Medical Statistics: Users’ Manual Arash Etemadi, MD PhD Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical.

Slides:



Advertisements
Similar presentations
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Advertisements

LSU-HSC School of Public Health Biostatistics 1 Statistical Core Didactic Introduction to Biostatistics Donald E. Mercante, PhD.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Departments of Medicine and Biostatistics
Statistical Tests Karen H. Hagglund, M.S.
MST Stats Primer This is a 30 minute overview of a 4 hour tutorial presented at a National Conference. The full handouts from that tutorial can be found.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Final Review Session.
Statistics By Z S Chaudry. Why do I need to know about statistics ? Tested in AKT To understand Journal articles and research papers.
Today Concepts underlying inferential statistics
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Sample Size Determination Ziad Taib March 7, 2014.
Statistics Idiots Guide! Dr. Hamda Qotba, B.Med.Sc, M.D, ABCM.
Regression and Correlation
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Practical statistics for Neuroscience miniprojects Steven Kiddle Slides & data :
Inference for regression - Simple linear regression
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
 Mean: true average  Median: middle number once ranked  Mode: most repetitive  Range : difference between largest and smallest.
Multiple Choice Questions for discussion
Simple Linear Regression
Statistics for clinical research An introductory course.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
Biostatistics Breakdown Common Statistical tests Special thanks to: Christyn Mullen, Pharm.D. Clinical Pharmacy Specialist John Peter Smith Hospital 1.
Class Meeting #11 Data Analysis. Types of Statistics Descriptive Statistics used to describe things, frequently groups of people.  Central Tendency 
Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
Statistics & Biology Shelly’s Super Happy Fun Times February 7, 2012 Will Herrick.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
X Treatment population Control population 0 Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx  Let X =  cholesterol level (mg/dL);
Causation. Associations may be due to Chance (random error) statistics are used to reduce it by appropriate design of the study statistics are used to.
Research Process Parts of the research study Parts of the research study Aim: purpose of the study Aim: purpose of the study Target population: group whose.
Basic Biostatistics Prof Paul Rheeder Division of Clinical Epidemiology.
Parametric tests (independent t- test and paired t-test & ANOVA) Dr. Omar Al Jadaan.
Linear correlation and linear regression + summary of tests
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
How to collect and report outcomes of heart valve surgery Hanneke Takkenberg Dept. of Cardio-Thoracic Surgery Erasmus University Medical Center, Rotterdam.
STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.
Lecture 9: Analysis of intervention studies Randomized trial - categorical outcome Measures of risk: –incidence rate of an adverse event (death, etc) It.
ANALYSIS PLAN: STATISTICAL PROCEDURES
Sample size and common statistical tests There are three kinds of lies- lies, dammed lies and statistics…… Benjamin Disraeli.
Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
Master’s Essay in Epidemiology I P9419 Methods Luisa N. Borrell, DDS, PhD October 25, 2004.
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Statistical Analysis II Lan Kong Associate Professor Division of Biostatistics and Bioinformatics Department of Public Health Sciences December 15, 2015.
Principles of statistical testing
Statistical inference Statistical inference Its application for health science research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Chapter 13 Understanding research results: statistical inference.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Interpretation of Common Statistical Tests Mary Burke, PhD, RN, CNE.
Dr.Rehab F.M. Gwada. Measures of Central Tendency the average or a typical, middle observed value of a variable in a data set. There are three commonly.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Statistics.
APPROACHES TO QUANTITATIVE DATA ANALYSIS
Basic Statistics Overview
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Medical Statistics Dr. Gholamreza Khalili
NURS 790: Methods for Research and Evidence Based Practice
Interpreting Basic Statistics
15.1 The Role of Statistics in the Research Process
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Presentation transcript:

Medical Statistics: Users’ Manual Arash Etemadi, MD PhD Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences

Why does good evidence from research fail to get into practice?? - 75% cannot understand the statistics - 70% cannot critically appraise a research paper Using Research for Practice: A UK Experience of the barriers scale Dunn V, Crichton C, Williams K, Roe B, Seers K

Do we have to build the car we drive?

Why is statistics necessary? –58% of the population had GERD –Mean age of the respondents was 25+8 –25% of women and 50% of men lied about their age –Doctors live longer than normal people. (4 in each group!?)

Why is statistics necessary? Descriptive statistics –58% of the population had GERD –Mean age of the respondents was 25+8 Inferential statistics –25% of women and 50% of men lied about their age –Doctors live longer than normal people. (4 in each group!?)

Descriptive statistics Point estimates: Mean, median, mode, relative frequency Distribution: Standard deviation

Inferential statistics: exploring associations and differences

Differences Continuous variables (blood pressure, age): vs Categorical variables (proportion of blind people): 10% vs. 2%

Measures of Association Relative Risk (RR) Odds Ratio (OR) Absolute risk reduction Relative risk reduction Number needed to treat Treatment deadaliveTotal Medical CABG

Number Needed to Treat (NNT) (very trendy but tricky): only defined for a specific intervention! only defined for a specific outcome! – eg. Pravastatin™ 40 mg nocte x10 years, in a 65 year old male, ex-smoker with high BP and Diabetes, to reduce MI or Death. NNT is the inverse of Absolute Risk Reduction: i.e. NNT = 1/ARR

Measures of Association Linear Correlation –Conditions –r and p, CI Regression –Univariate –Multiple Regression –Logistic Regression –Cox Proportional Hazard Model Do they mean causation?

Associations may be due to Chance (random error) – statistics are used to reduce it by appropriate design of the study –statistics are used to estimate the probability that the observed results are due to chance Bias (Systematic error) – must be considered in the design of the study Confounding – can be dealt with during both the design and the analysis of the study True association

Dealing with chance error During design of study – Sample size – Power During analysis (Statistical measures of chance) – Test of statistical significance (P value) – Confidence intervals

Statistical measures of chance I (Test of statistical significance) Association in Reality YesNo Yes No Type I error Type II error Observed association

The p-value in a nutshell p < 0.05 a statistically significant result p = 0.05 or 1 in 20 result fairly unlikely to be due to chance 0 1 Could the result have occurred by chance? The result is unlikely to be due to chance The result is likely to be due to chance 1 20 p > 0.05 not a statistically significant result p = 0.5 or 1 in 2 result quite likely to be due to chance 1 2

Significantitis Significantitis is the plague of our time. »A. Etemadi, 21 st century epidemiologist The drug reduced blood pressure by 1mmHg (p< ) Although we showed that half of the newborns could be saved by this method, our results were good-for-nothing (p=0.06)

Confidence Interval (CI) Is the range within which the true size of effect (never exactly known) lies, with a given degree of assurance (usually 95%)

The ACE inhibitor group had a 5% (95% CI: 1-9) higher survival.

Associations may be due to Chance (random error) – statistics are used to reduce it by appropriate design of the study –statistics are used to estimate the probability that the observed results are due to chance Bias (Systematic error) – must be considered in the design of the study Confounding – can be dealt with during both the design and the analysis of the study True association

Types of Bias Selection bias – identification of individual subjects for inclusion in study on the basis of either exposure or disease status depends in some way on the other axis of interest Observation (information) bias – results from systematic differences in the way data on exposure or outcome are obtained from the various study groups

Associations may be due to Chance (random error) – statistics are used to reduce it by appropriate design of the study –statistics are used to estimate the probability that the observed results are due to chance Bias (Systematic error) – must be considered in the design of the study Confounding – can be dealt with during both the design and the analysis of the study True association

Confounding coffee Pancreatic cancer smoking

Confounding smokingcoffee Pancreatic cancer

Confounding confounder effect Possible cause

Associations may be due to Chance (random error) – statistics are used to reduce it by appropriate design of the study –statistics are used to estimate the probability that the observed results are due to chance Bias (Systematic error) – must be considered in the design of the study Confounding – can be dealt with during both the design and the analysis of the study True association

9/3/2015 DETERMINATION OF CAUSATION The general QUESTION: Is there a cause and effect relationship between the presence of factor X and the development of disease Y? One way of determining causation is personal experience by directly observing a sequence of events. How do elevators work?

Nature of Evidence: 1. Replication of Findings – –consistent in populations 2. Strength of Association – –significant high risk 3. Temporal Sequence – –exposure precede disease

Nature of Evidence: 4. Dose-Response – –higher dose exposure, higher risk 5. Biologic Credibility – –exposure linked to pathogenesis 6. Consideration of alternative explanations – –the extent to which other explanations have been considered.

Nature of Evidence 7. Cessation of exposure (Dynamics) – –removal of exposure – reduces risk 8. Specificity –specific exposure is associated with only one disease 9. Experimental evidence

H. pylori –Temporal relationship 11% of chronic gastritis patients go on the develop duodenal ulcers over a 10-year period. –Strength H. pylori is found in at least 90% of patients with duodenal ulcer –Dose response density of H.pylori is higher in patients with duodenal ulcer than in patients without –Consistency association has been replicated in other studies

H. pylori –Biologic plausibility originally – no biologic plausibility then H. pylori binding sites were found know H. pylori induces inflammation –Specificity prevalence of H. pylori in patients with duodenal ulcers is 90% to 100%

First question to ask: Is there any statistics at all? –Is it necessary?

Are baseline differences explored and adjusted for? Collins et al. Stat Med; 1987 Nodal status Center 1Center 2 TreatmentControlTreatmentControl 061%64%22%50% 1-328% 31%35% 4+11%7%42%14% N/A01%5%1%

What is the appropriate test? –Scales Nominal Ordinal Interval Ratio

Normal Distribution

Copyright ©1997 BMJ Publishing Group Ltd. Greenhalgh, T. BMJ 1997;315: Skewed curve

Parametric versus non-parametric tests Transformation

Jekyl-Frankenestein-Tarkovsky test of variances for unequal modes

Subgroup analysis? –Analyses showed that the drug was especially effective in women above 35 who were unable to say supercalifragilisticexpialidocious. –We divided the study population according to sex, then each group were divided to 10 age groups, each age group was subdivided according to educational background and whether they were left-handed or right- handed.

Scenario

Subgroup analysis

Paired analysis?

Ten ways to cheat on statistical tests when writing up results Throw all your data into a computer and report as significant any relation where P<0.05 If baseline differences between the groups favour the intervention group, remember not to adjust for them Do not test your data to see if they are normally distributed. If you do, you might get stuck with non- parametric tests, which aren't as much fun Ignore all withdrawals (drop outs) and non- responders, so the analysis only concerns subjects who fully complied with treatment

Always assume that you can plot one set of data against another and calculate an "r value" (Pearson correlation coefficient), and assume that a "significant" r value proves causation If outliers (points which lie a long way from the others on your graph) are messing up your calculations, just rub them out. But if outliers are helping your case, even if they seem to be spurious results, leave them in If the confidence intervals of your result overlap zero difference between the groups, leave them out of your report. Better still, mention them briefly in the text but don't draw them in on the graph—and ignore them when drawing your conclusions

If the difference between two groups becomes significant four and a half months into a six month trial, stop the trial and start writing up. Alternatively, if at six months the results are "nearly significant," extend the trial for another three weeks If your results prove uninteresting, ask the computer to go back and see if any particular subgroups behaved differently. You might find that your intervention worked after all in Chinese women aged If analysing your data the way you plan to does not give the result you wanted, run the figures through a selection of other tests

Statistical Tests Type of Data GoalMeasurement (from Gaussian Population) Rank, Score, or Measureme nt (from Non- Gaussian Population) Binomial (Two Possible Outcomes) Survival Time Describe one group Mean, SDMedian, interquartile range ProportionKaplan Meier survival curve Compare one group to a hypothetical value One-sample t testWilcoxon testChi-square or Binomial test ** Compare two unpaired groups Unpaired t testMann-Whitney test Fisher's test (chi-square for large samples) Log-rank test or Mantel- Haenszel* Compare two paired groups Paired t testWilcoxon testMcNemar's testConditional proportional hazards regression*

Statistical Tests Compare three or more unmatched groups One-way ANOVAKruskal-Wallis testChi-square testCox proportional hazard regression** Compare three or more matched groups Repeated- measures ANOVA Friedman testCochrane Q**Conditional proportional hazards regression** Quantify association between two variables Pearson correlation Spearman correlation Contingency coefficients** Predict value from another measured variable Simple linear regression or Nonlinear regression Nonparametric regression** Simple logistic regression* Cox proportional hazard regression* Predict value from several measured or binomial variables Multiple linear regression* or Multiple nonlinear regression** Multiple logistic regression* Cox proportional hazard regression*