X Treatment population Control population 0 Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx  Let X =  cholesterol level (mg/dL);

Slides:



Advertisements
Similar presentations
X Treatment population Control population 0 Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx Let X = decrease (–) in cholesterol.
Advertisements

Categorical and discrete data. Non-parametric tests.
Departments of Medicine and Biostatistics
CHAPTER 3 Probability Theory Basic Definitions and Properties Conditional Probability and Independence Bayes’ Formula Applications.
INTRODUCTION TO NON-PARAMETRIC ANALYSES CHI SQUARE ANALYSIS.
Statistical Tests Karen H. Hagglund, M.S.
EPI 809 / Spring 2008 Final Review EPI 809 / Spring 2008 Ch11 Regression and correlation  Linear regression Model, interpretation. Model, interpretation.
Test statistic: Group Comparison Jobayer Hossain Larry Holmes, Jr Research Statistics, Lecture 5 October 30,2008.
Final Review Session.
Chapter 19 Data Analysis Overview
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Statistics Idiots Guide! Dr. Hamda Qotba, B.Med.Sc, M.D, ABCM.
Nonparametrics and goodness of fit Petter Mostad
ANALYSIS OF VARIANCE. Analysis of variance ◦ A One-way Analysis Of Variance Is A Way To Test The Equality Of Three Or More Means At One Time By Using.
Overall agenda Part 1 and 2  Part 1: Basic statistical concepts and descriptive statistics summarizing and visualising data describing data -measures.
Analysis of Categorical Data
 Mean: true average  Median: middle number once ranked  Mode: most repetitive  Range : difference between largest and smallest.
Simple Linear Regression
Statistics for clinical research An introductory course.
7.1 - Motivation Motivation Correlation / Simple Linear Regression Correlation / Simple Linear Regression Extensions of Simple.
Parametric & Non-parametric Parametric Non-Parametric  A parameter to compare Mean, S.D.  Normal Distribution & Homogeneity  No parameter is compared.
How to Teach Statistics in EBM Rafael Perera. Basic teaching advice Know your audience Know your audience! Create a knowledge gap Give a map of the main.
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
SIMPLE TWO GROUP TESTS Prof Peter T Donnan Prof Peter T Donnan.
A Repertoire of Hypothesis Tests  z-test – for use with normal distributions and large samples.  t-test – for use with small samples and when the pop.
Single Factor Research. dataCentral tendency variablitycorrelation Nominal moderangephi Ordinal medianrangeSpearman rho Interval/ Ratio Skewed medianrangeConvert.
Repeated measures ANOVA in SPSS Cross tabulations Survival analysis.
Linear correlation and linear regression + summary of tests
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
Analysis of Qualitative Data Dr Azmi Mohd Tamil Dept of Community Health Universiti Kebangsaan Malaysia FK6163.
STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.
ANALYSIS PLAN: STATISTICAL PROCEDURES
Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly.
Single-Factor Studies KNNL – Chapter 16. Single-Factor Models Independent Variable can be qualitative or quantitative If Quantitative, we typically assume.
Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.
Review Lecture 51 Tue, Dec 13, Chapter 1 Sections 1.1 – 1.4. Sections 1.1 – 1.4. Be familiar with the language and principles of hypothesis testing.
Chap 18-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 18-1 Chapter 18 A Roadmap for Analyzing Data Basic Business Statistics.
Master’s Essay in Epidemiology I P9419 Methods Luisa N. Borrell, DDS, PhD October 25, 2004.
Statistics for Neurosurgeons A David Mendelow Barbara A Gregson Newcastle upon Tyne England, UK.
Biostatistics Nonparametric Statistics Class 8 March 14, 2000.
X Treatment population Control population 0 Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx  Let X = decrease (–) in cholesterol.
Introduction to Basic Statistical Methods Part 1: “Statistics in a Nutshell” UWHC Scholarly Forum March 19, 2014 Ismor Fischer, Ph.D. UW Dept of Statistics.
How to do Power & Sample Size Calculations Part 1 **************** GCRC Research-Skills Workshop October 18, 2007 William D. Dupont Department of Biostatistics.
THE CHI-SQUARE TEST BACKGROUND AND NEED OF THE TEST Data collected in the field of medicine is often qualitative. --- For example, the presence or absence.
Nonparametric Statistics
Fall 2002Biostat Inference for two-way tables General R x C tables Tests of homogeneity of a factor across groups or independence of two factors.
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Interpretation of Common Statistical Tests Mary Burke, PhD, RN, CNE.
Introdcution to Epidemiology for Medical Students Université Paris-Descartes Babak Khoshnood INSERM U1153, Equipe EPOPé (Dir. Pierre-Yves Ancel) Obstetric,
Bivariate analysis. * Bivariate analysis studies the relation between 2 variables while assuming that other factors (other associated variables) would.
Chapter 18 Data Analysis Overview Yandell – Econ 216 Chap 18-1.
Nonparametric Statistics
BIOSTATISTICS Qualitative variable (Categorical) DESCRIPTIVE
Chapter 12 Simple Linear Regression and Correlation
CHAPTER 7 Linear Correlation & Regression Methods
Chapter 13 Nonlinear and Multiple Regression
Statistics.
CHOOSING A STATISTICAL TEST
Y - Tests Type Based on Response and Measure Variable Data
Medical Statistics Dr. Gholamreza Khalili
SA3202 Statistical Methods for Social Sciences
Nonparametric Statistics
Single-Factor Studies
Single-Factor Studies
Chapter 12 Simple Linear Regression and Correlation
قياس المتغيرات في المنهج الكمي
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
Presentation transcript:

X Treatment population Control population 0 Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx  Let X =  cholesterol level (mg/dL); Patients satisfying inclusion criteria RANDOMIZERANDOMIZE Treatment Arm Control Arm RANDOM SAMPLES End of Study T-test F-test (ANOVA ) Experiment significant? possible expected distributions:

X Post-Tx population Pre-Tx population Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx  Let X =  cholesterol level (mg/dL) Patients satisfying inclusion criteria Pre-Tx Arm Post-Tx Arm PAIRED SAMPLES End of Study Paired T-test, ANOVA F-test “repeated measures” Experiment significant? 0 from baseline, on same patients

S(t) = P(T > t) 0 1 T Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx  Let T = Survival time (months); End of Study Log-Rank Test, Cox Proportional Hazards Model Kaplan-Meier estimates population survival curves: significant? S 2 (t) Control S 1 (t) Treatment AUC difference survival probability

Case-Control studies Cohort studies

E+ vs. E– statistically significant Observational study designs that test for a statistically significant association between a disease D and exposure E to a potential risk (or protective) factor, measured via “odds ratio,” “relative risk,” etc. Lung cancer / Smoking PRESENT E+ vs. E– ?D+ vs. D– ? Case-Control studies Cohort studies Both types of study yield a 2  2 “contingency table” for binary variables D and E: D+D+D–D– E+E+ aba + b E–E– cdc + d a + cb + dn relatively easy and inexpensive relatively easy and inexpensive subject to faulty records, “recall bias” subject to faulty records, “recall bias” D+ vs. D– FUTUREPAST measures direct effect of E on D expensive, extremely lengthy expensive, extremely lengthy… Example: Framingham, MA study where a, b, c, d are the observed counts of individuals in each cell. cases controlsreference group End of Study Chi-squared Test McNemar Test (for paired case- control study designs) H 0 : No association between D and E.

– As seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test. But what if the two variables – say, X and Y – are numerical measurements? Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)? JAMA. 2003;290: Correlation Coefficient measures the strength of linear association between X and Y X Y Scatterplot r positive linear correlation negative linear correlation

– As seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test. Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)? JAMA. 2003;290: Correlation Coefficient measures the strength of linear association between X and Y X Y Scatterplot r positive linear correlation negative linear correlation But what if the two variables – say, X and Y – are numerical measurements?

– As seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test. Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)? JAMA. 2003;290: Correlation Coefficient linear measures the strength of linear association between X and Y X Y Scatterplot r positive linear correlation negative linear correlation But what if the two variables – say, X and Y – are numerical measurements?

As seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test. Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)? Correlation Coefficient linear measures the strength of linear association between X and Y But what if the two variables – say, X and Y – are numerical measurements? For this example, r = –0.387 (weak, negative linear correl) For this example, r = –0.387 (weak, negative linear correl)

For this example, r = –0.387 (weak, negative linear correl) For this example, r = –0.387 (weak, negative linear correl) residuals As seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test. Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)? But what if the two variables – say, X and Y – are numerical measurements? Want the unique line that minimizes the sum of the squared residuals. Simple Linear Regression Simple Linear Regression gives the “best” line that fits the data. Regression Methods

For this example, r = –0.387 (weak, negative linear correl) For this example, r = –0.387 (weak, negative linear correl) For this example, r = –0.387 (weak, negative linear correl) Y = – X (p =.0055) For this example, r = –0.387 (weak, negative linear correl) Y = – X (p =.0055) residuals As seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test. Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)? Regression Methods But what if the two variables – say, X and Y – are numerical measurements? Want the unique line that minimizes the sum of the squared residuals. Simple Linear Regression Simple Linear Regression gives the “least squares” regression line. It can also be shown that the proportion of total variability in the data that is accounted for by the line is equal to r 2, which in this case, = (–0.387) 2 = (15%)... very small.

Numerical (Quantitative) e.g., $ Annual Income 2 POPULATIONS: H 0 :  1 =  2 Normally distributed? YesNo Wilcoxon Rank Sum (aka Mann- Whitney U) 2-sample T (w/o pooling) Yes “Nonparametric Tests” No YesNo 2-sample T (w/ pooling) Equivariance? Satterwaithe Welch “Approximate” T Q-Q plots Shapiro-Wilk Anderson-Darling others… F-test Bartlett others…  2 POPULATIONS: ANOVA F-test Regression Methods Kruskal- Wallis Various modifications X σ1σ1 σ2σ2 11 22 Independent e.g., RCT Paired (Matched) e.g., Pre- vs. Post- Sample 1Sample 2 YesNo Sign Test Wilcoxon Signed Rank “Nonparametric Tests” Paired T ANOVA F-test (w/ “repeated measures” or “blocking”) Friedman Kendall’s W others…

Categorical (Qualitative) e.g., Income Level: Low, Mid, High Categorical (Qualitative) e.g., Income Level: Low, Mid, High  2 CATEGORIES per each of two variables: H 0 : “There is no association between (the categories of) I and (the categories of) J.” r × c contingency table Chi-squared Tests  Test of Independence (1 population, 2 categorical variables)  Test of Homogeneity (2 populations, 1 categorical variable)  “Goodness-of-Fit” Test (1 population, 1 categorical variable)  Modifications McNemar Test for paired 2 × 2 categorical data, to control for “confounding variables” e.g., case-control studies Fisher’s Exact Test for small “expected values” (< 5) to avoid possible “spurious significance”

Introduction to Basic Statistical Methods Part 1: Statistics in a Nutshell UWHC Scholarly Forum May 21, 2014 Ismor Fischer, Ph.D. UW Dept of Statistics Part 2: Overview of Biostatistics: “Which Test Do I Use??” Sincere thanks to… Judith Payne Judith Payne Heidi Miller Heidi Miller Samantha Goodrich Samantha Goodrich Troy Lawrence Troy Lawrence YOU! YOU! All slides posted at