Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 8 – Comparing Proportions Marshall University Genomics.

Slides:



Advertisements
Similar presentations
Contingency Tables Prepared by Yu-Fen Li.
Advertisements

Comparing Two Proportions (p1 vs. p2)
Observational Studies and RCT Libby Brewin. What are the 3 types of observational studies? Cross-sectional studies Case-control Cohort.
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
Find the Joy in Stats ? ! ? Walt Senterfitt, Ph.D., PWA Los Angeles County Department of Public Health and CHAMP.
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
Introduction to Risk Factors & Measures of Effect Meg McCarron, CDC.
Measures of Disease Association Measuring occurrence of new outcome events can be an aim by itself, but usually we want to look at the relationship between.
Stat 512 – Lecture 14 Analysis of Variance (Ch. 12)
Chapter 17 Comparing Two Proportions
Two-Way Tables Two-way tables come about when we are interested in the relationship between two categorical variables. –One of the variables is the row.
Stat 512 – Lecture 12 Two sample comparisons (Ch. 7) Experiments revisited.
BIOST 536 Lecture 3 1 Lecture 3 – Overview of study designs Prospective/retrospective  Prospective cohort study: Subjects followed; data collection in.
Statistics 303 Chapter 9 Two-Way Tables. Relationships Between Two Categorical Variables Relationships between two categorical variables –Depending on.
11-3 Contingency Tables In this section we consider contingency tables (or two-way frequency tables), which include frequency counts for categorical data.
Sample size calculations
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 5 – Testing for equivalence or non-inferiority. Power.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Survival Curves Marshall University Genomics Core.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 7 – T-tests Marshall University Genomics Core Facility.
The Chi-Square Test Used when both outcome and exposure variables are binary (dichotomous) or even multichotomous Allows the researcher to calculate a.
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Analytic Epidemiology
David Yens, Ph.D. NYCOM PASW-SPSS STATISTICS David P. Yens, Ph.D. New York College of Osteopathic Medicine, NYIT l PRESENTATION.
Categorical Data Prof. Andy Field.
Absolute, Relative and Attributable Risks. Outcomes or differences that we are interested in:  Differences in means or proportions  Odds ratio (OR)
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 – Multiple comparisons, non-normality, outliers Marshall.
Multiple Choice Questions for discussion
Statistical Analysis Statistical Analysis
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 7: Gathering Evidence for Practice.
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
Amsterdam Rehabilitation Research Center | Reade Testing significance - categorical data Martin van der Esch, PhD.
8.1 Inference for a Single Proportion
Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
Lecture 6 Objective 16. Describe the elements of design of observational studies: (current) cohort studies (longitudinal studies). Discuss the advantages.
Research Study Design. Objective- To devise a study method that will clearly answer the study question with the least amount of time, energy, cost, and.
Dr.Shaikh Shaffi Ahamed Ph.D., Dept. of Family & Community Medicine
Causation ? Tim Wiemken, PhD MPH CIC Assistant Professor Division of Infectious Diseases University of Louisville, Kentucky.
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
The binomial applied: absolute and relative risks, chi-square.
VSM CHAPTER 6: HARM Evidence-Based Medicine How to Practice and Teach EMB.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
Statistical test for Non continuous variables. Dr L.M.M. Nunn.
Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: One-way ANOVA Marshall University Genomics Core.
2 sample interval proportions sample Shown with two examples.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
More Contingency Tables & Paired Categorical Data Lecture 8.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
European Patients’ Academy on Therapeutic Innovation The Purpose and Fundamentals of Statistics in Clinical Trials.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.
How to do Power & Sample Size Calculations Part 1 **************** GCRC Research-Skills Workshop October 18, 2007 William D. Dupont Department of Biostatistics.
Doing Analyses on Binary Outcome. From November 14 th Dr Sainani talked about how the math works for binomial data.
Fall 2002Biostat Inference for two-way tables General R x C tables Tests of homogeneity of a factor across groups or independence of two factors.
2 3 انواع مطالعات توصيفي (Descriptive) تحليلي (Analytic) مداخله اي (Interventional) مشاهده اي ( Observational ) كارآزمايي باليني كارآزمايي اجتماعي كارآزمايي.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Case Control study. An investigation that compares a group of people with a disease to a group of people without the disease. Used to identify and assess.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Comparing Models.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 16 : Summary Marshall University Genomics Core Facility.
Cross-sectional studies
Lecture 8 – Comparing Proportions
Interpreting Epidemiologic Results.
Research Techniques Made Simple: Interpreting Measures of Association in Clinical Research Michelle Roberts PhD,1,2 Sepideh Ashrafzadeh,1,2 Maryam Asgari.
Presentation transcript:

Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 8 – Comparing Proportions Marshall University Genomics Core Facility

Types of independent and dependent variables Last lecture examined t-tests A (two-class) t-test is applicable when – The dependent (outcome) variable is a continuous, interval variable – The independent (input) variable is a nominal (or ordinal) variable with two possible values For example, in the GRHL2 comparison, the independent variable was “Basal type” (with values “Basal A” and “Basal B”) and the dependent variable was “log2 expression” Marshall University School of Medicine

Independent and dependent variables that are both nominal Another class of tests involves independent and dependent variables that are both nominal Very common in clinical studies – Independent: treated vs not treated – Dependent: disease vs no disease Marshall University School of Medicine

Contingency Tables In such experiments, data is usually presented in a contingency table Shows how value of dependent variable is contingent on the independent variable Aim is to compare the proportions between the two groups: is A/(A+B) different to C/(C+D)? Marshall University School of Medicine DiseaseNo diseaseTotal Exposed to risk or treated ABA+B Not exposed to risk, or not treated CDC+D TotalA+CB+DN=A+B+C+D

Types of study There are four different study designs that lead to data presented in contingency tables: – Cross-sectional studies Sample is selected at random from population. Sample is then divided into two groups depending on prior treatment or exposure to risk factor. Disease prevalence is compared in each group. – Prospective (longitudinal) studies Two groups are selected: one with exposure to risk factor (or treated) and one without. Groups are then followed over time to see how many develop disease in each group – Experimental studies Samples are selected, divided randomly in two groups. One group receives treatment (or is exposed to risk); one is not. Incidence of disease is compared in each group. – Case-control studies Two groups of samples are selected: one with the disease (cases) and one without (controls). Each group is examined to see how many were treated or exposed to risk prior to the study Marshall University School of Medicine

Example experimental study Study from Frye et al (1996, NEJM) – Compared two treatments for Coronary Artery Disease – CABG and PTCA 1829 patients in study, randomly assigned to CABG or PTCA Outcome is 5-year survival Marshall University School of Medicine Survived 5 years Did not survive 5 years Total CABG PTCA Total

Calculations for Frye data The risk for the CABG group is 372/914=40.7% The risk for the PTCA group is 378/915=41.3% The relative risk (for the PTCA group compared to the CABG group) is 41.3%/40.7%=1.01 – The PTCA group is 1.01 times more likely not to survive 5 years than the CABG group – The 95% confidence interval for the relative risk is to There is little difference between the risk in these groups Marshall University School of Medicine

Frye data for diabetic patients Frye et al. also examined a subgroup of their patients who had diabetes – Still an experimental study Controlled which patients received which treatment, observed outcome Marshall University School of Medicine Diabetic patients Survived 5 years Did not survived 5 years Total CABG PTCA Total

Risk and relative risk for diabetic patients Risk for CABG group is 87/180=48.3% Risk for PTCA group is 104/173=60.1% Relative risk for PTCA group (relative to CABG group) is 60.1%/48.3%=1.244 – For diabetic patients, the risk of dying within 5 years if you receive PTCA is times the risk of dying within 5 years if you receive CABG – 95% confidence interval is to Marshall University School of Medicine

Another example: AZT and HIV Cooper et al. (via Motulsky): – Study of the effectiveness in using AZT to prevent HIV developing into AIDS. – Studied 936 patients, randomly treated either with AZT or with a placebo – After three years, compared the proportion for whom the disease had progressed to AIDS Marshall University School of Medicine Disease progressed No progression Total AZT Placebo Total

Risk and relative risk for AZT We can perform the same analyses as before: Risk for AZT group is 76/475=16% Risk for placebo group is 129/461=28% Relative risk for AZT group is 16%/28%=0.57 AZT group are 0.57 times as likely to experience disease progression in three years as placebo group 95% Confidence interval for relative risk is to Marshall University School of Medicine

The attributable risk The difference between the two incidence rates is called the attributable risk: Attributable risk = 28%-16% = 12% – A risk of 12% (of the disease progressing in three years) is attributable to not taking AZT – 95% Confidence interval for attributable risk is from 6.68% to 17.28% Attributable risk is an intuitively useful value when studying risk factors Marshall University School of Medicine

Number Needed to Treat (NNT) Since the attributable risk is 12%, we can interpret this as: – Of those who didn’t receive AZT, 12% (about 1 in 8) progressed to the full disease in 3 years because they didn’t receive AZT Another 16% also progressed to the disease, but they would have done so anyway… Another way to look at this is that, for every 8 people or so who are treated, one is prevented from progressing to the full disease The Number Needed to Treat (NNT) is the reciprocal of the attributable risk – NNT = 1/ = 8.35 – On average, 8.35 people need to be treated with AZT to prevent 1 from progressing to the full disease in 3 years – The 95% confidence interval is computed from the 95% CI for the attributable risk: – 1/ to 1/0.0668, or 5.79 to Marshall University School of Medicine

Significance tests for contingency tables The confidence intervals for relative risk and/or attributable risk provide plenty of information about the differences between proportions in the contingency table If needed, a p-value can also be provided p-value is associated with a null hypothesis – The proportion of the “positive” outcomes is independent of the treatment group Marshall University School of Medicine

Fisher’s Exact Test and Chi-squared tests The best statistical test for a contingency table is Fisher’s Exact Test For large numbers (very large numbers), this test is computationally prohibitive In this case, a Chi-squared test can be used as an approximation – Historically, Chi-squared tests were always used, but increased computing power makes this unnecessary Marshall University School of Medicine

p-value for the CABG-PTCA study For the Frye et al. study, the null hypothesis is: The 5-year survival rate is the same for those treated with PTCA as for those treated with CABG Using Fisher’s Exact test for these data, we get p= – Assuming there is no difference in the survival rates between those treated with PTCA and those treated with CABG, there is a 81.22% chance of seeing a difference at least as big as the one observered Marshall University School of Medicine

p-value for Frye’s data with diabetic patients For the restriction to diabetic patients, the p- value, by Fisher’s exact test, is Assuming the survival rate for diabetic patients is the same for those treated with PTCA as for those treated with CABG, there is a 3.25% chance of seeing a difference as large as the one observed Marshall University School of Medicine

Case-control studies In a case-control study, the investigator selects two groups of subjects: – One group with the disease (or outcome of interest) – One group without Compare this to a prospective or experimental study where the investigator controls the groups based on the independent variable (treatment or risk factor) In a case-control study, the investigator then looks back within each group to see how many were exposed to the risk factor or treatment – Sometimes called a “retrospective study” Marshall University School of Medicine

Example: cholera vaccine Example (Lucas et al. via Motulsky) Performed a case-control study to measure the effectiveness of a vaccine for cholera Scientifically ideal experiment is to recruit subjects, randomly give half the vaccine and half a placebo, and follow them to see how many in each group develop cholera – Study would take many years – Unethical if you believe vaccine works Instead, investigators recruited a group of 43 subjects who had contracted cholera and 172 who had not, and compared how many had been vaccinated in each group Marshall University School of Medicine

Lucas et al study Cases (cholera)Controls Vaccinated1094 Not vaccinated3378 Total43172 Marshall University School of Medicine Note that in this study, investigators control the column totals In the previous examples, investigators control the row totals Makes an important difference to the interpretation of calculations

Relative risk is meaningless in case- control studies It makes no sense to compute the risk or relative risk in case-control studies The risk is the number affected in each group divided by the total in each group – In a case control study, this is determined merely by the choice of the investigator as to how many subjects to place in each group Marshall University School of Medicine

Odds ratios Results of a case-control study are summarized as an odds ratio – In our example, for the cholera group, the odds of being vaccinated are the number vaccinated divided by the number not vaccinated 10/33 = – The odds of being vaccinated for the controls are 94/78 = The odds ratio is the ratio of the odds: 0.303/1.205 = The odds of having been vaccinated for a cholera victim are times the odds of having been vaccinated for a control – The 95% confidence interval for this odds ratio is to Marshall University School of Medicine

Odds ratio and relative risk If the disease (or other outcome) is rare, then the odds ratio is an approximation to the relative risk – Rare means less than about 10% of the population So, if we assume cholera is rare in this population, vaccinated individuals have about 25% the risk of getting cholera as unvaccinated individuals Marshall University School of Medicine

Statistical test for case-control studies The statistical test used for case-control is also a Fisher’s exact test – the null hypothesis in our example is that the proportion who received the vaccine is the same for those with cholera as for those without – Fisher’s exact test gives p= in this case – So if there is no difference in the proportion who received the vaccine between those with and those without cholera, the chances of seeing data showing at least as strong a relationship between the two due to sampling would be (or 0.03%). Marshall University School of Medicine

Fisher’s exact test and Chi-squared tests The best test to use to produce a p-value associated to contingency tests is Fisher’s exact test Because this is a computationally intensive test, historically Chi-squared tests were used as an approximation – A “standard” chi-squared test will give a lower p-value than is accurate Potentially much lower if the sample size is small – A corrected is available to the chi-squared test, called the “Yates continuity correction” which generally gives a higher p-value than is correct Use Fisher’s exact test – If you are forced to use the Chi-squared test, use the Yates continuity correction For decent sample sizes it makes little difference Marshall University School of Medicine