According to scientists, too much coffee may cause... 1986 --phobias, --panic attacks 1990 --heart attacks, --stress, --osteoporosis 1991 -underweight.

Slides:

Advertisements

Similar presentations

How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.

Advertisements

Review of observational study design and basic statistics for contingency tables.

Sample size estimation

Conditional Probability

Observational Studies and RCT Libby Brewin. What are the 3 types of observational studies? Cross-sectional studies Case-control Cohort.

CONCEPTS UNDERLYING STUDY DESIGN

Case-Control Studies (Retrospective Studies). What is a cohort?

1 Case-Control Study Design Two groups are selected, one of people with the disease (cases), and the other of people with the same general characteristics.

Chance, bias and confounding

Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.

EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.

Measures of Disease Association Measuring occurrence of new outcome events can be an aim by itself, but usually we want to look at the relationship between.

Cohort Studies.

BS704 Class 7 Hypothesis Testing Procedures

Case-Control Studies. Feature of Case-control Studies 1. Directionality Outcome to exposure 2. Timing Retrospective for exposure, but case- ascertainment.

Sample Size Determination

Cohort Studies Hanna E. Bloomfield, MD, MPH Professor of Medicine Associate Chief of Staff, Research Minneapolis VA Medical Center.

Manish Chaudhary BPH, MPH

Lecture 9: p-value functions and intro to Bayesian thinking Matthew Fox Advanced Epidemiology.

The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.

Case Control Study Manish Chaudhary BPH, MPH

Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.5 Small Sample.

Chapter 10 Analyzing the Association Between Categorical Variables

How Can We Test whether Categorical Variables are Independent?

Analytic Epidemiology

 Mean: true average  Median: middle number once ranked  Mode: most repetitive  Range : difference between largest and smallest.

Multiple Choice Questions for discussion

 Be familiar with the types of research study designs  Be aware of the advantages, disadvantages, and uses of the various research design types  Recognize.

Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.

AETIOLOGY Case control studies (also RCT, cohort and ecological studies)

Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.

CHP400: Community Health Program- lI Research Methodology STUDY DESIGNS Observational / Analytical Studies Case Control Studies Present: Disease Past:

Lecture 6 Objective 16. Describe the elements of design of observational studies: (current) cohort studies (longitudinal studies). Discuss the advantages.

 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.

Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 8 – Comparing Proportions Marshall University Genomics.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Categorical Variables Chapter 6.

Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.

Introduction to observational medical studies and measures of association HRP 261 January 5, 2005 Read Chapter 1, Agresti.

Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.

The binomial applied: absolute and relative risks, chi-square.

Case-control study Chihaya Koriyama August 17 (Lecture 1)

Review of observational medical studies, measures of association, and 2x2 tables.

MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Analysis of Qualitative Data Dr Azmi Mohd Tamil Dept of Community Health Universiti Kebangsaan Malaysia FK6163.

1 Risk Assessment Tests Marina Kondratovich, Ph.D. OIVD/CDRH/FDA March 9, 2011 Molecular and Clinical Genetics Panel for Direct-to-Consumer (DTC) Genetic.

Unit 2 – Public Health Epidemiology Chapter 4 – Epidemiology: The Basic Science of Public Health.

1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?

1 Basic epidemiological study designs and its role in measuring disease exposure association M. A. Yushuf Sharker Assistant Scientist Center for Communicable.

Case-Control Studies Abdualziz BinSaeed. Case-Control Studies Type of analytic study Unit of observation and analysis: Individual (not group)

More Contingency Tables & Paired Categorical Data Lecture 8.

BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.

X Treatment population Control population 0 Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx  Let X = decrease (–) in cholesterol.

Introduction to Categorical Data Analysis July 22, 2004

Fall 2002Biostat Inference for two-way tables General R x C tables Tests of homogeneity of a factor across groups or independence of two factors.

Case control & cohort studies

Observational Medical Studies HRP 261 January 7, 2004.

Case Control study. An investigation that compares a group of people with a disease to a group of people without the disease. Used to identify and assess.

Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.

Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.1 Independence.

Statistical analyses for two- way contingency tables HRP 261 January 10, 2005 Read Chapter 2 Agresti.

Chapter 9: Case Control Studies Objectives: -List advantages and disadvantages of case-control studies -Identify how selection and information bias can.

The binomial applied: absolute and relative risks, chi-square

بسم الله الرحمن الرحيم COHORT STUDIES.

Some Epidemiological Studies

Examples and SAS introduction: -Violations of the rare disease assumption -Use of Fisher’s exact test January 14, 2004.

Case-Control Studies.

Evaluating Effect Measure Modification

Presentation transcript:

According to scientists, too much coffee may cause phobias, --panic attacks heart attacks, --stress, --osteoporosis underweight babies, --hypertension higher cholesterol 1993, 08 --miscarriages intensified stress delayed conception But scientists say coffee also may help prevent asthma colon and rectal cancer, —Type II Diabetes (*6 cups per day!) 2006—alcohol-induced liver damage 2007—skin cancer Coffee Chronicles BY MELISSA AUGUST, ANN MARIE BONARDI, VAL CASTRONOVO, MATTHEW JOE'S BLOWS Last week researchers reported that coffee might help prevent Parkinson's disease. So is the caffeine bean good for you or not? Over the years, studies haven't exactly been clear:

Medical Studies Evaluate whether a risk factor (or preventative factor) increases (or decreases) your risk for an outcome (usually disease, death or intermediary to disease). The General Idea… ExposureDisease ?

Association test between categorical variables

General 2x2 Table Exposure (E)No Exposure (~E) Disease (D)ab(a+b)/T = P(D) No Disease (~D)cd(c+d)/T = P(~D) (a+c)/T = P(E) (b+d)/T = P(~E) Marginal probability of disease Marginal probability of exposure N

Risk Ratio ( Relative Risk) Risk ratio is used to compare the risk for two groups An risk of 1 means there is no difference between the groups.

Coronary calcification is a process in which the interior lining of the coronary arteries develops a layer of hard substance known as plaque. Excessive amounts of cholesterol, fat, and waste material become calcified in arteries that have been weakened or damaged due to smoking, high blood pressure, diabetes, or a generally unhealthy diet. Coronary calcification restricts blood flow, presenting the risk of chronic chest pain, heart attacks, and eventual heart failure. Is depression and coronary calcification is associated calcificationcholesterolheart failurecalcificationcholesterolheart failure

Difference of proportions Z-test: Coronary calc > Coronary calc <= Any depression None

Coronary calc > Coronary calc <= Any depression None Or, use relative risk (risk ratio) Compare the risk for each groups Interpretation: those with coronary calcification are 35% more likely to have depression (not significant). See how to get this in R

Or, use chi-square test: Coronary calc > Coronary calc <= Any depressionNone Observed: Expected: Coronary calc > Coronary calc <= Any depressionNone 539*81/1920= = = =

Chi-square test: Note: 1.77 =

Chi-square test also works for bigger contingency tables (RxC):

Coronary calcification No depression Sub- threshhold depressive symptoms Clinical depressive disorder >

Coronary calcificati on No depression Sub- threshhold depressive symptoms Clinical depressive disorder > Observed: Expected: Coronary calcification No depression Sub- threshhold depressive symptoms Clinical depressive disorder *1839 /1920= *45/1 920= ( )= *1839 /1920= *45/1 920= ( )= 9.1 > ( )= ( ) = ( )= 10.2

Chi-square test:

Cause and effect? atherosclerosis depression in elderly ? Biological changes ? Lack of exercise Poor Eating

Confounding? atherosclerosis depression in elderly Advancing Age ? Biological changes ? Lack of exercise Poor Eating

Cross-Sectional Studies Advantages: – cheap and easy – generalizable – good for characteristics that (generally) don’t change like genes or gender Disadvantages – difficult to determine cause and effect – problematic for rare diseases and exposures

2. Cohort studies: Sample on exposure status and track disease development (for rare exposures)  Marginal probabilities (and rates) of developing disease for exposure groups are valid.

Example: The Framingham Heart Study  The Framingham Heart Study was established in 1948, when 5209 residents of Framingham, Mass, aged 28 to 62 years, were enrolled in a prospective epidemiologic cohort study.  Health and lifestyle factors were measured (blood pressure, weight, exercise, etc.).  Interim cardiovascular events were ascertained from medical histories, physical examinations, ECGs, and review of interim medical record.

Example 2: Johns Hopkins Precursors Study (medical students 1948 through 1964) From the John Hopkin’s Magazine website (URL above).

Cohort Studies Target population Exposed Not Exposed Disease-free cohort Disease Disease-free Disease Disease-free TIME

Exposure (E)No Exposure (~E) Disease (D)ab No Disease (~D)cd a+cb+d risk to the exposed risk to the unexposed The Risk Ratio, or Relative Risk (RR)

Hypothetical Data Normal BP Congestive Heart Failure No CHF High Systolic BP

Case-Control Studies Sample on disease status and ask retrospectively about exposures (for rare diseases)  Marginal probabilities of exposure for cases and controls are valid. Doesn’t require knowledge of the absolute risks of disease For rare diseases, can approximate relative risk

Target population Exposed in past Not exposed Exposed Not Exposed Case-Control Studies Disease (Cases) No Disease (Controls)

Example: the AIDS epidemic in the early 1980’s  Early, case-control studies among AIDS cases and matched controls indicated that AIDS was transmitted by sexual contact or blood products.  In 1982, an early case-control study matched AIDS cases to controls and found a positive association between amyl nitrites (“poppers”) and AIDS; odds ratio of 8.6 (Marmor et al. 1982). This is an example of confounding.

Case-Control Studies Examples  Case-control studies identified associations between lip cancer and pipe smoking (Broders 1920), breast cancer and reproductive history (Lane-Claypon 1926) and between oral cancer and pipe smoking (Lombard and Doering 1928). All rare diseases.  Case-control studies identified an association between smoking and lung cancer in the 1950’s.

Case-control example A study of the relation between body mass index and the incidence of age-related macular degeneration (Moeini et al. Br. J. Ophthalmol, 2005). Methods: Researchers compared 50 Iranian patients with confirmed age-related macular degeneration and 80 control subjects with respect to BMI, smoking habits, hypertension, and diabetes. The researchers were specifically interested in the relationship of BMI to age-related macular degeneration.

OverweightNormal ARMD No ARMD Corresponding 2x2 Table What is the risk ratio here? Tricky: There is no risk ratio, because we cannot calculate the risk of disease!! 50 80

The odds ratio… We cannot calculate a risk ratio from a case-control study. BUT, we can calculate a measure called the odds ratio…

Odds vs. Risk If the risk is…Then the odds are… ½ (50%) ¾ (75%) 1/10 (10%) 1/100 (1%) Note: An odds is always higher than its corresponding probability, unless the probability is 100%. 1:1 3:1 1:9 1:99

The proportion of cases and controls are set by the investigator; therefore, they do not represent the risk (probability) of developing disease. Exposure (E)No Exposure (~E) Disease (D)ab No Disease (~D)cd The Odds Ratio (OR) a+b=cases c+d=controls Odds of exposure in the cases Odds of exposure in the controls

Exposure (E)No Exposure (~E) Disease (D) a b No Disease (~D) c d The Odds Ratio (OR) Odds of disease for the exposed Odds of exposure for the controls Odds of exposure for the cases. Odds of disease for the unexposed

= Odds of exposure in the controls Odds of exposure in the cases Bayes’ Rule Odds of disease in the unexposed Odds of disease in the exposed What we want! Proof via Bayes’ Rule (optional)

OverweightNormal ARMD a b No ARMD c d The Odds Ratio (OR) Odds of ARMD for the overweight Odds of overweight for the controls Odds of overweight for the cases. Odds of ARMD for the normal weight

OverweightNormal ARMD No ARMD The Odds Ratio (OR)

OverweightNormal ARMD No ARMD The Odds Ratio (OR) Can be interpreted as: Overweight people have a 43% decrease in their ODDS of age-related macular degeneration. (not statistically significant here)

The odds ratio is a good approximation of the risk ratio if the disease is rare. If the disease is rare (affecting <10% of the population), then: WHY? If the disease is rare, the probability of it NOT happening is close to 1, and the odds is close to the risk. Eg:

Summary of statistical tests for contingency tables Table SizeTest or measures of association 2x2risk ratio (cohort or cross-sectional studies) odds ratio (case-control studies) Chi-square difference in proportions Fisher’s Exact test (cell size less than 5) RxCChi-square Fisher’s Exact test (expected cell size <5)

Fisher’s Exact Test

Who is Fisher Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was an English statistician, evolutionary biologist, geneticist, and eugenicist. Fisher is known as one of the chief architects of the neo-Darwinian synthesis, for his important contributions to statistics, including the analysis of variance (ANOVA), method of maximum likelihood, fiducial inference, and the derivation of various sampling distributions, and for being one of the three principal founders of population genetics. Anders Hald called him "a genius who almost single-handedly created the foundations for modern statistical science", while Richard Dawkins named him "the greatest biologist since Darwin".Englishstatisticianevolutionary biologist geneticisteugenicistneo-Darwinian synthesisanalysis of variancemethod of maximum likelihood fiducial inferencepopulation geneticsAnders HaldRichard DawkinsDarwin

Fisher’s “Tea-tasting experiment” Claim: Fisher’s colleague (call Dr. Muriel Bristol”) claimed that, when drinking tea, she could distinguish whether milk or tea was added to the cup first. To test her claim, Fisher designed an experiment in which she tasted 8 cups of tea (4 cups had milk poured first, 4 had tea poured first). Null hypothesis: Cathy’s guessing abilities are no better than chance. Alternatives hypotheses: Right-tail: She guesses right more than expected by chance. Left-tail: She guesses wrong more than expected by chance

Fisher’s “Tea-tasting experiment” Experimental Results: MilkTea Milk31 Tea13 Guess poured first Poured First 4 4

Fisher’s Exact Test Step 1: Identify tables that are as extreme or more extreme than what actually happened: Here she identified 3 out of 4 of the milk-poured-first teas correctly. Is that good luck or real talent? The only way she could have done better is if she identified 4 of 4 correct. MilkTea Milk31 Tea13 Guess poured first Poured First 4 4 MilkTea Milk40 Tea04 Guess poured first Poured First 4 4

Fisher’s Exact Test Step 2: Calculate the probability of the tables (assuming fixed marginals) MilkTea Milk31 Tea13 Guess poured first Poured First 4 4 MilkTea Milk40 Tea04 Guess poured first Poured First 4 4

Step 3: to get the left tail and right-tail p-values, consider the probability mass function: Probability mass function of X, where X= the number of correct identifications of the cups with milk-poured-first: “right-hand tail probability”: p=.243 “left-hand tail probability” (testing the alternative hypothesis that she’s systematically wrong): p=.986 R also gives a “two- sided p-value” which is calculated by adding up all probabilities in the distribution that are less than or equal to the probability of the observed table (“equal or more extreme”). Here: =.4857 See R code in file 2by2table.R on my website

Summary of statistical tests for contingency tables Table SizeTest or measures of association 2x2risk ratio (cohort or cross-sectional study) odds ratio (case-control study) Chi-square difference in proportions Fisher’s Exact test (cell size less than 5) RxCChi-square Fisher’s Exact test (expected cell size <5)

The rare disease assumption 1 1 When a disease is rare: P(~D) = 1 - P(D)  1

The odds ratio vs. the risk ratio 1.0 (null) Odds ratio Risk ratio Odds ratio Risk ratio Odds ratio Rare Outcome Common Outcome 1.0 (null)

When is the OR is a good approximation of the RR? General Rule of Thumb: “OR is a good approximation as long as the probability of the outcome in the unexposed is less than 10%” Prevalence of age-related macular degeneration is about 6.5% in people over 40 in the US (according to a 2011 estimate). So, the OR is a reasonable approximation of the RR.

Advantages/Limitations: Case-control studies Advantages: – Cheap and fast – Efficient for rare diseases Disadvantages: – Getting comparable controls is often tricky – Temporality is a problem (did risk factor cause disease or disease cause risk factor? – Recall bias

Inferences about the odds ratio…

Properties of the OR (simulation) (50 cases/50 controls/20% exposed) If the Odds Ratio=1.0 then with 50 cases and 50 controls, of whom 20% are exposed, this is the expected variability of the sample OR  note the right skew

Properties of the lnOR Standard deviation =

Hypothetical Data Amyl Nitrite UseNo Amyl Nitrite AIDS2010 Does not have AIDS Note that the size of the smallest 2x2 cell determines the magnitude of the variance

When can the OR mislead?

Example: Does dementia predict death? Dementia: The leading predictor of death in a defined elderly population. Neurology 2004; 62: Among patients with dementia: 291/355 (82%) died Among patients without dementia: 947/4328 (22%) died

Dementia study Authors report OR = (12.27, 21.48) But the RR = 3.72 Fortunately, they do not dwell on the OR, but it could mislead if not interpreted correctly…

Better to give OR or RR? From an RCT (prospective!) of a new diet drug, the authors showed the following table: Odds Ratios for losing at least 5kg were: 4.0 (low dose vs. placebo) 20.9 (medium dose vs. placebo) 31.5 (high dose vs. placebo)

Better to give OR or RR? Corresponding RRs are: 59%/29%=2 (low dose vs. placebo) 87%/29%=3 (medium dose vs. placebo) 91%/29%=3 (high dose vs. placebo)