7 Regression & Correlation: Rates Basic Medical Statistics Course October 2010 W. Heemsbergen.

Slides:



Advertisements
Similar presentations
The analysis of survival data in nephrology. Basic concepts and methods of Cox regression Paul C. van Dijk 1-2, Kitty J. Jager 1, Aeilko H. Zwinderman.
Advertisements

Brief introduction on Logistic Regression
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences.
Evaluating Diagnostic Accuracy of Prostate Cancer Using Bayesian Analysis Part of an Undergraduate Research course Chantal D. Larose.
Chance, bias and confounding
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Measures of Disease Association Measuring occurrence of new outcome events can be an aim by itself, but usually we want to look at the relationship between.
Measures of association
BIOST 536 Lecture 3 1 Lecture 3 – Overview of study designs Prospective/retrospective  Prospective cohort study: Subjects followed; data collection in.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
EPI 809/Spring Multiple Logistic Regression.
1 G Lect 11M Binary outcomes in psychology Can Binary Outcomes Be Studied Using OLS Multiple Regression? Transforming the binary outcome Logistic.
Main Points to be Covered Cumulative incidence using life table method Difference between cumulative incidence based on proportion of persons at risk and.
Measures of disease frequency (I). MEASURES OF DISEASE FREQUENCY Absolute measures of disease frequency: –Incidence –Prevalence –Odds Measures of association:
HaDPop Measuring Disease and Exposure in Populations (MD) &
Measuring Epidemiologic Outcomes
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Introduction to Survival Analysis August 3 and 5, 2004.
AS 737 Categorical Data Analysis For Multivariate
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
Study Design / Data: Case-Control, Descriptives Basic Medical Statistics Course: Module C October 2010 Wilma Heemsbergen
Cohort Study.
HSTAT1101: 27. oktober 2004 Odd Aalen
Lecture 3: Measuring the Occurrence of Disease
Multiple Choice Questions for discussion
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
Measurement Measuring disease and death frequency FETP India.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 4: Taking Risks and Playing the Odds: OR vs.
Assessing Survival: Cox Proportional Hazards Model
Analyses of Covariance Comparing k means adjusting for 1 or more other variables (covariates) Ho: u 1 = u 2 = u 3 (Adjusting for X) Combines ANOVA and.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
Evidence-Based Public Health: A Course in Chronic Disease Prevention MODULE 3: Quantifying the Issue Anjali Deshpande March 2013.
LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Rates, Ratios and Proportions and Measures of Disease Frequency
LEADING RESEARCH… MEASURES THAT COUNT Challenges of Studying Cardiovascular Outcomes in ADHD Elizabeth B. Andrews, MPH, PhD, VP, Pharmacoepidemiology and.
Survival Analysis 1 Always be contented, be grateful, be understanding and be compassionate.
Lecture 12: Cox Proportional Hazards Model
Describing the risk of an event and identifying risk factors Caroline Sabin Professor of Medical Statistics and Epidemiology, Research Department of Infection.
Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Lecture 5: The Natural History of Disease: Ways to Express Prognosis
Measures of Disease Frequency
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
1 Chapter 16 logistic Regression Analysis. 2 Content Logistic regression Conditional logistic regression Application.
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.
Matching. Objectives Discuss methods of matching Discuss advantages and disadvantages of matching Discuss applications of matching Confounding residual.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
1 CONFIDENTIAL – DO NOT DISTRIBUTE ARIES mCRC: Effectiveness and Safety of 1st- and 2nd-line Bevacizumab Treatment in Elderly Patients Mark Kozloff, MD.
Probability and odds Suppose we a frequency distribution for the variable “TB status” The probability of an individual having TB is frequencyRelative.
Carina Signori, DO Journal Club August 2010 Macdonald, M. et al. Diabetes Care; Jun 2010; 33,
Logistic Regression Logistic Regression - Binary Response variable and numeric and/or categorical explanatory variable(s) –Goal: Model the probability.
Meta-analysis of observational studies Nicole Vogelzangs Department of Psychiatry & EMGO + institute.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Prediction of lung cancer mortality in Central & Eastern Europe Joanna Didkowska.
Logistic Regression APKC – STATS AFAC (2016).
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Case-control studies: statistics
Presentation transcript:

7 Regression & Correlation: Rates Basic Medical Statistics Course October 2010 W. Heemsbergen

Event rate Event rate: rate at which the event occurs per subject per period of time. Number of events occurring Cumulative units of time* Rate = * Clinical research: person-years (total number of years of follow-up for all individuals) X X X Time should only be counted in which information is available about possible events, and in which the subject is at risk. One count (e.g. onset cancer), or several counts (e.g. bloody nose), are possible. 1

Event rate Incidence rate: no. of new cases per time period Mortality rate: no. of death per time period In case of a small rate: re-expressed by for instance the rate per 1000 person-years. 110 years+ 26 years- 35 years- 41 year+ 2 events / 22 years : event rate = 0.09 per person-year (or 90 per 1000 p-y). If we are only interested in first events (e.g. diagnosis of breast cancer) the f-up must cease at the time point of the (first) event. 2

Relative rate Relative rate = ( rate ratio, incidence rate ratio ) Rate exposed Rate unexposed A relative rate equal to 1 indicates a similar risk for the two groups A relative rate > 1 indicates that the rate is higher in the exposed group. A relative rate < 1 indicates that the rate is lower in the exposed group. A relative rate is interpreted similar as the relative risk and the Odds ratio, in most situations in cancer research. 3

Standardization A comparison between 2 rates can be misleading/inadequate. The (crude) mortality rate (number of deaths per 1000 person years) between 2 countries is misleading when country A has a relatively young population and country B a relatively old population (e.g. European vs. African country). Solution 1: age specific death rates (a calculated rate for each age category) Solution 2: standardization of the mortality rate, using a standard population. Solution 3: recalculate (adjust) rate of population A, using the age structure of population B. Standardized mortality/death rate: a standard population is introduced with a fixed age structure. Then the mortality of any population is adjusted for discrepancies in age structure between standard and the specific population. Factors often used in standardization: calender-year, age, gender, ethnicity. 4

Rate vs. Risk Rate: Total no. events / person-years of follow-up. Risk: Total no. events / no. of individuals exposed (probability between 0-1 for first events). Risk: Is calculated for a certain interval of time, may differ for longer or shorter intervals. In case the follow-up differs from person to person, rates are preferred. 5

Example Immediate risk of suicide and cardiovascular death after a prostate cancer diagnosis. BACKGROUND: Receiving a cancer diagnosis is a stressful event that may increase risks of suicide and cardiovascular death, especially soon after diagnosis. METHODS: We conducted a cohort study of 342,497 patients diagnosed with prostate cancer from January 1, 1979, through December 31, 2004, in the Surveillance, Epidemiology, and End Results Program. Follow-up started from the date of prostate cancer diagnosis to the end of first 12 calendar months after diagnosis. The relative risks of suicide and cardiovascular death were calculated as standardized mortality ratios (SMRs) comparing corresponding incidences among prostate cancer patients with those of the general US male population, with adjustment for age, calendar period, and state of residence. We compared risks in the first year and months after a prostate cancer diagnosis. The analyses were further stratified by calendar period at diagnosis, tumor characteristics, and other variables. J Natl Cancer Inst. 2010;102:

Example RESULTS: During follow-up, 148 men died of suicide (mortality rate = 0.5 per 1000 person-years) and 6845 died of cardiovascular diseases (mortality rate = 21.8 per 1000 person-years). Patients with prostate cancer were at increased risk of suicide during the first year (SMR = 1.4, 95% confidence interval [CI] = 1.2 to 1.6), especially during the first 3 months (SMR = 1.9, 95% CI = 1.4 to 2.6), after diagnosis. The elevated risk was apparent in pre-prostate-specific antigen (PSA) ( ) and peri-PSA ( ) eras but not since PSA testing has been widespread ( ). The risk of cardiovascular death was slightly elevated during the first year (SMR = 1.09, 95% CI = 1.06 to 1.12), with the highest risk in the first month (SMR = 2.05, 95% CI = 1.89 to 2.22), after diagnosis. The first-month risk was statistically significantly elevated during the entire study period. CONCLUSION: A diagnosis of prostate cancer may increase the immediate risks of suicide and cardiovascular death. 7

Question SMR (standardized mortality ratio) = 1.09 (cardiovasc death) A group of men is diagnosed with prostate cancer. Based on statistics of the general male population, the baseline risk for cardio- vascular death (without prostate cancer diagnosis), is 0.8 % for the coming year. How many men in this group are expected to die from cardio-vasc disease, the coming year ? 8

7 Regression & Correlation: Logistic regression Basic Medical Statistics Course October 2010 W. Heemsbergen

(Binary) Logistic Regression We have collected data on N individuals. We are interested in disease A,which is present in part of the subjects: which (risk) factors are predictive / associated with the disease ? what is the probability that a subject with a certain risk profile, has the disease or will develop the disease ? Example: The development of mucositis of the lower alimentary tract after chemotherapy in cancer patients. -What are the risk factors predictive for mucositis after chemotherapy ? -What is the probability to develop mucositis after chemotherapy, given an individual risk profile ? -Potential risk factors: age, weight, renal functioning, type and duration of chemotherapy, …. 9

(Binary) Logistic Regression Logistic Regression is similar to Linear Regression. It is used when the outcome of interest (the dependent variable) is not continuous (e.g. cancer yes/no). A patient with a certain risk profile (the independent factors), has a probability to develop an outcome: risk factor 1, risk factor 2 (covariates), … result in a probability (between 0-1). The outcome itself will however always be present (1) or not present (0). probability(D=1|z) = e z / (1+ e z ) e z = Exp(z) set of covariate values: x 1..x k, regression coefficients b 1.. b k z = a + b 1 x 1 +b 2 x 2 …+b 1 x 1 10

Example patnrmean lung Radiation dose (Gy) Pneumonitis

Linear Regression: PRED = * MLD Log. Reg: PROB(D=1) = (exp( * MLD)) / ( 1 + exp( * MLD) ) Exp(B) is the Odds Ratio for a unit increase. (Odds: P/(1-P) ) Logistic Regression Linear Regression 12

What is an Odds (Ratio) ? Obese Diabetesyesyesyes yesnononono noyes obese y n y diabetes n Odds obese = 0.75/0.25=3 Odds not obese = 0.25/0.75=0.33 OR = 0.33 / 3 = 0.11 or OR = 3 / 0.33 = 9 preferred Are obese patients more at risk to develop diabetes ? What is the Odds Ratio (OR) ? Odds = p/(1-p) = proportion with disease/(1-proportion with disease) 13 (ratio exposed/unexposed)

Variable types The potential predictive factors of interest, can be continuous, categorical, ordinal, or binary. How to deal with these different types in Logistic Regression ? In the Logistic Regression procedure, categorical data have to be indicated as categorical data, and a reference category has to be chosen. Then for each other category, the regression coefficient is calculated using this category as a reference. Therefore it is advised to use the largest category as the reference. If not, it will be assumed that the variable is “continuous”: for each increase of a unit, the same regression coefficient is estimated. Normal distribution is no prerequisite. An ordinal variable can be put in the model as a continuous variable. One should however always be aware of the underlying assumptions in the model. In case of a binary predictive variable, it is not necessary to choose. However, the “reference” will be the lowest value in case of a continuous variable, and possibly the highest value in case it is indicated as a category (depending on the chosen reference category). 14

Example: categorical / continuous Obese Diabetes (1 yes, 2 no or 0 no) (1 present, 0 not) (0)0 2 (0) obese y n y diabetes n Odds obese = 0.75/0.25=3 Odds not obese = 0.25/0.75=0.33 OR = 0.33 / 3 = 0.11 or OR = 3 / 0.33 = 9 (preferred) 15

Example: categorical / continuous Obese, 1=yes, 2=no, continuous Obese, 1=yes, 0=no, continuous Obese, 1=yes, 2=no, category (reference=last) risk factors present/not present: code 1 and 0, continuous var. 16

Example: rectal bleeding Int J Radiat Oncol Biol Phys 2004; 59: Dosimetric factors predictive for moderate/severe rectal bleeding, after RT for prostate cancer. 17

Example: esophagus toxicity Radiat Oncol 2005; 75: 157. To correlate acute esophageal toxicity with dosimetric and clinical parameters for patients treated with radiotherapy (RT) alone or with chemo-radiotherapy (CRT). probability(D=1) = exp(z) / (1+ exp(z) ), can be rewritten as: probability(D=1) = 1/(1+exp-(z) ) volume of esophagus 18

Question What is the Odds Ratio for V35, and for Concurrent Chemo-RT ? 19