REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI.

Slides:



Advertisements
Similar presentations
COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.
Advertisements

Lecture 11 (Chapter 9).
Logistic Regression Psy 524 Ainsworth.
ADVANCED STATISTICS FOR MEDICAL STUDIES Mwarumba Mwavita, Ph.D. School of Educational Studies Research Evaluation Measurement and Statistics (REMS) Oklahoma.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Departments of Medicine and Biostatistics
Introduction to Statistics: Political Science (Class 9) Review.
Modeling Process Quality
What role should probabilistic sensitivity analysis play in SMC decision making? Andrew Briggs, DPhil University of Oxford.
Cox Model With Intermitten and Error-Prone Covariate Observation Yury Gubman PhD thesis in Statistics Supervisors: Prof. David Zucker, Prof. Orly Manor.
Logistic Regression Part I - Introduction. Logistic Regression Regression where the response variable is dichotomous (not continuous) Examples –effect.
Multiple Regression [ Cross-Sectional Data ]
Introduction to Categorical Data Analysis
Data Analysis Statistics. Inferential statistics.
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
Nemours Biomedical Research Statistics April 2, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Chapter 11 Multiple Regression.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
Data Analysis Statistics. Inferential statistics.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Generalized Linear Models
Linear Regression and Correlation Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and the level of.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Chapter 8: Bivariate Regression and Correlation
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
AP Statistics Overview and Basic Vocabulary. Key Ideas The Meaning of Statistics Quantitative vs. Qualitative Data Descriptive vs. Inferential Statistics.
Simple Linear Regression
Chapter 12 Multiple Regression and Model Building.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.
Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate.
Bayesian Analysis and Applications of A Cure Rate Model.
2 December 2004PubH8420: Parametric Regression Models Slide 1 Applications - SAS Parametric Regression in SAS –PROC LIFEREG –PROC GENMOD –PROC LOGISTIC.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
HSRP 734: Advanced Statistical Methods July 17, 2008.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Pro gradu –thesis Tuija Hevonkorpi.  Basic of survival analysis  Weibull model  Frailty models  Accelerated failure time model  Case study.
CHI SQUARE TESTS.
Survival Analysis 1 Always be contented, be grateful, be understanding and be compassionate.
Lecture 12: Cox Proportional Hazards Model
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Survival Analysis in Stata First, declare your survival-time variables to Stata using stset For example, suppose your duration variable is called timevar.
1 Chapter 16 logistic Regression Analysis. 2 Content Logistic regression Conditional logistic regression Application.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Logistic Regression and Odds Ratios Psych DeShon.
Nonparametric Statistics
© 2010 Jones and Bartlett Publishers, LLC. Chapter 12 Clinical Epidemiology.
Introduction to Biostatistics Lecture 1. Biostatistics Definition: – The application of statistics to biological sciences Is the science which deals with.
Statistics & Evidence-Based Practice
Nonparametric Statistics
Logistic Regression APKC – STATS AFAC (2016).
Introduction to Regression Analysis
Making Use of Associations Tests
APPROACHES TO QUANTITATIVE DATA ANALYSIS
CHAPTER 18 SURVIVAL ANALYSIS Damodar Gujarati
Nonparametric Statistics
Ass. Prof. Dr. Mogeeb Mosleh
LEARNING OUTCOMES After studying this chapter, you should be able to
Simple Linear Regression
5.1 Introduction to Curve Fitting why do we fit data to a function?
Statistics II: An Overview of Statistics
Regression III.
Introductory Statistics
Presentation transcript:

REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI

Objective  Focus on the parametric regression models for survival data  Build an understanding related to the commonly used parametric regression methods  Link the survival time of an individual to covariates using a specified probability distribution within the regression settings

What is Prognosis? It is the prediction of the future of an individual patient with respect to duration, course, and outcome of a disease. Prognosis plays an important role in medical practice but it is often difficult to sort out which characteristics of a patient (also called explanatory variables) are most closely related to it. Therefore, a statistical analysis is needed to prepare a compact summary of the data that can reveal their relationship.

PRELIMINARY EXAMINATION OF DATA The Categories of Dependent Variables: The data used in prognostic studies or clinical trials can have the response variable as dichotomous, polychotomous, or continuous. Dichotomous dependent variables : response or nonresponse, life or death, and presence or absence of a given disease. Polychotomous dependent variables : different grades of symptoms (e.g., no evidence of disease, minor symptom, major symptom) and scores of psychiatric reactions (e.g., feeling well, tolerable, depressed, or very depressed). Continuous dependent variables : length of survival from start of treatment or length of remission, both measured on a numerical scale by a continuous range of values.

The Categories of Independent Variables: A prognostic variable (or independent variable) may be either numerical or nonnumeric. Numerical prognostic variables: - discrete, such as the number of previous strokes - continuous, such as age Continuous variables can be made discrete by grouping patients into subcategories (e.g., four age subgroups: 20, 20—39, 40—59, and 60). Nonnumeric prognostic variables : - unordered (e.g., race or diagnosis) - ordered (e.g., severity of disease may be primary, local, or metastatic).

Steps in Data Examination: Before conducting a statistical computation, the data needs to be examined carefully. We usually take the following steps as our preliminary examination:  Obtain correlation coefficients between variables to detect significantly correlated variables. The highly correlated variable that has a prognostic value shown in other studies shouldn’t be deleted.  For qualitative prognostic variable, the dummy variables are used. For example, having cell types A, B and O, let the dummy variable x1 takes the value of 1 for cell type A and 0 otherwise, and x2 takes the value of 1 for cell type B and 0 otherwise. For two categories (e.g., sex), only one dummy variable is needed: x is 1 for a male, 0 for a female.  Transformation ( such as logarithmic) can be applied to the prognostic variables to obtain the better description of the data.

 Reduction of prognostic factors that have little or no effect on the dependent variable from the multivariate analysis.  Dealing with missing data. - depends what proportion of data is missing - may drop the missing data observations if they are relatively smaller in proportion - for nominal or categorical independent variable, treat individuals in a group with missing information as another group. - for quantitatively measured variables (e.g., age), the mean of the values available can be used for a missing value. This principle can also be applied to nominal data.

GENERAL STRUCTURE OF PARAMETRIC REGRESSION MODELS

Commonly Used Parametric Models: The most commonly used parametric models are:  Exponential  Weibull  Lognormal  Log-logistic  Gamma  Gompertz The first two are included in our discussion. The distributions generally involve 2 parameters : (λ) scale parameter & (γ) shape parameter. Shape is assumed constant across individuals. Maximum Likelihood Estimation is used to obtain the estimates for parameters. Newton – Raphson Iterative procedure is also applied when there is no closed solution to MLE.

Likelihood Inference of Regression Models

Hypothesis Testing

Exponential Model  The exponential distribution is a useful form of the survival distribution when the hazard function (probability of failure) is constant and does not depend on time, the graph is approximately a straight line with slope=1.  In biomedical field, a constant hazard function is usually unrealistic, the situation will not be the case.

Practical Approach

Weibull Model  The hazard function changes with time, the graph is approximately a straight line, but the slope is not 1.  The hazard function always increase when the parameter γ >1  The hazard function always decrease when γ <1  It is the exponential regression model when γ = 1

Exponential hazard function is constant whereas Weibull hazard function is monotonically decreasing.

THANK YOU