Event History Analysis: Introduction Sociology 229 Class 3 Copyright © 2010 by Evan Schofer Do not copy or distribute without permission.

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Multilevel Event History Modelling of Birth Intervals
SC968: Panel Data Methods for Sociologists
What is Event History Analysis?
Surviving Survival Analysis
Event History Models 1 Sociology 229A: Event History Analysis Class 3
Binary Logistic Regression: One Dichotomous Independent Variable
Experimental Design, Response Surface Analysis, and Optimization
Departments of Medicine and Biostatistics
Nguyen Ngoc Anh Nguyen Ha Trang
Multilevel Models 4 Sociology 8811, Class 26 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
SC968: Panel Data Methods for Sociologists
Models with Discrete Dependent Variables
Multiple Linear Regression Model
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 9 Chicago School of Professional Psychology.
Main Points to be Covered
EHA: Terminology and basic non-parametric graphs
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
How Long Until …? Given a strike, how long will it last?
Chapter 4 Multiple Regression.
Event History Analysis 1 Sociology 8811 Lecture 14 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Statistics 350 Lecture 10. Today Last Day: Start Chapter 3 Today: Section 3.8 Homework #3: Chapter 2 Problems (page 89-99): 13, 16,55, 56 Due: February.
Topic 3: Regression.
Event History Models Sociology 229: Advanced Regression Class 5
Event History Analysis 1 Sociology 8811 Lecture 16 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
BIOST 536 Lecture 4 1 Lecture 4 – Logistic regression: estimation and confounding Linear model.
Multiple Regression 2 Sociology 5811 Lecture 23 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Bootstrapping applied to t-tests
Copyright © 2005 by Evan Schofer
Week 9: QUANTITATIVE RESEARCH (3)
Single and Multiple Spell Discrete Time Hazards Models with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.
1 Regression Models with Binary Response Regression: “Regression is a process in which we estimate one variable on the basis of one or more other variables.”
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
Chapter 9 Comparing More than Two Means. Review of Simulation-Based Tests  One proportion:  We created a null distribution by flipping a coin, rolling.
Linear Functions 2 Sociology 5811 Lecture 18 Copyright © 2004 by Evan Schofer Do not copy or distribute without permission.
EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
Sociology 5811: Lecture 14: ANOVA 2
Multiple Regression 3 Sociology 5811 Lecture 24 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Count Models 1 Sociology 8811 Lecture 12
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
CS 478 – Tools for Machine Learning and Data Mining Linear and Logistic Regression (Adapted from various sources) (e.g., Luiz Pessoa PY 206 class at Brown.
Modelling Longitudinal Data Survival Analysis. Event History. Recurrent Events. A Final Point – and link to Multilevel Models (perhaps).
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
Fall 2002Biostat Statistical Inference - Proportions One sample Confidence intervals Hypothesis tests Two Sample Confidence intervals Hypothesis.
More complex event history analysis. Start of Study End of Study 0 t1 0 = Unemployed; 1 = Working UNEMPLOYMENT AND RETURNING TO WORK STUDY Spell or Episode.
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Beginners statistics Assoc Prof Terry Haines. 5 simple steps 1.Understand the type of measurement you are dealing with 2.Understand the type of question.
REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI.
1 Study Design Imre Janszky Faculty of Medicine, ISM NTNU.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Event History Analysis 3
Drop-in Sessions! When: Hillary Term - Week 1 Where: Q-Step Lab (TBC) Sign up with Alice Evans.
Jennifer Ward-Batts March 21, 2017
CHAPTER 12 More About Regression
External Validity.
Count Models 2 Sociology 8811 Lecture 13
Improving Overlap Farrokh Alemi, Ph.D.
Non-Experimental designs: Correlational & Quasi-experimental designs
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Analysing Means I: (Extending) Analysis.
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Event History Analysis: Introduction Sociology 229 Class 3 Copyright © 2010 by Evan Schofer Do not copy or distribute without permission

Agenda Introduce EHA Motivation Limitations of alternative approaches

Regression and EHA: Examples Medical Research on Drug Efficacy Question #1: Do patients with larger doses of a drug have lower cholesterol? Approach: OLS Regression If assumptions are met, OLS is appropriate Independent Variable = dosage (“level” of drug) Dependent Variable = cholesterol (“level”)

Regression Example: Cholesterol Relationship between level of X and Y is modeled as a linear function: Y = a + bX + e Drug Dosage (mg) Cholesterol Level

Example 2: Drug & Mortality Suppose a different question: Does increased drug dosage reduce the incidence of mortality among patients? The dependent variable has a different character 1. Whereas cholesterol is measured as a “level” (continuously), mortality is “discrete” Either the patient lives or they don’t (not a “level”) 2. Also, TIMING is an issue Not just if a patient survives, but how long A drug that extends life is good, even if patients die

Logit/Probit Strategies Research strategies to address this problem: 1. Use a non-linear regression model for discrete outcomes: Logit, Probit, etc. Dependent variable is a dummy for patient mortality Look for relationship between dosage and mortality Benefit: Easy. An analog of regression Limitation: Doesn’t take timing into account All patients that die have the same influence on the model (whether they live 5 days or 20 years due to the drug dosage).

Logit/Probit Strategy: Visual Relationship between level of X and the discrete variable Y is modeled as a non- linear function Yes No Drug Dosage (mg) Mortality

Drug & Mortality: OLS Regression Option #2: Use OLS regression to model the time elapsed (duration) until mortality –Rather than ask “did they live or die” (logit/probit), you ask “how long did they live”? Compute a variable that reflects the time until mortality (in relevant time units – e.g., months since drug therapy is started) Model time as the dependent variable Observe: Do patients with high drug doses die later than ones with low doses?

OLS Duration Strategy: Visual Q: Where do you put individuals who were alive at the end of the study? Drug Dosage (mg) Months Until Mortality

Drug & Mortality: OLS Regression Problem #1: What about patients who don’t experience mortality during study? This is called “censored data” If study is 80 months, you know that Y>80… –But, you don’t have an exact value What do you do? –Treat them as experiencing mortality at the very end of the study? Or approximate time of mortality? –Exclude them? NO! That selects on the dependent variable! Possible solution: Use models for censored data –Ex: tobit model; “censored normal regression” »Stata: tobit, cnreg.

Drug & Mortality: OLS Regression Problem #2: Temporal data often violates normality assumption of OLS regression Often violations are quite bad “Censored” data is a surmountable problem, but normality violation is usually not So – we shouldn’t typically use OLS… or models for censored data that assume normality…

Drug and Mortality: EHA Strategy Event History Analysis (EHA) provides purchase on this exact type of problem And others, as well In essence, EHA models a dependent variable that reflects both: –1. Whether or not a patient experiences mortality (like logit), and… –2. When it occurs (like a OLS regression of duration) Note: This information is typically encoded in 2 or more variables

EHA: Overview and Terminology EHA is referred to as “dynamic” modeling i.e., addresses the timing of outcomes: rates Dependent variable is best conceptualized as a rate of some occurrence Not a “level” or “amount” as in OLS regression Think: “How fast?” “How often?” The “occurrence” may be something that can occur only once for each case: e.g., mortality Or, it may be repeatable: e.g., marriages, strategic alliances.

EHA: Overview EHA involves both descriptive and parametric analysis of data Just like regression Scatterplots, partialplots = descriptive OLS model/hypothesis tests = parametric Descriptive analyses/plots Allow description of the overall rate of some outcome For all cases, or for various subgroups Parametric Models Allow hypothesis testing about variables that affect rate (and can include control variables).

EHA: Types of Questions Some types of questions EHA can address: 1. Mortality: Does drug dosage reduce rates? Does “rate” decrease with larger doses? Also: control for race, gender, treatment options, etc 2. Life stage transitions: timing of marriage Is rate affected by gender, class, religion? 3. Organizational mortality Is rate affected by size, historical era, competition? 4. Inter-state war Is rate affected by economic, political factors?