Negative Binomial Regression

Slides:



Advertisements
Similar presentations
Multiple Regression.
Advertisements

Autocorrelation and Heteroskedasticity
Lecture 11 (Chapter 9).
Tests of Significance for Regression & Correlation b* will equal the population parameter of the slope rather thanbecause beta has another meaning with.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Linear regression models
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 10 Simple Regression.
Generalised linear models
Log-linear and logistic models
Chapter 11 Multiple Regression.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Generalized Linear Models
Logistic Regression with “Grouped” Data Lobster Survival by Size in a Tethering Experiment Source: E.B. Wilkinson, J.H. Grabowski, G.D. Sherwood, P.O.
Checking Regression Model Assumptions NBA 2013/14 Player Heights and Weights.
Poisson Regression Caution Flags (Crashes) in NASCAR Winston Cup Races L. Winner (2006). “NASCAR Winston Cup Race Results for ,” Journal.
Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Stat13-lecture 25 regression (continued, SE, t and chi-square) Simple linear regression model: Y=  0 +  1 X +  Assumption :  is normal with mean 0.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Multiple Regression I KNNL – Chapter 6. Models with Multiple Predictors Most Practical Problems have more than one potential predictor variable Goal is.
Linear Model. Formal Definition General Linear Model.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
1 GLM I: Introduction to Generalized Linear Models By Curtis Gary Dean Distinguished Professor of Actuarial Science Ball State University By Curtis Gary.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Negative Binomial Regression NASCAR Lead Changes
Environmental Modeling Basic Testing Methods - Statistics III.
Correlation & Regression Analysis
1 Chi-square Test Dr. T. T. Kachwala. Using the Chi-Square Test 2 The following are the two Applications: 1. Chi square as a test of Independence 2.Chi.
Beta Regression Proportion of Prize Money for Ford in NASCAR Winston Cup Races – Methodology: S.L.P. Ferrari and F. Cribari-Neto (2004). “Beta.
Variance Stabilizing Transformations. Variance is Related to Mean Usual Assumption in ANOVA and Regression is that the variance of each observation is.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
The Probit Model Alexander Spermann University of Freiburg SoSe 2009
The simple linear regression model and parameter estimation
Chapter 4: Basic Estimation Techniques
Regression Analysis AGEC 784.
Chapter 4. Inference about Process Quality
Generalized Linear Models
Chapter 11: Simple Linear Regression
Generalized Linear Models
Part Three. Data Analysis
Checking Regression Model Assumptions
Caution Flags (Crashes) in NASCAR Winston Cup Races
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Model Estimation and Comparison Gamma and Lognormal Distributions
CHAPTER 29: Multiple Regression*
Chapter 12 Inference on the Least-squares Regression Line; ANOVA
Checking Regression Model Assumptions
Linear Regression.
24/02/11 Tutorial 3 Inferential Statistics, Statistical Modelling & Survey Methods (BS2506) Pairach Piboonrungroj (Champ)
Hypothesis testing and Estimation
Statistical Process Control
Interval Estimation and Hypothesis Testing
Categorical Data Analysis
5.2 Least-Squares Fit to a Straight Line
Tutorial 1: Misspecification
Chapter 7: The Normality Assumption and Inference with OLS
Multiple Linear Regression
Product moment correlation
Diagnostics and Remedial Measures
Linear Regression and Correlation
Logistic Regression with “Grouped” Data
Regression and Correlation of Data
Introduction to Regression
Presentation transcript:

Negative Binomial Regression NASCAR Lead Changes 1975-1979

Data Description Units – 151 NASCAR races during the 1975-1979 Seasons Response - # of Lead Changes in a Race Predictors: # Laps in the Race # Drivers in the Race Track Length (Circumference, in miles) Models: Poisson (assumes E(Y) = V(Y)) Negative Binomial (Allows for V(Y) > E(Y))

Poisson Regression Random Component: Poisson Distribution for # of Lead Changes Systematic Component: Linear function with Predictors: Laps, Drivers, Trklength Link Function: log: g(m) = ln(m)

Regression Coefficients – Z-tests Note: All predictors are highly significant. Holding all other factors constant: As # of laps increases, lead changes increase As # of drivers increases, lead changes increase As Track Length increases, lead changes increase

Testing Goodness-of-Fit Break races down into 10 groups of approximately equal size based on their fitted values The Pearson residuals are obtained by computing: Under the hypothesis that the model is adequate, X2 is approximately chi-square with 10-4=6 degrees of freedom (10 cells, 4 estimated parameters). The critical value for an a=0.05 level test is 12.59. The data (next slide) clearly are not consistent with the model. Note that the variances within each group are orders of magnitude larger than the mean.

Testing Goodness-of-Fit 107.4 >> 12.59  Data are not consistent with Poisson model

Negative Binomial Regression Random Component: Negative Binomial Distribution for # of Lead Changes Systematic Component: Linear function with Predictors: Laps, Drivers, Trklength Link Function: log: g(m) = ln(m)

Regression Coefficients – Z-tests Note that SAS and STATA estimate 1/k in this model.

Goodness-of-Fit Test Clearly this model fits better than Poisson Regression Model. For the negative binomial model, SD/mean is estimated to be 0.43 = sqrt(1/k). For these 10 cells, ratios range from 0.24 to 0.67, consistent with that value.

Computational Aspects - I k is restricted to be positive, so we estimate k* = log(k) which can take on any value. Note that software packages estimating 1/k are estimating –k* Likelihood Function: Log-Likelihood Function:

Computational Aspects - II Derivatives wrt k* and b:

Computational Aspects - III Newton-Raphson Algorithm Steps: Step 1: Set k*=0 (k=1) and iterate to obtain estimate of b: Step 2: Use b’ of Step 1 and iterate to obtain estimate of k*: Step 3: Use results from steps 1 and 2 as starting values to obtain estimates of k* and b Step 4: Back-transform k* to get estimate of k: k=exp(k*)