Negative Binomial Regression

Slides:

Advertisements

Similar presentations

Multiple Regression.

Advertisements

Autocorrelation and Heteroskedasticity

Lecture 11 (Chapter 9).

Tests of Significance for Regression & Correlation b* will equal the population parameter of the slope rather thanbecause beta has another meaning with.

Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.

Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests

Linear regression models

© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.

Chapter 10 Simple Regression.

Generalised linear models

Log-linear and logistic models

Chapter 11 Multiple Regression.

Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.

Generalized Linear Models

Logistic Regression with “Grouped” Data Lobster Survival by Size in a Tethering Experiment Source: E.B. Wilkinson, J.H. Grabowski, G.D. Sherwood, P.O.

Checking Regression Model Assumptions NBA 2013/14 Player Heights and Weights.

Poisson Regression Caution Flags (Crashes) in NASCAR Winston Cup Races L. Winner (2006). “NASCAR Winston Cup Race Results for ,” Journal.

Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.

Marketing Research Aaker, Kumar, Day and Leone Tenth Edition

Stat13-lecture 25 regression (continued, SE, t and chi-square) Simple linear regression model: Y=  0 +  1 X +  Assumption :  is normal with mean 0.

Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.

Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.

Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.

Multiple Regression I KNNL – Chapter 6. Models with Multiple Predictors Most Practical Problems have more than one potential predictor variable Goal is.

Linear Model. Formal Definition General Linear Model.

Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.

Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.

1 GLM I: Introduction to Generalized Linear Models By Curtis Gary Dean Distinguished Professor of Actuarial Science Ball State University By Curtis Gary.

Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.

Negative Binomial Regression NASCAR Lead Changes

Environmental Modeling Basic Testing Methods - Statistics III.

Correlation & Regression Analysis

1 Chi-square Test Dr. T. T. Kachwala. Using the Chi-Square Test 2 The following are the two Applications: 1. Chi square as a test of Independence 2.Chi.

Beta Regression Proportion of Prize Money for Ford in NASCAR Winston Cup Races – Methodology: S.L.P. Ferrari and F. Cribari-Neto (2004). “Beta.

Variance Stabilizing Transformations. Variance is Related to Mean Usual Assumption in ANOVA and Regression is that the variance of each observation is.

The Probit Model Alexander Spermann University of Freiburg SS 2008.

The “Big Picture” (from Heath 1995). Simple Linear Regression.

Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.

The Probit Model Alexander Spermann University of Freiburg SoSe 2009

The simple linear regression model and parameter estimation

Chapter 4: Basic Estimation Techniques

Regression Analysis AGEC 784.

Chapter 4. Inference about Process Quality

Generalized Linear Models

Chapter 11: Simple Linear Regression

Generalized Linear Models

Part Three. Data Analysis

Checking Regression Model Assumptions

Caution Flags (Crashes) in NASCAR Winston Cup Races

Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests

Model Estimation and Comparison Gamma and Lognormal Distributions

CHAPTER 29: Multiple Regression*

Chapter 12 Inference on the Least-squares Regression Line; ANOVA

Checking Regression Model Assumptions

Linear Regression.

24/02/11 Tutorial 3 Inferential Statistics, Statistical Modelling & Survey Methods (BS2506) Pairach Piboonrungroj (Champ)

Hypothesis testing and Estimation

Statistical Process Control

Interval Estimation and Hypothesis Testing

Categorical Data Analysis

5.2 Least-Squares Fit to a Straight Line

Tutorial 1: Misspecification

Chapter 7: The Normality Assumption and Inference with OLS

Multiple Linear Regression

Product moment correlation

Diagnostics and Remedial Measures

Linear Regression and Correlation

Logistic Regression with “Grouped” Data

Regression and Correlation of Data

Introduction to Regression

Presentation transcript:

Negative Binomial Regression NASCAR Lead Changes 1975-1979

Data Description Units – 151 NASCAR races during the 1975-1979 Seasons Response - # of Lead Changes in a Race Predictors: # Laps in the Race # Drivers in the Race Track Length (Circumference, in miles) Models: Poisson (assumes E(Y) = V(Y)) Negative Binomial (Allows for V(Y) > E(Y))

Poisson Regression Random Component: Poisson Distribution for # of Lead Changes Systematic Component: Linear function with Predictors: Laps, Drivers, Trklength Link Function: log: g(m) = ln(m)

Regression Coefficients – Z-tests Note: All predictors are highly significant. Holding all other factors constant: As # of laps increases, lead changes increase As # of drivers increases, lead changes increase As Track Length increases, lead changes increase

Testing Goodness-of-Fit Break races down into 10 groups of approximately equal size based on their fitted values The Pearson residuals are obtained by computing: Under the hypothesis that the model is adequate, X2 is approximately chi-square with 10-4=6 degrees of freedom (10 cells, 4 estimated parameters). The critical value for an a=0.05 level test is 12.59. The data (next slide) clearly are not consistent with the model. Note that the variances within each group are orders of magnitude larger than the mean.

Testing Goodness-of-Fit 107.4 >> 12.59  Data are not consistent with Poisson model

Negative Binomial Regression Random Component: Negative Binomial Distribution for # of Lead Changes Systematic Component: Linear function with Predictors: Laps, Drivers, Trklength Link Function: log: g(m) = ln(m)

Regression Coefficients – Z-tests Note that SAS and STATA estimate 1/k in this model.

Goodness-of-Fit Test Clearly this model fits better than Poisson Regression Model. For the negative binomial model, SD/mean is estimated to be 0.43 = sqrt(1/k). For these 10 cells, ratios range from 0.24 to 0.67, consistent with that value.

Computational Aspects - I k is restricted to be positive, so we estimate k* = log(k) which can take on any value. Note that software packages estimating 1/k are estimating –k* Likelihood Function: Log-Likelihood Function:

Computational Aspects - II Derivatives wrt k* and b:

Computational Aspects - III Newton-Raphson Algorithm Steps: Step 1: Set k*=0 (k=1) and iterate to obtain estimate of b: Step 2: Use b’ of Step 1 and iterate to obtain estimate of k*: Step 3: Use results from steps 1 and 2 as starting values to obtain estimates of k* and b Step 4: Back-transform k* to get estimate of k: k=exp(k*)