Count Models Sociology 229: Advanced Regression Copyright © 2010 by Evan Schofer Do not copy or distribute without permission.

Slides:



Advertisements
Similar presentations
Event History Models 1 Sociology 229A: Event History Analysis Class 3
Advertisements

Welcome to Econ 420 Applied Regression Analysis
Brief introduction on Logistic Regression
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
Logit & Probit Regression
Introduction to Logistic Regression In Stata Maria T. Kaylen, Ph.D. Indiana Statistical Consulting Center WIM Spring 2014 April 11, 2014, 3:00-4:30pm.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Multilevel Models 4 Sociology 8811, Class 26 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Multinomial Logit Sociology 8811 Lecture 11 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Sociology 601 Class 19: November 3, 2008 Review of correlation and standardized coefficients Statistical inference for the slope (9.5) Violations of Model.
In previous lecture, we highlighted 3 shortcomings of the LPM. The most serious one is the unboundedness problem, i.e., the LPM may make the nonsense predictions.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
An Introduction to Logistic Regression JohnWhitehead Department of Economics Appalachian State University.
Log-linear and logistic models Generalised linear model ANOVA revisited Log-linear model: Poisson distribution logistic model: Binomial distribution Deviances.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Log-linear and logistic models
In previous lecture, we dealt with the unboundedness problem of LPM using the logit model. In this lecture, we will consider another alternative, i.e.
Topic 3: Regression.
Event History Models Sociology 229: Advanced Regression Class 5
An Introduction to Logistic Regression
Multiple Regression 2 Sociology 5811 Lecture 23 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
BINARY CHOICE MODELS: LOGIT ANALYSIS
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Generalized Linear Models
TOBIT ANALYSIS Sometimes the dependent variable in a regression model is subject to a lower limit or an upper limit, or both. Suppose that in the absence.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
1 INTERACTIVE EXPLANATORY VARIABLES The model shown above is linear in parameters and it may be fitted using straightforward OLS, provided that the regression.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: binary choice logit models Original citation: Dougherty, C. (2012) EC220.
Objectives of Multiple Regression
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
LEARNING PROGRAMME Hypothesis testing Intermediate Training in Quantitative Analysis Bangkok November 2007.
Logistic Regression Sociology 229: Advanced Regression
Methods Workshop (3/10/07) Topic: Event Count Models.
1 BINARY CHOICE MODELS: PROBIT ANALYSIS In the case of probit analysis, the sigmoid function is the cumulative standardized normal distribution.
Logistic Regression 2 Sociology 8811 Lecture 7 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
Introduction to Hypothesis Testing: One Population Value Chapter 8 Handout.
Multinomial Logit Sociology 8811 Lecture 10
What is the MPC?. Learning Objectives 1.Use linear regression to establish the relationship between two variables 2.Show that the line is the line of.
Linear Functions 2 Sociology 5811 Lecture 18 Copyright © 2004 by Evan Schofer Do not copy or distribute without permission.
EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without.
Multiple Regression 3 Sociology 5811 Lecture 24 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
9-1 MGMG 522 : Session #9 Binary Regression (Ch. 13)
Count Models 1 Sociology 8811 Lecture 12
Limited Dependent Variables Ciaran S. Phibbs May 30, 2012.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Discussion of time series and panel models
Quantitative Methods Analyzing event counts. Event Count Analysis Event counts involve a non-negative interger-valued random variable. Examples are the.
Logistic Regression Analysis Gerrit Rooks
Analysis of Experimental Data IV Christoph Engel.
Logistic Regression 2 Sociology 8811 Lecture 7 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
1 COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS When alternative specifications of a regression model have the same dependent variable, R 2 can be used.
Logistic Regression: Regression with a Binary Dependent Variable.
Linear Regression 1 Sociology 5811 Lecture 19 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Logistic Regression APKC – STATS AFAC (2016).
Advanced Quantitative Techniques
Advanced Quantitative Techniques
Event History Analysis 3
Generalized Linear Models
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Count Models 2 Sociology 8811 Lecture 13
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Presentation transcript:

Count Models Sociology 229: Advanced Regression Copyright © 2010 by Evan Schofer Do not copy or distribute without permission

Announcements Assignment #1 Due Assignment #2 handed out Due in 1 week Agenda: Basic count models Intro to EHA (if time allows)

Count Variables Many dependent variables are counts: Non- negative integers # Crimes a person has committed in lifetime # Children living in a household # new companies founded in a year (in an industry) # of social protests per month in a city –Can you think of others?

Count Variables Count variables can be modeled with OLS regression… but: –1. Linear models can yield negative predicted values… whereas counts are never negative Similar to the problem of the Linear Probability Model –2. Count variables are often highly skewed Ex: # crimes committed this year… most people are zero or very low; a few people are very high Extreme skew violates the normality assumption of OLS regression.

Count Models Two most common count models: Poisson Regression Model Negative Binomial Regression Model Both based on the Poisson distribution:  = expected count (and variance) –Called lambda ( ) in some texts; I rely on Freese & Long 2006 y = observed count

Poisson Regression Strategy: Model log of  as a function of Xs Quite similar to modeling log odds in logit Again, the log form avoids negative values Which can be written as:

Poisson Regression: Example Hours per week spent on web

Poisson Regression: Web Use Output = similar to logistic regression. poisson wwwhr male age educ lowincome babies Poisson regression Number of obs = 1552 LR chi2(5) = Prob > chi2 = Log likelihood = Pseudo R2 = wwwhr | Coef. Std. Err. z P>|z| [95% Conf. Interval] male | age | educ | lowincome | babies | _cons | Men spend more time on the web than women Number of young children in household reduces web use

Poisson Regression: Stata Output Stata output yields familiar statistics: –Standard errors, z/t- values, and p-values for coefficient hypothesis tests –Pseudo R-square for model fit Not a great measure… but gives a crude explained variance –MLE log likelihood –Likelihood ratio test: Chi-square and p-value Comparing to null model (constant only) Tests can also be conducted on nested models with stata command “ lrtest ”.

Interpreting Coefficients In Poisson Regression, Y is typically conceptualized as a rate… Positive coefficients indicate higher rate; negative = lower rate Like logit, Poisson models are non-linear Coefficients don’t have a simple linear interpretation Like logit, model has a log form; exponentiation aids interpretation Exponentiated coefficients are multiplicative Analogous to odds ratios… but called “incidence rate ratios”.

Interpreting Coefficients Exponentiated coefficients: indicate effect of unit change of X on rate In STATA: “incidence rate ratios”: “ poison …, irr ” e b = 2.0 indicates that the rate doubles for each unit change in X e b =.5 indicates that the rate drops by half for each unit change in X Recall: Exponentiated coefs are multiplicative If e b = 5.0, a 2-point change in X isn’t 10; it is 5 * 5 = 25 –Also: you must invert to see opposite effects If e b = 5.0, a 1-point decrease in X isn’t -5, it is 1/5 =.2

Interpreting Coefficients Again, exponentiated coefficients (rate ratios) can be converted to % change Formula: (e b - 1) * 100% Ex: Coefficent = (e ) * 100% = 50% decrease in rate.

Interpreting Coefficients Exponentiated coefficients yield multiplier:. poisson wwwhr male age educ lowincome babies Poisson regression Number of obs = 1552 LR chi2(5) = Prob > chi2 = Log likelihood = Pseudo R2 = wwwhr | Coef. Std. Err. z P>|z| [95% Conf. Interval] male | age | educ | lowincome | babies | _cons | Exponentiation of.359 = 1.43; Rate is 1.43 times higher for men (1.43-1) * 100 = 43% more Exp(-.14) =.87. Each baby reduces rate by factor of.87 (.87-1) * 100 = 13% less

Probabilities of Count Outcomes Stata extension “prcount” can compute probabilities for each possible count outcome For all cases, of for particular groups It plugs values (m), Xs, & bs into formula: Rate: [ , ] Pr(y=0|x): [ , ] Pr(y=1|x): [ , ] Pr(y=2|x): [ , ] Pr(y=3|x): [ , ] Pr(y=4|x): [ , ] Pr(y=5|x): [ , ] Pr(y=6|x): [ , ] Pr(y=7|x): [ , ] Pr(y=8|x): [ , ] Pr(y=9|x): [ , ] male age educ lowincome babies x=

Predicted Counts Stata “predict varname, n” computes predicted value for each case. predict predwww if e(sample), n. list wwwhr predwww if e(sample) | wwwhr predwww | | | 1. | | 2. | | 3. | | 12. | | 13. | | 15. | | 16. | | 19. | | 20. | | 21. | | 23. | | 24. | | 25. | | 27. | | 33. | | Some of the predictions are close to the observed values… Many of the predictions are quite bad… Recall that the model fit was VERY poor!

Predicted Counts Stata command adjust (Stata 9/10) and margins (Stata 11) can summarize predicted counts You can compute average predictions for each case in your data… or for sub-groups of the data. –The trick is to figure out what values to use for OTHER variables when you compute probabilities Hold other variables at the mean of all cases? Hold other variables at the mean for each subgroup of the variable of interest? Set other variables at values corresponding to an interesting hypothetical case?

Predicted Counts: adjust/margins Example: comparing women and men. margins, at(male=(0 1)) atmeans Adjusted predictions Number of obs = 1552 Expression : Predicted number of events, predict() 1._at : male = 0 age = (mean) educ = (mean) lowincome = (mean) babies = (mean) 2._at : male = 1 age = (mean) educ = (mean) lowincome = (mean) babies = (mean) | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] _at | 1 | | This prediction refers to men, with other variables held at the mean of all cases

Issue: Exposure Poisson outcome variables are typically conceptualized as rates Web hours per week Number of crimes committed in past year Issue: Cases may vary in exposure to “risk” of a given outcome To properly model rates, we must account for the fact that some cases have greater exposure than others Ex: # crimes committed in lifetime –Older people have greater opportunity to have higher counts Alternately, exposure may vary due to research design –Ex: Some cases followed for longer time than others…

Issue: Exposure Poisson (and other count models) can address varying exposure: Where t i = exposure time for case i It is easy to incorporate into stata, too: Ex: poisson NumCrimes SES income, exposure(age) Note: Also works with other “count” models.

Poisson Model Assumptions Poisson regression makes a big assumption: That variance of  =  (“equidisperson”) In other words, the mean and variance are the same This assumption is often not met in real data Dispersion is often greater than  : overdispersion –Consequence of overdispersion: Standard errors will be underestimated Potential for overconfidence in results; rejecting H0 when you shouldn’t! Note: overdispersion doesn’t necessarily affect predicted counts (compared to alternative models).

Poisson Model Assumptions Overdispersion is most often caused by highly skewed dependent variables –Often due to variables with high numbers of zeros Ex: Number of traffic tickets per year Most people have zero, some can have 50! Mean of variable is low, but SD is high –Other examples of skewed outcomes # of scholarly publications # cigarettes smoked per day # riots per year (for sample of cities in US).

Negative Binomial Regression Strategy: Modify the Poisson model to address overdispersion Add an “error” term to the basic model: Additional model assumptions: Expected value of exponentiated error = 1 (e  = 1) Exponentiated error is Gamma distributed We hope that these assumptions are more plausible than the equidispersion assumption!

Negative Binomial Regression Full negative biniomial model: Note that the model incorporates a new parameter:  Alpha represents the extent of overdispersion If  = 0 the model reduces to simple poisson regression

Negative Binomial Regression Question: Is alpha (  ) = 0? If so, we can use Poisson regression If not, overdispersion is present; Poisson is inadequate Strategy: conduct a statistical test of the hypothesis: H0:  = 0; H1:  > 0 Stata provides this information when you run a negative binomial model: Likelihood ratio test (G 2 ) for alpha P-value <.05 indicates that overdispersion is present; negative binomial is preferred If P>.05, just use Poisson regression –So you don’t have to make assumptions about gamma dist….

Negative Binomial Regression Interpreting coefficients: Identical to poisson regression Predicted probabilities: Can be done. You must use big Neg Binomial formula Plugging in observed Xs, estimates of a, Bs… Probably best to get STATA to do this one… Long & Freese created command: prvalue

Negative Binomial Example: Web Use Note: Bs are similar but SEs change a lot! Negative binomial regression Number of obs = 1552 LR chi2(5) = Prob > chi2 = Log likelihood = Pseudo R2 = wwwhr | Coef. Std. Err. z P>|z| [95% Conf. Interval] male | age | educ | lowincome | babies | _cons | /lnalpha | alpha | Likelihood-ratio test of alpha=0: chibar2(01) = Prob>=chibar2 = Note: Standard Error for education increased from.004 to.012! Effect is no longer statistically significant.

Negative Binomial Example: Web Use Note: Info on overdispersion is provided Negative binomial regression Number of obs = 1552 LR chi2(5) = Prob > chi2 = Log likelihood = Pseudo R2 = wwwhr | Coef. Std. Err. z P>|z| [95% Conf. Interval] male | age | educ | lowincome | babies | _cons | /lnalpha | alpha | Likelihood-ratio test of alpha=0: chibar2(01) = Prob>=chibar2 = Alpha is clearly > 0! Overdispersion is evident; LR test p<.05 You should not use Poisson Regression in this case

General Remarks Poisson & Negative binomial models suffer all the same basic issues as “normal” regression Model specification / omitted variable bias Multicollinearity Outliers/influential cases –Also, it uses Maximum Likelihood N > 500 = fine; N < 100 can be worrisome –Results aren’t necessarily wrong if N<100; –But it is a possibility; and hard to know when problems crop up Plus ~10 cases per independent variable.

General Remarks It is often useful to try both Poisson and Negative Binomial models The latter allows you to test for overdispersion Use LRtest on alpha (  ) to guide model choice –If you don’t suspect dispersion and alpha appears to be zero, use Poission Regression It makes fewer assumptions –Such as gamma-distributed error.

Example: Labor Militancy Isaac & Christiansen 2002 Note: Results are presented as % change

Zero-Inflated Poisson & NB Reg If outcome variable has many zero values it tends to be highly skewed Under those circumstances, NBREG works better than ordinary Poisson due to overdispersion –But, sometimes you have LOTS of zeros. Even nbreg isn’t sufficient Model under-predicts zeros, doesn’t fit well –Examples: # violent crimes committed by a person in a year # of wars a country fights per year # of foreign subsidiaries of firms.

Zero-Inflated Poisson & NB Reg Logic of zero-inflated models: Assume two types of groups in your sample Type A: Always zero – no probability of non-zero value Type ~A: Non-zero chance of positive count value –Probability is variable, but not zero –1. Use logit to model group membership –2. Use poisson or nbreg to model counts for those in group ~A –3. Compute probabilities based on those results.

Zero-Inflated Poisson & NB Reg Example: Web usage at work More skewed than overall web usage. Why? Many people don’t have computers at work! So, web usage is zero for many

Zero-Inflated Poisson & NB Reg Zero-inflated models in Stata “zip” = Poisson, zinb = negative binomial Commands accept two separate variable lists –Variables that affect counts For those with non-zero counts Modeled with Poisson or NB regression –Variables that predict membership in “zero” group Modeled with logit –Ex: zinb webatwork male age educ lowincome babies, inflate(male age educ lowincome babies)

ZINB Example: Web Hrs at Work “Inflate” output = logit for group membership Zero-inflated negative binomial regression Number of obs = 1135 Nonzero obs = 562 Zero obs = 573 Inflation model = logit LR chi2(5) = Log likelihood = Prob > chi2 = | Coef. Std. Err. z P>|z| [95% Conf. Interval] webatwork | male | age | educ | lowincome | babies | _cons | inflate | male | age | educ | lowincome | babies | _cons | Education reduces odds of zero value But doesn’t have an effect on count for those that are non-zero Model predicting zero group

Zero-Inflated Poisson & NB Reg Remarks –ZINB produces estimate of alpha Helps choose between zip & zinb –Long and Freese (2006) have helpful tool to compare fit of count models: countfit See textbook –Zero-inflated models seem very useful Count variables often have many zeros It is often reasonable to assume a “always zero” group –But, they are fairly new Not many examples in the literature Haven’t been widely scrutinized.

Zero-truncated Poisson & NB reg Truncation – the absence of information about cases in some range of a variable Example: Suppose we study income based on data from tax returns… –Cases with income below a certain value are not required to submit a tax return… so data is missing Example: Data on # crimes committed, taken from legal records –Individuals with zero crimes are not evident in data Example: An on-line survey of web use –Individuals with zero web use are not in data Poisson & NB have been adapted to address truncated data: –Zero-truncated Poisson & Zero-trunciated NB reg.

Example: Zero-truncated NB Reg Web use (zeros removed) Zero-truncated negative binomial regression Number of obs = 1304 LR chi2(5) = Dispersion = mean Prob > chi2 = Log likelihood = Pseudo R2 = wwwhr | Coef. Std. Err. z P>|z| [95% Conf. Interval] male | age | educ | lowincome | babies | _cons | /lnalpha | alpha | Likelihood-ratio test of alpha=0: chibar2(01) = Prob>=chibar2 = Coefficient interpretation works just like ordinary poisson or NB regression.

Empirical Example 2 Example: Haynie, Dana L “Delinquent Peers Revisited: Does Network Structure Matter?” American Journal of Sociology, 106, 4: