Count Models Sociology 229: Advanced Regression Copyright © 2010 by Evan Schofer Do not copy or distribute without permission
Announcements Assignment #1 Due Assignment #2 handed out Due in 1 week Agenda: Basic count models Intro to EHA (if time allows)
Count Variables Many dependent variables are counts: Non- negative integers # Crimes a person has committed in lifetime # Children living in a household # new companies founded in a year (in an industry) # of social protests per month in a city –Can you think of others?
Count Variables Count variables can be modeled with OLS regression… but: –1. Linear models can yield negative predicted values… whereas counts are never negative Similar to the problem of the Linear Probability Model –2. Count variables are often highly skewed Ex: # crimes committed this year… most people are zero or very low; a few people are very high Extreme skew violates the normality assumption of OLS regression.
Count Models Two most common count models: Poisson Regression Model Negative Binomial Regression Model Both based on the Poisson distribution: = expected count (and variance) –Called lambda ( ) in some texts; I rely on Freese & Long 2006 y = observed count
Poisson Regression Strategy: Model log of as a function of Xs Quite similar to modeling log odds in logit Again, the log form avoids negative values Which can be written as:
Poisson Regression: Example Hours per week spent on web
Poisson Regression: Web Use Output = similar to logistic regression. poisson wwwhr male age educ lowincome babies Poisson regression Number of obs = 1552 LR chi2(5) = Prob > chi2 = Log likelihood = Pseudo R2 = wwwhr | Coef. Std. Err. z P>|z| [95% Conf. Interval] male | age | educ | lowincome | babies | _cons | Men spend more time on the web than women Number of young children in household reduces web use
Poisson Regression: Stata Output Stata output yields familiar statistics: –Standard errors, z/t- values, and p-values for coefficient hypothesis tests –Pseudo R-square for model fit Not a great measure… but gives a crude explained variance –MLE log likelihood –Likelihood ratio test: Chi-square and p-value Comparing to null model (constant only) Tests can also be conducted on nested models with stata command “ lrtest ”.
Interpreting Coefficients In Poisson Regression, Y is typically conceptualized as a rate… Positive coefficients indicate higher rate; negative = lower rate Like logit, Poisson models are non-linear Coefficients don’t have a simple linear interpretation Like logit, model has a log form; exponentiation aids interpretation Exponentiated coefficients are multiplicative Analogous to odds ratios… but called “incidence rate ratios”.
Interpreting Coefficients Exponentiated coefficients: indicate effect of unit change of X on rate In STATA: “incidence rate ratios”: “ poison …, irr ” e b = 2.0 indicates that the rate doubles for each unit change in X e b =.5 indicates that the rate drops by half for each unit change in X Recall: Exponentiated coefs are multiplicative If e b = 5.0, a 2-point change in X isn’t 10; it is 5 * 5 = 25 –Also: you must invert to see opposite effects If e b = 5.0, a 1-point decrease in X isn’t -5, it is 1/5 =.2
Interpreting Coefficients Again, exponentiated coefficients (rate ratios) can be converted to % change Formula: (e b - 1) * 100% Ex: Coefficent = (e ) * 100% = 50% decrease in rate.
Interpreting Coefficients Exponentiated coefficients yield multiplier:. poisson wwwhr male age educ lowincome babies Poisson regression Number of obs = 1552 LR chi2(5) = Prob > chi2 = Log likelihood = Pseudo R2 = wwwhr | Coef. Std. Err. z P>|z| [95% Conf. Interval] male | age | educ | lowincome | babies | _cons | Exponentiation of.359 = 1.43; Rate is 1.43 times higher for men (1.43-1) * 100 = 43% more Exp(-.14) =.87. Each baby reduces rate by factor of.87 (.87-1) * 100 = 13% less
Probabilities of Count Outcomes Stata extension “prcount” can compute probabilities for each possible count outcome For all cases, of for particular groups It plugs values (m), Xs, & bs into formula: Rate: [ , ] Pr(y=0|x): [ , ] Pr(y=1|x): [ , ] Pr(y=2|x): [ , ] Pr(y=3|x): [ , ] Pr(y=4|x): [ , ] Pr(y=5|x): [ , ] Pr(y=6|x): [ , ] Pr(y=7|x): [ , ] Pr(y=8|x): [ , ] Pr(y=9|x): [ , ] male age educ lowincome babies x=
Predicted Counts Stata “predict varname, n” computes predicted value for each case. predict predwww if e(sample), n. list wwwhr predwww if e(sample) | wwwhr predwww | | | 1. | | 2. | | 3. | | 12. | | 13. | | 15. | | 16. | | 19. | | 20. | | 21. | | 23. | | 24. | | 25. | | 27. | | 33. | | Some of the predictions are close to the observed values… Many of the predictions are quite bad… Recall that the model fit was VERY poor!
Predicted Counts Stata command adjust (Stata 9/10) and margins (Stata 11) can summarize predicted counts You can compute average predictions for each case in your data… or for sub-groups of the data. –The trick is to figure out what values to use for OTHER variables when you compute probabilities Hold other variables at the mean of all cases? Hold other variables at the mean for each subgroup of the variable of interest? Set other variables at values corresponding to an interesting hypothetical case?
Predicted Counts: adjust/margins Example: comparing women and men. margins, at(male=(0 1)) atmeans Adjusted predictions Number of obs = 1552 Expression : Predicted number of events, predict() 1._at : male = 0 age = (mean) educ = (mean) lowincome = (mean) babies = (mean) 2._at : male = 1 age = (mean) educ = (mean) lowincome = (mean) babies = (mean) | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] _at | 1 | | This prediction refers to men, with other variables held at the mean of all cases
Issue: Exposure Poisson outcome variables are typically conceptualized as rates Web hours per week Number of crimes committed in past year Issue: Cases may vary in exposure to “risk” of a given outcome To properly model rates, we must account for the fact that some cases have greater exposure than others Ex: # crimes committed in lifetime –Older people have greater opportunity to have higher counts Alternately, exposure may vary due to research design –Ex: Some cases followed for longer time than others…
Issue: Exposure Poisson (and other count models) can address varying exposure: Where t i = exposure time for case i It is easy to incorporate into stata, too: Ex: poisson NumCrimes SES income, exposure(age) Note: Also works with other “count” models.
Poisson Model Assumptions Poisson regression makes a big assumption: That variance of = (“equidisperson”) In other words, the mean and variance are the same This assumption is often not met in real data Dispersion is often greater than : overdispersion –Consequence of overdispersion: Standard errors will be underestimated Potential for overconfidence in results; rejecting H0 when you shouldn’t! Note: overdispersion doesn’t necessarily affect predicted counts (compared to alternative models).
Poisson Model Assumptions Overdispersion is most often caused by highly skewed dependent variables –Often due to variables with high numbers of zeros Ex: Number of traffic tickets per year Most people have zero, some can have 50! Mean of variable is low, but SD is high –Other examples of skewed outcomes # of scholarly publications # cigarettes smoked per day # riots per year (for sample of cities in US).
Negative Binomial Regression Strategy: Modify the Poisson model to address overdispersion Add an “error” term to the basic model: Additional model assumptions: Expected value of exponentiated error = 1 (e = 1) Exponentiated error is Gamma distributed We hope that these assumptions are more plausible than the equidispersion assumption!
Negative Binomial Regression Full negative biniomial model: Note that the model incorporates a new parameter: Alpha represents the extent of overdispersion If = 0 the model reduces to simple poisson regression
Negative Binomial Regression Question: Is alpha ( ) = 0? If so, we can use Poisson regression If not, overdispersion is present; Poisson is inadequate Strategy: conduct a statistical test of the hypothesis: H0: = 0; H1: > 0 Stata provides this information when you run a negative binomial model: Likelihood ratio test (G 2 ) for alpha P-value <.05 indicates that overdispersion is present; negative binomial is preferred If P>.05, just use Poisson regression –So you don’t have to make assumptions about gamma dist….
Negative Binomial Regression Interpreting coefficients: Identical to poisson regression Predicted probabilities: Can be done. You must use big Neg Binomial formula Plugging in observed Xs, estimates of a, Bs… Probably best to get STATA to do this one… Long & Freese created command: prvalue
Negative Binomial Example: Web Use Note: Bs are similar but SEs change a lot! Negative binomial regression Number of obs = 1552 LR chi2(5) = Prob > chi2 = Log likelihood = Pseudo R2 = wwwhr | Coef. Std. Err. z P>|z| [95% Conf. Interval] male | age | educ | lowincome | babies | _cons | /lnalpha | alpha | Likelihood-ratio test of alpha=0: chibar2(01) = Prob>=chibar2 = Note: Standard Error for education increased from.004 to.012! Effect is no longer statistically significant.
Negative Binomial Example: Web Use Note: Info on overdispersion is provided Negative binomial regression Number of obs = 1552 LR chi2(5) = Prob > chi2 = Log likelihood = Pseudo R2 = wwwhr | Coef. Std. Err. z P>|z| [95% Conf. Interval] male | age | educ | lowincome | babies | _cons | /lnalpha | alpha | Likelihood-ratio test of alpha=0: chibar2(01) = Prob>=chibar2 = Alpha is clearly > 0! Overdispersion is evident; LR test p<.05 You should not use Poisson Regression in this case
General Remarks Poisson & Negative binomial models suffer all the same basic issues as “normal” regression Model specification / omitted variable bias Multicollinearity Outliers/influential cases –Also, it uses Maximum Likelihood N > 500 = fine; N < 100 can be worrisome –Results aren’t necessarily wrong if N<100; –But it is a possibility; and hard to know when problems crop up Plus ~10 cases per independent variable.
General Remarks It is often useful to try both Poisson and Negative Binomial models The latter allows you to test for overdispersion Use LRtest on alpha ( ) to guide model choice –If you don’t suspect dispersion and alpha appears to be zero, use Poission Regression It makes fewer assumptions –Such as gamma-distributed error.
Example: Labor Militancy Isaac & Christiansen 2002 Note: Results are presented as % change
Zero-Inflated Poisson & NB Reg If outcome variable has many zero values it tends to be highly skewed Under those circumstances, NBREG works better than ordinary Poisson due to overdispersion –But, sometimes you have LOTS of zeros. Even nbreg isn’t sufficient Model under-predicts zeros, doesn’t fit well –Examples: # violent crimes committed by a person in a year # of wars a country fights per year # of foreign subsidiaries of firms.
Zero-Inflated Poisson & NB Reg Logic of zero-inflated models: Assume two types of groups in your sample Type A: Always zero – no probability of non-zero value Type ~A: Non-zero chance of positive count value –Probability is variable, but not zero –1. Use logit to model group membership –2. Use poisson or nbreg to model counts for those in group ~A –3. Compute probabilities based on those results.
Zero-Inflated Poisson & NB Reg Example: Web usage at work More skewed than overall web usage. Why? Many people don’t have computers at work! So, web usage is zero for many
Zero-Inflated Poisson & NB Reg Zero-inflated models in Stata “zip” = Poisson, zinb = negative binomial Commands accept two separate variable lists –Variables that affect counts For those with non-zero counts Modeled with Poisson or NB regression –Variables that predict membership in “zero” group Modeled with logit –Ex: zinb webatwork male age educ lowincome babies, inflate(male age educ lowincome babies)
ZINB Example: Web Hrs at Work “Inflate” output = logit for group membership Zero-inflated negative binomial regression Number of obs = 1135 Nonzero obs = 562 Zero obs = 573 Inflation model = logit LR chi2(5) = Log likelihood = Prob > chi2 = | Coef. Std. Err. z P>|z| [95% Conf. Interval] webatwork | male | age | educ | lowincome | babies | _cons | inflate | male | age | educ | lowincome | babies | _cons | Education reduces odds of zero value But doesn’t have an effect on count for those that are non-zero Model predicting zero group
Zero-Inflated Poisson & NB Reg Remarks –ZINB produces estimate of alpha Helps choose between zip & zinb –Long and Freese (2006) have helpful tool to compare fit of count models: countfit See textbook –Zero-inflated models seem very useful Count variables often have many zeros It is often reasonable to assume a “always zero” group –But, they are fairly new Not many examples in the literature Haven’t been widely scrutinized.
Zero-truncated Poisson & NB reg Truncation – the absence of information about cases in some range of a variable Example: Suppose we study income based on data from tax returns… –Cases with income below a certain value are not required to submit a tax return… so data is missing Example: Data on # crimes committed, taken from legal records –Individuals with zero crimes are not evident in data Example: An on-line survey of web use –Individuals with zero web use are not in data Poisson & NB have been adapted to address truncated data: –Zero-truncated Poisson & Zero-trunciated NB reg.
Example: Zero-truncated NB Reg Web use (zeros removed) Zero-truncated negative binomial regression Number of obs = 1304 LR chi2(5) = Dispersion = mean Prob > chi2 = Log likelihood = Pseudo R2 = wwwhr | Coef. Std. Err. z P>|z| [95% Conf. Interval] male | age | educ | lowincome | babies | _cons | /lnalpha | alpha | Likelihood-ratio test of alpha=0: chibar2(01) = Prob>=chibar2 = Coefficient interpretation works just like ordinary poisson or NB regression.
Empirical Example 2 Example: Haynie, Dana L “Delinquent Peers Revisited: Does Network Structure Matter?” American Journal of Sociology, 106, 4: