Presentation is loading. Please wait.

Presentation is loading. Please wait.

Logistic regression.

Similar presentations


Presentation on theme: "Logistic regression."— Presentation transcript:

1 Logistic regression

2 Analysis of proportion data
We know how many times an event occurred, and how many times it did not occur. We want to know if these proportions are affected by a treatment or a factor. Examples: Proportion dying Proportion responding to a treatment Proportion in a sex Proportion flowering

3 The old fashioned way People used to model these data using percentages as the response variable… The problems with this are: Errors are not normally distributed! The variance is not constant! The response is bounded (0-1)! We lose information on the sample size!

4 However… Some data, such as percentage of plant cover, is better analyzed using the conventional models (normal errors and constant variance) following the arcsine transformation (the response variable measured in radians)…

5

6 If the response variable takes the form of percentage change of some measurement
Usually it is better to: Analysis of covariance, using final weight as the response variable and initial weight as the covariate Specifying the response variable as a relative growth rate, measured as log(final/initial) Both can be analyzed with normal errors without further transformations!

7 Rational for logistic regression
The traditional transformation of proportion data was arcsine. This transformation took care of the error distribution. There is nothing wrong with this transformation, but a simpler approach is often preferable, and is likely to produce a model that is easier to interpret…

8 The logistic curve The logistic curve is commonly used to describe data on proportions. It asymptotes at 0 and 1, so that negative proportions and responses of more than 100 % cannot be predicted.

9 Binomial errors If p = proportion of individuals observed to respond in a given way The proportion of individuals that respond in alternative ways is: 1-p and we shall call this proportion q n is the size of the sample (or number of attempts) An important point is that the variance of the binomial distribution is not constant. In fact the variance of a binomial distribution with mean np is: So that the variance changes with the mean like this:

10 The logistic model The logistic model for p as a function of x is given by: This model is bounded since:

11 The trick of linearizing the logistic model is a simple transformation known as logit…
See better description for the logit transformation in the class website

12 Hypericum cumulicola Small short-lived perennial herb
Narrowly endemic and endangered Flowers are small and bisexual Self-compatible, but requires pollinators to set seed Menges et al. (1999) Dolan et al. (1999) Boyle and Menges (2001)

13 Demographic data 15 populations (various patch sizes)
>80 individuals per population each year Data on height and number of reproductive structures Survival between August 1994 and August 1995

14 Histogram of height (cm) Hypericum cumulicola (1994)

15 Call: glm(formula = survival ~ height, family = binomial) Deviance Residuals: Min Q Median Q Max Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) <2e-16 *** height <2e-16 *** --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: on 878 degrees of freedom Residual deviance: on 877 degrees of freedom AIC: Number of Fisher Scoring iterations: 4

16 Calculating a given proportion
You can back-transform from logits (z) to proportions (p) by:

17 Survival vs. height

18 Survival vs. Rep. Structures

19


Download ppt "Logistic regression."

Similar presentations


Ads by Google