Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.

Logistic Regression Multivariate Analysis

What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number. The log of 1000 is 3 as 10 3 =1000. The log of an odds ratio of 1.0 is 0 as 10 0 = 1 Exponent (e) or 2.718 raised to a certain power is the antilog of that number. Thus, (exp β ) = antilog β Antilog of log odds 0 is 2.718 o =1 Exponential increases are curvilinear.

Main questions with logistic regression How do the odds of a successful outcome depend upon or change based on each explanatory variable (X)? How does the probability that a successful outcome occurs depend upon or change based on each explanatory variable (X)?

Logistic regression Single binary response variable predicted by categorical and interval variables Maximum likelihood model – the coefficients that make sample observations most likely are reported in the final model Binomial distribution that assumes a sigmoid curve (non-linear) The probability of success falls between 0 and 1 for all possible values of X (s-curve bends)

Sigmoid curve for logistic regression

Response variable Denote Y by 0 and 1 (dummy coding) 0 and 1 are usually termed failure and success of an outcome (by convention, success is category 1) The sample mean of Y is the sum of the number of successes divided by the sample size (proportion of success)

Odds ratios in logistic regression Can be thought of as likelihood or odds success based impact of predictors in model Interval : the odds of success for those who are a unit apart in X, net of other predictors. For dummy coefficients: the odds of success for those in the reference category of X (1) compared with those in the omitted (0) Every unit increase in X has an exponential effect on the odds of success so an odds ratio can be >1

Odds ratio π / 1- π is the odds ratio or the odds of success When the probability of success or π is ½ or 50-50, odds for success equals.5/1-.5 = 1.0. This means that success is equally as likely as failure Thus, predicted probability of.5 and an odds ratio of 1.0 are our points of comparison when making inferences

Logistic transformation of odds ratio To model dichotomous outcomes, SPSS takes logistic transformation of odds ratios: Log (π / 1- π ) = α + βX1 + βX2 … To interpret, we take the exponent values of beta coefficient for each predictor (can do for all in model) Odds ratio or the odds of success are: π / 1- π = e α + βX = e α + (e β ) X  Exponent 

We can also talk about the percentage change in odds for interval and dummy variables Thus, the exponential beta value in the SPSS output can be calculated into a percent by 100 (exp b –1) or the percentage change in odds for each unit increase in the independent variable. We don’t really talk about the intercept here … betas for each predictor are our concern

We can also talk about the probability of success or π Can calculate point estimates by substituting specific X values, thus it is good for forecasting, given respondent characteristics Impact of X on π is interactive/non-constant π is the probability of success and this probability varies as X changes and it is expressed in a % form (ranges 0-1) π = e α + βX / 1 + e α + βX or odds / 1 + odds

Slope in logistic regression models (FYI) Like the slope of a straight line, β refers to whether the sigmoid curve (π or prob. of success) increases β+ or decreases β- as the values of the intervals increase or we move from 0 to 1 for dummy Steepness of s-curve increases as absolute value of β increases The rate at which the curve climbs or descends changes according to the values of the independent variable thus β (X)

Slope in logistic regression models (FYI) When β = 0, π does not change as X increases (X has no bearing on probability or odds of success ) so the curve is flat, there is just a straight line For β > 0, π increases as X increases (probability of success increases thus curve increases) For β < 0, π decreases as X increases (probability of success decreases thus curve decreases) Mention.5 bit on next slide

Slope in logistic regression (FYI)

Null hypothesis for predictors Ho: β = 0 for Log (π / 1- π ) = α + βX 1…I X has no effect on the likelihood that [y =1] an outcome will occur Y is independent of X so the likelihood of being successful is the same for all income groups

Wald Statistic Null hypothesis test statistic for each predictor in your model Wald statistic is the significance test for each parameter in the model Null is that each β = 0 Has df=1; Chi-square distribution It is the square of z statistic which equals β/s.error

-2 log likelihood as test of null hypothesis for entire model A test of significance for model and is like the F-ratio; chi-square distribution; df = p α + β - p α Does the observed likelihood or odds of success differ from 1? Compares the model with the intercept alone to intercept and predictors. Do your predictors add to the predictive power of the model? Tests if the difference is 0 and is referred to as the model chi-square

Goodness of Fit Statistic – null for residuals (FYI) Compares observed probabilities or what you observed in the sample to the predicted probabilities of an outcome occurring based on model parameters in your equation Examines residuals – do the predictor coefficients significantly minimize their squared distances? Chi-square distribution; df = p Should be NS as observed and predicted are anticipated to be quite similar

Mean of our response variable attending self-help group (FYI) The sample mean of Y is the sum of the number of successes (yes to attend) divided by the sample size, n The sample mean is the proportion of successful outcomes Thus, 44 said yes and n = 400, thus mean proportion of yes is.11 or 11%

Odds ratio and % in odds change by age Age  =-.0586 and p<.01 (beta negative). Thus, log odds of attending a self-help group decrease as a person gets older Exp  =.9431 … the odds ratio [exp  <1] thus odds decrease % change (in this case a reduction in) in odds of attending for each additional year of age is 100(exp  - 1) = 100 (.9431 – 1) = -5.69 % less likely each year one ages

Predicted probability of attending by age When  <.5, the probability of attending declines and we would see a downward dip in the sigmoid curve with increasing values of X (keeping in mind probability ranges from 0-1) More meaningful with all predictors, however, a point estimate for age 80 would be:  = e (-.0586)(80) / 1 + e (-.0586)(80) =.009 The probability of those 80 years of age attending a group is 1%

Odds ratio and % change in odds by gender Gender  =1.2540 and p<.05. Thus, odds of attending a self-help group among females is greater (referent category is female and beta is positive) Exp  = 3.5045 … odds of attending are 3.5 times as large for females as they are for males [exp  >1] % change (in this case an increase in) in odds of attending when a person is female is 100(exp  - 1)=100(3.50 – 1) = 250 %

Predicted probability of attending by gender e (1.254)(1) / 1 + e (1.254)(1) =.77 Thus, the probability of attending among females is 77% When  >.5, the probability of attending increases and we would see an upward trend in the sigmoid curve with increasing values of X on the horizontal axis (keeping in mind probability ranges from 0-1)

Wald statistic Coefficient for each independent variable is 0 Tells us which variables significantly predictor the likelihood of attending a self-help group Age = 7.2298** Gender 5.7723*

Likelihood statistic for the model Likelihood or odds are 1.0 and predicted probability is.5 Constant alone minus constant and all predictors 281.36838 - 240.518* All of our predictor variables have β = 0 With 12 df model chi-square of 40.85 has a p<.0001 The predictors in model significantly add to our capacity to predict attendance

Goodness of Fit (FYI) 371.093, 12 df, ns Our model parameters minimize the squared distances [residual] between actual sample observations of attendance to that which the logistic regression equation predicts (odds and probabilities)

Logistic Regression References DeMaris, A. (1995). A tutorial in logistic regression. Journal of Marriage and the Family, 57(10): 956-968 Agresti, A. & Finlay, B. (1997). Logistic regression – modeling categorical responses. Statistical methods for social sciences (3 rd ed., pp. 575-619). Prentice Hall: New Jersey. Dwyer, J.H. (1983). Statistical methods for the social and behavioral sciences (pp. 447-465). Oxford University Press: New York.

Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.

Similar presentations

Presentation on theme: "Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.

Similar presentations

Presentation on theme: "Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number."— Presentation transcript:

Similar presentations

About project

Feedback