Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Brief introduction on Logistic Regression
Binary Logistic Regression: One Dichotomous Independent Variable
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Logistic Regression.
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Logistic Regression STA302 F 2014 See last slide for copyright information 1.
Models with Discrete Dependent Variables
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
SIMPLE LINEAR REGRESSION
An Introduction to Logistic Regression
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
BINARY CHOICE MODELS: LOGIT ANALYSIS
Simple Linear Regression Analysis
Generalized Linear Models
Unit 4b: Fitting the Logistic Model to Data © Andrew Ho, Harvard Graduate School of EducationUnit 4b – Slide 1
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
Logistic Regression STA2101/442 F 2014 See last slide for copyright information.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.
Regression. Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Statistical Principles and An Overview of.
Logistic Regression. Linear Regression Purchases vs. Income.
Warsaw Summer School 2015, OSU Study Abroad Program Advanced Topics: Interaction Logistic Regression.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Multiple Regression  Similar to simple regression, but with more than one independent variable R 2 has same interpretation R 2 has same interpretation.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Logistic regression. Recall the simple linear regression model: y =  0 +  1 x +  where we are trying to predict a continuous dependent variable y from.
Logistic Regression Saed Sayad 1www.ismartsoft.com.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Regression Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Logistic Regression and Odds Ratios Psych DeShon.
Nonparametric Statistics
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Logistic Regression: Regression with a Binary Dependent Variable.
Nonparametric Statistics
Chapter 4: Basic Estimation Techniques
Chapter 4 Basic Estimation Techniques
Statistical Modelling
Simple Linear Regression
Logistic Regression.
Basic Estimation Techniques
THE LOGIT AND PROBIT MODELS
Drop-in Sessions! When: Hillary Term - Week 1 Where: Q-Step Lab (TBC) Sign up with Alice Evans.
Generalized Linear Models
Introduction to logistic regression a.k.a. Varbrul
Basic Estimation Techniques
THE LOGIT AND PROBIT MODELS
Nonparametric Statistics
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Logistic Regression.
Presentation transcript:

Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis

Random Variable A variable whose numerical value is determined by chance, the outcome of a random phenomenon.A variable whose numerical value is determined by chance, the outcome of a random phenomenon. Discrete has a countable number of values.Discrete has a countable number of values. Continuous can take on any value in an interval.Continuous can take on any value in an interval. Conducting Social Research Is “statistical anxiety” continuous or discrete?

Continuous or Discrete Actual measurement is always discrete.Actual measurement is always discrete. Distinction between variables that can take on lots versus many values.Distinction between variables that can take on lots versus many values. Not always clear or unambiguous.Not always clear or unambiguous. Conducting Social Research Is income continuous or discrete?

Binary. Ordinal/Ranked. Nominal/Unranked. Count. Censored. Conducting Social Research Categorical/Limited Variables

Normal distribution of dependent variable and independent variables will increase likelihood that the residuals are normally distributed, however this is not necessary for residuals to be normally distributed. Conducting Social Research Normally Distributed Errors

Conducting Social Research Binomial Distribution Fireplace

Ordinary Least Squares and the Linear Probability Model Conducting Social Research

Ordinary Least Squares Linear Probability Model

If we interpret the predicted value of Y as a probability. Conducting Social Research Ordinary Least Squares Model fire = total_living_area By definition, the observed values are 1 or 0, but the probability is continuous, so explained variation will be lower than might be expected.

If we interpret the predicted value of Y as a probability. Conducting Social Research Ordinary Least Squares Model fire = total_living_area By definition, probabilities cannot be greater than 1 (or less than 0), which is inevitable with any linear model.

Only two possible values of resdiuals for any value of X. Conducting Social Research Ordinary Least Squares Model fire = total_living_area If Y is a function of X and the errors depend on the mean of Y, then the assumption of constant variance is invalid.

A flexible, general-purpose modeling strategy with straight-forward interpretation for dichotomous Y variables. Similar to the Linear Regression Model but with different underlying mathematical basis. Estimated using maximum likelihood techniques, an iterative technique. Conducting Social Research Logistic Regression

P(Y = 1) denotes the probability that Y equals 1. Conducting Social Research Probabilities and Odds Finding a function that approaches but never exceeds the 0 and 1 probability limits. Odds range from 0 when P(Y = 1) to ∞ when P(Y = 0). Thus, we have a floor of 0, but not yet a ceiling of 1.

P(Y = 1) = 0.6 Conducting Social Research Probabilities and Odds P(Y = 1) = 0.4

Conducting Social Research Odds and the Log (Odds) Logit Logits range from - ∞ when P(Y = 0) to ∞ when P(Y = 1). If the logit is a linear function of X variables then P is a nonlinear sigmoid (s-shaped) curve that ranges from 0 to 1.

Binary Logistic Regression Model Conducting Social Research

Binary Logistic Regression Model Conducting Social Research If the linear part of the equation is as large as possible, that is infinity, then:

Binary Logistic Regression Model Conducting Social Research If the linear part of the equation is as small as possible, that is negative infinity then:

Conducting Social Research Logit and Logistic Regression

Conducting Social Research Logistic Regression The slope of the logit changes as the probability ranges from 0 to 1.

Conducting Social Research Interpreting Logistic Regression Coefficients The slope of the logit changes as the probability ranges from 0 to 1.

Conducting Social Research Logistic Regression Model fire = total_living_area

Easily accessible regression coefficient. Not easily interpretable. Conducting Social Research Interpreting Results Log Odds

Use the mean of all independent variables in the equation and then increase one variable one unit. Meaning of an average dummy variable is ambiguous. Conducting Social Research Interpreting Results Average Observation Change

At a probability of.5, multiplying a logit coefficient by.25, approximates a linear probability model coefficient. Conducting Social Research Interpreting Results Rough Estimate

More intuitive interpretation. Takes advantage of the tractable form of the logit. Assesses the change in the odds based on a change in X. Conducting Social Research Interpreting Results Odds Ratios

Conducting Social Research Interpreting Results Odds Ratios Taking the exponential: Compare the odd before and after changing the quantity of an X variable:

For a unit change in x the odds are expected to change by a factor of the odds ratio. If the odds ratio is greater than 1, the odds are larger. If the odds ratio is less than 1, the odds are smaller. The effect is not dependent on the level of any variable. Conducting Social Research Interpreting Results