Linear Discriminant Analysis and Logistic Regression.

Slides:



Advertisements
Similar presentations
1 1 Chapter 5: Multiple Regression 5.1 Fitting a Multiple Regression Model 5.2 Fitting a Multiple Regression Model with Interactions 5.3 Generating and.
Advertisements

Correlation and regression
Logistic Regression Psy 524 Ainsworth.
Forecasting Using the Simple Linear Regression Model and Correlation
Correlation and Linear Regression.
Regression BPS chapter 5 © 2006 W.H. Freeman and Company.
Logistic Regression.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #18.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
REGRESSION AND CORRELATION
An Introduction to Logistic Regression
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Multiple Regression Research Methods and Statistics.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Simple Linear Regression Analysis
Discriminant Analysis Testing latent variables as predictors of groups.
Simple Linear Regression NFL Point Spreads – 2007.
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.
Relationships Among Variables
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Lecture 5 Correlation and Regression
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Relationships between Variables. Two variables are related if they move together in some way Relationship between two variables can be strong, weak or.
Biostatistics Unit 9 – Regression and Correlation.
Fundamental Statistics in Applied Linguistics Research Spring 2010 Weekend MA Program on Applied English Dr. Da-Fu Huang.
Chapter 6 Regression Algorithms in Data Mining
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
Slide 1 The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics.
Examining Relationships in Quantitative Research
Part IV Significantly Different: Using Inferential Statistics
Multivariate Data Analysis Chapter 5 – Discrimination Analysis and Logistic Regression.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Regression Models Fit data Time-series data: Forecast Other data: Predict.
Multiple Discriminant Analysis
Regression BPS chapter 5 © 2010 W.H. Freeman and Company.
PS 225 Lecture 20 Linear Regression Equation and Prediction.
Discussion of time series and panel models
Slide 1 The Kleinbaum Sample Problem This problem comes from an example in the text: David G. Kleinbaum. Logistic Regression: A Self-Learning Text. New.
Data Analysis.
Linear Discriminant Analysis (LDA). Goal To classify observations into 2 or more groups based on k discriminant functions (Dependent variable Y is categorical.
Correlation and Regression: The Need to Knows Correlation is a statistical technique: tells you if scores on variable X are related to scores on variable.
 P – 60 5ths  APPLY LINEAR FUNCTIONS  X-axis time since purchase  Y-axis value  Use two intercepts (0, initial value) and (time until value.
Logistic Regression Analysis Gerrit Rooks
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
CORRELATION ANALYSIS.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
DISCRIMINANT ANALYSIS. Discriminant Analysis  Discriminant analysis builds a predictive model for group membership. The model is composed of a discriminant.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
Logistic Regression: Regression with a Binary Dependent Variable.
Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.
Logistic Regression When and why do we use logistic regression?
Chapter 15 Linear Regression
BIVARIATE REGRESSION AND CORRELATION
Simple Linear Regression
One-Way Analysis of Variance: Comparing Several Means
Statistics II: An Overview of Statistics
Section 6.2 Prediction.
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Correlation and Simple Linear Regression
Presentation transcript:

Linear Discriminant Analysis and Logistic Regression

Background Linear Discriminant Analysis predicts a categorical variable based on one or more metric independent variables

Example Data Age Purchase Consider purchase data compared to a person’s age. A 0 value for Purchase represents someone who didn’t buy, while a 1 represents someone who did.

Graph Interpretation Potential customers who did purchase Age Purchase Potential customers who did not purchase

Graphical Representation Age Purchase A discriminant analysis fits a linear regression to this data as though the categorical variable was numerical.

Graphical Representation ctd. Age Purchase Then the Discriminant Analysis determines a cutoff score. For a single predictor variable, this score is where the regression line is equal to.5. Any data points to the left of the line are predicted to be 0, while those to the right are predicted to be 1. For this data, any potential customer below the age of 41 is predicted not to buy, while anyone older is predicted to buy.

A 100% Accurate Discriminate Analysis Even a discriminant analysis that provides perfect separation between purchasers and non-purchasers does not have a perfect R. 2

Classification Accuracy Standard Error measures the distance of the predicted value (the regression line) from the observed values. Even data points that are correctly predicted will contribute to the error calculation. Classification accuracy is a better measure. This distance will lower the total R, even though it is a correct classification. 2

Discriminant Analysis in StatTools

StatTools – Interpreting Output Actualvalues Predicted Values Correct Predictions

StatTools – Interpreting Output ctd. Actualvalues Predicted Values False Negatives False Positives Overall Accuracy

Logistic Regression A logistic regression fits a sigmoid, or S-shaped curve instead of a straight line. On some datasets, this will provide greater classification accuracy.

Logistic Regression in StatTools

StatTools – Interpreting Output Age is highly statistically significant Overall Accuracy

Comparison Discriminant Analysis Can be used for dependent variables with more than 2 possible values Logistic Regression Less reliant on basic assumptions of the data like normality and constant variance More accurate on borderline points for some datasets