DISCRIMINANT ANALYSIS. Discriminant Analysis  Discriminant analysis builds a predictive model for group membership. The model is composed of a discriminant.

Slides:



Advertisements
Similar presentations
Discriminant Analysis and Classification. Discriminant Analysis as a Type of MANOVA  The good news about DA is that it is a lot like MANOVA; in fact.
Advertisements

Multiple Regression and Model Building
Sociology 680 Multivariate Analysis Logistic Regression.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Inference for Regression
Correlation and regression
Chapter 17 Overview of Multivariate Analysis Methods
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Intro to Statistics for the Behavioral Sciences PSYC 1900
Chapter 11 Multiple Regression.
1 Chapter 16 Linear regression is a procedure that identifies relationship between independent variables and a dependent variable.Linear regression is.
Multiple Regression and Correlation Analysis
Data Analysis Statistics. Inferential statistics.
Analysis of Variance & Multivariate Analysis of Variance
Multiple Regression – Basic Relationships
Multivariate Analysis Techniques
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Simple Linear Regression Analysis
Discriminant Analysis Testing latent variables as predictors of groups.
Discriminant analysis
Lecture 5 Correlation and Regression
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, November 19 Chi-Squared Test of Independence.
Selecting the Correct Statistical Test
Chapter 13: Inference in Regression
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences.
CHAPTER 26 Discriminant Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
بسم الله الرحمن الرحیم.. Multivariate Analysis of Variance.
Chapter Eighteen Discriminant Analysis Chapter Outline 1) Overview 2) Basic Concept 3) Relation to Regression and ANOVA 4) Discriminant Analysis.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Multivariate Data Analysis CHAPTER seventeen.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Discriminant Analysis
18-1 © 2007 Prentice Hall Chapter Eighteen 18-1 Discriminant and Logit Analysis.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Pearson Chi-Square Contingency Table Analysis.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
17-1 COMPLETE BUSINESS STATISTICS by AMIR D. ACZEL & JAYAVEL SOUNDERPANDIAN 6 th edition (SIE)
Chapter 13 Multiple Regression
ANOVA: Analysis of Variance.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Copyright © 2010 Pearson Education, Inc Chapter Eighteen Discriminant and Logit Analysis.
Review for Final Examination COMM 550X, May 12, 11 am- 1pm Final Examination.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Multiple Logistic Regression STAT E-150 Statistical Methods.
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
Module III Multivariate Analysis Techniques- Framework, Factor Analysis, Cluster Analysis and Conjoint Analysis Research Report.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Two-Group Discriminant Function Analysis. Overview You wish to predict group membership. There are only two groups. Your predictor variables are continuous.
 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?
Unit 7 Statistics: Multivariate Analysis of Variance (MANOVA) & Discriminant Functional Analysis (DFA) Chat until class starts.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Nonparametric Statistics
Chapter Seventeen Copyright © 2004 John Wiley & Sons, Inc. Multivariate Data Analysis.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Simple Bivariate Regression
BINARY LOGISTIC REGRESSION
Multiple Regression and Model Building
Discriminant Analysis
Regression Analysis Week 4.
Linear Discriminant Analysis
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Presentation transcript:

DISCRIMINANT ANALYSIS

Discriminant Analysis  Discriminant analysis builds a predictive model for group membership. The model is composed of a discriminant function (or, for more than two groups, a set of discriminant functions) based on linear combinations of the predictor variables that provide the best discrimination between the groups. The functions are generated from a sample of cases for which group membership is known; the functions can then be applied to new cases that have measurements for the predictor variables but have unknown group membership. Rahul Chandra

Discriminant Analysis  The grouping variable (dependent) can have more than two values. The codes for the grouping variable must be integers, however, and you need to specify their minimum and maximum values. Cases with values outside of these bounds are excluded from the analysis. Rahul Chandra

DA vs Linear Regression  linear regression is limited to cases where the dependent variable (Y) is an interval variable so that the combination of predictors (X) will, through the regression equation, produce estimated Y values for given values of weighted combinations of X values.  But many interesting variables are categorical, making a profit or not, holding a particular credit card, owning, renting or paying a mortgage for a house, employed/unemployed, satisfied versus dissatisfied employees, which customers are likely to buy a product or not buy, etc. Here DA is used. Rahul Chandra

DA is used when  The dependent variable is categorical with the predictor IV’s at interval level such as age, income, attitudes, perceptions, and years of education, although dummy variables can be used as predictors as in multiple regression. Logistic regression IV’s can be of any level of measurement.  There are more than two DV categories, unlike logistic regression, which is limited to a dichotomous dependent variable. Rahul Chandra

Discriminant analysis linear equation Rahul Chandra

Assumptions of discriminant analysis  The observations are a random sample.  Each predictor variable is normally distributed.  Each of the allocations for the dependent categories in the initial classification are correctly classified.  Within-group variance-covariance matrices should be equal across groups.  The groups or categories should be defined before collecting the data. Rahul Chandra

Assumptions of discriminant analysis  There must be at least two groups or categories, with each case belonging to only one group so that the groups are mutually exclusive and collectively exhaustive. For instance, three groups taking three available levels of amounts of housing loan etc.  Group sizes of the dependent should not be grossly different and should be at least five times the number of independent variables. Rahul Chandra

Statistical analysis in DA  The aim is to combine (weight) the variable scores in some way so that a single new composite variable, the discriminant score, is produced.  It is hoped that each group will have a normal distribution of discriminant scores.  The degree of overlap between the discriminant score distributions can then be used as a measure of the success of the technique. Rahul Chandra

Discriminant Distribution Rahul Chandra

Discriminant Distribution Rahul Chandra

Discriminant Distribution Rahul Chandra

Discriminant Analysis on SPSS  Given is an example of discriminant analysis for classifying employees of a company in two groups of smokers and Non-smokers based on certain independent variables.  Dependent variable is ‘smoke’ with two categories of smokers and Non-smokers.  The other variables to be used are age, days absent sick from work last year, self-concept score, anxiety score and attitudes to anti-smoking at work score.  The aim of the analysis is to determine whether these variables will discriminate between those who smoke and those who do not. Rahul Chandra

SPSS OUTPUTS Rahul Chandra

Group Statistics Rahul Chandra

Test of Equality of Group Means Strong statistical evidence of significant differences between means of smoke and no smoke groups for all IV’s, with self-concept and anxiety producing very high value F’s indicating they are strongest discriminator. Rahul Chandra

Within Group Inter correlation Matrices Good discriminator should have low inter correlations. The Pooled Within-Group Matrices also supports use of these IV’s as inter- correlations are low. Rahul Chandra

Log determinants and Box’s M tables  In DA the basic assumption is that the variance - Co- variance matrices are equivalent. Box’s M tests the null hypothesis that the covariance matrices do not differ between groups formed by the dependent. This test should not to be significant so that the null hypothesis that the groups do not differ can be retained. Rahul Chandra

Log determinants and Box’s M tables  For this assumption to hold, the log determinants should be equal.  When tested by Box’s M, we are looking for a non- significant M to show similarity and lack of significant differences. Rahul Chandra

Log Determinants In this case the log determinants appear similar. Where three or more groups exist, and M is significant, groups with very small log determinants should be deleted from the analysis. Rahul Chandra

BOX’s M Test Box’s M is with F = which is significant at p <.000. However, since the sample size is large, a significant result is not regarded as too important. Rahul Chandra

Eigen Values Function - This indicates the first or second canonical linear discriminant function. The number of functions is equal to the number of discriminating variables, if there are more groups than variables; or it is 1 less than the number of levels in the group variable. In this example dependent variable has only two levels and number of predictors are five, therefore only 1 (2-1) function is created. Each function acts as projections of the data onto a dimension that best separates or discriminates between the groups. Rahul Chandra

Eigenvalue  Eigenvalue are related to the canonical correlations and describe how much discriminating ability a function possesses. The magnitudes of the eigenvalues are indicative of the functions' discriminating abilities.  Canonical correlations is the correlation of our predictor variables and the discriminant functions. Rahul Chandra

Wilk’s Lamda Wilks’ lambda indicates the significance of the discriminant function. This table indicates a highly significant function (p <.000) and provides the proportion of total variability not explained. So we have 35.6% unexplained Rahul Chandra

More on Wilk’s Lamda  Wilk’s Lamda checks the Null hypothesis that the function, and all functions that follow, have no discriminating ability. This hypothesis is tested using this Chi-square statistic.  This has to be significant otherwise discriminant function is questionable as a discriminator. Rahul Chandra

Discriminant Function Coefficients These un-standardized coefficients (b) are used to create the discriminant function (equation). It operates just like a regression equation. Rahul Chandra

Discriminant Function Rahul Chandra

Group Centroids A further way of interpreting discriminant analysis results is to describe each group in terms of its profile, using the group means of the predictor variables. These group means are called centroids. Rahul Chandra

Classification Table Rahul Chandra