Spreadsheet Modeling & Decision Analysis

Slides:



Advertisements
Similar presentations
Discrimination amongst k populations. We want to determine if an observation vector comes from one of the k populations For this purpose we need to partition.
Advertisements

Introduction to Mathematical Programming MA/OR 504 Chapter 7 Machine Learning: Discriminant Analysis Neural Networks 6-1.
Spreadsheet Modeling & Decision Analysis
Correlation and regression
Week 3. Logistic Regression Overview and applications Additional issues Select Inputs Optimize complexity Transforming Inputs.
Probability & Statistical Inference Lecture 9
Describing Relationships Using Correlation and Regression
Chapter 8 Logistic Regression 1. Introduction Logistic regression extends the ideas of linear regression to the situation where the dependent variable,
Discriminant Analysis To describe multiple regression analysis and multiple discriminant analysis. Discriminant Analysis.
Spreadsheet Modeling and Decision Analysis, 3e, by Cliff Ragsdale. © 2001 South-Western/Thomson Learning. 8-1 Nonlinear Programming & Evolutionary Optimization.
1 Spreadsheet Modeling & Decision Analysis: A Practical Introduction to Management Science, 3e by Cliff Ragsdale.
Spreadsheet Modeling and Decision Analysis, 3e, by Cliff Ragsdale. © 2001 South-Western/Thomson Learning. 8-1 Introduction to Nonlinear Programming (NLP)
Simple Linear Regression Statistics 700 Week of November 27.
Basics: Notation: Sum:. PARAMETERS MEAN: Sample Variance: Standard Deviation: * the statistical average * the central tendency * the spread of the values.
Data Mining CS 341, Spring 2007 Lecture 4: Data Mining Techniques (I)
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
MACHINE LEARNING 6. Multivariate Methods 1. Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Motivating Example  Loan.
Analysis of Covariance Goals: 1)Reduce error variance. 2)Remove sources of bias from experiment. 3)Obtain adjusted estimates of population means.
1 Linear Classification Problem Two approaches: -Fisher’s Linear Discriminant Analysis -Logistic regression model.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Classification with several populations Presented by: Libin Zhou.
Classification and Prediction: Regression Analysis
Decision Tree Models in Data Mining
Relationships Among Variables
Spreadsheet Modeling & Decision Analysis A Practical Introduction to Management Science 5 th edition Cliff T. Ragsdale.
METU Informatics Institute Min 720 Pattern Classification with Bio-Medical Applications PART 2: Statistical Pattern Classification: Optimal Classification.
Multiple Discriminant Analysis and Logistic Regression.
Inference for the mean vector. Univariate Inference Let x 1, x 2, …, x n denote a sample of n from the normal distribution with mean  and variance 
Classification (Supervised Clustering) Naomi Altman Nov '06.
Principles of Pattern Recognition
Chapter 6 Regression Algorithms in Data Mining
ECE 8443 – Pattern Recognition LECTURE 03: GAUSSIAN CLASSIFIERS Objectives: Normal Distributions Whitening Transformations Linear Discriminants Resources.
1 Spreadsheet Modeling & Decision Analysis: A Practical Introduction to Management Science, 3e by Cliff Ragsdale.
Chapter 12 – Discriminant Analysis © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Regression Models Fit data Time-series data: Forecast Other data: Predict.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
Multiple Discriminant Analysis
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN Modified by Prof. Carolina Ruiz © The MIT Press, 2014 for CS539 Machine Learning at WPI
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 13 Multiple Regression
Linear Discriminant Analysis and Its Variations Abu Minhajuddin CSE 8331 Department of Statistical Science Southern Methodist University April 27, 2002.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Linear Discriminant Analysis (LDA). Goal To classify observations into 2 or more groups based on k discriminant functions (Dependent variable Y is categorical.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Linear Discriminant Analysis and Logistic Regression.
Introduction to Machine Learning Multivariate Methods 姓名 : 李政軒.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 04: GAUSSIAN CLASSIFIERS Objectives: Whitening.
CORRELATION ANALYSIS.
Unit 7 Statistics: Multivariate Analysis of Variance (MANOVA) & Discriminant Functional Analysis (DFA) Chat until class starts.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
DISCRIMINANT ANALYSIS. Discriminant Analysis  Discriminant analysis builds a predictive model for group membership. The model is composed of a discriminant.
Managerial Decision Modeling 6 th edition Cliff T. Ragsdale.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Chapter 12 – Discriminant Analysis
Chapter 7. Classification and Prediction
CH 5: Multivariate Methods
Multiple Discriminant Analysis and Logistic Regression
Chapter 15 Linear Regression
Discriminant Analysis
Correlation and Regression
Generally Discriminant Analysis
Multivariate Methods Berlin Chen
Graph Review Skills Needed Identify the relationship in the graph
Multivariate Methods Berlin Chen, 2005 References:
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
The Geometric Distributions
Presentation transcript:

Spreadsheet Modeling & Decision Analysis A Practical Introduction to Management Science 5th edition Cliff T. Ragsdale

Discriminant Analysis Chapter 10 Discriminant Analysis

Introduction to Discirminant Analysis (DA) DA is a statistical technique that uses information from a set of independent variables to predict the value of a discrete or categorical dependent variable. The goal is to develop a rule for predicting to which of two or more predefined groups a new observation belongs based on the values of the independent variables. Examples: Credit Scoring Will a new loan applicant: (1) default, or (2) repay? Insurance Rating Will a new client be a: (1) high, (2) medium or (3) low risk?

Types of DA Problems 2 Group Problems... …regression can be used k-Group Problem (where k>=2)... …regression cannot be used if k>2

Example of a 2-Group DA Problem: ACME Manufacturing All employees of ACME manufacturing are given a pre-employment test measuring mechanical and verbal aptitude. Each current employee has also been classified into one of two groups: satisfactory or unsatisfactory. We want to determine if the two groups of employees differ with respect to their test scores. If so, we want to develop a rule for predicting whether new applicants will be satisfactory or unsatisfactory.

The Data See file Fig10-1.xls

Graph of Data for Current Employees 45 Group 1 centroid 40 Group 2 centroid C1 Verbal Aptitude 35 C2 30 Satisfactory Employees Unsatisfactory Employees 25 25 30 35 40 45 50 Mechanical Aptitude

Calculating Discriminant Scores where X1 = mechanical aptitude test score X2 = verbal aptitude test score For our example, using regression we obtain,

A Classification Rule If an observation’s discriminant score is less than or equal to some cutoff value, then assign it to group 1; otherwise assign it to group 2 What should the cutoff value be?

Possible Distributions of Discriminant Scores Group 1 Group 2 Cut-off Value

Cutoff Value For data that is multivariate-normal with equal covariances, the optimal cutoff value is: For our example, the cutoff value is: Even when the data is not multivariate-normal, this cutoff value tends to give good results.

Calculating Discriminant Scores See file Fig10-5.xls

A Refined Cutoff Value Costs of misclassification may differ. Probability of group memberships may differ. The following refined cutoff value accounts for these considerations:

Classification Accuracy Predicted Group 1 2 Total Actual 1 9 2 11 Group 2 2 7 9 Total 11 9 20 Accuracy rate = 16/20 = 80%

Classifying New Employees See file Fig10-5.xls

Assign observation to group: The k-Group DA Problem Suppose we have 3 groups (A=1, B=2 & C=3) and one independent variable. We could then fit the following regression function: If the discriminant score is: Assign observation to group: A B C The classification rule is then:

Graph Showing Linear Relationship 1 2 3 4 5 6 7 8 9 10 11 12 13 X Y Group A Group B Group C

The k-Group DA Problem Now suppose we re-assign the groups numbers as follows: A=2, B=1 & C=3. The relation between X & Y is no longer linear. There is no general way to ensure group numbers are assigned in a way that will always produce a linear relationship.

Graph Showing Nonlinear Relationship Y 1 2 3 4 5 6 7 8 9 10 11 12 13 X Group A Group B Group C

Example of a 3-Group DA Problem: ACME Manufacturing All employees of ACME manufacturing are given a pre-employment test measuring mechanical and verbal aptitude. Each current employee has also been classified into one of three groups: superior, average, or inferior. We want to determine if the three groups of employees differ with respect to their test scores. If so, we want to develop a rule for predicting whether new applicants will be superior, average, or inferior.

The Data See file Fig10-11.xls

Graph of Data for Current Employees 45.0 Group 1 centroid 40.0 Group 3 centroid C1 C2 Verbal Aptitude 35.0 C3 30.0 Superior Employees Average Employees Group 2 centroid Inferior Employees 25.0 25.0 30.0 35.0 40.0 45.0 50.0 Mechanical Aptitude

The Classification Rule Compute the distance from the point in question to the centroid of each group. Assign it to the closest group.

Distance Measures Euclidean Distance This does not account for possible differences in variances.

99% Contours of Two Groups X2 P1 C2 C1 X1

Distance Measures Variance-Adjusted Distance This can be adjusted further to account for differences in covariances. The DA.xla add-in uses the Mahalanobis distance measure.

Using the DA.XLA Add-In See file Fig10-11.xls

End of Chapter 10