Download presentation
Presentation is loading. Please wait.
Published byMeghan Blair Modified over 9 years ago
1
Classification Cheng Lei rexlei86@uvic.ca Department of Electrical and Computer Engineering University of Victoria April 24, 2015
2
Outline ❖ Basic concepts ❖ Evaluation ❖ Procedures in SAS
3
Goal & Data Data: Training data: a set of data records with a number of attributes and a class Test data: to test the model learned from the training data Goal: To learn a classification model from the data that can be used to predict the classes of the new cases
4
Steps Determine the training data set Gather the training data Determine the input feature representation of the learned function Determine learning algorithm Run the algorithm on the training data Evaluate the accuracy
5
Classification Process Training Data Learning Algorithm Model Test Data Accura cy
6
Evaluation Methods Predictive accuracy Efficiency Time to build the model Time to use the model Robustness Handling noise and missing values Scalability Efficiency in disk-resident databases Interpretability Understandable and insight provided by the model Compactness of the model Number of rules
7
Evaluation methods Holdout set Divide the data into two parts: training data & test data N-fold cross-validation Divide data to N subsets, each subset as the test data and the rest as training data, run the procedure n times Leave-one-out cross-validation
8
Precision & Recall TP: True Positive, number of correct classification of the positive instances FN: False Negative, number of incorrect classification of the positive instances FP: False Positive, number of incorrect classification of the negative instances TN: True Negative, number of correct classification of the negative instances Classified PositiveClassified Negative Actual PositiveTPFN Actual NegativeFPTN
9
Precision & Recall Classified PositiveClassified Negative Actual PositiveTPFN Actual NegativeFPTN
10
F-measure Hard to compare two classifiers using two measures (p, r), so combine them and use their harmonic mean F value to be large, both P & r must be large
11
Procedures in SAS BCHOICE Performs Bayesian analysis for discrete choice models CATMOD Performs categorical data modeling of data that can be represented by a contingency table DISCRIM Develops a discriminant criterion to classify each observation into groups STEPDISC Given a classification variable and several quantitative variables, the procedure performs a stepwise discriminant analysis to select a subset of the quantitative variables for use in discriminating among the classes …….
12
Thank You!!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.