Download presentation
Presentation is loading. Please wait.
Published byAlan McCoy Modified over 9 years ago
1
Performance Evaluation: Estimation of Recognition rates J.-S. Roger Jang ( 張智星 ) CSIE Dept., National Taiwan Univ. http://mirlab.org/jangjang@mirlab.org Machine Learning Performance Evaluation
2
2 Outline Performance indices of a given classifier/model Accuracy (recognition rate) Computation load Methods to estimate the recognition rate Inside test One-sided holdout test Two-sided holdout test M-fold cross validation Leave-one-out cross validation
3
Machine Learning Performance Evaluation 3 Synonym The following sets of synonyms will be use interchangeably Classifier, model Recognition rate, accuracy
4
Machine Learning Performance Evaluation 4 Performance Indices Performance indices of a classifier Recognition rate -Requires an objective procedure to derive it Computation load -Design-time computation -Run-time computation Our focus Recognition rate and the procedures to derive it The estimated accuracy depends on -Dataset -Model (types and complexity)
5
Machine Learning Performance Evaluation 5 Methods for Deriving Recognition rates Methods to derive the recognition rates Inside test (resubstitution recog. rate) One-sided holdout test Two-sided holdout test M-fold cross validation Leave-one-out cross validation Data partitioning Training set Training and test sets Training, validating, and test sets
6
Machine Learning Performance Evaluation 6 Inside Test Dataset partitioning Use the whole dataset for training & evaluation Recognition rate Inside-test recognition rate Resubstitution accuracy
7
Machine Learning Performance Evaluation 7 Inside Test (2) Characteristics Too optimistic since RR tends to be higher For instance, 1-NNC always has an RR of 100%! Can be used as the upper bound of the true RR. Potential reasons for low inside-test RR: Bad features of the dataset Bad method for model construction, such as -Bad results from neural network training -Bad results from k-means clustering
8
Machine Learning Performance Evaluation 8 One-side Holdout Test Dataset partitioning Training set for model construction Test set for performance evaluation Recognition rate Inside-test RR Outside-test RR
9
Machine Learning Performance Evaluation 9 One-side Holdout Test (2) Characteristics Highly affected by data partitioning Usually Adopted when design-time computation load is high
10
Machine Learning Performance Evaluation 10 Two-sided Holdout Test Dataset partitioning Training set for model construction Test set for performance evaluation Role reversal
11
Machine Learning Performance Evaluation 11 Two-sided Holdout Test (2) Two-sided holdout test (used in GMDH) Data set A Data set B Model A construction RR A evaluation Model B RR B construction evaluation Outside-test RR = (RR A + RR B )/2
12
Machine Learning Performance Evaluation 12 Two-sided Holdout Test (3) Characteristics Better usage of the dataset Still highly affected by the partitioning Suitable for models/classifiers with high design- time computation load
13
Machine Learning Performance Evaluation 13 M-fold Cross Validation Data partitioning Partition the dataset into m fold One fold for test, the other folds for training Repeat m times
14
Machine Learning Performance Evaluation 14 M-fold Cross Validation (2) Model k.................. m disjoint sets construction evaluation Outside test
15
Machine Learning Performance Evaluation 15 M-fold Cross Validation (3) Characteristics When m=2 Two-sided holdout test When m=n Leave-one-out cross validation The value of m depends on the computation load imposed by the selected model/classifier.
16
Machine Learning Performance Evaluation 16 Leave-one-out Cross Validation Data partitioning When m=n and Si=(xi, yi)
17
Machine Learning Performance Evaluation 17 Leave-one-out Cross Validation (2) Leave-one-out CV Model k.................. n i/o pairs construction evaluation 0% or 100%! Outside test
18
Machine Learning Performance Evaluation 18 Leave-one-out Cross Validation (3) General method for LOOCV Perform model construction (as a blackbox) n times Slow! To speed up the computation LOOCV Construct a common part that will be used repeatedly, such as -Global mean and covariance for QC More info of cross-validation on Wikipediacross-validation on Wikipedia
19
Machine Learning Performance Evaluation 19 Applications and Misuse of CV Applications of CV Input (feature) selection Model complexity determination Performance comparison among different models Misuse of CV Do not try to boost validation RR too much, or you are running the risk of indirectly training the left- out data!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.