Performance Evaluation: Estimation of Recognition rates J.-S. Roger Jang ( 張智星 ) CSIE Dept., National Taiwan Univ.

Slides:



Advertisements
Similar presentations
Feature Selection for Pattern Recognition J.-S. Roger Jang ( 張智星 ) CSIE Dept., National Taiwan University ( 台灣大學 資訊工程系 )
Advertisements

Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.
Indian Statistical Institute Kolkata
Evaluation.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing.
Model Evaluation Metrics for Performance Evaluation
Credibility: Evaluating what’s been learned. Evaluation: the key to success How predictive is the model we learned? Error on the training data is not.
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
Supervised classification performance (prediction) assessment Dr. Huiru Zheng Dr. Franscisco Azuaje School of Computing and Mathematics Faculty of Engineering.
Evaluation.
Experimental Evaluation
CS Instance Based Learning1 Instance Based Learning.
System Evaluation To evaluate the error probability of the designed Pattern Recognition System Resubstitution Method – Apparent Error Overoptimistic Holdout.
Computer Vision Lecture 8 Performance Evaluation.
1  The goal is to estimate the error probability of the designed classification system  Error Counting Technique  Let classes  Let data points in class.
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 01: Training, Testing, and Tuning Datasets.
Evaluating Classifiers
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
PCA & LDA for Face Recognition
Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.
CLassification TESTING Testing classifier accuracy
Gene Expression Profiling Illustrated Using BRB-ArrayTools.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
1 1 Slide Evaluation. 2 2 n Interactive decision tree construction Load segmentchallenge.arff; look at dataset Load segmentchallenge.arff; look at dataset.
Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.
CSIE Dept., National Taiwan Univ., Taiwan
Experimental Evaluation of Learning Algorithms Part 1.
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 5.
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
EMBC2001 Using Artificial Neural Networks to Predict Malignancy of Ovarian Tumors C. Lu 1, J. De Brabanter 1, S. Van Huffel 1, I. Vergote 2, D. Timmerman.
Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory Research supported in part by a grant from the National.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Singly Linked Lists Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University 1.
CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.
Presentation Title Department of Computer Science A More Principled Approach to Machine Learning Michael R. Smith Brigham Young University Department of.
Evaluating Results of Learning Blaž Zupan
Quadratic Classifiers (QC) J.-S. Roger Jang ( 張智星 ) CS Dept., National Taiwan Univ Scientific Computing.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Validation.
Chapter 6 Cross Validation.
LOGO iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance- Pairs and Reduced Alphabet Profile into the General Pseudo Amino.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.
CSCI 347, Data Mining Evaluation: Cross Validation, Holdout, Leave-One-Out Cross Validation and Bootstrapping, Sections 5.3 & 5.4, pages
Feature Extraction Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and.
Computational Intelligence: Methods and Applications Lecture 33 Decision Tables & Information Theory Włodzisław Duch Dept. of Informatics, UMK Google:
Validation methods.
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer.
Classification Cheng Lei Department of Electrical and Computer Engineering University of Victoria April 24, 2015.
Computational Intelligence: Methods and Applications Lecture 15 Model selection and tradeoffs. Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Machine Learning in Practice Lecture 21 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Next, this study employed SVM to classify the emotion label for each EEG segment. The basic idea is to project input data onto a higher dimensional feature.
Extending linear models by transformation (section 3.4 in text) (lectures 3&4 on amlbook.com)
Simulation of Stock Trading J.-S. Roger Jang ( 張智星 ) MIR Lab, CSIE Dept. National Taiwan University.
Linear Classifiers (LC) J.-S. Roger Jang ( 張智星 ) MIR Lab, CSIE Dept. National Taiwan University.
7. Performance Measurement
CSIE Dept., National Taiwan Univ., Taiwan
Quadratic Classifiers (QC)
Intro to Machine Learning
National Taiwan University
Evaluating Results of Learning
Feature Selection for Pattern Recognition
Intro to Machine Learning
Circularly Linked Lists and List Reversal
Naive Bayes Classifiers (NBC)
Derek Hoiem CS 598, Spring 2009 Jan 27, 2009
Presentation transcript:

Performance Evaluation: Estimation of Recognition rates J.-S. Roger Jang ( 張智星 ) CSIE Dept., National Taiwan Univ. Machine Learning Performance Evaluation

2 Outline Performance indices of a given classifier/model Accuracy (recognition rate) Computation load Methods to estimate the recognition rate Inside test One-sided holdout test Two-sided holdout test M-fold cross validation Leave-one-out cross validation

Machine Learning Performance Evaluation 3 Synonym The following sets of synonyms will be use interchangeably Classifier, model Recognition rate, accuracy

Machine Learning Performance Evaluation 4 Performance Indices Performance indices of a classifier Recognition rate -Requires an objective procedure to derive it Computation load -Design-time computation -Run-time computation Our focus Recognition rate and the procedures to derive it The estimated accuracy depends on -Dataset -Model (types and complexity)

Machine Learning Performance Evaluation 5 Methods for Deriving Recognition rates Methods to derive the recognition rates Inside test (resubstitution recog. rate) One-sided holdout test Two-sided holdout test M-fold cross validation Leave-one-out cross validation Data partitioning Training set Training and test sets Training, validating, and test sets

Machine Learning Performance Evaluation 6 Inside Test Dataset partitioning Use the whole dataset for training & evaluation Recognition rate Inside-test recognition rate Resubstitution accuracy

Machine Learning Performance Evaluation 7 Inside Test (2) Characteristics Too optimistic since RR tends to be higher For instance, 1-NNC always has an RR of 100%! Can be used as the upper bound of the true RR. Potential reasons for low inside-test RR: Bad features of the dataset Bad method for model construction, such as -Bad results from neural network training -Bad results from k-means clustering

Machine Learning Performance Evaluation 8 One-side Holdout Test Dataset partitioning Training set for model construction Test set for performance evaluation Recognition rate Inside-test RR Outside-test RR

Machine Learning Performance Evaluation 9 One-side Holdout Test (2) Characteristics Highly affected by data partitioning Usually Adopted when design-time computation load is high

Machine Learning Performance Evaluation 10 Two-sided Holdout Test Dataset partitioning Training set for model construction Test set for performance evaluation Role reversal

Machine Learning Performance Evaluation 11 Two-sided Holdout Test (2) Two-sided holdout test (used in GMDH) Data set A Data set B Model A construction RR A evaluation Model B RR B construction evaluation Outside-test RR = (RR A + RR B )/2

Machine Learning Performance Evaluation 12 Two-sided Holdout Test (3) Characteristics Better usage of the dataset Still highly affected by the partitioning Suitable for models/classifiers with high design- time computation load

Machine Learning Performance Evaluation 13 M-fold Cross Validation Data partitioning Partition the dataset into m fold One fold for test, the other folds for training Repeat m times

Machine Learning Performance Evaluation 14 M-fold Cross Validation (2) Model k m disjoint sets construction evaluation Outside test

Machine Learning Performance Evaluation 15 M-fold Cross Validation (3) Characteristics When m=2  Two-sided holdout test When m=n  Leave-one-out cross validation The value of m depends on the computation load imposed by the selected model/classifier.

Machine Learning Performance Evaluation 16 Leave-one-out Cross Validation Data partitioning When m=n and Si=(xi, yi)

Machine Learning Performance Evaluation 17 Leave-one-out Cross Validation (2) Leave-one-out CV Model k n i/o pairs construction evaluation 0% or 100%! Outside test

Machine Learning Performance Evaluation 18 Leave-one-out Cross Validation (3) General method for LOOCV Perform model construction (as a blackbox) n times  Slow! To speed up the computation LOOCV Construct a common part that will be used repeatedly, such as -Global mean and covariance for QC More info of cross-validation on Wikipediacross-validation on Wikipedia

Machine Learning Performance Evaluation 19 Applications and Misuse of CV Applications of CV Input (feature) selection Model complexity determination Performance comparison among different models Misuse of CV Do not try to boost validation RR too much, or you are running the risk of indirectly training the left- out data!