CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 01: Training, Testing, and Tuning Datasets.

Slides:

Advertisements

Similar presentations

The Software Infrastructure for Electronic Commerce Databases and Data Mining Lecture 4: An Introduction To Data Mining (II) Johannes Gehrke

Advertisements

Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.

Decision Tree Approach in Data Mining

Learning Algorithm Evaluation

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.

Autocorrelation and Linkage Cause Bias in Evaluation of Relational Learners David Jensen and Jennifer Neville.

Algorithms: The basic methods. Inferring rudimentary rules Simplicity first Simple algorithms often work surprisingly well Many different kinds of simple.

Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.

Evaluation (practice). 2 Predicting performance  Assume the estimated error rate is 25%. How close is this to the true error rate?  Depends on the amount.

Credibility: Evaluating what’s been learned. Evaluation: the key to success How predictive is the model we learned? Error on the training data is not.

Supervised classification performance (prediction) assessment Dr. Huiru Zheng Dr. Franscisco Azuaje School of Computing and Mathematics Faculty of Engineering.

Evaluation and Credibility How much should we believe in what was learned?

Experimental Evaluation

Evaluation and Credibility

Evaluation of Results (classifiers, and beyond) Biplav Srivastava Sources: [Witten&Frank00] Witten, I.H. and Frank, E. Data Mining - Practical Machine.

Genetic Algorithm What is a genetic algorithm? “Genetic Algorithms are defined as global optimization procedures that use an analogy of genetic evolution.

On Comparing Classifiers: Pitfalls to Avoid and Recommended Approach Published by Steven L. Salzberg Presented by Prakash Tilwani MACS 598 April 25 th.

POPULATION: Population is an aggregate of objaects enimate or inenimate understudy. The population may be finite or infinite. SAMPLE: A finite subset.

CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.

Evaluating Performance for Data Mining Techniques

Evaluation of Learning Models

CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 07: Cost-Sensitive Measures.

Today Evaluation Measures Accuracy Significance Testing

M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining Classification: Evaluation February 23,

1  The goal is to estimate the error probability of the designed classification system  Error Counting Technique  Let classes  Let data points in class.

EVALUATION David Kauchak CS 451 – Fall Admin Assignment 3 - change constructor to take zero parameters - instead, in the train method, call getFeatureIndices()

Module 04: Algorithms Topic 07: Instance-Based Learning

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 1 Hastie, Tibshirani and Friedman.The Elements of Statistical Learning (2nd edition,

Issues with Data Mining

CLassification TESTING Testing classifier accuracy

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall 報告人：黃子齊

ENSEMBLE LEARNING David Kauchak CS451 – Fall 2013.

Evaluating Hypotheses Reading: Coursepack: Learning From Examples, Section 4 (pp )

Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.

Berendt: Advanced databases, winter term 2007/08, 1 Advanced databases – Inferring implicit/new.

1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.

CLASSIFICATION: Ensemble Methods

CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.

Evaluating Results of Learning Blaž Zupan

1 Evaluation of Learning Models Literature: Literature: T. Mitchel, Machine Learning, chapter 5 T. Mitchel, Machine Learning, chapter 5 I.H. Witten and.

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.

COP5992 – DATA MINING TERM PROJECT RANDOM SUBSPACE METHOD + CO-TRAINING by SELIM KALAYCI.

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.

Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.

CSCI 347, Data Mining Evaluation: Cross Validation, Holdout, Leave-One-Out Cross Validation and Bootstrapping, Sections 5.3 & 5.4, pages

CSCI 347, Data Mining Evaluation: Training and Testing, Section 5.1, pages

Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -

Validation methods.

Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.

Evaluating Classifiers Reading: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website)An introduction to ROC analysis.

Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.

Evaluation of Learning Models Evgueni Smirnov. Overview Motivation Metrics for Classifier’s Evaluation Methods for Classifier’s Evaluation Comparing Data.

Computational Intelligence: Methods and Applications Lecture 15 Model selection and tradeoffs. Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.

Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Credibility: Evaluating What’s Been Learned Predicting.

Data Mining Chapter 4 Algorithms: The Basic Methods Reporter: Yuen-Kuei Hsueh.

Data Science Credibility: Evaluating What’s Been Learned

Cluster Analysis II 10/03/2012.

Evaluating Results of Learning

9. Credibility: Evaluating What’s Been Learned

Data Mining Practical Machine Learning Tools and Techniques

Machine Learning Techniques for Data Mining

Learning Algorithm Evaluation

Evaluating Models Part 2

CSCI N317 Computation for Scientific Applications Unit Weka

Ensemble learning.

Cross-validation Brenda Thomson/ Peter Fox Data Analytics

Presentation transcript:

CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 01: Training, Testing, and Tuning Datasets

Evaluation: The Key to Success ● How predictive is the model we learned? ● Error on the training data is not a good indicator of performance on future data  Otherwise 1-NN would be the optimum classifier! ● Simple solution that can be used if lots of (labeled) data is available:  Split data into training and test set ● However: (labeled) data is usually limited  More sophisticated techniques need to be used 2

Issues in Evaluation ● Statistical reliability of estimated differences in performance (  significance tests) ● Choice of performance measure:  Number of correct classifications  Accuracy of probability estimates  Error in numeric predictions ● Costs assigned to different types of errors  Many practical applications involve costs 3

Training and Testing I ● Natural performance measure for classification problems: error rate  Success: instance’s class is predicted correctly  Error: instance’s class is predicted incorrectly  Error rate: proportion of errors made over the whole set of instances ● Resubstitution error: error rate obtained from training data ● Resubstitution error is (hopelessly) optimistic! 4

Training and Testing II ● Test set: independent instances that have played no part in formation of classifier ● Assumption: both training data and test data are representative samples of the underlying problem ● Test and training data may differ in nature ● Example: classifiers built using customer data from two different towns A and B ● To estimate performance of classifier from town A in completely new town, test it on data from B 5

Note on Parameter Tuning ● It is important that the test data is not used in any way to create the classifier ● Some learning schemes operate in two stages: ● Stage 1: Build the basic structure ● Stage 2: Optimize parameter settings ● The test data can’t be used for parameter tuning! ● Proper procedure uses three sets: training data, validation data, and test data ● Validation data is used to optimize parameters 6

Making the Most of the Data ● Once evaluation is complete, all the data can be used to build the final classifier ● Generally, the larger the training data the better the classifier (but returns diminish) ● The larger the test data the more accurate the error estimate ● Holdout procedure: method of splitting original data into training and test set ● Dilemma: ideally both training set and test set should be large! 7

The Mystery Sound  And what is this sound? 8