Evaluating Models Part 1

Slides:



Advertisements
Similar presentations
...visualizing classifier performance in R Tobias Sing, Ph.D. (joint work with Oliver Sander) Modeling & Simulation Novartis Pharma AG 3 rd BaselR meeting.
Advertisements

...visualizing classifier performance Tobias Sing Dept. of Modeling & Simulation Novartis Pharma AG Joint work with Oliver Sander (MPI for Informatics,
Lecture Notes for Chapter 4 (2) Introduction to Data Mining
Lecture Notes for Chapter 4 Part III Introduction to Data Mining
Model Evaluation Metrics for Performance Evaluation
1 I256: Applied Natural Language Processing Marti Hearst Sept 27, 2006.
Tutorial 2 LIU Tengfei 2/19/2009. Contents Introduction TP, FP, ROC Precision, recall Confusion matrix Other performance measures Resource.
1 The Expected Performance Curve Samy Bengio, Johnny Mariéthoz, Mikaela Keller MI – 25. oktober 2007 Kresten Toftgaard Andersen.
IE486 Work Analysis & Design II Lab 4 – Upper Limb Assessment Supplementary Notes on Evaluation of Analysis Tools March 23, 2007 Vincent G. Duffy, IE &
Evaluating Classifiers
Chapter 4 Pattern Recognition Concepts continued.
From Genomic Sequence Data to Genotype: A Proposed Machine Learning Approach for Genotyping Hepatitis C Virus Genaro Hernandez Jr CMSC 601 Spring 2011.
Performance measurement. Must be careful what performance metric we use For example, say we have a NN classifier with 1 output unit, and we code ‘1 =
הערכת טיב המודל F-Measure, Kappa, Costs, MetaCost ד " ר אבי רוזנפלד.
Evaluating Hypotheses Reading: Coursepack: Learning From Examples, Section 4 (pp )
Evaluating What’s Been Learned. Cross-Validation Foundation is a simple idea – “ holdout ” – holds out a certain amount for testing and uses rest for.
Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier.
Model Evaluation l Metrics for Performance Evaluation –How to evaluate the performance of a model? l Methods for Performance Evaluation –How to obtain.
Practical Issues of Classification Underfitting and Overfitting –Training errors –Generalization (test) errors Missing Values Costs of Classification.
1 Wrap up SCREENING TESTS. 2 Screening test The basic tool of a screening program easy to use, rapid and inexpensive. 1.2.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.
Classification Evaluation. Estimating Future Accuracy Given available data, how can we reliably predict accuracy on future, unseen data? Three basic approaches.
Quiz 1 review. Evaluating Classifiers Reading: T. Fawcett paper, link on class website, Sections 1-4 Optional reading: Davis and Goadrich paper, link.
ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.
Classification Cheng Lei Department of Electrical and Computer Engineering University of Victoria April 24, 2015.
Evaluating Classifiers. Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website)
University of Essex Alma Lilia García Almanza Edward P.K. Tsang Simplifying Decision Trees Learned by Genetic Programming.
7. Performance Measurement
Evolving Decision Rules (EDR)
ECE 471/571 - Lecture 19 Review 02/24/17.
Evaluating Classifiers
Machine Learning: Methodology Chapter
Performance Evaluation 02/15/17
Class session 7 Screening, validity, reliability
Modelling in Physical Geography Martin Mergili, University of Vienna
CSSE463: Image Recognition Day 11
COMP1942 Classification: More Concept Prepared by Raymond Wong
Evaluating Results of Learning
Data Analytics (CS40003) Sensitivity Analysis Lecture #11
Chapter 7 – K-Nearest-Neighbor
Measuring Success in Prediction
بسم الله الرحمن الرحيم Clinical Epidemiology
למידת מכונה Machine Learning.
Our Data Science Roadmap
Lecture Notes for Chapter 4 Introduction to Data Mining
Machine Learning Week 10.
Data Mining Classification: Alternative Techniques
Features & Decision regions
CSSE463: Image Recognition Day 11
Experiments in Machine Learning
Learning Algorithm Evaluation
Approaching an ML Problem
Patricia Butterfield & Naomi Chaytor October 18th, 2017
Evaluating Models Part 2
ROC Curves and Operating Points
Model Evaluation and Selection
Computational Intelligence: Methods and Applications
If I Know This What is 0.4 as a fraction? What is 3.8 as a fraction?
CSSE463: Image Recognition Day 11
CSSE463: Image Recognition Day 11
Our Data Science Roadmap
CS639: Data Management for Data Science
Machine Learning: Methodology Chapter
COSC 4368 Intro Supervised Learning Organization
More on Maxent Env. Variable importance:
ECE – Pattern Recognition Lecture 8 – Performance Evaluation
ROC Curves and Operating Points
Information Organization: Evaluation of Classification Performance
ECE – Pattern Recognition Midterm Review
Presentation transcript:

Evaluating Models Part 1 Evaluation is Creation Geoff Hulten

Training and Testing Data 1) Training set: to build the model 2) Validation set: to tune the parameters of the model 3) Test set: to estimate how well the model works Common Pattern: for p in parametersToTry: model.fit(trainX, trainY, p) accuracies[p] = evaluate(validationY, model.predict(validationX)) bestPFound = bestParametersFound(accuracies) finalModel.fit(trainX+validationX, trainY+validationY, bestPFound) estimateOfGeneralizationPerformance = evaluate(testY, finalModel.predict(testX))

Risks with Evaluation Failure to Generalize: 1) If you test on the same data you train on, you’ll be too optimistic 2) If you evaluate on test data a lot as you’re debugging, you’ll be too optimistic Failure to learn the best model you can: 3) If you reserve too much data for testing you might not learn as good a model

Types of Mistakes: Confusion Matrix Actual Prediction 1

Basic Evaluation Metrics Actual Prediction 1 Accuracy: What fraction does it get right (# TP + # TN) / # Total Precision: When it says 1 how often is it right # TP / (# TP + # FP) Recall: What fraction of 1s does it get right # TP / (# TP + # FN) False Positive Rate: What fraction of 0s are called 1s # FP / (# FP + # TN) False Negative Rate: What fraction of 1s are called 0s # FN / (# TP + # FN) 3 2 3 2

Example of Evaluation Metrics Accuracy: 91% (# TP + # TN) / # Total False Negative Rate: 0% # FN / (# TP + # FN) False Positive Rate: 90% # FP / (# FP + # TN) 90 9 1