Evaluating Models Part 1

Slides:

Advertisements

Similar presentations

...visualizing classifier performance in R Tobias Sing, Ph.D. (joint work with Oliver Sander) Modeling & Simulation Novartis Pharma AG 3 rd BaselR meeting.

Advertisements

...visualizing classifier performance Tobias Sing Dept. of Modeling & Simulation Novartis Pharma AG Joint work with Oliver Sander (MPI for Informatics,

Lecture Notes for Chapter 4 (2) Introduction to Data Mining

Lecture Notes for Chapter 4 Part III Introduction to Data Mining

Model Evaluation Metrics for Performance Evaluation

1 I256: Applied Natural Language Processing Marti Hearst Sept 27, 2006.

Tutorial 2 LIU Tengfei 2/19/2009. Contents Introduction TP, FP, ROC Precision, recall Confusion matrix Other performance measures Resource.

1 The Expected Performance Curve Samy Bengio, Johnny Mariéthoz, Mikaela Keller MI – 25. oktober 2007 Kresten Toftgaard Andersen.

IE486 Work Analysis & Design II Lab 4 – Upper Limb Assessment Supplementary Notes on Evaluation of Analysis Tools March 23, 2007 Vincent G. Duffy, IE &

Evaluating Classifiers

Chapter 4 Pattern Recognition Concepts continued.

From Genomic Sequence Data to Genotype: A Proposed Machine Learning Approach for Genotyping Hepatitis C Virus Genaro Hernandez Jr CMSC 601 Spring 2011.

Performance measurement. Must be careful what performance metric we use For example, say we have a NN classifier with 1 output unit, and we code ‘1 =

הערכת טיב המודל F-Measure, Kappa, Costs, MetaCost ד " ר אבי רוזנפלד.

Evaluating Hypotheses Reading: Coursepack: Learning From Examples, Section 4 (pp )

Evaluating What’s Been Learned. Cross-Validation Foundation is a simple idea – “ holdout ” – holds out a certain amount for testing and uses rest for.

Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier.

Model Evaluation l Metrics for Performance Evaluation –How to evaluate the performance of a model? l Methods for Performance Evaluation –How to obtain.

Practical Issues of Classification Underfitting and Overfitting –Training errors –Generalization (test) errors Missing Values Costs of Classification.

1 Wrap up SCREENING TESTS. 2 Screening test The basic tool of a screening program easy to use, rapid and inexpensive. 1.2.

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.

Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.

Classification Evaluation. Estimating Future Accuracy Given available data, how can we reliably predict accuracy on future, unseen data? Three basic approaches.

Quiz 1 review. Evaluating Classifiers Reading: T. Fawcett paper, link on class website, Sections 1-4 Optional reading: Davis and Goadrich paper, link.

ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

Classification Cheng Lei Department of Electrical and Computer Engineering University of Victoria April 24, 2015.

Evaluating Classifiers. Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website)

University of Essex Alma Lilia García Almanza Edward P.K. Tsang Simplifying Decision Trees Learned by Genetic Programming.

7. Performance Measurement

Evolving Decision Rules (EDR)

ECE 471/571 - Lecture 19 Review 02/24/17.

Evaluating Classifiers

Machine Learning: Methodology Chapter

Performance Evaluation 02/15/17

Class session 7 Screening, validity, reliability

Modelling in Physical Geography Martin Mergili, University of Vienna

CSSE463: Image Recognition Day 11

COMP1942 Classification: More Concept Prepared by Raymond Wong

Evaluating Results of Learning

Data Analytics (CS40003) Sensitivity Analysis Lecture #11

Chapter 7 – K-Nearest-Neighbor

Measuring Success in Prediction

بسم الله الرحمن الرحيم Clinical Epidemiology

למידת מכונה Machine Learning.

Our Data Science Roadmap

Lecture Notes for Chapter 4 Introduction to Data Mining

Machine Learning Week 10.

Data Mining Classification: Alternative Techniques

Features & Decision regions

CSSE463: Image Recognition Day 11

Experiments in Machine Learning

Learning Algorithm Evaluation

Approaching an ML Problem

Patricia Butterfield & Naomi Chaytor October 18th, 2017

Evaluating Models Part 2

ROC Curves and Operating Points

Model Evaluation and Selection

Computational Intelligence: Methods and Applications

If I Know This What is 0.4 as a fraction? What is 3.8 as a fraction?

CSSE463: Image Recognition Day 11

CSSE463: Image Recognition Day 11

Our Data Science Roadmap

CS639: Data Management for Data Science

Machine Learning: Methodology Chapter

COSC 4368 Intro Supervised Learning Organization

More on Maxent Env. Variable importance:

ECE – Pattern Recognition Lecture 8 – Performance Evaluation

ROC Curves and Operating Points

Information Organization: Evaluation of Classification Performance

ECE – Pattern Recognition Midterm Review

Presentation transcript:

Evaluating Models Part 1 Evaluation is Creation Geoff Hulten

Training and Testing Data 1) Training set: to build the model 2) Validation set: to tune the parameters of the model 3) Test set: to estimate how well the model works Common Pattern: for p in parametersToTry: model.fit(trainX, trainY, p) accuracies[p] = evaluate(validationY, model.predict(validationX)) bestPFound = bestParametersFound(accuracies) finalModel.fit(trainX+validationX, trainY+validationY, bestPFound) estimateOfGeneralizationPerformance = evaluate(testY, finalModel.predict(testX))

Risks with Evaluation Failure to Generalize: 1) If you test on the same data you train on, you’ll be too optimistic 2) If you evaluate on test data a lot as you’re debugging, you’ll be too optimistic Failure to learn the best model you can: 3) If you reserve too much data for testing you might not learn as good a model

Types of Mistakes: Confusion Matrix Actual Prediction 1

Basic Evaluation Metrics Actual Prediction 1 Accuracy: What fraction does it get right (# TP + # TN) / # Total Precision: When it says 1 how often is it right # TP / (# TP + # FP) Recall: What fraction of 1s does it get right # TP / (# TP + # FN) False Positive Rate: What fraction of 0s are called 1s # FP / (# FP + # TN) False Negative Rate: What fraction of 1s are called 0s # FN / (# TP + # FN) 3 2 3 2

Example of Evaluation Metrics Accuracy: 91% (# TP + # TN) / # Total False Negative Rate: 0% # FN / (# TP + # FN) False Positive Rate: 90% # FP / (# FP + # TN) 90 9 1