1  The goal is to estimate the error probability of the designed classification system  Error Counting Technique  Let classes  Let data points in class.

Slides:

Advertisements

Similar presentations

Principles of Density Estimation

Advertisements

Evaluating Classifiers

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.

1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.

Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.

Model Assessment, Selection and Averaging

ESTIMATION AND HYPOTHESIS TESTING

Chapter 7 Introduction to Sampling Distributions

Evaluation (practice). 2 Predicting performance  Assume the estimated error rate is 25%. How close is this to the true error rate?  Depends on the amount.

Assuming normally distributed data! Naïve Bayes Classifier.

Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.

Credibility: Evaluating what’s been learned. Evaluation: the key to success How predictive is the model we learned? Error on the training data is not.

Chapter 6 Introduction to Sampling Distributions

Statistical Inference Chapter 12/13. COMP 5340/6340 Statistical Inference2 Statistical Inference Given a sample of observations from a population, the.

Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.

Evaluating Hypotheses

Parametric Inference.

SAMPLING DISTRIBUTIONS. SAMPLING VARIABILITY

2. Point and interval estimation Introduction Properties of estimators Finite sample size Asymptotic properties Construction methods Method of moments.

Experimental Evaluation

System Evaluation To evaluate the error probability of the designed Pattern Recognition System Resubstitution Method – Apparent Error Overoptimistic Holdout.

Sampling Distributions & Point Estimation. Questions What is a sampling distribution? What is the standard error? What is the principle of maximum likelihood?

Business Statistics: Communicating with Numbers

Computer Vision Lecture 8 Performance Evaluation.

Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.

CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 01: Training, Testing, and Tuning Datasets.

CLassification TESTING Testing classifier accuracy

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall 報告人：黃子齊

Pattern Recognition: Baysian Decision Theory Charles Tappert Seidenberg School of CSIS, Pace University.

ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:

Bayesian Networks. Male brain wiring Female brain wiring.

 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 5.

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

The Central Limit Theorem and the Normal Distribution.

Optimal Bayes Classification

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.

1  The goals:  Select the “optimum” number l of features  Select the “best” l features  Large l has a three-fold disadvantage:  High computational.

Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.

CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.

START OF DAY 5 Reading: Chap. 8. Support Vector Machine.

Model Evaluation l Metrics for Performance Evaluation –How to evaluate the performance of a model? l Methods for Performance Evaluation –How to obtain.

Chapter 3: Maximum-Likelihood Parameter Estimation l Introduction l Maximum-Likelihood Estimation l Multivariate Case: unknown , known  l Univariate.

Classification Problem GivenGiven Predict class label of a given queryPredict class label of a given query

Machine Learning Chapter 5. Evaluating Hypotheses

Chapter 6 Cross Validation.

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.

1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.

Nonparametric Methods II 1 Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

CSCI 347, Data Mining Evaluation: Cross Validation, Holdout, Leave-One-Out Cross Validation and Bootstrapping, Sections 5.3 & 5.4, pages

M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.

Validation methods.

Lecture 5: Statistical Methods for Classification CAP 5415: Computer Vision Fall 2006.

R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.

BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.

Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.

Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

Chapter 7, part D. VII. Sampling Distribution of The sampling distribution of is the probability distribution of all possible values of the sample proportion.

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

Chapter 3: Maximum-Likelihood Parameter Estimation

Virtual University of Pakistan

Parameter Estimation 主講人：虞台文.

Maximum Likelihood Estimation

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Parametric Methods Berlin Chen, 2005 References:

Learning From Observed Data

Presentation transcript:

1  The goal is to estimate the error probability of the designed classification system  Error Counting Technique  Let classes  Let data points in class for testing. the number of test points.  Let P i the probability error for class ω i  The classifier is assumed to have been designed using another independent data set  Assuming that the feature vectors in the test data set are independent, the probability of k i vectors from ω i being in error is SYSTEM EVALUATION

2  Since P i ’s are not known, estimate P i by maximizing the above binomial distribution. It turns out that  Thus, count the errors and divide by the total number of test points in class  Total probability of error

3  Statistical Properties Thus the estimator is unbiased but only asymptotically consistent. Hence for small N, may not be reliable  A theoretically derived estimate of a sufficient number N of the test data set is

4  Exploiting the finite size of the data set.  Resubstitution method: Use the same data for training and testing. It underestimates the error. The estimate improves for large N and large ratios.  Holdout Method: Given N divide it into: N 1 : training points N 2 : test points N=N 1 +N 2 Problem: Less data both for training and test

5  Leave-one-out Method The steps: Choose one sample out of the N. Train the classifier using the remaining N-1 samples. Test the classifier using the selected sample. Count an error if it is misclassified. Repeat the above by excluding a different sample each time. Compute the error probability by averaging the counted errors

6  Advantages: Use all data for testing and training Assures independence between test and training samples  Disadvantages: Complexity in computations high  Variants of it exclude k>1 points each time, to reduce complexity