Evaluating Hypothesis

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Evaluating Classifiers
Sampling: Final and Initial Sample Size Determination
POINT ESTIMATION AND INTERVAL ESTIMATION
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Chapter 10 Simple Regression.
CS 8751 ML & KDDEvaluating Hypotheses1 Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal.
Evaluating Classifiers Lecture 2 Instructor: Max Welling Read chapter 5.
Evaluating Hypotheses
T-test.
Evaluating Classifiers Lecture 2 Instructor: Max Welling.
Inference about a Mean Part II
Experimental Evaluation
Inferences About Process Quality
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
BCOR 1020 Business Statistics Lecture 18 – March 20, 2008.
BCOR 1020 Business Statistics
Standard error of estimate & Confidence interval.
Review of normal distribution. Exercise Solution.
1 Machine Learning: Lecture 5 Experimental Evaluation of Learning Algorithms (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
Hypothesis Testing.
1 Machine Learning: Experimental Evaluation. 2 Motivation Evaluating the performance of learning systems is important because: –Learning systems are usually.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Experimental Evaluation of Learning Algorithms Part 1.
Introduction  Populations are described by their probability distributions and parameters. For quantitative populations, the location and shape are described.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
Estimation Chapter 8. Estimating µ When σ Is Known.
Sections 7-1 and 7-2 Review and Preview and Estimating a Population Proportion.
CpSc 881: Machine Learning Evaluating Hypotheses.
机器学习 陈昱 北京大学计算机科学技术研究所 信息安全工程研究中心. 课程基本信息  主讲教师:陈昱 Tel :  助教:程再兴, Tel :  课程网页:
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Machine Learning Chapter 5. Evaluating Hypotheses
1 CSI5388 Current Approaches to Evaluation (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
Chapter5: Evaluating Hypothesis. 개요 개요 Evaluating the accuracy of hypotheses is fundamental to ML. - to decide whether to use this hypothesis - integral.
1 Evaluation of Learning Models Literature: Literature: T. Mitchel, Machine Learning, chapter 5 T. Mitchel, Machine Learning, chapter 5 I.H. Witten and.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
© Copyright McGraw-Hill 2004
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
8.1 Estimating µ with large samples Large sample: n > 30 Error of estimate – the magnitude of the difference between the point estimate and the true parameter.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
Empirical Evaluation (Ch 5) how accurate is a hypothesis/model/dec.tree? given 2 hypotheses, which is better? accuracy on training set is biased – error:
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 FINAL EXAMINATION STUDY MATERIAL III A ADDITIONAL READING MATERIAL – INTRO STATS 3 RD EDITION.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate accuracy.
Statistics for Business and Economics 7 th Edition Chapter 7 Estimation: Single Population Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Evaluating Hypotheses
Introduction For inference on the difference between the means of two populations, we need samples from both populations. The basic assumptions.
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
STATISTICAL INFERENCE
Introduction to Probability and Statistics Twelfth Edition
3. The X and Y samples are independent of one another.
Chapter 4. Inference about Process Quality
Evaluating Classifiers
Empirical Evaluation (Ch 5)
Problems: Q&A chapter 6, problems Chapter 6:
Computational Learning Theory
Evaluating Hypotheses
Instructor : Saeed Shiry
Performance Evaluation and Hypothesis Testing add hints on how to evaluate with unbalanced data, see howtosvm.pdf.
Chapter 24 Comparing Two Means.
Sampling Distributions (§ )
Chapter 8 Estimation.
Machine Learning: Lecture 5
How Confident Are You?.
Presentation transcript:

Evaluating Hypothesis 자연언어처리연구실 장 정 호

개요 Evaluating the accuracy of hypotheses is fundamental to ML. - to decide whether to use this hypothesis - integral component of many learning system Difficulty from limited set of data - Bias in the estimate - Variance in the estimate

1. Contents Methods for evaluating learned hypotheses Methods for comparing the accuracy of two hypotheses Methods for comparing the accuracy of two learning algorithms when limited set of data is available

2. Estimating Hypothesis Accuracy Two Interests 1. Given a hypothesis h and a data sample, what is the best estimate of the accuracy of h over unseen data? 2. What is probable error in accuracy estimate?

2. Evaluating… (Cont’d) Two Definitions of Error 1. Sample Error with respect to target function f and data sample S, 2. True Error with respect to target function f and distribution D, How good an estimate of errorD(h) is provided by errorS(h)?

2. Evaluating… (Cont’d) Problems Causing Estimating Error 1. Bias : if S is training set, errorS(h) is optimistically biased estimation bias = E[errorS(h)] - errorD(h) For unbiased estimate, h and S must be chosen independently 2. Variance : Even with unbiased S, errorS(h) may vary from errorD(h)

2. Evaluating… (Cont’d) Estimators Experiment : 1. Choose sample S of size n according to distribution D 2. Measure errorS(h) errorS(h) is a random variable errorS(h) is an unbiased estimator for errorD(h) Given observed errorS(h) what can we conclude about errorD(h) ?

2. Evaluating… (Cont’d) Confidence Interval if 1. S contains n examples, drawn independently of h and each other 2. n >= 30 then with approximately N% probability, errorD(h) lies in interval

2. Evaluating… (Cont’d) Normal Distribution Approximates Binomial Distribution errorS(h) follows a Binomial distribution, with Approximate this by a Normal distribution with

2. Evaluating… (Cont’d) More Correct Confidence Interval if 1. S contains N examples, drawn independently of h and each other 2. N>= 30 then with approximately 95% probability, errorS(h) lies in interval equivalently, errorS(h) lies in interval which is approximately

2. Evaluating… (Cont’d) Two-sided and One-sided bounds 1. Two-sided What is the probability that errorD(h) is between L and U? 2. One-sided What is the probability that errorD(h) is at most U? 100(1-a)% confidence interval in Two-sided implies 100(1-a/2)% in One-sided.

3. General Confidence Interval Consider a set of independent, identically distributed random variables Y1…Yn, all governed by an arbitrary probability distribution with mean  and variance 2. Define sample mean, Central Limit Theorem As n, the distribution governing approaches a Normal distribution, with mean  and variance 2 /n.

3. General Confidence Interval (Cont’d) 1. Pick parameter p to estimate errorD(h) 2. Choose an estimator errorS(h) 3. Determine probability distribution that governs estimator errorS(h) governed by Binomial distribution, approximated by Normal distribution when n>=30 4. Find interval (L, U) such that N% of probability mass falls in the interval

4. Difference in Error of Two Hypothesis Assumption - two hypothesis h1, h2. - h1 is tested on sample S1 containing n1 random examples. h2 is tested on sample S2 containing n2 ramdom examples. Object - get difference between two true errors. where, d = errorD(h1) - errorD(h2)

4. Difference in Error of Two Hypothesis(Cont’d) Procedure 1. Choose an estimator for d 2. Determine probability distribution that governs estimator 3. Find interval (L, U) such that N% of probability mass falls in the interval

4. Difference in Error of Two Hypothesis(Cont’d) Hypothesis Test Ex) size of S1, S2 is 100 error s1(h1)=0.30, errors2(h2) = 0.20 What is the probability that errorD(h1) > errorD(h2)?

4. Difference in Error of Two Hypothesis(Cont’d) Solution 1. The problem is equivalent to getting the probability of the following 2. From former expression, 3. Table of Normal distribution shows that associated confidence level for two-sided interval is 90%, so for one-sided interval, it is 95%

5. Comparing Two Learning Algorithms What we’d like to estimate: where L(S) is the hypothesis output by learner L using training set S But, given limited data D0, what is a good estimator?  Could partition D0 into training set S and test set T0, and measure errorT0(LA(S0)) - errorT0(LB(S0))  Even better, repeat this many times and average the results

5. Comparing Two Learning Algorithms(Cont’d) 1. Partition data D0 into k disjoint test sets T1, T2, …, Tk of equal size, where this size if at least 30. 2. For 1 <= i <=k, do use Ti for the test set, and the remaining data for training set Si Si = {D0 - Ti}, hA= LA(Si), hB= LB(Si) 3. Return the value i, where

5. Comparing Two Learning Algorithms(Cont’d) 4. Now, use paired t test on to obtain a confidence interval The result is… N% confidence interval estimate for  :