© sebastian thrun, CMU, 20001 10-610 The KDD Lab Intro: Outcome Analysis Sebastian Thrun Carnegie Mellon University www.cs.cmu.edu/~10610.

© sebastian thrun, CMU, 20002 Problem 1  You find out on testing data, your speech recognizer can recognize sentences with 68% word accuracy, whereas previous recognizers achieve 60%. Would you advice a company to adopt your speech recognizer?

© sebastian thrun, CMU, 20003 Problem 2  On testing data, your data mining algorithm can predict emergency C-sections with 68% accuracy, whereas a previous $1,000 test achieves 60% accuracy. Do you recommend to replace the previous test by your new method?

© sebastian thrun, CMU, 20004 Characterize: What Should We Worry about? cost/loss FP/FN errors regression quadratic error unsupervised learning log likelihood pattern classification + - classification error

© sebastian thrun, CMU, 20006 Error Types  Type I error, alpha error, false positive: Probability of accepting hypothesis if not true  Type II error, beta error, false negative: Probability of rejecting hypothesis when it is true

© sebastian thrun, CMU, 20008 ROC Curves (ROC=Receiver Operating Characteristic)  Sensitivity: probability that a test result will be positive when the disease is present  Specificity: probability that a test result will be negative when the disease is not present  Positive likelihood ratio: ratio between the probability of a positive test result given the presence of the disease and the probability of a positive test result given the absence of the disease  Negative likelihood ratio: ratio between the probability of a negative test result given the presence of the disease and the probability of a negative test result given the absence of the disease  Positive predictive value (PPV): probability that the disease is present when the test is positive  Negative predictive value (NPV): probability that the disease is not present when the test is negative

© sebastian thrun, CMU, 200013 Confidence Intervals (See Mitchell 97) If S contains m examples, drawn independently m  30 Then With approximately 95% probability, the true error e D lies in the interval

© sebastian thrun, CMU, 200014 Example:  Hypothesis misclassifies 12 out of 40 examples in cross validation set S.  Q: What will the “true” error on future examples?  A: With 95% confidence, the true error will be in the interval:

© sebastian thrun, CMU, 200015 Confidence Intervals (See Mitchell 97) If S contains n examples, drawn independently n  30 Then With approximately N% probability, the true error e D lies in the interval N%50%68%80%90%95%98%99% zNzN 0.671.01.281.641.962.332.58

© sebastian thrun, CMU, 200024 k-fold Cross Validation Data Train on yellow, evaluate on pink  error 5 Train on yellow, evaluate on pink  error 6 Train on yellow, evaluate on pink  error 7 Train on yellow, evaluate on pink  error 1 Train on yellow, evaluate on pink  error 3 Train on yellow, evaluate on pink  error 4 Train on yellow, evaluate on pink  error 8 Train on yellow, evaluate on pink  error 2 error =  error i / k k-way split

© sebastian thrun, CMU, 200028 Comparing Different Hypotheses: Paired t test  True difference:  For each partition k :  Average:  N % Confidence interval: test error for partition k k-1 is degrees of freedom N is confidence level 90%95%98%99% =2 2.924.306.969.92 =5 2.022.573.364.03 =10 1.812.232.763.17 =20 1.722.092.532.84 =30 1.702.042.462.75 =120 1.661.982.362.62 =  1.641.962.332.58

© sebastian thrun, CMU, 200031 Summary  Know your loss function!  Finite testing data: report confidence intervals  Scarce data: Repartition training/testing set  Asymptotic prediction: exponential  Put thoughts into your evaluation, and be critical. Convince yourself!

© sebastian thrun, CMU, 20001 10-610 The KDD Lab Intro: Outcome Analysis Sebastian Thrun Carnegie Mellon University www.cs.cmu.edu/~10610.

Similar presentations

Presentation on theme: "© sebastian thrun, CMU, 20001 10-610 The KDD Lab Intro: Outcome Analysis Sebastian Thrun Carnegie Mellon University www.cs.cmu.edu/~10610."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

© sebastian thrun, CMU, 20001 10-610 The KDD Lab Intro: Outcome Analysis Sebastian Thrun Carnegie Mellon University www.cs.cmu.edu/~10610.

Similar presentations

Presentation on theme: "© sebastian thrun, CMU, 20001 10-610 The KDD Lab Intro: Outcome Analysis Sebastian Thrun Carnegie Mellon University www.cs.cmu.edu/~10610."— Presentation transcript:

Similar presentations

About project

Feedback