7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 1 8. Evaluation Methods Errors and Error Rates Precision and Recall Similarity Cross Validation.

Slides:



Advertisements
Similar presentations
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
Advertisements

Chapter 9 Hypothesis Testing Understandable Statistics Ninth Edition
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and Alternative Hypotheses Type I and Type II Errors Type I and Type II Errors.
1 1 Slide STATISTICS FOR BUSINESS AND ECONOMICS Seventh Edition AndersonSweeneyWilliams Slides Prepared by John Loucks © 1999 ITP/South-Western College.
1 1 Slide MA4704Gerry Golding Developing Null and Alternative Hypotheses Hypothesis testing can be used to determine whether Hypothesis testing can be.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
1 1 Slide Hypothesis Testing Chapter 9 BA Slide Hypothesis Testing The null hypothesis, denoted by H 0, is a tentative assumption about a population.
Chapter 7: Statistical Applications in Traffic Engineering
Spring 2003Data Mining by H. Liu, ASU1 8. Evaluation Methods Errors and Error Rates Precision and Recall Similarity Cross Validation Various Presentations.
Fundamentals of Hypothesis Testing. Identify the Population Assume the population mean TV sets is 3. (Null Hypothesis) REJECT Compute the Sample Mean.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Chapter Goals After completing this chapter, you should be able to:
Inferences About Means of Single Samples Chapter 10 Homework: 1-6.
Inferences About Means of Single Samples Chapter 10 Homework: 1-6.
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 10 Notes Class notes for ISE 201 San Jose State University.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 8-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Inference about a Mean Part II
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
Chapter 8 Introduction to Hypothesis Testing
Statistical Comparison of Two Learning Algorithms Presented by: Payam Refaeilzadeh.
AM Recitation 2/10/11.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Chapter 10 Hypothesis Testing
Confidence Intervals and Hypothesis Testing - II
Hypothesis Testing.
1 1 Slide © 2005 Thomson/South-Western Chapter 9, Part A Hypothesis Tests Developing Null and Alternative Hypotheses Developing Null and Alternative Hypotheses.
Statistical inference: confidence intervals and hypothesis testing.
© 2002 Prentice-Hall, Inc.Chap 7-1 Statistics for Managers using Excel 3 rd Edition Chapter 7 Fundamentals of Hypothesis Testing: One-Sample Tests.
Statistics for Managers Using Microsoft® Excel 7th Edition
Fundamentals of Hypothesis Testing: One-Sample Tests
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
© 2003 Prentice-Hall, Inc.Chap 7-1 Business Statistics: A First Course (3 rd Edition) Chapter 7 Fundamentals of Hypothesis Testing: One-Sample Tests.
1 Introduction to Hypothesis Testing. 2 What is a Hypothesis? A hypothesis is a claim A hypothesis is a claim (assumption) about a population parameter:
Introduction to Hypothesis Testing: One Population Value Chapter 8 Handout.
Mid-Term Review Final Review Statistical for Business (1)(2)
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
A Course In Business Statistics 4th © 2006 Prentice-Hall, Inc. Chap 9-1 A Course In Business Statistics 4 th Edition Chapter 9 Estimation and Hypothesis.
Testing of Hypothesis Fundamentals of Hypothesis.
© 2002 Prentice-Hall, Inc.Chap 7-1 Business Statistics: A First course 4th Edition Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
Statistics for Managers 5th Edition Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Chap 8-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 8 Introduction to Hypothesis.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
Chapter 10 The t Test for Two Independent Samples
© 2004 Prentice-Hall, Inc.Chap 9-1 Basic Business Statistics (9 th Edition) Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
© Copyright McGraw-Hill 2004
Inferences Concerning Variances
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
ENGR 610 Applied Statistics Fall Week 7 Marshall University CITE Jack Smith.
Hypothesis Tests u Structure of hypothesis tests 1. choose the appropriate test »based on: data characteristics, study objectives »parametric or nonparametric.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Hypothesis Testing. Steps for Hypothesis Testing Fig Draw Marketing Research Conclusion Formulate H 0 and H 1 Select Appropriate Test Choose Level.
Chapter 9 Introduction to the t Statistic
Statistical Inference for the Mean: t-test
Presentation transcript:

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 1 8. Evaluation Methods Errors and Error Rates Precision and Recall Similarity Cross Validation Various Presentations of Evaluation Results Statistical Tests

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 2 How to evaluate/estimate error Resubstitution –one data set used for both training and for testing Holdout (training and testing) –2/3 for training, 1/3 for testing Leave-one-out –If a data set is small Cross validation –10-fold, why 10? –m 10-fold CV

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 3 Error and Error Rate Mean and Median –mean = 1/n  x i –weighted mean = (  w i x i )/  w i –median = x (n+1)/2 if n is odd, else (x n/2 +x (n/2)+1 )/2 Error – disagreement btwn y and y’ (predicted) –1 if they disagree, 0 otherwise (0-1 loss l 01 ) –Other definitions depending on the output of a predictor such as quadratic loss l 2, absolute loss l ‖

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 4 Error estimation –Error rate e = #Errors/N, where N is the total number of instances –Accuracy A = 1 - e

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 5 Precision and Recall False negative and false positive Types of errors for k classes = k 2 -k –k = 3, 3*3-3 = 6, k = 2, 2*2-2 = 2 Precision (wrt the retrieved) –P = TP/(TP+FP) Recall (wrt the total relevant) –R = TP/(TP+FN) Precision×Recall (PR) and PR gain –PR gain = (PR’ – PR 0 )/PR 0 Accuracy –A = (TP+TN)/(TP+TN+FP+FN) O|PredP’veN’ve P’veTPFN N’veFPTN P R

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 6 Similarity or Dissimilarity Measures Distance (dissimilarity) measures ( Triangle Inequality ) –Euclidean –City-block, or Manhattan –Cosine (p i,p j )= [  (p ik p jk )/  (p ik ) 2  (p jk ) 2 ] Inter-clusters and intra-clusters –Single linkage vs. complete linkage D min = min|p i - p j |, two data points D max = max|p i - p j | –Centroid methods D avg = 1/(n i n j )  |p i – p j | D mean = |m i - m j |, two means

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 7 k-Fold Cross Validation Cross validation –1 fold for training, the rest for testing –rotate until every fold is used for training –calculate average m k-fold cross validation –reshuffle data, repeat XV for m times –what is a suitable k? Model complexity –use of XV tree complexity, training/testing error rates Fold 2 Fold 3 Fold 1

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 8 Presentations of Evaluation Results Learning (happy) curves –Accuracy increases over X –Its opposite (or error) decreases over X Box-plot –Whiskers (min, max) –Box: confidence interval –Graphical equivalent of t- test Results are usually about time, space, trend, average case min max 22 mean

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 9 Statistical Tests Null hypothesis and alternative hypothesis Type I and Type II errors Student’s t test comparing two means Paired t test comparing two means Chi-Square test –Contingency table

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 10 Null Hypothesis Null hypothesis (H 0 ) –No difference between the test statistic and the actual value of the population parameter –E.g., H 0 :  =  0 Alternative hypothesis (H 1 ) –It specifies the parameter value(s) to be accepted if the H 0 is rejected. –E.g., H 1 :  !=  0 – two-tailed test –Or H 1 :  >  0 – one-tailed test

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 11 Type I, II errors Type I errors (  ) –Rejecting a null hypothesis when it is true (FN) Type II errors (  ) –Accepting a null hypothesis when it is false (FP) –Power = 1 –  Costs of different errors –A life-saving medicine appears to be effective, which is cheap and has no side effect (H 0 : non-effective) Type I error: it is effective, not costly Type II error: it is non-effective, very costly

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 12 Test using Student’s t Distribution Use t distribution for testing the difference between two population means is appropriate if –The population standard deviations are not known –The samples are small (n < 30) –The populations are assumed to be approx. normal –The two unknown  1 =  2 H0: (  1 -  2) = 0, H1: (  1 -  2) != 0 –Check the difference of estimated means normalized by common population means degree of freedom and p level of significance –df = n 1 + n 2 – 2

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 13 Paired t test With paired observations, use paired t test Now H 0 :  d = 0 and H 1 :  d != 0 –Check the estimated difference mean The t in previous and current cases are calculated differently. –Both are 2-tailed test, p = 1% means.5% on each side –Excel can do that for you! 0 +  /2-  /2 Rejection Region

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 14 Chi-Square Test (the goodness-of-fit) Testing a null hypothesis that the population distribution for a random variable follows a specified form. The chi-square statistic is calculated: degree of freedom df = k-m-1 –k = num of data categories –m = num of parameters estimated 0 – uniform, 1- Poisson, 2 - normal –Each cell should be at least 5 One-tail test C1C2  I-1A 11 A 12 R1R1 I-2A 21 A 22 R2R2  C1C1 C2C2 N 2 k  2 =   (A ij – E ij ) 2 / E ij i=1 j=1 Rejection Region

7/03Data Mining – Evaluation H. Liu (ASU) & G Dong (WSU) 15 Bibliography W. Klosgen & J.M. Zytkow, edited, 2002, Handbook of Data Mining and Knowledge Discovery. Oxford University Press. L. J. Kazmier & N. F. Pohl, Basic Statistics for Business and Economics. R.E. Walpole & R.H. Myers, Probability and Statistics for Engineers and Scientists (5 th edition). MACMILLAN Publishing Company.