CogNova Technologies 1 Evaluating Induced Models Evaluating Induced Models with Daniel L. Silver Daniel L. Silver Copyright (c), 2004 All Rights Reserved.

Slides:



Advertisements
Similar presentations
SAMPLE DESIGN: HOW MANY WILL BE IN THE SAMPLE—DESCRIPTIVE STUDIES ?
Advertisements

Inference for Regression
Statistics Versus Parameters
Sta220 - Statistics Mr. Smith Room 310 Class #16.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7.3 Estimating a Population mean µ (σ known) Objective Find the confidence.
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
Sociology 601 Class 19: November 3, 2008 Review of correlation and standardized coefficients Statistical inference for the slope (9.5) Violations of Model.
Evaluation (practice). 2 Predicting performance  Assume the estimated error rate is 25%. How close is this to the true error rate?  Depends on the amount.
Evaluation.
Two Sample Hypothesis Testing for Proportions
© 2010 Pearson Prentice Hall. All rights reserved Hypothesis Testing Using a Single Sample.
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Evaluation.
Chapter Goals After completing this chapter, you should be able to:
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Analysis of Simulation Input.. Simulation Machine n Simulation can be considered as an Engine with input and output as follows: Simulation Engine Input.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Simple Linear Regression Analysis
Chapter 25 Asking and Answering Questions About the Difference Between Two Population Means: Paired Samples.
Chi-Square and F Distributions Chapter 11 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
Comparing Means.  Comparing two means is not very different from comparing two proportions.  This time the parameter of interest is the difference between.
QA 233 PRACTICE PROBLEMS PROBABILITY, SAMPLING DISTRIBUTIONS CONFIDENCE INTERVALS & HYPOTHESIS TESTING These problems will give you an opportunity to practice.
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
1 Machine Learning: Lecture 5 Experimental Evaluation of Learning Algorithms (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
1/2555 สมศักดิ์ ศิวดำรงพงศ์
CLassification TESTING Testing classifier accuracy
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall 報告人:黃子齊
Single-Sample T-Test Quantitative Methods in HPELS 440:210.
COMP3503 Intro to Inductive Modeling
Chapter 8: Confidence Intervals
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Estimation Bias, Standard Error and Sampling Distribution Estimation Bias, Standard Error and Sampling Distribution Topic 9.
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) Each has some error or uncertainty.
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
Experimental Evaluation of Learning Algorithms Part 1.
1 COMP3503 Inductive Decision Trees with Daniel L. Silver Daniel L. Silver.
A Course In Business Statistics 4th © 2006 Prentice-Hall, Inc. Chap 9-1 A Course In Business Statistics 4 th Edition Chapter 9 Estimation and Hypothesis.
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai One-Sample t-Test PowerPoint Prepared by Alfred P.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 14: Inference about the Model. Confidence Intervals for the Regression Slope (p. 788) If we repeated our sampling and computed another model,
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Confidence intervals and hypothesis testing Petter Mostad
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
CpSc 881: Machine Learning Evaluating Hypotheses.
1 CSI5388 Current Approaches to Evaluation (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
to accompany Introduction to Business Statistics
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Chapter 10 The t Test for Two Independent Samples
1 Chapter 9: Introduction to Inference. 2 Thumbtack Activity Toss your thumbtack in the air and record whether it lands either point up (U) or point down.
Understanding Basic Statistics
Paired Samples Lecture 39 Section 11.3 Tue, Nov 15, 2005.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai One-Sample t-Test PowerPoint Prepared by Alfred P.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Difference Between Two Means.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Dependent t-Test PowerPoint Prepared by Alfred P.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Two Sample Problems  Compare the responses of two treatments or compare the characteristics of 2 populations  Separate samples from each population.
Inference about proportions Example: One Proportion Population of students Sample of 175 students CI: What proportion (percentage) of students abstain.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Review Statistical inference and test of significance.
Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Credibility: Evaluating What’s Been Learned Predicting.
Essential Statistics Chapter 191 Comparing Two Proportions.
Chapter 14 Single-Population Estimation. Population Statistics Population Statistics:  , usually unknown Using Sample Statistics to estimate population.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Inferences and Conclusions from Data
Section 12.2: Tests about a Population Proportion
SAMPLE DESIGN: HOW MANY WILL BE IN THE SAMPLE—DESCRIPTIVE STUDIES ?
Machine Learning: Lecture 5
Presentation transcript:

CogNova Technologies 1 Evaluating Induced Models Evaluating Induced Models with Daniel L. Silver Daniel L. Silver Copyright (c), 2004 All Rights Reserved

CogNova Technologies 2 Agenda  Interpretation and Evaluation Phase  Model accuracy (fitness) and confidence  Testing the difference between two models  Testing the difference between two DM methods (e.g. IDT versus ANN)

CogNova Technologies 3 The KDD Process Selection and Preprocessing Data Mining Interpretation and Evaluation Data Consolidation Knowledge p(x)=0.02 Data Warehouse Data Sources Patterns & Models Prepared Data Consolidated Data

CogNova Technologies 4 Inductive Modeling = Data Mining Basic Framework for Inductive Learning Inductive Learning System Environment Training Examples Testing Examples Induced Model of Classifier Output Classification (x, f(x)) (x, h(x)) h(x) = f(x)? Focus is on developing models that can accurately classify new examples. ~

CogNova Technologies 5 Model Accuracy and Confidence  to judge fitness or accuracy  Preferably a separate verification set is used to judge fitness or accuracy  Statistical confidence in the accuracy of a model can be expressed as an interval Mean Error or Error Rate h1

CogNova Technologies 6 The Normal Curve and Confidence Intervals  Consider a class of 30 persons  True mean (average) mark of 75%  How can we estimate this from the marks of only 10 sample persons?  Let’s do an example using Excel

CogNova Technologies 7 Model Accuracy and Confidence Available Examples Training Set Verify Set Approach #1: Large Sample When the amount of available data is large... 70% 30% Used to develop one model Compute Test error Divide randomly Generalization = test/verify fit Test Set

CogNova Technologies 8 Model Accuracy and Confidence  Generalization statistic (fit, error or accuracy) is provided by the learning system  Confidence interval must be computed: Continuous target variable - Compute mean error over n examples and confidence interval using Excel (evaluate_models.xls)Continuous target variable - Compute mean error over n examples and confidence interval using Excel (evaluate_models.xls)evaluate_models.xls Nominal (binary) target variable - Given an error rate of P from a sample of n examples, then the 95%conf. interval = 1.96 sqrt( P(1-P) / n ) = 1.96 stdevNominal (binary) target variable - Given an error rate of P from a sample of n examples, then the 95%conf. interval = 1.96 sqrt( P(1-P) / n ) = 1.96 stdev o P = number incorrect / n Strictly speaking this is for n >= 30 Strictly speaking this is for n >= 30

CogNova Technologies 9 Testing the Difference Between Two Models  Which of the following two hypotheses is the better? … h1 or h2 ? h2 h3 Fitness or Error Rate h1h2

CogNova Technologies 10 Testing the Difference Between Two Models  Assumption: If some measurable characteristic of the models is statistically different then we will consider the models different  We will focus on the characteristics: mean error, and error rate (proportion incorrect) which can be computed from the test results

CogNova Technologies 11 Testing the Difference Between Two Models  Continuous target variable Use a Difference of Means TestUse a Difference of Means Test  Nominal (binary) target variable Use a Difference of Proportions TestUse a Difference of Proportions Test  For 95% confidence in a difference then p-value statistic must be <= 0.05 (see Excel spreadsheet example)

CogNova Technologies 12 Testing the Difference Between Two DM Methods  Cross-Validation must be performed  Requires generating several models with different train, test and verify sets  With WEKA use the accuracy or error rate on the test sets

CogNova Technologies 13 Network Training Available Examples Training Set Ver. Set Approach #2: Cross-validation Provides a sense of confidence in model... 10% 90% Repeat 10 times Used to develop 10 different models Accumulate test errors Generalization determined by mean test fit and stddev Test Set

CogNova Technologies 14 Testing the Difference Between Two DM Methods  A Difference of Means T-test can be used to determine a p-value statistic  For 95% confidence in a difference then p-value statistic must be <= 0.05 (see Excel spreadsheet example)

CogNova Technologies 15 Example: Using Census Data  Problem: To identify males given census data  Performance measure: Accuracy = Goodness of fitAccuracy = Goodness of fit  Model generation: IDT and ANN

CogNova Technologies 16 Example: Using Census Data  Record results: Goodness of fit stats on test set for 10 different models Mean fitness: ANN= 26.6, IDT = 31.8Mean fitness: ANN= 26.6, IDT = 31.8  Test difference between models : Use a difference of means T-test (see evaluate_models.xls) evaluate_models.xls p-value = p-value = Since p-value < 0.05, the two models are significantly differentSince p-value < 0.05, the two models are significantly different

CogNova Technologies 17 THE END