2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 1 Hastie, Tibshirani and Friedman.The Elements of Statistical Learning (2nd edition,

Slides:

Advertisements

Similar presentations

Evaluating Classifiers

Advertisements

Uncertainty in fall time surrogate Prediction variance vs. data sensitivity – Non-uniform noise – Example Uncertainty in fall time data Bootstrapping.

Sampling: Final and Initial Sample Size Determination

Model generalization Test error Bias, variance and complexity

Robert Plant != Richard Plant. Sample Data Response, covariates Predictors Remotely sensed Build Model Uncertainty Maps Covariates Direct or Remotely.

Model Assessment and Selection

Model Assessment, Selection and Averaging

Model assessment and cross-validation - overview

Chapter 10: Sampling and Sampling Distributions

Evaluation (practice). 2 Predicting performance  Assume the estimated error rate is 25%. How close is this to the true error rate?  Depends on the amount.

Linear Methods for Regression Dept. Computer Science & Engineering, Shanghai Jiao Tong University.

Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.

Credibility: Evaluating what’s been learned. Evaluation: the key to success How predictive is the model we learned? Error on the training data is not.

Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.

Supervised classification performance (prediction) assessment Dr. Huiru Zheng Dr. Franscisco Azuaje School of Computing and Mathematics Faculty of Engineering.

Additional Topics in Regression Analysis

Linear Regression Models Based on Chapter 3 of Hastie, Tibshirani and Friedman Slides by David Madigan.

Resampling techniques

2008 Chingchun 1 Bootstrap Chingchun Huang ( 黃敬群 ) Vision Lab, NCTU.

Chapter 11 Multiple Regression.

Experimental Evaluation

On Comparing Classifiers: Pitfalls to Avoid and Recommended Approach Published by Steven L. Salzberg Presented by Prakash Tilwani MACS 598 April 25 th.

Prelude of Machine Learning 202 Statistical Data Analysis in the Computer Age (1991) Bradely Efron and Robert Tibshirani.

For Better Accuracy Eick: Ensemble Learning

1 Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Presented by: Tun-Hsiang Yang.

1 CSI5388 Data Sets: Running Proper Comparative Studies with Large Data Repositories [Based on Salzberg, S.L., 1997 “On Comparing Classifiers: Pitfalls.

Regression Model Building

M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining Classification: Evaluation February 23,

Model Assessment and Selection Florian Markowetz & Rainer Spang Courses in Practical DNA Microarray Analysis.

CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 01: Training, Testing, and Tuning Datasets.

Chapter 11 Simple Regression

Chapter 7 Estimation: Single Population

CLassification TESTING Testing classifier accuracy

沈致远. Test error(generalization error): the expected prediction error over an independent test sample Training error: the average loss over the training.

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall 報告人：黃子齊

Model Building III – Remedial Measures KNNL – Chapter 11.

Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.

“PREDICTIVE MODELING” CoSBBI, July Jennifer Hu.

Chapter 7: Sampling and Sampling Distributions

Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.

Manu Chandran. Outline Background and motivation Over view of techniques Cross validation Bootstrap method Setting up the problem Comparing AIC,BIC,Crossvalidation,Bootstrap.

CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.

Guest lecture: Feature Selection Alan Qi Dec 2, 2004.

CSCI 347, Data Mining Evaluation: Cross Validation, Holdout, Leave-One-Out Cross Validation and Bootstrapping, Sections 5.3 & 5.4, pages

Validation methods.

PREDICTION Elsayed Hemayed Data Mining Course. Outline  Introduction  Regression Analysis  Linear Regression  Multiple Linear Regression  Predictor.

POLS 7000X STATISTICS IN POLITICAL SCIENCE CLASS 5 BROOKLYN COLLEGE-CUNY SHANG E. HA Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for.

Chapter 16 Multiple Regression and Correlation

FORECASTING METHODS OF NON- STATIONARY STOCHASTIC PROCESSES THAT USE EXTERNAL CRITERIA Igor V. Kononenko, Anton N. Repin National Technical University.

Computational Intelligence: Methods and Applications Lecture 15 Model selection and tradeoffs. Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.

Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.

Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Credibility: Evaluating What’s Been Learned Predicting.

Score Function for Data Mining Algorithms Based on Chapter 7 of Hand, Manilla, & Smyth David Madigan.

Bias-Variance Analysis in Regression  True function is y = f(x) +  where  is normally distributed with zero mean and standard deviation .  Given a.

PREDICT 422: Practical Machine Learning Module 3: Resampling Methods in Machine Learning Lecturer: Nathan Bastian, Section: XXX.

Data Science Credibility: Evaluating What’s Been Learned

Robert Plant != Richard Plant

Statistical Quality Control, 7th Edition by Douglas C. Montgomery.

Chapter 4. Inference about Process Quality

R. E. Wyllys Copyright 2003 by R. E. Wyllys Last revised 2003 Jan 15

Machine Learning Techniques for Data Mining

Cross-validation for the selection of statistical models

Ensemble learning Reminder - Bagging of Trees Random Forest

Model generalization Brief summary of methods

Introduction to Machine learning

Presentation transcript:

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 1 Hastie, Tibshirani and Friedman.The Elements of Statistical Learning (2nd edition, 2009). Presentation by: Ayellet Jehassi

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 2 1)Introduction 2)Bias, Variance and Model Complexity 3)AIC and BIC 4)Cross-Validation 5)Bootstrap Methods

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 3  The process of evaluating a model’s performance is known as model assessment, where the process of selecting the proper level of flexibility for a model is known as assessment model selection. For example: - Repeatedly drawing samples from a training set and refitting a model of interest on each sample in order to obtain additional information about the fitted model. - Fitting the same statistical method multiple times using different subsets of the training data.

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 4

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 5  We have many different potential predictors. Why not base the model on all of them?  Two sides of one coin: bias and variance – The model with more predictors can describe the phenomenon better – less bias. – When we estimate more parameters, the variance of estimators grows – we “fit the noise”, i.e. we perform an over fitting!  A clever model selection strategy should resolve the bias–variance trade–off.

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 6

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 7 The best approach for both problems is to randomly divide the dataset into three parts: a training set, a validation set, and a test set.  The training set is used to fit the models.  The validation set is used to estimate prediction error for model selection.  The test set is used for assessment of the generalization error of the final chosen model. The typical proportions are respectively: 50%, 25%, 25% The methods of this chapter approximate the validation step either analytically or by efficient sample re-use.

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 8

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 9

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 10

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 11

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 12

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 13

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 14

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 15

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 16  We randomly divide the available set of samples into two parts: a training set and a validation or hold-out set.  The model is fit on the training set, and the fitted model is used to predict the responses for the observations in the validation set.  The resulting validation-set error provides an estimate of the test error.

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 17  Widely used approach for estimating test error.  Estimates can be used to select best model, and to give an idea of the test error of the final chosen model.  The idea is to randomly divide the data into K equal-sized parts. We leave out part k, fit the model to the other K − 1 parts (combined), and then obtain predictions for the left-out kth part.  This is done in turn for each part k = 1, 2,... K, and then the results are combined.

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 18

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 19

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 20

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 21 Consider a simpler classifier for microarrays: 1)Starting with 5,000 genes, find the top 200 genes having the largest correlation with the class label. 2)Carry about the nearest-centroid classification using top 200 genes. How do we select the tuning parameter in the classifier? Way 1: apply cross-validation to step 2 Way 2: apply cross-validation to steps 1 and 2 Which is right? – Cross validating the whole procedure.

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 22

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 23  The bootstrap is a flexible and powerful statistical tool that can be used to quantify the uncertainty associated with a given estimator or statistical learning method.  For example, it can provide an estimate of the standard error of a coefficient, or a confidence interval for that coefficient.

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 24

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 25  The bootstrap approach allows us to use a computer to mimic the process of obtaining new data sets, so that we can estimate the variability of our estimate without generating additional samples.  Rather than repeatedly obtaining independent data sets from the population, we instead obtain distinct data sets by repeatedly sampling observations from the original data set with replacement.

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 26  Each of these “bootstrap data sets” is created by sampling with replacement, and is the same size as our original dataset.  As a result some observations may appear more than once in a given bootstrap data set and some not at all.

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 27  To estimate prediction error using the bootstrap, we could think about using each bootstrap dataset as our training sample, and the original sample as our validation sample.  But each bootstrap sample has significant overlap with the original data. About two-thirds of the original data points appear in each bootstrap sample.

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 28

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 29

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 30