Statistical Learning Dong Liu Dept. EEIS, USTC.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Principles of Density Estimation
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) ETHEM ALPAYDIN © The MIT Press, 2010
Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.
Pattern Analysis Prof. Bennett Math Model of Learning and Discovery 2/14/05 Based on Chapter 1 of Shawe-Taylor and Cristianini.
LECTURE 11: BAYESIAN PARAMETER ESTIMATION
INTRODUCTION TO Machine Learning 3rd Edition
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Pattern Recognition and Machine Learning
MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 
Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
1 An Introduction to Nonparametric Regression Ning Li March 15 th, 2004 Biostatistics 277.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
ENN: Extended Nearest Neighbor Method for Pattern Recognition
This week: overview on pattern recognition (related to machine learning)
DATA MINING LECTURE 10 Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
1 Data Mining Lecture 5: KNN and Bayes Classifiers.
Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.
1 E. Fatemizadeh Statistical Pattern Recognition.
Applying Statistical Machine Learning to Retinal Electrophysiology Matt Boardman January, 2006 Faculty of Computer Science.
Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #2.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
CS 1699: Intro to Computer Vision Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh October 29, 2015.
Radial Basis Function ANN, an alternative to back propagation, uses clustering of examples in the training set.
Chapter1: Introduction Chapter2: Overview of Supervised Learning
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Introduction to Machine Learning Multivariate Methods 姓名 : 李政軒.
Machine Learning 5. Parametric Methods.
Machine Learning ICS 178 Instructor: Max Welling Supervised Learning.
DATA MINING LECTURE 10b Classification k-nearest neighbor classifier
Kernel Methods Arie Nakhmani. Outline Kernel Smoothers Kernel Density Estimators Kernel Density Classifiers.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
CHAPTER 8: Nonparametric Methods Alpaydin transparencies significantly modified, extended and changed by Ch. Eick Last updated: March 4, 2011.
Computational Intelligence: Methods and Applications Lecture 26 Density estimation, Expectation Maximization. Włodzisław Duch Dept. of Informatics, UMK.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Lecture 7 Spring 2010 Dr. Jianjun Hu CSCE883 Machine Learning.
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
IMAGE PROCESSING RECOGNITION AND CLASSIFICATION
Instance Based Learning
INTRODUCTION TO Machine Learning 3rd Edition
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Ch8: Nonparametric Methods
Machine learning, pattern recognition and statistical data modelling
Overview of Supervised Learning
Statistical Learning Dong Liu Dept. EEIS, USTC.
Statistical Learning Dong Liu Dept. EEIS, USTC.
K Nearest Neighbor Classification
Classification Nearest Neighbor
Statistical Learning Dong Liu Dept. EEIS, USTC.
DATA MINING LECTURE 10 Classification k-nearest neighbor classifier
LECTURE 07: BAYESIAN ESTIMATION
Model generalization Brief summary of methods
Multivariate Methods Berlin Chen
Mathematical Foundations of BME
Multivariate Methods Berlin Chen, 2005 References:
Hairong Qi, Gonzalez Family Professor
Machine Learning – a Probabilistic Perspective
ECE – Pattern Recognition Lecture 10 – Nonparametric Density Estimation – k-nearest-neighbor (kNN) Hairong Qi, Gonzalez Family Professor Electrical.
Pattern Analysis Prof. Bennett
Outlines Introduction & Objectives Methodology & Workflow
Presentation transcript:

Statistical Learning Dong Liu Dept. EEIS, USTC

Chapter 5. Non-Parametric Supervised Learning Parzen window k-nearest-neighbor (k-NN) Sparse coding 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Non-parametric learning Most of statistical learning methods assume a model For example, linear regression assumes and linear classification assumes Learning is converted to a problem of solving/estimating model parameters In this chapter, we consider learning without explicit modeling Non-parametric learning is sometimes equivalent to instance/memory-based learning 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Chap 5. Non-Parametric Supervised Learning Parzen window Consider the problem of (probabilistic) density estimation We want to estimate an unknown distribution p(x) from a given set of samples Parametric learning: we assume a Non-parametric learning: we do not presume Parzen window, also known as kernel density estimation Kernel function shall satisfy: non-negative, integrate to 1, e.g. 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Comparison between histogram and Parzen window 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Parzen window: Window size Introduce a hyper-parameter to control window size Then Gray: true density Red: h=0.05 Black: h=0.337 Green: h=2 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Chapter 5. Non-Parametric Supervised Learning Parzen window k-nearest-neighbor (k-NN) Sparse coding 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Chap 5. Non-Parametric Supervised Learning Nearest-neighbor If it walks like a duck, quacks like a duck, then it is probably a duck. Compare & find the “closest” sample New sample Samples in memory 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Chap 5. Non-Parametric Supervised Learning Illustration of 1-NN Reproduced from ESL It is sensitive to noise/outlier in dataset. How to improve? 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Chap 5. Non-Parametric Supervised Learning k-nearest-neighbor Compare and find several “closet” samples, and make decision based on them all k-NN classification k-NN regression 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Chap 5. Non-Parametric Supervised Learning Illustration of k-NN Reproduced from ESL Better than 1-NN in eliminating noise effect 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Chap 5. Non-Parametric Supervised Learning Variants of k-NN How to define neighborhood? Fixed k (then how to set?) Adaptive k (e.g. distance thresholding) How to define distance? Euclidean distance Lp distance Mahalanobis distance Cosine similarity Pearson correlation coefficient Distance weighting Equal weight Less distance, more weight 2018/11/17 Chap 5. Non-Parametric Supervised Learning

How to set k: Bias-variance tradeoff Consider k-NN regression Assume the data are generated by Then we have 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Example: Rating prediction for making recommendations 1/5 Items Search Recommendations Advertisements Products, news, movies, music, … 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Example: Rating prediction for making recommendations 2/5 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Example: Rating prediction for making recommendations 3/5 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Example: Rating prediction for making recommendations 4/5 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Example: Rating prediction for making recommendations 5/5 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Chapter 5. Non-Parametric Supervised Learning Parzen window k-nearest-neighbor (k-NN) Sparse coding 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Chap 5. Non-Parametric Supervised Learning Sparse coding Instead of finding “closest” samples, we want to find “correlated” samples 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Chap 5. Non-Parametric Supervised Learning Sparse coding Sparse coding aims to solve Sparse coding can be relaxed to deal with noise or corruption in data, for example To deal with (Gaussian) noise To deal with (sparse) corruption 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Example: Robust face recognition 1/2 Wright, J., Yang, A. Y., Ganesh, A., Sastry, S. S., & Ma, Y. (2009). Robust face recognition via sparse representation. IEEE transactions on pattern analysis and machine intelligence, 31(2), 210-227. 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Example: Robust face recognition 2/2 Able to identify valid/invalid input 2018/11/17 Chap 5. Non-Parametric Supervised Learning

Chap 5. Non-Parametric Supervised Learning Chapter summary Dictionary Toolbox Instance-based learning Memory-based learning k-NN Parzen window Sparse coding 2018/11/17 Chap 5. Non-Parametric Supervised Learning