Pattern Recognition and Machine Learning

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Pattern Recognition and Machine Learning
: INTRODUCTION TO Machine Learning Parametric Methods.
Classification. Introduction A discriminant is a function that separates the examples of different classes. For example – IF (income > Q1 and saving >Q2)
Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
Pattern Recognition and Machine Learning
Pattern Recognition and Machine Learning
Chapter 2: Bayesian Decision Theory (Part 2) Minimum-Error-Rate Classification Classifiers, Discriminant Functions and Decision Surfaces The Normal Density.
Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.
Pattern Classification. Chapter 2 (Part 1): Bayesian Decision Theory (Sections ) Introduction Bayesian Decision Theory–Continuous Features.
Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.
Chapter 2: Bayesian Decision Theory (Part 2) Minimum-Error-Rate Classification Classifiers, Discriminant Functions and Decision Surfaces The Normal Density.
Pattern Classification Chapter 2 (Part 2)0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O.
Chapter 4: Linear Models for Classification
What is Statistical Modeling
Laboratory for Social & Neural Systems Research (SNS) PATTERN RECOGNITION AND MACHINE LEARNING Institute of Empirical Research in Economics (IEW)
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Classification and risk prediction
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Machine Learning CMPT 726 Simon Fraser University CHAPTER 1: INTRODUCTION.
Machine Learning CMPT 726 Simon Fraser University
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation X = {
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation Given.
PATTERN RECOGNITION AND MACHINE LEARNING
ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011.
Principles of Pattern Recognition
Speech Recognition Pattern Classification. 22 September 2015Veton Këpuska2 Pattern Classification  Introduction  Parametric classifiers  Semi-parametric.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
INTRODUCTION TO Machine Learning 3rd Edition
Bias and Variance of the Estimator PRML 3.2 Ethem Chp. 4.
Linear Models for Classification
Lecture 2: Statistical learning primer for biologists
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
Bias and Variance of the Estimator PRML 3.2 Ethem Chp. 4.
Basic Technical Concepts in Machine Learning Introduction Supervised learning Problems in supervised learning Bayesian decision theory.
Machine Learning CUNY Graduate Center Lecture 6: Linear Regression II.
Ch 1. Introduction (Latter) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by J.W. Ha Biointelligence Laboratory, Seoul National.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by J.W. Ha Biointelligence Laboratory, Seoul National University.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
CS Statistical Machine learning Lecture 7 Yuan (Alan) Qi Purdue CS Sept Acknowledgement: Sargur Srihari’s slides.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Lecture 2. Bayesian Decision Theory
Basic Technical Concepts in Machine Learning
Stat 223 Introduction to the Theory of Statistics
Probability Theory and Parameter Estimation I
ICS 280 Learning in Graphical Models
CS 2750: Machine Learning Probability Review Density Estimation
CH 5: Multivariate Methods
CS668: Pattern Recognition Ch 1: Introduction
Special Topics In Scientific Computing
Bias and Variance of the Estimator
Pattern Recognition PhD Course.
Lecture 26: Faces and probabilities
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
INTRODUCTION TO Machine Learning 3rd Edition
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Loss.
Stat 223 Introduction to the Theory of Statistics
Mathematical Foundations of BME
Parametric Methods Berlin Chen, 2005 References:
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Hairong Qi, Gonzalez Family Professor
Presentation transcript:

Pattern Recognition and Machine Learning Chapter 1: Introduction

Example Handwritten Digit Recognition

Polynomial Curve Fitting

Sum-of-Squares Error Function

0th Order Polynomial

1st Order Polynomial

3rd Order Polynomial

9th Order Polynomial

Over-fitting Root-Mean-Square (RMS) Error:

Polynomial Coefficients

Data Set Size: 9th Order Polynomial

Data Set Size: 9th Order Polynomial

Regularization Penalize large coefficient values

Regularization:

Regularization:

Regularization: vs.

Polynomial Coefficients

Probability Theory Apples and Oranges

Probability Theory Marginal Probability Conditional Probability Joint Probability

Probability Theory Sum Rule Product Rule

The Rules of Probability Sum Rule Product Rule

Bayes’ Theorem posterior  likelihood × prior

Probability Densities

Expectations Conditional Expectation (discrete) Approximate Expectation (discrete and continuous)

Variances and Covariances

The Gaussian Distribution

Gaussian Mean and Variance

The Multivariate Gaussian

Gaussian Parameter Estimation Likelihood function

Maximum (Log) Likelihood

Properties of and

Curve Fitting Re-visited

Maximum Likelihood Determine by minimizing sum-of-squares error, .

Predictive Distribution

MAP: A Step towards Bayes Determine by minimizing regularized sum-of-squares error, .

Bayesian Curve Fitting

Bayesian Predictive Distribution

Model Selection Cross-Validation

Curse of Dimensionality

Curse of Dimensionality Polynomial curve fitting, M = 3 Gaussian Densities in higher dimensions

Decision Theory Inference step Determine either or . Decision step For given x, determine optimal t.

Minimum Misclassification Rate

Minimum Expected Loss Example: classify medical images as ‘cancer’ or ‘normal’ Decision Truth

Minimum Expected Loss Regions are chosen to minimize

Reject Option

Why Separate Inference and Decision? Minimizing risk (loss matrix may change over time) Reject option Unbalanced class priors Combining models

Decision Theory for Regression Inference step Determine . Decision step For given x, make optimal prediction, y(x), for t. Loss function:

The Squared Loss Function

Generative vs Discriminative Generative approach: Model Use Bayes’ theorem Discriminative approach: Model directly