INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 Lecture.

Slides:



Advertisements
Similar presentations
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for 1 Lecture Notes for E Alpaydın 2010.
Advertisements

Classification. Introduction A discriminant is a function that separates the examples of different classes. For example – IF (income > Q1 and saving >Q2)
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
Pattern Recognition and Machine Learning
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning 3rd Edition
INTRODUCTION TO Machine Learning 3rd Edition
Yazd University, Electrical and Computer Engineering Department Course Title: Machine Learning By: Mohammad Ali Zare Chahooki Bayesian Decision Theory.
Chapter 2: Bayesian Decision Theory (Part 2) Minimum-Error-Rate Classification Classifiers, Discriminant Functions and Decision Surfaces The Normal Density.
Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.
Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Bayesian Decision Theory Chapter 2 (Duda et al.) – Sections
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Chapter 2: Bayesian Decision Theory (Part 1) Introduction Bayesian Decision Theory–Continuous Features All materials used in this course were taken from.
Machine Learning CMPT 726 Simon Fraser University
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation X = {
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning 3rd Edition
INTRODUCTION TO Machine Learning 3rd Edition
Bayesian Decision Theory Making Decisions Under uncertainty 1.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN Modified by Prof. Carolina Ruiz © The MIT Press, 2014 for CS539 Machine Learning at WPI
INTRODUCTION TO Machine Learning 3rd Edition
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
CHAPTER 10: Logistic Regression. Binary classification Two classes Y = {0,1} Goal is to learn how to correctly classify the input into one of these two.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN Modified by Prof. Carolina Ruiz © The MIT Press, 2014for CS539 Machine Learning at WPI
Bayesian decision theory: A framework for making decisions when uncertainty exit 1 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e.
Machine Learning 5. Parametric Methods.
CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Bayesian Decision Theory Introduction to Machine Learning (Chap 3), E. Alpaydin.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Bayesian classification review Bayesian statistics derive K nearest neighbors (KNN) classifier analysis of 2-way classification results homework assignment.
INTRODUCTION TO Machine Learning 2nd Edition
CHAPTER 3: Bayesian Decision Theory
INTRODUCTION TO Machine Learning
INTRODUCTION TO Machine Learning
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
INTRODUCTION TO Machine Learning
INTRODUCTION TO Machine Learning 3rd Edition
Pattern Recognition and Machine Learning
INTRODUCTION TO Machine Learning
Machine Learning” Dr. Alper Özpınar.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
A discriminant function for 2-class problem can be defined as the ratio of class likelihoods g(x) = p(x|C1)/p(x|C2) Derive formula for g(x) when class.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
INTRODUCTION TO Machine Learning
Foundations 2.
INTRODUCTION TO Machine Learning
Presentation transcript:

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture Slides for

CHAPTER 3: BAYESIAN DECISION THEORY

Probability and Inference 3  Result of tossing a coin is  {Heads,Tails}  Random var X  {1,0} Bernoulli: P {X=1} = p o X (1 ‒ p o ) (1 ‒ X)  Sample: X = {x t } N t =1 Estimation: p o = # {Heads}/#{Tosses} = ∑ t x t / N  Prediction of next toss: Heads if p o > ½, Tails otherwise

Classification  Credit scoring: Inputs are income and savings. Output is low-risk vs high-risk  Input: x = [x 1,x 2 ] T,Output: C Î {0,1}  Prediction: 4

Bayes’ Rule 5 posterior likelihoodprior evidence

Bayes’ Rule: K>2 Classes 6

Losses and Risks  Actions: α i  Loss of α i when the state is C k : λ ik  Expected risk (Duda and Hart, 1973) 7

Losses and Risks: 0/1 Loss 8 For minimum risk, choose the most probable class

Losses and Risks: Reject 9

Different Losses and Reject 10 Equal losses Unequal losses With reject

Discriminant Functions 11 K decision regions R 1,...,R K

K=2 Classes  Dichotomizer (K=2) vs Polychotomizer (K>2)  g(x) = g 1 (x) – g 2 (x)  Log odds: 12

Utility Theory  Prob of state k given exidence x: P (S k |x)  Utility of α i when state is k: U ik  Expected utility: 13

Association Rules  Association rule: X  Y  People who buy/click/visit/enjoy X are also likely to buy/click/visit/enjoy Y.  A rule implies association, not necessarily causation. 14

Association measures 15  Support (X  Y):  Confidence (X  Y):  Lift (X  Y):

Example 16

Apriori algorithm (Agrawal et al., 1996) 17  For (X,Y,Z), a 3-item set, to be frequent (have enough support), (X,Y), (X,Z), and (Y,Z) should be frequent.  If (X,Y) is not frequent, none of its supersets can be frequent.  Once we find the frequent k-item sets, we convert them to rules: X, Y  Z,... and X  Y, Z,...