Download presentation
Published byMolly Griffith Modified over 9 years ago
1
Machine Learning with Discriminative Methods Lecture 02 – PAC Learning and tail bounds intro CS Spring 2015 Alex Berg
2
Today’s lecture PAC Learning Tail bounds…
3
Rectangle learning - Hypothesis H + + - - + + - - -
Hypothesis is any axis aligned rectangle. Inside rectangle is positive.
4
Rectangle learning – Realizable case
- Actual boundary is also an axis-aligned rectangle, “The Realizable Case” (no approximation error) Hypothesis H + + - - + + - - -
5
Rectangle learning – Realizable case
- Actual boundary is also an axis-aligned rectangle, “The Realizable Case” (no approximation error) Hypothesis H + + - - + + - - - - A mistake for the hypothesis H! Measure ERROR by the probability of making a mistake.
6
Rectangle learning – a strategy for a learning algorithm…
- Hypothesis H (Output of learning algorithm so far…) + + - - + + - - - Make the smallest rectangle consistent with all the data so far.
7
Rectangle learning – making a mistake
- Hypothesis H (Output of learning algorithm so far…) Current hypothesis makes a mistake on a new data item… + + - - + + + - - - Make the smallest rectangle consistent with all the data so far.
8
Rectangle learning – making a mistake
- Hypothesis H (Output of learning algorithm so far…) Current hypothesis makes a mistake on a new data item… + + - - + + Use probability of such a mistake (this is our error measure) to find a bound for how likely it was we had not yet seen a training example in this region… + - - - Make the smallest rectangle consistent with all the data so far.
9
Very subtle formulation…
R = Actual decision boundary R’ = Result of algorithm so far (after m sample) From the Kearns and Vazirani Reading
10
From the Kearns and Vazirani Reading
11
PAC Learning
12
Flashback: Learning/fitting is a process…
Estimating the probability that a tossed coin comes up heads… The i’th coin toss Estimator based on n tosses Estimate is within epsilon Estimate is not within epsilon Probability of being bad is inversely proportional to the number of samples… (the underlying computation is an example of a tail bound) From Raginsky notes
13
Markov’s Inequality From Raginksy’s notes
14
Chebyshev’s Inequality
From Raginksy’s notes
15
Not quite good enough… From Raginksy’s notes
16
For next class Read the wikipedia page for Chernoff Bound:
Read at least first Raginsky’s introductory notes on tail bounds (pages 1-5) Come to class with questions! It is fine to have questions, but first spend some time trying to work through reading/problems. Feel free to post questions to Sakai discussion board!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.