Online Learning Yiling Chen. Machine Learning Use past observations to automatically learn to make better predictions or decisions in the future A large.

Online Learning Yiling Chen

Machine Learning Use past observations to automatically learn to make better predictions or decisions in the future A large field We are scratching the surface only for part of it.

Example: Click Prediction

Example: Recommender System Netflix challenge

Spam Prediction Unknown Sender Sent to more than 10 people “Cheap” or “Sale” “Dear Sir”…Spam? 00000 11111 11011 Need some reasonable concept classes: Disjunctions: Spam if “Dear Sir” and “Sent to more than 10 people” Threshold: Spam if “Dear Sir” + Sent to more than 10 people + Unknown sender > 2

Batch Learning Unknown Sender Sent to more than 10 people “Cheap” or “Sale” “Dear Sir”…Spam? 00000 11111 11011 Learning Algorithm Prediction Rule for New Examples

Online Learning Unknown Sender Sent to more than 10 people “Cheap” or “Sale” “Dear Sir”…Spam? 0000? Unknown Sender Sent to more than 10 people “Cheap” or “Sale” “Dear Sir”…Spam? 1111? 0 1 How to update the prediction rule?

Competitive Ratio Optimal offline algorithm: optimal in hind sight Competitive ration = performance of the online algorithm/performance of the optimal offline algorithm

Why We Care? The “Learning from Expert Advice” setting is an information aggregation problem. Spam if “Dear Sir” and “Sent to more than 10 people” Spam if “Dear Sir” + Sent to more than 10 people + Unknown sender > 2 Yahoo!’s spam filter Can we make use of predictions of these “experts”?

Basic Online Learning Setting The learning algorithm sees a new example The algorithm predicts a label for this example After the prediction, the true label is observed Algorithm makes a mistake if Update the prediction rule

Two Goals Minimize the number of mistakes – Hope that (# of mistakes/# of rounds) -> 0 – Assume that there is a perfect target function Minimize regret – Hope that (# of mistakes - # of mistakes by comparator)/# of rounds -> 0 – Adversarial setting

Minimizing the Number of Mistakes

Halving Algorithm Let C be a finite concept class. Assume that there exist c in C such that c( ) =. Then, the number of mistakes made by Halving is no more than log|C|.

Halving Algorithm Current version space contains all functions that are consistent with the observations so far. At each round t, predict label to be the same as if it is chosen by the majority of functions in the current concept space. Update the version space

Monotonic Disjunctions Concept class can be disjunctions of r variables |C| can be large Halving is not computationally tractable

The Winnow Algorithm

# mistakes <= O(log rn) We can treat each variable (feature) as an expert Winnow updates weights of the expert dynamically

Minimizing Regret No assumption on the distribution of examples No assumption on target function Adversarial setting

# Mistakes <= 2.41 (m + log n)

# of Mistakes <= m + log n + O( sqrt(m log n))

Online Learning Yiling Chen. Machine Learning Use past observations to automatically learn to make better predictions or decisions in the future A large.

Similar presentations

Presentation on theme: "Online Learning Yiling Chen. Machine Learning Use past observations to automatically learn to make better predictions or decisions in the future A large."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Online Learning Yiling Chen. Machine Learning Use past observations to automatically learn to make better predictions or decisions in the future A large.

Similar presentations

Presentation on theme: "Online Learning Yiling Chen. Machine Learning Use past observations to automatically learn to make better predictions or decisions in the future A large."— Presentation transcript:

Similar presentations

About project

Feedback