Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 416 Artificial Intelligence

Similar presentations


Presentation on theme: "CS 416 Artificial Intelligence"— Presentation transcript:

1 CS 416 Artificial Intelligence
Lecture 24 Statistical Learning Chapter 20

2 AI: Creating rational agents
The pursuit of autonomous, rational, agents It’s all about search Varying amounts of model information tree searching (informed/uninformed) simulated annealing value/policy iteration Searching for an explanation of observations Used to develop a model

3 Searching for explanation of observations
If I can explain observations… can I predict the future? Can I explain why ten coin tosses are 6 H and 4 T? Can I predict the 11th coin toss

4 Running example: Candy
Surprise Candy Comes in two flavors cherry (yum) lime (yuk) All candy is wrapped in same opaque wrapper Candy is packaged in large bags containing five different allocations of cherry and lime

5 Statistics Given a bag of candy, what distribution of flavors will it have? Let H be the random variable corresponding to your hypothesis H1 = all cherry, H2 = all lime, H3 = 50/50 cherry/lime As you open pieces of candy, let each observation of data: D1, D2, D3, … be either cherry or lime D1 = cherry, D2 = cherry, D3 = lime, … Predict the flavor of the next piece of candy If the data caused you to believe H1 was correct, you’d pick cherry

6 Bayesian Learning Use available data to calculate the probability of each hypothesis and make a prediction Because each hypothesis has an independent likelihood, we use all their relative likelihoods when making a prediction Probabilistic inference using Bayes’ rule: P(hi | d) = aP(d | hi) P(hi) The probability of of hypothesis hi being active given you observed sequence d equals the probability of seeing data sequence d generated by hypothesis hi multiplied by the likelihood of hypothesis i being active hypothesis prior likelihood

7 Prediction of an unknown quantity X
The likelihood of X happening given d has already happened is a function of how much each hypothesis predicts X can happen given d has happened Even though a hypothesis has a high prediction that X will happen, this prediction will be discounted if the hypothesis itself is unlikely to be true given the observation of d

8 Details of Bayes’ rule All observations within d are independent
identically distributed The probability of a hypothesis explaining a series of observations, d is the product of explaining each component

9 Example Prior distribution across hypotheses Prediction
h1 = 100% cherry = 0.1 h2 = 75/25 cherry/lime = 0.2 h3 = 50/50 cherry/lime = 0.5 h4 = 25/75 cherry/lime = 0.2 h5 = 100% lime = 0.1 Prediction P(d|h3) = (0.5)10

10 Example Probabilities for each hypothesis starts at prior value <.1, .2, .4, .2, .1> Probability of h3 hypothesis as 10 lime candies are observed P(d|h3)*P(h3) = (0.5)10*(0.4)

11 Prediction of 11th candy If we’ve observed 10 lime candies, is 11th lime? Build weighted sum of each hypothesis’s prediction Weighted sum can become expensive to compute Instead use most probable hypothesis and ignore others MAP: maximum a posteriori from hypothesis from observations

12 Overfitting Remember overfitting from NN discussion?
The number of hypotheses influences predictions Too many hypotheses can lead to overfitting

13 Overfitting Example Say we’ve observed 3 cherry and 7 lime
Consider our 5 hypotheses from before prediction is a weighted average of the 5 Consider having 11 hypotheses, one for each permutation The 3/7 hypothesis will be 1 and all others will be 0

14 Learning with Data First talk about parameter learning
Let’s create a hypothesis for candies that says the probability a cherry is drawn is q, hq If we unwrap N candies and c are cherry, what is q? The (log) likelihood is:

15 Learning with Data We want to find q that maximizes log-likelihood
differentiate L with respect to q and set to 0 This solution process may not be easily computed and iterative and numerical methods may be used


Download ppt "CS 416 Artificial Intelligence"

Similar presentations


Ads by Google