Bayesian Learning, cont’d. Administrivia Homework 1 returned today (details in a second) Reading 2 assigned today S. Thrun, Learning occupancy grids with.

Bayesian Learning, cont’d

Administrivia Homework 1 returned today (details in a second) Reading 2 assigned today S. Thrun, Learning occupancy grids with forward sensor models. Autonomous Robots, 2002. Due: Oct 26 Much crunchier than the first! Don’t slack. Work with your group to sort out the math. Questions to mailing list and me. Midterm exam: Oct 21

Homework 1 results Mean=30.3; std=6.9

IID Samples In supervised learning, we usually assume that data points are sampled independently and from the same distribution IID assumption: data are independent and identically distributed ⇒ joint PDF can be written as product of individual (marginal) PDFs:

The max likelihood recipe Start with IID data Assume model for individual data point, f(X; Θ ) Construct joint likelihood function (PDF): Find the params Θ that maximize L (If you’re lucky): Differentiate L w.r.t. Θ, set =0 and solve Repeat for each class

Exercise Find the maximum likelihood estimator of μ for the univariate Gaussian: Find the maximum likelihood estimator of β for the degenerate gamma distribution: Hint: consider the log of the likelihood fns in both cases

Solutions PDF for one data point: Joint likelihood of N data points:

Solutions Log-likelihood:

Solutions Log-likelihood: Differentiate w.r.t. μ:

Solutions What about for the gamma PDF?

Putting the parts together [X,Y][X,Y] complete training data

Putting the parts together Assumed distribution family (hyp. space) w/ parameters Θ Parameters for class a: Specific PDF for class a

Putting the parts together

Gaussian Distributions

5 minutes of math... Recall your friend the Gaussian PDF: I asserted that the d-dimensional form is: Let’s look at the parts...

5 minutes of math...

Ok, but what do the parts mean? Mean vector, : mean of data along each dimension

5 minutes of math... Covariance matrix Like variance, but describes spread of data

5 minutes of math... Note: covariances on the diagonal of are same as standard variances on that dimension of data But what about skewed data?

5 minutes of math... Off-diagonal covariances ( ) describe the pairwise variance How much x i changes as x j changes (on avg)

5 minutes of math... Calculating from data: In practice: you want to measure the covariance between every pair of random variables (dimensions): Or, in linear algebra:

Bayesian Learning, cont’d. Administrivia Homework 1 returned today (details in a second) Reading 2 assigned today S. Thrun, Learning occupancy grids with.

Similar presentations

Presentation on theme: "Bayesian Learning, cont’d. Administrivia Homework 1 returned today (details in a second) Reading 2 assigned today S. Thrun, Learning occupancy grids with."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Bayesian Learning, cont’d. Administrivia Homework 1 returned today (details in a second) Reading 2 assigned today S. Thrun, Learning occupancy grids with.

Similar presentations

Presentation on theme: "Bayesian Learning, cont’d. Administrivia Homework 1 returned today (details in a second) Reading 2 assigned today S. Thrun, Learning occupancy grids with."— Presentation transcript:

Similar presentations

About project

Feedback