CS 188: Artificial Intelligence Spring 2007 Lecture 14: Bayes Nets III 3/1/2007 Srini Narayanan – ICSI and UC Berkeley.

Slides:



Advertisements
Similar presentations
CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Jan, 29, 2014.
Advertisements

Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Lirong Xia Bayesian networks (2) Thursday, Feb 25, 2014.
Lirong Xia Hidden Markov Models Tue, March 28, 2014.
Lirong Xia Approximate inference: Particle filter Tue, April 1, 2014.
Probabilistic Reasoning (2)
Bayesian network inference
10/24  Exam on 10/26 (Lei Tang and Will Cushing to proctor)
Inference in Bayesian Nets
CS 188: Artificial Intelligence Fall 2009 Lecture 20: Particle Filtering 11/5/2009 Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint.
CS 188: Artificial Intelligence Fall 2009 Lecture 16: Bayes’ Nets III – Inference 10/20/2009 Dan Klein – UC Berkeley.
Bayesian Networks. Graphical Models Bayesian networks Conditional random fields etc.
CS 188: Artificial Intelligence Fall 2009 Lecture 17: Bayes Nets IV 10/27/2009 Dan Klein – UC Berkeley.
Constructing Belief Networks: Summary [[Decide on what sorts of queries you are interested in answering –This in turn dictates what factors to model in.
5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005.
CS 188: Artificial Intelligence Fall 2006 Lecture 17: Bayes Nets III 10/26/2006 Dan Klein – UC Berkeley.
10/22  Homework 3 returned; solutions posted  Homework 4 socket opened  Project 3 assigned  Mid-term on Wednesday  (Optional) Review session Tuesday.
Announcements Homework 8 is out Final Contest (Optional)
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Bayesian Networks Tamara Berg CS Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell,
CS 188: Artificial Intelligence Fall 2008 Lecture 18: Decision Diagrams 10/30/2008 Dan Klein – UC Berkeley 1.
QUIZ!!  T/F: The forward algorithm is really variable elimination, over time. TRUE  T/F: Particle Filtering is really sampling, over time. TRUE  T/F:
Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:
INC 551 Artificial Intelligence Lecture 8 Models of Uncertainty.
2 Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions Exact inference by enumeration Exact.
CS 188: Artificial Intelligence Fall 2006 Lecture 18: Decision Diagrams 10/31/2006 Dan Klein – UC Berkeley.
Announcements  Office hours this week Tuesday (as usual) and Wednesday  HW6 posted, due Monday 10/20.
UBC Department of Computer Science Undergraduate Events More
Bayes’ Nets: Sampling [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available.
Announcements Project 4: Ghostbusters Homework 7
1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks.
CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 2, 2015.
QUIZ!!  In HMMs...  T/F:... the emissions are hidden. FALSE  T/F:... observations are independent given no evidence. FALSE  T/F:... each variable X.
Bayesian networks and their application in circuit reliability estimation Erin Taylor.
CS 188: Artificial Intelligence Bayes Nets: Approximate Inference Instructor: Stuart Russell--- University of California, Berkeley.
Advanced Artificial Intelligence Lecture 5: Probabilistic Inference.
Inference Algorithms for Bayes Networks
Quick Warm-Up Suppose we have a biased coin that comes up heads with some unknown probability p; how can we use it to produce random bits with probabilities.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
CPSC 7373: Artificial Intelligence Lecture 5: Probabilistic Inference Jiang Bian, Fall 2012 University of Arkansas at Little Rock.
CSE 473: Artificial Intelligence Autumn 2011 Bayesian Networks: Inference Luke Zettlemoyer Many slides over the course adapted from either Dan Klein, Stuart.
CS 416 Artificial Intelligence Lecture 15 Uncertainty Chapter 14 Lecture 15 Uncertainty Chapter 14.
QUIZ!!  T/F: You can always (theoretically) do BNs inference by enumeration. TRUE  T/F: In VE, always first marginalize, then join. FALSE  T/F: VE is.
Web-Mining Agents Data Mining Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Karsten Martiny (Übungen)
CS 541: Artificial Intelligence Lecture VII: Inference in Bayesian Networks.
CS 188: Artificial Intelligence Spring 2007
CS 541: Artificial Intelligence
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Inference in Bayesian Networks
Artificial Intelligence
Quizzz Rihanna’s car engine does not start (E).
CS 4/527: Artificial Intelligence
CAP 5636 – Advanced Artificial Intelligence
Markov Networks.
Advanced Artificial Intelligence
CHAPTER 7 BAYESIAN NETWORK INDEPENDENCE BAYESIAN NETWORK INFERENCE MACHINE LEARNING ISSUES.
Inference Inference: calculating some useful quantity from a joint probability distribution Examples: Posterior probability: Most likely explanation: B.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Instructors: Fei Fang (This Lecture) and Dave Touretzky
CS 188: Artificial Intelligence
Class #19 – Tuesday, November 3
CS 188: Artificial Intelligence Fall 2008
CS 188: Artificial Intelligence Spring 2007
Hidden Markov Models Lirong Xia.
CS 188: Artificial Intelligence Fall 2007
CS 188: Artificial Intelligence Spring 2006
Probabilistic Reasoning
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Presentation transcript:

CS 188: Artificial Intelligence Spring 2007 Lecture 14: Bayes Nets III 3/1/2007 Srini Narayanan – ICSI and UC Berkeley

Announcements  Office Hours this week will be on Friday  (11-1).  Assignment 2 grading  Midterm 3/13  Review 3/8 (next Thursday)  Midterm review materials up over the weekend.  Extended office hours next week  (Thursday 11-1, Friday 2:30-4:30)

Representing Knowledge

Inference  Inference: calculating some statistic from a joint probability distribution  Examples:  Posterior probability:  Most likely explanation: R T B D L T’

Inference in Graphical Models  Queries  Value of information  What evidence should I seek next  Sensitivity Analysis  What probability values are most critical  Explanation  e.g., Why do I need a new starter motor  Prediction  e.g., What would happen if my fuel pump stops working

Inference Techniques  Exact Inference  Inference by enumeration  Variable elimination  Approximate Inference/ Monte Carlo  Prior Sampling  Rejection Sampling  Likelihood weighting  Monte Carlo Markov Chain (MCMC)

Reminder: Alarm Network

Inference by Enumeration  Given unlimited time, inference in BNs is easy  Recipe:  State the marginal probabilities you need  Figure out ALL the atomic probabilities you need  Calculate and combine them  Example:

Example Where did we use the BN structure? We didn’t!

Example  In this simple method, we only need the BN to synthesize the joint entries

Normalization Trick Normalize

Nesting Sums  Atomic inference is extremely slow!  Slightly clever way to save work:  Move the sums as far right as possible  Example:

Example

Evaluation Tree  View the nested sums as a computation tree:  Still repeated work: calculate P(m | a) P(j | a) twice, etc.

Variable Elimination: Idea  Lots of redundant work in the computation tree  We can save time if we carry out the summation right to left and cache all intermediate results into objects called factors  This is the basic idea behind variable elimination

Basic Objects  Track objects called factors  Initial factors are local CPTs  During elimination, create new factors  Anatomy of a factor: Variables introduced Variables summed out Factor argument variables

Basic Operations  First basic operation: join factors  Combining two factors:  Just like a database join  Build a factor over the union of the domains  Example:

Basic Operations  Second basic operation: marginalization  Take a factor and sum out a variable  Shrinks a factor to a smaller one  A projection operation  Example:

Example

General Variable Elimination  Query:  Start with initial factors:  Local CPTs (but instantiated by evidence)  While there are still hidden variables (not Q or evidence):  Pick a hidden variable H  Join all factors mentioning H  Project out H  Join all remaining factors and normalize

Variable Elimination  What you need to know:  VE caches intermediate computations  Polynomial time for tree-structured graphs!  Saves time by marginalizing variables as soon as possible rather than at the end  Approximations  Exact inference is slow, especially when you have a lot of hidden nodes  Approximate methods give you a (close) answer, faster

Sampling  Basic idea:  Draw N samples from a sampling distribution S  Compute an approximate posterior probability  Show this converges to the true probability P  Outline:  Sampling from an empty network  Rejection sampling: reject samples disagreeing with evidence  Likelihood weighting: use evidence to weight samples

Prior Sampling Cloudy Sprinkler Rain WetGrass Cloudy Sprinkler Rain WetGrass

Prior Sampling  This process generates samples with probability …i.e. the BN’s joint probability  Let the number of samples of an event be  Then  I.e., the sampling procedure is consistent

Example  We’ll get a bunch of samples from the BN: c,  s, r, w c, s, r, w  c, s, r,  w c,  s, r, w  c, s,  r, w  If we want to know P(W)  We have counts  Normalize to get P(W) =  This will get closer to the true distribution with more samples  Can estimate anything else, too  What about P(C|  r)? P(C|  r,  w)? Cloudy Sprinkler Rain WetGrass C S R W

Rejection Sampling  Let’s say we want P(C)  No point keeping all samples around  Just tally counts of C outcomes  Let’s say we want P(C| s)  Same thing: tally C outcomes, but ignore (reject) samples which don’t have S=s  This is rejection sampling  It is also consistent (correct in the limit) c,  s, r, w c, s, r, w  c, s, r,  w c,  s, r, w  c, s,  r, w Cloudy Sprinkler Rain WetGrass C S R W

Likelihood Weighting  Problem with rejection sampling:  If evidence is unlikely, you reject a lot of samples  You don’t exploit your evidence as you sample  Consider P(B|a)  Idea: fix evidence variables and sample the rest  Problem: sample distribution not consistent!  Solution: weight by probability of evidence given parents BurglaryAlarm BurglaryAlarm

Likelihood Sampling Cloudy Sprinkler Rain WetGrass Cloudy Sprinkler Rain WetGrass

Likelihood Weighting  Sampling distribution if z sampled and e fixed evidence  Now, samples have weights  Together, weighted sampling distribution is consistent Cloudy Rain C S R W =

Likelihood Weighting  Note that likelihood weighting doesn’t solve all our problems  Rare evidence is taken into account for downstream variables, but not upstream ones  A better solution is Markov-chain Monte Carlo (MCMC), more advanced  We’ll return to sampling for robot localization and tracking in dynamic BNs Cloudy Rain C S R W

Summary  Exact inference in graphical models is tractable only for polytrees.  Variable elimination caches intermediate results to make the exact inference process more efficient for a given for a given query.  Approximate inference techniques use a variety of sampling methods and can scale to large models.  NEXT: Adding dynamics and change to graphical models

Bayes Net for insurance