CHAPTER 7 BAYESIAN NETWORK INDEPENDENCE BAYESIAN NETWORK INFERENCE MACHINE LEARNING ISSUES.

Slides:



Advertisements
Similar presentations
Bayesian networks Chapter 14 Section 1 – 2. Outline Syntax Semantics Exact computation.
Advertisements

Classification Classification Examples
Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
10/24  Exam on 10/26 (Lei Tang and Will Cushing to proctor)
CS 188: Artificial Intelligence Spring 2007 Lecture 19: Perceptrons 4/2/2007 Srini Narayanan – ICSI and UC Berkeley.
Bayesian networks Chapter 14 Section 1 – 2.
Bayesian Belief Networks
CS 188: Artificial Intelligence Fall 2009 Lecture 15: Bayes’ Nets II – Independence 10/15/2009 Dan Klein – UC Berkeley.
Constructing Belief Networks: Summary [[Decide on what sorts of queries you are interested in answering –This in turn dictates what factors to model in.
CS 188: Artificial Intelligence Spring 2009 Lecture 15: Bayes’ Nets II -- Independence 3/10/2009 John DeNero – UC Berkeley Slides adapted from Dan Klein.
CS 188: Artificial Intelligence Spring 2007 Lecture 14: Bayes Nets III 3/1/2007 Srini Narayanan – ICSI and UC Berkeley.
CS 188: Artificial Intelligence Fall 2006 Lecture 17: Bayes Nets III 10/26/2006 Dan Klein – UC Berkeley.
10/22  Homework 3 returned; solutions posted  Homework 4 socket opened  Project 3 assigned  Mid-term on Wednesday  (Optional) Review session Tuesday.
Bayesian Networks Alan Ritter.
CPSC 422, Lecture 18Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Feb, 25, 2015 Slide Sources Raymond J. Mooney University of.
CS 188: Artificial Intelligence Fall 2009 Lecture 14: Bayes’ Nets 10/13/2009 Dan Klein – UC Berkeley.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)
Advanced Artificial Intelligence
Announcements  Midterm 1  Graded midterms available on pandagrader  See Piazza for post on how fire alarm is handled  Project 4: Ghostbusters  Due.
Bayes’ Nets  A Bayes’ net is an efficient encoding of a probabilistic model of a domain  Questions we can ask:  Inference: given a fixed BN, what is.
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
QUIZ!!  T/F: Traffic, Umbrella are cond. independent given raining. TRUE  T/F: Fire, Smoke are cond. Independent given alarm. FALSE  T/F: BNs encode.
Announcements Project 4: Ghostbusters Homework 7
INTERVENTIONS AND INFERENCE / REASONING. Causal models  Recall from yesterday:  Represent relevance using graphs  Causal relevance ⇒ DAGs  Quantitative.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
Probabilistic Reasoning [Ch. 14] Bayes Networks – Part 1 ◦Syntax ◦Semantics ◦Parameterized distributions Inference – Part2 ◦Exact inference by enumeration.
CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks.
Artificial Intelligence Bayes’ Nets: Independence Instructors: David Suter and Qince Li Course Harbin Institute of Technology [Many slides.
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11 CS479/679 Pattern Recognition Dr. George Bebis.
Integrative Genomics I BME 230. Probabilistic Networks Incorporate uncertainty explicitly Capture sparseness of wiring Incorporate multiple kinds of data.
A Brief Introduction to Bayesian networks
CS 188: Artificial Intelligence Spring 2007
Artificial Intelligence
Bayesian networks Chapter 14 Section 1 – 2.
Presented By S.Yamuna AP/CSE
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Bayesian networks (1) Lirong Xia Spring Bayesian networks (1) Lirong Xia Spring 2017.
From last time: on-policy vs off-policy Take an action Observe a reward Choose the next action Learn (using chosen action) Take the next action Off-policy.
CS 4/527: Artificial Intelligence
CS 4/527: Artificial Intelligence
CAP 5636 – Advanced Artificial Intelligence
Read R&N Ch Next lecture: Read R&N
Inference Inference: calculating some useful quantity from a joint probability distribution Examples: Posterior probability: Most likely explanation: B.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
CS 188: Artificial Intelligence
CAP 5636 – Advanced Artificial Intelligence
CS 188: Artificial Intelligence
CS 188: Artificial Intelligence Fall 2007
CS 188: Artificial Intelligence Fall 2008
CAP 5636 – Advanced Artificial Intelligence
Class #19 – Tuesday, November 3
CS 188: Artificial Intelligence
CS 188: Artificial Intelligence Fall 2008
CS 188: Artificial Intelligence Spring 2006
CS 188: Artificial Intelligence Spring 2007
Read R&N Ch Next lecture: Read R&N
CS 188: Artificial Intelligence Fall 2007
Bayesian networks Chapter 14 Section 1 – 2.
CS 188: Artificial Intelligence Spring 2006
Probabilistic Reasoning
Bayesian networks (1) Lirong Xia. Bayesian networks (1) Lirong Xia.
Instructor: Vincent Conitzer
CS 188: Artificial Intelligence Fall 2008
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Presentation transcript:

CHAPTER 7 BAYESIAN NETWORK INDEPENDENCE BAYESIAN NETWORK INFERENCE MACHINE LEARNING ISSUES

Review: Alarm Network

Causality? When Bayesian Networks reflect the true causal patterns:  Often simpler (nodes have fewer parents)  Often easier to think about  Often easier to elicit from experts BNs need not actually be causal  Sometimes no causal net exists over the domain  E.g. consider the variables Traffic and RoofDrips  End up with arrows that reflect correlation, not causation What do the arrows really mean?  Topology may happen to encode causal structure  Topology really encodes conditional independencies

Creating Bayes’ Nets Last time: we talked about how any fixed Bayesian Network encodes a joint distribution Today: how to represent a fixed distribution as a Bayesian Network  Key ingredient: conditional independence  The exercise we did in “causal” assembly of BNs was a kind of intuitive use of conditional independence  Now we have to formalize the process After that: how to answer queries (inference)

Conditional Independence

Conditional Independence

Independence in a BN

Causal Chains

Common Cause

Common Effect

The General Case

Reachability

Reachability (the Bayes Ball)

Example

Inference

Reminder: Alarm Network

Atomic Inference

Inference by Enumeration

Evaluation Tree

Variable Elimination Still lots of redundant work in the computation tree! We can save time if we cache all partial results This is the basic idea behind the variable elimination algorithm Compute and store factors over variables which represent results of intermediate computations All CPDs are factors, but not all factors are CPDs Thus not always “human interpretable” Just improves efficiency, doesn’t improve worst case time complexity Still exponential in the number of variables That’s all we’ll expect you to know!

Classification

Tuning on Held-Out Data

Confidences from a Classifier

Precision vs. Recall

Precision vs. Recall

Errors, and What to Do

What to Do About Errors? Need more features: words aren’t enough!  Have you emailed the sender before?  Have 1K other people just gotten the same email?  Is the sending information consistent?  Is the email in ALL CAPS?  Do inline URLs point where they say they point?  Does the email address you by (your) name? Naïve Bayes models can incorporate a variety of features, but tend to do best in homogeneous cases (e.g. all features are word occurrences)

Features A feature is a function which signals a property of the input Examples:  ALL_CAPS: value is 1 iff email in all caps  HAS_URL: value is 1 iff email has a URL  NUM_URLS: number of URLs in email  VERY_LONG: 1 iff email is longer than 1K  SUSPICIOUS_SENDER: 1 iff reply-to domain doesn’t match originating server Features are anything you can think of code to evaluate on an input  Some cheap, some very very expensive to calculate  Can even be the output of another classifier  Domain knowledge goes here! In Naïve Bayes, how did we encode features?

Feature Extractors

Generative vs. Discriminative Generative classifiers:  E.g. Naïve Bayes  We build a causal model of the variables  We then query that model for causes, given evidence Discriminative classifiers:  E.g. Perceptron (next)  No causal model, no Bayes rule, often no probabilities  Try to predict output directly  Loosely: mistake driven rather than model driven

Some (Vague) Biology