Lecture 1 is based on David Heckerman’s Tutorial slides. (Microsoft Research) Bayesian Networks Lecture 1: Basics and Knowledge- Based Construction Requirements: 50% home works; 50% Exam or a project
What I hope you will get out of this course... u What are Bayesian networks? u Why do we use them? u How do we build them by hand? u How do we build them from data? u What are some applications? u What is their relationship to other models? u What are the properties of conditional independence that make these models appropriate? u Usage in genetic linkage analysis
Applications of hand-built Bayes Nets u Answer Wizard 95, Office Assistant 97,2000 u Troubleshooters in Windows 98 u Lymph node pathology u Trauma care u NASA mission control Some Applications of learned Bayes Nets u Clustering users on the web (MSNBC) u Classifying Text (spam filtering)
Some factors that support intelligence u Knowledge representation u Reasoning u Learning / adapting
Artificial Intelligence
Artificial Intelligence is better than none !
Artificial Intelligence is better than ours !
Outline for today u Basics u Knowledge-based construction u Probabilistic inference u Applications of hand-built BNs at Microsoft
Bayesian Networks: History u 1920s: Wright -- analysis of crop failure u 1950s: I.J. Good -- causality u Early 1980s: Howard and Matheson, Pearl u Other names: l directed acyclic graphical (DAG) models l belief networks l causal networks l probabilistic networks l influence diagrams l knowledge maps
Bayesian Network Fuel FuelGauge Start Battery Engine Turns Over p(b)p(b) p(t|b) p(g|f,b) p(s|f,t) p(f)p(f) Directed Acyclic Graph, annotated with prob distributions
BN structure: Definition Missing arcs encode independencies such that Fuel FuelGauge Start Battery Engine Turns Over p(b)p(b) p(t|b) p(g|f,b) p(s|f,t) p(f)p(f)
Independencies in a Bayes net Many other independencies are entailed by (*): can be read from the graph using d-separation (Pearl) Example:
Explaining Away and Induced Dependencies Fuel Start TurnOver "explaining away" "induced dependencies"
Local distributions Table: p(S=y|T=n,F=e) = 0.0 p(S=y|T=n,F=n) = 0.0 p(S=y|T=y,F=e) = 0.0 p(S=y|T=y,F=n) = 0.99 Fuel (empty, not) Start (yes, no) TurnOver (yes, no) TF S
Local distributions Tree: Fuel (empty, not) Start (yes, no) TurnOver (yes, no) TF S TurnOver Fuel no yes empty not empty p( start )=0 p( start )=0.99
Lots of possibilities for a local distribution... u y = discrete node: any probabilistic classifier l Decision tree l Neural net u y= continuous node: any probabilistic regression model l Linear regression with Gaussian noise l Neural net nodeparents
Naïve Bayes Classifier Class Input 1Input 2Input n... discrete
Hidden Markov Model H1H1 X1X1 H2H2 X2X2 H3H3 X3X3 H4H4 X4X4 H5H5 X5X5... discrete, hidden observations
Feed-Forward Neural Network X1X1 X1X1 X1X1 Y1Y1 Y2Y2 Y3Y3 hidden layer inputs outputs (binary) sigmoid
Outline u Basics u Knowledge-based construction u Probabilistic inference u Decision making u Applications of hand-built BNs at Microsoft
Building a Bayes net by hand (ok, now we're starting to be Bayesian) u Define variables u Assess the structure u Assess the local probability distributions
What is a variable? u Collectively exhaustive, mutually exclusive values Error Occured No Error
Clarity Test: Is the variable knowable in principle u Is it raining? {Where, when, how many inches?} u Is it hot? {T 100F, T < 100F} u Is user’s personality dominant or submissive? {numerical result of standardized personality test}
Assessing structure (one approach) u Choose an ordering for the variables u For each variable, identify parents Pa i such that
Example Fuel GaugeStart Battery TurnOver
Example Fuel GaugeStart Battery TurnOver p(f)p(f)
Example p(b|f)=p(b) Fuel GaugeStart Battery TurnOver p(f)p(f)
Example p(b|f)=p(b) p(t|b,f)=p(t|b) Fuel GaugeStart Battery TurnOver p(f)p(f)
Example p(b|f)=p(b) p(t|b,f)=p(t|b) p(g|f,b,t)=p(g|f,b) Fuel GaugeStart Battery TurnOver p(f)p(f)
Example p(b|f)=p(b) p(t|b,f)=p(t|b) p(g|f,b,t)=p(g|f,b) p(s|f,b,t,g)=p(s|f,t) p(f,b,t,g,s) = p(f) p(b) p(t|b) p(g|f,b) p(s|f,t) FuelGaugeStart Battery TurnOver p(f)p(f)
Why is this the wrong way? Variable order can be critical Battery TurnOver Start Fuel Gauge
A better way: Use causal knowledge Fuel Gauge Start Battery TurnOver
Conditional Independence Simplifies Probabilistic Inference FuelGauge Battery TurnOverStart
Online Troubleshooters
Define Problem
Gather Information
Get Recommendations
(see Breese & Heckerman, 1996) Portion of BN for print troubleshooting
Office Assistant 97
Lumière Project User’s Goals User’s Needs User Activity User Activity (see Horvitz, Breese, Heckerman, Hovel & Rommelse 1998)
Studies with Human Subjects u “Wizard of OZ” experiments at MS Usability Labs Expert Advisor Inexperienced user User Actions Typed Advice
. Activities with Relevance to User’s Needs Several classes of evidence n Search: e.g., menu surfing n Introspection: e.g., sudden pause, slowing of command stream n Focus of attention: e.g, selected objects n Undesired effects: e.g., command/undo, dialogue opened and cancelled n Inefficient command sequences n Goal-specific sequences of actions
Summary so far Bayes nets are useful because... u They encode independence explicitly l more parsimonious models l efficient inference u They encode independence graphically l Easier explanation l Easier encoding u They sometimes correspond to causal models l Easier explanation l Easier encoding l Modularity leads to easier maintenance
Teenage Bayes MICRONEWS 97: Microsoft Researchers Exchange Brainpower with Eighth-grader Teenager Designs Award-Winning Science Project.. For her science project, which she called "Dr. Sigmund Microchip," Tovar wanted to create a computer program to diagnose the probability of certain personality types. With only answers from a few questions, the program was able to accurately diagnose the correct personality type 90 percent of the time.
Artificial Intelligence is a promising field always was, always will be.