Download presentation
Presentation is loading. Please wait.
Published byAlvin Fowler Modified over 9 years ago
1
Probabilistic Reasoning ECE457 Applied Artificial Intelligence Spring 2007 Lecture #9
2
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 2 Outline Bayesian networks D-separation and independence Inference Russell & Norvig, sections 14.1 to 14.4
3
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 3 Recall the Story from FOL Anyone passing their 457 exam and winning the lottery is happy. Anyone who studies or is lucky can pass all their exams. Bob did not study but is lucky. Anyone who’s lucky can win the lottery. Is Bob happy?
4
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 4 Add Probabilities Anyone passing their 457 exam and winning the lottery has a 99% chance of being happy. Anyone only passing their 457 exam has an 80%, while someone only winning the lottery has a 60% chance of being happy, and someone who does neither has a 20% chance of being happy. Anyone who studies has a 90% chance of passing their exams. Anyone who’s lucky has a 50% chance of passing their exams. Anyone who’s both lucky and who studied has a 99% chance of passing, but someone who didn’t study and is unlucky has a 1% chance of passing. There’s a 20% chance that Bob studied, but a 75% chance that he’ll be lucky. Anyone who’s lucky has a 40% chance of winning the lottery, while an unlucky person only has a 1% chance of winning. What’s the probability of Bob being happy?
5
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 5 Probabilities in the Story Example of probabilities in the story P(Lucky) = 0.75 P(Study) = 0.2 P(PassExam|Study) = 0.9 P(PassExam|Lucky) = 0.5 P(Win|Lucky) = 0.4 P(Happy|PassExam,Win) = 0.99 Some variables directly affect others! Graphical representation of dependencies and conditional independencies between variables?
6
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 6 Bayesian Network Belief network Directed acyclic graph Nodes represent variables Edges represent conditional relationships Concise representation of any full joint probability distribution StudyPassExamLuckyWinHappy
7
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 7 Bayesian Network Nodes with no parents have prior probabilities Nodes with parents have conditional probability tables For all truth value combinations of their parents StudyPassExamLuckyWinHappy
8
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 8 Bayesian Network StudyPassExamLuckyWinHappy P(L) = 0.75P(S) = 0.2 LP(W) F0.01 P(W| L) T0.4P(W|L) LSP(E) FF0.01 P(E| L S) TF0.5 P(E|L S) FT0.9 P(E| L S) TT0.99 P(E|L S) WEP(H) P( H) FF0.20.8 TF0.60.4 FT0.80.2 TT0.990.01
9
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 9 Bayesian Network abcdgfejhikmnlopqrstuvwyxz
10
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 10 Chain Rule Recall the chain rule P(A,B) = P(A|B)P(B) P(A,B,C) = P(A|B,C)P(B,C) P(A,B,C) = P(A|B,C)P(B|C)P(C) P(A 1,A 2,…,A n ) = P(A 1 |A 2,…,A n )P(A 2 |A 3,…,A n )…P(A n-1 |A n )P(A n ) P(A 1,A 2,…,A n ) = i=1 n P(A i |A i+1,…,A n )
11
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 11 Chain Rule If we know the value of a node’s parents, we don’t care about more distant ancestors Their influence is included through the parents A node is conditionally independent of its predecessors given its parents Or more generally, a node is conditionally independent of its non-descendents given its parents Update chain rule P(A 1,A 2,…,A n ) = i=1 n P(A i |parents(A i ))
12
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 12 Chain Rule Example Probability that Bob is happy because he won the lottery and passed his exam, because he’s lucky but did not study P(H,W,E,L, S) = P(H|W E) * P(W|L) * P(E|L S) * P(L) * P( S) P(H,W,E,L, S) = 0.99 * 0.4 * 0.5 * 0.75 * 0.8 P(H,W,E,L, S) = 0.12
13
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 13 Constructing Bayesian Nets Build from the top- down Start with root nodes Add children Go down to leaves StudyPassExamLuckyWinHappy
14
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 14 Constructing Bayesian Nets What happens if we build with the wrong order? Network becomes needlessly complicated Node ordering is important! StudyPassExamLuckyWinHappy
15
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 15 Connections We can understand dependence in a network by considering how evidence is transmitted through it Information entered at one node Propagates to descendents and ancestors through connected nodes Provided no node in path already has evidence (in which case we would stop the propagation)
16
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 16 Serial Connection Study and Happy are dependent Study and Happy are independent given PassExam Intuitively, the only way Study can affect Happy is through PassExam StudyPassExamLuckyWinHappy
17
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 17 Converging Connection Lucky and Study are independent Lucky and Study are dependent given PassExam Intuitively, Lucky can be used to explain away Study StudyPassExamLuckyWinHappy
18
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 18 Diverging Connection Win and PassExams are dependent Win and PassExams are independent given Lucky Intuitively, Lucky can explain both Win and PassExam. Win and PassExam can affect each other by changing the belief in Lucky StudyPassExamLuckyWinHappy
19
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 19 D-Separation Determine if two variables are independent given some other variables X is independent of Y given Z if X and Y are d- separate given Z X is d-separate from Y if, for all (undirected) paths between X and Y, there exists a node Z for which: The connection is serial or diverging and there is evidence for Z The connection is converging and there is no evidence for Z or any of its descendents
20
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 20 D-Separation X Z Blocks path if in evidence YX YX Z Blocks path if not in evidence Y Z 2 Blocks path if not in evidence
21
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 21 D-Separation Can be computed in linear time using depth-first-search algorithm Fast algorithm to know if two nodes are independent Allows us to infer whether learning the value of a variable might give us information about another variable given what we already know All d-separated variables are independent but not all independent variable are d- separated
22
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 22 D-Separation Exercise If we observe a value for node g, what other nodes are updated? Nodes f, h and i If we observe a value for node a, what other nodes are updated? Nodes b, c, d, e, f abcdefghij
23
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 23 D-Separation Exercise Given an observation of c, are nodes a and f independent? Yes Given an observation of i, are nodes g and j independent? No abcdefghij
24
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 24 Other Independence Criteria bcdghiknlopsuvwyx m A node is conditionally independent of its non- descendents given its parents Recall from updated chain rule z
25
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 25 Other Independence Criteria bcdghiknlopsuvwyx m A node is conditionally independent of all others in the network given its parents, children, and children’s parents Markov blanket z
26
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 26 Inference in Bayesian Network Compute the posterior probability of a query variable given an observed event P(A 1,A 2,…,A n ) = i=1 n P(A i |parents(A i )) Observed evidence variables E = E 1,…,E m Query variable X Between them: nonevidence (hidden) variables Y = Y 1 …Y l Belief network is X E Y
27
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 27 Inference in Bayesian Network P(X|E) Recall Bayes’ Theorem: P(A|B) = P(A,B) / P(B) P(X|E) = α P(X,E) Recall marginalization: P(A i ) = j P(A i,B j ) P(X|E) = α Y P(X,E,Y) Recall chain rule: P(A 1,A 2,…,A n ) = i=1 n P(A i |parents(A i )) P(X|E) = α Y A=X E P(A|parents(A))
28
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 28 Inference Example StudyPassExamLuckyWinHappy P(L) = 0.75P(S) = 0.2 LP(W) F0.01 T0.4 LSP(E) FF0.01 TF0.5 FT0.9 TT0.99 WEP(H) FF0.2 TF0.6 FT0.8 TT0.99
29
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 29 Inference Example #1 With only the information from the network (and no observations), what’s the probability that Bob won the lottery? P(W) = l P(W,l) P(W) = l P(W|l)P(l) P(W) = P(W|L)P(L) + P(W| L)P( L) P(W) = 0.4*0.75 + 0.01*0.25 P(W) = 0.3025
30
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 30 Inference Example #2 Given that we know that Bob is happy, what’s the probability that Bob won the lottery? From the network, we know P(h,e,w,s,l) = P(l)P(s)P(e|l,s)P(w|l)P(h|w,e) We want to find P(W|H) = α l s e P(l)P(s)P(e|l,s)P(W|l)P(H|W,e) P( W|H) also needed to normalize
31
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 31 Inference Example #2 lseP(s)P(l)P(e|l,s)P(W|l)P(H|W,e) FFF0.80.250.990.010.60.001188 TFF0.80.750.50.40.60.072 FTF0.20.250.10.010.60.00003 TTF0.20.750.010.40.60.00036 FFT0.80.250.01 0.990.0000198 TFT0.80.750.50.40.990.1188 FTT0.20.250.90.010.990.0004455 TTT0.20.750.990.40.990.058806 P(W|H) = α 0.2516493
32
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 32 Inference Example #2 lseP(s)P(l)P(e|l,s) P( W|l)P(H| W,e) FFF0.80.250.99 0.20.039204 TFF0.80.750.50.60.20.036 FTF0.20.250.10.990.20.00099 TTF0.20.750.010.60.20.00018 FFT0.80.250.010.990.80.001584 TFT0.80.750.50.60.80.144 FTT0.20.250.90.990.80.03564 TTT0.20.750.990.60.80.07128 P( W|H) = α 0.328878
33
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 33 Inference Example #2 P(W|H) = α P(W|H) = Note that P( W|H) > P(W|H) because P( W| L) P(W| L) The probability of Bob having won the lottery has increased by 13.1% thanks to our knowledge that he is happy!
34
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 34 Expert Systems Bayesian networks used to implement expert systems Diagnostic systems that contains subject-specific knowledge Knowledge (nodes, relationships, probabilities) typically provided by human experts System observes evidence by asking questions to user, then infers most likely conclusion
35
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 35 Pathfinder Expert system for medical diagnostic of lymph-node diseases Very large Bayesian network Over 60 diseases Over 100 features of lymph nodes Over 30 features for clinical information Lot of work from medical experts 8 hours to define features and diseases 35 hours to build network topology 40 hours to assess probabilities
36
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 36 Pathfinder One node for each disease Assumes the diseases are mutually exclusive and exhaustive Large domain, hard to handle Several small networks for diagnostic tasks built individually Then combined into a single large network
37
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 37 Pathfinder Testing the network 53 test cases (real diagnostics) Diagnostic accuracy as good as a medical expert
38
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 38 Assumptions Learning agent Environment Fully observable / Partially observable Deterministic / Strategic / Stochastic Sequential Static / Semi-dynamic Discrete / Continuous Single agent / Multi-agent
39
ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 39 Assumptions Updated We can handle a new combination! Fully observable & Deterministic No uncertainty (map of Romania) Fully observable & Stochastic Games of chance (Monopoly, Backgammon) Partially observable & Deterministic Logic (Wumpus World) Partially observable & Stochastic
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.