Download presentation
Presentation is loading. Please wait.
Published byJacob O’Brien’ Modified over 9 years ago
1
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Lecture 27 of 42 William H. Hsu Department of Computing and Information Sciences, KSU KSOL course page: http://snipurl.com/v9v3http://snipurl.com/v9v3 Course web site: http://www.kddresearch.org/Courses/CIS730http://www.kddresearch.org/Courses/CIS730 Instructor home page: http://www.cis.ksu.edu/~bhsuhttp://www.cis.ksu.edu/~bhsu Reading for Next Class: Sections 14.3 – 14.5, p. 500 - 518, Russell & Norvig 2 nd edition Reasoning under Uncertainty: Introduction to Graphical Models, Part 1 of 2
2
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Lecture Outline Reading for Next Class: Sections 14.3 – 14.5 (p. 500 – 518), R&N 2 e Last Class: Uncertainty, Chapter 13 (p. 462 - 489) Today: Graphical Models, 14.1 – 14.2 (p. 492 – 499), R&N 2 e Coming Week: More Applied Probability, Graphical Models
3
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Graphical Models of Probability P(20s, Female, Low, Non-Smoker, No-Cancer, Negative, Negative) = P(T) · P(F) · P(L | T) · P(N | T, F) · P(N | L, N) · P(N | N) · P(N | N) Conditional Independence X is conditionally independent (CI) from Y given Z iff P(X | Y, Z) = P(X | Z) for all values of X, Y, and Z Example: P(Thunder | Rain, Lightning) = P(Thunder | Lightning) T R | L Bayesian (Belief) Network Acyclic directed graph model B = (V, E, ) representing CI assertions over Vertices (nodes) V: denote events (each a random variable) Edges (arcs, links) E: denote conditional dependencies Markov Condition for BBNs (Chain Rule): Example BBN X1X1 X3X3 X4X4 X5X5 Age Exposure-To-Toxins Smoking Cancer X6X6 Serum Calcium X2X2 Gender X7X7 Lung Tumor
4
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Semantics of Bayesian Networks
5
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Markov Blanket
6
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Constructing Bayesian Networks: Chain Rule of Inference
7
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Evidential Reasoning: Example – Car Diagnosis
8
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence BNJ Visualization [1] Pseudo-Code Annotation (Code Page) © 2004 KSU BNJ Development Team ALARM Network
9
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Wednesday, 01 Nov 2006CIS 490 / 730: Artificial Intelligence BNJ Visualization [2] Network © 2004 KSU BNJ Development Team Poker Network
10
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Graphical Models Overview [2]: Markov Blankets & d-Separation Z XEY (1) (2) (3) Z Z Adapted from J. Schlabach (1996) Motivation: The conditional independence status of nodes within a BBN might change as the availability of evidence E changes. Direction-dependent separation (d-separation) is a technique used to determine conditional independence of nodes as evidence changes. Definition: A set of evidence nodes E d-separates two sets of nodes X and Y if every undirected path from a node in X to a node in Y is blocked given E. A path is blocked if one of three conditions holds:
11
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Graphical Models Overview [3]: Inference Problem Multiply-connected case: exact, approximate inference are #P-complete Adapted from slide © 2004 S. Russell & P. Norvig. Reused with permission.
12
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Goal: Estimate Filtering: r = t Intuition: infer current state from observations Applications: signal identification Variation: Viterbi algorithm Prediction: r < t Intuition: infer future state Applications: prognostics Smoothing: r > t Intuition: infer past hidden state Applications: signal enhancement CF Tasks Plan recognition by smoothing Prediction cf. WebCANVAS – Cadez et al. (2000) Adapted from Murphy (2001), Guo (2002) Other Topics in Graphical Models [1]: Temporal Probabilistic Reasoning
13
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence General-Case BN Structure Learning: Use Inference to Compute Scores Optimal Strategy: Bayesian Model Averaging Assumption: models h H are mutually exclusive and exhaustive Combine predictions of models in proportion to marginal likelihood Compute conditional probability of hypothesis h given observed data D i.e., compute expectation over unknown h for unseen cases Let h structure, parameters CPTs Posterior ScoreMarginal Likelihood Prior over StructuresLikelihood Prior over Parameters Other Topics in Graphical Models [2]: Learning Structure from Data
14
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Propagation Algorithm in Singly-Connected BNs – Pearl (1983) C1C1 C2C2 C3C3 C4C4 C5C5 C6C6 Upward (child-to- parent) messages ’ (C i ’ ) modified during message-passing phase Downward messages P ’ (C i ’ ) is computed during message-passing phase Multiply-connected case: exact, approximate inference are #P-complete (counting problem is #P-complete iff decision problem is NP-complete) Adapted from Neapolitan (1990), Guo (2000)
15
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Inference by Clustering [1]: Moralization, Triangulation, Cliques A D BE G C H F Bayesian Network (Acyclic Digraph) A D BE G C H F Moralize A1A1 D8D8 B2B2 E3E3 G5G5 C4C4 H7H7 F6F6 Triangulate Clq6 D8D8 C4C4 G5G5 H7H7 C4C4 Clq5 G5G5 F6F6 E3E3 Clq4 G5G5 E3E3 C4C4 Clq3 A1A1 B2B2 Clq1 E3E3 C4C4 B2B2 Clq2 Find Maximal Cliques Adapted from Neapolitan (1990), Guo (2000)
16
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Input: list of cliques of triangulated, moralized graph G u Output: Tree of cliques Separators nodes S i, Residual nodes R i and potential probability (Clq i ) for all cliques Algorithm: 1. S i = Clq i (Clq 1 Clq 2 … Clq i-1 ) 2. R i = Clq i - S i 3. If i >1 then identify a j < i such that Clq j is a parent of Clq i 4. Assign each node v to a unique clique Clq i that v c(v) Clq i 5. Compute (Clq i ) = f(v) Clqi = P(v | c(v)) {1 if no v is assigned to Clq i } 6. Store Clq i, R i, S i, and (Clq i ) at each vertex in the tree of cliques Inference by Clustering [2]: Junction Tree Algorithm
17
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Clq6 D8D8 C4C4G5G5 H7H7 C4C4 Clq5 G5G5 F6F6 E3E3 Clq4 G5G5 E3E3 C4C4 Clq3 A1A1 B2B2 Clq1 E3E3 C4C4 B2B2 Clq2 (Clq5) = P(H|C,G) (Clq2) = P(D|C) Clq 1 Clq3 = {E,C,G} R3 = {G} S3 = { E,C } Clq1 = {A, B} R1 = {A, B} S1 = {} Clq2 = {B,E,C} R2 = {C,E} S2 = { B } Clq4 = {E, G, F} R4 = {F} S4 = { E,G } Clq5 = {C, G,H} R5 = {H} S5 = { C,G } Clq6 = {C, D} R5 = {D} S5 = { C} (Clq 1 ) = P(B|A)P(A) (Clq2) = P(C|B,E) (Clq3) = 1 (Clq4) = P(E|F)P(G|F)P(F) AB BEC ECG EGF CGH CD B EC CGEG C R i : residual nodes S i : separator nodes (Clq i ): potential probability of Clique i Clq 2 Clq 3 Clq 4 Clq 5 Clq 6 Adapted from Neapolitan (1990), Guo (2000) Inference by Clustering [2]: Clique Tree Operations
18
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Inference by Loop Cutset Conditioning Split vertex in undirected cycle; condition upon each of its state values Number of network instantiations: Product of arity of nodes in minimal loop cutset Posterior: marginal conditioned upon cutset variable values X3X3 X4X4 X5X5 Exposure-To- Toxins Smoking Cancer X6X6 Serum Calcium X2X2 Gender X7X7 Lung Tumor X 1,1 Age = [0, 10) X 1,2 Age = [10, 20) X 1,10 Age = [100, ) Deciding Optimal Cutset: NP-hard Current Open Problems Bounded cutset conditioning: ordering heuristics Finding randomized algorithms for loop cutset optimization
19
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Adapted from slide © 2004 S. Russell & P. Norvig. Reused with permission. Inference by Variable Elimination [1]: Factoring Operations
20
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Inference by Variable Elimination [2]: Factoring Operations Adapted from slide © 2004 S. Russell & P. Norvig. Reused with permission.
21
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence [2] Representation Evaluator for Learning Problems Genetic Wrapper for Change of Representation and Inductive Bias Control D: Training Data : Inference Specification D train (Inductive Learning) D val (Inference) [1] Genetic Algorithm α Candidate Representation f(α) Representation Fitness Optimized Representation Genetic Algorithms for Parameter Tuning in Learning
22
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence References: Graphical Models & Inference Graphical Models Bayesian (Belief) Networks tutorial – Murphy (2001) http://www.cs.berkeley.edu/~murphyk/Bayes/bayes.html http://www.cs.berkeley.edu/~murphyk/Bayes/bayes.html Learning Bayesian Networks – Heckerman (1996, 1999) http://research.microsoft.com/~heckerman http://research.microsoft.com/~heckerman Inference Algorithms Junction Tree (Join Tree, L-S, Hugin): Lauritzen & Spiegelhalter (1988) http://citeseer.nj.nec.com/huang94inference.html http://citeseer.nj.nec.com/huang94inference.html (Bounded) Loop Cutset Conditioning: Horvitz & Cooper (1989) http://citeseer.nj.nec.com/shachter94global.html http://citeseer.nj.nec.com/shachter94global.html Variable Elimination (Bucket Elimination, ElimBel): Dechter (1986) http://citeseer.nj.nec.com/dechter96bucket.html Recommended Books Neapolitan (1990) – out of print; see Pearl (1988), Jensen (2001) Castillo, Gutierrez, Hadi (1997) Cowell, Dawid, Lauritzen, Spiegelhalter (1999) Stochastic Approximation http://citeseer.nj.nec.com/cheng00aisbn.html http://citeseer.nj.nec.com/cheng00aisbn.html
23
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Terminology Uncertain Reasoning Ability to perform inference in presence of uncertainty about premises rules Nondeterminism Representations for Uncertain Reasoning Probability: measure of belief in sentences Founded on Kolmogorov axioms prior, joint vs. conditional Bayes’s theorem: P(A | B) = (P(B | A) * P(A)) / P(B) Graphical models: graph theory + probability Dempster-Shafer theory: upper and lower probabilities, reserved belief Fuzzy representation (sets), fuzzy logic: degree of membership Others Truth maintenance system: logic-based network representation Endorsements: evidential reasoning mechanism
24
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Last Class: Reasoning under Uncertainty and Probability Uncertainty is pervasive Planning Reasoning Learning (later) What are we uncertain about? Sensor error Incomplete or faulty domain theory “Nondeterministic” environment Today: Graphical Models Coming Week: More Applied Probability Graphical models as KR for uncertainty: Bayesian networks, etc. Some inference algorithms for Bayes nets Summary Points
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.