COMP 538 (Reasoning and Decision under Uncertainty) Introduction to Bayesian networks Introduction to Course.

Slides:



Advertisements
Similar presentations
CS188: Computational Models of Human Behavior
Advertisements

CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.
A Tutorial on Learning with Bayesian Networks
Graphical Models and Applications CNS/EE148 Instructors: M.Polito, P.Perona, R.McEliece TA: C. Fanti.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for 1 Lecture Notes for E Alpaydın 2010.
Bayesian Network and Influence Diagram A Guide to Construction And Analysis.
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Chapter 7 Technologies to Manage Knowledge: Artificial Intelligence.
CSE 5522: Survey of Artificial Intelligence II: Advanced Techniques Instructor: Alan Ritter TA: Fan Yang.
Hidden Markov Models Reading: Russell and Norvig, Chapter 15, Sections
Artificial Intelligence Chapter 19 Reasoning with Uncertain Information Biointelligence Lab School of Computer Sci. & Eng. Seoul National University.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.
Rosa Cowan April 29, 2008 Predictive Modeling & The Bayes Classifier.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
Introduction to Introduction to Artificial Intelligence Henry Kautz.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Part 2 of 3: Bayesian Network and Dynamic Bayesian Network.
© 2002 Franz J. Kurfess Introduction 1 CPE/CSC 481: Knowledge-Based Systems Dr. Franz J. Kurfess Computer Science Department Cal Poly.
CPSC 322, Lecture 31Slide 1 Probability and Time: Markov Models Computer Science cpsc322, Lecture 31 (Textbook Chpt 6.5) March, 25, 2009.
Bayes Nets. Bayes Nets Quick Intro Topic of much current research Models dependence/independence in probability distributions Graph based - aka “graphical.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)
THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY CSIT 5220: Reasoning and Decision under Uncertainty L10: Model-Based Classification and Clustering Nevin.
B. RAMAMURTHY EAP#2: Data Mining, Statistical Analysis and Predictive Analytics for Automotive Domain CSE651C, B. Ramamurthy 1 6/28/2014.
A Brief Introduction to Graphical Models
Anomaly detection with Bayesian networks Website: John Sandiford.
Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models.
Knowledge representation
Bayesian networks Chapter 14 Section 1 – 2. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 10a-11:30a Instructor: Christoph F. Eick Classroom:AH123
Probability and inference General probability rules IPS chapter 4.5 © 2006 W.H. Freeman and Company.
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Probability and inference General probability rules IPS chapter 4.5 © 2006 W.H. Freeman and Company.
第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models
Introduction to Artificial Intelligence and Soft Computing
Bayesian Statistics and Belief Networks. Overview Book: Ch 13,14 Refresher on Probability Bayesian classifiers Belief Networks / Bayesian Networks.
COMP 538 Reasoning and Decision under Uncertainty Introduction Readings: Pearl (1998, Chapter 1 Shafer and Pearl, Chapter 1.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Lecture notes 9 Bayesian Belief Networks.
Computing & Information Sciences Kansas State University Data Sciences Summer Institute Multimodal Information Access and Synthesis Learning and Reasoning.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 2, 2015.
Bayesian networks and their application in circuit reliability estimation Erin Taylor.
Quiz 3: Mean: 9.2 Median: 9.75 Go over problem 1.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 24 of 41 Monday, 18 October.
CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
Probabilistic Robotics Probability Theory Basics Error Propagation Slides from Autonomous Robots (Siegwart and Nourbaksh), Chapter 5 Probabilistic Robotics.
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
Artificial Intelligence Chapter 19 Reasoning with Uncertain Information Biointelligence Lab School of Computer Sci. & Eng. Seoul National University.
Bayesian Decision Theory Introduction to Machine Learning (Chap 3), E. Alpaydin.
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11 CS479/679 Pattern Recognition Dr. George Bebis.
Brief Intro to Machine Learning CS539
Reasoning Under Uncertainty: Belief Networks
INTRODUCTION TO Machine Learning 2nd Edition
Artificial Intelligence
INTRODUCTION TO Machine Learning
Basic Intro Tutorial on Machine Learning and Data Mining
Artificial Intelligence Chapter 19
Uncertainty in AI.
Introduction to Artificial Intelligence and Soft Computing
Biointelligence Lab School of Computer Sci. & Eng.
Chapter 14 February 26, 2004.
Presentation transcript:

COMP 538 (Reasoning and Decision under Uncertainty) Introduction to Bayesian networks Introduction to Course

COMP 538 Introduction / Slide 2 Nevin L. Zhang Bayesian Networks l Bayesian networks n Are networks of random variables n Are a marriage between probability theory and graph theory. n Represent conditional independence –A random variable is directly related to only a few neighboring variables. –It is independent of all other variables given the neighboring variables. n Facilitate the application of probability theory to many problems in AI, Applied Mathematics, Statistics, and Engineering that –Are complex –Involve uncertainty.

COMP 538 Introduction / Slide 3 Nevin L. Zhang Probability Theory & Uncertainty in AI l Bayesian networks have been developed in the AI community as a tool to build intelligent reasoning systems, in particular expert systems. n We next provide a brief historic account of the development. l Prior to 1980, intelligent reasoning systems were based on symbolic logic. n To tackle uncertainty, numerical tags were attached with if-then rules. n Sometimes, the numbers were interpreted probabilistically (MYCIN (Buchanan et al 1984), PROSPECTOR). n The probabilistic interpretation is not justified because the numbers were not manipulated according principles of probability theory.

COMP 538 Introduction / Slide 4 Nevin L. Zhang Probability Theory & Uncertainty in AI l Rule-based systems n Uncertainty associated with rules: summarizes exceptions –Consider rule l If the ground is wet, then it rained. –Exceptions: Sprinkler was on, water truck leaked, water pipe bursted, … –In general, exceptions too many to explicate. Summarized by a weight: l If the ground is wet, then it rained (0.8). n Application of rule “if A then B”: –If you see A in knowledge base, then conclude B –(Locality) regardless of other things in knowledge base –(Detachment) regardless of how A was derived.

COMP 538 Introduction / Slide 5 Nevin L. Zhang Probability Theory & Uncertainty in AI n Problem with locality (Pearl 1988) –Rule 1: If ground wet, then it rained (0.8). –If we see wet ground, we conclude that it rained with 80% prob. –But what if, somewhere in the knowledge base, there is the sentence “Sprinkler on last night”? n Problem with detachment (Pearl 1988) –Rule 2: If Sprinkler on (last night), then ground wet (this morning). –We know: Sprinkler on –Using rule 2: Ground wet –Using rule 1 (Detachment here): It rained.

COMP 538 Introduction / Slide 6 Nevin L. Zhang Probability Theory & Uncertainty in AI n Detachment also implies that there is no way to determine whether two pieces of information originate from the same source or from two independence sources (Pearl 1988, Henrion 1986). –Analogy: Shanghai Disney larger than Hong Kong Disney? Shanghai Business Man Statement TV reportNewspaper report My Belief

COMP 538 Introduction / Slide 7 Nevin L. Zhang Probability Theory & Uncertainty in AI n Rule-based systems can operate safely only in tree structured networks and they can perform either diagnosis or prediction, but not both (Shafer and Pearl 1990, Introduction to Chapter 5). n Classic logic does not suffer from those problems because the truth value characterize logical formulae themselves rather than exceptions.

COMP 538 Introduction / Slide 8 Nevin L. Zhang Probability Theory & Uncertainty in AI l Model-based systems n Uncertainty measure not on individual logical formulae, rather on sets of possible worlds. n In the case of probability models, there is a probability distribution over the sample space --- the Cartesian product of the state spaces of all random variables. Or a joint probability over all the random variables. –Well-known and well-understood framework for uncertainty –Clear semantics –Provides principled answers for: l Combining evidence l Predictive & Diagnostic reasoning l Belief update

COMP 538 Introduction / Slide 9 Nevin L. Zhang Probability Theory & Uncertainty in AI l Difficulties in applying probability theory: n Complexity of model construction n Complexity of problem solving n Both exponential in problem size: number of variables. l Example: n Patients in hospital are described by several attributes: –Background: age, gender, history of diseases, … –Symptoms: fever, blood pressure, headache, … –Diseases: pneumonia, heart attack. … n A joint probability distribution needs to assign a number to each combination of values of these attributes, exponential model size. –20 attributes require 2 20 = 10 6 numbers –Real applications usually involve hundreds of attributes l One of the reasons why probability theory did not play a significant role in AI reasoning systems before 1980.

COMP 538 Introduction / Slide 10 Nevin L. Zhang Probability Theory & Uncertainty in AI l The breakthrough came in early 1980’s (Pearl 1986, 1988, Howard & Matheson 1984) n In a joint probability, every variable is, in theory, directly related to all other variables. n Pearl and others realized: – It is often reasonable to make the, sometimes simplifying, assumption that each variable is directly related to only a few other variables. –This leads to modularity: Splitting a complex model and its associated calculations into small manageable pieces.

COMP 538 Introduction / Slide 11 Nevin L. Zhang Probability Theory & Uncertainty in AI l Example: Africa Visit (Lauritzen & Spiegelhalter 1988, modified) n Variables: –Patient complaint: Dyspnea --- D, –Q&A and Exam: l Visit-Africa --- A; Smoking --- S; X-Ray --- X –Diagnosis l Lung Cancer --- L; Tuberculosis --- T; Bronchitis --- B n Assuming all variables binary, size of joint probability model P(A, S, T, B, D, X) is 64-1

COMP 538 Introduction / Slide 12 Nevin L. Zhang Probability Theory & Uncertainty in AI l Reasonable assumptions: n X directly influenced by T & L; conditioned on T & L, it is independent of all other variables. n D directly influenced by T, L, B; conditioned on T, L, B, it is independent of all other variables. n A directly influences T. n Smoking directly influences L & B l Break up model P(A, S, T, L, B, X, D) n P(A), P(S), P(T|A), P(L|S), P(B|S) P(TorL|T, L),P(X|TorL), P(D|TorL, B) n Total number of parameters – =18

COMP 538 Introduction / Slide 13 Nevin L. Zhang Probability Theory & Uncertainty in AI l Modularity (conditional independence) n Simplifies model construction –18 parameters instead of 63 parameters –More drastic in real-world applications. n Model construction, inference, and model learning become possible for realistic applications –Before: exponential in problem size --- number of all variables. –Now: exponential in “number of neighbors” (more precisely size of largest clique.)

COMP 538 Introduction / Slide 14 Nevin L. Zhang Probability Theory & Uncertainty in AI l After the breakthrough n 1980’s –Representation (graphical representation of conditional independence) –Inference (polytree propagation, clique tree propagation) n 1990’s –Inference (Variable elimination, search, MCMC, variational methods, special local structures, …) –Learning (Parameter learning, structure learning, incomplete data, latent variables, …) –Sensitivity analysis, temporal models, causal models, ….

COMP 538 Introduction / Slide 15 Nevin L. Zhang Impact of Bayesian Networks l From non-existence to prominence n Prior to 1980, probability theory had essentially no role in AI. n : The breakthrough and much research activity. –However, by 1990, “there is till no consensus on the theoretical and practical role of probability in AI” (Shafer and Pearl 1990, Introduction) n 2002: The role of Bayesian networks in AI is so prominent that the the first invited talk at AAAI-2002 was entitled –Probabilistic AI (M. Jordan)

COMP 538 Introduction / Slide 16 Nevin L. Zhang Impact of Bayesian Networks l Bayesian networks are now a major topic in influential textbooks on n AI (Russell & Norvig 1995, Artificial intelligence : a modern approach) n Machine Learning (Mitchell 1997, Machine Learning) l They are also discussed in textbooks on n Data Mining (Hand et al 2001, Principles of Data Mining) n Pattern Recognition (Duda et al 2000, Pattern Classification)

COMP 538 Introduction / Slide 17 Nevin L. Zhang Impact of Bayesian Networks l Impact beyond AI n In statistics, Bayesian networks are –Viewed as a kind of statistical models, just as regress models –Called graphical models. –Used for multivariate data analysis. n A side note: Bayesian networks vs Neural networks –Bayesian networks: model data generation process, more interpretable. –Neural networks: motivated by biological process, less interpretable.

COMP 538 Introduction / Slide 18 Nevin L. Zhang Impact of Bayesian Networks n Bayesian networks provide a uniform framework to view a variety of models in Statistics and Engineering: –Hidden Markov models, Mixture models, latent class models, Kalman filters, factor analysis, and Ising models. n Books were written that –Use Bayesian networks to explain algorithms in digital communication, in particular data compress and channel coding (Frey 1998, Graphical models for machine learning and digital communication) –Draw connections between Bayesian networks and contemporary cognitive psychology (Glymour 2001, The mind’s arrows: Bayes nets and graphical causal models in psychology)

COMP 538 Introduction / Slide 19 Nevin L. Zhang Impact of Bayesian Networks l Applications: too many to survey. Some random samples: n Medical diagnostic systems, n Real-time weapons scheduling, n Jet-engines fault diagnosis, n Intel processor fault diagnosis (Intel), n Generator monitoring expert system (General Electric), n Software troubleshooting (Microsoft office assistant, Win98 print troubleshooting), n Space shuttle engines monitoring(Vista project), n Biological sequences analysis and classification

COMP 538 Introduction / Slide 20 Nevin L. Zhang Contents of This Course l This course is designed for graduate students in science and engineering. l Our objective is to give an in-depth coverage of what we deem the core concepts, ideas, and results of Bayesian networks n Concept and semantics of Bayesian networks n Representational issues: what can/cannot be represented n Inference: How to answer queries efficiently n Learning: How to adapt/learn Bayesian networks based on/from data n Special models: Hidden Markov models, Latent class models n Bayesian networks for classification and cluster analysis