Uncertain KR&R Chapter 9.

Slides:



Advertisements
Similar presentations
Bayesian networks Chapter 14 Section 1 – 2. Outline Syntax Semantics Exact computation.
Advertisements

BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.
Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
Artificial Intelligence Universitatea Politehnica Bucuresti Adina Magda Florea
Reasoning under Uncertainty Department of Computer Science & Engineering Indian Institute of Technology Kharagpur.
1 22c:145 Artificial Intelligence Bayesian Networks Reading: Ch 14. Russell & Norvig.
Bayesian Networks Chapter 14 Section 1, 2, 4. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
Review: Bayesian learning and inference
Bayesian networks Chapter 14 Section 1 – 2.
Bayesian Belief Networks
1 Bayesian Reasoning Chapter 13 CMSC 471 Adapted from slides by Tim Finin and Marie desJardins.
University College Cork (Ireland) Department of Civil and Environmental Engineering Course: Engineering Artificial Intelligence Dr. Radu Marinescu Lecture.
Ai in game programming it university of copenhagen Welcome to... the Crash Course Probability Theory Marco Loog.
Bayesian networks More commonly called graphical models A way to depict conditional independence relationships between random variables A compact specification.
Uncertainty Chapter 13.
Uncertainty Chapter 13.
Probabilistic Reasoning
EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS
CSCI 121 Special Topics: Bayesian Network Lecture #1: Reasoning Under Uncertainty.
Uncertain KR&R Chapter 9. 2 Outline Probability Bayesian networks Fuzzy logic.
Bayesian networks Chapter 14 Section 1 – 2. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
1 Chapter 13 Uncertainty. 2 Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
Bayesian Statistics and Belief Networks. Overview Book: Ch 13,14 Refresher on Probability Bayesian classifiers Belief Networks / Bayesian Networks.
Introduction to Bayesian Networks
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 30 Uncertainty, Fuizziness.
Chapter 13 February 19, Acting Under Uncertainty Rational Decision – Depends on the relative importance of the goals and the likelihood of.
Uncertainty Chapter 13. Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
Uncertainty Chapter 13. Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
1 Vagueness The Oxford Companion to Philosophy (1995): “Words like smart, tall, and fat are vague since in most contexts of use there is no bright line.
Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me there on time? Problems: 1.partial observability (road state, other.
1 Probability FOL fails for a domain due to: –Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result –Theoretical.
CSE 473 Uncertainty. © UW CSE AI Faculty 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one has stood.
Uncertainty Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
PROBABILISTIC REASONING Heng Ji 04/05, 04/08, 2016.
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
A Brief Introduction to Bayesian networks
Another look at Bayesian inference
Reasoning Under Uncertainty: Belief Networks
CS 2750: Machine Learning Directed Graphical Models
Bayesian Networks Chapter 14 Section 1, 2, 4.
Bayesian networks Chapter 14 Section 1 – 2.
Presented By S.Yamuna AP/CSE
Read R&N Ch Next lecture: Read R&N
Uncertainty Chapter 13.
Uncertainty Chapter 13.
Learning Bayesian Network Models from Data
Bayesian Networks Probability In AI.
Uncertainty.
Read R&N Ch Next lecture: Read R&N
Knowledge and Uncertainty
Uncertainty in AI.
Uncertainty in Environments
Probabilistic Reasoning; Network-based reasoning
Bayesian Statistics and Belief Networks
Class #21 – Monday, November 10
Hankz Hankui Zhuo Bayesian Networks Hankz Hankui Zhuo
Belief Networks CS121 – Winter 2003 Belief Networks.
Read R&N Ch Next lecture: Read R&N
Bayesian networks Chapter 14 Section 1 – 2.
Probabilistic Reasoning
Read R&N Ch Next lecture: Read R&N
Uncertainty Chapter 13.
Bayesian Networks CSE 573.
Representing Uncertainty
Uncertainty Chapter 13.
Presentation transcript:

Uncertain KR&R Chapter 9

Outline Probability Bayesian networks Fuzzy logic

Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical ignorance: there is no complete theory for the domain Practical ignorance: have not or cannot run all necessary tests

Probability Probability = a degree of belief Probability comes from: Frequentist: experiments and statistical assessment Objectivist: real aspects of the universe Subjectivist: a way of characterizing an agent’s beliefs Decision theory = probability theory + utility theory

Probability Prior probability: probability in the absence of any other information P(Dice = 2) = 1/6 random variable: Dice domain = <1, 2, 3, 4, 5, 6> probability distribution: P(Dice) = <1/6, 1/6, 1/6, 1/6, 1/6, 1/6>

Probability Conditional probability: probability in the presence of some evidence P(Dice = 2 | Dice is even) = 1/3 P(Dice = 2 | Dice is odd) = 0 P(A | B) = P(A  B)/P(B) P(A  B) = P(A | B).P(B)

Probability Example: S = stiff neck M = meningitis P(S | M) = 0.5 P(M) = 1/50000 P(S) = 1/20 P(M | S) = P(S | M).P(M)/P(S) = 1/5000

Probability Joint probability distributions: X: <x1, …, xm> Y: <y1, …, yn> P(X = xi, Y = yj)

Probability Axioms: 0  P(A)  1 P(true) = 1 and P(false) = 0 P(A  B) = P(A) + P(B) - P(A  B)

Probability Derived properties: P(A) = 1 - P(A) P(U) = P(A1) + P(A2) + ... + P(An) U = A1  A2  ...  An collectively exhaustive Ai  Aj = false mutually exclusive

P(Hi | E) = P(E | Hi).P(Hi)/iP(E | Hi).P(Hi) Probability Bayes’ theorem: P(Hi | E) = P(E | Hi).P(Hi)/iP(E | Hi).P(Hi) Hi’s are collectively exhaustive & mutually exclusive

Probability Problem: a full joint probability distribution P(X1, X2, ..., Xn) is sufficient for computing any (conditional) probability on Xi's, but the number of joint probabilities is exponential.

Probability Independence: P(A  B) = P(A).P(B) P(A) = P(A | B) Conditional independence: P(A  B | E) = P(A | E).P(B | E) P(A | E) = P(A | E  B) Example: P(Toothache | Cavity  Catch) = P(Toothache | Cavity) P(Catch | Cavity  Toothache) = P(Catch | Cavity)

Probability "In John's and Mary's house, an alarm is installed to sound in case of burglary or earthquake. When the alarm sounds, John and Mary may make a call for help or rescue."

Probability "In John's and Mary's house, an alarm is installed to sound in case of burglary or earthquake. When the alarm sounds, John and Mary may make a call for help or rescue." Q1: If earthquake happens, how likely will John make a call?

Probability "In John's and Mary's house, an alarm is installed to sound in case of burglary or earthquake. When the alarm sounds, John and Mary may make a call for help or rescue." Q1: If earthquake happens, how likely will John make a call? Q2: If the alarm sounds, how likely is the house burglarized?

Bayesian Networks Pearl, J. (1982). Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach, presented at the Second National Conference on Artificial Intelligence (AAAI-82), Pittsburgh, Pennsylvania 2000 AAAI Classic Paper Award

Bayesian Networks Burglary Earthquake Alarm MaryCalls JohnCalls P(B) .001 P(E) .002 Burglary Earthquake B E P(A) T T T F F T F F .95 .94 .29 .001 Alarm A P(M) T F .70 .01 A P(J) T F .90 .05 MaryCalls JohnCalls

Bayesian Networks Syntax: A set of random variables makes up the nodes A set of directed links connects pairs of nodes Each node has a conditional probability table that quantifies the effects of its parent nodes The graph has no directed cycles A A yes no B B C C D D

Bayesian Networks Semantics: An ordering on the nodes: Xi is a predecessor of Xj  i < j P(X1, X2, …, Xn) = P(Xn | Xn-1, …, X1).P(Xn-1 | Xn-2, …, X1). … .P(X2 | X1).P(X1) = iP(Xi | Xi-1, …, X1) = iP(Xi | Parents(Xi)) P (Xi | Xi-1, …, X1) = P(Xi | Parents(Xi)) Parents(Xi)  {Xi-1, …, X1} Each node is conditionally independent of its predecessors given its parents

Bayesian Networks Example: P(J  M  A  B  E) = P(J | A).P(M | A).P(A | B  E).P(B).P(E) = 0.00062 E B A M J

Bayesian Networks Why Bayesian Networks?

Uncertain Question Answering P(Query | Evidence) = ? Diagnostic (from effects to causes): P(B | J) Causal (from causes to effects): P(J | B) Intercausal (between causes of a common effect): P(B | A, E) Mixed: P(A | J, E), P(B | J, E) E B A M J

Uncertain Question Answering The independence assumptions in a Bayesian Network simplify computation of conditional probabilities on its variables

Uncertain Question Answering Q1: If earthquake happens, how likely will John make a call? Q2: If the alarm sounds, how likely is the house burglarized? Q3: If the alarm sounds, how likely both John and Mary make calls?

Uncertain Question Answering P(B | A) = P(B  A)/P(A) = aP(B  A) P(B | A) = aP(B  A)  a = 1/(P(B  A) + P(B  A))

General Conditional Independence A Bayesian Network implies all conditional independence among its variables

General Conditional Independence U1 Um X Z1j Znj Y1 Yn A node (X) is conditionally independent of its non-descendents (Zij's), given its parents (Ui's)

General Conditional Independence U1 Um X Z1j Znj Y1 Yn A node (X) is conditionally independent of all other nodes, given its parents (Ui's), children (Yi's), and children's parents (Zij's)

General Conditional Independence X and Y are conditionally independent given E X E Y Z Z Z

General Conditional Independence Example: Battery Gas Ignition Radio Starts Moves

General Conditional Independence Example: Battery Gas Ignition Radio Starts Moves Gas - (Ignition) - Radio Gas - (Battery) - Radio Gas - (no evidence at all) - Radio Gas  Radio | Starts Gas  Radio | Moves

Vagueness The Oxford Companion to Philosophy (1995): “Words like smart, tall, and fat are vague since in most contexts of use there is no bright line separating them from not smart, not tall, and not fat respectively …”

Vagueness Imprecision vs. Uncertainty: The bottle is about half-full. It is likely to a degree of 0.5 that the bottle is full.

Fuzzy Sets Zadeh, L.A. (1965). Fuzzy Sets Journal of Information and Control

Fuzzy Sets

Fuzzy Sets

Fuzzy Set Definition A fuzzy set is defined by a membership function that maps elements of a given domain (a crisp set) into values in [0, 1]. mA: U  [0, 1] mA  A 1 young 0.5 25 30 40 Age

Fuzzy Set Representation Discrete domain: high-dice score: {1:0, 2:0, 3:0.2, 4:0.5, 5:0.9, 6:1} Continuous domain: A(u) = 1 for u[0, 25] A(u) = (40 - u)/15 for u[25, 40] A(u) = 0 for u[40, 150] 1 0.5 25 30 40 Age

Fuzzy Set Representation a-cuts: Aa = {u | A(u)  a} Aa+ = {u | A(u) > a} strong a-cut A0.5 = [0, 30] 1 0.5 25 30 40 Age

Fuzzy Set Representation a-cuts: Aa = {u | A(u)  a} Aa+ = {u | A(u) > a} strong a-cut A(u) = sup {a | u  Aa} A0.5 = [0, 30] 1 0.5 25 30 40 Age

Fuzzy Set Representation Support: supp(A) = {u | A(u) > 0} = A0+ Core: core(A) = {u | A(u) = 1} = A1 Height: h(A) = supUA(u)

Fuzzy Set Representation Normal fuzzy set: h(A) = 1 Sub-normal fuzzy set: h(A)  1

Membership Degrees Subjective definition

Membership Degrees Subjective definition Voting model: Each voter has a subset of U as his/her own crisp definition of the concept that A represents. A(u) is the proportion of voters whose crisp definitions include u.

Membership Degrees Voting model: P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 x 1 2

Fuzzy Subset Relations A  B iff A(u)  B(u) for every uU A is more “specific” than B “X is A” entails “X is B”

Fuzzy Set Operations Standard definitions: Complement: A(u) = 1 - A(u) Intersection: (AB)(u) = min[A(u), B(u)] Union: (AB)(u) = max[A(u), B(u)]

Fuzzy Set Operations Example: not young = young not old = old middle-age = not youngnot old old = young

Fuzzy Relations Crisp relation: R(U1, ..., Un)  U1 ... Un R(u1, ..., un) = 1 iff (u1, ..., un)  R or = 0 otherwise

Fuzzy Relations Crisp relation: R(U1, ..., Un)  U1 ... Un R(u1, ..., un) = 1 iff (u1, ..., un)  R or = 0 otherwise Fuzzy relation: a fuzzy set on U1 ... Un

Fuzzy Relations Fuzzy relation: R = “very far” U1 = {New York, Paris}, U2 = {Beijing, New York, London} R = “very far” R = {(NY, Beijing): 1, ...} NY Paris Beijing 1 .9 .7 London .6 .3

Fuzzy Numbers A fuzzy number A is a fuzzy set on R: A must be a normal fuzzy set Aa must be a closed interval for every a(0, 1] supp(A) = A0+ must be bounded

Basic Types of Fuzzy Numbers 1 1 1 1

Basic Types of Fuzzy Numbers 1 1

Operations of Fuzzy Numbers Interval-based operations: (A  B)a = Aa  Ba

Operations of Fuzzy Numbers Arithmetic operations on intervals: [a, b][d, e] = {fg | a  f  b, d  g  e}

Operations of Fuzzy Numbers Arithmetic operations on intervals: [a, b][d, e] = {fg | a  f  b, d  g  e} [a, b] + [d, e] = [a + d, b + e] [a, b] - [d, e] = [a - e, b - d] [a, b]*[d, e] = [min(ad, ae, bd, be), max(ad, ae, bd, be)] [a, b]/[d, e] = [a, b]*[1/e, 1/d] 0[d, e]

Operations of Fuzzy Numbers about 2 about 3 1 about 2 + about 3 = ? about 2  about 3 = ? + 2 3

Operations of Fuzzy Numbers Discrete domains: A = {xi: A(xi)} B = {yi: B(yi)} A  B = ?

Operations of Fuzzy Numbers Extension principle: f: U1 U2  V induces g: U1 U2  V [g(A, B)](v) = sup{(u1, u2) | v = f(u1, u2)}min{A(u1), B(u2)} ~ ~ ~

Operations of Fuzzy Numbers Discrete domains: A = {xi: A(xi)} B = {yi: B(yi)} (A  B)(v) = sup{(xi, yj) | v = xi°yj)}min{A(xi), B(yj)}

Fuzzy Logic x is A* ------------------------ y is B* if x is A then y is B x is A* ------------------------ y is B*

Fuzzy Logic View a fuzzy rule as a fuzzy relation Measure similarity of A and A*

Fuzzy Controller As special expert systems When difficult to construct mathematical models When acquired models are expensive to use

Fuzzy Controller IF the temperature is very high AND the pressure is slightly low THEN the heat change should be slightly negative

Fuzzy Controller FUZZY CONTROLLER Defuzzification model actions Controlled process Fuzzy inference engine Fuzzy rule base Fuzzification model conditions

Fuzzification 1 x0

Defuzzification Center of Area: x = (A(z).z)/A(z)

Defuzzification M = {z | A(z) = h(A)} x = (min M + max M)/2 Center of Maxima: M = {z | A(z) = h(A)} x = (min M + max M)/2

Defuzzification Mean of Maxima: M = {z | A(z) = h(A)} x = z/|M|

Exercises In Klir-Yuan’s textbook: 1.9, 1.10, 2.11, 2.12, 4.5