Download presentation
Presentation is loading. Please wait.
Published byNeil Woods Modified over 9 years ago
1
Knowledge-Based Systems Knowledge-Based Systems Dr. Marco Antonio Ramos Corchado Computer Science Department
2
Overview Reasoning and Uncertainty uMotivation uObjectives uSources of Uncertainty and Inexactness in Reasoning uIncorrect and Incomplete Knowledge uAmbiguities uBelief and Ignorance uProbability Theory uBayesian Networks uCertainty Factors uBelief and Disbelief uDempster-Shafer Theory uEvidential Reasoning uImportant Concepts and Terms uChapter Summary
3
LogisticsLogistics uIntroductions uCourse Materials utextbooks (see below) ulecture notes uPowerPoint Slides will be available on my Web page uhandouts uWeb page uhttp://fi.uaemex.mx/mramos uTerm Project uLab and Homework Assignments uExams uGrading
4
Bridge-In
5
Pre-Test
6
MotivationMotivation reasoning for real-world problems involves missing knowledge, inexact knowledge, inconsistent facts or rules, and other sources of uncertainty reasoning for real-world problems involves missing knowledge, inexact knowledge, inconsistent facts or rules, and other sources of uncertainty while traditional logic in principle is capable of capturing and expressing these aspects, it is not very intuitive or practical while traditional logic in principle is capable of capturing and expressing these aspects, it is not very intuitive or practical explicit introduction of predicates or functions explicit introduction of predicates or functions many expert systems have mechanisms to deal with uncertainty many expert systems have mechanisms to deal with uncertainty sometimes introduced as ad-hoc measures, lacking a sound foundation sometimes introduced as ad-hoc measures, lacking a sound foundation
7
ObjectivesObjectives be familiar with various sources of uncertainty and imprecision in knowledge representation and reasoning be familiar with various sources of uncertainty and imprecision in knowledge representation and reasoning understand the main approaches to dealing with uncertainty understand the main approaches to dealing with uncertainty probability theory probability theory Bayesian networks Bayesian networks Dempster-Shafer theory Dempster-Shafer theory important characteristics of the approaches important characteristics of the approaches differences between methods, advantages, disadvantages, performance, typical scenarios differences between methods, advantages, disadvantages, performance, typical scenarios evaluate the suitability of those approaches evaluate the suitability of those approaches application of methods to scenarios or tasks application of methods to scenarios or tasks apply selected approaches to simple problems apply selected approaches to simple problems
8
Introduction reasoning under uncertainty and with inexact knowledge reasoning under uncertainty and with inexact knowledge frequently necessary for real-world problems frequently necessary for real-world problems heuristics heuristics ways to mimic heuristic knowledge processing ways to mimic heuristic knowledge processing methods used by experts methods used by experts empirical associations empirical associations experiential reasoning experiential reasoning based on limited observations based on limited observations probabilities probabilities objective (frequency counting) objective (frequency counting) subjective (human experience ) subjective (human experience ) reproducibility reproducibility will observations deliver the same results when repeated will observations deliver the same results when repeated
9
Dealing with Uncertainty expressiveness expressiveness can concepts used by humans be represented adequately? can concepts used by humans be represented adequately? can the confidence of experts in their decisions be expressed? can the confidence of experts in their decisions be expressed? comprehensibility comprehensibility representation of uncertainty representation of uncertainty utilization in reasoning methods utilization in reasoning methods correctness correctness probabilities probabilities adherence to the formal aspects of probability theory adherence to the formal aspects of probability theory relevance ranking relevance ranking probabilities don’t add up to 1, but the “most likely” result is sufficient probabilities don’t add up to 1, but the “most likely” result is sufficient long inference chains long inference chains tend to result in extreme (0,1) or not very useful (0.5) results tend to result in extreme (0,1) or not very useful (0.5) results computational complexity computational complexity feasibility of calculations for practical purposes feasibility of calculations for practical purposes
10
Sources of Uncertainty data data data missing, unreliable, ambiguous, data missing, unreliable, ambiguous, representation imprecise, inconsistent, subjective, derived from defaults, … representation imprecise, inconsistent, subjective, derived from defaults, … expert knowledge expert knowledge inconsistency between different experts inconsistency between different experts plausibility plausibility “best guess” of experts “best guess” of experts quality quality causal knowledge causal knowledge deep understanding deep understanding statistical associations statistical associations observations observations scope scope only current domain, or more general only current domain, or more general
11
Sources of Uncertainty (cont.) knowledge representation knowledge representation restricted model of the real system restricted model of the real system limited expressiveness of the representation mechanism limited expressiveness of the representation mechanism inference process inference process deductive deductive the derived result is formally correct, but inappropriate the derived result is formally correct, but inappropriate derivation of the result may take very long derivation of the result may take very long inductive inductive new conclusions are not well-founded new conclusions are not well-founded not enough samples not enough samples samples are not representative samples are not representative unsound reasoning methods unsound reasoning methods induction, non-monotonic, default reasoning induction, non-monotonic, default reasoning
12
Uncertainty in Individual Rules errors errors domain errors domain errors representation errors representation errors inappropriate application of the rule inappropriate application of the rule likelihood of evidence likelihood of evidence for each premise for each premise for the conclusion for the conclusion combination of evidence from multiple premises combination of evidence from multiple premises
13
Uncertainty and Multiple Rules conflict resolution conflict resolution if multiple rules are applicable, which one is selected if multiple rules are applicable, which one is selected explicit priorities, provided by domain experts explicit priorities, provided by domain experts implicit priorities derived from rule properties implicit priorities derived from rule properties specificity of patterns, ordering of patterns creation time of rules, most recent usage, … specificity of patterns, ordering of patterns creation time of rules, most recent usage, … compatibility compatibility contradictions between rules contradictions between rules subsumption subsumption one rule is a more general version of another one one rule is a more general version of another one redundancy redundancy missing rules missing rules data fusion data fusion integration of data from multiple sources integration of data from multiple sources
14
Basics of Probability Theory mathematical approach for processing uncertain information mathematical approach for processing uncertain information sample space set X = {x 1, x 2, …, x n } sample space set X = {x 1, x 2, …, x n } collection of all possible events collection of all possible events can be discrete or continuous can be discrete or continuous probability number P(x i ) reflects the likelihood of an event x i to occur probability number P(x i ) reflects the likelihood of an event x i to occur non-negative value in [0,1] non-negative value in [0,1] total probability of the sample space (sum of probabilities) is 1 total probability of the sample space (sum of probabilities) is 1 for mutually exclusive events, the probability for at least one of them is the sum of their individual probabilities for mutually exclusive events, the probability for at least one of them is the sum of their individual probabilities experimental probability experimental probability based on the frequency of events based on the frequency of events subjective probability subjective probability based on expert assessment based on expert assessment
15
Compound Probabilities describes independent events describes independent events do not affect each other in any way do not affect each other in any way joint probability of two independent events A and B P(A B) = n(A B) / n(s) = P(A) * P (B) joint probability of two independent events A and B P(A B) = n(A B) / n(s) = P(A) * P (B) where n(S) is the number of elements in S union probability of two independent events A and B P(A B) = P(A) + P(B) - P(A B) = P(A) + P(B) - P(A) * P (B) union probability of two independent events A and B P(A B) = P(A) + P(B) - P(A B) = P(A) + P(B) - P(A) * P (B)
16
Conditional Probabilities describes dependent events describes dependent events affect each other in some way affect each other in some way conditional probability of event A given that event B has already occurred P(A|B) = P(A B) / P(B) conditional probability of event A given that event B has already occurred P(A|B) = P(A B) / P(B)
17
Advantages and Problems: Probabilities advantages advantages formal foundation formal foundation reflection of reality (a posteriori) reflection of reality (a posteriori) problems problems may be inappropriate may be inappropriate the future is not always similar to the past the future is not always similar to the past inexact or incorrect inexact or incorrect especially for subjective probabilities especially for subjective probabilities ignorance ignorance probabilities must be assigned even if no information is available probabilities must be assigned even if no information is available assigns an equal amount of probability to all such items assigns an equal amount of probability to all such items non-local reasoning non-local reasoning requires the consideration of all available evidence, not only from the rules currently under consideration requires the consideration of all available evidence, not only from the rules currently under consideration no compositionality no compositionality complex statements with conditional dependencies can not be decomposed into independent parts complex statements with conditional dependencies can not be decomposed into independent parts
18
Bayesian Approaches derive the probability of a cause given a symptom derive the probability of a cause given a symptom has gained importance recently due to advances in efficiency has gained importance recently due to advances in efficiency more computational power available more computational power available better methods better methods especially useful in diagnostic systems especially useful in diagnostic systems medicine, computer help systems medicine, computer help systems inverse or a posteriori probability inverse or a posteriori probability inverse to conditional probability of an earlier event given that a later one occurred inverse to conditional probability of an earlier event given that a later one occurred
19
Bayes’ Rule for Single Event single hypothesis H, single event E P(H|E) = (P(E|H) * P(H)) / P(E) or single hypothesis H, single event E P(H|E) = (P(E|H) * P(H)) / P(E) or P(H|E) = (P(E|H) * P(H) / (P(E|H) * P(H) + P(E| H) * P( H) ) P(H|E) = (P(E|H) * P(H) / (P(E|H) * P(H) + P(E| H) * P( H) )
20
Bayes’ Rule for Multiple Events multiple hypotheses H i, multiple events E 1, …, E n P(H i |E 1, E 2, …, E n ) = (P(E 1, E 2, …, E n |H i ) * P(H i )) / P(E 1, E 2, …, E n ) or P(H i |E 1, E 2, …, E n ) = (P(E 1 |H i ) * P(E 2 |H i ) * …* P(E n |H i ) * P(H i )) / k P(E 1 |H k ) * P(E 2 |H k ) * … * P(E n |H k )* P(H k ) with independent pieces of evidence E i multiple hypotheses H i, multiple events E 1, …, E n P(H i |E 1, E 2, …, E n ) = (P(E 1, E 2, …, E n |H i ) * P(H i )) / P(E 1, E 2, …, E n ) or P(H i |E 1, E 2, …, E n ) = (P(E 1 |H i ) * P(E 2 |H i ) * …* P(E n |H i ) * P(H i )) / k P(E 1 |H k ) * P(E 2 |H k ) * … * P(E n |H k )* P(H k ) with independent pieces of evidence E i
21
Advantages and Problems of Bayesian Reasoning advantages advantages sound theoretical foundation sound theoretical foundation well-defined semantics for decision making well-defined semantics for decision making problems problems requires large amounts of probability data requires large amounts of probability data sufficient sample sizes sufficient sample sizes subjective evidence may not be reliable subjective evidence may not be reliable independence of evidences assumption often not valid independence of evidences assumption often not valid relationship between hypothesis and evidence is reduced to a number relationship between hypothesis and evidence is reduced to a number explanations for the user difficult explanations for the user difficult high computational overhead high computational overhead
22
Certainty Factors denotes the belief in a hypothesis H given that some pieces of evidence E are observed denotes the belief in a hypothesis H given that some pieces of evidence E are observed no statements about the belief means that no evidence is present no statements about the belief means that no evidence is present in contrast to probabilities, Bayes’ method in contrast to probabilities, Bayes’ method works reasonably well with partial evidence works reasonably well with partial evidence separation of belief, disbelief, ignorance separation of belief, disbelief, ignorance share some foundations with Dempster-Shafer theory, but are more practical share some foundations with Dempster-Shafer theory, but are more practical introduced in an ad-hoc way in MYCIN introduced in an ad-hoc way in MYCIN later mapped to DS theory later mapped to DS theory
23
Belief and Disbelief measure of belief measure of belief degree to which hypothesis H is supported by evidence E degree to which hypothesis H is supported by evidence E MB(H,E) = 1 if P(H) =1 (P(H|E) - P(H)) / (1- P(H)) otherwise MB(H,E) = 1 if P(H) =1 (P(H|E) - P(H)) / (1- P(H)) otherwise measure of disbelief measure of disbelief degree to which doubt in hypothesis H is supported by evidence E degree to which doubt in hypothesis H is supported by evidence E MB(H,E) = 1 if P(H) =0 (P(H) - P(H|E)) / P(H)) otherwise MB(H,E) = 1 if P(H) =0 (P(H) - P(H|E)) / P(H)) otherwise
24
Certainty Factor certainty factor CF certainty factor CF ranges between -1 (denial of the hypothesis H) and +1 (confirmation of H) ranges between -1 (denial of the hypothesis H) and +1 (confirmation of H) allows the ranking of hypotheses allows the ranking of hypotheses difference between belief and disbelief CF (H,E) = MB(H,E) - MD (H,E) difference between belief and disbelief CF (H,E) = MB(H,E) - MD (H,E) combining antecedent evidence combining antecedent evidence use of premises with less than absolute confidence use of premises with less than absolute confidence E 1 E 2 = min(CF(H, E 1 ), CF(H, E 2 )) E 1 E 2 = min(CF(H, E 1 ), CF(H, E 2 )) E 1 E 2 = max(CF(H, E 1 ), CF(H, E 2 )) E 1 E 2 = max(CF(H, E 1 ), CF(H, E 2 )) E = CF(H, E) E = CF(H, E)
25
Combining Certainty Factors certainty factors that support the same conclusion certainty factors that support the same conclusion several rules can lead to the same conclusion several rules can lead to the same conclusion applied incrementally as new evidence becomes available applied incrementally as new evidence becomes available CF rev (CF old, CF new ) = CF old + CF new (1 - CF old ) if both > 0 CF old + CF new (1 + CF old ) if both < 0 CF old + CF new / (1 - min(|CF old |, |CF new |)) if one < 0
26
Characteristics of Certainty Factors Ranges measure of belief 0 ≤ MB ≤ 1 measure of disbelief 0 ≤ MD ≤ 1 certainty factor -1 ≤ CF ≤ +1 AspectProbabilityMBMDCF Certainly trueP(H|E) = 1101 Certainly false P( H|E) = 1 01 No evidenceP(H|E) = P(H)000
27
Advantages and Problems of Certainty Factors Advantages Advantages simple implementation simple implementation reasonable modeling of human experts’ belief reasonable modeling of human experts’ belief expression of belief and disbelief expression of belief and disbelief successful applications for certain problem classes successful applications for certain problem classes evidence relatively easy to gather evidence relatively easy to gather no statistical base required no statistical base required Problems Problems partially ad hoc approach partially ad hoc approach theoretical foundation through Dempster-Shafer theory was developed later theoretical foundation through Dempster-Shafer theory was developed later combination of non-independent evidence unsatisfactory combination of non-independent evidence unsatisfactory new knowledge may require changes in the certainty factors of existing knowledge new knowledge may require changes in the certainty factors of existing knowledge certainty factors can become the opposite of conditional probabilities for certain cases certainty factors can become the opposite of conditional probabilities for certain cases not suitable for long inference chains not suitable for long inference chains
28
Dempster-Shafer Theory mathematical theory of evidence mathematical theory of evidence uncertainty is modeled through a range of probabilities uncertainty is modeled through a range of probabilities instead of a single number indicating a probability instead of a single number indicating a probability sound theoretical foundation sound theoretical foundation allows distinction between belief, disbelief, ignorance (non-belief) allows distinction between belief, disbelief, ignorance (non-belief) certainty factors are a special case of DS theory certainty factors are a special case of DS theory
29
DS Theory Notation environment = {O 1, O 2,..., O n } environment = {O 1, O 2,..., O n } set of objects O i that are of interest set of objects O i that are of interest = {O 1, O 2,..., O n } = {O 1, O 2,..., O n } frame of discernment FD frame of discernment FD an environment whose elements may be possible answers an environment whose elements may be possible answers only one answer is the correct one only one answer is the correct one mass probability function m mass probability function m assigns a value from [0,1] to every item in the frame of discernment assigns a value from [0,1] to every item in the frame of discernment describes the degree of belief in analogy to the mass of a physical object describes the degree of belief in analogy to the mass of a physical object mass probability m(A) mass probability m(A) portion of the total mass probability that is assigned to a specific element A of FD portion of the total mass probability that is assigned to a specific element A of FD
30
Belief and Certainty belief Bel(A) in a subset A belief Bel(A) in a subset A sum of the mass probabilities of all the proper subsets of A sum of the mass probabilities of all the proper subsets of A likelihood that one of its members is the conclusion likelihood that one of its members is the conclusion plausibility Pl(A) plausibility Pl(A) maximum belief of A maximum belief of A certainty Cer(A) certainty Cer(A) interval [Bel(A), Pl(A)] interval [Bel(A), Pl(A)] expresses the range of belief expresses the range of belief
31
Combination of Mass Probabilities combining two masses in such a way that the new mass represents a consensus of the contributing pieces of evidence combining two masses in such a way that the new mass represents a consensus of the contributing pieces of evidence set intersection puts the emphasis on common elements of evidence, rather than conflicting evidence set intersection puts the emphasis on common elements of evidence, rather than conflicting evidence m 1 m 2 (C)= X Y m 1 (X) * m 2 (Y) m 1 m 2 (C)= X Y m 1 (X) * m 2 (Y) =C m 1 (X) * m 2 (Y) / (1- X Y) =C m 1 (X) * m 2 (Y) where X, Y are hypothesis subsets and C is their intersection C = X Y is the orthogonal or direct sum
32
Differences Probabilities - DF Theory Aspect ProbabilitiesDempster-Shafer Aggregate Sum i Pi = 1m( ) ≤ 1 Subset X Y P(X) ≤ P(Y)m(X) > m(Y) allowed relationship X, X (ignorance) P(X) + P ( X) = 1m(X) + m( X) ≤ 1
33
Evidential Reasoning extension of DS theory that deals with uncertain, imprecise, and possibly inaccurate knowledge extension of DS theory that deals with uncertain, imprecise, and possibly inaccurate knowledge also uses evidential intervals to express the confidence in a statement also uses evidential intervals to express the confidence in a statement
34
Evidential Intervals Bel: belief; lower bound of the evidential interval Pls: plausibility; upper bound MeaningEvidential Interval Completely true[1,1] Completely false[0,0] Completely ignorant[0,1] Tends to support[Bel,1] where 0 < Bel < 1 Tends to refute[0,Pls] where 0 < Pls < 1 Tends to both support and refute[Bel,Pls] where 0 < Bel ≤ Pls< 1
35
Advantages and Problems of Dempster-Shafer advantages advantages clear, rigorous foundation clear, rigorous foundation ability to express confidence through intervals ability to express confidence through intervals certainty about certainty certainty about certainty proper treatment of ignorance proper treatment of ignorance problems problems non-intuitive determination of mass probability non-intuitive determination of mass probability very high computational overhead very high computational overhead may produce counterintuitive results due to normalization may produce counterintuitive results due to normalization usability somewhat unclear usability somewhat unclear
36
Post-Test
37
EvaluationEvaluation uCriteria
38
Important Concepts and Terms Bayesian networks Bayesian networks belief belief certainty factor certainty factor compound probability compound probability conditional probability conditional probability Dempster-Shafer theory Dempster-Shafer theory disbelief disbelief evidential reasoning evidential reasoning inference inference inference mechanism inference mechanism ignorance ignorance knowledge knowledge knowledge representation knowledge representation mass function mass function probability probability reasoning reasoning rule rule sample sample set set uncertainty uncertainty
39
Summary Reasoning and Uncertainty many practical tasks require reasoning under uncertainty many practical tasks require reasoning under uncertainty missing, inexact, inconsistent knowledge missing, inexact, inconsistent knowledge variations of probability theory are often combined with rule-based approaches variations of probability theory are often combined with rule-based approaches works reasonably well for many practical problems works reasonably well for many practical problems Bayesian networks have gained some prominence Bayesian networks have gained some prominence improved methods, sufficient computational power improved methods, sufficient computational power
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.