Download presentation
Presentation is loading. Please wait.
Published byVeronica Boone Modified over 9 years ago
1
Uncertainty in AI
2
Birds can fly, right? Seems like common sense knowledge.
3
The problem with First-Order Logic is that universal rules are too strict x bird(x)→flies(x) mean all bird must fly but there are many exceptions... If we asserted: penguin(opus) x penguin(x) → flies(x) x penguin(x) → bird(x) this would make the KB inconsistent
4
opus tweety big bird woodstock
5
we can't enumerate all possible exceptions what about a robin with a broken wing? what about birds that are made out of plastic? what about Big Bird? this is known as the "qualification problem" Uncertainty in actions: If a gun is loaded and you pull the trigger, the gun will fire, right?
6
we can't enumerate all possible exceptions what about a robin with a broken wing? what about birds that are made out of plastic? what about Big Bird? Uncertainty in actions: If a gun is loaded and you pull the trigger, the gun will fire, right?...unless it is a toy gun...unless it is underwater...unless the barrel is filled with concrete...unless it is defective
7
what might we want to say is... "most birds fly" or "some birds do not fly" or "tweety possibly flies" We cannot express these in FOL there are no degrees of truth "some birds do not fly" becomes x bird(x) flies(x) solutions: Default (non-monotonic) Logics Fuzzy Logics procedural approachs Probability
8
Default Logic extensions of FOL that allow for exceptions FOL is monotonic: anything that is entailed by a set of sentences will still be entailed by a superset of sentences implies bird(opus) flies(opus) bird(opus) penguin(opus) flies(opus) [necessarily] Default Logics are non-monotonic old conclusions can get retracted due to new information bird(opus) penguin(opus) flies(opus)
9
Default Logics require alternative model theory, such as possible worlds (or Kripke) semantics without going into the complexities, most default logics are based on drawing conclusions from "minimal models" that make the least assumptions necessary
10
Default Logic sytem I new type of "soft" implication bird(x) flies(x) " " means "..and if it is not inconsistent to believe so, then..." example: penguin(opus) bird(tweety) x penguin(x) → flies(x) x penguin(x) → bird(x) x bird(x) flies(x) model: m 0 = U={a,b} D={opus→a,tweety→b} R: bird={a,b}, penguin={a}, flies={b} because opus is a penguin, 'a' cannot be in the set of things that fly this makes it inconsistent to apply the "birds fly" rule for opus this model is the "minimal" consistent model of the sentences
11
Circumscription - another approach to Defeasible Reasoning add 'abnormal' predicates to default sentences penguin(opus) bird(tweety) x penguin(x) → flies(x) x penguin(x) → bird(x) x bird(x) abnormal 1 (x) → flies(x) find models that make the fewest abnormal assumptions to be consistent U={a,b} D={opus→a,tweety→b} R: bird={a,b}, penguin={a}, flies={b}, abnormal 1 ={a} there is no reason to assume tweety is abnormal but we opus needs to be abnormal to explain not flying
12
Procedural Solutions Semantics networks 'isa' links encode inheritance relationships to answer queries, follow pointers until property is defined Closed World Assumption in Prolog anything you cannot prove is assumed to be false this makes it more concise to write KBs by asserting only true things flies(X) :- bird(X),not penguin(X),not too_large_to_fly(X). bird(tweety). bird(opus). bird(big_bird). penguin(opus). too_big_to_fly(big_bird). note that in FOL, we could not conclude flies(tweety) because penguin(tweety) would have to be explicitly asserted as a fact opus isa canfly penguin F isa canfly bird T tweety
13
Rule strengths or priorities in Expert Systems...an old ad-hoc approach (with unclear semantics) penguin(x) → 0.9 flies(x) bird(x) → 0.5 flies(x) example: can assign 'salience' (integers) to defrules in Jess this will determine the order in which rules fire in the forward-chaining
14
Probability encode knowledge in the form of prior probabilities and conditional probabilities P(x speaks portugese|x is from Brazil)=0.9 P(x speaks portugese)=0.012 P(x is from Brazil)=0.007 P(x flies|x is a bird)=0.8 (?) inference is done by calculating posterior probabilities given evidence compute P(cavity | toothache, flossing, dental history, recent consumption of candy...) compute P(fed will raise interest rate | employment=5%, inflation=0.5%, GDP=2%, recent geopolitical events...)
15
Many modern knowledge-based systems are based on probabilistic inference including Bayesian networks, Hidden Markov Models, Markov Decision Problems example: Bayesian networks are used for inferring user goals or help needs from actions like mouse clicks in an automated software help system (think 'Clippy') Decision Theory combines utilities with probabilities of outcomes to decide actions to tak the challenge is capturing all the numbers needed for the prior and conditional probabilities objectivists (frequentists) - probabilities represent outcomes of trials/experiments subjectivists - probabilities are degress of belief probability and statistics is at the core of many Machine Learning algorithms
16
causal vs. diagnostic knowledge causal: P(x has a toothache|x has a cavity)=0.9 diagnostic: P(x has a cavity|x has a toothache)=0.5 typically it is easier to articulate knowledge in the causal direction, but we often want to use it in a diagnostic way to make inferences from observations
17
product rule : P(A,B) joint = P(A|B)*P(B) Bayes' Rule: convert between causal and diagnostic H = hypothesis (cause, disease) E = evidence (effect, symptoms)
18
Joint probability table (JPT) you can calculate answer to any question from JPT the problem is there are exponential # of entries (2 N, where N is the number of binary random variables)
19
Joint probability table (JPT) you can calculate answer to any question from JPT the problem is there are exponential # of entries (2 N, where N is the number of binary random variables) P( cavity | toothache) = P( cavity toothache) / P(toothache) = 0.016+0.064 / (0.108 + 0.012 + 0.016 + 0.064) = 0.4
20
P(A,B|C) = P(A|C)P(B|C) Solution to reduce complexity: Employ the Independence Assumption Most variables are not strictly independent, in that there is a statistical association (but which is cause? effect?). However, some many variables are conditionally independent.
21
Bayesian networks note: absence of link between Burglary and JohnCalls means JohnCalls is conditionally independent of Burglary given Alarm
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.