Uncertainty in AI. Birds can fly, right? Seems like common sense knowledge.

Uncertainty in AI

Birds can fly, right? Seems like common sense knowledge.

The problem with First-Order Logic is that universal rules are too strict  x bird(x)→flies(x) mean all bird must fly but there are many exceptions... If we asserted: penguin(opus)  x penguin(x) →  flies(x)  x penguin(x) → bird(x) this would make the KB inconsistent

opus tweety big bird woodstock

we can't enumerate all possible exceptions what about a robin with a broken wing? what about birds that are made out of plastic? what about Big Bird? this is known as the "qualification problem" Uncertainty in actions: If a gun is loaded and you pull the trigger, the gun will fire, right?

we can't enumerate all possible exceptions what about a robin with a broken wing? what about birds that are made out of plastic? what about Big Bird? Uncertainty in actions: If a gun is loaded and you pull the trigger, the gun will fire, right?...unless it is a toy gun...unless it is underwater...unless the barrel is filled with concrete...unless it is defective

what might we want to say is... "most birds fly" or "some birds do not fly" or "tweety possibly flies" We cannot express these in FOL there are no degrees of truth "some birds do not fly" becomes  x bird(x)  flies(x) solutions: Default (non-monotonic) Logics Fuzzy Logics procedural approachs Probability

Default Logic extensions of FOL that allow for exceptions FOL is monotonic: anything that is entailed by a set of sentences will still be entailed by a superset of sentences  implies  bird(opus)  flies(opus) bird(opus)  penguin(opus)  flies(opus) [necessarily] Default Logics are non-monotonic old conclusions can get retracted due to new information bird(opus)  penguin(opus)   flies(opus)

Default Logics require alternative model theory, such as possible worlds (or Kripke) semantics without going into the complexities, most default logics are based on drawing conclusions from "minimal models" that make the least assumptions necessary

Default Logic sytem I new type of "soft" implication bird(x)  flies(x) "  " means "..and if it is not inconsistent to believe so, then..." example: penguin(opus) bird(tweety)  x penguin(x) →  flies(x)  x penguin(x) → bird(x)  x bird(x)  flies(x) model: m 0 = U={a,b} D={opus→a,tweety→b} R: bird={a,b}, penguin={a}, flies={b} because opus is a penguin, 'a' cannot be in the set of things that fly this makes it inconsistent to apply the "birds fly" rule for opus this model is the "minimal" consistent model of the sentences

Circumscription - another approach to Defeasible Reasoning add 'abnormal' predicates to default sentences penguin(opus) bird(tweety)  x penguin(x) →  flies(x)  x penguin(x) → bird(x)  x bird(x)  abnormal 1 (x) → flies(x) find models that make the fewest abnormal assumptions to be consistent U={a,b} D={opus→a,tweety→b} R: bird={a,b}, penguin={a}, flies={b}, abnormal 1 ={a} there is no reason to assume tweety is abnormal but we opus needs to be abnormal to explain not flying

Procedural Solutions Semantics networks 'isa' links encode inheritance relationships to answer queries, follow pointers until property is defined Closed World Assumption in Prolog anything you cannot prove is assumed to be false this makes it more concise to write KBs by asserting only true things flies(X) :- bird(X),not penguin(X),not too_large_to_fly(X). bird(tweety). bird(opus). bird(big_bird). penguin(opus). too_big_to_fly(big_bird). note that in FOL, we could not conclude flies(tweety) because  penguin(tweety) would have to be explicitly asserted as a fact opus isa canfly penguin F isa canfly bird T tweety

Rule strengths or priorities in Expert Systems...an old ad-hoc approach (with unclear semantics) penguin(x) → 0.9  flies(x) bird(x) → 0.5 flies(x) example: can assign 'salience' (integers) to defrules in Jess this will determine the order in which rules fire in the forward-chaining

Probability encode knowledge in the form of prior probabilities and conditional probabilities P(x speaks portugese|x is from Brazil)=0.9 P(x speaks portugese)=0.012 P(x is from Brazil)=0.007 P(x flies|x is a bird)=0.8 (?) inference is done by calculating posterior probabilities given evidence compute P(cavity | toothache, flossing, dental history, recent consumption of candy...) compute P(fed will raise interest rate | employment=5%, inflation=0.5%, GDP=2%, recent geopolitical events...)

Many modern knowledge-based systems are based on probabilistic inference including Bayesian networks, Hidden Markov Models, Markov Decision Problems example: Bayesian networks are used for inferring user goals or help needs from actions like mouse clicks in an automated software help system (think 'Clippy') Decision Theory combines utilities with probabilities of outcomes to decide actions to tak the challenge is capturing all the numbers needed for the prior and conditional probabilities objectivists (frequentists) - probabilities represent outcomes of trials/experiments subjectivists - probabilities are degress of belief probability and statistics is at the core of many Machine Learning algorithms

causal vs. diagnostic knowledge causal: P(x has a toothache|x has a cavity)=0.9 diagnostic: P(x has a cavity|x has a toothache)=0.5 typically it is easier to articulate knowledge in the causal direction, but we often want to use it in a diagnostic way to make inferences from observations

product rule : P(A,B) joint = P(A|B)*P(B) Bayes' Rule: convert between causal and diagnostic H = hypothesis (cause, disease) E = evidence (effect, symptoms)

Joint probability table (JPT) you can calculate answer to any question from JPT the problem is there are exponential # of entries (2 N, where N is the number of binary random variables)

Joint probability table (JPT) you can calculate answer to any question from JPT the problem is there are exponential # of entries (2 N, where N is the number of binary random variables) P(  cavity | toothache) = P(  cavity  toothache) / P(toothache) = 0.016+0.064 / (0.108 + 0.012 + 0.016 + 0.064) = 0.4

P(A,B|C) = P(A|C)P(B|C) Solution to reduce complexity: Employ the Independence Assumption Most variables are not strictly independent, in that there is a statistical association (but which is cause? effect?). However, some many variables are conditionally independent.

Bayesian networks note: absence of link between Burglary and JohnCalls means JohnCalls is conditionally independent of Burglary given Alarm

Uncertainty in AI. Birds can fly, right? Seems like common sense knowledge.

Similar presentations

Presentation on theme: "Uncertainty in AI. Birds can fly, right? Seems like common sense knowledge."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Uncertainty in AI. Birds can fly, right? Seems like common sense knowledge.

Similar presentations

Presentation on theme: "Uncertainty in AI. Birds can fly, right? Seems like common sense knowledge."— Presentation transcript:

Similar presentations

About project

Feedback