Presentation is loading. Please wait.

Presentation is loading. Please wait.

DR.M.THIAGARAJAN ASSOCIATE PROFESSOR OF MATHEMATICS

Similar presentations


Presentation on theme: "DR.M.THIAGARAJAN ASSOCIATE PROFESSOR OF MATHEMATICS"— Presentation transcript:

1 DR.M.THIAGARAJAN ASSOCIATE PROFESSOR OF MATHEMATICS
ST JOSEPH’S COLLEGE TRICHIRAPPALLI

2 Uncertainty in AI Outline: Introduction Basic Probability Theory
Probabilistic Reasoning Why should we use probability theory? Dutch Book Theorem

3 Sources of Uncertainty
Information is partial Information is not fully reliable. Representation language is inherently imprecise. Information comes from multiple sources and it is conflicting. Information is approximate Non-absolute cause-effect relationships exist

4 Basic Probability Probability theory enables us to make rational decisions. Which mode of transportation is safer: Car or Plane? What is the probability of an accident?

5 Basic Probability Theory
An experiment has a set of potential outcomes, e.g., throw a dice The sample space of an experiment is the set of all possible outcomes, e.g., {1, 2, 3, 4, 5, 6} An event is a subset of the sample space. {2} {3, 6} even = {2, 4, 6} odd = {1, 3, 5}

6 Probability as Relative Frequency
An event has a probability. Consider a long sequence of experiments. If we look at the number of times a particular event occurs in that sequence, and compare it to the total number of experiments, we can compute a ratio. This ratio is one way of estimating the probability of the event. P(E) = (# of times E occurred)/(total # of trials)

7 Example 100 attempts are made to swim a length in 30 secs. The swimmer succeeds on 20 occasions therefore the probability that a swimmer can complete the length in 30 secs is: 20/100 = 0.2 Failure = 1-.2 or 0.8 The experiments, the sample space and the events must be defined clearly for probability to be meaningful What is the probability of an accident?

8 Theoretical Probability
Principle of Indifference—Alternatives are always to be judged equiprobable if we have no reason to expect or prefer one over the other. Each outcome in the sample space is assigned equal probability. Example: throw a dice P({1})=P({2})= ... =P({6})=1/6

9 Law of Large Numbers As the number of experiments increases the relative frequency of an event more closely approximates the theoretical probability of the event. if the theoretical assumptions hold. Buffon’s Needle for Computing π Draw parallel lines 1 inch apart on a plane Throw a 1-inch needle on the plane P( needle crossing a line )=2/π

10 Large Number Reveals Untruth in Assumptions
Results of 1,000,000 throws of a die Number Fraction

11 Axioms of Probability Theory
Suppose P(.) is a probability function, then 1. for any event E, 0≤P(E) ≤1. 2. P(S) = 1, where S is the sample space. 3. for any two mutually exclusive events E1 and E2, P(E1 È E2) = P(E1) + P(E2) Any function that satisfies the above three axioms is a probability function. The probability of logical truth is 1. The probability of one of two logically exclusive statements is true equals the sum of their respective probabilities.

12 Joint Probability Let A, B be two events, the joint probability of both A and B being true is denoted by P(A, B). Example: P(spade) is the probability of the top card being a spade. P(king) is the probability of the top card being a king. P(spade, king) is the probability of the top card being both a spade and a king, i.e., the king of spade. P(king, spade)=P(spade, king) ??? when does P(A, B)=P(A)?

13 Properties of Probability
1. P(ØE) = 1– P(E) 2. If E1 and E2 are logically equivalent, then P(E1)=P(E2). E1: Not all philosophers are more than six feet tall. E2: Some philosopher is not more that six feet tall. Then P(E1)=P(E2). 3. P(E1, E2)≤P(E1).

14 Conditional Probability
The probability of an event may change after knowing another event. The probability of A given B is denoted by P(A|B). Example P( W=space ) the probability of a randomly selected word from an English text is ‘space’ P( W=space | W’=outer) the probability of ‘space’ if the previous word is ‘outer’

15 Example However, if we know
A: the top card of a deck of poker cards is a king of spade P(A) = 1/52 However, if we know B: the top card is a king then, the probability of A given B is true is P(A|B) = 1/4.

16 How to Compute P(A|B)? B A

17 Business Students Of 100 students completing a course, 20 were business major. Ten students received As in the course, and three of these were business majors., suppose A is the event that a randomly selected student got an A in the course, B is the event that a randomly selected event is a business major. What is the probability of A? What is the probability of A after knowing B is true?

18 Probabilistic Reasoning
Evidence What we know about a situation. Hypothesis What we want to conclude. Compute P( Hypothesis | Evidence )

19 Credit Card Authorization
E is the data about the applicant's age, job, education, income, credit history, etc, H is the hypothesis that the credit card will provide positive return. The decision of whether to issue the credit card to the applicant is based on the probability P(H|E).

20 Medical Diagnosis E is a set of symptoms, such as, coughing, sneezing, headache, ... H is a disorder, e.g., common cold, SARS, flu. The diagnosis problem is to find an H (disorder) such that P(H|E) is maximum.

21 Linda is 31 years old, single, outspoken, and very bright
Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in antinuclear demonstrations. Please rank the following statements by their probability, using 1 for the most probable and 8 for the least probable. a. Linda is a teacher in elementary school. b. Linda works in a bookstore and takes yoga classes. c. Linda is active in the feminist movement. d. Linda is psychiatric social worker. e. Linda is a member of the League of Women Voters. f. Linda is a bank teller. g. Linda is an insurance salesperson. h. Linda is a bank teller and is active in the feminist movement.

22 Example A patient takes a lab test and the result comes back positive. The test has a false negative rate of 2% and false positive rate of 3%. Furthermore, 0.8% of the entire population have this cancer. What is the probability of cancer if we know the test result is positive?

23 Bayes Theorem If P(E2)>0, then P(E1|E2)=P(E2|E1)P(E1)/P(E2)
This can be derived from the definition of conditional probability.

24 The Three-Card Problem
Three cards are in a hat. One is red on both sides (the red-red card). One is white on both sides (the white-white card). One is red on one side and white on the other (the red-white card). A single card is drawn randomly and tossed into the air. a. What is the probability that the red-red card was drawn? (RR) b. What is the probability that the drawn cards lands with a white side up? (W-up) c. What is the probability that the red-red card was not drawn, assuming that the drawn card lands with the a red side up. (not-RR|R-up)

25 Fair Bets A bet is fair to an individual I if, according to the individual's probability assessment, the bet will break even in the long run. The following three bet are fair : Bet (a): Win $4.20 if RR; lose $2.10 otherwise. [since you believe P(RR)=1/3] Bet (b): Win $2.00 if W-up; lose $2.00 otherwise. [since you believe P(W-up)=1/2] Bet (c): Win $4.00 if R-up and not-RR; lose $4.00 if R-up and RR; neither win nor lose if not-R-up. [since you believe P(not-RR|R-up)=1/2]

26 Dutch Book The bets that you accepted have an interesting property:
No matter what card is drawn in the three-card problem, and no matter how it lands, you are guaranteed to lose money. This is called a Dutch Book

27 Verification there are three possible outcomes 1 2 3
1. Some card other than red-red is drawn, and it lands with white side up. That is, W-up and not-RR 2. Some card other than red-red is drawn, and it lands with a red side up. That is, R-up and not-RR. 3. The red-red card is drawn, and it lands (of course) with a red side up. That is, R-up and RR. a. –$2.10 –$2.10 +$4.20 b. +$2.00 –$2.00 –$2.00 c. ±$0.00 +$4.00 –$4.00 total –$0.10 –$0.10 –$1.80

28 The Dutch Book Theorem Suppose that an individual I is willing to accept any bet that is fair for I. Then a Dutch book can be made against I if and only if I's assessment of probability violates Bayesian axiomatization.

29 Independence: Intuition
Events are independent if one has nothing whatever to do with others. Therefore, for two independent events, knowing one happening does change the probability of the other event happening. one toss of coin is independent of another coin (assuming it is a regular coin). price of tea in England is independent of the result of general election in Canada.

30 Independent or Dependent?
Getting cold and getting cat-allergy Mile Per Gallon and acceleration. Size of a person’s vocabulary the person’s shoe size.

31 Independence: Definition
Events A and B are independent iff: P(A, B) = P(A) x P(B) which is equivalent to P(A|B) = P(A) and P(B|A) = P(B) when P(A, B) >0. T1: the first toss is a head. T2: the second toss is a tail. P(T2|T1) = P(T2)

32 Conditional Independence
Dependent events can become independent given certain other events. Example, Size of shoe Age Size of vocabulary Two events A, B are conditionally independent given a third event C iff P(A|B, C) = P(A|C) Independence is a very nice property that makes probability calculation much easier/simpler. However, when ever we are interested in calculating the joint probability of some events, chances are that these events are related somehow and therefore dependent on one another.

33 Conditional Independence: Definition
Let E1 and E2 be two events, they are conditionally independent given E iff P(E1|E, E2)=P(E1|E), that is the probability of E1 is not changed after knowing E2, given E is true. Equivalent formulations: P(E1, E2|E)=P(E1|E) P(E2|E) P(E2|E, E1)=P(E2|E)

34 Example: Play Tennis? Predict playing tennis when <sunny, cool, high, strong> What probability should be used to make the prediction? How to compute the probability?

35 Probabilities of Individual Attributes
Given the training set, we can compute the probabilities P(+) = 9/14 P(−) = 5/14

36 Naïve Bayes Method Knowledge Base contains Given Find
A set of hypotheses A set of evidences Probability of an evidence given a hypothesis Given A sub set of the evidences known to be present in a situation Find the hypothesis with the highest posterior probability: P(H|E1, E2, …, Ek). The probability itself does not matter so much.

37 Naïve Bayes Method Assumptions
Hypotheses are exhaustive and mutually exclusive H1 v H2 v … v Hk ¬ (Hi ^ Hj) for any i≠j Evidences are conditionally independent given a hypothesis P(E1, E2,…, Ek|H) = P(E1|H)…P(Ek|H) P(H | E1, E2,…, Ek) = P(E1, E2,…, Ek, H)/P(E1, E2,…, Ek) = P(E1, E2,…, Ek|H)P(H)/P(E1, E2,…, Ek)

38 Naïve Bayes Method The goal is to find H that maximize P(H|E1, E2,…, Ek) Since P(H|E1, E2,…, Ek) = P(E1, E2,…, Ek|H)P(H)/P(E1, E2,…, Ek) and P(E1, E2,…, Ek) is the same for different hypotheses, Maximizing P(H|E1, E2,…, Ek) is equivalent to maximizing P(E1, E2,…, Ek|H)P(H)= P(E1|H)…P(Ek|H)P(H) Naïve Bayes Method Find a hypothesis that maximizes P(E1|H)…P(Ek|H)P(H)

39 Example: Play Tennis P(+| sunny, cool, high, strong) vs.
P(sunny|+)P(cool|+)P(high|+)P(strong|+)P(+) vs. P(sunny|−)P(cool|−)P(high|−)P(strong|−)P(−) P(+) = 9/14 P(−) = 5/14

40 Application: Spam Detection
Dear sir, We want to transfer to overseas ($ 126, USD) One hundred and Twenty six million United States Dollars) from a Bank in Africa, I want to ask you to quietly look for a reliable and honest person who will be capable and fit to provide either an existing …… Legitimate Ham: for lack of better name.

41 Hypotheses: {Spam, Ham} Evidence: a document
The document is treated as a set (or bag) of words Knowledge P(Spam) The prior probability of an message being a spam. How to estimate this probability? P(w|Spam) the probability that a word is w if we know w is chosen from a spam.

42 Limitations of Naïve Bayesian
Cannot handle hypotheses of composite hypotheses well Suppose are independent of each other Consider a composite hypothesis How to compute the posterior probability

43 Using the Bayes’ Theorem

44 P(B|A, E) <<P(B|A)
but this is a very unreasonable assumption Need a better representation and a better assumption E and B are independent But when A is given, they are (adversely) dependent because they become competitors to explain A P(B|A, E) <<P(B|A) E explains away of A E: earth quake B: burglar A: alarm set off

45 Cannot handle causal chaining
Ex. A: weather of the year B: cotton production of the year C: cotton price of next year Observed: A influences C The influence is not direct (A -> B -> C) P(C|B, A) = P(C|B): instantiation of B blocks influence of A on C

46 Summary Basics of Probability Theory Probabilistic Reasoning
Experiment, sample space, events Axioms and prosperities Joint Probability Conditional Probability Probabilistic Reasoning Bayes Theorem Dutch Book Theorem Independence and Conditional Independence Naïve Bayes Method


Download ppt "DR.M.THIAGARAJAN ASSOCIATE PROFESSOR OF MATHEMATICS"

Similar presentations


Ads by Google