Presentation is loading. Please wait.

Presentation is loading. Please wait.

Naïve Bayes Model. Outline Independence and Conditional Independence Naïve Bayes Model Application: Spam Detection.

Similar presentations


Presentation on theme: "Naïve Bayes Model. Outline Independence and Conditional Independence Naïve Bayes Model Application: Spam Detection."— Presentation transcript:

1 Naïve Bayes Model

2 Outline Independence and Conditional Independence Naïve Bayes Model Application: Spam Detection

3 Independence: Intuition Events are independent if one has nothing whatever to do with others. Therefore, for two independent events, knowing one happening does change the probability of the other event happening. one toss of coin is independent of another coin (assuming it is a regular coin). price of tea in England is independent of the result of general election in Canada.

4 Independent or Dependent? Getting cold and getting cat-allergy Mile Per Gallon and acceleration. Size of a person’s vocabulary the person’s shoe size.

5 Independence: Definition Events A and B are independent iff: P(A, B) = P(A) x P(B) which is equivalent to P(A|B) = P(A) and P(B|A) = P(B) when P(A, B) >0. T1: the first toss is a head. T2: the second toss is a tail. P(T2|T1) = P(T2)

6 Conditional Independence Dependent events can become independent given certain other events. Example, Size of shoe Age Size of vocabulary Two events A, B are conditionally independent given a third event C iff P(A|B, C) = P(A|C)

7 Conditional Independence: Definition Let E1 and E2 be two events, they are conditionally independent given E iff P(E1|E, E2)=P(E1|E), that is the probability of E1 is not changed after knowing E2, given E is true. Equivalent formulations: P(E1, E2|E)=P(E1|E) P(E2|E) P(E2|E, E1)=P(E2|E)

8 Example: Play Tennis? Predict playing tennis when What probability should be used to make the prediction? How to compute the probability?

9 Probabilities of Individual Attributes Given the training set, we can compute the probabilities P(+) = 9/14 P( − ) = 5/14

10 Naïve Bayes Method Knowledge Base contains A set of hypotheses A set of evidences Probability of an evidence given a hypothesis Given A sub set of the evidences known to be present in a situation Find the hypothesis with the highest posterior probability: P(H|E 1, E 2, …, E k ).  The probability itself does not matter so much.

11 Naïve Bayes Method Assumptions Hypotheses are exhaustive and mutually exclusive  H 1 v H 2 v … v H k  ¬ (H i ^ H j ) for any i≠j Evidences are conditionally independent given a hypothesis  P(E 1, E 2,…, E k |H) = P(E 1 |H)…P(E k |H)  P(H | E 1, E 2,…, E k ) = P(E 1, E 2,…, E k, H)/P(E 1, E 2,…, E k ) = P(E 1, E 2,…, E k |H)P(H)/P(E 1, E 2,…, E k )

12 Naïve Bayes Method The goal is to find H that maximize P(E 1, E 2,…, E k |H) Since P(E 1, E 2,…, E k |H) = P(E 1, E 2,…, E k |H)P(H)/P(E 1, E 2,…, E k ) and P(E 1, E 2,…, E k ) is the same for different hypotheses, Maximizing P(E 1, E 2,…, E k |H) is equivalent to maximizing P(E 1, E 2,…, E k |H)P(H)= P(E 1 |H)…P(E k |H)P(H) Naïve Bayes Method Find a hypothesis that maximizes P(E 1 |H)…P(E k |H)P(H)

13 Example: Play Tennis P(+| sunny, cool, high, strong) vs. P(−| sunny, cool, high, strong) P(sunny|+)P(cool|+)P(high|+)P(strong|+)P(+) vs. P(sunny|−)P(cool|−)P(high|−)P(strong|−)P(−)

14 Application: Spam Detection Spam Dear sir, We want to transfer to overseas ($ 126,000.000.00 USD) One hundred and Twenty six million United States Dollars) from a Bank in Africa, I want to ask you to quietly look for a reliable and honest person who will be capable and fit to provide either an existing …… Legitimate email Ham: for lack of better name.

15 Hypotheses: {Spam, Ham} Evidence: a document The document is treated as a set (or bag) of words Knowledge P(Spam)  The prior probability of an e-mail message being a spam.  How to estimate this probability? P(w|Spam)  the probability that a word is w if we know w is chosen from a spam.  How to estimate this probability?

16 Limitations of Naïve Bayesian Cannot handle hypotheses of composite hypotheses well Suppose are independent of each other Consider a composite hypothesis How to compute the posterior probability

17 Using the Bayes’ Theorem

18 but this is a very unreasonable assumption Need a better representation and a better assumption E: earth quake B: burglar A: alarm set off E and B are independent But when A is given, they are (adversely) dependent because they become competitors to explain A P(B|A, E) <<P(B|A) E explains away of A

19 Cannot handle causal chaining Ex. A: weather of the year B: cotton production of the year C: cotton price of next year Observed: A influences C The influence is not direct (A -> B -> C) P(C|B, A) = P(C|B): instantiation of B blocks influence of A on C


Download ppt "Naïve Bayes Model. Outline Independence and Conditional Independence Naïve Bayes Model Application: Spam Detection."

Similar presentations


Ads by Google