Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis.

Final Exam: May 10 Thursday

If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis ) is true with probability p Bayesian reasoning

Bayesian reasoning Example: Cancer and Test P(C) = 0.01 P(¬C) = 0.99 P(+|C) = 0.9 P(-|C) = 0.1 P(+|¬C) = 0.2P(-|¬C) = 0.8 P(C|+) = ?

Expand the Bayesian rule to work with multiple hypotheses ( H 1... H m ) and evidences ( E 1... E n ) Assuming conditional independence among evidences E 1... E n Bayesian reasoning with multiple hypotheses and evidences

Expert data: Bayesian reasoning Example

user observes E 3 E 1 E 2

Bayesian reasoning Example expert system computes posterior probabilities user observes E 2

Propagation of CFs For a single antecedent rule: cf(E) is the certainty factor of the evidence. cf(R) is the certainty factor of the rule.

Single antecedent rule example IF patient has toothache THEN problem is cavity {cf 0.3} Patient has toothache {cf 0.9} What is the cf(cavity, toothache)?

Propagation of CFs (multiple antecedents) For conjunctive rules: IF AND... AND THEN {cf} For two evidences E1 and E2: cf(E1 AND E2) = min(cf(E1), cf(E2))

Propagation of CFs (multiple antecedents) For disjunctive rules: IF OR... OR THEN {cf} For two evidences E1 and E2: cf(E1 OR E2) = max(cf(E1), cf(E2))

Exercise IF (P1 AND P2) OR P3 THEN C1 (0.7) AND C2 (0.3) Assume cf(P1) = 0.6, cf(P2) = 0.4, cf(P3) = 0.2 What is cf(C1), cf(C2)?

Defining fuzzy sets with fit-vectors A can be defined as: So, for example: Tall men = (0/180, 1/190) Short men=(1/160, 0/170) Average men=(0/165,1/175,0/185)

What about linguistic values with qualifiers ? e.g. very tall, extremely short, etc. Hedges are qualifying terms that modify the shape of fuzzy sets e.g. very, somewhat, quite, slightly, extremely, etc. Qualifiers & Hedges

Representing Hedges

Crisp Set Operations

Complement To what degree do elements not belong to this set? tall men = {0/180, 0.25/182, 0.5/185, 0.75/187, 1/190}; Not tall men = {1/180, 0.75/182, 0.5/185, 0.25/187, 1/190}; Fuzzy Set Operations  ¬ A ( x ) = 1 –  A ( x )

Containment Which sets belong to other sets? tall men = {0/180, 0.25/182, 0.5/185, 0.75/187, 1/190}; very tall men = {0/180, 0.06/182, 0.25/185, 0.56/187, 1/190}; Fuzzy Set Operations Each element of the fuzzy subset has smaller membership than in the containing set

Intersection To what degree is the element in both sets? Fuzzy Set Operations  A ∩ B ( x ) = min [  A ( x ),  B ( x ) ]

tall men = {0/165, 0/175, 0/180, 0.25/182, 0.5/185, 1/190}; average men = {0/165, 1/175, 0.5/180, 0.25/182, 0/185, 0/190}; tall men ∩ average men = {0/165, 0/175, 0/180, 0.25/182, 0/185, 0/190}; or tall men ∩ average men = {0/180, 0.25/182, 0/185};  A ∩ B ( x ) = min [  A ( x ),  B ( x ) ]

Union To what degree is the element in either or both sets? Fuzzy Set Operations  A  B ( x ) = max [  A ( x ),  B ( x ) ]

tall men = {0/165, 0/175, 0/180, 0.25/182, 0.5/185, 1/190}; average men = {0/165, 1/175, 0.5/180, 0.25/182, 0/185, 0/190}; tall men  average men = {0/165, 1/175, 0.5/180, 0.25/182, 0.5/185, 1/190};  A  B ( x ) = max [  A ( x ),  B ( x ) ]

25 Choosing the Best Attribute: Binary Classification Want a formal measure that returns a maximum value when attribute makes a perfect split and minimum when it makes no distinction Information theory (Shannon and Weaver 49) Entropy: a measure of uncertainty of a random variable A coin that always comes up heads --> 0 A flip of a fair coin (Heads or tails) --> 1(bit) The roll of a fair four-sided die --> 2(bit) Information gain: the expected reduction in entropy caused by partitioning the examples according to this attribute

26 Formula for Entropy Examples: Suppose we have a collection of 10 examples, 5 positive, 5 negative: H(1/2,1/2) = -1/2log 2 1/2 -1/2log 2 1/2 = 1 bit Suppose we have a collection of 100 examples, 1 positive and 99 negative: H(1/100,99/100) = -.01log 2.01 -.99log 2.99 =.08 bits

Information gain Information gain (from attribute test) = difference between the original information requirement and new requirement Information Gain (IG) or reduction in entropy from the attribute test: Choose the attribute with the largest IG

Information gain For the training set, p = n = 6, I(6/12, 6/12) = 1 bit Consider the attributes Patrons and Type (and others too): Patrons has the highest IG of all attributes and so is chosen by the DTL algorithm as the root

Example contd. Decision tree learned from the 12 examples: Substantially simpler than “true”

Perceptrons X = x 1 w 1 + x 2 w 2 Y = Y step

Perceptrons How does a perceptron learn? A perceptron has initial (often random) weights typically in the range [-0.5, 0.5] Apply an established training dataset Calculate the error as expected output minus actual output : error e = Y expected – Y actual Adjust the weights to reduce the error

Perceptrons How do we adjust a perceptron’s weights to produce Y expected ? If e is positive, we need to increase Y actual (and vice versa) Use this formula:, where and α is the learning rate (between 0 and 1) e is the calculated error

Perceptron Example – AND Train a perceptron to recognize logical AND Use threshold Θ = 0.2 and learning rate α = 0.1

Perceptron Example – AND Repeat until convergence i.e. final weights do not change and no error Use threshold Θ = 0.2 and learning rate α = 0.1

Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis.

Similar presentations

Presentation on theme: "Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis.

Similar presentations

Presentation on theme: "Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis."— Presentation transcript:

Similar presentations

About project

Feedback