SAD: 6º Projecto.

SAD: 6º Projecto

Lift Charts Comparing classifiers: 1,000,000 prospective respondents
prediction that 0.1% of all households (1,000,000) will respond prediction that 0.4% of a specified 100,000 homes will respond. lift factor=increase in response rate=4 Given a classifier that outputs probabilities for the predicted class value for each test instance, what to do?

Lift Factor sample success proportion lift factor
= (number of positive instances in sample) / sample size lift factor = (sample success proportion) / (total test set success proportion)

Evaluation: The confusion matrix
Incorrectly classified instances a b  classified as | a = yes | b = no Correctly classified instances Comments: For a boolean classification, the entropy is 0 if all entities belong to the same class; the entropy is 1 if the collection contains an equal number of positive and negative examples. Typical measure of entropy: bits of information needed for encoding the classification. Note that the first term for gain is the entropy of the original collection, and the second term is the expected value of the entropy after C is partitioned using attribute A. Gain is the exptected reduction in entropy caused by knowing the value of attribute A.

b) (1 pts) Perform Cross Validation of all your algorithms with Fold Count 4, 8, Maximum Cases should be Which algorithm is the best, which varies less? Which is the better choice?

Decision Tree

Naïve Bayes

Paired Sample t Test Given a set of paired observations
(from two normal populations) A B =A-B x1 y1 x1-x2 x2 y2 x2-y2 x3 y3 x3-y3 x4 y4 x4-y4 x5 y5 x5-y5

Calculate the mean and the standard deviation s of the the differences 
H0: =0 (no difference) H0: =k (difference is a constant)

DT [182 – 188 – ] NB [184 – – ] NN [163.1 – 166 – 168]

DT [89.28– 94 – 98.72] NB [91.73 – – ] NN [76.38– –90.38]

Shannon formalized these intuitions
Given a universe of messages M={m1,m2,...,mn} and a probability p(mi) for the occurrence of each message, the information content (also called entropy)of a message M is given

The amount of information needed to complete the tree is defined as weighted average of the information content of each sub tree by the percentage of the examples present C a set of training instances. If property (for example income) with n values, C will be divided into the subsets {C1,C2,..,Cn} Expected information needed to complete the tree after making P root

The gain from the property P is computed by subtracting the expected information to complete E(P) fro the total information

2. (6pts) Decision Tree

SAD: 6º Projecto.

Similar presentations

Presentation on theme: "SAD: 6º Projecto."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

SAD: 6º Projecto.

Similar presentations

Presentation on theme: "SAD: 6º Projecto."— Presentation transcript:

Similar presentations

About project

Feedback