SAD: 6º Projecto.

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

DECISION TREES. Decision trees  One possible representation for hypotheses.
Decision Tree Learning - ID3
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Decision Tree Approach in Data Mining
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Indian Statistical Institute Kolkata
Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis.
Assuming normally distributed data! Naïve Bayes Classifier.
Mapping Between Taxonomies Elena Eneva 30 Oct 2001 Advanced IR Seminar.
Decision Tree Algorithm
Lecture 5 (Classification with Decision Trees)
Experimental Evaluation
Evaluation of Results (classifiers, and beyond) Biplav Srivastava Sources: [Witten&Frank00] Witten, I.H. and Frank, E. Data Mining - Practical Machine.
Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.
Chapter 5 Data mining : A Closer Look.
Evaluating Performance for Data Mining Techniques
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 07: Cost-Sensitive Measures.
1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
By Wang Rui State Key Lab of CAD&CG
Bayesian Networks. Male brain wiring Female brain wiring.
CS-424 Gregory Dudek Today’s outline Administrative issues –Assignment deadlines: 1 day = 24 hrs (holidays are special) –The project –Assignment 3 –Midterm.
1 CO Games Development 2 Week 19 Probability Trees + Decision Trees (Learning Trees) Gareth Bellaby.
Evaluating What’s Been Learned. Cross-Validation Foundation is a simple idea – “ holdout ” – holds out a certain amount for testing and uses rest for.
Feature Selection: Why?
Learning from Observations Chapter 18 Through
1 Learning Chapter 18 and Parts of Chapter 20 AI systems are complex and may have many parameters. It is impractical and often impossible to encode all.
Data Mining – Algorithms: Decision Trees - ID3 Chapter 4, Section 4.3.
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
MULTI-INTERVAL DISCRETIZATION OF CONTINUOUS VALUED ATTRIBUTES FOR CLASSIFICATION LEARNING KIRANKUMAR K. TAMBALKAR.
1 Evaluation of Learning Models Literature: Literature: T. Mitchel, Machine Learning, chapter 5 T. Mitchel, Machine Learning, chapter 5 I.H. Witten and.
Learning Classifier Systems (Introduction) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering and Computer Science Victoria University.
Statistics for Engineer. Statistics  Deals with  Collection  Presentation  Analysis and use of data to make decision  Solve problems and design.
DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:
Lecture Notes for Chapter 4 Introduction to Data Mining
Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.
Machine Learning in Practice Lecture 10 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Basic Concepts of Information Theory A measure of uncertainty. Entropy. 1.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Prof. Pushpak Bhattacharyya, IIT Bombay1 CS 621 Artificial Intelligence Lecture 12 – 30/08/05 Prof. Pushpak Bhattacharyya Fundamentals of Information.
BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.
Machine Learning Reading: Chapter Classification Learning Input: a set of attributes and values Output: discrete valued function Learning a continuous.
DECISION TREES An internal node represents a test on an attribute.
Classification Algorithms
Decision Trees.
Artificial Intelligence
Chapter 6 Classification and Prediction
9. Credibility: Evaluating What’s Been Learned
Data Science Algorithms: The Basic Methods
Classification Decision Trees
Dipartimento di Ingegneria «Enzo Ferrari»,
Chapter 7 – K-Nearest-Neighbor
Data Mining Classification: Basic Concepts and Techniques
Classification and Prediction
Central Limit Theorem General version.
K Nearest Neighbor Classification
Chapter 8 Tutorial.
Machine Learning Techniques for Data Mining
Weka Free and Open Source ML Suite Ian Witten & Eibe Frank
Machine Learning: Lecture 3
Multiple Decision Trees ISQS7342
CSCI N317 Computation for Scientific Applications Unit Weka
CS539 Project Report -- Evaluating hypothesis
Learning Chapter 18 and Parts of Chapter 20
Entropy S is the sample space, or Data set D
Decision trees One possible representation for hypotheses
Machine Learning: Lecture 5
Presentation transcript:

SAD: 6º Projecto

Lift Charts Comparing classifiers: 1,000,000 prospective respondents prediction that 0.1% of all households (1,000,000) will respond prediction that 0.4% of a specified 100,000 homes will respond. lift factor=increase in response rate=4 Given a classifier that outputs probabilities for the predicted class value for each test instance, what to do?

Lift Factor sample success proportion lift factor = (number of positive instances in sample) / sample size lift factor = (sample success proportion) / (total test set success proportion)

Lift

Evaluation: The confusion matrix Incorrectly classified instances a b  classified as 7 2 | a = yes 4 1 | b = no Correctly classified instances Comments: For a boolean classification, the entropy is 0 if all entities belong to the same class; the entropy is 1 if the collection contains an equal number of positive and negative examples. Typical measure of entropy: bits of information needed for encoding the classification. Note that the first term for gain is the entropy of the original collection, and the second term is the expected value of the entropy after C is partitioned using attribute A. Gain is the exptected reduction in entropy caused by knowing the value of attribute A.

b) (1 pts) Perform Cross Validation of all your algorithms with Fold Count 4, 8, Maximum Cases should be 1000. Which algorithm is the best, which varies less? Which is the better choice?

Decision Tree

Naïve Bayes

NN

Paired Sample t Test Given a set of paired observations (from two normal populations) A B =A-B x1 y1 x1-x2 x2 y2 x2-y2 x3 y3 x3-y3 x4 y4 x4-y4 x5 y5 x5-y5

Calculate the mean and the standard deviation s of the the differences  H0: =0 (no difference) H0: =k (difference is a constant)

DT

NB

NN

DT 188 +- 5.79 [182 – 188 – 193.79] NB 196 +-7.43 [184 – 191.5 – 196.93] NN 166+-2.9 [163.1 – 166 – 168]

DT 94 +- 4.72 [89.28– 94 – 98.72] NB 96.13+-4.4 [91.73 – 96.13 –100.53 ] NN 83.38+-7 [76.38– 83.38 –90.38]

Shannon formalized these intuitions Given a universe of messages M={m1,m2,...,mn} and a probability p(mi) for the occurrence of each message, the information content (also called entropy)of a message M is given

The amount of information needed to complete the tree is defined as weighted average of the information content of each sub tree by the percentage of the examples present C a set of training instances. If property (for example income) with n values, C will be divided into the subsets {C1,C2,..,Cn} Expected information needed to complete the tree after making P root

The gain from the property P is computed by subtracting the expected information to complete E(P) fro the total information

2. (6pts) Decision Tree