SAD: 6º Projecto.

Slides:

Advertisements

Similar presentations

COMP3740 CR32: Knowledge Management and Adaptive Systems

Advertisements

DECISION TREES. Decision trees  One possible representation for hypotheses.

Decision Tree Learning - ID3

CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.

Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.

Decision Tree Approach in Data Mining

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,

Indian Statistical Institute Kolkata

Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis.

Assuming normally distributed data! Naïve Bayes Classifier.

Mapping Between Taxonomies Elena Eneva 30 Oct 2001 Advanced IR Seminar.

Decision Tree Algorithm

Lecture 5 (Classification with Decision Trees)

Experimental Evaluation

Evaluation of Results (classifiers, and beyond) Biplav Srivastava Sources: [Witten&Frank00] Witten, I.H. and Frank, E. Data Mining - Practical Machine.

Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.

Chapter 5 Data mining : A Closer Look.

Evaluating Performance for Data Mining Techniques

CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 07: Cost-Sensitive Measures.

1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

By Wang Rui State Key Lab of CAD&CG

Bayesian Networks. Male brain wiring Female brain wiring.

CS-424 Gregory Dudek Today’s outline Administrative issues –Assignment deadlines: 1 day = 24 hrs (holidays are special) –The project –Assignment 3 –Midterm.

1 CO Games Development 2 Week 19 Probability Trees + Decision Trees (Learning Trees) Gareth Bellaby.

Evaluating What’s Been Learned. Cross-Validation Foundation is a simple idea – “ holdout ” – holds out a certain amount for testing and uses rest for.

Feature Selection: Why?

Learning from Observations Chapter 18 Through

1 Learning Chapter 18 and Parts of Chapter 20 AI systems are complex and may have many parameters. It is impractical and often impossible to encode all.

Data Mining – Algorithms: Decision Trees - ID3 Chapter 4, Section 4.3.

For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.

MULTI-INTERVAL DISCRETIZATION OF CONTINUOUS VALUED ATTRIBUTES FOR CLASSIFICATION LEARNING KIRANKUMAR K. TAMBALKAR.

1 Evaluation of Learning Models Literature: Literature: T. Mitchel, Machine Learning, chapter 5 T. Mitchel, Machine Learning, chapter 5 I.H. Witten and.

Learning Classifier Systems (Introduction) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering and Computer Science Victoria University.

Statistics for Engineer. Statistics  Deals with  Collection  Presentation  Analysis and use of data to make decision  Solve problems and design.

DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:

Lecture Notes for Chapter 4 Introduction to Data Mining

Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.

Machine Learning in Practice Lecture 10 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Basic Concepts of Information Theory A measure of uncertainty. Entropy. 1.

DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.

Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.

Prof. Pushpak Bhattacharyya, IIT Bombay1 CS 621 Artificial Intelligence Lecture 12 – 30/08/05 Prof. Pushpak Bhattacharyya Fundamentals of Information.

BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.

Machine Learning Reading: Chapter Classification Learning Input: a set of attributes and values Output: discrete valued function Learning a continuous.

DECISION TREES An internal node represents a test on an attribute.

Classification Algorithms

Decision Trees.

Artificial Intelligence

Chapter 6 Classification and Prediction

9. Credibility: Evaluating What’s Been Learned

Data Science Algorithms: The Basic Methods

Classification Decision Trees

Dipartimento di Ingegneria «Enzo Ferrari»,

Chapter 7 – K-Nearest-Neighbor

Data Mining Classification: Basic Concepts and Techniques

Classification and Prediction

Central Limit Theorem General version.

K Nearest Neighbor Classification

Chapter 8 Tutorial.

Machine Learning Techniques for Data Mining

Weka Free and Open Source ML Suite Ian Witten & Eibe Frank

Machine Learning: Lecture 3

Multiple Decision Trees ISQS7342

CSCI N317 Computation for Scientific Applications Unit Weka

CS539 Project Report -- Evaluating hypothesis

Learning Chapter 18 and Parts of Chapter 20

Entropy S is the sample space, or Data set D

Decision trees One possible representation for hypotheses

Machine Learning: Lecture 5

Presentation transcript:

SAD: 6º Projecto

Lift Charts Comparing classifiers: 1,000,000 prospective respondents prediction that 0.1% of all households (1,000,000) will respond prediction that 0.4% of a specified 100,000 homes will respond. lift factor=increase in response rate=4 Given a classifier that outputs probabilities for the predicted class value for each test instance, what to do?

Lift Factor sample success proportion lift factor = (number of positive instances in sample) / sample size lift factor = (sample success proportion) / (total test set success proportion)

Lift

Evaluation: The confusion matrix Incorrectly classified instances a b  classified as 7 2 | a = yes 4 1 | b = no Correctly classified instances Comments: For a boolean classification, the entropy is 0 if all entities belong to the same class; the entropy is 1 if the collection contains an equal number of positive and negative examples. Typical measure of entropy: bits of information needed for encoding the classification. Note that the first term for gain is the entropy of the original collection, and the second term is the expected value of the entropy after C is partitioned using attribute A. Gain is the exptected reduction in entropy caused by knowing the value of attribute A.

b) (1 pts) Perform Cross Validation of all your algorithms with Fold Count 4, 8, Maximum Cases should be 1000. Which algorithm is the best, which varies less? Which is the better choice?

Decision Tree

Naïve Bayes

NN

Paired Sample t Test Given a set of paired observations (from two normal populations) A B =A-B x1 y1 x1-x2 x2 y2 x2-y2 x3 y3 x3-y3 x4 y4 x4-y4 x5 y5 x5-y5

Calculate the mean and the standard deviation s of the the differences  H0: =0 (no difference) H0: =k (difference is a constant)

DT

NB

NN

DT 188 +- 5.79 [182 – 188 – 193.79] NB 196 +-7.43 [184 – 191.5 – 196.93] NN 166+-2.9 [163.1 – 166 – 168]

DT 94 +- 4.72 [89.28– 94 – 98.72] NB 96.13+-4.4 [91.73 – 96.13 –100.53 ] NN 83.38+-7 [76.38– 83.38 –90.38]

Shannon formalized these intuitions Given a universe of messages M={m1,m2,...,mn} and a probability p(mi) for the occurrence of each message, the information content (also called entropy)of a message M is given

The amount of information needed to complete the tree is defined as weighted average of the information content of each sub tree by the percentage of the examples present C a set of training instances. If property (for example income) with n values, C will be divided into the subsets {C1,C2,..,Cn} Expected information needed to complete the tree after making P root

The gain from the property P is computed by subtracting the expected information to complete E(P) fro the total information

2. (6pts) Decision Tree