The Good News: Why are Decision Trees Currently Quite Popular for Classification Problems? Very robust --- good average testing performance: outperform.

Slides:

Advertisements

Similar presentations

COMP3740 CR32: Knowledge Management and Adaptive Systems

Advertisements

On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach Author: Steven L. Salzberg Presented by: Zheng Liu.

DECISION TREES. Decision trees  One possible representation for hypotheses.

Chapter 5: Introduction to Information Retrieval

CHAPTER 9: Decision Trees

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

1 Machine Learning: Lecture 7 Instance-Based Learning (IBL) (Based on Chapter 8 of Mitchell T.., Machine Learning, 1997)

Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.

Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.

AI – CS364 Hybrid Intelligent Systems Overview of Hybrid Intelligent Systems 07 th November 2005 Dr Bogdan L. Vrusias

Knowledge Representation. 2 Outline: Output - Knowledge representation  Decision tables  Decision trees  Decision rules  Rules involving relations.

© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.

1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.

Data Mining Techniques

Possibilities for Applying Data Mining for Early Warning in Food Supply Networks Adrie J.M. Beulens,Yuan Li, Mark R. Kramer, Jack G.A.J. van der Vorst.

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.

Data Mining Chun-Hung Chou

Issues with Data Mining

Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,

Mohammad Ali Keyvanrad

2-Oct-15 Bojan Orlic, TU/e Informatica, System Architecture and Networking 12-Oct-151 Homework assignment 1 feedback Bojan Orlic Architecture.

Ch. Eick: Evolutionary Machine Learning n Different Forms of Learning: Learning agent receives feedback with respect to its actions (e.g. from a teacher)

Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.

Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.

Discovering Descriptive Knowledge Lecture 18. Descriptive Knowledge in Science In an earlier lecture, we introduced the representation and use of taxonomies.

CS690L Data Mining: Classification

Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.

CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.

Week 2 Introduction to Data Modelling

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall DM Finals Study Guide Rodney Nielsen.

Issues in Automatic Musical Genre Classification Cory McKay.

Summary „Rough sets and Data mining” Vietnam national university in Hanoi, College of technology, Feb.2006.

Data Mining and Decision Support

Data Mining By: Johan Johansson. Mining Techniques Association Rules Association Rules Decision Trees Decision Trees Clustering Clustering Nearest Neighbor.

Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.

Data Mining Practical Machine Learning Tools and Techniques Chapter 6.5: Instance-based Learning Rodney Nielsen Many / most of these slides were adapted.

Why Intelligent Data Analysis? Joost N. Kok Leiden Institute of Advanced Computer Science Universiteit Leiden.

The KDD Process for Extracting Useful Knowledge from Volumes of Data Fayyad, Piatetsky-Shapiro, and Smyth Ian Kim SWHIG Seminar.

Anomaly Detection Nathan Dautenhahn CS 598 Class Lecture March 3, 2011.

General-Purpose Learning Machine

What Is Cluster Analysis?

k-Nearest neighbors and decision tree

An Artificial Intelligence Approach to Precision Oncology

Prepared by: Mahmoud Rafeek Al-Farra

Table 1. Advantages and Disadvantages of Traditional DM/ML Methods

Research Areas Christoph F. Eick

Prepared by: Mahmoud Rafeek Al-Farra

Prepared by: Mahmoud Rafeek Al-Farra

Data Mining 資料探勘分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育

Data Mining: Concepts and Techniques

Supporting End-User Access

Data Mining: Concepts and Techniques

Chap 8. Instance Based Learning

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

Ensemble learning.

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

Comparative Evaluation of SOM-Ward Clustering and Decision Tree for Conducting Customer-Portfolio Analysis By 1Oloyede Ayodele, 2Ogunlana Deborah, 1Adeyemi.

Nearest Neighbor Classifiers

Data Mining: Concepts and Techniques

Avoid Overfitting in Classification

CS639: Data Management for Data Science

Lecture 4. Niching and Speciation (1)

A task of induction to find patterns

Scientific forecasting

Data Mining CSCI 307, Spring 2019 Lecture 21

A task of induction to find patterns

The Good News: Why are Decision Trees Currently Quite Popular for Classification Problems? Very robust --- good average testing performance: outperform.

Presentation transcript:

The Good News: Why are Decision Trees Currently Quite Popular for Classification Problems? Very robust --- good average testing performance: outperform other methods over sets of diverse benchmarks. Decision trees are still somewhat understandable for domain experts. Very useful in early stages of a data analysis project:attributes near the root are very important, attributes near the leafs are somewhat important, attributes that do not occur or occur very rarely near the leafs are not important. Information gain heuristic avoids searching a huge search space --- claim: searches an NP-hard search space quite well. The approach avoids the combinatorial explosion of rules/nodes that other approaches face through the use of sophisticated pruning techniques and because of its hierarchical knowledge representation approach. Can cope with: missing data, noisy data, mixed (numerical and symbolic) data. Easy to use; do not require to provide additional domain knowledge. Simplicity of the approach is appealing.

Decision Trees:The Bad News Rely on rectangular approximations --- this kind of approximations is sometimes not be well suited for particular application domains. Decision trees rely on the ordering of attribute values, and not their absolute differences; e.g. 5>3>1 and 3.0001>3>2.9999 is the same in the context of C5.0; basically, decision trees employ ordering based classification in contrast to distance-based classification which is used by techniques, such as nearest neighbors. If the notion of distance is of key importance for an application, decision trees might be less suitable for the application. Not necessary good for applications in which a lot of attributes have a minor impact and very few or no attributes have a major impact on a decision --- violates the hierarchical nature of decision trees. Data collections have to be in flat-file format, which causes problems with multi-valued attributes (but other approaches face similar problems) Summary: Although decision trees might not be “perfect” for all applications, I consider decision trees as one of the most promising machine learning and data mining technologies for classification tasks.

Decision Trees & the Concept Learning / Classification Tool Market Main Competitors (performance is “comparable” to decision trees): Neural Networks (good overall learning performance, have a hard time to tell what they learned) Discriminant Analysis (sound theoretical foundation, not very stable learning performance: does very well for some benchmarks and very badly for others) Other Competitors (“inferior performance” or other problems): Fuzzy Techniques (combinatorial explosion of rules, not easy to use, lack of heuristics, poor learning performance) Association rule finding (needs symbolic data sets, combinatorial explosion of rules), Bayesian Rule-learning approaches(many diverse approaches which makes it difficult to evaluate the members of this group; most approaches are restricted to symbolic data sets) Classical and Symbolic Regression (poor learning performance) Nearest neighbor(success strongly depends on the availability of a “good” distance function; learning performance not very stable) Logic-based rule-learning approaches, such as AQ-family (currently not very popular) Remark: The following evaluation is based on research projects that benchmarked various approaches which were conducted by the author and his students Y.J. Kim, Brandon Rabke, Ruijiang Zhang, Jim Reynolds and Zheng Wen.