The Good News: Why are Decision Trees Currently Quite Popular for Classification Problems? Very robust --- good average testing performance: outperform.

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach Author: Steven L. Salzberg Presented by: Zheng Liu.
DECISION TREES. Decision trees  One possible representation for hypotheses.
Chapter 5: Introduction to Information Retrieval
CHAPTER 9: Decision Trees
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
1 Machine Learning: Lecture 7 Instance-Based Learning (IBL) (Based on Chapter 8 of Mitchell T.., Machine Learning, 1997)
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
Data Mining.
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
AI – CS364 Hybrid Intelligent Systems Overview of Hybrid Intelligent Systems 07 th November 2005 Dr Bogdan L. Vrusias
Knowledge Representation. 2 Outline: Output - Knowledge representation  Decision tables  Decision trees  Decision rules  Rules involving relations.
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Data Mining Techniques
Possibilities for Applying Data Mining for Early Warning in Food Supply Networks Adrie J.M. Beulens,Yuan Li, Mark R. Kramer, Jack G.A.J. van der Vorst.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Chun-Hung Chou
Issues with Data Mining
Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,
Mohammad Ali Keyvanrad
2-Oct-15 Bojan Orlic, TU/e Informatica, System Architecture and Networking 12-Oct-151 Homework assignment 1 feedback Bojan Orlic Architecture.
Ch. Eick: Evolutionary Machine Learning n Different Forms of Learning: Learning agent receives feedback with respect to its actions (e.g. from a teacher)
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
Discovering Descriptive Knowledge Lecture 18. Descriptive Knowledge in Science In an earlier lecture, we introduced the representation and use of taxonomies.
CS690L Data Mining: Classification
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
Week 2 Introduction to Data Modelling
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall DM Finals Study Guide Rodney Nielsen.
Issues in Automatic Musical Genre Classification Cory McKay.
Summary „Rough sets and Data mining” Vietnam national university in Hanoi, College of technology, Feb.2006.
Data Mining and Decision Support
Data Mining By: Johan Johansson. Mining Techniques Association Rules Association Rules Decision Trees Decision Trees Clustering Clustering Nearest Neighbor.
Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.
Data Mining Practical Machine Learning Tools and Techniques Chapter 6.5: Instance-based Learning Rodney Nielsen Many / most of these slides were adapted.
Why Intelligent Data Analysis? Joost N. Kok Leiden Institute of Advanced Computer Science Universiteit Leiden.
The KDD Process for Extracting Useful Knowledge from Volumes of Data Fayyad, Piatetsky-Shapiro, and Smyth Ian Kim SWHIG Seminar.
Anomaly Detection Nathan Dautenhahn CS 598 Class Lecture March 3, 2011.
General-Purpose Learning Machine
What Is Cluster Analysis?
k-Nearest neighbors and decision tree
An Artificial Intelligence Approach to Precision Oncology
Prepared by: Mahmoud Rafeek Al-Farra
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Research Areas Christoph F. Eick
Prepared by: Mahmoud Rafeek Al-Farra
Prepared by: Mahmoud Rafeek Al-Farra
Clustering.
Data Mining 資料探勘 分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育
Data Mining: Concepts and Techniques
Supporting End-User Access
Data Mining: Concepts and Techniques
Chap 8. Instance Based Learning
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Ensemble learning.
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Comparative Evaluation of SOM-Ward Clustering and Decision Tree for Conducting Customer-Portfolio Analysis By 1Oloyede Ayodele, 2Ogunlana Deborah, 1Adeyemi.
Nearest Neighbor Classifiers
Data Mining: Concepts and Techniques
Avoid Overfitting in Classification
CS639: Data Management for Data Science
Lecture 4. Niching and Speciation (1)
A task of induction to find patterns
Scientific forecasting
Data Mining CSCI 307, Spring 2019 Lecture 21
A task of induction to find patterns
The Good News: Why are Decision Trees Currently Quite Popular for Classification Problems? Very robust --- good average testing performance: outperform.
Presentation transcript:

The Good News: Why are Decision Trees Currently Quite Popular for Classification Problems? Very robust --- good average testing performance: outperform other methods over sets of diverse benchmarks. Decision trees are still somewhat understandable for domain experts. Very useful in early stages of a data analysis project:attributes near the root are very important, attributes near the leafs are somewhat important, attributes that do not occur or occur very rarely near the leafs are not important. Information gain heuristic avoids searching a huge search space --- claim: searches an NP-hard search space quite well. The approach avoids the combinatorial explosion of rules/nodes that other approaches face through the use of sophisticated pruning techniques and because of its hierarchical knowledge representation approach. Can cope with: missing data, noisy data, mixed (numerical and symbolic) data. Easy to use; do not require to provide additional domain knowledge. Simplicity of the approach is appealing.

Decision Trees:The Bad News Rely on rectangular approximations --- this kind of approximations is sometimes not be well suited for particular application domains. Decision trees rely on the ordering of attribute values, and not their absolute differences; e.g. 5>3>1 and 3.0001>3>2.9999 is the same in the context of C5.0; basically, decision trees employ ordering based classification in contrast to distance-based classification which is used by techniques, such as nearest neighbors. If the notion of distance is of key importance for an application, decision trees might be less suitable for the application. Not necessary good for applications in which a lot of attributes have a minor impact and very few or no attributes have a major impact on a decision --- violates the hierarchical nature of decision trees. Data collections have to be in flat-file format, which causes problems with multi-valued attributes (but other approaches face similar problems) Summary: Although decision trees might not be “perfect” for all applications, I consider decision trees as one of the most promising machine learning and data mining technologies for classification tasks.

Decision Trees & the Concept Learning / Classification Tool Market Main Competitors (performance is “comparable” to decision trees): Neural Networks (good overall learning performance, have a hard time to tell what they learned) Discriminant Analysis (sound theoretical foundation, not very stable learning performance: does very well for some benchmarks and very badly for others) Other Competitors (“inferior performance” or other problems): Fuzzy Techniques (combinatorial explosion of rules, not easy to use, lack of heuristics, poor learning performance) Association rule finding (needs symbolic data sets, combinatorial explosion of rules), Bayesian Rule-learning approaches(many diverse approaches which makes it difficult to evaluate the members of this group; most approaches are restricted to symbolic data sets) Classical and Symbolic Regression (poor learning performance) Nearest neighbor(success strongly depends on the availability of a “good” distance function; learning performance not very stable) Logic-based rule-learning approaches, such as AQ-family (currently not very popular) Remark: The following evaluation is based on research projects that benchmarked various approaches which were conducted by the author and his students Y.J. Kim, Brandon Rabke, Ruijiang Zhang, Jim Reynolds and Zheng Wen.