Neural Trees Olcay Taner Yıldız, Ethem Alpaydın Boğaziçi University Computer Engineering Department

Slides:



Advertisements
Similar presentations
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Advertisements

also known as the “Perceptron”
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Learning Rules from Data
Decision Tree Approach in Data Mining
Classification with Multiple Decision Trees
Introduction Training Complexity, Pruning CART vs. ID3 vs. C4.5
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Transform and Conquer Chapter 6. Transform and Conquer Solve problem by transforming into: a more convenient instance of the same problem (instance simplification)
Decision Tree.
Lecture 13 – Perceptrons Machine Learning March 16, 2010.
Artificial Neural Networks
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Fall 2004 Shreekanth Mandayam ECE Department Rowan University.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks ECE /ECE Fall 2008 Shreekanth Mandayam ECE Department Rowan University.
Perceptron-based Global Confidence Estimation for Value Prediction Master’s Thesis Michael Black June 26, 2003.
TCSS 343, version 1.1 Algorithms, Design and Analysis Transform and Conquer Algorithms Presorting HeapSort.
Neural Networks Marco Loog.
Tree-based methods, neutral networks
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Chapter 6: Multilayer Neural Networks
Example of a Decision Tree categorical continuous class Splitting Attributes Refund Yes No NO MarSt Single, Divorced Married TaxInc NO < 80K > 80K.
Classification and Prediction by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
Ensemble Learning (2), Tree and Forest
A Comparison of Discriminant Functions and Decision Tree Induction Techniques for Evaluation of Antenatal Fetal Risk Assessment Nilgün Güler, Olcay Taner.
Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.
Review Binary Tree Binary Tree Representation Array Representation Link List Representation Operations on Binary Trees Traversing Binary Trees Pre-Order.
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
A Simple Method to Extract Fuzzy Rules by Measure of Fuzziness Jieh-Ren Chang Nai-Jian Wang.
Multi-Layer Perceptrons Michael J. Watts
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Decision Trees Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
Comparing Univariate and Multivariate Decision Trees Olcay Taner Yıldız Ethem Alpaydın Department of Computer Engineering Bogazici University
Categorical data. Decision Tree Classification Which feature to split on? Try to classify as many as possible with each split (This is a good split)
1 N -Queens via Relaxation Labeling Ilana Koreh ( ) Luba Rashkovsky ( )
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.3: Decision Trees Rodney Nielsen Many of.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
1 Decision Tree Learning Original slides by Raymond J. Mooney University of Texas at Austin.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.
Dividing Decimals by a Whole Number 3.6 ÷ 3.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Decision Trees Example of a Decision Tree categorical continuous class Refund MarSt TaxInc YES NO YesNo Married Single, Divorced < 80K> 80K Splitting.
An Interval Classifier for Database Mining Applications Rakes Agrawal, Sakti Ghosh, Tomasz Imielinski, Bala Iyer, Arun Swami Proceedings of the 18 th VLDB.
COMP53311 Other Classification Models: Neural Network Prepared by Raymond Wong Some of the notes about Neural Network are borrowed from LW Chan’s notes.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
1 Learning Bias & Clustering Louis Oliphant CS based on slides by Burr H. Settles.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Linear Models & Clustering Presented by Kwak, Nam-ju 1.
An Algorithm to Learn the Structure of a Bayesian Network Çiğdem Gündüz Olcay Taner Yıldız Ethem Alpaydın Computer Engineering Taner Bilgiç Industrial.
10. Decision Trees and Markov Chains for Gene Finding.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
DECISION TREES An internal node represents a test on an attribute.
Decision Trees.
Artificial Intelligence
Trees, bagging, boosting, and stacking
Data Science Algorithms: The Basic Methods
Artificial Neural Networks
Sorting And Searching CSE116A,B 2/23/2019 B.Ramamurthy.
Sorting And Searching CSE116A,B 4/7/2019 B.Ramamurthy.
INTRODUCTION TO Machine Learning
Presentation transcript:

Neural Trees Olcay Taner Yıldız, Ethem Alpaydın Boğaziçi University Computer Engineering Department

Overview Decision Trees Neural Trees –Linear Model –Nonlinear Model –Hybrid Model Class Separation Problem –Selection Method –Exchange Method Results Conclusion and Future Work

Decision Trees

Neural Trees A Neural Network at each decision node Three Neural Network Models –Linear Perceptron –Multilayer Perceptron (Guo, Gelfand 1992) –Hybrid Model (Statistical Test to decide Linear or Nonlinear Model)

Linear perceptron Multilayer perceptron Hybrid (According to 5  2 cv F-Test result select multilayer or linear perceptron %95 confidence level) Network Models

Training of Neural Trees 1.Divide k classes in that node into two parts. 2.Solve two class problem with the neural network model in that node. 3.For each of two child nodes repeat step 1 and step 2 recursively until each node has only one class in it.

Class Separation Problem Division of k classes into two can be done in 2 k-1 -1 different ways. (Too large for big k) Two heuristic methods –Selection Method O(k) –Exchange Method O(k 2 )

Selection Method 1.Select two classes C i and C j at random and put one in C L and the other in C R 2.Train the discriminant with the given partition. Do not consider the instances of other classes yet. 3.For other classes in the class list, search for the class C k that is best placed into one of the partitions. 4.Add C k to C L or C R depending on on which side its instances fall more and continue adding classes one by one using steps 2 to 4 until no more classes are left

Exchange Method 1.Select an initial partition of C into C L and C R, both containing k/2 classes 2.Train the discriminant to separate C L and C R. Compute the entropy E 0 with the selected entropy formula 3.For each of the classes k in C 1... C k form the partitions C L(k) and C R(k) by changing the assignment of the class C k in the partitions C L and C R 4.Train the neural network with the partitions C L(k) and C R(k). Compute the entropy E k and the decrease in the entropy  E k =E k -E 0 5.Let  E * be the maximum of the impurity decreases over all possible k and k * be the k causing the largest decrease. If this impurity decrease is less than zero then exit else set C L =C L (k * ), C R =C R (k * ), and goto step 2

Experiments 20 data sets from UCI Repository are used Three different criteria used –Accuracy –Tree Size –Learning Time For comparison 5  2 cv F-Test is used.

Results for Accuracy ID3CARTID-LPID-MLPID-Hybrid ID CART0-111 ID-LP47-00 ID-MLP451-1 ID-Hybrid4800-

Results for Tree Size ID3CARTID-LPID-MLPID-Hybrid ID CART3-000 ID-LP ID-MLP ID-Hybrid18 00-

Results for Learning Time ID3CARTID-LPID-MLPID-Hybrid ID CART0-001 ID-LP ID-MLP ID-Hybrid21700-

Conclusion Accuracy: ID-LP = ID-MLP = ID-Hybrid>ID3=CART Tree Size: ID-MLP = ID-Hybrid > ID-LP > CART > ID3 Learning Time: ID3 > ID-LP > ID-MLP > ID-Hybrid > CART Linear Discriminant Trees (ICML2k)