Download presentation
Presentation is loading. Please wait.
1
Radosław Wesołowski Tomasz Pękalski, Michal Borkowicz, Maciej Kopaczyński 12-03-2008
2
What is it anyway? Decision tree T – a tree with a root (in graph theory sense), in which we assign the following meanings to its elements: - inner nodes represent attributes, - edges represent values of the attribute, - leafs represent classification decisions. Using decision tree we can visualize a program with only ‘if-then’ instructions.
5
Testing functions Let us consider an attribute A (e.g. temperature). Let V A mean the set of all possible values of A (0K up to infinity). Let R t mean the set of all possible test results (hot, mild, cold). As a testing function we mean a map t: V A R t We distinguish two main types of testing functions, depending on the set V A - discrete and continuous.
6
Quality of a decision tree (Occam's razor): - we prefer small, simple trees, - we want to gain maximum accuracy of classification (training set, test set) For example: For example: Q(T) = *size(T) + *accuracy(T)
7
Optimal tree – we are given: - a training set S, - a testing functions set TEST, - quality criterion Q. Target: T optimising Q(T). Fact: usually this is NP-hard problem. Conclusion: we have to use heuristics.
8
Building a decision tree: - top_down method: a. In the beginning the root includes all training examples b. We divide them recursively, choosing one attribute at a time - bottom_up: we remove subtrees or edges to gain precision for judging new cases.
10
Entropy – average bits amount to represent a decision d for a randomly chosen object from a given set S. Why? Because optimal binary representation assigns –log2(p) bits to a decision which probability is p. We have formula: entropy(p1,...pn)= - p1*log2(p1) -... - pn*log2(pn)
13
Information gain: gain(.) = info before dividing – info after dividing
15
Overtraining: We say that a model H overfits if there is a model H’ such that : - training_error(H) < training_error(H’), - testing_error(H) > testing_error(H’). Avoiding overtraining: - adequate stop criterions, - posprunning, - preprunning.
16
Some decision trees algorithms: - R1, - ID3 (), - ID3 ( Interactive dichotomizer version 3 ), - C4.5 ( - C4.5 ( ID3 + discretization + prunning ), - CART ( - CART ( Classification and Regression Trees ), - CHAID ( - CHAID ( CHi-squared Automatic Interaction ). -Detection ).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.