Download presentation
Presentation is loading. Please wait.
1
SIKS-Advanced Course on Computational Intelligence, October 2001 1 Ordinal Classification Rob Potharst Erasmus University Rotterdam
2
SIKS-Advanced Course on Computational Intelligence, October 2001 2 What is ordinal classification?
3
SIKS-Advanced Course on Computational Intelligence, October 2001 3 Company: catering service Swift total liabilities / total assets1 net income / net worth3 …… managers’work experience5 market niche-position3 …... bankruptcy risk + (acceptable)
4
SIKS-Advanced Course on Computational Intelligence, October 2001 4 2 2 2 2 1 3 5 3 5 4 2 4 + 4 5 2 3 3 3 5 4 5 5 4 5 + 3 5 1 1 2 2 5 3 5 5 3 5 + 2 3 2 1 2 4 5 2 5 4 3 4 + 3 4 3 2 2 2 5 3 5 5 3 5 + 3 5 3 3 3 2 5 3 4 4 3 4 + 3 5 2 3 4 4 5 4 4 5 3 5 + 1 1 4 1 2 3 5 2 4 4 1 4 + 3 4 3 3 2 4 4 2 4 3 1 3 + 3 4 2 1 2 2 4 2 4 4 1 4 + 2 5 1 1 3 4 4 3 4 4 3 4 + 3 3 4 4 3 4 4 2 4 4 1 3 + 1 1 2 1 1 3 4 2 4 4 1 4 + 2 1 1 1 4 3 4 2 4 4 3 3 + 2 3 2 1 1 2 4 4 4 4 2 5 + 2 3 4 3 1 5 4 2 4 3 2 3 + 2 2 2 1 1 4 4 4 4 4 2 4 + 2 1 3 1 1 3 5 2 4 2 1 3 + 2 1 2 1 1 3 4 2 4 4 2 4 + 2 1 2 1 1 5 4 2 4 4 2 4 + 2 1 1 1 1 3 2 2 4 4 2 3 ? 1 1 3 1 2 1 3 4 4 4 3 4 ? 2 1 2 1 1 2 4 3 3 2 1 2 ? 1 1 1 1 1 1 3 2 4 4 2 3 ? 2 2 2 1 1 3 3 2 4 4 2 3 ? 2 2 1 1 1 3 2 2 4 4 2 3 ? 2 1 2 1 1 3 2 2 4 4 2 4 ? 1 1 4 1 3 1 2 2 3 3 1 2 ? 3 4 4 3 2 3 3 4 4 4 3 4 ? 3 1 3 3 1 2 2 3 4 4 2 3 ? 1 1 2 1 1 1 3 3 4 4 2 3 - 3 5 2 1 1 1 3 2 3 4 1 3 - 2 2 1 1 1 1 3 3 3 4 3 4 - 2 1 1 1 1 1 2 2 3 4 3 4 - 1 1 2 1 1 1 3 1 4 3 1 2 - 1 1 3 1 2 1 2 1 3 3 2 3 - 1 1 1 1 1 1 2 2 4 4 2 3 - 1 1 3 1 1 1 1 1 4 3 1 3 - 2 1 1 1 1 1 1 1 2 1 1 2 - Data set: 39 companies 20: + (acceptable) 9: - (unacceptable) 10: ? (uncertain) from: Greco, Matarazzo, Slowinski (1996)
5
SIKS-Advanced Course on Computational Intelligence, October 2001 5 Possible classifier if man.exp. > 4, then class = ‘+’ if man.exp. < 4 and net.inc/net.worth = 1, then class = ‘-’ all other cases: class = ‘?’ when applied to dataset of 39: 3 mistakes
6
SIKS-Advanced Course on Computational Intelligence, October 2001 6 What is classification? The act of assigning objects to classes, using the values of relevant features of those objects So we need: objects (individuals, cases), all belonging to some domain classes, number and kind prescribed features (attributes, variables) a classifier (classification function) that assigns a class to any object
7
SIKS-Advanced Course on Computational Intelligence, October 2001 7 Building classifiers = induction from a training set of examples: data without noise data with noise
8
SIKS-Advanced Course on Computational Intelligence, October 2001 8 induction-methods (especially from AI world) decision trees: C4.5, CART (from 1984 on) neural networks: backpropagation (from1986, with false start from 1974) rule induction algorithms: CN2 (1989) newer methods: rough sets, fuzzy methods, decision lists, pattern based methods, etc.
9
SIKS-Advanced Course on Computational Intelligence, October 2001 9 Decision tree: example + ? ?- man.exp. < 3 gen.exp./sales = 1 tot.liab/cashfl = 1 yn yn yn classifies 37 out of 39 ex’s correctly
10
SIKS-Advanced Course on Computational Intelligence, October 2001 10 Ordinal classification features have ordinal scale classes have ordinal scale the ordering must be preserved!
11
SIKS-Advanced Course on Computational Intelligence, October 2001 11 Preservation of ordering A classifier is monotone iff: if A < B, then also class(A) < class(B)
12
SIKS-Advanced Course on Computational Intelligence, October 2001 12 Relevance of ordinal classification selection-problems credit worthiness pricing (e.g. real estate) etc.
13
SIKS-Advanced Course on Computational Intelligence, October 2001 13 Induction of monotone decision trees using C4.5 or CART: non-monotone trees needed: an algorithm that guarantees to generate only monotone trees Makino, Ibaraki, etc. (1996), only for 2-class problems, cumbersome Potharst & Bioch (2000) for k-class problems, fast and efficient
14
SIKS-Advanced Course on Computational Intelligence, October 2001 14 The algorithm try to split subset T: 1) update D for subset T 2) if D T is homogeneous then assign class label to T and make T a leaf definitively else split T into two non-empty subsets T L and T R using entropy try to split subset T L try to split subset T R
15
SIKS-Advanced Course on Computational Intelligence, October 2001 15 The update rule update D for T: 1) if min(T) is not in D then - add min(T) to D - class ( min(T) ) = the maximal value allowed, given D 2) if max(T) is not in D then - add max(T) to D - class ( max(T) ) = the minimal value allowed, given D
16
SIKS-Advanced Course on Computational Intelligence, October 2001 16 The minimal value allowed given D For each x X \ D it is possible to calculate the minimal and the maximal class value possible, given D. Let x be the downset { y X | y x } of x Let y* be an element in D x with highest class value Then the minimal class value possible for x is class (y*).
17
SIKS-Advanced Course on Computational Intelligence, October 2001 17 The maximal value allowed given D Let x be the upset { y X | y x } of x Let y* be an element in D x with lowest class value Then the maximal class value possible for x is class (y*) if there is no such element then take the maximal class value (or the minimal, in the former case)
18
SIKS-Advanced Course on Computational Intelligence, October 2001 18 Example 0 0 1 0 0 0 2 1 1 1 2 2 2 0 2 2 2 1 2 3 attr. 1: values 0,1,2 attr. 2: values 0,1,2 attr. 3: values 0,1,2 classes: 0, 1, 2, 3 D: 0 0 0 1 0 0 0 1 0 …. 2 2 2 X: Let us calculate the min and max poss value for x = 022: minvalue: y* = 002, so the min-value = 1 maxvalue: there is no y*, so the max-value = 3
19
SIKS-Advanced Course on Computational Intelligence, October 2001 19 Tracing the algorithm Try to split subset T = X: update D for X: min(X) = 000 is not in D; maxvalue of 000 is 0 add 000 with class 0 to D max(X) = 222 is not in D; minvalue of 222 is 3 add 222 with class 3 to D D X is not homogeneous so consider all the possible splits: A 1 0; A 1 1; A 2 0; A 2 1; A 3 0; A 3 1 0 0 0 0 1 0 0 0 2 1 1 1 2 2 2 0 2 2 2 1 2 3 2 2 2 3
20
SIKS-Advanced Course on Computational Intelligence, October 2001 20 The entropy of each split The split A 1 0 splits X into T L = [000,022] and T R = [100,222] 0 0 0 0 1 0 0 0 2 1 D T R 1 1 2 2 2 0 2 2 2 1 2 3 2 2 2 3 Entropy = 1 D T L Entropy = 0.92 Average entropy of this split = 3/7 x 0.92 + 4/7 x 1 = 0.97
21
SIKS-Advanced Course on Computational Intelligence, October 2001 21 Going on with the trace The split with lowest entropy is A 1 0, so we go on with T = T L = [000,022]: Try to split subset T = [000,022]: update D for T: min(T) = 000 is already in D max(T) = 022 has minimum value 1, so it is added to D 0 0 0 0 1 0 0 0 2 1 0 2 2 1 1 1 2 2 2 0 2 2 2 1 2 3 2 2 2 3 D T is not homogeneous, so we go on to consider the following splits: A 2 0; A 2 1; A 3 0; A 3 1 Lowest entropy
22
SIKS-Advanced Course on Computational Intelligence, October 2001 22 We now have the following tree: A 1 0 A 3 1 ? ? ?
23
SIKS-Advanced Course on Computational Intelligence, October 2001 23 Going on... The split A 3 1 splits T into T L = [000,021] and T R = [002,022] We go on with T = T L = [000,021] Try to split subset T = [000,021]: min(T) = 000 is already in D max(T) = 021 has minimum value 0, so it is added to D D T is homogeneous, so we stop and make T into a leaf with class value 0 Next, we go on with T = T R = [002,022], etc.
24
SIKS-Advanced Course on Computational Intelligence, October 2001 24 Finally... A 1 0 A 3 1 0 32 1 2 A 1 1 A 2 0
25
SIKS-Advanced Course on Computational Intelligence, October 2001 25 A monotone tree for the Bankruptcy problem can be seen on p. 107 of the paper that was handed out with this course a tree with 6 leaves uses the same attributes as those that come up with an ordinal version of the rough set approach: see Viara Popova’s lecture
26
SIKS-Advanced Course on Computational Intelligence, October 2001 26 Conclusions and remaining problems We described an efficient algorithm for the induction of monotone decision trees, in case we have a monotone dataset We also have an algorithm to repair a non- monotone decision tree, but it makes the tree larger What if we have noise in the dataset? Is it possible to repair by pruning?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.