Presentation is loading. Please wait.

Presentation is loading. Please wait.

SIKS-Advanced Course on Computational Intelligence, October 2001 1 Ordinal Classification Rob Potharst Erasmus University Rotterdam.

Similar presentations


Presentation on theme: "SIKS-Advanced Course on Computational Intelligence, October 2001 1 Ordinal Classification Rob Potharst Erasmus University Rotterdam."— Presentation transcript:

1 SIKS-Advanced Course on Computational Intelligence, October 2001 1 Ordinal Classification Rob Potharst Erasmus University Rotterdam

2 SIKS-Advanced Course on Computational Intelligence, October 2001 2 What is ordinal classification?

3 SIKS-Advanced Course on Computational Intelligence, October 2001 3 Company: catering service Swift total liabilities / total assets1 net income / net worth3 …… managers’work experience5 market niche-position3 …... bankruptcy risk + (acceptable)

4 SIKS-Advanced Course on Computational Intelligence, October 2001 4 2 2 2 2 1 3 5 3 5 4 2 4 + 4 5 2 3 3 3 5 4 5 5 4 5 + 3 5 1 1 2 2 5 3 5 5 3 5 + 2 3 2 1 2 4 5 2 5 4 3 4 + 3 4 3 2 2 2 5 3 5 5 3 5 + 3 5 3 3 3 2 5 3 4 4 3 4 + 3 5 2 3 4 4 5 4 4 5 3 5 + 1 1 4 1 2 3 5 2 4 4 1 4 + 3 4 3 3 2 4 4 2 4 3 1 3 + 3 4 2 1 2 2 4 2 4 4 1 4 + 2 5 1 1 3 4 4 3 4 4 3 4 + 3 3 4 4 3 4 4 2 4 4 1 3 + 1 1 2 1 1 3 4 2 4 4 1 4 + 2 1 1 1 4 3 4 2 4 4 3 3 + 2 3 2 1 1 2 4 4 4 4 2 5 + 2 3 4 3 1 5 4 2 4 3 2 3 + 2 2 2 1 1 4 4 4 4 4 2 4 + 2 1 3 1 1 3 5 2 4 2 1 3 + 2 1 2 1 1 3 4 2 4 4 2 4 + 2 1 2 1 1 5 4 2 4 4 2 4 + 2 1 1 1 1 3 2 2 4 4 2 3 ? 1 1 3 1 2 1 3 4 4 4 3 4 ? 2 1 2 1 1 2 4 3 3 2 1 2 ? 1 1 1 1 1 1 3 2 4 4 2 3 ? 2 2 2 1 1 3 3 2 4 4 2 3 ? 2 2 1 1 1 3 2 2 4 4 2 3 ? 2 1 2 1 1 3 2 2 4 4 2 4 ? 1 1 4 1 3 1 2 2 3 3 1 2 ? 3 4 4 3 2 3 3 4 4 4 3 4 ? 3 1 3 3 1 2 2 3 4 4 2 3 ? 1 1 2 1 1 1 3 3 4 4 2 3 - 3 5 2 1 1 1 3 2 3 4 1 3 - 2 2 1 1 1 1 3 3 3 4 3 4 - 2 1 1 1 1 1 2 2 3 4 3 4 - 1 1 2 1 1 1 3 1 4 3 1 2 - 1 1 3 1 2 1 2 1 3 3 2 3 - 1 1 1 1 1 1 2 2 4 4 2 3 - 1 1 3 1 1 1 1 1 4 3 1 3 - 2 1 1 1 1 1 1 1 2 1 1 2 - Data set: 39 companies 20: + (acceptable) 9: - (unacceptable) 10: ? (uncertain) from: Greco, Matarazzo, Slowinski (1996)

5 SIKS-Advanced Course on Computational Intelligence, October 2001 5 Possible classifier if man.exp. > 4, then class = ‘+’ if man.exp. < 4 and net.inc/net.worth = 1, then class = ‘-’ all other cases: class = ‘?’ when applied to dataset of 39: 3 mistakes

6 SIKS-Advanced Course on Computational Intelligence, October 2001 6 What is classification? The act of assigning objects to classes, using the values of relevant features of those objects So we need: objects (individuals, cases), all belonging to some domain classes, number and kind prescribed features (attributes, variables) a classifier (classification function) that assigns a class to any object

7 SIKS-Advanced Course on Computational Intelligence, October 2001 7 Building classifiers = induction from a training set of examples: ­ data without noise ­ data with noise

8 SIKS-Advanced Course on Computational Intelligence, October 2001 8 induction-methods (especially from AI world) decision trees: C4.5, CART (from 1984 on) neural networks: backpropagation (from1986, with false start from 1974) rule induction algorithms: CN2 (1989) newer methods: rough sets, fuzzy methods, decision lists, pattern based methods, etc.

9 SIKS-Advanced Course on Computational Intelligence, October 2001 9 Decision tree: example + ? ?- man.exp. < 3 gen.exp./sales = 1 tot.liab/cashfl = 1 yn yn yn classifies 37 out of 39 ex’s correctly

10 SIKS-Advanced Course on Computational Intelligence, October 2001 10 Ordinal classification features have ordinal scale classes have ordinal scale the ordering must be preserved!

11 SIKS-Advanced Course on Computational Intelligence, October 2001 11 Preservation of ordering A classifier is monotone iff: if A < B, then also class(A) < class(B)

12 SIKS-Advanced Course on Computational Intelligence, October 2001 12 Relevance of ordinal classification selection-problems credit worthiness pricing (e.g. real estate) etc.

13 SIKS-Advanced Course on Computational Intelligence, October 2001 13 Induction of monotone decision trees using C4.5 or CART: non-monotone trees needed: an algorithm that guarantees to generate only monotone trees Makino, Ibaraki, etc. (1996), only for 2-class problems, cumbersome Potharst & Bioch (2000) for k-class problems, fast and efficient

14 SIKS-Advanced Course on Computational Intelligence, October 2001 14 The algorithm try to split subset T: 1) update D for subset T 2) if D  T is homogeneous then assign class label to T and make T a leaf definitively else split T into two non-empty subsets T L and T R using entropy try to split subset T L try to split subset T R

15 SIKS-Advanced Course on Computational Intelligence, October 2001 15 The update rule update D for T: 1) if min(T) is not in D then - add min(T) to D - class ( min(T) ) = the maximal value allowed, given D 2) if max(T) is not in D then - add max(T) to D - class ( max(T) ) = the minimal value allowed, given D

16 SIKS-Advanced Course on Computational Intelligence, October 2001 16 The minimal value allowed given D For each x  X \ D it is possible to calculate the minimal and the maximal class value possible, given D. Let  x be the downset { y  X | y  x } of x Let y* be an element in D   x with highest class value Then the minimal class value possible for x is class (y*).

17 SIKS-Advanced Course on Computational Intelligence, October 2001 17 The maximal value allowed given D Let  x be the upset { y  X | y  x } of x Let y* be an element in D   x with lowest class value Then the maximal class value possible for x is class (y*) if there is no such element then take the maximal class value (or the minimal, in the former case)

18 SIKS-Advanced Course on Computational Intelligence, October 2001 18 Example 0 0 1 0 0 0 2 1 1 1 2 2 2 0 2 2 2 1 2 3 attr. 1: values 0,1,2 attr. 2: values 0,1,2 attr. 3: values 0,1,2 classes: 0, 1, 2, 3 D: 0 0 0 1 0 0 0 1 0 …. 2 2 2 X: Let us calculate the min and max poss value for x = 022: minvalue: y* = 002, so the min-value = 1 maxvalue: there is no y*, so the max-value = 3

19 SIKS-Advanced Course on Computational Intelligence, October 2001 19 Tracing the algorithm Try to split subset T = X: update D for X: min(X) = 000 is not in D; maxvalue of 000 is 0 add 000 with class 0 to D max(X) = 222 is not in D; minvalue of 222 is 3 add 222 with class 3 to D D  X is not homogeneous so consider all the possible splits: A 1  0; A 1  1; A 2  0; A 2  1; A 3  0; A 3  1 0 0 0 0 1 0 0 0 2 1 1 1 2 2 2 0 2 2 2 1 2 3 2 2 2 3

20 SIKS-Advanced Course on Computational Intelligence, October 2001 20 The entropy of each split The split A 1  0 splits X into T L = [000,022] and T R = [100,222] 0 0 0 0 1 0 0 0 2 1 D  T R 1 1 2 2 2 0 2 2 2 1 2 3 2 2 2 3 Entropy = 1 D  T L Entropy = 0.92 Average entropy of this split = 3/7 x 0.92 + 4/7 x 1 = 0.97

21 SIKS-Advanced Course on Computational Intelligence, October 2001 21 Going on with the trace The split with lowest entropy is A 1  0, so we go on with T = T L = [000,022]: Try to split subset T = [000,022]: update D for T: min(T) = 000 is already in D max(T) = 022 has minimum value 1, so it is added to D 0 0 0 0 1 0 0 0 2 1 0 2 2 1 1 1 2 2 2 0 2 2 2 1 2 3 2 2 2 3 D  T is not homogeneous, so we go on to consider the following splits: A 2  0; A 2  1; A 3  0; A 3  1 Lowest entropy

22 SIKS-Advanced Course on Computational Intelligence, October 2001 22 We now have the following tree: A 1  0 A 3  1 ? ? ?

23 SIKS-Advanced Course on Computational Intelligence, October 2001 23 Going on... The split A 3  1 splits T into T L = [000,021] and T R = [002,022] We go on with T = T L = [000,021] Try to split subset T = [000,021]: min(T) = 000 is already in D max(T) = 021 has minimum value 0, so it is added to D D  T is homogeneous, so we stop and make T into a leaf with class value 0 Next, we go on with T = T R = [002,022], etc.

24 SIKS-Advanced Course on Computational Intelligence, October 2001 24 Finally... A 1  0 A 3  1 0 32 1 2 A 1  1 A 2  0

25 SIKS-Advanced Course on Computational Intelligence, October 2001 25 A monotone tree for the Bankruptcy problem can be seen on p. 107 of the paper that was handed out with this course a tree with 6 leaves uses the same attributes as those that come up with an ordinal version of the rough set approach: see Viara Popova’s lecture

26 SIKS-Advanced Course on Computational Intelligence, October 2001 26 Conclusions and remaining problems We described an efficient algorithm for the induction of monotone decision trees, in case we have a monotone dataset We also have an algorithm to repair a non- monotone decision tree, but it makes the tree larger What if we have noise in the dataset? Is it possible to repair by pruning?


Download ppt "SIKS-Advanced Course on Computational Intelligence, October 2001 1 Ordinal Classification Rob Potharst Erasmus University Rotterdam."

Similar presentations


Ads by Google