Presentation is loading. Please wait.

Presentation is loading. Please wait.

Conditional Classification Trees using Instrumental Variables Roberta Siciliano Valerio Aniello Tutore Department of Mathematics and Statistics University.

Similar presentations


Presentation on theme: "Conditional Classification Trees using Instrumental Variables Roberta Siciliano Valerio Aniello Tutore Department of Mathematics and Statistics University."— Presentation transcript:

1 Conditional Classification Trees using Instrumental Variables Roberta Siciliano Valerio Aniello Tutore Department of Mathematics and Statistics University of Naples Federico II Massimo Aria

2 Outline The framework  Standard CART (Breiman et al., 1984)  TWO-STAGE methods (Mola and Siciliano, 1992) The problem  Specific context applications concerning structural information on either the objects or the variables  Conditional Data Analysis using Trees The proposed approach  Introduction of instrumental variables  Two methods and some applications

3 Tree-based model N1N1 N2N2 N3N3 N7N7 N 13 N 12 N5N5 N9N9 N8N8 N4N4 N6N6 Sample L = {y, x n ; n =1, …, N 1 } from the distribution (Y, X) where Y is the response and X is the set of predictors Segmentation  Recursive partitioning of objects to get homogeneous subgroups w.r.t. Y by means of a sequence of splitting variables generated by the predictors s 2  0 s 2  1s 3  0s 3  1 s 4  0 s 4  1s 6  0s 6  1  Supervised Classification and Regression to predict Y y8y8 y9y9 y 12 y 13 y7y7 s 1  0s 1  1

4 Two problems when using trees There are applications where objects are not intrinsically homogeneous as they can be distinguished into different subsamples  The final partitioning cannot ignore this condition! There are applications where the predictors are within-groups correlated and are organized into more blocks  Standard trees offer unstable and unfair tree partitioning!

5 The genesis of our contribution This work aims to define a segmentation methodology for three-way data matrix starting from some recent results (Tutore et al., 2006)). A three-way data matrix consists of measurements of a response variable, a set of predictors, and in addition a stratifying or descriptor variable (of categorical type). The stratifying instance plays the role of instrumental variable to identify either subsamples of objects or groups of predictors. Two basic methods are proposed: Partial predictability trees Multiple discriminant trees

6 Partial predictability trees: the idea In a context of classification dealing with categorical predictors (m=1,…, M), the instrumental (categorical/categorized) variable provides to distinguish different subsamples of objects: “A standard splitting criterion would divide the objects regardless their subsamples belonging” We introduce a criterion that finds the best split conditioned by the instrumental variable. Partial predictability trees can be understood as an extension of two- stage segmentation and of the CART methodology.

7 Partial predictability trees: the method We consider the two-stage splitting criterion based on the predictability index of Goodman and Kruskal for two-way cross- classification and its extensions to three-way cross-classifications due to Gray and Williams. 1. First stage: the best predictor is found maximizing the global prediction with respect to the response variable conditional to the instrumental variable; 2. Second stage: the best split of the best predictor is found maximizing the partial prediction.

8 The proposed splitting criterion First stage: at each node among all predictors we maximize the partial index to find the best predictor conditioned by the instrumental variable : where and are the multiple and the simple predictability measures. Second stage: we find the best split of the best predictor maximizing the partial index among all possible splits of the best predictors.

9 Partial predictability trees – An example A dataset about credit leave in Germany from UCI (Professor Dr. Hans Hofmann, University of Hamburg) 2026 objects 20 variables:  1 categorical response: bad or good client  18 predictors  1 instrumental variable (amount credit request – four classes)

10

11

12 Path 1-23 Good clients

13 Multiple discriminant trees: the idea The data set consists of G blocks of internally correlated covariates for and a dummy response variable Y The idea is to find a compromise of all predictors in each block and a compromise of all blocks in such a way to maximize the predictability of Y The approach is an improvement of TS-DIS Segmentation (Siciliano and Mola, 2002)

14 Multiple discriminant trees: the method I. Within-block latent compromises: In each block find the linear combination of the covariates using discriminant functions: II. Across-block latent compromise: Find the discriminant function of all III. Multiple factorial split: Find the best split of

15 A real dataset from Customer Satisfaction of a local public transport in 2006 1290 objects 13 variables:  12 predictors  4 groups of predictors (every 3 predictors are correlated)  1 categorical response: global satisfaction Multiple Discriminant Trees – An example

16

17

18 Path 1-8 Unsatisfied customers

19 Path 1-55 Satisfied customers

20 Some remarks Partial Predictability Trees:  It can be demonstrated that considering only the second stage (trying out all possible splits of all predictors) yields to a conditional version of CART-like splitting criterion (based on the decrease of the impurity measure). Multiple Discriminant Trees:  Extension to Regression trees or Classification trees with more response classes Define the dummy response variable such to distinguish two groups of response values/classes which are the most distant without caring of predictors (the so-called retrospective split, Siciliano and Mola, 2002).

21 Conclusion remarks Main results of the proposed conditional methods:  Multiple discriminant trees allow to reduce input dimensionality introducing a multidimensional splitting that takes into account the structural relationship among the variables  Partial predictability trees provide a three- dimensional classification taking into account distinct subsamples at each node

22 Last but not the least point Matching two scientific worlds  Computational Statistics/Data Mining: The science of extracting useful information from large data sets by means of a strategy of analysis considering data preprocessing and statistical methods  Computer Science/Machine Learning: The approach that combines data-driven procedures with computational intensive methods by exploiting the information technology such to obtain a comprehensive and detailed explanation of the phenomenon under analysis Intelligent Data Analysis Statistical Learning and Information Management  Turning data into information and then information into knowledge are the main steps of the knowledge discovery process of the statistical learning paradigm.

23 References Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J. (1984): Classification and Regression Trees, Belmont C.A. Wadsworth. Hastie, T.J., Tibshirani, R.J., Friedman, J. (2001): The Elements of Statistical Learning, Springer Verlag. Mola, F., Siciliano, R. (1992): A two-stage predictive splitting algorithm in binary segmentation. In: Y. Dodge, J. Whittaker (Eds.): Computational Statistics: COMPSTAT ’92, 1, Physica Verlag, Heidelberg (D), 179-184. Mola, F., Siciliano, R. (1997): A Fast Splitting Procedure for Classification Thees, Statistics and Computing, 7, 208-216. Siciliano, R., Mola, F. (2002). Discriminant Analysis and Factorial Multiple Splits in Recursive Partitioning for Data Mining, in Roli, F., Kittler, J. (eds.): Proceedings of International Conference on Multiple Classifier Systems (Chia, June 24-26, 2002), 118-126, Lecture Notes in Computer Science, Springer, Heidelberg. Siciliano, R., Aria, M., Conversano, C. (2004). Harvesting trees: methods, software and applications. In Proceedings in Computational Statistics: 16° Symposium of IASC, held Prague, August 23-27, 2004 (COMPSTAT 2004), Electronical Edition (CD) Physica-Verlag, Heidelberg.


Download ppt "Conditional Classification Trees using Instrumental Variables Roberta Siciliano Valerio Aniello Tutore Department of Mathematics and Statistics University."

Similar presentations


Ads by Google