Learning recursive Bayesian multinets for data clustering by means of constructive induction Pegna, J.M., Lozano, J.A., and Larragnaga, P. Machine Learning, 47(1), pp. 63-89, 2002. Summarized by Kyu-Baek Hwang
data clustering problem data partitioning k-means representing the joint probability distribution of a database mixture density models (e.g., Gaussian mixtures) Bayesian networks C Y1 Y2 Y3 Y4 Y5 (c) 2003 SNU CSE Biointelligence Laboratory
recursive Bayesian multinets RBMN a decision tree where each decision path ends in an alternate component Bayesian network (BN) context-specific conditional independencies (c) 2003 SNU CSE Biointelligence Laboratory
BNs for data clustering the joint probability distribution by a BN (c) 2003 SNU CSE Biointelligence Laboratory
(c) 2003 SNU CSE Biointelligence Laboratory Bayesian multinets encode the context-specific conditional independencies. (c) 2003 SNU CSE Biointelligence Laboratory
(c) 2003 SNU CSE Biointelligence Laboratory RBMNs extensions of BMNs or partitional clustering systems (c) 2003 SNU CSE Biointelligence Laboratory
(c) 2003 SNU CSE Biointelligence Laboratory real world domain geographical distribution of malignant tumors (c) 2003 SNU CSE Biointelligence Laboratory
component BN structures extended naïve Bayes (ENB) models a selection of the attributes to be included in the models (X) some attributes can be grouped together under the same node (O) (c) 2003 SNU CSE Biointelligence Laboratory
learning algorithm for ENB models (c) 2003 SNU CSE Biointelligence Laboratory
(c) 2003 SNU CSE Biointelligence Laboratory parameter search EM (expectation maximization) algorithm BC (bound and collapse) + EM algorithm (O) (c) 2003 SNU CSE Biointelligence Laboratory
(c) 2003 SNU CSE Biointelligence Laboratory structure search constructive induction the process of changing the representation of the cases in the database by creating new attributes from existing attributes. forward algorithm backward algorithm (c) 2003 SNU CSE Biointelligence Laboratory
marginal likelihood criterion for RBMNs with uninformative Dirichlet prior, for BMNs, with some reasonable assumptions including parameter independence, for RBMNs, (c) 2003 SNU CSE Biointelligence Laboratory
learning algorithm for RBMNs (c) 2003 SNU CSE Biointelligence Laboratory
(c) 2003 SNU CSE Biointelligence Laboratory experimental setup both synthetic data and real data discrete variables with (unrestricted) multinomial distributions convergence criterion for BC + EM algorithm change in the log marginal likelihood value is less than 10-6 or 150 iterations fixing_probability_threshold: 0.51 initial structure: naïve Bayes model 5 independent runs at each experiment (c) 2003 SNU CSE Biointelligence Laboratory
1-level RBMNs for the experiments (c) 2003 SNU CSE Biointelligence Laboratory
2-level RBMNs for the experiments (c) 2003 SNU CSE Biointelligence Laboratory
performance for 4 synthetic databases (c) 2003 SNU CSE Biointelligence Laboratory
performance for real world data tic-tac-toe data: 2 clusters with 9 predictive variables, 958 cases nursery data: 5 clusters with 8 predictive variables, 12960 cases (c) 2003 SNU CSE Biointelligence Laboratory
conclusions and future research context-specific conditional independencies data partitioning efficient representation, Bayesian committees, mixture of experts learning speed problem trade-off with the efficient representation monothetic decision tree polythetic paths enrich the modeling power extensions to the continuous domain (c) 2003 SNU CSE Biointelligence Laboratory