Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Permutation Genetic Algorithms for Score-Based Bayesian Network Structure Learning Monday, 16 August 2004 William H. Hsu and Roby Joehanes Joint work with: Haipeng Guo, Benjamin B. Perry, Julie A. Thornton Thanks to: Jeffrey M. Barber, Andrew King, Chris Meyer Laboratory for Knowledge Discovery in Databases Department of Computing and Information Sciences Kansas State University This presentation is: Computing, Communications and Control Technologies (CCCT) 2004
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Research Overview Graphical Models of Probability –Markov graphs –Bayesian (belief) networks –Causal semantics –Direction-dependent separation (d-separation) property Learning and Reasoning: Problems, Algorithms –Inference: exact and approximate Junction tree – Lauritzen and Spiegelhalter (1988) (Bounded) loop cutset conditioning – Horvitz and Cooper (1989) Variable elimination – Dechter (1996) –Structure learning K2 algorithm – Cooper and Herskovits (1992) Variable ordering problem – Larannaga (1996), Hsu et al. (2002, 2004) Probabilistic Reasoning in Machine Learning, Data Mining Current Research and Open Problems
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Graphical Models of Probability P(20s, Female, Low, Non-Smoker, No-Cancer, Negative, Negative) = P(T) · P(F) · P(L | T) · P(N | T, F) · P(N | L, N) · P(N | N) · P(N | N) Conditional Independence –X is conditionally independent (CI) from Y given Z iff P(X | Y, Z) = P(X | Z) for all values of X, Y, and Z –Example: P(Thunder | Rain, Lightning) = P(Thunder | Lightning) T R | L Bayesian (Belief) Network –Acyclic directed graph model B = (V, E, ) representing CI assertions over –Vertices (nodes) V: denote events (each a random variable) –Edges (arcs, links) E: denote conditional dependencies Markov Condition for BBNs (Chain Rule): Example BBN X1X1 X3X3 X4X4 X5X5 Age Exposure-To-Toxins Smoking Cancer X6X6 Serum Calcium X2X2 Gender X7X7 Lung Tumor
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Model Averaging Procedure (Schuurmans et al.) General-Case BBN Structure Learning: Use Inference to Compute Scores Optimal Strategy: Bayesian Model Averaging –Assumption: models h H are mutually exclusive and exhaustive –Combine predictions of models in proportion to marginal likelihood Compute conditional probability of hypothesis h given observed data D i.e., compute expectation over unknown h for unseen cases Let h structure, parameters CPTs Posterior ScoreMarginal Likelihood Prior over StructuresLikelihood Prior over Parameters
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Greedy Score-Based Algorithm for Structure Learning (K2, Cooper & Herskovits) Algorithm Learn-BBN-Structure-K2 (D, Max-Parents) FOR i 1 to n DO// arbitrary ordering of variables {x 1, x 2, …, x n } WHILE (Parents[x i ].Size < Max-Parents) DO// find best candidate parent Best argmax j>i (P(D | x j Parents[x i ])// max Dirichlet score IF (Parents[x i ] + Best).Score > Parents[x i ].Score) THEN Parents[x i ] += Best RETURN ({Parents [x i ] | i {1, 2, …, n}}) A Logical Alarm Reduction Mechanism [Beinlich et al, 1989] –BN2O (3-layer) graphical model for patient monitoring in surgical anesthesia –Vertices (37): findings (e.g., esophageal intubation), intermediates, observables –K2: finds BBN different in only 1 edge from gold standard (elicited from expert)
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Development History Bayesian Network Tools (BNTools, ) –Junction Tree –Editor –Structure learning (K2) BNJ v1 ( ) –Semistructured data format (XML) based on MSBN, XBN –ConverterFactory: Hugin, Ergo, Netica, MSBN –Importance sampling –Other inference algorithms (Guo: Multi-Start Hill Climbing, Tabu Search) BNJ v2 ( ) –Relational Models –Wizards: Learning, Inference BNJ v3 (2004-present) –Visualization Framework –Run Mode: online constraint propagation –Refactoring for speed: % speedup over v2 –Better memory management
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Graphical User Interface: Editor © 2004 KSU BNJ Development TeamAsia (Chest Clinic) Network
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( [2] Representation Evaluator for Learning Problems Genetic Wrapper for Change of Representation and Inductive Bias Control D: Training Data : Inference Specification D train (Inductive Learning) D val (Inference) [1] Genetic Algorithm α Candidate Representation f(α) Representation Fitness Optimized Representation Permutation GA for Greedy Structure Learning
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Fitness Function
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Results: Asia (Chest Clinic) Histogram of estimated fitness for all 8! = permutations of Asia variables K2FS Samples Best f of final gen Results for Asia (5000 samples per fitness evaluation in D val and D test )
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Results: ALARM-13
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Core [1] Design
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Core [2] Graph Architecture © 2004 KSU BNJ Development TeamCPCS-54 Network
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Graphical User Interface: Network © 2004 KSU BNJ Development Team ALARM Network
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Visualization [1] Framework © 2004 KSU BNJ Development Team
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Visualization [2] Pseudo-Code Annotation (Code Page) © 2004 KSU BNJ Development Team ALARM Network
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Visualization [3] Network © 2004 KSU BNJ Development Team Poker Network
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Current Work: Features in Progress Scalability –Large networks (50+ vertices, 10+ parents) –Very large data sets (10 6 +) Other Visualizations –K2 for structure learning –Conditioning BNJ v1-2 ports –Guo’s dissertation algorithms –Importance sampling (CABeN) Lazy Evaluation © 2004 KSU BNJ Development TeamBarley Network
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Future Work: Desired Features Grid Computing –Very large networks (200+ vertices) New Visualizations –Variable Elimination (difficult) –Other structure learning New Representations –Relational Graphical Models –Dynamic Bayes nets –Decision Networks BNJ v1-2 Reimplementations –Database GUI –Wizards
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Treatment 1 (Control) Treatment 2 (Pathogen) Messenger RNA (mRNA) Extract 1 Messenger RNA (mRNA) Extract 2 cDNA DNA Hybridization Microarray (under LASER) Adapted from Friedman et al. (2000) Current Research Topics: Bioinformatics Learning Environment G = (V, E) Specification Fitness (Inferential Loss) B = (V, E, ) [B] Parameter Estimation G1G1 G2G2 G3G3 G4G4 G5G5 [A] Structure Learning G1G1 G2G2 G3G3 G4G4 G5G5 D val (Model Validation by Inference) D: Data (User, Microarray)
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( References: Graphical Models and Inference Algorithms Inference Algorithms –Junction Tree (Join Tree, L-S, Hugin): Lauritzen & Spiegelhalter (1988) –(Bounded) Loop Cutset Conditioning: Horvitz & Cooper (1989) –Variable Elimination (Bucket Elimination, ElimBel): Dechter (1986) –Recommended Books Neapolitan (1990) – out of print; see Pearl (1988), Jensen (2001) Castillo, Gutierrez, Hadi (1997) Cowell, Dawid, Lauritzen, Spiegelhalter (1999) –Stochastic Approximation Bioinformatics –European Bioinformatics Institute Tutorial: Brazma et al. (2001) –K-State BMI Group: literature survey and resource catalog (2002)
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Acknowledgements Kansas State University Lab for Knowledge Discovery in Databases –Undergraduates Jeff Barber Andrew King –Graduate Students Chris Meyer Julie A. Thornton Other Universities –Carnegie Mellon University: Dr. Clark Glymour, Dr. Richard Scheines –Iowa State University: Dr. Vasant Honavar, Dr. Dimitris Margaritis, Dr. Jin Tian BNJ v3 Test Sites
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( For More Information Commercial Tools: Ergo, Netica, TETRAD, Hugin Bayes Net Toolbox (BNT) – Murphy (1997-present) –Distribution page –Development group Bayesian Network tools in Java (BNJ) – Hsu et al. (2000-present) –Distribution page –Development group –Current (re)implementation projects for KSU KDD Lab Continuous state: Minka (2002) – Hsu, Barber Formats: XML BNIF (MSBN), Netica – Guo, Hsu Bounded cutset conditioning – Chandak Space-efficient DBN inference