Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab (www.kddresearch.org)www.kddresearch.org Permutation.

Slides:



Advertisements
Similar presentations
Bayesian network for gene regulatory network construction
Advertisements

KSU Math Department Colloquium
A Tutorial on Learning with Bayesian Networks
IMPORTANCE SAMPLING ALGORITHM FOR BAYESIAN NETWORKS
Introduction of Probabilistic Reasoning and Bayesian Networks
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Bayes Nets Rong Jin. Hidden Markov Model  Inferring from observations (o i ) to hidden variables (q i )  This is a general framework for representing.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Introduction to Graphical Models.
Part 2 of 3: Bayesian Network and Dynamic Bayesian Network.
5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005.
1 gR2002 Peter Spirtes Carnegie Mellon University.
Today Logistic Regression Decision Trees Redux Graphical Models
Artificial Intelligence Term Project #3 Kyu-Baek Hwang Biointelligence Lab School of Computer Science and Engineering Seoul National University
Causal Models, Learning Algorithms and their Application to Performance Modeling Jan Lemeire Parallel Systems lab November 15 th 2006.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Machine Learning CUNY Graduate Center Lecture 21: Graphical Models.
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Graphical.
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( William.
Reasoning with Bayesian Networks. Overview Bayesian Belief Networks (BBNs) can reason with networks of propositions and associated probabilities Useful.
Computing & Information Sciences Kansas State University Lecture 28 of 42 CIS 530 / 730 Artificial Intelligence Lecture 28 of 42 William H. Hsu Department.
Kansas State University Department of Computing and Information Sciences Real-Time Bayesian Network Inference for Decision Support in Personnel Management:
A Brief Introduction to Graphical Models
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 26 of 41 Friday, 22 October.
Learning Structure in Bayes Nets (Typically also learn CPTs here) Given the set of random variables (features), the space of all possible networks.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, March 27, 2000 William.
Computing & Information Sciences Kansas State University Lecture 27 of 42 CIS 530 / 730 Artificial Intelligence Lecture 27 of 42 William H. Hsu Department.
Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?
Computing & Information Sciences Kansas State University Lecture 30 of 42 CIS 530 / 730 Artificial Intelligence Lecture 30 of 42 William H. Hsu Department.
Analysing Microarray Data Using Bayesian Network Learning Name: Phirun Son Supervisor: Dr. Lin Liu.
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( cDNA.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 25 Wednesday, 20 October.
Kansas State University Department of Computing and Information Sciences Ben Perry – M.S. thesis defense Benjamin B. Perry Laboratory for Knowledge Discovery.
Aprendizagem Computacional Gladys Castillo, UA Bayesian Networks Classifiers Gladys Castillo University of Aveiro.
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Relational.
Reasoning with Bayesian Belief Networks. Overview Bayesian Belief Networks (BBNs) can reason with networks of propositions and associated probabilities.
Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Thursday, October 7, 1999 William.
Computing & Information Sciences Kansas State University Monday, 29 Oct 2008CIS 530 / 730: Artificial Intelligence Lecture 25 of 42 Wednesday, 29 October.
Computing & Information Sciences Kansas State University Wednesday, 22 Oct 2008CIS 530 / 730: Artificial Intelligence Lecture 22 of 42 Wednesday, 22 October.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, March 15, 2000.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 28 of 41 Friday, 22 October.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 25 of 41 Monday, 25 October.
Generalizing Variable Elimination in Bayesian Networks 서울 시립대학원 전자 전기 컴퓨터 공학과 G 박민규.
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,
LEAP-KMC Workshop 2006 Visualization of KMC Simulation Data and Evolutionary Computation: The LEAP Infrastructure and Content Management System William.
Computing & Information Sciences Kansas State University Data Sciences Summer Institute Multimodal Information Access and Synthesis Learning and Reasoning.
Computing & Information Sciences Kansas State University Monday, 06 Nov 2006CIS 490 / 730: Artificial Intelligence Lecture 31 of 42 Monday, 06 November.
BNJ 2.03α Beginner Developer Tutorial Chris H. Meyer (revised by William H. Hsu) Kansas State University KDD Laboratory
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
K2 Algorithm Presentation KDD Lab, CIS Department, KSU
Computing & Information Sciences Kansas State University Friday, 27 Oct 2006CIS 490 / 730: Artificial Intelligence Lecture 27 of 42 Friday, 27 October.
1 Param. Learning (MLE) Structure Learning The Good Graphical Models – Carlos Guestrin Carnegie Mellon University October 1 st, 2008 Readings: K&F:
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Graphical Models of Probability.
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
Computing & Information Sciences Kansas State University Wednesday, 08 Nov 2006CIS 490 / 730: Artificial Intelligence Lecture 32 of 42 Wednesday, 08 November.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 24 of 41 Monday, 18 October.
Computing & Information Sciences Kansas State University Wednesday, 08 Nov 2006CIS 490 / 730: Artificial Intelligence Lecture 32 of 42 Wednesday, 08 November.
A Cooperative Coevolutionary Genetic Algorithm for Learning Bayesian Network Structures Arthur Carvalho
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
Computing & Information Sciences Kansas State University Friday, 03 Nov 2006CIS 490 / 730: Artificial Intelligence Lecture 30 of 42 Friday, 03 November.
Bayesian Decision Theory Introduction to Machine Learning (Chap 3), E. Alpaydin.
Computing & Information Sciences Kansas State University Friday, 31 Oct 2008CIS 530 / 730: Artificial Intelligence Lecture 26 of 42 Friday, 31 October.
Computing & Information Sciences Kansas State University Wednesday, 01 Nov 2006CIS 490 / 730: Artificial Intelligence Lecture 29 of 42 Wednesday, 01 November.
CSCI 5822 Probabilistic Models of Human and Machine Learning
Network Inference Chris Holmes Oxford Centre for Gene Function, &,
Searching for Graphical Causal Models of Education Data
INTRODUCTION TO Machine Learning
Presentation transcript:

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Permutation Genetic Algorithms for Score-Based Bayesian Network Structure Learning Monday, 16 August 2004 William H. Hsu and Roby Joehanes Joint work with: Haipeng Guo, Benjamin B. Perry, Julie A. Thornton Thanks to: Jeffrey M. Barber, Andrew King, Chris Meyer Laboratory for Knowledge Discovery in Databases Department of Computing and Information Sciences Kansas State University This presentation is: Computing, Communications and Control Technologies (CCCT) 2004

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Research Overview Graphical Models of Probability –Markov graphs –Bayesian (belief) networks –Causal semantics –Direction-dependent separation (d-separation) property Learning and Reasoning: Problems, Algorithms –Inference: exact and approximate Junction tree – Lauritzen and Spiegelhalter (1988) (Bounded) loop cutset conditioning – Horvitz and Cooper (1989) Variable elimination – Dechter (1996) –Structure learning K2 algorithm – Cooper and Herskovits (1992) Variable ordering problem – Larannaga (1996), Hsu et al. (2002, 2004) Probabilistic Reasoning in Machine Learning, Data Mining Current Research and Open Problems

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Graphical Models of Probability P(20s, Female, Low, Non-Smoker, No-Cancer, Negative, Negative) = P(T) · P(F) · P(L | T) · P(N | T, F) · P(N | L, N) · P(N | N) · P(N | N) Conditional Independence –X is conditionally independent (CI) from Y given Z iff P(X | Y, Z) = P(X | Z) for all values of X, Y, and Z –Example: P(Thunder | Rain, Lightning) = P(Thunder | Lightning)  T  R | L Bayesian (Belief) Network –Acyclic directed graph model B = (V, E,  ) representing CI assertions over  –Vertices (nodes) V: denote events (each a random variable) –Edges (arcs, links) E: denote conditional dependencies Markov Condition for BBNs (Chain Rule): Example BBN X1X1 X3X3 X4X4 X5X5 Age Exposure-To-Toxins Smoking Cancer X6X6 Serum Calcium X2X2 Gender X7X7 Lung Tumor

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Model Averaging Procedure (Schuurmans et al.) General-Case BBN Structure Learning: Use Inference to Compute Scores Optimal Strategy: Bayesian Model Averaging –Assumption: models h  H are mutually exclusive and exhaustive –Combine predictions of models in proportion to marginal likelihood Compute conditional probability of hypothesis h given observed data D i.e., compute expectation over unknown h for unseen cases Let h  structure, parameters   CPTs Posterior ScoreMarginal Likelihood Prior over StructuresLikelihood Prior over Parameters

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Greedy Score-Based Algorithm for Structure Learning (K2, Cooper & Herskovits) Algorithm Learn-BBN-Structure-K2 (D, Max-Parents) FOR i  1 to n DO// arbitrary ordering of variables {x 1, x 2, …, x n } WHILE (Parents[x i ].Size < Max-Parents) DO// find best candidate parent Best  argmax j>i (P(D | x j  Parents[x i ])// max Dirichlet score IF (Parents[x i ] + Best).Score > Parents[x i ].Score) THEN Parents[x i ] += Best RETURN ({Parents [x i ] | i  {1, 2, …, n}}) A Logical Alarm Reduction Mechanism [Beinlich et al, 1989] –BN2O (3-layer) graphical model for patient monitoring in surgical anesthesia –Vertices (37): findings (e.g., esophageal intubation), intermediates, observables –K2: finds BBN different in only 1 edge from gold standard (elicited from expert)

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Development History Bayesian Network Tools (BNTools, ) –Junction Tree –Editor –Structure learning (K2) BNJ v1 ( ) –Semistructured data format (XML) based on MSBN, XBN –ConverterFactory: Hugin, Ergo, Netica, MSBN –Importance sampling –Other inference algorithms (Guo: Multi-Start Hill Climbing, Tabu Search) BNJ v2 ( ) –Relational Models –Wizards: Learning, Inference BNJ v3 (2004-present) –Visualization Framework –Run Mode: online constraint propagation –Refactoring for speed: % speedup over v2 –Better memory management

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Graphical User Interface: Editor © 2004 KSU BNJ Development TeamAsia (Chest Clinic) Network

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( [2] Representation Evaluator for Learning Problems Genetic Wrapper for Change of Representation and Inductive Bias Control D: Training Data : Inference Specification D train (Inductive Learning) D val (Inference) [1] Genetic Algorithm α Candidate Representation f(α) Representation Fitness Optimized Representation Permutation GA for Greedy Structure Learning

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Fitness Function

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Results: Asia (Chest Clinic) Histogram of estimated fitness for all 8! = permutations of Asia variables K2FS Samples Best f of final gen Results for Asia (5000 samples per fitness evaluation in D val and D test )

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Results: ALARM-13

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Core [1] Design

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Core [2] Graph Architecture © 2004 KSU BNJ Development TeamCPCS-54 Network

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Graphical User Interface: Network © 2004 KSU BNJ Development Team ALARM Network

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Visualization [1] Framework © 2004 KSU BNJ Development Team

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Visualization [2] Pseudo-Code Annotation (Code Page) © 2004 KSU BNJ Development Team ALARM Network

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( BNJ Visualization [3] Network © 2004 KSU BNJ Development Team Poker Network

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Current Work: Features in Progress Scalability –Large networks (50+ vertices, 10+ parents) –Very large data sets (10 6 +) Other Visualizations –K2 for structure learning –Conditioning BNJ v1-2 ports –Guo’s dissertation algorithms –Importance sampling (CABeN) Lazy Evaluation © 2004 KSU BNJ Development TeamBarley Network

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Future Work: Desired Features Grid Computing –Very large networks (200+ vertices) New Visualizations –Variable Elimination (difficult) –Other structure learning New Representations –Relational Graphical Models –Dynamic Bayes nets –Decision Networks BNJ v1-2 Reimplementations –Database GUI –Wizards

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Treatment 1 (Control) Treatment 2 (Pathogen) Messenger RNA (mRNA) Extract 1 Messenger RNA (mRNA) Extract 2 cDNA DNA Hybridization Microarray (under LASER) Adapted from Friedman et al. (2000) Current Research Topics: Bioinformatics Learning Environment G = (V, E) Specification Fitness (Inferential Loss) B = (V, E,  ) [B] Parameter Estimation G1G1 G2G2 G3G3 G4G4 G5G5 [A] Structure Learning G1G1 G2G2 G3G3 G4G4 G5G5 D val (Model Validation by Inference) D: Data (User, Microarray)

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( References: Graphical Models and Inference Algorithms Inference Algorithms –Junction Tree (Join Tree, L-S, Hugin): Lauritzen & Spiegelhalter (1988) –(Bounded) Loop Cutset Conditioning: Horvitz & Cooper (1989) –Variable Elimination (Bucket Elimination, ElimBel): Dechter (1986) –Recommended Books Neapolitan (1990) – out of print; see Pearl (1988), Jensen (2001) Castillo, Gutierrez, Hadi (1997) Cowell, Dawid, Lauritzen, Spiegelhalter (1999) –Stochastic Approximation Bioinformatics –European Bioinformatics Institute Tutorial: Brazma et al. (2001) –K-State BMI Group: literature survey and resource catalog (2002)

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( Acknowledgements Kansas State University Lab for Knowledge Discovery in Databases –Undergraduates Jeff Barber Andrew King –Graduate Students Chris Meyer Julie A. Thornton Other Universities –Carnegie Mellon University: Dr. Clark Glymour, Dr. Richard Scheines –Iowa State University: Dr. Vasant Honavar, Dr. Dimitris Margaritis, Dr. Jin Tian BNJ v3 Test Sites

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( For More Information Commercial Tools: Ergo, Netica, TETRAD, Hugin Bayes Net Toolbox (BNT) – Murphy (1997-present) –Distribution page –Development group Bayesian Network tools in Java (BNJ) – Hsu et al. (2000-present) –Distribution page –Development group –Current (re)implementation projects for KSU KDD Lab Continuous state: Minka (2002) – Hsu, Barber Formats: XML BNIF (MSBN), Netica – Guo, Hsu Bounded cutset conditioning – Chandak Space-efficient DBN inference