Simulation and Application on learning gene causal relationships Xin Zhang.

Slides:



Advertisements
Similar presentations
Causal reasoning in Biomedical Informatics
Advertisements

Bayesian network for gene regulatory network construction
A Tutorial on Learning with Bayesian Networks
DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng.
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
1 Chapter 5 Belief Updating in Bayesian Networks Bayesian Networks and Decision Graphs Finn V. Jensen Qunyuan Zhang Division. of Statistical Genomics,
BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
Structure Learning Using Causation Rules Raanan Yehezkel PAML Lab. Journal Club March 13, 2003.
Dynamic Bayesian Networks (DBNs)
Identifying Conditional Independencies in Bayes Nets Lecture 4.
Introduction of Probabilistic Reasoning and Bayesian Networks
Learning Causality Some slides are from Judea Pearl’s class lecture
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
From: Probabilistic Methods for Bioinformatics - With an Introduction to Bayesian Networks By: Rich Neapolitan.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
CISC667, F05, Lec26, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Genetic networks and gene expression data.
Biological Gene and Protein Networks
CSE 571 Advanced Artificial Intelligence Nov 24, 2003 Class Notes Transcribed By: Jon Lammers.
Bayesian Network Representation Continued
Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.
Advanced Topics in Data Mining Special focus: Social Networks.
Learning Equivalence Classes of Bayesian-Network Structures David M. Chickering Presented by Dmitry Zinenko.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
6. Gene Regulatory Networks
Bayesian Networks Alan Ritter.
Causal Modeling for Anomaly Detection Andrew Arnold Machine Learning Department, Carnegie Mellon University Summer Project with Naoki Abe Predictive Modeling.
國立陽明大學生資學程 陳虹瑋. Genetic Algorithm Background Fitness function ……. population selection Cross over mutation Fitness values Random cross over.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Bayes Net Perspectives on Causation and Causal Inference
Machine Learning CUNY Graduate Center Lecture 21: Graphical Models.
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.
Bayes’ Nets  A Bayes’ net is an efficient encoding of a probabilistic model of a domain  Questions we can ask:  Inference: given a fixed BN, what is.
Made by: Maor Levy, Temple University  Probability expresses uncertainty.  Pervasive in all of Artificial Intelligence  Machine learning 
Using Bayesian Networks to Analyze Expression Data By Friedman Nir, Linial Michal, Nachman Iftach, Pe'er Dana (2000) Presented by Nikolaos Aravanis Lysimachos.
Chapter 2 Graph Algorithms.
Gene Regulatory Network Inference. Progress in Disease Treatment  Personalized medicine is becoming more prevalent for several kinds of cancer treatment.
Reverse engineering gene regulatory networks Dirk Husmeier Adriano Werhli Marco Grzegorczyk.
Introduction to Graphs. Introduction Graphs are a generalization of trees –Nodes or verticies –Edges or arcs Two kinds of graphs –Directed –Undirected.
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
1 Monte Carlo Artificial Intelligence: Bayesian Networks.
Introduction to Bayesian Networks
Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological.
Learning Linear Causal Models Oksana Kohutyuk ComS 673 Spring 2005 Department of Computer Science Iowa State University.
1 Closures of Relations: Transitive Closure and Partitions Sections 8.4 and 8.5.
Networks Igor Segota Statistical physics presentation.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Course files
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
Learning With Bayesian Networks Markus Kalisch ETH Zürich.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
Dependency networks Sushmita Roy BMI/CS 576 Nov 25 th, 2014.
1 Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department,
Reverse engineering of regulatory networks Dirk Husmeier & Adriano Werhli.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Introduction on Graphic Models
1 Using Graph Theory to Analyze Gene Network Coherence José A. Lagares Jesús S. Aguilar Norberto Díaz-Díaz Francisco A. Gómez-Vela
Belief Networks Kostas Kontogiannis E&CE 457. Belief Networks A belief network is a graph in which the following holds: –A set of random variables makes.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2006 Readings: K&F: 3.1, 3.2, 3.3.
1 Day 2: Search June 9, 2015 Carnegie Mellon University Center for Causal Discovery.
An Algorithm to Learn the Structure of a Bayesian Network Çiğdem Gündüz Olcay Taner Yıldız Ethem Alpaydın Computer Engineering Taner Bilgiç Industrial.
A Place-based Model for the Internet Topology Xiaotao Cai Victor T.-S. Shi William Perrizo NDSU {Xiaotao.cai, Victor.shi,
A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Deep Belief Nets and Ising Model-Based Network Construction
An Algorithm for Bayesian Network Construction from Data
Causal Models Lecture 12.
CS 188: Artificial Intelligence Spring 2007
Presentation transcript:

Simulation and Application on learning gene causal relationships Xin Zhang

Introduction High-throughput genetic technologies empowers to study how genes interact with each other; Simulation to evaluate how well IC algorithm learns gene causal relationships; We present an algorithm (mIC algorithm) for learning causal relationship with knowledge of topological ordering information, and apply it on Melanoma dataset; Apply mIC algorithm on Melanoma dataset;

Steps for Simulation Study Construct a causal network N; Generate datasets based on the causal network; Learning the simulated data using causal algorithms (e.g. IC algorithm) to obtain network N´; Compare the original network N with obtained network N´ w.r.t precision and recall;

Modeling and simulation of a causal Boolean network (BN) Boolean network: A C B f C=f(A,B) Constructing a causal structure; Assign parameters (proper functions) for each node with casual parents; Assign probability distribution;

Constructing Boolean Network 1.Generate M BNs with up to 3 causal parents for each node; 2.For each BN, generate a random proper function for each node; 3.Assign random probabilities for the root gene(s); 4.Given one configuration, get probability distribution; 5.Collect 200 data points for each network; 6.Repeat above steps 3-5 for all M networks.

Constructing Causal Structure A C B E D

Steps for constructing causal structure

Proper function (1) Proper function: The function that reflects the influence of the operators. Example: By simplifying f, c is a function of a with c = a b is a pseudo predictor of c, and has no effect on c. f is not a proper function.

Proper function (2) Definition:  With n predictors, the number of proper function is given by:

Probability Distribution

Generating dataset

Steps of learning gene causal relationships Step1: obtain the probability distribution and data sampling; Step2: apply algorithms to find causal relations; Step3: compare the original and obtained networks based on the two notions of precision and recall; Step4: repeat step 1-3 for every random network;

Comparing two networks A DC BA DC B Original Network Obtained Network

Precision and Recall Original graph is a DAG, while obtained graph has both directed and undirected edges; Orig GraphObt. Graph FN TP TN FP PFN, PTP PTN, PFP Recall = ATP/(AFN+ATP), Precision = ATP/(ATP + AFP)

Observational equivalence and Transitive Closure Two DAGs are said to be observational equivalent (OE) if they have the same skeleton and the same set of v- structure; A DC BA DC B OE Transitive closure (TC): A ->B -> C with A -> C cc(x,y): is true if there is a directed or an undirected edge from x to y; pcc(x,y): is true if there is a path from x to y consisting of properly directed and undirected edges pcc(x,y):= cc(x,y) | pcc(x,z)  pcc(z,y)

Result for IC algorithm

How to improve IC algorithm The original IC algorithm did not have good results on learning gene causal relationships; A possible way to improve the performance is to incorporate extra information; If we know the topological ordering of the regulatory network, it would be helpful to improve the learning result;

Gene topological ordering If a specific gene is the causal parent of another gene; In a pathway, if one gene appears before another gene; If one gene is at the beginning or at the end of the pathway; IC algorithm + topological ordering information

mIC algorithm mIC algorithm based on IC, but incorporates both topological ordering information with steady state data to infer causality; 3 Steps of mIC algorithm: –Find conditional independence: For each pair of gene g i and g j in a dataset, test pairwise conditional independence. If they are dependent, search for a set S ij = {g k | g i and g j are independent given g k, with i<k<j, or j<k<i}. Construct an undirected graph G such that g i and g j are connected with an edge if an only if they are pairwise dependent and no S ij can be found; –Find v-structure: For each pair of nonadjacent genes g i and g j with common neighbor g k, if g k  S ij, and k>i, k>j, add arrowheads pointing at g k, such as g i ->g k <- g j ; –Orientate more directed edges according to rules: Orientate the undirected edges without creating new cycles and v-structures;

Results from mIC algorithm

Melanoma dataset The 10 genes involved in this study chosen from 587 genes from the melonoma data; Previous studies show that WNT5A has been identified as a gene of interest involved in melanoma; Controlling the influence of WNT5A in the regulation can reduce the chance of melanoma metastasizing;

Applying mIC algorithm on Melanoma Dataset WNT5A Partial biological prior knowledge: MMP3 is expected to be the end of the pathway Pirin causatively influences WNT5A – In order to maintain the level of WNT5A we need to directly control WNT5A or through pirin. WNT5A directly causes MART-1

Conclusion Evaluated IC algorithm using simulation data; We presented mIC algorithm that can infer gene causal relationship from steady state data with gene topological ordering information; Performed simulation based on Boolean network to evaluate the performance of the causal algorithms; We applied mIC algorithm to real biological microarray data Melanoma dataset; The result showed that some of the important causal relationships associated with WNT5A gene have been identified using mIC algorithm.