Integrative Analysis of Biological Data Sai Moturu.

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Bayesian network for gene regulatory network construction
Prior Knowledge Driven Causality Analysis in Gene Regulatory Network Discovery Authors: Shun Yao, Shinjae Yoo, Dantong Yu Stony Brook University Computational.
Rulebase Expert System and Uncertainty. Rule-based ES Rules as a knowledge representation technique Type of rules :- relation, recommendation, directive,
Clustering: Introduction Adriano Joaquim de O Cruz ©2002 NCE/UFRJ
. Context-Specific Bayesian Clustering for Gene Expression Data Yoseph Barash Nir Friedman School of Computer Science & Engineering Hebrew University.
CGeMM – University of Louisville Mining gene-gene interactions from microarray data - Coefficient of Determination Marcel Brun – CGeMM - UofL.
Modeling Student Knowledge Using Bayesian Networks to Predict Student Performance By Zach Pardos, Neil Heffernan, Brigham Anderson and Cristina Heffernan.
N.U.S. - January 13, 2006 Gert Lanckriet U.C. San Diego Classification problems with heterogeneous information sources.
Rich Probabilistic Models for Gene Expression Eran Segal (Stanford) Ben Taskar (Stanford) Audrey Gasch (Berkeley) Nir Friedman (Hebrew University) Daphne.
Modeling Human Reasoning About Meta-Information Presented By: Scott Langevin Jingsong Wang.
Clustering approaches for high- throughput data Sushmita Roy BMI/CS 576 Nov 12 th, 2013.
1 ALIGNMENT OF NUCLEOTIDE & AMINO-ACID SEQUENCES.
Parameterising Bayesian Networks: A Case Study in Ecological Risk Assessment Carmel A. Pollino Water Studies Centre Monash University Owen Woodberry, Ann.
1 © 1998 HRL Laboratories, LLC. All Rights Reserved Construction of Bayesian Networks for Diagnostics K. Wojtek Przytula: HRL Laboratories & Don Thompson:
PROVIDING DISTRIBUTED FORECASTS OF PRECIPITATION USING A STATISTICAL NOWCAST SCHEME Neil I. Fox and Chris K. Wikle University of Missouri- Columbia.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
Knowledge Engineering a Bayesian Network for an Ecological Risk Assessment (KEBN-ERA) Owen Woodberry Supervisors: Ann Nicholson Kevin Korb Carmel Pollino.
Lesson 8: Machine Learning (and the Legionella as a case study) Biological Sequences Analysis, MTA.
Experimental and computational assessment of conditionally essential genes in E. coli Chao WANG, Oct
Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,
Improvements in the Spatial and Temporal representation of the Model Owen Woodberry Bachelor of Computer Science, Honours.
Semantics For the Semantic Web: The Implicit, the Formal and The Powerful Amit Sheth, Cartic Ramakrishnan, Christopher Thomas CS751 Spring 2005 Presenter:
Biological networks Construction and Analysis. Recap Gene regulatory networks –Transcription Factors: special proteins that function as “keys” to the.
Classical tree view of cell cycle data (Spellman, et al MolBiolCell 9, 3273)
Knowledge Engineering a Bayesian Network for an Ecological Risk Assessment (KEBN-ERA) Owen Woodberry Supervisors: Ann Nicholson Kevin Korb Carmel Pollino.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Network Formation Can we model it? Oh yeahhhhhhh!
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Inferring Cellular Networks Using Probabilistic Graphical Models Jianlin Cheng, PhD University of Missouri 2009.
ACT Question Analysis and Strategies for Science Presentation A.
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.
Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.
MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification.
Cis-regulation Trans-regulation 5 Objective: pathway reconstruction.
Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi.
1 A Presentation of ‘Bayesian Models for Gene Expression With DNA Microarray Data’ by Ibrahim, Chen, and Gray Presentation By Lara DePadilla.
Genetic Regulatory Network Inference Russell Schwartz Department of Biological Sciences Carnegie Mellon University.
Hierarchical Dirichlet Process (HDP) A Dirichlet process (DP) is a discrete distribution that is composed of a weighted sum of impulse functions. Weights.
Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry.
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
1/15 Strengthening I-ReGEC classifier G. Attratto, D. Feminiano, and M.R. Guarracino High Performance Computing and Networking Institute Italian National.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Exogean: an expert gene annotation framework based on directed acyclic coloured multigraphs ENCODE Gene Prediction Workshop - EGASP/2005 Sarah Djebali,
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
Understanding Network Concepts in Modules Dong J, Horvath S (2007) BMC Systems Biology 2007, 1:24.
Nuria Lopez-Bigas Methods and tools in functional genomics (microarrays) BCO17.
By: Amira Djebbari and John Quackenbush BMC Systems Biology 2008, 2: 57 Presented by: Garron Wright April 20, 2009 CSCE 582.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Mining the Biomedical Research Literature Ken Baclawski.
Introduction to biological molecular networks
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
Flat clustering approaches
GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function Sara Mostafavi, Debajyoti Ray, David Warde-Farley,
Reverse engineering of regulatory networks Dirk Husmeier & Adriano Werhli.
Decision Tree Algorithms Rule Based Suitable for automatic generation.
Typically, classifiers are trained based on local features of each site in the training set of protein sequences. Thus no global sequence information is.
The statistics of pairwise alignment BMI/CS 576 Colin Dewey Fall 2015.
Create and assess protein networks through molecular characteristics of individual proteins Yanay Ofran et al. ISMB ’06 Presenter: Danhua Guo 12/07/2006.
Refined Online Citation Matching and Adaptive Canonical Metadata Construction CSE 598B Course Project Report Huajing Li.
1 Using Graph Theory to Analyze Gene Network Coherence José A. Lagares Jesús S. Aguilar Norberto Díaz-Díaz Francisco A. Gómez-Vela
Computer Science and Engineering PhD in Computer Science Monday, November 07, :00 a.m. – 11:00 a.m. Swearingen Conference Room 3A75 Network Based.
Biological data representation and data mining Xin Chen
ACT Question Analysis and Strategies for Science
Building and Analyzing Genome-Wide Gene Disruption Networks
1-7 Notes for Algebra 1 Functions.
Prepared by: Mahmoud Rafeek Al-Farra
Bioinformatics 김유환, 문현구, 정태진, 정승우.
Presentation transcript:

Integrative Analysis of Biological Data Sai Moturu

MAGIC M ultisource A ssociation of G enes by I ntegration of C lusters Goal: Integrate heterogeneous types of high-throughput data for accurate gene function prediction Bayesian reasoning Incorporates expert knowledge Yeast Data

Integrative analysis ! Why ?? High throughput methods sacrifice specificity for scale High throughput methods sacrifice specificity for scale Microarray data alone is good for hypothesis generation but lacks specificity for accurate gene function prediction Microarray data alone is good for hypothesis generation but lacks specificity for accurate gene function prediction By using heterogeneous functional data, the prediction accuracy is improved By using heterogeneous functional data, the prediction accuracy is improved

Need for MAGIC Studies have combined different types of data in a heuristic fashion on a case by case basis Studies have combined different types of data in a heuristic fashion on a case by case basis No general scheme or probabilistic representation is applied No general scheme or probabilistic representation is applied Methods for combination of specific data Methods for combination of specific data MAGIC – general method to integrate disparate data sources MAGIC – general method to integrate disparate data sources

Input to MAGIC Input: Gene-Gene relation matrices for each data source Input: Gene-Gene relation matrices for each data source The elements of the matrix are scores that indicate whether there could be relationship between two genes The elements of the matrix are scores that indicate whether there could be relationship between two genes The score can be binary, discrete or continuous The score can be binary, discrete or continuous Input format is flexible and allows genes to be in more than one group or cluster Input format is flexible and allows genes to be in more than one group or cluster Thus does not exclude biclustering or fuzzy clustering methods Thus does not exclude biclustering or fuzzy clustering methods

Structure of the MAGIC Bayesian network Prior probabilities assessed by experts Prior probabilities assessed by experts

Evaluation No gold standard for gene groupings exists No gold standard for gene groupings exists GO is the best available reflection of current biological knowledge GO is the best available reflection of current biological knowledge Use a cutoff of 3 levels in the hierarchical structure to say that to genes are functionally related Use a cutoff of 3 levels in the hierarchical structure to say that to genes are functionally related

Results

Results

AVID A nnotation V ia I ntegration of D ata Integrates data to build high-confidence networks in which proteins are connected if they are likely to share a common annotation Integrates data to build high-confidence networks in which proteins are connected if they are likely to share a common annotation AVID predictions functional annotation in all three GO categories AVID predictions functional annotation in all three GO categories

AVID stages

AVID results