ResponseNet Esti Yeger-Lotem, Laura Riva, Linhui Julie Su, Aaron D Gitler, Anil G Cashikar, Oliver D King, Pavan K Auluck, Melissa L Geddie, Julie S Valastyan,

Slides:

Advertisements

Similar presentations

Control Case Common Always active

Advertisements

Introduction to Training and Learning in Neural Networks n CS/PY 399 Lab Presentation # 4 n February 1, 2001 n Mount Union College.

Putting genetic interactions in context through a global modular decomposition Jamal.

CSE Fall. Summary Goal: infer models of transcriptional regulation with annotated molecular interaction graphs The attributes in the model.

D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.

Yeast Lab!. What makes something living? Consider the following questions… How big/complex must something be? What must it be able to do? Where must it.

Global Mapping of the Yeast Genetic Interaction Network Tong et. al, Science, Feb 2004 Presented by Bowen Cui.

. Inferring Subnetworks from Perturbed Expression Profiles D. Pe’er A. Regev G. Elidan N. Friedman.

Petri net modeling of biological networks Claudine Chaouiya.

Teresa Przytycka NIH / NLM / NCBI RECOMB 2010 Bridging the genotype and phenotype.

Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol

Networks are useful for describing systems of interacting objects, where the nodes represent the objects and the edges represent the interactions between.

A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae Article by Peter Uetz, et.al. Presented by Kerstin Obando.

Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.

Genomic analysis of regulatory network dynamics reveals large topological changes Paper Study Speaker: Cai Chunhui Sep 21, 2004.

Regulatory networks 10/29/07. Definition of a module Module here has broader meanings than before. A functional module is a discrete entity whose function.

Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.

Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae Speaker: Zhu YANG 6 th step, 2006.

Gene Regulatory Networks - the Boolean Approach Andrey Zhdanov Based on the papers by Tatsuya Akutsu et al and others.

Data Mining Presentation Learning Patterns in the Dynamics of Biological Networks Chang hun You, Lawrence B. Holder, Diane J. Cook.

Biological networks Construction and Analysis. Recap Gene regulatory networks –Transcription Factors: special proteins that function as “keys” to the.

6. Gene Regulatory Networks

Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.

Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.

Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.

Comparative Expression Moran Yassour +=. Goal Build a multi-species gene-coexpression network Find functions of unknown genes Discover how the genes.

Efficient Algorithms for Detecting Signaling Pathways in Protein Interaction Networks Jacob Scott, Trey Ideker, Richard M. Karp, Roded Sharan RECOMB 2005.

Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.

Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.

Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.

Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009.

Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi.

Gene Set Enrichment Analysis (GSEA)

A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.

Genetic Regulatory Network Inference Russell Schwartz Department of Biological Sciences Carnegie Mellon University.

Networks and Interactions Boo Virk v1.0.

Yeast Lab!. What makes something living? Consider the following questions… How big/complex must something be? What must it be able to do? Where must it.

Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.

ResponseNet revealing signaling and regulatory networks linking genetic and transcriptomic screening data CSE Fall.

Reconstructing gene networks Analysing the properties of gene networks Gene Networks Using gene expression data to reconstruct gene networks.

Reconstruction of Transcriptional Regulatory Networks

Analysis of the yeast transcriptional regulatory network.

Tutorial session 3 Network analysis Exploring PPI networks using Cytoscape EMBO Practical Course Session 8 Nadezhda Doncheva and Piet Molenaar.

Systems Biology ___ Toward System-level Understanding of Biological Systems Hou-Haifeng.

Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.

Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.

IMPROVED RECONSTRUCTION OF IN SILICO GENE REGULATORY NETWORKS BY INTEGRATING KNOCKOUT AND PERTURBATION DATA Yip, K. Y., Alexander, R. P., Yan, K. K., &

While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.

Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.

Introduction to biological molecular networks

Abdul-Rahman Elshafei – ID  Introduction  SLAT & iSTAT  Multiplet Scoring  Matching Passing Tests  Matching Complex Failures  Multiplet.

Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.

 Signal Transduction transmits signals from outside to the inside of the cell  Integer Linear Programming model is used to unravel STN.

Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.

Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,

1 Lesson 12 Networks / Systems Biology. 2 Systems biology  Not only understanding components! 1.System structures: the network of gene interactions and.

Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.

Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.

Network Motifs See some examples of motifs and their functionality Discuss a study that showed how a miRNA also can be integrated into motifs Today’s plan.

Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.

Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.

Yiming Kang, Hien-haw Liow, Ezekiel Maier, & Michael Brent

Journal club Jun , Zhen.

1. SELECTION OF THE KEY GENE SET 2. BIOLOGICAL NETWORK SELECTION

Biological networks CS 5263 Bioinformatics.

1 Department of Engineering, 2 Department of Mathematics,

1 Department of Engineering, 2 Department of Mathematics,

1 Department of Engineering, 2 Department of Mathematics,

Anastasia Baryshnikova Cell Systems

Presentation transcript:

ResponseNet Esti Yeger-Lotem, Laura Riva, Linhui Julie Su, Aaron D Gitler, Anil G Cashikar, Oliver D King, Pavan K Auluck, Melissa L Geddie, Julie S Valastyan, David R Karger, Susan Lindquist & Ernest Fraenkel Bridging high-throughput genetic and transcriptional data reveals cellular responses to alpha-synuclein toxicity, Nature Genetics 2009 Presented by Shahar Zini June 2013

Introduction Gene expression profiling alone reveals very little about the inner workings of the cell and the mechanism involved. In the last semester we’ve seen many efforts to combine data from gene expression and protein-protein/protein-DNA knowledge in order to reveal more about the processes that occur within the cell. ResponseNet is a method which adds another high-throughput experimental data into the mix: genetic screening.

Genetic screening A method which involves alteration or deletion of an organism’s gene products and the examination of the phenotype of the resulting organism. Phenotype: the composite of an organism’s observable characteristics The process: Pick-up an healthy organism Delete/alter/overexpress a gene or a group of genes Test the resulting organism’s phenotype against a desirable characteristic (called “scoring”) For example: delete the MEP1 gene in a yeast cell and check for impaired growth.

Genetic screening This process can be used as a high-throughput procedure, meaning one could test alterations of 1000x of genes for numerous desired characteristics. The set of genes found to affect a process is called “genetic hit”. Can reveal an answer to some questions like: “What genes, if removed, will cause a yeast cell to die?” Just like gene expression: lots of data, yet, does not reveal a lot about the procedures the took place inside the cell to cause the observed phenotype.

Motivation Cells respond to stimuli by changes in various processes. Efforts to identify components of these responses using both methods (gene expression and genetic screening) yields different sets of genes. Genetic screens tend to identify response regulators. Gene expression tend to identify metabolic responses. Combining the data together may reveal something deeper about the processes of the cell. It is biologically plausible that some of the reported response regulators will be connected through regulatory pathways to differently expressed genes.

ResponseNet Identify molecular interaction paths connecting genetic hits and differentially expressed genes. Create a graph containing 2 types of nodes: genes and proteins. Genetic hits represented as their corresponding protein Fill the graph with (mostly bi-directional) edges connecting protein to another protein if they have a known interaction Fill the graph with directional edges connecting a protein node to a gene node if the protein is probable transcriptional regulator of the gene.

Genetic hit Differentially expressed gene Protein

The signaling and regulatory sub-network, by which stimulus-related proteins detected by genetic screens may lead to the measured transcriptomic response. Stimulus- related proteins stimulus- related genes intermediary proteins TFs

Min-cost flow Represent the problem as a flow problem. Each edge’s weight set to the negative log of it’s probability Try to pass flow from the genetic hits to the differently expressed genes using regulatory proteins. Strong flow => Highly regulated path => Important component in the process Rank nodes based on amount of flow to distinguish core component of the cellular response and peripherals.

Genetic hit Differentially expressed gene Protein Source Sink

Weighting scheme – protein-protein There are K types of evidence for a protein-protein interaction For example, “Two-hybrid” connection can be tested using direct experiment and using high-throughput experiment We trust the direct experiment more We are interested in finding regulatory pathways, and thus we would like to emphasis interactions between proteins that function together in a common response pathway.

Weighting scheme – protein-protein For each pair of proteins A and B we construct a vector X = (x 1,x 2 …,x k ) X i is 1  There is evidence of type i that A and B are connected w ab = P(a and b are connected | X) By bayes’ rule: = P(X | a and b are connected) * P(a and b are connected) / P(X) P(X) = P(X | a and b are connected) * P(a and b are connected) + P(X | a and b are not connected) * P(a and b are not connected) Assume independence between types of evidence: P(X | a and b are [not] connected) = P(X i | a and b are [not] connected)

Weighting scheme – protein-protein In order to calculate values using the equations derived we need to calculate the prior distributions P(X k | a and b are [not] connected) They used 3 sets of data: A set of response pathways A positive set containing all interacting protein pairs in a common response pathway based on reliable GO annotation (direct assays or expert knowledge) A negative set of interacting protein pairs known not to be in a common response pathway They capped the weight to 0.7 because some interactions received a score close to 1 (probably well studied proteins)

Weighting scheme – protein-gene They used data obtained using “Chip-chip” and “Chip-chip motif” methods. Those experiments are used to detect transcription factors for genes. The weighting scheme is irrelevant, but it was set so that the weight of each edge would reflect the interaction’s reliability.

Flow problem details For each genetic hit, the weight and capacity of the edge from the source to the node is set to: Strength is determined by results of the experiment For each edge connecting a differentially expressed gene to the sink, set the weight and capacity to: Strength is determined by fold-change or p-value of the experiment’s results All other capacities are set to 1 (and weight using the weighting scheme) Source Sink

Linear programming Min-cost flow optimization can be solved efficiently using linear programming The objective function used included a nice twist, so we would investigate it further: The first term is what you would expect from min-cost-flow problem Reduce the sum of cost*flow in all edges The second term is used to control the size of the input connected by flow

Smaller values will result in an empty flow (any flow will increase the cost) A value of 0 means you will only allow for perfect edges (weight=1) As the value increases, you give room to more and more imperfection, and allow for lower-probabilities edges to be added Adding more edges => better coverage Meaning, we can adjust the coverage of the solution and so the conflict between coverage and accuracy continues.

Results Validation of algorithm results on well-studied response pathways STE5 deletion DNA Damage Broader validation on less-studied response pathways Alpha-synuclein toxicity

STE5 deletion A protein that coordinates a MAPK cascade activated by pheromone Multi-tier pathway activated by the presence of pheromone Simple and well-studied response pathway Used a set of genetic hits associated with STE5 and differentially expressed genes from a yeast strain lacking STE5

DNA Damage There are many mechanisms in the cell which identifies and fixes damage created to the DNA. A much more complex pathway, but still highly-studied The input is very large: 91 genetic hits 9 differentially expressed genes (exposed to mutagen) Large output

Results – DNA Damage Contains 361 edges between 166 intermediate proteins Highly enriched for DNA damage stimulus (p< ) and DNA repair (p< )

MEC1, RAD53,RFC2,RFC3,RFC4 and RFC5 are essential genes And thus could not have been detected via genetic screening Some of highly ranked parts of the network are proteins that Have been uncovered by years of intense research. (MEC1, RAD53, RFC2,RFC3,RFC4,RFC5, and RFX1)

Broader experiments In order to evaluate the algorithm’s results better a broader evaluation took place The writers used data from 101 different perturbation, each involved the inactivation of a single gene In most cases, the response pathways are poorly studied The genetic hits contained the genetic interactors of the inactive gene The differentially expressed genes are based on microchip profiling of the inactivated strain

Broader experiments In order to validate the results, the identity of the inactivated gene was hidden from the algorithm The result of each test were considered successful if: They included the inactivated gene of the specific test and it was ranked significantly well They were significantly enriched for a specific biological process attributed to the specific gene Significance was tested by creating a randomized inputs for each test: 50 times the genetic hits inputs were randomized 50 times the differentially expressed genes were randomized A significant solution was one which was better (in ranking / enrichment level) than 95% of the randomized solutions

Broader experiments – results Finding the inactivated gene was successful in 25 of the tests In 8 of the tests the results contained the inactivated gene but it was not scored high enough Finding enriched results where successful in 28 of the tests The success rate for cases based on incomplete genetic hits was 40% and under full genetic screens it was 47% => An indication that the algorithm works well even under very limited data sets A typical result covered only 1% of the yeast genes We can infer that both criteria were important Overall: 41% success (considered high because of the low quality data)

Alpha-synuclein toxicity A protein known to be a key component to Parkinson disease. Despite intense research, the cellular pathways leading to cell death are just beginning to get uncovered. Genetic screen experiment detected 77 proteins No significant overlap between genetic hits and transcriptional profiling results were found, meaning, we have lots of data, but no of to explain it Applying ResponseNet revealed a sub-network connecting 34 genetic hits and 166 differentially expressed genes shading light on the inner processes effecting alpha-synyclein. Some of the conclusions were tested in the lab, and confirmed. Real contribution to the understanding of the Parkinson disease!

Freedom of data The authors complained that many experiments are taking place world-wide but only a few release source codes and complete data sets on the internet So, they decided to open their algorithm and method to the world by creating a web page that anyone can upload data into and receive results in minutes One can use yeast data (as used in the article) or can upload it’s own data representing the protein-protein and protein-gene relations in a different organism

Any Questions?

Questions What is the motivation to connect genetic hits and differentially expressed genes? Why do the authors capped the weight of each edge? What would happen they would have skipped this stage? What would happen if we set the gamma parameter to value which is too high? Please send answers to my mail (shshzi at gmail)