“An Extension of Weighted Gene Co-Expression Network Analysis to Include Signed Interactions” Michael Mason Department of Statistics, UCLA.

Slides:



Advertisements
Similar presentations
Gene Correlation Networks
Advertisements

Network biology Wang Jie Shanghai Institutes of Biological Sciences.
The multi-layered organization of information in living systems
VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.
A Probabilistic Dynamical Model for Quantitative Inference of the Regulatory Mechanism of Transcription Guido Sanguinetti, Magnus Rattray and Neil D. Lawrence.
Functional Organization of the Transcriptome in Human Brain Michael C. Oldham Laboratory of Daniel H. Geschwind, UCLA BIOCOMP ‘08, Las Vegas, NV July 15,
Andy Yip, Steve Horvath Depts Human Genetics and Biostatistics, University of California, Los Angeles The Generalized Topological.
University at BuffaloThe State University of New York Interactive Exploration of Coherent Patterns in Time-series Gene Expression Data Daxin Jiang Jian.
Open Day 2006 From Expression, Through Annotation, to Function Ohad Manor & Tali Goren.
Comparison of Networks Across Species CS374 Presentation October 26, 2006 Chuan Sheng Foo.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Differentially expressed genes
Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break 14:45 – 15:15Regulatory pathways lecture 15:15 – 15:45Exercise.
Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,
Modular Organization of Protein Interaction Network Feng Luo, Ph.D. Department of Computer Science Clemson University.
Steve Horvath University of California, Los Angeles
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
Steve Horvath University of California, Los Angeles
Steve Horvath, Andy Yip Depts Human Genetics and Biostatistics, University of California, Los Angeles The Generalized Topological.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Is Forkhead Box N1 (FOXN1) significant in both men and women diagnosed with Chronic Fatigue Syndrome? Charlyn Suarez.
Peter S. Gargalovic, Minori Imura, Bin Zhang, Nima M. Gharavi, Michael J. Clark, Joanne Pagnon, Wen-Pin Yang, Aiqing He, Amy Truong, Shilpa Patel, Stanley.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Is my network module preserved and reproducible? PloS Comp Biol. 7(1): e Steve Horvath Peter Langfelder University of California, Los Angeles.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Consensus eigengene networks: Studying relationships between gene co-expression modules across networks Peter Langfelder Dept. of Human Genetics, UC Los.
Empirical evaluation of prediction- and correlation network methods applied to genomic data Steve Horvath University of California, Los Angeles.
Steve Horvath University of California, Los Angeles
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Gene expression profiling identifies molecular subtypes of gliomas
Ai Li and Steve Horvath Depts Human Genetics and Biostatistics, University of California, Los Angeles Generalizations of.
An Overview of Weighted Gene Co-Expression Network Analysis
Network Analysis and Application Yao Fu
A Geometric Interpretation of Gene Co-Expression Network Analysis Steve Horvath, Jun Dong.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Bioinformatics Dealing with expression data Kristel Van Steen, PhD, ScD Université de Liege - Institut Montefiore
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells ES cell culture Self- renewing Ecto- derm.
Extended Overview of Weighted Gene Co-Expression Network Analysis (WGCNA) Steve Horvath University of California, Los Angeles.
General Prediction Strength Methods for Estimating the Number of Clusters in a Dataset Moira Regelson, Ph.D. September 15, 2005.
Steve Horvath Co-authors: Zhang Y, Langfelder P, Kahn RS, Boks MPM, van Eijk K, van den Berg LH, Ophoff RA Aging effects on DNA methylation modules in.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Expression Modules Brian S. Yandell (with slides from Steve Horvath, UCLA, and Mark Keller, UW-Madison)
Network Construction “A General Framework for Weighted Gene Co-Expression Network Analysis” Steve Horvath Human Genetics and Biostatistics University of.
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
Algorithms for Biological Networks Prof. Tijana Milenković Computer Science and Engineering University of Notre Dame Fall 2010.
Differential analysis of Eigengene Networks: Finding And Analyzing Shared Modules Across Multiple Microarray Datasets Peter Langfelder and Steve Horvath.
CSCE555 Bioinformatics Lecture 18 Network Biology: Comparison of Networks Across Species Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu.
Understanding Network Concepts in Modules Dong J, Horvath S (2007) BMC Systems Biology 2007, 1:24.
Introduction to biological molecular networks
Analyzing Expression Data: Clustering and Stats Chapter 16.
Flat clustering approaches
Evaluation of gene-expression clustering via mutual information distance measure Ido Priness, Oded Maimon and Irad Ben-Gal BMC Bioinformatics, 2007.
Statistical Data Analysis 2010/2011 M. de Gunst Lecture 10.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Consensus modules: modules present across multiple data sets Peter Langfelder and Steve Horvath Eigengene networks for studying the relationships between.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 5.
Response network emerging from simple perturbation Seung-Woo Son Complex System and Statistical Physics Lab., Dept. Physics, KAIST, Daejeon , Korea.
Graph clustering to detect network modules
Factor and Principle Component Analysis
A General Framework for Weighted Gene Co-Expression Network Analysis
Building and Analyzing Genome-Wide Gene Disruption Networks
Topological overlap matrix (TOM) plots of weighted, gene coexpression networks constructed from one mouse studies (A–F) and four human studies including.
CSCI2950-C Lecture 13 Network Motifs; Network Integration
SEG5010 Presentation Zhou Lanjun.
Volume 3, Issue 1, Pages (July 2016)
Characteristics of tissue‐specific co‐expression networks (CNs)‏
Volume 9, Issue 2, Pages (August 2017)
Presentation transcript:

“An Extension of Weighted Gene Co-Expression Network Analysis to Include Signed Interactions” Michael Mason Department of Statistics, UCLA

Contents Here we consider the application of a generalized WGCNA that keeps track of the sign of the co-expression information. standard unsigned networks are based on Here we focus on signed networks based on

Step 1: Define a Gene Co-expression Similarity Step 2: Define a Family of Adjacency Functions Step3: Determine the AF Parameters Step 4: Define a Measure of Node Dissimilarity Step 5: Identify Network Modules (Clustering) Step 5: Find Biologically Interesting Modules Step 6: Find Key Genes in Interesting Modules General Framework Of Network Construction

Adjacency Functions: Hard and Soft Thresholding A network can be represented by an adjacency matrix, A=[a ij ], that encodes how a pair of nodes is connected. –A is a symmetric matrix with entries in [0,1] –For unweighted networks, hard thresholding is applied to S to yield A. If s ij > τ, a ij = 1 else a ij = 0. –For weighted networks, soft thresholding is applied with 0 < a ij < 1, and a ij = s ij β. –Both types of adjacency functions can be applied to unsigned and signed co-expression similarity measures. In this analysis we employ soft thresholding.

Defining a co-expression similarity measures that keeps track of the sign Unsigned networks are based on the absolute value of the correlation. Signed networks preserve sign information from the correlation Cor(x i,x j )

Generalized Connectivity A gene’s connectivity (also known as degree) equals the row sum of the adjacency matrix. Intuitively for unweighted networks this is the number of direct neighbors a gene has. For our signed networks, the connectivity of the i-th gene measures the extent of positive correlations with the other genes in the network.

For high powers of beta, signed weighted networks exhibit approximate scale free topology Scale Free Topology refers to the frequency distribution of the connectivity k, P(k)~k -λ p(k)=proportion of nodes that have connectivity k

How to check Scale Free Topology? Idea: Log transformation p(k) and k and look at scatter plots Linear model fitting R 2 index can be used to quantify scale free topology In our cancer and mouse embryonic stem cell applications, we find R 2 = 0.97 and 0.94 for β= 12 and 22, respectively.

The scale free topology criterion for choosing the parameter values of an adjacency function. A) CONSIDER ONLY THOSE PARAMETER VALUES THAT RESULT IN APPROXIMATE SCALE FREE TOPOLOGY B) SELECT THE PARAMETERS THAT RESULT IN THE HIGHEST MEAN NUMBER OF CONNECTIONS Criterion A is motivated by the finding that most metabolic networks (including gene co-expression networks, protein-protein interaction networks and cellular networks) have been found to exhibit a scale free topology Criterion B leads to high power for detecting modules (clusters of genes) and hub genes.

Trade-off between criterion A and criterion B when varying the power β in signed cancer network

Trade-off between criterion A and criterion B when varying the power β in signed mouse embryonic stem cell network

How to measure distance in a network? Biological Answer: look at shared neighbors with the topological overlap matrix. –Intuition: if 2 people share the same friends they are close in a social network –In an unsigned network negatively correlated genes are treated as friends while in the signed network they are treated as enemies. –Two genes have high topological overlap if they share (positively correlated) friends

Topological Overlap leads to a network distance measure (Ravasz et al 2002) Generalized in Zhang and Horvath (2005) to the case of weighted networks.

SIMPLE TOM example In this simple example TOM 1,2 reduces to a. If cor(x 1, x u ) and cor(x u, x 2 ) = -1, then in an unsigned network TOM 1,2 = 1, while in a signed network TOM 1,2 = 0.

Application: comparing Signed to Unsigned Networks using brain cancer data described in Horvath S, Zhang B, Carlson M, Lu KV, Zhu S, Felciano RM, Laurance MF, Zhao W, Shu, Q, Lee Y, Scheck AC, Liau LM, Wu H, Geschwind DH, Febbo PG, Kornblum HI, Cloughesy TF, Nelson SF, Mischel PS (2006) "Analysis of Oncogenic Signaling Networks in Glioblastoma Identifies ASPM as a Novel Molecular Target", PNAS | November 14, 2006 | vol. 103 | no. 46 |

Preservation of Modules between Unsigned and Signed Methods in Brain Cancer Unsigned NetworkSigned Network Message: no difference between signed and unsigned analysis

Analysis of Networks in Mouse ESC data described in Ivanova et al

Preservation of Large Modules between Unsigned and Signed Methods in Mouse embryonic stem cells. Signed network exhibits 4 additional modules Unsigned NetworkSigned Network

Gene significance Definition Differential gene expression test between control versus knockout –Control: Mouse microrray samples treated with empty virus versus –Knockout: microarray samples treated with a Oct4 RNAi (Oct4 is of major biological importance in ES pluripotency) Individual gene significance = t-test statistic –Note that the t-test keep tracks of the sign Goal: To relate gene significance to intramodular connectivity

Absolute Mean Significance Increases Once New Modules are Found via Signed WGCNA UnsignedSigned Message: signed networks allowed us to split large modules into smaller, biologically more significant modules

Behind the Scenes: Brown Module is Hidden within Turquoise UnsignedSigned

Signed WGCNA shows influence of known pluripotency transcription factors Separated into their own module, both the connectivity and relative gene significance of the TF’s increase.

Brown Module Shows Oct4 is a highly connected hub and it is highly significant in this module. This module could not have been detected in an unsigned network. Note that the signed intramodular connectivity is a biologically important screening variable. Biological importance of module is verified by 2 fold enrichment of Oct4 and Nanog binding.

Conclusion Signed weighted gene co-expression network analysis is a robust extension of unsigned WGCNA, preserving large modules while finding new and biologically interesting modules, thus facilitating a system’s level understanding of gene and/or protein interactions.

Acknowledgement Biostatistics/Bioinformatics Steve Horvath Qing Zhou Peter Langfelder