A SENSITIVITY ANALYSIS OF A BIOLOGICAL MODULE DISCOVERY PIPELINE James Long International Arctic Research Center University of Alaska Fairbanks March 25,

Slides:



Advertisements
Similar presentations
The Primal-Dual Method: Steiner Forest TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A AA A A A AA A A.
Advertisements

DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng.
The Diversity and Integration of Biological Network Motifs Seminars in Bioinformatics Martin Akerman 31/03/08.
Charalampos (Babis) E. Tsourakakis KDD 2013 KDD'131.
Shuai Ma, Yang Cao, Wenfei Fan, Jinpeng Huai, Tianyu Wo Capturing Topology in Graph Pattern Matching University of Edinburgh.
1 Jarnac examples Michael Grobe. 2 Topics Introduction: Why simulate? Some reaction kinetics examples Simple production without degradation Production.
An Intro To Systems Biology: Design Principles of Biological Circuits Uri Alon Presented by: Sharon Harel.
Mauro Sozio and Aristides Gionis Presented By:
Putting genetic interactions in context through a global modular decomposition Jamal.
Optimization of Pearl’s Method of Conditioning and Greedy-Like Approximation Algorithm for the Vertex Feedback Set Problem Authors: Ann Becker and Dan.
Seminar in Bioinformatics, Winter 2011 Network Motifs
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
HCS Clustering Algorithm
Recent Development on Elimination Ordering Group 1.
Chapter 6 The Normal Distribution and Other Continuous Distributions
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
Assigning Numbers to the Arrows Parameterizing a Gene Regulation Network by using Accurate Expression Kinetics.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics: A First Course 5 th.
Tal Mor  Create an automatic system that given an image of a room and a color, will color the room walls  Maintaining the original texture.
The Erdös-Rényi models
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Copyright © 2008 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights reserved. John W. Creswell Educational Research: Planning,
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
Chap 6-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 6 The Normal Distribution Business Statistics: A First Course 6 th.
MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification.
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
Finding dense components in weighted graphs Paul Horn
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
Excursions in Modern Mathematics, 7e: Copyright © 2010 Pearson Education, Inc. 6 The Mathematics of Touring 6.1Hamilton Paths and Hamilton Circuits.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
ANALYSIS AND IMPLEMENTATION OF GRAPH COLORING ALGORITHMS FOR REGISTER ALLOCATION By, Sumeeth K. C Vasanth K.
Quality Assurance How do you know your results are correct? How confident are you?
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
5-1 ANSYS, Inc. Proprietary © 2009 ANSYS, Inc. All rights reserved. May 28, 2009 Inventory # Chapter 5 Six Sigma.
Bioinformatics 3 V8 – Gene Regulation Fri, Nov 15, 2013.
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
CS270 Project Overview Maximum Planar Subgraph Danyel Fisher Jason Hong Greg Lawrence Jimmy Lin.
Chapter Eight: Using Statistics to Answer Questions.
Basic Business Statistics
Over-Trained Network Node Removal and Neurotransmitter-Inspired Artificial Neural Networks By: Kyle Wray.
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
Cluster validation Integration ICES Bioinformatics.
Week 6. Statistics etc. GRS LX 865 Topics in Linguistics.
Evaluation of gene-expression clustering via mutual information distance measure Ido Priness, Oded Maimon and Irad Ben-Gal BMC Bioinformatics, 2007.
Microarray Data Analysis The Bioinformatics side of the bench.
Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
September 28, 2000 Improved Simultaneous Data Reconciliation, Bias Detection and Identification Using Mixed Integer Optimization Methods Presented by:
Exponential random graphs and dynamic graph algorithms David Eppstein Comp. Sci. Dept., UC Irvine.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Network Motifs See some examples of motifs and their functionality Discuss a study that showed how a miRNA also can be integrated into motifs Today’s plan.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Mining Coherent Dense Subgraphs across Multiple Biological Networks Vahid Mirjalili CSE 891.
Finding Dense and Connected Subgraphs in Dual Networks
A simple parallel algorithm for the MIS problem
Introduction to Algorithms
Unscented Kalman Filter for a coal run-of-mine bin
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
SEG5010 Presentation Zhou Lanjun.
Chapter Nine: Using Statistics to Answer Questions
The Normal Distribution
Presentation transcript:

A SENSITIVITY ANALYSIS OF A BIOLOGICAL MODULE DISCOVERY PIPELINE James Long International Arctic Research Center University of Alaska Fairbanks March 25, 2015

A Sensitivity Analysis of a Biological Module Discovery Pipeline

Gene Expression A Sensitivity Analysis of a Biological Module Discovery Pipeline

Gene Expression Synthetic Gene Expression Data A Sensitivity Analysis of a Biological Module Discovery Pipeline

Gene Expression Synthetic Gene Expression Data CODENSE A Sensitivity Analysis of a Biological Module Discovery Pipeline

Gene Expression Synthetic Gene Expression Data CODENSE Regionalized Sensitivity Analysis A Sensitivity Analysis of a Biological Module Discovery Pipeline

Gene Expression Synthetic Gene Expression Data CODENSE Regionalized Sensitivity Analysis Results A Sensitivity Analysis of a Biological Module Discovery Pipeline

Gene Expression Synthetic Gene Expression Data CODENSE Regionalized Sensitivity Analysis Results A Sensitivity Analysis of a Biological Module Discovery Pipeline

Gene Expression

Hill Function

Gene Expression Hill Function

Gene Expression Hill Function,

Gene Expression

Hill Function

Generalized Hill Function

for activators

Generalized Hill Function for activators for repressors

Gene Expression Synthetic Gene Expression Data CODENSE Regionalized Sensitivity Analysis Results A Sensitivity Analysis of a Biological Module Discovery Pipeline

Synthetic Gene Expression Data

NEMO

Synthetic Gene Expression Data NEMO – Network Motif Language

Synthetic Gene Expression Data NEMO – Network Motif Language COPASI

Synthetic Gene Expression Data NEMO – Network Motif Language COPASI – Complex Pathway Simulator

Synthetic Gene Expression Data NEMO – Network Motif Language COPASI – Complex Pathway Simulator

NEMO

G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

G0 G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

NEMO G0( G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

NEMO G0(P1 G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

NEMO G0(P1+ G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

NEMO G0(P1+, G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

NEMO G0(P1+,P2+, G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

NEMO G0(P1+,P2+,P3-, G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

NEMO G0(P1+,P2+,P3-,P4-, G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

NEMO G0(P1+,P2+,P3-,P4-,P5+) G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

NEMO G0(P1+,P2+,P3-,P4-,P5+) G1(…) G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

NEMO G0(P1+,P2+,P3-,P4-,P5+) G1(…) G2(…) G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

NEMO G0(P1+,P2+,P3-,P4-,P5+) G1(…) G2(…) etc. G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

NEMO G0(P1+,P2+,P3-,P4-,P5+) G1(…) G2(…) etc.) GLIST( G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

NEMO G0(P1+,P2+,P3-,P4-,P5+) G1(…) G2(…) etc.)] GLIST( [ G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5

Dense Overlapping Regulon (DOR) NEMO G0G1 G3 G4G5 G2

G3(P0+,P1+,P2-) NEMO Dense Overlapping Regulon (DOR) G0G1 G3 G4G5 G2

G3(P0+,P1+,P2-), G4(P0+,P1+) NEMO Dense Overlapping Regulon (DOR) G0G1 G3 G4G5 G2

G3(P0+,P1+,P2-), G4(P0+,P1+), G5(P1-,P2+) NEMO Dense Overlapping Regulon (DOR) G0G1 G3 G4G5 G2

G0G1 G3 G4G5 G2 DOR(G3(P0+,P1+,P2-), G4(P0+,P1+), G5(P1-,P2+)) NEMO Dense Overlapping Regulon (DOR)

Negative auto-regulation NEMO G0

Negative auto-regulation G0 G0(P0-) NEMO

Feed-forward loop (FFL) G0G1 G2 NEMO

G0G1 G2 P0 NEMO Feed-forward loop (FFL)

G0G1 G2 P0(+G1 NEMO Feed-forward loop (FFL)

G0G1 G2 P0(+G1+G2 NEMO Feed-forward loop (FFL)

G0G1 G2 P0(+G1+G2+ NEMO Feed-forward loop (FFL)

G0G1 G2 P0(+G1+G2+) NEMO Feed-forward loop (FFL)

Multi-output FFL G0G1 G2 NEMO

G0G1 G2 G3 NEMO Multi-output FFL

G0G1 G2 G3G4 NEMO Multi-output FFL

P0(+G1+G2+) G0G1 G2 G3G4 NEMO Multi-output FFL

P0(+G1+(G2,G3,G4)+) G0G1 G2 G3G4 NEMO Multi-output FFL

P0(+G1+(G2,G3,G4)+)) G0G1 G2 G3G4 NEMO Multi-output FFL TMLIST(

Single-input module (SIM) G0 G1 G2G3 NEMO

P0(+G1,G2,G3) G0 G1 G2G3 NEMO Single-input module (SIM)

G0G1G2 G3G4G5G6 G7G8G9G10 NEMO

G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-) NEMO

G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), P0(+G1-G2-) NEMO

G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), P0(+G1-G2-), P1(+G3,G4,G5,G6) NEMO

G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), P0(+G1-G2-), P1(+G3,G4,G5,G6), G7(P3+,P4-,P5+) NEMO

G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), P0(+G1-G2-), P1(+G3,G4,G5,G6), G7(P3+,P4-,P5+), G8(P4+) NEMO

G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), P0(+G1-G2-), P1(+G3,G4,G5,G6), G7(P3+,P4-,P5+), G8(P4+), G9(P3+,P5+,P6+) NEMO

G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), P0(+G1-G2-), P1(+G3,G4,G5,G6), G7(P3+,P4-,P5+), G8(P4+), G9(P3+,P5+,P6+), G10(P5-) NEMO

G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), P0(+G1-G2-), P1(+G3,G4,G5,G6), DOR(G7(P3+,P4-,P5+), G8(P4+), G9(P3+,P5+,P6+), G10(P5-)) NEMO

G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), TMLIST(P0(+G1-G2-), P1(+G3,G4,G5,G6)), DOR(G7(P3+,P4-,P5+), G8(P4+), G9(P3+,P5+,P6+), G10(P5-)) NEMO

G0G1G2 G3G4G5G6 G7G8G9G10 GLIST(G0(P5-)), TMLIST(P0(+G1-G2-), P1(+G3,G4,G5,G6)), DOR(G7(P3+,P4-,P5+), G8(P4+), G9(P3+,P5+,P6+), G10(P5-)) NEMO

G0G1G2 G3G4G5G6 G7G8G9G10 [ GLIST(G0(P5-)), TMLIST(P0(+G1-G2-), P1(+G3,G4,G5,G6)), DOR(G7(P3+,P4-,P5+), G8(P4+), G9(P3+,P5+,P6+), G10(P5-)) ] NEMO

G0G1G2 G3G4G5G6 G7G8G9G10 [ GLIST(G0(P5-)), TMLIST(P0(+G1-G2-), P1(+G3,G4,G5,G6)), DOR(G7(P3+,P4-,P5+), G8(P4+), G9(P3+,P5+,P6+), G10(P5- :F(power(sin(P5),2)))) ] NEMO

NEMO compiler emits SBML

NEMO NEMO compiler emits SBML – Systems Biology Markup Language

NEMO NEMO compiler emits SBML – Systems Biology Markup Language – Uses libsbml from

NEMO NEMO compiler emits SBML – Systems Biology Markup Language – Uses libsbml from More accurate to call it a “language translator”

NEMO NEMO compiler emits SBML – Systems Biology Markup Language – Uses libsbml from More accurate to call it a “language translator”, that adds random generalized Hill functions!

NEMO NEMO compiler emits SBML – Systems Biology Markup Language – Uses libsbml from More accurate to call it a “language translator”, that adds random generalized Hill functions! Bioinformatics 2008 Jan 1;24(1):132-4

NEMO G0 G0 G1 G1 [GLIST(G0(P1+), G1(P0+)) ]

NEMO

NEMO B_0 P1 K_0 n_0 1 P1 K_0 n_0 tau

NEMO dc_0 P0 tau

NEMO P0 tau B_1 P0 K_1 n_1 1

NEMO P0 K_1 n_1 tau dc_1 P1 tau

NEMO P1 tau

Synthetic Gene Expression Data NEMO – Network Motif Language COPASI – Complex Pathway Simulator

Also outputs a table of data for each time step.

COPASI – Complex Pathway Simulator Also outputs a table of data for each time step, the last column of which is our synthetic data!

COPASI – Complex Pathway Simulator

Gene Expression Synthetic Gene Expression Data CODENSE Regionalized Sensitivity Analysis Results A Sensitivity Analysis of a Biological Module Discovery Pipeline

The CODENSE algorithm

Input – a series of expression correlation graphs, each representing a different state for an organism.

The CODENSE algorithm Input – a series of expression correlation graphs, each representing a different state for an organism. Output – groups of genes (modules) whose expression is correlated across the series of expression correlation graphs.

Expression Correlation

Pearson’s Correlation

Expression Correlation Pearson’s Correlation – Linear dependence between two variables

Expression Correlation Pearson’s Correlation – Linear dependence between two variables Pearson’s Correlation with Z-score

Expression Correlation Pearson’s Correlation – Linear dependence between two variables Pearson’s Correlation with Z-score – Number of standard deviations above the mean

Expression Correlation Pearson’s Correlation – Linear dependence between two variables Pearson’s Correlation with Z-score – Number of standard deviations above the mean Mutual Information

Expression Correlation Pearson’s Correlation – Linear dependence between two variables Pearson’s Correlation with Z-score – Number of standard deviations above the mean Mutual Information – A measure of mutual dependence between variables

Expression Correlation Pearson’s Correlation – Linear dependence between two variables Pearson’s Correlation with Z-score – Number of standard deviations above the mean Mutual Information – A measure of mutual dependence between variables – Non-linear dependence OK

The CODENSE algorithm

Manuscript in preparation

ODES – Overlapping Dense Subgraph Algorithm

vertex

edge

vertex edge connected

vertex edge connected

vertex edge connected cut vertex

vertex edge cut vertex

vertex edge cut vertex disconnected

vertex edge disconnected

ODES Density of a graph

ODES Number of actual edges / Number of possible edges

ODES Density of a graph Number of actual edges / Number of possible edges Number of possible edges

ODES Density of a graph Number of actual edges / Number of possible edges Number of possible edges Density

ODES Density of a graph Number of actual edges / Number of possible edges Number of possible edges Density Degree of a vertex

ODES Density of a graph Number of actual edges / Number of possible edges Number of possible edges Density Degree of a vertex Number of edges incident to the vertex

ODES Theorem

ODES A connected graph, with density, and number of vertices, has at least one non-cut vertex where degree, the average degree of vertices in.

ODES Theorem A connected graph, with density, and number of vertices, has at least one non-cut vertex where degree, the average degree of vertices in. Removal of from does not decrease the density of.

ODES Theorem A connected graph, with density, and number of vertices, has at least one non-cut vertex where degree, the average degree of vertices in. Removal of from does not decrease the density of. Bioinformatics (2010) 26 (21):

8 vertices

22 edges

8 vertices 22 edges average degree = 44/8 = 5.5

8 vertices 22 edges average degree = 44/8 = 5.5 density = 2*22/(8(8-1)) ~ 0.78

8 vertices 22 edges average degree = 44/8 = 5.5 density = 2*22/(8(8-1)) ~ 0.78

8 vertices 22 edges average degree = 5.5 density ~ 0.78

7 vertices 17 edges average degree ~ 4.86 density ~ 0.81

7 vertices 17 edges average degree ~ 4.86 density ~ 0.81

6 vertices 13 edges average degree ~ 4.33 density ~ 0.87

6 vertices 13 edges average degree ~ 4.33 density ~ 0.87

5 vertices 9 edges average degree = 3.6 density ~ 0.9

5 vertices 9 edges average degree = 3.6 density ~ 0.9

4 vertices 6 edges average degree = 3.0 density = 1.0

4 vertices 6 edges average degree = 3.0 density = 1.0

3 vertices 3 edges average degree = 2.0 density = 1.0

3 vertices 3 edges average degree = 2.0 density = 1.0

2 vertices 1 edge average degree = 1.0 density = 1.0

2 vertices 1 edge average degree = 1.0 density = 1.0

2 vertices 1 edge average degree = 1.0 density = 1.0

3 vertices 3 edges average degree = 2.0 density = 1.0

3 vertices 3 edges average degree = 2.0 density = 1.0

4 vertices 6 edges average degree = 3.0 density = 1.0

4 vertices 6 edges average degree = 3.0 density = 1.0

5 vertices 9 edges average degree = 3.6 density ~ 0.9

5 vertices 9 edges average degree = 3.6 density ~ 0.9

6 vertices 13 edges average degree ~ 4.33 density ~ 0.87

6 vertices 13 edges average degree ~ 4.33 density ~ 0.87

7 vertices 17 edges average degree ~ 4.86 density ~ 0.81

7 vertices 17 edges average degree ~ 4.86 density ~ 0.81

8 vertices 22 edges average degree = 5.5 density ~ 0.78

8 vertices 22 edges average degree = 5.5 density ~ 0.78

9 vertices 24 edges average degree ~ 5.3 density ~ 0.67

8 vertices 22 edges average degree = 5.5 density ~ 0.78

8 vertices 22 edges average degree = 5.5 density ~ 0.78 Note: brute-force search is confined to actual dense subgraphs.

Gene Expression Synthetic Gene Expression Data CODENSE Regionalized Sensitivity Analysis Results A Sensitivity Analysis of a Biological Module Discovery Pipeline

Regionalized Sensitivity Analysis

Monte Carlo model runs Regionalized Sensitivity Analysis

Monte Carlo model runs Evaluate one or more binary Objective Functions Regionalized Sensitivity Analysis

Monte Carlo model runs Evaluate one or more binary Objective Functions – Are only exact known modules returned? Regionalized Sensitivity Analysis

Monte Carlo model runs Evaluate one or more binary Objective Functions – Are only exact known modules returned? – Exact modules returned w/ limited false positives? Regionalized Sensitivity Analysis

Monte Carlo model runs Evaluate one or more binary Objective Functions – Are only exact known modules returned? – Exact modules returned w/ limited false positives? – Approximate modules returned w/ limited false positives? Regionalized Sensitivity Analysis

Monte Carlo model runs Evaluate one or more binary Objective Functions – Are only exact known modules returned? – Exact modules returned w/ limited false positives? – Approximate modules returned w/ limited false positives? – Half of known modules returned approximately w/ limited false positives? Regionalized Sensitivity Analysis

Monte Carlo model runs Evaluate one or more binary Objective Functions – Are only exact known modules returned? – Exact modules returned w/ limited false positives? – Approximate modules returned w/ limited false positives? – Half of known modules returned approximately w/ limited false positives? Increment parameter bins based on Objective Function conformance or non-conformance Regionalized Sensitivity Analysis

Compile the NEMO representation of the canonical network Regionalized Sensitivity Analysis

Compile the NEMO representation of the canonical network Import into COPASI Regionalized Sensitivity Analysis

Compile the NEMO representation of the canonical network Import into COPASI Set up model, and save in COPASI format Regionalized Sensitivity Analysis

Compile the NEMO representation of the canonical network Import into COPASI Set up model, and save in COPASI format Create template from COPASI format file where all genes are turned off (B = 0) Regionalized Sensitivity Analysis

Compile the NEMO representation of the canonical network Import into COPASI Set up model, and save in COPASI format Create template from COPASI format file where all genes are turned off (B = 0) Create synthetic data by turning some genes and SIMs on, taking last column of COPASI output. Regionalized Sensitivity Analysis

Compile the NEMO representation of the canonical network Import into COPASI Set up model, and save in COPASI format Create template from COPASI format file where all genes are turned off (B = 0) Create synthetic data by turning some genes and SIMs on, taking last column of COPASI output. Add noise to output. Regionalized Sensitivity Analysis

Gene Expression Synthetic Gene Expression Data CODENSE Regionalized Sensitivity Analysis Results A Sensitivity Analysis of a Biological Module Discovery Pipeline

Results, One Module

First RSA runs used a static transcription network Results, One Module

First RSA runs used a static transcription network – For Objective Function 1, no noise added Results, One Module

First RSA runs used a static transcription network – For Objective Function 1, no noise added Highly sensitive to PC cutoff score for PC only correlation Results, One Module

First RSA runs used a static transcription network – For Objective Function 1, no noise added Highly sensitive to PC cutoff score for PC only correlation – to 1.0 Results, One Module

First RSA runs used a static transcription network – For Objective Function 1, no noise added Highly sensitive to PC cutoff score for PC only correlation – to 1.0 – Very low conformance/non-conformance ratio Results, One Module

First RSA runs used a static transcription network – For Objective Function 1, no noise added Highly sensitive to PC cutoff score for PC only correlation – to 1.0 – Very low conformance/non-conformance ratio For Z-scores, sensitivity to PC cutoff score greatly attenuated Results, One Module

First RSA runs used a static transcription network – For Objective Function 1, no noise added Highly sensitive to PC cutoff score for PC only correlation – to 1.0 – Very low conformance/non-conformance ratio For Z-scores, sensitivity to PC cutoff score greatly attenuated – Lower PC scores allow more false positives to enter pipeline Results, One Module

First RSA runs used a static transcription network – For Objective Function 1, no noise added Highly sensitive to PC cutoff score for PC only correlation – to 1.0 – Very low conformance/non-conformance ratio For Z-scores, sensitivity to PC cutoff score greatly attenuated – Lower PC scores allow more false positives to enter pipeline – Significant sensitivities to similarity cutoff score and minimum density for a dense subgraph in coherent dense subgraphs Results, One Module

First RSA runs used a static transcription network – For Objective Function 1, no noise added Highly sensitive to PC cutoff score for PC only correlation – to 1.0 – Very low conformance/non-conformance ratio For Z-scores, sensitivity to PC cutoff score greatly attenuated – Lower PC scores allow more false positives to enter pipeline – Significant sensitivities to similarity cutoff score and minimum density for a dense subgraph in coherent dense subgraphs – Still a very low conformance/non-conformance ratio Results, One Module

First RSA runs used a static transcription network – For other Objective Functions Results, One Module

First RSA runs used a static transcription network – For other Objective Functions Conformance/non-conformance ratio rises to ~ 50% Results, One Module

First RSA runs used a static transcription network – For other Objective Functions Conformance/non-conformance ratio rises to ~ 50% Only sensitive to the parameter that declares the fraction of data sets that an edge must exist in to be included in the summary graph Results, One Module

First RSA runs used a static transcription network – For other Objective Functions Conformance/non-conformance ratio rises to ~ 50% Only sensitive to the parameter that declares the fraction of data sets that an edge must exist in to be included in the summary graph Observations Results, One Module

First RSA runs used a static transcription network – For other Objective Functions Conformance/non-conformance ratio rises to ~ 50% Only sensitive to the parameter that declares the fraction of data sets that an edge must exist in to be included in the summary graph Observations – No good at finding the exact module Results, One Module

First RSA runs used a static transcription network – For other Objective Functions Conformance/non-conformance ratio rises to ~ 50% Only sensitive to the parameter that declares the fraction of data sets that an edge must exist in to be included in the summary graph Observations – No good at finding the exact module – No noise is unrealistic Results, One Module

First RSA runs used a static transcription network – For other Objective Functions Conformance/non-conformance ratio rises to ~ 50% Only sensitive to the parameter that declares the fraction of data sets that an edge must exist in to be included in the summary graph Observations – No good at finding the exact module – No noise is unrealistic – Static network parameters are unrealistic Results, One Module

RSA with noise and dynamic network parameters Results, One Module

RSA with noise and dynamic network parameters – normally distributed noise, mu=0, sigma=10% of datum Results, One Module

RSA with noise and dynamic network parameters – normally distributed noise, mu=0, sigma=10% of datum – Still low conformance/non-conformance ratios for PC only, ~ 10% Results, One Module

RSA with noise and dynamic network parameters – normally distributed noise, mu=0, sigma=10% of datum – Still low conformance/non-conformance ratios for PC only, ~ 10% – Jumps to 30-45% with Z-score methods Results, One Module

RSA with noise and dynamic network parameters – normally distributed noise, mu=0, sigma=10% of datum – Still low conformance/non-conformance ratios for PC only, ~ 10% – Jumps to 30-45% with Z-score methods – Sensitivities mostly disappear in the presence of noise Results, One Module

RSA with noise and dynamic network parameters – normally distributed noise, mu=0, sigma=10% of datum – Still low conformance/non-conformance ratios for PC only, ~ 10% – Jumps to 30-45% with Z-score methods – Sensitivities mostly disappear in the presence of noise sensitive to how fast module genes reach equilibrium Results, One Module

RSA with noise and dynamic network parameters – normally distributed noise, mu=0, sigma=10% of datum – Still low conformance/non-conformance ratios for PC only, ~ 10% – Jumps to 30-45% with Z-score methods – Sensitivities mostly disappear in the presence of noise sensitive to how fast module genes reach equilibrium sensitive to percentage of time a module is ‘on’ in the data Results, One Module

RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods Results, One Module

RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods sigma = 20% of datum to be modified Results, One Module

RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods sigma = 20% of datum to be modified – conformance/non-conformance drops to ~ 25-35% Results, One Module

RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods sigma = 20% of datum to be modified – conformance/non-conformance drops to ~ 25-35% sigma = 25% of datum to be modified Results, One Module

RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods sigma = 20% of datum to be modified – conformance/non-conformance drops to ~ 25-35% sigma = 25% of datum to be modified – conformance/non-conformance drops to ~ 15-25% Results, One Module

RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods sigma = 20% of datum to be modified – conformance/non-conformance drops to ~ 25-35% sigma = 25% of datum to be modified – conformance/non-conformance drops to ~ 15-25% sigma = 33% of datum to be modified Results, One Module

RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods sigma = 20% of datum to be modified – conformance/non-conformance drops to ~ 25-35% sigma = 25% of datum to be modified – conformance/non-conformance drops to ~ 15-25% sigma = 33% of datum to be modified – conformance/non-conformance drops to ~ 5-10% Results, One Module

RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods sigma = 20% of datum to be modified – conformance/non-conformance drops to ~ 25-35% sigma = 25% of datum to be modified – conformance/non-conformance drops to ~ 15-25% sigma = 33% of datum to be modified – conformance/non-conformance drops to ~ 5-10% for 25% & 33% cases, sensitivities appear to non-module maximum expression coefficient distribution mu and sigma Results, One Module

RSA with noise and dynamic network parameters Results, Relaxed Module

RSA with noise and dynamic network parameters – Pipeline parameters fixed, MI calculation performed Results, Relaxed Module

RSA with noise and dynamic network parameters – Pipeline parameters fixed, MI calculation performed – Module parameters picked from a wider distribution Results, Relaxed Module

RSA with noise and dynamic network parameters – Pipeline parameters fixed, MI calculation performed – Module parameters picked from a wider distribution – Sensitive to rate at which module parameters reach equilibrium, and percentage of time module is ‘on’ in the data Results, Relaxed Module

RSA with noise and dynamic network parameters – Pipeline parameters fixed, MI calculation performed – Module parameters picked from a wider distribution – Sensitive to rate at which module parameters reach equilibrium, and percentage of time module is ‘on’ in the data – Conformance/non-conformance ratio is ~ 35-50% Results, Relaxed Module

MI calculation Results, Relaxed Module

MI calculation – Invoked only when PC + Z-score fails to infer an edge Results, Relaxed Module

MI calculation – Invoked only when PC + Z-score fails to infer an edge – Invoked only if expression levels are comparable Results, Relaxed Module

MI calculation – Invoked only when PC + Z-score fails to infer an edge – Invoked only if expression levels are comparable Hypothesis is that module members are expressed in quantities that are not vastly different from each other Results, Relaxed Module

MI calculation – Invoked only when PC + Z-score fails to infer an edge – Invoked only if expression levels are comparable Hypothesis is that module members are expressed in quantities that are not vastly different from each other – Invoked only if expression levels are not small Results, Relaxed Module

MI calculation – Invoked only when PC + Z-score fails to infer an edge – Invoked only if expression levels are comparable Hypothesis is that module members are expressed in quantities that are not vastly different from each other – Invoked only if expression levels are not small At levels considered ‘noise’ Results, Relaxed Module

MI calculation – Invoked only when PC + Z-score fails to infer an edge – Invoked only if expression levels are comparable Hypothesis is that module members are expressed in quantities that are not vastly different from each other – Invoked only if expression levels are not small At levels considered ‘noise’ – Typically increases conformance/non-conformance ratios by ~ 0-5% Results, Relaxed Module

Results, Two Modules

Most realistic case yet Results, Two Modules

Most realistic case yet With PC, Z-score, and noise, pipeline is sensitive to rate at which module genes reach equilibrium, and sensitive to Z-score cutoff value, as well as percentage of time module is ‘on’ in the data Results, Two Modules

Most realistic case yet With PC, Z-score, and noise, pipeline is sensitive to rate at which module genes reach equilibrium, and sensitive to Z-score cutoff value, as well as percentage of time module is ‘on’ in the data – conforming/non-conforming ratio is ~ 25-40% Results, Two Modules

Most realistic case yet With PC, Z-score, and noise, pipeline is sensitive to rate at which module genes reach equilibrium, and sensitive to Z-score cutoff value, as well as percentage of time module is on in the data – conforming/non-conforming ratio is ~ 25-40% With the addition of an MI calculation, the only sensitivity is percentage of time module is ‘on’ in the data Results, Two Modules

Most realistic case yet With PC, Z-score, and noise, pipeline is sensitive to rate at which module genes reach equilibrium, and sensitive to Z-score cutoff value, as well as percentage of time module is on in the data – conforming/non-conforming ratio is ~ 25-40% With the addition of an MI calculation, the only sensitivity is percentage of time module is ‘on’ in the data – conforming/non-conforming ratio is ~ 30-45% Results, Two Modules

With only a sensitivity to the percentage of time the module is ‘on’ in the data Results, Two Modules

With only a sensitivity to the percentage of time the module is ‘on’ in the data – Pipeline is robust in the face of network variation Results, Two Modules

With only a sensitivity to the percentage of time the module is ‘on’ in the data – Pipeline is robust in the face of network variation – Can start the pipeline ‘support’ parameter at a high value, Results, Two Modules

With only a sensitivity to the percentage of time the module is ‘on’ in the data – Pipeline is robust in the face of network variation – Can start the pipeline ‘support’ parameter at a high value, and turn it down until modules are detected! Results, Two Modules

Conclusions

NEMO – Network Motif language developed Conclusions

– Language translator for a qualitative transcription network description to a quantitative SBML model Conclusions

NEMO – Network Motif language developed – Language translator for a qualitative transcription network description to a quantitative SBML model – Used as input to the COPASI biochemical simulator Conclusions

NEMO – Network Motif language developed – Language translator for a qualitative transcription network description to a quantitative SBML model – Used as input to the COPASI biochemical simulator To generate synthetic gene expression data Conclusions

NEMO – Network Motif language developed – Language translator for a qualitative transcription network description to a quantitative SBML model – Used as input to the COPASI biochemical simulator To generate synthetic gene expression data – Microarray data Conclusions

NEMO – Network Motif language developed – Language translator for a qualitative transcription network description to a quantitative SBML model – Used as input to the COPASI biochemical simulator To generate synthetic gene expression data – Microarray data – NGS data Conclusions

NEMO – Network Motif language developed – Language translator for a qualitative transcription network description to a quantitative SBML model – Used as input to the COPASI biochemical simulator To generate synthetic gene expression data – Microarray data – NGS data – Bioinformatics (2008) 24 (1): Conclusions

Conclusions ODES – Overlapping Dense Subgraph Algorithm

Conclusions – In the class of exponentially exact algorithms

Conclusions ODES – Overlapping Dense Subgraph Algorithm – In the class of exponentially exact algorithms – Confines brute-force search domain to actual dense subgraphs

Conclusions ODES – Overlapping Dense Subgraph Algorithm – In the class of exponentially exact algorithms – Confines brute-force search domain to actual dense subgraphs – Bioinformatics (2010) 26 (21):

Open source CODENSE algorithm developed Conclusions

– Improved expression correlation algorithms Conclusions

Open source CODENSE algorithm developed – Improved expression correlation algorithms – Uses ODES dense subgraph algorithm Conclusions

Open source CODENSE algorithm developed – Improved expression correlation algorithms – Uses ODES dense subgraph algorithm – Successful identification of modules from synthetic data Conclusions

Open source CODENSE algorithm developed – Improved expression correlation algorithms – Uses ODES dense subgraph algorithm – Successful identification of modules from synthetic data – Manuscript in preparation Conclusions

Regionalized Sensitivity Analysis Performed Conclusions

– Pipeline is insensitive to reasonably chosen parameters Conclusions

Regionalized Sensitivity Analysis Performed – Pipeline is insensitive to reasonably chosen parameters In the presence of noise Conclusions

Regionalized Sensitivity Analysis Performed – Pipeline is insensitive to reasonably chosen parameters In the presence of noise, except for the ‘support’ parameter Conclusions

Regionalized Sensitivity Analysis Performed – Pipeline is insensitive to reasonably chosen parameters In the presence of noise, except for the ‘support’ parameter – Pipeline is insensitive to transcription network variability Conclusions

Regionalized Sensitivity Analysis Performed – Pipeline is insensitive to reasonably chosen parameters In the presence of noise, except for the ‘support’ parameter – Pipeline is insensitive to transcription network variability – Pipeline is robust in the face of noise Conclusions

Regionalized Sensitivity Analysis Performed – Pipeline is insensitive to reasonably chosen parameters In the presence of noise, except for the ‘support’ parameter – Pipeline is insensitive to transcription network variability – Pipeline is robust in the face of noise – Up to a 50% conformance/non-conformance ratio Conclusions

Future Work

Package the pipeline code for use by researchers. Future Work

Package the pipeline code for use by researchers. Make sensitivity runs on the canonical network, where more than two modules have the opportunity of being turned on. Future Work

Package the pipeline code for use by researchers. Make sensitivity runs on the canonical network, where more than two modules have the opportunity of being turned on. Sensitivity runs on different network topologies. Future Work

Package the pipeline code for use by researchers. Make sensitivity runs on the canonical network, where more than two modules have the opportunity of being turned on. Sensitivity runs on different network topologies. Test on larger networks. Future Work

Package the pipeline code for use by researchers. Make sensitivity runs on the canonical network, where more than two modules have the opportunity of being turned on. Sensitivity runs on different network topologies. Test on larger networks. Test on real data! Future Work

Package the pipeline code for use by researchers. Make sensitivity runs on the canonical network, where more than two modules have the opportunity of being turned on. Sensitivity runs on different network topologies. Test on larger networks. Test on real data! Investigate spanning tree initialization of ODES Future Work

Package the pipeline code for use by researchers. Make sensitivity runs on the canonical network, where more than two modules have the opportunity of being turned on. Sensitivity runs on different network topologies. Test on larger networks. Test on real data! Investigate spanning tree initialization of ODES – Changes exact algorithm into high-probability heuristic Future Work

Thank You