A SENSITIVITY ANALYSIS OF A BIOLOGICAL MODULE DISCOVERY PIPELINE James Long International Arctic Research Center University of Alaska Fairbanks March 25, 2015
A Sensitivity Analysis of a Biological Module Discovery Pipeline
Gene Expression A Sensitivity Analysis of a Biological Module Discovery Pipeline
Gene Expression Synthetic Gene Expression Data A Sensitivity Analysis of a Biological Module Discovery Pipeline
Gene Expression Synthetic Gene Expression Data CODENSE A Sensitivity Analysis of a Biological Module Discovery Pipeline
Gene Expression Synthetic Gene Expression Data CODENSE Regionalized Sensitivity Analysis A Sensitivity Analysis of a Biological Module Discovery Pipeline
Gene Expression Synthetic Gene Expression Data CODENSE Regionalized Sensitivity Analysis Results A Sensitivity Analysis of a Biological Module Discovery Pipeline
Gene Expression Synthetic Gene Expression Data CODENSE Regionalized Sensitivity Analysis Results A Sensitivity Analysis of a Biological Module Discovery Pipeline
Gene Expression
Hill Function
Gene Expression Hill Function
Gene Expression Hill Function,
Gene Expression
Hill Function
Generalized Hill Function
for activators
Generalized Hill Function for activators for repressors
Gene Expression Synthetic Gene Expression Data CODENSE Regionalized Sensitivity Analysis Results A Sensitivity Analysis of a Biological Module Discovery Pipeline
Synthetic Gene Expression Data
NEMO
Synthetic Gene Expression Data NEMO – Network Motif Language
Synthetic Gene Expression Data NEMO – Network Motif Language COPASI
Synthetic Gene Expression Data NEMO – Network Motif Language COPASI – Complex Pathway Simulator
Synthetic Gene Expression Data NEMO – Network Motif Language COPASI – Complex Pathway Simulator
NEMO
G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
G0 G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
NEMO G0( G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
NEMO G0(P1 G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
NEMO G0(P1+ G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
NEMO G0(P1+, G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
NEMO G0(P1+,P2+, G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
NEMO G0(P1+,P2+,P3-, G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
NEMO G0(P1+,P2+,P3-,P4-, G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
NEMO G0(P1+,P2+,P3-,P4-,P5+) G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
NEMO G0(P1+,P2+,P3-,P4-,P5+) G1(…) G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
NEMO G0(P1+,P2+,P3-,P4-,P5+) G1(…) G2(…) G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
NEMO G0(P1+,P2+,P3-,P4-,P5+) G1(…) G2(…) etc. G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
NEMO G0(P1+,P2+,P3-,P4-,P5+) G1(…) G2(…) etc.) GLIST( G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
NEMO G0(P1+,P2+,P3-,P4-,P5+) G1(…) G2(…) etc.)] GLIST( [ G0 G0 G1 G1 G2G2 G3 G3 G4 G4 G5 G5
Dense Overlapping Regulon (DOR) NEMO G0G1 G3 G4G5 G2
G3(P0+,P1+,P2-) NEMO Dense Overlapping Regulon (DOR) G0G1 G3 G4G5 G2
G3(P0+,P1+,P2-), G4(P0+,P1+) NEMO Dense Overlapping Regulon (DOR) G0G1 G3 G4G5 G2
G3(P0+,P1+,P2-), G4(P0+,P1+), G5(P1-,P2+) NEMO Dense Overlapping Regulon (DOR) G0G1 G3 G4G5 G2
G0G1 G3 G4G5 G2 DOR(G3(P0+,P1+,P2-), G4(P0+,P1+), G5(P1-,P2+)) NEMO Dense Overlapping Regulon (DOR)
Negative auto-regulation NEMO G0
Negative auto-regulation G0 G0(P0-) NEMO
Feed-forward loop (FFL) G0G1 G2 NEMO
G0G1 G2 P0 NEMO Feed-forward loop (FFL)
G0G1 G2 P0(+G1 NEMO Feed-forward loop (FFL)
G0G1 G2 P0(+G1+G2 NEMO Feed-forward loop (FFL)
G0G1 G2 P0(+G1+G2+ NEMO Feed-forward loop (FFL)
G0G1 G2 P0(+G1+G2+) NEMO Feed-forward loop (FFL)
Multi-output FFL G0G1 G2 NEMO
G0G1 G2 G3 NEMO Multi-output FFL
G0G1 G2 G3G4 NEMO Multi-output FFL
P0(+G1+G2+) G0G1 G2 G3G4 NEMO Multi-output FFL
P0(+G1+(G2,G3,G4)+) G0G1 G2 G3G4 NEMO Multi-output FFL
P0(+G1+(G2,G3,G4)+)) G0G1 G2 G3G4 NEMO Multi-output FFL TMLIST(
Single-input module (SIM) G0 G1 G2G3 NEMO
P0(+G1,G2,G3) G0 G1 G2G3 NEMO Single-input module (SIM)
G0G1G2 G3G4G5G6 G7G8G9G10 NEMO
G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-) NEMO
G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), P0(+G1-G2-) NEMO
G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), P0(+G1-G2-), P1(+G3,G4,G5,G6) NEMO
G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), P0(+G1-G2-), P1(+G3,G4,G5,G6), G7(P3+,P4-,P5+) NEMO
G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), P0(+G1-G2-), P1(+G3,G4,G5,G6), G7(P3+,P4-,P5+), G8(P4+) NEMO
G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), P0(+G1-G2-), P1(+G3,G4,G5,G6), G7(P3+,P4-,P5+), G8(P4+), G9(P3+,P5+,P6+) NEMO
G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), P0(+G1-G2-), P1(+G3,G4,G5,G6), G7(P3+,P4-,P5+), G8(P4+), G9(P3+,P5+,P6+), G10(P5-) NEMO
G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), P0(+G1-G2-), P1(+G3,G4,G5,G6), DOR(G7(P3+,P4-,P5+), G8(P4+), G9(P3+,P5+,P6+), G10(P5-)) NEMO
G0G1G2 G3G4G5G6 G7G8G9G10 G0(P5-), TMLIST(P0(+G1-G2-), P1(+G3,G4,G5,G6)), DOR(G7(P3+,P4-,P5+), G8(P4+), G9(P3+,P5+,P6+), G10(P5-)) NEMO
G0G1G2 G3G4G5G6 G7G8G9G10 GLIST(G0(P5-)), TMLIST(P0(+G1-G2-), P1(+G3,G4,G5,G6)), DOR(G7(P3+,P4-,P5+), G8(P4+), G9(P3+,P5+,P6+), G10(P5-)) NEMO
G0G1G2 G3G4G5G6 G7G8G9G10 [ GLIST(G0(P5-)), TMLIST(P0(+G1-G2-), P1(+G3,G4,G5,G6)), DOR(G7(P3+,P4-,P5+), G8(P4+), G9(P3+,P5+,P6+), G10(P5-)) ] NEMO
G0G1G2 G3G4G5G6 G7G8G9G10 [ GLIST(G0(P5-)), TMLIST(P0(+G1-G2-), P1(+G3,G4,G5,G6)), DOR(G7(P3+,P4-,P5+), G8(P4+), G9(P3+,P5+,P6+), G10(P5- :F(power(sin(P5),2)))) ] NEMO
NEMO compiler emits SBML
NEMO NEMO compiler emits SBML – Systems Biology Markup Language
NEMO NEMO compiler emits SBML – Systems Biology Markup Language – Uses libsbml from
NEMO NEMO compiler emits SBML – Systems Biology Markup Language – Uses libsbml from More accurate to call it a “language translator”
NEMO NEMO compiler emits SBML – Systems Biology Markup Language – Uses libsbml from More accurate to call it a “language translator”, that adds random generalized Hill functions!
NEMO NEMO compiler emits SBML – Systems Biology Markup Language – Uses libsbml from More accurate to call it a “language translator”, that adds random generalized Hill functions! Bioinformatics 2008 Jan 1;24(1):132-4
NEMO G0 G0 G1 G1 [GLIST(G0(P1+), G1(P0+)) ]
NEMO
NEMO B_0 P1 K_0 n_0 1 P1 K_0 n_0 tau
NEMO dc_0 P0 tau
NEMO P0 tau B_1 P0 K_1 n_1 1
NEMO P0 K_1 n_1 tau dc_1 P1 tau
NEMO P1 tau
Synthetic Gene Expression Data NEMO – Network Motif Language COPASI – Complex Pathway Simulator
Also outputs a table of data for each time step.
COPASI – Complex Pathway Simulator Also outputs a table of data for each time step, the last column of which is our synthetic data!
COPASI – Complex Pathway Simulator
Gene Expression Synthetic Gene Expression Data CODENSE Regionalized Sensitivity Analysis Results A Sensitivity Analysis of a Biological Module Discovery Pipeline
The CODENSE algorithm
Input – a series of expression correlation graphs, each representing a different state for an organism.
The CODENSE algorithm Input – a series of expression correlation graphs, each representing a different state for an organism. Output – groups of genes (modules) whose expression is correlated across the series of expression correlation graphs.
Expression Correlation
Pearson’s Correlation
Expression Correlation Pearson’s Correlation – Linear dependence between two variables
Expression Correlation Pearson’s Correlation – Linear dependence between two variables Pearson’s Correlation with Z-score
Expression Correlation Pearson’s Correlation – Linear dependence between two variables Pearson’s Correlation with Z-score – Number of standard deviations above the mean
Expression Correlation Pearson’s Correlation – Linear dependence between two variables Pearson’s Correlation with Z-score – Number of standard deviations above the mean Mutual Information
Expression Correlation Pearson’s Correlation – Linear dependence between two variables Pearson’s Correlation with Z-score – Number of standard deviations above the mean Mutual Information – A measure of mutual dependence between variables
Expression Correlation Pearson’s Correlation – Linear dependence between two variables Pearson’s Correlation with Z-score – Number of standard deviations above the mean Mutual Information – A measure of mutual dependence between variables – Non-linear dependence OK
The CODENSE algorithm
Manuscript in preparation
ODES – Overlapping Dense Subgraph Algorithm
vertex
edge
vertex edge connected
vertex edge connected
vertex edge connected cut vertex
vertex edge cut vertex
vertex edge cut vertex disconnected
vertex edge disconnected
ODES Density of a graph
ODES Number of actual edges / Number of possible edges
ODES Density of a graph Number of actual edges / Number of possible edges Number of possible edges
ODES Density of a graph Number of actual edges / Number of possible edges Number of possible edges Density
ODES Density of a graph Number of actual edges / Number of possible edges Number of possible edges Density Degree of a vertex
ODES Density of a graph Number of actual edges / Number of possible edges Number of possible edges Density Degree of a vertex Number of edges incident to the vertex
ODES Theorem
ODES A connected graph, with density, and number of vertices, has at least one non-cut vertex where degree, the average degree of vertices in.
ODES Theorem A connected graph, with density, and number of vertices, has at least one non-cut vertex where degree, the average degree of vertices in. Removal of from does not decrease the density of.
ODES Theorem A connected graph, with density, and number of vertices, has at least one non-cut vertex where degree, the average degree of vertices in. Removal of from does not decrease the density of. Bioinformatics (2010) 26 (21):
8 vertices
22 edges
8 vertices 22 edges average degree = 44/8 = 5.5
8 vertices 22 edges average degree = 44/8 = 5.5 density = 2*22/(8(8-1)) ~ 0.78
8 vertices 22 edges average degree = 44/8 = 5.5 density = 2*22/(8(8-1)) ~ 0.78
8 vertices 22 edges average degree = 5.5 density ~ 0.78
7 vertices 17 edges average degree ~ 4.86 density ~ 0.81
7 vertices 17 edges average degree ~ 4.86 density ~ 0.81
6 vertices 13 edges average degree ~ 4.33 density ~ 0.87
6 vertices 13 edges average degree ~ 4.33 density ~ 0.87
5 vertices 9 edges average degree = 3.6 density ~ 0.9
5 vertices 9 edges average degree = 3.6 density ~ 0.9
4 vertices 6 edges average degree = 3.0 density = 1.0
4 vertices 6 edges average degree = 3.0 density = 1.0
3 vertices 3 edges average degree = 2.0 density = 1.0
3 vertices 3 edges average degree = 2.0 density = 1.0
2 vertices 1 edge average degree = 1.0 density = 1.0
2 vertices 1 edge average degree = 1.0 density = 1.0
2 vertices 1 edge average degree = 1.0 density = 1.0
3 vertices 3 edges average degree = 2.0 density = 1.0
3 vertices 3 edges average degree = 2.0 density = 1.0
4 vertices 6 edges average degree = 3.0 density = 1.0
4 vertices 6 edges average degree = 3.0 density = 1.0
5 vertices 9 edges average degree = 3.6 density ~ 0.9
5 vertices 9 edges average degree = 3.6 density ~ 0.9
6 vertices 13 edges average degree ~ 4.33 density ~ 0.87
6 vertices 13 edges average degree ~ 4.33 density ~ 0.87
7 vertices 17 edges average degree ~ 4.86 density ~ 0.81
7 vertices 17 edges average degree ~ 4.86 density ~ 0.81
8 vertices 22 edges average degree = 5.5 density ~ 0.78
8 vertices 22 edges average degree = 5.5 density ~ 0.78
9 vertices 24 edges average degree ~ 5.3 density ~ 0.67
8 vertices 22 edges average degree = 5.5 density ~ 0.78
8 vertices 22 edges average degree = 5.5 density ~ 0.78 Note: brute-force search is confined to actual dense subgraphs.
Gene Expression Synthetic Gene Expression Data CODENSE Regionalized Sensitivity Analysis Results A Sensitivity Analysis of a Biological Module Discovery Pipeline
Regionalized Sensitivity Analysis
Monte Carlo model runs Regionalized Sensitivity Analysis
Monte Carlo model runs Evaluate one or more binary Objective Functions Regionalized Sensitivity Analysis
Monte Carlo model runs Evaluate one or more binary Objective Functions – Are only exact known modules returned? Regionalized Sensitivity Analysis
Monte Carlo model runs Evaluate one or more binary Objective Functions – Are only exact known modules returned? – Exact modules returned w/ limited false positives? Regionalized Sensitivity Analysis
Monte Carlo model runs Evaluate one or more binary Objective Functions – Are only exact known modules returned? – Exact modules returned w/ limited false positives? – Approximate modules returned w/ limited false positives? Regionalized Sensitivity Analysis
Monte Carlo model runs Evaluate one or more binary Objective Functions – Are only exact known modules returned? – Exact modules returned w/ limited false positives? – Approximate modules returned w/ limited false positives? – Half of known modules returned approximately w/ limited false positives? Regionalized Sensitivity Analysis
Monte Carlo model runs Evaluate one or more binary Objective Functions – Are only exact known modules returned? – Exact modules returned w/ limited false positives? – Approximate modules returned w/ limited false positives? – Half of known modules returned approximately w/ limited false positives? Increment parameter bins based on Objective Function conformance or non-conformance Regionalized Sensitivity Analysis
Compile the NEMO representation of the canonical network Regionalized Sensitivity Analysis
Compile the NEMO representation of the canonical network Import into COPASI Regionalized Sensitivity Analysis
Compile the NEMO representation of the canonical network Import into COPASI Set up model, and save in COPASI format Regionalized Sensitivity Analysis
Compile the NEMO representation of the canonical network Import into COPASI Set up model, and save in COPASI format Create template from COPASI format file where all genes are turned off (B = 0) Regionalized Sensitivity Analysis
Compile the NEMO representation of the canonical network Import into COPASI Set up model, and save in COPASI format Create template from COPASI format file where all genes are turned off (B = 0) Create synthetic data by turning some genes and SIMs on, taking last column of COPASI output. Regionalized Sensitivity Analysis
Compile the NEMO representation of the canonical network Import into COPASI Set up model, and save in COPASI format Create template from COPASI format file where all genes are turned off (B = 0) Create synthetic data by turning some genes and SIMs on, taking last column of COPASI output. Add noise to output. Regionalized Sensitivity Analysis
Gene Expression Synthetic Gene Expression Data CODENSE Regionalized Sensitivity Analysis Results A Sensitivity Analysis of a Biological Module Discovery Pipeline
Results, One Module
First RSA runs used a static transcription network Results, One Module
First RSA runs used a static transcription network – For Objective Function 1, no noise added Results, One Module
First RSA runs used a static transcription network – For Objective Function 1, no noise added Highly sensitive to PC cutoff score for PC only correlation Results, One Module
First RSA runs used a static transcription network – For Objective Function 1, no noise added Highly sensitive to PC cutoff score for PC only correlation – to 1.0 Results, One Module
First RSA runs used a static transcription network – For Objective Function 1, no noise added Highly sensitive to PC cutoff score for PC only correlation – to 1.0 – Very low conformance/non-conformance ratio Results, One Module
First RSA runs used a static transcription network – For Objective Function 1, no noise added Highly sensitive to PC cutoff score for PC only correlation – to 1.0 – Very low conformance/non-conformance ratio For Z-scores, sensitivity to PC cutoff score greatly attenuated Results, One Module
First RSA runs used a static transcription network – For Objective Function 1, no noise added Highly sensitive to PC cutoff score for PC only correlation – to 1.0 – Very low conformance/non-conformance ratio For Z-scores, sensitivity to PC cutoff score greatly attenuated – Lower PC scores allow more false positives to enter pipeline Results, One Module
First RSA runs used a static transcription network – For Objective Function 1, no noise added Highly sensitive to PC cutoff score for PC only correlation – to 1.0 – Very low conformance/non-conformance ratio For Z-scores, sensitivity to PC cutoff score greatly attenuated – Lower PC scores allow more false positives to enter pipeline – Significant sensitivities to similarity cutoff score and minimum density for a dense subgraph in coherent dense subgraphs Results, One Module
First RSA runs used a static transcription network – For Objective Function 1, no noise added Highly sensitive to PC cutoff score for PC only correlation – to 1.0 – Very low conformance/non-conformance ratio For Z-scores, sensitivity to PC cutoff score greatly attenuated – Lower PC scores allow more false positives to enter pipeline – Significant sensitivities to similarity cutoff score and minimum density for a dense subgraph in coherent dense subgraphs – Still a very low conformance/non-conformance ratio Results, One Module
First RSA runs used a static transcription network – For other Objective Functions Results, One Module
First RSA runs used a static transcription network – For other Objective Functions Conformance/non-conformance ratio rises to ~ 50% Results, One Module
First RSA runs used a static transcription network – For other Objective Functions Conformance/non-conformance ratio rises to ~ 50% Only sensitive to the parameter that declares the fraction of data sets that an edge must exist in to be included in the summary graph Results, One Module
First RSA runs used a static transcription network – For other Objective Functions Conformance/non-conformance ratio rises to ~ 50% Only sensitive to the parameter that declares the fraction of data sets that an edge must exist in to be included in the summary graph Observations Results, One Module
First RSA runs used a static transcription network – For other Objective Functions Conformance/non-conformance ratio rises to ~ 50% Only sensitive to the parameter that declares the fraction of data sets that an edge must exist in to be included in the summary graph Observations – No good at finding the exact module Results, One Module
First RSA runs used a static transcription network – For other Objective Functions Conformance/non-conformance ratio rises to ~ 50% Only sensitive to the parameter that declares the fraction of data sets that an edge must exist in to be included in the summary graph Observations – No good at finding the exact module – No noise is unrealistic Results, One Module
First RSA runs used a static transcription network – For other Objective Functions Conformance/non-conformance ratio rises to ~ 50% Only sensitive to the parameter that declares the fraction of data sets that an edge must exist in to be included in the summary graph Observations – No good at finding the exact module – No noise is unrealistic – Static network parameters are unrealistic Results, One Module
RSA with noise and dynamic network parameters Results, One Module
RSA with noise and dynamic network parameters – normally distributed noise, mu=0, sigma=10% of datum Results, One Module
RSA with noise and dynamic network parameters – normally distributed noise, mu=0, sigma=10% of datum – Still low conformance/non-conformance ratios for PC only, ~ 10% Results, One Module
RSA with noise and dynamic network parameters – normally distributed noise, mu=0, sigma=10% of datum – Still low conformance/non-conformance ratios for PC only, ~ 10% – Jumps to 30-45% with Z-score methods Results, One Module
RSA with noise and dynamic network parameters – normally distributed noise, mu=0, sigma=10% of datum – Still low conformance/non-conformance ratios for PC only, ~ 10% – Jumps to 30-45% with Z-score methods – Sensitivities mostly disappear in the presence of noise Results, One Module
RSA with noise and dynamic network parameters – normally distributed noise, mu=0, sigma=10% of datum – Still low conformance/non-conformance ratios for PC only, ~ 10% – Jumps to 30-45% with Z-score methods – Sensitivities mostly disappear in the presence of noise sensitive to how fast module genes reach equilibrium Results, One Module
RSA with noise and dynamic network parameters – normally distributed noise, mu=0, sigma=10% of datum – Still low conformance/non-conformance ratios for PC only, ~ 10% – Jumps to 30-45% with Z-score methods – Sensitivities mostly disappear in the presence of noise sensitive to how fast module genes reach equilibrium sensitive to percentage of time a module is ‘on’ in the data Results, One Module
RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods Results, One Module
RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods sigma = 20% of datum to be modified Results, One Module
RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods sigma = 20% of datum to be modified – conformance/non-conformance drops to ~ 25-35% Results, One Module
RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods sigma = 20% of datum to be modified – conformance/non-conformance drops to ~ 25-35% sigma = 25% of datum to be modified Results, One Module
RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods sigma = 20% of datum to be modified – conformance/non-conformance drops to ~ 25-35% sigma = 25% of datum to be modified – conformance/non-conformance drops to ~ 15-25% Results, One Module
RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods sigma = 20% of datum to be modified – conformance/non-conformance drops to ~ 25-35% sigma = 25% of datum to be modified – conformance/non-conformance drops to ~ 15-25% sigma = 33% of datum to be modified Results, One Module
RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods sigma = 20% of datum to be modified – conformance/non-conformance drops to ~ 25-35% sigma = 25% of datum to be modified – conformance/non-conformance drops to ~ 15-25% sigma = 33% of datum to be modified – conformance/non-conformance drops to ~ 5-10% Results, One Module
RSA with noise and dynamic network parameters – Different amounts of noise, Z-score methods sigma = 20% of datum to be modified – conformance/non-conformance drops to ~ 25-35% sigma = 25% of datum to be modified – conformance/non-conformance drops to ~ 15-25% sigma = 33% of datum to be modified – conformance/non-conformance drops to ~ 5-10% for 25% & 33% cases, sensitivities appear to non-module maximum expression coefficient distribution mu and sigma Results, One Module
RSA with noise and dynamic network parameters Results, Relaxed Module
RSA with noise and dynamic network parameters – Pipeline parameters fixed, MI calculation performed Results, Relaxed Module
RSA with noise and dynamic network parameters – Pipeline parameters fixed, MI calculation performed – Module parameters picked from a wider distribution Results, Relaxed Module
RSA with noise and dynamic network parameters – Pipeline parameters fixed, MI calculation performed – Module parameters picked from a wider distribution – Sensitive to rate at which module parameters reach equilibrium, and percentage of time module is ‘on’ in the data Results, Relaxed Module
RSA with noise and dynamic network parameters – Pipeline parameters fixed, MI calculation performed – Module parameters picked from a wider distribution – Sensitive to rate at which module parameters reach equilibrium, and percentage of time module is ‘on’ in the data – Conformance/non-conformance ratio is ~ 35-50% Results, Relaxed Module
MI calculation Results, Relaxed Module
MI calculation – Invoked only when PC + Z-score fails to infer an edge Results, Relaxed Module
MI calculation – Invoked only when PC + Z-score fails to infer an edge – Invoked only if expression levels are comparable Results, Relaxed Module
MI calculation – Invoked only when PC + Z-score fails to infer an edge – Invoked only if expression levels are comparable Hypothesis is that module members are expressed in quantities that are not vastly different from each other Results, Relaxed Module
MI calculation – Invoked only when PC + Z-score fails to infer an edge – Invoked only if expression levels are comparable Hypothesis is that module members are expressed in quantities that are not vastly different from each other – Invoked only if expression levels are not small Results, Relaxed Module
MI calculation – Invoked only when PC + Z-score fails to infer an edge – Invoked only if expression levels are comparable Hypothesis is that module members are expressed in quantities that are not vastly different from each other – Invoked only if expression levels are not small At levels considered ‘noise’ Results, Relaxed Module
MI calculation – Invoked only when PC + Z-score fails to infer an edge – Invoked only if expression levels are comparable Hypothesis is that module members are expressed in quantities that are not vastly different from each other – Invoked only if expression levels are not small At levels considered ‘noise’ – Typically increases conformance/non-conformance ratios by ~ 0-5% Results, Relaxed Module
Results, Two Modules
Most realistic case yet Results, Two Modules
Most realistic case yet With PC, Z-score, and noise, pipeline is sensitive to rate at which module genes reach equilibrium, and sensitive to Z-score cutoff value, as well as percentage of time module is ‘on’ in the data Results, Two Modules
Most realistic case yet With PC, Z-score, and noise, pipeline is sensitive to rate at which module genes reach equilibrium, and sensitive to Z-score cutoff value, as well as percentage of time module is ‘on’ in the data – conforming/non-conforming ratio is ~ 25-40% Results, Two Modules
Most realistic case yet With PC, Z-score, and noise, pipeline is sensitive to rate at which module genes reach equilibrium, and sensitive to Z-score cutoff value, as well as percentage of time module is on in the data – conforming/non-conforming ratio is ~ 25-40% With the addition of an MI calculation, the only sensitivity is percentage of time module is ‘on’ in the data Results, Two Modules
Most realistic case yet With PC, Z-score, and noise, pipeline is sensitive to rate at which module genes reach equilibrium, and sensitive to Z-score cutoff value, as well as percentage of time module is on in the data – conforming/non-conforming ratio is ~ 25-40% With the addition of an MI calculation, the only sensitivity is percentage of time module is ‘on’ in the data – conforming/non-conforming ratio is ~ 30-45% Results, Two Modules
With only a sensitivity to the percentage of time the module is ‘on’ in the data Results, Two Modules
With only a sensitivity to the percentage of time the module is ‘on’ in the data – Pipeline is robust in the face of network variation Results, Two Modules
With only a sensitivity to the percentage of time the module is ‘on’ in the data – Pipeline is robust in the face of network variation – Can start the pipeline ‘support’ parameter at a high value, Results, Two Modules
With only a sensitivity to the percentage of time the module is ‘on’ in the data – Pipeline is robust in the face of network variation – Can start the pipeline ‘support’ parameter at a high value, and turn it down until modules are detected! Results, Two Modules
Conclusions
NEMO – Network Motif language developed Conclusions
– Language translator for a qualitative transcription network description to a quantitative SBML model Conclusions
NEMO – Network Motif language developed – Language translator for a qualitative transcription network description to a quantitative SBML model – Used as input to the COPASI biochemical simulator Conclusions
NEMO – Network Motif language developed – Language translator for a qualitative transcription network description to a quantitative SBML model – Used as input to the COPASI biochemical simulator To generate synthetic gene expression data Conclusions
NEMO – Network Motif language developed – Language translator for a qualitative transcription network description to a quantitative SBML model – Used as input to the COPASI biochemical simulator To generate synthetic gene expression data – Microarray data Conclusions
NEMO – Network Motif language developed – Language translator for a qualitative transcription network description to a quantitative SBML model – Used as input to the COPASI biochemical simulator To generate synthetic gene expression data – Microarray data – NGS data Conclusions
NEMO – Network Motif language developed – Language translator for a qualitative transcription network description to a quantitative SBML model – Used as input to the COPASI biochemical simulator To generate synthetic gene expression data – Microarray data – NGS data – Bioinformatics (2008) 24 (1): Conclusions
Conclusions ODES – Overlapping Dense Subgraph Algorithm
Conclusions – In the class of exponentially exact algorithms
Conclusions ODES – Overlapping Dense Subgraph Algorithm – In the class of exponentially exact algorithms – Confines brute-force search domain to actual dense subgraphs
Conclusions ODES – Overlapping Dense Subgraph Algorithm – In the class of exponentially exact algorithms – Confines brute-force search domain to actual dense subgraphs – Bioinformatics (2010) 26 (21):
Open source CODENSE algorithm developed Conclusions
– Improved expression correlation algorithms Conclusions
Open source CODENSE algorithm developed – Improved expression correlation algorithms – Uses ODES dense subgraph algorithm Conclusions
Open source CODENSE algorithm developed – Improved expression correlation algorithms – Uses ODES dense subgraph algorithm – Successful identification of modules from synthetic data Conclusions
Open source CODENSE algorithm developed – Improved expression correlation algorithms – Uses ODES dense subgraph algorithm – Successful identification of modules from synthetic data – Manuscript in preparation Conclusions
Regionalized Sensitivity Analysis Performed Conclusions
– Pipeline is insensitive to reasonably chosen parameters Conclusions
Regionalized Sensitivity Analysis Performed – Pipeline is insensitive to reasonably chosen parameters In the presence of noise Conclusions
Regionalized Sensitivity Analysis Performed – Pipeline is insensitive to reasonably chosen parameters In the presence of noise, except for the ‘support’ parameter Conclusions
Regionalized Sensitivity Analysis Performed – Pipeline is insensitive to reasonably chosen parameters In the presence of noise, except for the ‘support’ parameter – Pipeline is insensitive to transcription network variability Conclusions
Regionalized Sensitivity Analysis Performed – Pipeline is insensitive to reasonably chosen parameters In the presence of noise, except for the ‘support’ parameter – Pipeline is insensitive to transcription network variability – Pipeline is robust in the face of noise Conclusions
Regionalized Sensitivity Analysis Performed – Pipeline is insensitive to reasonably chosen parameters In the presence of noise, except for the ‘support’ parameter – Pipeline is insensitive to transcription network variability – Pipeline is robust in the face of noise – Up to a 50% conformance/non-conformance ratio Conclusions
Future Work
Package the pipeline code for use by researchers. Future Work
Package the pipeline code for use by researchers. Make sensitivity runs on the canonical network, where more than two modules have the opportunity of being turned on. Future Work
Package the pipeline code for use by researchers. Make sensitivity runs on the canonical network, where more than two modules have the opportunity of being turned on. Sensitivity runs on different network topologies. Future Work
Package the pipeline code for use by researchers. Make sensitivity runs on the canonical network, where more than two modules have the opportunity of being turned on. Sensitivity runs on different network topologies. Test on larger networks. Future Work
Package the pipeline code for use by researchers. Make sensitivity runs on the canonical network, where more than two modules have the opportunity of being turned on. Sensitivity runs on different network topologies. Test on larger networks. Test on real data! Future Work
Package the pipeline code for use by researchers. Make sensitivity runs on the canonical network, where more than two modules have the opportunity of being turned on. Sensitivity runs on different network topologies. Test on larger networks. Test on real data! Investigate spanning tree initialization of ODES Future Work
Package the pipeline code for use by researchers. Make sensitivity runs on the canonical network, where more than two modules have the opportunity of being turned on. Sensitivity runs on different network topologies. Test on larger networks. Test on real data! Investigate spanning tree initialization of ODES – Changes exact algorithm into high-probability heuristic Future Work
Thank You