Individual Gene Analysis, Categorized on Validity of Inputs.

Slides:



Advertisements
Similar presentations
Diagnostic Metrics Week 2 Video 3. Different Methods, Different Measures  Today we’ll continue our focus on classifiers  Later this week we’ll discuss.
Advertisements

Psych 5500/6500 t Test for Two Independent Groups: Power Fall, 2008.
Measurement Reliability and Validity
Hypothesis Testing making decisions using sample data.
Standard Deviation and Standard Error Tutorial
Measures of Dispersion
Empirical Analysis Doing and interpreting empirical work.
Nemours Biomedical Research Statistics March 19, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Statistical methods for identifying yeast cell cycle transcription factors Speaker: Chun-hui Cai.
Swami NatarajanJune 17, 2015 RIT Software Engineering Reliability Engineering.
Power Analysis for Correlation & Multiple Regression Sample Size & multiple regression Subject-to-variable ratios Stability of correlation values Useful.
Statistical Analysis of Microarray Data
The Analysis of Variance
Chapter 6: Introduction to Formal Statistical Inference November 19, 2008.
Modeling the Gene Expression of Saccharomyces cerevisiae Δcin5 Under Cold Shock Conditions Kevin McKay Laura Terada Department of Biology Loyola Marymount.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 11 Introduction to Hypothesis Testing.
Multiple testing in high- throughput biology Petter Mostad.
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
Determining the Identity and Dynamics of the Gene Regulatory Network Controlling the Response to Cold Shock in Saccharomyces cerevisiae June 24, 2015.
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
1 Statistical Inference. 2 The larger the sample size (n) the more confident you can be that your sample mean is a good representation of the population.
One Sample Inf-1 If sample came from a normal distribution, t has a t-distribution with n-1 degrees of freedom. 1)Symmetric about 0. 2)Looks like a standard.
Two Way ANOVA ©2005 Dr. B. C. Paul. ANOVA Application ANOVA allows us to review data and determine whether a particular effect is changing our results.
Deletion of ZAP1 as a transcriptional factor has minor effects on S. cerevisiae regulatory network in cold shock KARA DISMUKE AND KRISTEN HORSTMANN MAY.
GRNmap Testing Analysis Grace Johnson and Natalie Williams June 10, 2015.
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
GRNmap Testing Grace Johnson and Natalie Williams June 3, 2015.
SURP 2015 Presentation draft 15 minutes. Wt, initial weight 1 run.
Uncertainties Using & Calculating Uncertainties for Electrical Measurement.
Changes in Gene Regulation in Δ Zap1 Strain of Saccharomyces cerevisiae due to Cold Shock Jim McDonald and Paul Magnano.
Errors “Computer says no..”. Types of Errors Many different types of errors new ones are being invented every day by industrious programming students..
Confidence intervals and hypothesis testing Petter Mostad
GRNmap and GRNsight June 24, Systems Biology Workflow DNA microarray data: wet lab-generated or published Generate gene regulatory network Modeling.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Data Analysis and GRNmap Testing Grace Johnson and Natalie Williams June 24, 2015.
Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan Multiple Regression SECTION 10.3 Variable selection Confounding variables.
Issues concerning the interpretation of statistical significance tests.
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
IMPROVED RECONSTRUCTION OF IN SILICO GENE REGULATORY NETWORKS BY INTEGRATING KNOCKOUT AND PERTURBATION DATA Yip, K. Y., Alexander, R. P., Yan, K. K., &
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Statistical Analysis of Microarray Data By H. Bjørn Nielsen.
Sunglasses Sales Excellence Discussion. Sunglasses Identify and describe at least one further feature of this time series data with reasons. – Sunglasses.
Regression Analysis: Part 2 Inference Dummies / Interactions Multicollinearity / Heteroscedasticity Residual Analysis / Outliers.
Research planning. Planning v. evaluating research To a large extent, the same thing Plan a study so that it is capable of yielding data that could possibly.
GRNmap Testing Grace Johnson and Natalie Williams June 17, 2015.
Ch 8 Estimating with Confidence 8.1: Confidence Intervals.
Within Strain ANOVA WTdHAP4 p < (38.4%)2387 (38.6%) p < (24.7%)1489 (24.1 %) p < (13.8%)679 (11.0%) p < (7.25%)240.
Maths Study Centre CB Open 11am – 5pm Semester Weekdays
National Hurricane Center 2009 Forecast Verification James L. Franklin Branch Chief, Hurricane Specialist Unit National Hurricane Center 2009 NOAA Hurricane.
Outline S. cerevisiae, a eukaryote known for cold-shock adaption, used in cold-shock experiments Deletion strand HMO1 and the comparison of microarray.
Comparison of the wild type of S. cerevisiae and S. paradoxus Karina Alvarez and Natalie Williams.
A.P. STATISTICS EXAM REVIEW TOPIC #2 Tests of Significance and Confidence Intervals for Means and Proportions Chapters
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
The Fine Art of Knowing How Wrong You Might Be. To Err Is Human Humans make an infinitude of mistakes. I figure safety in numbers makes it a little more.
The Reasons for the Steps of Descriptive Statistics
(5) Notes on the Least Squares Estimate
Departments of Biology and Mathematics
Week 14 Assignment Kara Dismuke.
Week 14 Output Figures Kristen Horstmann.
Significance Tests: The Basics
GRNmap Testing and Results
Consolidation of Gephi Stats
ANOVA Within Strain P-values
(Within-Strain ANOVA)
Week 14 – Network Modeling
Consolidation of Gephi Stats
There are 16 Different Combinations for the Test Inputs
Presentation transcript:

Individual Gene Analysis, Categorized on Validity of Inputs

Wt, initial weight 1 run

Genes with no Inputs (FHL1, SKO1, SWI6) wt B&H p=0.4454B&H p= B&H p= Good fit No significant change Modeled well Good fit No significant change, but maybe because of large variance Modeled well, could have an activator Fair fit No significant change, but some Modeled fairly well, may have a missing repressor Current downward trend of model is due to a pro rate only slightly larger than degradation rate It would be helpful to know actual production rate dCIN5 B&H p= dGLN3 B&H p= dCIN5 B&H p= dGLN5 B&H p= dCIN5 B&H p= dGLN5 B&H p=0.4557

Genes with No inputs are modeled well Because these genes are not regulated by any other genes, they should have no significant dynamics. This is reflected in their p-values – No significant dynamics in the wt or dCIN5 and dGLN3 deletion strains Genes with no inputs are modeled well

Genes with One Input (HAP5, HMO1) B&H p= B&H p= Poor fit No significant dynamics Good fit Significant upward dynamics

HAP5 Regulator: SWI4 B&H p= B&H p= Because the dynamics of HAP5’s only regulator are not significant, it is difficult to estimate HAP5’s w and b. SWI4 seems to have essentially no effect on HAP5. The estimated production rate and weight is extremely small. This could just be noise modeling Weight: -4.4E-5

HMO1 Regulator: FHL1 B&H p= B&H p= Weight: 0.24 Although FHL1 has insignificant dynamics, making parameters difficult to estimate, it does produce the correct output in HMO1. HMO1 is probably modeled well High estimated production rate also contributes to the large upward trend. It would be helpful to know actual production rate

Genes with Two Inputs (ACE2, HOT1, MGA2, MAL33) B&H p= B&H p= B&H p= B&H p= Poor fit, large variance Significant dynamics Okay fit, given large variance No significant dynamics Fairly good fit No statistically significant dynamics, but visible upward trend Decent fit No significant dynamics

ACE2 Regulators: ZAP1 and FKH2 B&H p= B&H p= B&H p= W and b parameters of ACE2 are easier to estimate because both its regulators have dynamics Both regulators activate ACE2 in the network. If this was true, ACE2 should show significant upward dynamics ACE2 is wired incorrectly Weight: 0.22 Weight: 0.082

HOT1 Regulators: CIN5 and SKN7 B&H p= B&H p= B&H p= Both regulators show significant dynamics, so it is easier to estimate HOT1’s parameters Given that both regulators increase their expression, HOT1’s expression should decrease more towards then end of the time series Unsure… variance of data makes it tricky to determine problem Weight: Weight:

MGA2 Regulators: GLN3 and SMP1 B&H p= B&H p= B&H p= Weight: 0.33 Weight: Regulators do not have significant dynamics. MGA2’s parameters are difficult to estimate. High estimated production rate relative to degradation rate is also causing the upward dynamics. Knowing actual production rate would give a more conclusive case.

MGA2 with dGLN3 dGLN3 B&H p= Deleting GLN3 decreases the expression of MGA2 MGA2’s wiring to GLN3 is modeled correctly Wt B&H p=0.1028

MAL33 Regulators: MBP1 and SMP1 B&H p=0.0101B&H p= B&H p= Weight: Weight: 0.77 Production rate is huge relative to other genes. The model is attempting to fit the large initial spike Are these dynamics due to a regulator we’re not seeing? Why does MBP1 repress MAL33? Because inputs have no dynamics, it is difficult to estimate w’s and b Unsure of MAL33 connection

Genes with Three Inputs (MSS11) B&H p= Good fit No significant dynamics Inputs have some significant dynamics, weights are probably estimated well Given this, why is there a downward expression? Estimated production rate is about the same as degradation… this is causing downward model line Weights are probably good… a good example of why we need to find production and degradation rates from literature Validity of connection is uncertain Regulators: SKO1, CIN5 and SKN7 Weight: Weight: 0.16 Weight: B&H p= B&H p= B&H p=0.0228

Genes with Self-Regulation Only (MBP1, SKN7, ZAP1) B&H p= B&H p= B&H p= Decent fit, large variance though No significant dynamics Good fit, large variance though Significant upward dynamics Decent fit, large variance though Significant upward dynamics

MBP1 Self Regulation B&H p= Weight: Upward trend of model described by estimated production rate 4X that of degradation rate Weight can be almost anything because MBP1 has no significant dynamics… the model made it small so it would fit the up-ish trend of the data Because of variance of data, it is difficult to tell if MBP1 is missing an activator (it’s probably not wired correctly though… we really need production rates to tell)

SKN7 Self Regulation B&H p= Weight: 0.50 Because of SKN7’s significant dynamics, we can be fairly confident in the validity of the weight value Model appears fits the positive feedback connection in the network... However, the trend looks as if it’s leveling off, which should not be the case with complete positive feedback (see ZAP1) – unless weight is smallish and degradation rate kicks in SKN7 may have a repressor that levels this off… or a larger degradation rate Connection validity is uncertain

ZAP1 Self Regulation B&H p= Weight: 0.77 Because of ZAP1’s significant dynamics, we can be a little more confident in the validity of this weight. ZAP1 seems to be exhibiting a positive feedback cycle trend, which matches the continuous upward trend in expression The strength of this weight could be masking other activators, but other than this ZAP1 seems to be modeled well

Genes with Self Regulation and Other Inputs (FKH2, AFT2, GLN3, CIN5, SMP1, SWI4, YAP6, PHD1) B&H p= B&H p= B&H p= B&H p= B&H p= B&H p= B&H p= B&H p=0.0017

AFT2 (Two regulators) B&H p= B&H p= Regulators: AFT2 and SKN7 Weight: Weight: AFT2 has a decent fit, no significant dynamics Weights are too small to see any effect. Uncertain of AFT2’s connectivity… because of its dynamics (or absence thereof) it looks like we’re modeling noise

FKH2 (Two regulators) B&H p= Regulators: FKH2 and FHL1 B&H p= Weight: Weight: FKH2 has a fairly good fit with statistically insignificant dynamics, but a visible downward trend Because FKH2’s regulators do not have much dynamics, it is difficult to estimate w’s and b Degradation rate is higher than production rate… model decreased P in attempt to fit data Unsure of connection validity… no glaring errors, but having a literature production rate would help elucidate weights

GLN3 (Two regulators) B&H p= Regulators: GLN3 and MAL33 Weight: Weight: 0.55 B&H p= GLN3 has an okay fit with no significant dynamics Slight self repression may work to keep levels stable… not enough data points to tell Regulator MAL33 has a relatively large weight and increased levels of expression…. GLN3 should exhibit more increased expression. Perhaps GLN3 is missing a repressor. GLN3 is probably missing an input, although variance gives uncertainty.

CIN5 (Four regulators) Regulators: CIN5, SKO1, PHD1, YAP6 B&H p= B&H p= B&H p= B&H p= Weight: 0.73 Weight: Weight: Weight: CIN5 has a fairly good fit with visible expression change, but no statistically significant dynamics because of large variance Majority of regulators show significant dynamics, making the weights easier to estimate (i.e. they are probably reliable) CIN5’s production rate is much higher than its degradation rate. This is contributing to large upward dynamics, even with several repressors present. Again, without knowledge of production and degradation rates, it is difficult to say if CIN5 has the correct inputs.

SMP1 (Four regulators) B&H p= Regulators: SMP1, CIN5, FHL1, PHD1 B&H p= B&H p= Weight: Weight: 0.19 Weight: 0.03 Weight: B&H p= SMP1 has a good fit with no significant dynamics Only two of its four regulators have significant dynamics, making weights difficult to estimate The largest weight comes from CIN5, which is also up-regulated. SMP1 should also exhibit upward expression. The slightly downward trend is due to an estimated production rate that is roughly equivalent to the degradation rate Without knowledge of pro and degradation rates, we cant’s say much about the validity of SMP1’s inputs. All the weights are so low and SMP1 has no dynamics… we could just be modeling noise

SWI4 (Six regulators) Regulators: SWI4, MBP1, MAL33, PHD1, SWI6, YAP6 B&H p= B&H p= B&H p= B&H p= Weight: Weight: Weight: Weight: B&H p= Weight: B&H p= Weight: With SWI4, we are probably modeling noise SWI4 has a fairly poor fit with no significant dynamics Several regulators have insignificant dynamics or poor fits themselves Weights are all very small Production rate is tiny compared to degradation rate – the model is trying to account for the slightly downward trend Too many inputs?

YAP6 (Seven regulators) Regulators: YAP6, CIN5, FHL1, FKH2, PHD1, SKN7, SKO1 B&H p= B&H p= B&H p= B&H p= B&H p= B&H p= B&H p= Weight: Weight: 0.26 Weight: Weight: 0.19 Weight: Weight: Weight: YAP6 has significant dynamics and is modeled fairly well Estimated production rate is less than the degradation rate. This is contributing to the downward trend, even when the strongest weights (coming from genes with significant dynamics) are activating YAP6 Because YAP6’s regulators are mostly dynamic, the weights are probably estimated well. However, the validity of these inputs is uncertain without further knowledge of actual production and degradation rates.

PHD1 (Seven regulators) Regulators: PHD1, CIN5, FHL1, SKN7, SKO1, SWI4, SWI6 B&H p= B&H p= B&H p= B&H p= B&H p= B&H p= B&H p= Weight: 0.16 Weight: Weight: Weight: 0.16 Weight: Weight: Weight: 0.14 PHD1 has a good fit with significant dynamics Most regulators also have significant dynamics, making the weights easier to estimate Production rate is 3X degradation rate (a relatively stable value) Although it is difficult to tell with so many inputs, PHD1’s model follows the trend of its inputs well Initially activated, then slightly repressed as the two repressors (CIN5 and SKN7) increase their expression PHD1’s inputs seem justified Total repression: Total activation: 0.61

General Conclusions We definitely need to get literature production and degradation rates It is difficult to make any conclusive statements about the connections in the network without knowing the production and degradation rates… too many parameters to take into consideration In terms of inputs: – 6 genes are modeled well – 11 genes are uncertain in their inputs (due to the need of P and d) – we could say something definitive about these genes if we didn’t have to estimate P – 4 genes are modeled poorly/missing inputs