GRNmap Testing Analysis Grace Johnson and Natalie Williams June 10, 2015.

Slides:

Advertisements

Similar presentations

© Copyright 2001, Alan Marshall1 Regression Analysis Time Series Analysis.

Advertisements

Visualization of hidden node activity in a feed forward neural network Adam Arvay.

Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.

Prediction, Correlation, and Lack of Fit in Regression (§11. 4, 11

Agenda Review homework Lecture/discussion Week 10 assignment

Chapter 13 Multiple Regression

Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.

Regression Diagnostics Using Residual Plots in SAS to Determine the Appropriateness of the Model.

2008/10/21 J. Fontenla, M Haberreiter LASP – Univ. of Colorado NADIR MURI Focus Area.

Plots, Correlations, and Regression Getting a feel for the data using plots, then analyzing the data with correlations and linear regression.

Mean Comparison With More Than Two Groups

Analysis of Variance & Multivariate Analysis of Variance

Stat 112: Lecture 9 Notes Homework 3: Due next Thursday

Correlation and Regression Analysis

Modeling the Gene Expression of Saccharomyces cerevisiae Δcin5 Under Cold Shock Conditions Kevin McKay Laura Terada Department of Biology Loyola Marymount.

Descriptive measures of the strength of a linear association r-squared and the (Pearson) correlation coefficient r.

Slides 13b: Time-Series Models; Measuring Forecast Error

Inference for regression - Simple linear regression

Forecasting and Statistical Process Control MBA Statistics COURSE #5.

Determining the Identity and Dynamics of the Gene Regulatory Network Controlling the Response to Cold Shock in Saccharomyces cerevisiae June 24, 2015.

Calibration Guidelines 1. Start simple, add complexity carefully 2. Use a broad range of information 3. Be well-posed & be comprehensive 4. Include diverse.

September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.

McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.

1 Statistical Inference. 2 The larger the sample size (n) the more confident you can be that your sample mean is a good representation of the population.

Deletion of ZAP1 as a transcriptional factor has minor effects on S. cerevisiae regulatory network in cold shock KARA DISMUKE AND KRISTEN HORSTMANN MAY.

Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

Regression For the purposes of this class: –Does Y depend on X? –Does a change in X cause a change in Y? –Can Y be predicted from X? Y= mX + b Predicted.

Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.

Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.

Tutorial for solution of Assignment week 39 “A. Time series without seasonal variation Use the data in the file 'dollar.txt'. “

GRNmap Testing Grace Johnson and Natalie Williams June 3, 2015.

SURP 2015 Presentation draft 15 minutes. Wt, initial weight 1 run.

Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.

Changes in Gene Regulation in Δ Zap1 Strain of Saccharomyces cerevisiae due to Cold Shock Jim McDonald and Paul Magnano.

GRNmap and GRNsight June 24, Systems Biology Workflow DNA microarray data: wet lab-generated or published Generate gene regulatory network Modeling.

Data Analysis and GRNmap Testing Grace Johnson and Natalie Williams June 24, 2015.

Index Slide 2-5: Statistical testing results 6-14: Clustering results 15-17: GRNsight visualization of YEASTRACT results 18-20: GRNmap output visualization.

Stat 112 Notes 9 Today: –Multicollinearity (Chapter 4.6) –Multiple regression and causal inference.

IMPROVED RECONSTRUCTION OF IN SILICO GENE REGULATORY NETWORKS BY INTEGRATING KNOCKOUT AND PERTURBATION DATA Yip, K. Y., Alexander, R. P., Yan, K. K., &

MEASURE : Measurement System Analysis

Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.

SIMPLE LINEAR REGRESSION AND CORRELLATION

GRNmap Testing Grace Johnson and Natalie Williams June 17, 2015.

Student: Trixie Anne M. Roque, Tessa A. Morris Faculty Mentors: Dr. Kam D. Dahlquist, Dr. Ben G. Fitzpatrick, & Dr. John David N. Dionisio SURP 2015 Final.

Within Strain ANOVA WTdHAP4 p < (38.4%)2387 (38.6%) p < (24.7%)1489 (24.1 %) p < (13.8%)679 (11.0%) p < (7.25%)240.

Individual Gene Analysis, Categorized on Validity of Inputs.

Comparison of the wild type of S. cerevisiae and S. paradoxus Karina Alvarez and Natalie Williams.

Outline S. cerevisiae, a eukaryote known for cold-shock adaption, used in cold-shock experiments Deletion strand HMO1 and the comparison of microarray.

Hypothesis Testing and Statistical Significance

LESSON 5 - STATISTICS & RESEARCH STATISTICS – USE OF MATH TO ORGANIZE, SUMMARIZE, AND INTERPRET DATA.

Comparison of the wild type of S. cerevisiae and S. paradoxus Karina Alvarez and Natalie Williams.

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical.

Week 14 Assignment Kara Dismuke.

Agenda Review homework Lecture/discussion Week 10 assignment

Modeling Nitrogen Metabolism in Yeast

ANOVA Within Strain P-values

Week 14 Output Figures Kristen Horstmann.

Within Strain ANOVA for All Strains

GRNmap Testing and Results

Consolidation of Gephi Stats

Loyola Marymount Unviersity

ANOVA Within Strain P-values

Within-strain ANOVA for dCIN5

(Within-Strain ANOVA)

Comparison of the P-value significance between Wild Type and dGLN3

Consolidation of Gephi Stats

Jeffrey Crosson and William Gendron

There are 16 Different Combinations for the Test Inputs

Jeffrey Crosson and William Gendron

Jeffrey Crosson and William Gendron

Presentation transcript:

GRNmap Testing Analysis Grace Johnson and Natalie Williams June 10, 2015

Strain Run Comparison We looked at outputs from last week’s various strain runs to interpret the results

Estimated production rates and b values varied widely between strain runs Figure 1: Estimated b values Figure 2: Estimated production rates

We investigated genes that had either fluctuating or stable estimated b’s and production rates Auto-Regulators with no other inputs (fluctuating) Genes with no inputs (stable) SKN7FHL1 ZAP1SWI6 MBP1SKO1

Figure 9: All strains, initial weights 1 Figure 8: wt only, initial weights 1 Visualized networks also display variations between runs.

To evaluate each gene, we looked at: Production vs. degradation rate – How do these combine to produce effects seen in output graphs? ANOVA within-strain p-values – Are the genes significant? If so, is this due to dynamics or tight data? Fit vs. parameter stability – Good fit with stable parameter values? – Bad fit with unstable parameter values? – Bad fit with stable parameter values? – Good fit with unstable parameter values?

Genes with no inputs (stable parameters) revealed how to detect missing information. For a gene with no inputs, you would expect little dynamics, aside from initial adjusting to production/degradation effect FHL1 – modeled well SWI6 – missing input SKO1 – missing input

FHL1 is modeled well. Production vs. degradation rate – At steady state, expect logFC = 0.58 – Slight increase is observed in output graphs No p-value significance for wt or dHAP4 ANOVA – Variance is not huge, so FHL1 must not have significant dynamics. This is expected of a gene that has no inputs. FHL1 has a good fit with stable parameters, indicating it is modeled well in this 21 gene network

SWI6 is missing an input. Production vs. degradation rate – At steady state, expect logFC = – Slight decrease is observed in output graphs, excepting dCIN5 No p-value significance for wt or dHAP4 ANOVA – Could be because of variance, or just little dynamics – It would be useful to see dCIN5 ANOVA SWI6 has a okay fit with stable parameters, but the dCIN5 run indicates that SWI6 should be connected to CIN5 in some way

SKO1 is missing an input. Production vs. degradation rate – At steady state, expect logFC = 0.7 – Increase is observed in output graphs No p-value significance for wt ANOVA – Could be because of variance, or just little dynamics P-value significance (B&H 0.015) for dHAP4 ANOVA – SKO1 is affected by the deletion of HAP4 SKO1 has a fairly good fit with stable parameters, however the dHAP4 p-value indicates that SKO1 could be missing an input connected to HAP4 (not in current network)

Auto-regulators with no other inputs revealed problems fitting the model to the data. For self-regulating genes, you would expect: – Uncontrolled increasing dynamics, or – Cycles of up- then down-regulation MBP1: missing regulator SKN7: missing an input ZAP1: lacks various inputs

MBP1 needs an additional input Production vs. Degradation rate – All production rates fell within 0.08 – 0.13 range No significance in ANOVA p-values – A lot of variance or little dynamics MBP1 needs input from ZAP1 – The model fit the trends of the provided data points, however it failed to correctly adjust to T60 – Production rate values seem stable, but the threshold values vary in magnitude Production B values

SKN7 is missing an input Production vs. degradation rate – Initial dynamics of model relate to production rate Significant p-value (B&H: ; ) – Significance most likely results from large changes in the dynamics SKN7 should have GLN3 as a regulator – Exceptional fit of the data except for dGLN3 model for T30 & T60 – The parameters are very unstable with a lot of variance P-value to Fit of Model: wt is significant – The model fits this data better than other strains – Should analyze dGLN3 p-value Production B values

ZAP1 lacks necessary inputs. Production vs. degradation rate – All values lie below 0.1 – There were two negative overall effects (pro rate – deg rate) Significant p-values (B&H: ; ) – The magnitude of change makes the gene significant ZAP1 seems to be activated by CIN5, GLN3, and HMO1 – The model fits the data poorly, but best fit to wt – Does not adjust to fit T60 ZAP1 ProductionB values

General Conclusions Production vs. degradation rate seems to drive most dynamics – Sets the model on its trajectory – Difficult for model to alter its dynamics to fit to T60 Correlation between p-value and better fit of model to data – Accounting for variance

Model uses overall average of provided data points to fit multi-strain runs Finds curve that fits all data, not individual strains Averages similar slopes from individual runs to create single curve 5 Output Overlay All-Strain Run