Data Analysis and GRNmap Testing Grace Johnson and Natalie Williams June 24, 2015.

Slides:



Advertisements
Similar presentations
Britain Southwick Nicole Anguiano March 29, 2014
Advertisements

Open Day 2006 From Expression, Through Annotation, to Function Ohad Manor & Tali Goren.
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
Differentially expressed genes
Statistical Analysis of Microarray Data
ANOVA WTdZAP1 p < /6189 (31.42%)2264/6189 (36.58%) p < /6189 (24.67%)1445/6189 (23.35%) p < /6189 (13.90%)792/6189 (12.80%) p
Modeling the Gene Expression of Saccharomyces cerevisiae Δcin5 Under Cold Shock Conditions Kevin McKay Laura Terada Department of Biology Loyola Marymount.
Software Refactoring and Usability Enhancement for GRNmap, a Gene Regulatory Network Modeling Application Mathematical Model Equation 2. Equation 3. Future.
Determining the Identity and Dynamics of the Gene Regulatory Network Controlling the Response to Cold Shock in Saccharomyces cerevisiae June 24, 2015.
Deletion of ZAP1 as a transcriptional factor has minor effects on S. cerevisiae regulatory network in cold shock KARA DISMUKE AND KRISTEN HORSTMANN MAY.
A COMPREHENSIVE GENE REGULATORY NETWORK FOR THE DIAUXIC SHIFT IN SACCHAROMYCES CEREVISIAE GEISTLINGER, L., CSABA, G., DIRMEIER, S., KÜFFNER, R., AND ZIMMER,
GRNmap Testing Analysis Grace Johnson and Natalie Williams June 10, 2015.
GRNmap Testing Grace Johnson and Natalie Williams June 3, 2015.
Open Source Projects for Undergraduate Research Experiences
SURP 2015 Presentation draft 15 minutes. Wt, initial weight 1 run.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
Changes in Gene Regulation in Δ Zap1 Strain of Saccharomyces cerevisiae due to Cold Shock Jim McDonald and Paul Magnano.
GRNmap and GRNsight June 24, Systems Biology Workflow DNA microarray data: wet lab-generated or published Generate gene regulatory network Modeling.
Creating a Gene Regulatory Network Comparing a Wild Type Strain with a Mutant ΔGLN3 Deletion in S. cerevisiae Showed that ΔGLN3 Exhibits No Meaningful.
Index Slide 2-5: Statistical testing results 6-14: Clustering results 15-17: GRNsight visualization of YEASTRACT results 18-20: GRNmap output visualization.
IMPROVED RECONSTRUCTION OF IN SILICO GENE REGULATORY NETWORKS BY INTEGRATING KNOCKOUT AND PERTURBATION DATA Yip, K. Y., Alexander, R. P., Yan, K. K., &
Statistical Analysis of Microarray Data By H. Bjørn Nielsen.
Microarray Data Analysis The Bioinformatics side of the bench.
GRNmap Testing Grace Johnson and Natalie Williams June 17, 2015.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Student: Trixie Anne M. Roque, Tessa A. Morris Faculty Mentors: Dr. Kam D. Dahlquist, Dr. Ben G. Fitzpatrick, & Dr. John David N. Dionisio SURP 2015 Final.
Within Strain ANOVA WTdHAP4 p < (38.4%)2387 (38.6%) p < (24.7%)1489 (24.1 %) p < (13.8%)679 (11.0%) p < (7.25%)240.
Individual Gene Analysis, Categorized on Validity of Inputs.
Comparison of the wild type of S. cerevisiae and S. paradoxus Karina Alvarez and Natalie Williams.
Outline S. cerevisiae, a eukaryote known for cold-shock adaption, used in cold-shock experiments Deletion strand HMO1 and the comparison of microarray.
Comparison of the wild type of S. cerevisiae and S. paradoxus Karina Alvarez and Natalie Williams.
Systems modeling and statistical analysis allows comparison in the response to cold shock in Saccharomyces cerevisiae between Δhap4-derived and randomly.
Departments of Biology and Mathematics
Student: Trixie Anne M. Roque, Tessa A. Morris
Eddie Azinge, John Lopez, and Corinne Wong
Evaluating Hap4’s Role in the Gene Regulatory Network that Controls the Response to Cold Shock in Saccharomyces cerevisiae using GRNmap K. Grace Johnson1,
Week 14 Assignment Kara Dismuke.
Final Presentation [work in progress… work completed for Week 11 Journal Assignment] Kara Dismuke.
Ahnert, S. E., & Fink, T. M. A. (2016). Form and function in gene regulatory networks: the structure of network motifs determines fundamental properties.
Evaluating Hap4’s Role in the Gene Regulatory Network that Controls the Response to Cold Shock in Saccharomyces cerevisiae using GRNmap K. Grace Johnson1,
Deletion of ZAP1 as a transcriptional factor has minor effects on S
Lauren Kelly April 19th, 2017 BIOL398-05
1 Department of Engineering, 2 Department of Mathematics,
Biomathematical Modeling: The Deletion of HMO1 and its Effect on Cold Shock Reaction in S. cerevisiae Lauren Magee & Lucia Ramirez Department of Mathematics.
ANOVA Within Strain P-values
1 Department of Engineering, 2 Department of Mathematics,
Change in Expression after modifying fix_b?
1 Department of Engineering, 2 Department of Mathematics,
Change in Expression after modifying fix_b?
Change in Expression after modifying fix_b?
Comparison of the P-value significance between Wild Type and ΔGLN3
Within Strain ANOVA for All Strains
Significance Table from Anova Test
GRNmap Testing and Results
Change in Expression after modifying fix_b?
Final Presentation [work in progress… work completed for Week 11 Journal Assignment] Kara Dismuke.
dCIN5 and Wildtype Transcription Factor Mapping in Cold Shock
Comparison of the P-value significance between Wild Type and ΔGLN3
Loyola Marymount Unviersity
ANOVA Within Strain P-values
Percentages of specific p-value data
Evaluating Hap4’s Role in the Gene Regulatory Network that Controls the Response to Cold Shock in Saccharomyces cerevisiae using GRNmap K. Grace Johnson1,
Within-strain ANOVA for dCIN5
(Within-Strain ANOVA)
Comparison of the P-value significance between Wild Type and dGLN3
Jeffrey Crosson and William Gendron
Benjamini & Hochberg-corrected p < 0.05 Bonferroni-corrected
Jeffrey Crosson and William Gendron
Jeffrey Crosson and William Gendron
Presentation transcript:

Data Analysis and GRNmap Testing Grace Johnson and Natalie Williams June 24, 2015

General Overview 1.Microarray Data Analysis Workflow 2.GRNmap Testing SURP 2015

Microarray Data Analysis Workflow 1.Generating Log2 Ratios with GenePix Pro 2.Within- and Between-chip Normalization with R 3.Statistical Analysis a)Within-strain ANOVA b)Modified t-test for each time point c)Between-strain ANOVA 4.GenMAPP 5.Clustering with STEM 6.YEASTRACT 7.GRNmap and GRNsight

Generating Log2 Ratios with GenePix Pro Microarray chips are raw data from wet lab (wt, dCIN5, dGLN3, dHAP4, dHMO1, dSWI4, dZAP1, Spar) Quantitate the fluorescence signal in each spot by counting pixels Calculate the ratio of red/green fluorescence Log2 transform the ratios to put them on the same scale – 2 fold increase becomes 1 – 2 fold decrease becomes -1

Within- and Between-chip Normalization with R Normalization scripts written for R (64bit) Within array normalization for Ontario chips Within array normalization for GCAT chips Between array normalization for all chips Visualization plots of before and after normalization

Statistical Analysis Each group continued on, analyzing either wt, dCIN5, dGLN3, dHAP4 or dSWI4 Within-strain ANOVA told us how many genes had significant expression changes at any time point Modified t-test told us how many genes had significant changes at each time point Between-strain ANOVA told how many genes change their expression between strains – wt vs. deletion strain

Between-Strain ANOVA for wt Microarray Data ANOVA WTdCIN5dGLN3dHAP4dSWI4 p < (38.41%) 1995 (32.23%) 1856 (29.99%) 2387 (38.57%) 2583 (41.74%) p < (24.74%) 1157 (18.69%) 1007 (16.27%) 1489 (24.06%) 1679 (27.13%) p < (13.73%) 566 (9.15%) 398 (6.43%) 679 (10.97%) 869 (14.04%) p < (7.25%) 280 (4.52%) 121 (1.96%) 240 (3.88%) 446 (7.21%) B & H p < (27.03%) 1117 (18.05%) 889 (14.36%) 1615 (26.09%) 1855 (29.97%) Bonferroni p < (3.65%) 109 (1.76%) 20 (0.32%) 61 (0.99%) 179 (2.89%)

P-values Used in Statistical Analysis Uncorrected (0.05, 0.01, 0.001, ) – We run into the multiple testing problem Bonferroni corrected (0.05) – Multiply each p-value by the number of experiments (6189) – More stringent Benjamini and Hotchberg corrected (0.05) – Adjust Bonferroni by dividing by p-value rank – Less stringent

GenMAPP Guided Further Wet Lab Research In GenMAPP, we visualized results from ANOVA and t- tests, and categorized based on p-value significance We set up a voting system to determine which strains to test further (visible, significant dynamics) – Microarray winner: dYAP1 – Test for growth impairment winners: dNRG1, dPHD1, dRSF2, dYHP1, dRTG3, dYOX1

Clustering with STEM STEM (short time series expression miner) groups genes based on similar dynamics We built STEM profiles from genes with B&H p < 0.05 from within-strain ANOVA for our strain Profiles include GO information

YEASTRACT Genes from significant STEM profiles were entered as target genes into YEASTRACT – Inferring that the same set of TFs regulate genes that have similar dynamics YEASTRACT outputs a list of candidate TFs ranked by significance

Using YEASTRACT to create a hypothesis network To the resulting list of significant regulators, CIN5, GLN3, HAP4, HMO1, SWI4, and ZAP1 were added The new list of genes was entered into the YEASTRACT Gene Regulation Matrix as both regulators and target genes YEASTRACT outputs adjacency matrix that can be fed into GRNmap and visualized with GRNsight – Selecting “DNA binding evidence plus expression evidence” gives a more connected network – Selecting “only DNA binding evidence” gives a less connected network

GRNmap Estimates Parameters and Runs a Forward Simulation Networks from YEASTRACT were formatted in input sheet for MATLAB – Input sheet included log2 fold change data from wt and deletion strains Outputs were obtained by fitting model to wt data and chosen deletion strain data. Production rates and weights were estimated. Fix bEstimated b

Estimated weights from GRNmap were visualized using GRNsight Profile 16 Plus from STEM, using wt and dHAP4 data

GRNmap Testing SURP 2015 Analyzed each gene based on: – Fit (visual, SSE) – Dynamics (B&H p-value) – Dynamics of regulators (B&H p-value) – Output production/degradation rate ratio Genes fell into three categories when looking at the validity of inputs – Inputs to the gene are wired correctly – Inputs to the gene are wired incorrectly – Validity of inputs is uncertain due to the number and type of estimated parameters

Analyzed Each Gene from wt Alone Run 21-gene, 50-edge weighted network

Analyzed Each Gene from wt Alone Run 21-gene, 50-edge weighted network

PHD1 is Modeled Well Regulators: PHD1, CIN5, FHL1, SKN7, SKO1, SWI4, SWI6 B&H p= B&H p= B&H p= B&H p= B&H p= B&H p= B&H p= Weight: 0.16 Weight: Weight: Weight: 0.16 Weight: Weight: Weight: 0.14 PHD1 has a good fit with significant dynamics Most regulators also have significant dynamics, making the weights easier to estimate Production rate is 3X degradation rate (a relatively stable value) Although it is difficult to tell with so many inputs, PHD1’s model follows the trend of its inputs well Initially activated, then slightly repressed as the two repressors (CIN5 and SKN7) increase their expression PHD1’s inputs seem justified Total repression: Total activation: 0.61

MAL33 is Modeled Poorly Regulators: MBP1 and SMP1 B&H p=0.0101B&H p= B&H p= Weight: Weight: 0.77 Production rate is huge relative to other genes. The model is attempting to fit the large initial spike Are these dynamics due to a regulator we’re not seeing? Because inputs have no dynamics, it is difficult to estimate w’s and b Unsure of MAL33 connection

YAP6 Could Be Modeled Well Regulators: YAP6, CIN5, FHL1, FKH2, PHD1, SKN7, SKO1 B&H p= B&H p= B&H p= B&H p= B&H p= B&H p= B&H p= Weight: Weight: 0.26 Weight: Weight: 0.19 Weight: Weight: Weight: YAP6 has significant dynamics and is modeled fairly well Because YAP6’s regulators are mostly dynamic, the weights are probably estimated well. However, the validity of these inputs is uncertain without further knowledge of actual production and degradation rates. Estimated production rate is less than the degradation rate. This is contributing to the downward trend, even when the strongest weights (coming from genes with significant dynamics) are activating YAP6 Total repression: Total activation: 0.45

General Conclusions Genes fell into three categories when looking at the validity of inputs – 5 genes have correctly wired inputs and are modeled well – 4 genes are modeled poorly – For the other 12 genes (and really all 21 genes), the validity of inputs is uncertain due to the number and type of estimated parameters Genes with less dynamics are more difficult to model It is difficult to make any conclusive statements about the connections in the network without knowing the production and degradation rates.

Acknowledgments Dr. Dahlquist Dr. Fitzpatrick Dondi Natalie Williams