Presentation is loading. Please wait.

Presentation is loading. Please wait.

Preliminary Exploratory Data Analysis for Ruth’s G2P Workflow Bernice Rogowitz 4/30/2010.

Similar presentations


Presentation on theme: "Preliminary Exploratory Data Analysis for Ruth’s G2P Workflow Bernice Rogowitz 4/30/2010."— Presentation transcript:

1 Preliminary Exploratory Data Analysis for Ruth’s G2P Workflow Bernice Rogowitz 4/30/2010

2 Exploratory Data Analysis Exercise A look at Bjorn’s experimental data with Lecong’s Lemma analysis Main focus: to demonstrate potential for exploratory data analysis techniques for G2P visualization –and to demonstrate ViVA capabilities Preliminary identification of “interesting” clusters of genes, which can feed additional analyses of pathway and metabolic activity This type of analysis can pave the way to creating an exploratory analysis component based on ViVA for integration into Ruth’s workflow

3 Experimental Data 15,085 genes 2 Experiments –Long Exposure –Short Exposure 5 Temperatures for each -- 10,12, 14, 17, 8 degrees Celsius Control Condition for each Experiment, 20 degrees Celsius Data –Gene expression values –Statistical significance, relative to control condition A quick look at expression value distributions

4 Distribution of p values for short and long experiments Notice: lots of highly significant values, p<=.05

5 Exploratory data analysis Result: Green = all those genes that are significant in all the short conditions. Note, none significantly different from the control in the warmest condition, 17 degrees Celsius. For each condition, mark values where p<=0.5 in green For each condition, mark values where p>=0.5 in in black

6 Identify genes by ID in “Category Table”

7 Genes that are significantly different from controls across14C, 12C, 10C and 8C Short Experiment

8 Genes that are significantly different from controls across14C, 12C, 10C and 8C Long Experiment The “yellow” genes are significantly different from controls in both the short and long experiments

9 Genes that are significantly different from controls across 14C, 12C, 10C and 8C Short Experiment

10 Two genes At 17C, there were no genes that behaved different from the control. Two genes were significantly different from their controls in all other experimental conditions –In the long- and short-duration experiments –For temperature = 14C, 12C, 10C and 8C These are: –246114_at –257252_at

11 Another Analysis Identify Genes that are differentially expressed in the different experiments and conditions First, identify genes that are visually different from controls. Second, filter out identified genes that are not statistically different from the controls.

12 Expression vs. Control Short Experiment Long Experiment

13 Visual Exploration: Short Condition 1.Visually identified genes in short-14 condition that were different from controls. Color them red. 2.Examine short- 8 condition. Additional genes identified in short-8, which were visually different from controls, color them green.

14 Visual Exploration: Short Condition 1.Genes that are active in the short-14 condition tend to be more highly differentiated in lower temperature, short-8 (red) 2.Additional genes are differentiated at the lower temperature (green)

15 Visual Exploration: Long Condition 1. Some red and green genes are also visually differentiated in the long condition 2.Additional genes are visually differentiated, which are not green or red. Color them blue.

16 Visual Exploration: Return to the Short Condition 1.Genes that are differentiated in the long condition (blue) are basically not differentiated in the short condition (no blue genes are present)

17 S-14 S-8 L-8

18

19 “Blue” Genes 28 genes identified which are very different from the control for the long-8 condition, but not for the short conditions. Next: which of these are significantly different?

20 ViVA “Category Table” Table sorted by p-value”. Not all “blue” genes significantly different from the control. To see which ones are, look at the blue genes only.

21 Identifying p<=0.05 deviations P<= 0.05 P<=.01

22 Genes that are significantly expressed for long, but not short duration P<=0.05 P<=0.01

23 Goal: Venn Diagram of Co- Expression (a la Ruth) 2 Short 14 Long 14 Long 8Short 8

24 Next step Translate gene ids to searchable format Use PlantMetGen and MapMan to identify involved pathways and metabolic functions


Download ppt "Preliminary Exploratory Data Analysis for Ruth’s G2P Workflow Bernice Rogowitz 4/30/2010."

Similar presentations


Ads by Google