“noisy” signal analysis DRS data: “noisy” signal analysis
Signal vs. “noisy” signal Several replicas allow a noise level estimation: if a site have reads in one replica and does not have ones in other replicas it is a “dubious” noise site. We can/should/must exclude the site from consideration. Here we follow another prescription: hard cut Join replicas into one data set Build bins (3 nts per bin) Normalise data Choose bins with a cut (Nreads > 4.0 norm. reads): signal bins Find genes (TAIR9 gene +/- 25 nts) with signal bins Nreads > 5.0 “Noisy” signal Arabidopsis Project Dundee, 29/06/2010
DRS data statistics WT sample (2 biological replicas): Nreads= 6567.0 Kreads; Data normalized: 6.6 real reads correspond to 1 norm. read! Nsites= 723.6 Ksites (Nbin= 3 nts); Nsites= 32 834 signal sites (Nreads > 4.0 norm. reads) 7459 expressed genes (Nreads > 5.0) FPAox sample (3 biological replicas): Nreads= 11646.7 Kreads; Nsites= 1008.5 Ksites (Nbin= 3 nts); 32 224 signal sites (Nreads > 4.0 norm. reads) 7357 expressed genes fpa-8 sample (3 biological replicas): Nreads = 1219.4 Kreads; Nsites= 1049.2 Ksites (Nbin= 3 nts); 32 475 signal sites 7010 expressed genes Arabidopsis Project Dundee, 29/06/2010
Expressed TAIR9 genes Arabidopsis Project Dundee, 29/06/2010
Analysis Build bins and select “noisy” signal bins in 3 data sets: WT, FPAox, fpa-8 Select expressed genes Analyse differential expressions in 3 data sets Select genes, which are expressed in all 3 datasets Select genes, which are expressed in 2 datasets Select genes, which are expressed in 1 dataset only Apply cuts for these gene sets Down-regulated in FPAox and up-regulated in fpa8 for (1) and (2) Up-regulated both in FPAox and fpa8 for (3) Arabidopsis Project Dundee, 29/06/2010
All expressed genes in all 3 samples (log scale): 6293 genes Expressed TAIR9 genes All expressed genes in all 3 samples (log scale): 6293 genes Arabidopsis Project Dundee, 29/06/2010
Lets add gene identifiers. Mess... Expressed TAIR9 genes Lets add gene identifiers. Mess... Arabidopsis Project Dundee, 29/06/2010
Lets apply a condition: down-reg. FPAox and up-reg. fpa8 Expressed TAIR9 genes Lets apply a condition: down-reg. FPAox and up-reg. fpa8 Arabidopsis Project Dundee, 29/06/2010
Genes in (2) and (3): expressed in 1/2 data sets. 1965 genes Expressed TAIR9 genes Genes in (2) and (3): expressed in 1/2 data sets. 1965 genes Arabidopsis Project Dundee, 29/06/2010
Lets remove genes expressed in 1 data set only Expressed TAIR9 genes Lets remove genes expressed in 1 data set only Arabidopsis Project Dundee, 29/06/2010
… and apply the same condition → 8 genes Expressed TAIR9 genes … and apply the same condition → 8 genes Arabidopsis Project Dundee, 29/06/2010
A distribution of genes expressed in one sample only Expressed TAIR9 genes A distribution of genes expressed in one sample only Arabidopsis Project Dundee, 29/06/2010
Expressed TAIR9 genes Set a cut → 13 genes Dundee, 29/06/2010 Arabidopsis Project Dundee, 29/06/2010
List of expressed genes Expressed genes in all three datasets (condition: FPAox/WT < 1.0 and fpa8/WT > 2.0) AT1G14700 PAP3 (PURPLE ACID PHOSPHATASE 3) AT1G48300 protein coding: unknown protein AT1G63940 protein coding: monodehydroascorbate reductase AT1G74090 SOT18 (DESULFO-GLUCOSINOLATE SULFOTRANSFERASE 18) AT1G74210 protein coding: glycerophosphoryl diester phosphodiesterase family protein AT1G43560 Aty2 (Arabidopsis thioredoxin y2) AT2G14740 ATVSR3 (ARABIDOPSIS THALIANA VACULOLAR SORTING RECEPTOR 3) AT2G18950 HPT1 (HOMOGENTISATE PHYTYLTRANSFERASE 1) AT2G32860 BGLU33 (BETA GLUCOSIDASE 33) AT2G18193 protein coding: AAA-type ATPase family protein AT3G10310 protein coding: ATP binding / microtubule motor AT3G25770 AOC2 (ALLENE OXIDE CYCLASE 2) AT3G48310 CYP71A22 AT3G62750 BGLU8 (BETA GLUCOSIDASE 8) AT3G21720 ICL (ISOCITRATE LYASE) AT3G51750 protein coding: unknown protein AT3G55120 TT5 (TRANSPARENT TESTA 5) AT4G00030 protein coding: plastid-lipid associated protein PAP / fibrillin family protein AT4G15210 BAM5 (BETA-AMYLASE 5) AT4G18440 protein coding: adenylosuccinate lyase, putative / adenylosuccinase, putative AT5G14200 protein coding: 3-isopropylmalate dehydrogenase, chloroplast, putative AT5G24160 SQE6 (SQUALENE MONOXYGENASE 6) Arabidopsis Project Dundee, 29/06/2010
List of expressed genes (2) Expressed genes in two datasets (condition: FPAox/WT < 1.0 and fpa8/WT > 2.0) AT2G47970 protein coding: NPL4 family protein AT2G45560 CYP76C1 AT3G10450 SCPL7 (SERINE CARBOXYPEPTIDASE-LIKE 7) AT3G56360 protein coding: unknown protein AT5G16590 LRR1 AT5G46330 FLS2 (FLAGELLIN-SENSITIVE 2) AT5G53480 protein coding: importin beta-2, putative AT5G62720 protein coding: integral membrane HPP family protein Expressed genes in one dataset only (condition: FPAox/WT > 10.0 or fpa8/WT > 10.0) AT1G21250 WAK1 (CELL WALL-ASSOCIATED KINASE) AT1G72060 protein_coding: serine-type endopeptidase inhibitor AT2G18690 protein_coding: unknown protein AT2G43410 FPA AT2G43570 protein_coding: chitinase, putative AT2G43620 protein_coding: chitinase, putative AT3G30720 QQS (QUA-QUINE STARCH) AT3G57260 BGL2 (BETA-1,3-GLUCANASE 2) AT4G27140 protein_coding: 2S seed storage protein 1 / 2S albumin storage protein / NWMU1-2S albumin 1 AT4G27160 AT2S3 AT4G28520 CRU3 (CRUCIFERIN 3) AT5G10140 FLC (FLOWERING LOCUS C) AT5G50860 protein_coding: protein kinase family protein Arabidopsis Project Dundee, 29/06/2010
List of expressed genes (3) Expressed genes in three datasets (condition: FPAox/WT > 2.0 and fpa8/WT < 1.0) AT1G02920 GSTF7 AT1G06550 protein: enoyl-CoA hydratase/isomerase family protein AT1G14870 undefined AT1G24147 protein: unknown protein AT1G62540 FMO GS-OX2 (FLAVIN-MONOOXYGENASE GLUCOSINOLATE S-OXYGENASE 2) AT1G65845 protein: unknown protein AT1G72970 HTH (HOTHEAD) AT2G03780 protein: translin family protein AT2G37710 RLK (receptor lectin kinase) AT2G41090 protein: calmodulin-like calcium-binding protein, 22 kDa (CaBP-22) AT3G11820 SYP121 (SYNTAXIN OF PLANTS 121) AT3G26200 CYP71B22 AT3G44720 ADT4 (arogenate dehydratase 4) AT3G52400 SYP122 (SYNTAXIN OF PLANTS 122) AT4G02520 ATGSTF2 (GLUTATHIONE S-TRANSFERASE PHI 2) AT4G08470 MAPKKK10 AT4G12490 protein: protease inhibitor/seed storage/lipid transfer protein (LTP) family protein AT4G17070 protein: peptidyl-prolyl cis-trans isomerase AT4G17570 protein: zinc finger (GATA type) family protein AT4G25810 XTR6 (XYLOGLUCAN ENDOTRANSGLYCOSYLASE 6) AT5G47990 CYP705A5 AT5G48380 protein: leucine-rich repeat family protein / protein kinase family protein AT5G64120 protein: peroxidase, putative Arabidopsis Project Dundee, 29/06/2010
List of expressed genes (4) Expressed genes in two datasets (condition: FPAox/WT > 2.0 and fpa8/WT < 1.0) AT1G07135 protein: glycine-rich protein AT1G25220 ASB1 (ANTHRANILATE SYNTHASE BETA SUBUNIT 1) AT1G27730 STZ (salt tolerance zinc finger) AT1G34750 protein: protein phosphatase 2C, putative / PP2C, putative AT1G49410 TOM6 (translocase of the outer mitochondrial membrane 6) AT1G65500 protein: unknown protein AT1G73650 protein: oxidoreductase, acting on the CH-CH group of donors AT1G77420 protein: hydrolase, alpha/beta fold family protein AT2G04400 protein: indole-3-glycerol phosphate synthase (IGPS) AT2G14610 PR1 (PATHOGENESIS-RELATED GENE 1) AT2G17540 protein: unknown protein AT2G26530 AR781 AT2G30250 WRKY25 AT2G31880 protein: leucine-rich repeat transmembrane protein kinase, putative AT2G35410 protein: 33 kDa ribonucleoprotein, chloroplast, putative / RNA-binding protein cp33, putative AT2G48020 protein: sugar transporter, putative AT3G07590 protein: small nuclear ribonucleoprotein D1, putative / snRNP core protein D1, putative AT3G09085 protein: unknown protein AT3G26230 CYP71B24 AT3G46280 protein: protein kinase-related AT3G56400 WRKY70 AT5G08760 protein: unknown protein AT5G16990 protein: NADP-dependent oxidoreductase, putative AT5G19240 protein: unknown protein AT5G48540 protein: 33 kDa secretory protein-related AT5G55450 protein: protease inhibitor/seed storage/lipid transfer protein (LTP) family protein AT5G61600 protein: ethylene-responsive element-binding family protein AT5G63130 protein: octicosapeptide/Phox/Bem1p (PB1) domain-containing protein Arabidopsis Project Dundee, 29/06/2010