SPH 247 Statistical Analysis of Laboratory Data 1May 14, 2013SPH 247 Statistical Analysis of Laboratory Data
Annotation Given that one has found one of more genes that are differentially expressed, there are a number useful things to know What is the putative function? What pathways are know to contain this gene? What other proteins interact with the given protein? etc. May 14, 2013SPH 247 Statistical Analysis of Laboratory Data2
Two-color array example May 14, 2013SPH 247 Statistical Analysis of Laboratory Data3 > alldata[1,] [1] [16] > geneID[1,] Name ID 1 NM_ discoidin domain receptor family, member
May 14, 2013SPH 247 Statistical Analysis of Laboratory Data4 Official SymbolDDR2 provided by HGNCHGNC Official Full Name discoidin domain receptor tyrosine kinase 2 provided by HGNCHGNC Primary sourceHGNC:2731HGNC:2731 Locus tagRP11-572K18.1 See relatedEnsembl:ENSG ; HPRD:01868; MIM:191311; Vega:OTTHUMG Ensembl:ENSG ;HPRD:01868;MIM:191311;Vega:OTTHUMG Gene typeprotein coding RefSeq statusREVIEWED OrganismHomo sapiensHomo sapiens LineageEukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo Also known asTKT; MIG20a; NTRKR3; TYRO10 SummaryReceptor tyrosine kinases (RTKs) play a key role in the communication of cells with their microenvironment. These molecules are involved in the regulation of cell growth, differentiation, and metabolism. In several cases the biochemical mechanism by which RTKs transduce signals across the membrane has been shown to be ligand induced receptor oligomerization and subsequent intracellular phosphorylation. This autophosphorylation leads to phosphorylation of cytosolic targets as well as association with other molecules, which are involved in pleiotropic effects of signal transduction. RTKs have a tripartite structure with extracellular, transmembrane, and cytoplasmic regions. This gene encodes a member of a novel subclass of RTKs and contains a distinct extracellular region encompassing a factor VIII-like domain. Alternative splicing in the 5' UTR results in multiple transcript variants encoding the same protein. [provided by RefSeq, Jul 2008]
Affy Example May 14, 2013SPH 247 Statistical Analysis of Laboratory Data5 > source(" > biocLite("annaffy") > biocLite("hgu95av2.db") > library(annaffy) > library(affy) Loading required package: Biobase Loading required package: tools … Loading required package: GO Loading required package: KEGG
May 14, 2013SPH 247 Statistical Analysis of Laboratory Data6 > probeids <- featureNames(eset)[pv2$Posterior.FDR <.05] > probeids[1:5] [1] "1005_at" "1009_at" "1034_at" "1035_g_at" "1045_s_at" > symbols <- aafSymbol(probeids,"hgu95av2.db") Loading required package: hgu95av2 > symbols[1] An object of class "aafList" [[1]] An object of class “aafSymbol” [1] "DUSP1" > getText(symbols[1]) [1] "DUSP1" > descs <- aafDescription(probeids,"hgu95av2.db")[1] > getText(descs)[1] [1] "dual specificity phosphatase 1" > gos <- aafGO(probeids,"hgu95av2.db")
May 14, 2013SPH 247 Statistical Analysis of Laboratory Data7 > gos[1] An object of class "aafList" [[1]] An object of class "aafGO" [[1]][[1]] An object of class "protein amino acid "Biological "IEA" [[1]][[2]] An object of class "response to oxidative "Biological "TAS" [[1]][[3]] An object of class "cell "Biological "IEA”
May 14, 2013SPH 247 Statistical Analysis of Laboratory Data8 [[1]][[4]] An object of class "non-membrane spanning protein tyrosine phosphatase "Molecular "TAS" [[1]][[5]] An object of class "protein "Molecular "IPI" [[1]][[6]] An object of class "hydrolase "Molecular "IEA"
May 14, 2013SPH 247 Statistical Analysis of Laboratory Data9 [[1]][[7]] An object of class "MAP kinase tyrosine/serine/threonine phosphatase "Molecular "IEA"
GO Evidence Codes IEA = inferred from electronic annotation (e.g., BLAST). Uncurated TAS = traceable author statement (i.e., someone said so). May 14, 2013SPH 247 Statistical Analysis of Laboratory Data10
IDA = inferred from direct assay IEP = inferred from expression pattern IGI = inferred from genetic interaction IMP = inferred from mutant phenotype IPI = inferred from physical interaction ISS = inferred from sequence similarity NAS = non-traceable author statement ND = no biological data available NR = not recorded May 14, 2013SPH 247 Statistical Analysis of Laboratory Data11
Online Access > gbs <- aafGenBank(probeids,"hgu95av2.db") > getURL(gbs[[1]]) [1] " fcgi?cmd=search&db=nucleotide&term=X68277% 5BACCN%5D&doptcmdl=GenBank" > lls <- aafLocusLink(probeids,"hgu95av2.db") > getURL(lls[[1]]) [1] " Db=gene&Cmd=DetailsSearch&Term=1843" May 14, 2013SPH 247 Statistical Analysis of Laboratory Data12
Abstracts > pmids <- aafPubMed(probeids,"hgu95av2.db") > pmids[[1]] An object of class "aafPubMed" [1] [13] > pmids[1] An object of class “aafPubMed” [1] [9] [17] [25] [33] [41] > browseURL(getURL(lls[[1]])) May 14, 2013SPH 247 Statistical Analysis of Laboratory Data13
Direct Browsing > browseURL(getURL(lls[[1]])) > browseURL(getURL(gbs[[1]])) > browseURL(getURL(pmids[1])) May 14, 2013SPH 247 Statistical Analysis of Laboratory Data14
Top Genes > probeids.ord <- featureNames(eset)[order(pv1$Posterior)] > getText(aafSymbol(probeids.ord[1:10],"hgu95av2.db")) [1] "" "PSPHP1" "" "COPA" "" "GM2A" "S100A2" "RPLP1" "" "" > getText(aafDescription(probeids.ord[1:10],"hgu95av2.db")) [1] "" "phosphoserine phosphatase pseudogene 1" [3] "" "coatomer protein complex, subunit alpha" [5] "" "GM2 ganglioside activator" [7] "S100 calcium binding protein A2" "ribosomal protein, large, P1" [9] "" "" > aafGO(probeids.ord[7],"hgu95av2.db") An object of class "aafList" [[1]] An object of class "aafGO" [[1]][[1]] An object of class "calcium ion "Molecular "NAS" [[1]][[2]] An object of class "Cellular "ND" [[1]][[3]] An object of class "endothelial cell "Biological "IMP" May 14, 2013SPH 247 Statistical Analysis of Laboratory Data15
> aafGO(probeids.ord[7],"hgu95av2.db") An object of class "aafList" [[1]] An object of class "aafGO" [[1]][[1]] An object of class "calcium ion "Molecular "NAS" [[1]][[2]] An object of class "Cellular "ND" [[1]][[3]] An object of class "endothelial cell "Biological "IMP" May 14, 2013SPH 247 Statistical Analysis of Laboratory Data16