As discoveries of genetic polymorphisms in the human population expand, so does the opportunity and challenge of correlating these with disease-risk. Thus, there is a critical need to efficiently organize large-scale polymorphism analyses and to prioritize their further testing through experimental and epidemiologic studies. To approach this, we have developed a web accessible server PolyDoms ( as a resource that can serve as an initial filter for the identification of potentially high impact nsSNPs (nonsynonymous Single Nucleotide Polymorphisms). In particular we have concentrated on potentially deleterious polymorphisms that might affect gene expression, protein function and pathways that could lead to increased environmental agent sensitivity and risk of diseases associated with loss of genomic integrity. We have now mapped all EGP (Environmental Genome Project) cSNPs (coding SNPs) onto the corresponding conserved and known functional protein domains from NCBI’s CDD (Conserved Domain Database) and the 3D structures from PDB (Protein Data Bank). PolyDoms provides an interactive graphical visualization web interface, is easy to update with new polymorphisms and automatically retrieves relevant literature references. Using an automatic link to the Polyview server ( polymorphisms are also mapped to extremely high quality predictions of protein secondary structures and relative solvent accessibilities with polymorphic residues highlighted. This is also available for each of the DNA repair and cell cycle control group of proteins. In addition, the results from nsSNP effect- prediction servers like PolyPhen (Polymorphism Phenotyping) and SIFT (Sorting Intolerant from Tolerant) are also performed automatically and made available for each of the nsSNPs. We are now also extending the horizon of polymorphisms studied to include insertions and deletions of putative regulatory regions—likely to be of much higher regulatory impact than SNPs—of environmentally responsive genes. Abstract References PolyDoms: Web Resource for Initial Functional Characterization of Protein Polymorphisms Jing Chen 1, Sivakumar Gowrisankar 1, Anil G Jegga 1, Robert Livingston 2, Andrew von Niederhausern 3, Dana C. Crawford 2, Christopher S. Carlson 2, Mark J. Rieder 2, Robert B. Weiss 3, Deborah A. Nickerson 2 and Bruce J Aronow 1 1 Biomedical Informatics, Cincinnati Childrens Hospital Medical Center and University of Cincinnati. 2 Department of Genome Sciences, University of Washington, 3 Department of Human Genetics, University of Utah PolyDoms: Twenty-three of the NIEHS candidate genes have at least one nsSNP occurring in a known protein functional domain and predicted as deleterious or damaging by SIFT and PolyPhen servers respectively. PolyDoms Database Statistics NCBI RefSeq NIEHS Candidates TotalDNA RepairCell Cycle Total number With at least 1 nsSNP With at least 1 nsSNP in a functional domain With at least 1 nonsense SNP With a known PDB structure With a Pfam functional domain defined Of the 127 NIEHS Cell Cycle candidate proteins with a known functional domain, 9 of them had a transcription factor E2F/dimerization domain. Of the 90 NIEHS DNA repair candidate proteins with a known functional domain, 6 of them had a BRCT domain. Additional Gene Targets from NCBI (not included in NIEHS candidate gene list) with BRCT domain Gene Symbol No. of nsSNP In a DomainnsSNPs (NCBI-dbSNP) In a Domain Nonsense SNP Count Nonsense SNP FLJ Ser 7 Asn|Arg 76 Ser|Ala 117 Thr|Ile 209 Arg|His 219 Asp|Asp 297 Gly|Thr 587 Asn|Ala 666 Val|Pro 733 Ser00 ADPRTL16 Ser 122 Asn|Tyr 215 Phe|Ala 899 Thr|Met 936 Val|Thr 936 Met|Ile 1012 Val00 TOPBP15 His 58 Asn|Pro 212 Leu|Gln 370 Lys|Ile 550 Val|Ser 955 Asn00 BARD13Met 507 Val|Leu 653 Phe|Arg 658 Cys00 CTDP13Ser 282 Phe|Thr 340 Met|Pro 519 His00 PAXIP1L3Ile 124 Leu|Ala 882 Thr|Met 979 Val1* 232 Trp BRAP2Tyr 249 Asp|* 489 Glu1* 489 Glu DNTT1Arg 112 Gly00 ECT DNA repair genes with at least one nsSNP occurring in a conserved domain and predicted as damaging/deleterious Gene No. of nsSNPsnsSNPs XRCC12 Trp 194 Arg; Arg 560 Trp MPG1Pro 64 Leu MGMT1Gly 160 Arg POLG1Arg 1146 Cys MLH11His 718 Tyr FANCC1Gly 139 Glu 6 cell cycle genes with at least one nsSNP occurring in a conserved domain and predicted as damaging/deleterious Gene No. of nsSNPsnsSNPs ORC3L1Arg 588 Cys NBS11Arg 169 Gly DDC1Pro 210 Leu MCM31Asp 280 Val ERBB21Ala 1170 Pro CDKN1B1Arg 15 Trp Schematic representation of secondary structures, relative solvent accessibilities, highlighting polymorphic residues. The RSA reflects the degree of the residue’s exposure to the surrounding solvent in the protein structure. The relative probability of disease-causing mutations is highest in the protein interior. NIEHS Candidate Genes: Prioritized List A RefSeq protein database search for additional protein targets that have E2F/dimerization domain resulted in 4 other proteins (3 hypothetical and an E2F7). E2F7 has a nsSNP (Ala324Asp) occurring in the E2F domain of the protein. Mutations at Arg residues account for almost 15% of disease mutations. A random mutation at a Trp or Cys residue has highest probability of causing disease. Mutations at Gly which is frequently present at the turns of alpha-helices, might have a negative impact on protein structural stability (Vitkup et al., 2003). 48 of 135 cell cycle proteins had at least one nsSNP affecting an Arg residue; 11-Trp; 15-Cys; and 30 proteins with at least one nsSNP at a glycine residue. PolyView ( 1.PolyDoms: 2.PolyView: 3.SIFT: 4.PolyPhen: 5.GeneSNPs: 6.NCBI RefSeq: ftp://ftp.ncbi.nih.gov/refseq/ 7.NCBI-CDD: NIEHS U01 ES11038 Mouse Centers Genomics Consortium NIEHS Candidate Genes (not sequenced): List of genes with at least one nsSNP occurring in a domain and predicted as deleterious/damaging. GenensSNPFrequency BIRC1Leu 1323 Trp0|0|0| CAPN2Lys 568 Gln0| CKBLys 267 Glu0| CYP2A6Leu 160 His0|0|.1|.035|.022| EPHX1Arg 49 Cys0|.021|0|.005|0|0| EPHX1Tyr 113 His0|.38|0|0|0|.29|.24|0| F13BTyr 543 Ser.025| F3Arg 163 Trp.025| F5Pro 809 Ser.025| F5Asn 817 Thr.1|0|0|0|0| GTF2H2Thr 199 Ile0|0|0| HSPA6Pro 276 Leu0|0|0| HSPA6Thr 297 Lys0| IL12AMet 213 Thr0| MMP1Asp 252 Gly0| NF1Thr 354 Lys0| PTGS1Arg 108 Gln0| SERPINA1Cys 256 Trp0| SERPINA1Glu 288 Val.025|0|0| SERPINA1Gly 373 Trp0| NIEHS Candidate Genes (not sequenced): List of genes with at least one nsSNP predicted as deleterious/damaging. GenensSNPFrequency CYP2D6Arg 365 His0 CYP2D6Met 451 Ile0 FGFR4Arg 388 Gly0 HSPB1Val 6 Phe0 CYP2A6*1 (wild type) is responsible for the 7-hydroxylation of coumarin. The point mutation (T to A) in codon 160 leads to a single amino acid substitution (Leu to His) and the resulting protein, CYP2A*2 is unable to 7-hydroxylate coumarin (Cok et al., 2001). EPHX1 (microsomal epoxide hydrolase) codon 113 Tyr/Tyr variant is associated with oropharyngeal carcinogenesis (Amador et al., 2002). Significant correlation reported between FGFR4 SNP (Arg 388 Gly) and prognosis in patients with soft tissue sarcoma. This SNP might be used to improve the prediction of clinical prognosis and lead to new treatment strategies in patients with soft tissue sarcomas (Morimoto et al., 2003). GenensSNPFrequency CDKN1BArg 15 Trp |0|.0056| DDCPro 210 Leu0|.0111| ERBB2Ala 1170 Pro0|.4861| |0|0|.4824|0| FANCCGly 139 Glu |.0056| MCM3Asp 280 Val0| MGMTGly 160 Arg0| MLH1His 718 Tyr0|0|0| MPGPro 64 Leu0|.0056| NBS1Arg 169 Gly0| ORC3LArg 588 Cys0| PGRArg 625 Ile0| POLGArg 1146 Cys0| |.0079| PTGS2Glu 488 Gly.05|0|0|0|0|0| STAT2Gln 66 His0| STAT2Leu 220 Pro0| STAT2Thr 448 Met0|.0169| STAT2Ser 501 Ile0| XRCC1Arg 560 Trp0|.0135| XRCC1Trp 194 Arg.1193|0|0| |0| NIEHS-EGP sequenced genes that have at least one nsSNP occurring in a protein functional domain and predicted as probably damaging or deleterious by PolyPhen and SIFT algorithms respectively. Support There are no reports of any disease implications of Ala1170Pro nsSNP of ERBB2A though it’s predicted as deleterious and is occurring in a functional protein domain. However, there are conflicting reports about another nsSNP Ile655Val. It has been associated with an increased risk of breast cancer, particularly among younger women. However, this SNP has variable frequency in different ethnic groups (Ameyaw et al., 2002). SIFT and PolyPhen predicted it as benign or tolerated.