SNP Resources: Finding SNPs, Databases and Data Extraction Debbie Nickerson NIEHS SNPs Workshop
Genotype - Phenotype Studies What SNPs are available? How do I find the common SNPs? What is the validation/quality of the SNPs? Are these SNPs informative in my population/samples? What can I download information? How do I pick the “best” SNPs? - Dana Crawford You have candidate gene/region/pathway of interest and samples ready to study:
Minimal SNP information for genotyping/characterization What is the SNP? Flanking sequence and alleles. FASTA format >snp_name ACCGAGTAGCCAG [A/G] ACTGGGATAGAAC dbSNP reference SNP # (rs #) Where is the SNP mapped? Exon, promoter, UTR, etc How was it discovered? Method What assurances do you have that it is real? Validated how? What population – African, European, etc? What is the allele frequency of each SNP? Common (>5%), rare Are other SNPs associated - redundant? Is genotyping data for control populations available?
Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? 1. NIEHS Environmental Genome Project (EGP) Candidate gene website 2. NIEHS web applications and other tools GeneSNPS, PolyDoms, PolyPhen, GVS 3. HapMap Genome Browser 4. Entrez Gene - dbSNP - Entrez SNP
Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? 1. NIEHS Environmental Genome Project (EGP) Candidate gene website 2. NIEHS web applications and other tools GeneSNPS, PolyDoms, PolyPhen, GVS 3. HapMap Genome Browser 4. Entrez Gene - dbSNP - Entrez SNP
Finding SNPs: NIEHS SNPs Candidate Genes egp.gs.washington.edu
African American African YRI European CEU Hispanic Asian CHB JPT
SNP_pos Ind_ID allele1 allele2 Repeat for all individuals Repeat for next SNP
PolyPhen - Polymorphism Phenotyping Structural protein characteristics and evolutionary comparison SIFT = Sorting Intolerant From Tolerant Evolutionary comparison of non-synonymous SNPs
Finding SNPs: NIEHS SNPs Candidate Genes
egp.gs.washington.edu
Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? 1. NIEHS Environmental Genome Project (EGP) Candidate gene website 2. NIEHS web applications and other tools GeneSNPS, PolyDoms, PolyPhen, GVS 3. HapMap Genome Browser 4. Entrez Gene - dbSNP - Entrez SNP
GeneSNPs Graphic view of SNPs in context of gene elements All NIEHS genes presented - organized by pathway/function SNPs from dbSNP - organized by submitter handle Link-outs to EntrezSNP pages and other resources Multiple views of SNPs in contexts of gene elements, protein domains, linkage disequilibrium Tutorial available from OpenHelix (
Gene SNPs - Gene SNPs -
GeneSNPs navigation
GeneSNPs links to other resouces
GeneSNPs: multiple views of SNPS in context of gene elements
Polydoms A web-based application that maps synonymous and non-synonymous SNPs onto known functional protein domains SNPs are from dbSNP and GeneSNPsSNPs are from dbSNP and GeneSNPs Domain structures from NCBI's Conserved Domain DatabaseDomain structures from NCBI's Conserved Domain Database Functional predictions based on SIFT and PolyPhenFunctional predictions based on SIFT and PolyPhen 3 dimensional mapping of SNPs on protein structure using Chime viewer3 dimensional mapping of SNPs on protein structure using Chime viewer
Polydoms -
Scroll Down
Physical and comparative analyses used to make predictions Uses SwissProt annotations to identify known domains Calculates a substitution probability from BLAST alignments of homologous and orthologous sequences Ranks substitutions on scale of predicted functional effects from “benign” to “probably damaging” PolyPhen: Polymorphism Phenotyping- prediction of functional effect of human nsSNPs
PolyPhen: Polymorphism Phenotyping- prediction of functional effect of human nsSNPs
Provides rapid analysis of 4.5 million genotyped SNPs from dbSNP and the HapMap Mapped to human genome build 36 (hg18) Displays genotype data in text and image formats Displays tagSNPs or clusters of informative SNPs in text and image formats Displays linkage disequilibrium (LD) in text and image formats Online tutorial provided at OpenHelix.com GVS: Genome Variation Server
ADH4
GVS: Genome Variation Server
Table of genotypes Image of visual genotypes
GVS: Genome Variation Server Genotypes displayed in prettybase table and visual genotype graphic
GVS: Genome Variation Server
Dense genotypes around a candidate gene can be integrated with broader HapMap genotypes = EGP SNP discovery (1/200 bp) = HapMap SNPs (~1/1000 bp) High Density Genic Coverage (EGP) Low Density Genome Coverage (HapMap)
GVS: Genome Variation Server Dense genotypes around a candidate gene can be integrated with lower-density HapMap genotypes
GVS: Genome Variation Server Combined Common A.Common samples- combined variations B. Combined samples- common variations C.Combined samples- combined variations
GVS: Genome Variation Server A.Common samples- combined variations Combined variations -Common samples-
GVS: Genome Variation Server B. Combined samples- common variations -Combined samples- HapMap EGP
GVS: Genome Variation Server C. Combined samples- combined variations -Combined samples- Combined variations
Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? 1. NIEHS Environmental Genome Project (EGP) Candidate gene website 2. NIEHS web applications and other tools GeneSNPS, PolyDoms, PolyPhen, GVS 3. HapMap Genome Browser 4. Entrez Gene - dbSNP - Entrez SNP
Finding SNPs: HapMap Browser
Finding SNPs: HapMap Genotypes
Finding SNPs: HapMap Browser 1.HapMap data sets are useful because individual genotype data in deeply sampled populations can be used to determine optimal genotyping strategies (tagSNPs) or perform population genetic analyses (linkage disequilbrium) 2.Data are specific to the HapMap project (not all dbSNP) HapMap data is available in dbSNP HapMap data is available in dbSNP 3.Visualization of data and direct access to SNP data, individual genotypes, and LD analysis possible in the browser and formats can be saved possible in the browser and formats can be saved for Haploview for Haploview
Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? 1. NIEHS Environmental Genome Project (EGP) Candidate gene website 2. NIEHS web applications and other tools GeneSNPS, PolyDoms, PolyPhen, GVS 3. HapMap Genome Browser 4. Entrez Gene - dbSNP - Entrez SNP
NCBI - Database Resource NOS2A
Finding SNPs using NCBI databases
Default View cSNPs
Finding SNPs using NCBI databases
Entrez SNP - Query Term Capabilities
Finding SNPs - Entrez SNP Summary 1.dbSNP is useful for investigating detailed information on a small number SNPs - and it’s good for a picture of the gene 2.Entrez SNP is a direct, fast database for querying SNP data 3.Data from Entrez SNP can be retrieved in batches for many SNPs 4.Entrez SNP data can be “limited” to specific subsets of SNPs and formatted in plain text for easy parsing and manipulation 5.More detailed queries can be formed using specific “field tags” for retrieving SNP data
Summary Finding SNPs: Databases and Extraction Reviewing candidate genes using views and resources in - NIEHS SNPs - GeneSNPs Prediction of functional variations - Polydoms and PolyPhen Integration of dense, gene-centric SNP maps with genomic HapMap SNPs - GVS HapMap viewer NCBI databases through Entrez portal -Entrez Gene, dbSNP, Entrez SNP -many ways to retrieve and format data