SNP Resources: Finding SNPs, Databases and Data Extraction Debbie Nickerson NIEHS SNPs Workshop.

Slides:



Advertisements
Similar presentations
LS-SNP: Large-scale annotation of coding non- synonymous SNPs based on multiple information sources -Bioinformatics April 2005.
Advertisements

Integrating dbSNP with P. falciparum genome resources.
Efficient Algorithms for Genome-wide TagSNP Selection across Populations via the Linkage Disequilibrium Criterion Authors: Lan Liu, Yonghui Wu, Stefano.
Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id
Peter Tsai, Bioinformatics Institute.  University of California, Santa Cruz (UCSC)  A rapid and reliable display of any requested portion of genomes.
Outline to SNP bioinformatics lecture
Using HapMap.Org A Tutorial Lincoln Stein, Cold Spring Harbor Laboratory.
CS177 Lecture 9 SNPs and Human Genetic Variation Tom Madej
Resources at HapMap.Org Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
Predicting the Function of Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.
SNP Resources: Finding SNPs Discovery and Databases Mark J. Rieder, PhD SeattleSNPs Workshop March 20-21, 2006.
Copyright OpenHelix. No use or reproduction without express written consent1 Organization of genomic data… Genome backbone: base position number sequence.
SNP Resources: Finding SNPs, Databases and Data Extraction Debbie Nickerson
UCSC Genome Browser Tutorial
NIEHS SNPs Workshop Introduction Debbie Nickerson Department of Genome Sciences University of Washington.
Picking SNPs Application to Association Studies Dana Crawford, PhD SeattleSNPs PGA University of Washington March 20, 2006.
How to access genomic information using Ensembl August 2005.
SNP Resources: Variation Discovery, HapMap and the EGP Mark J. Rieder Department of Genome Sciences NIEHS SNPs Workshop Jan 10-11,
SNP Resources: Finding SNPs Databases and Data Extraction Mark J. Rieder, PhD Robert J. Livingston, PhD NIEHS Variation Workshop January 30-31, 2005.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
PolyPhen and SIFT: Tools for predicting functional effects of SNPs Epi 244 Spring 2009 Sam S. Oh.
Online Resources for Genetic Variation Study – Part One
SNPs DNA differs between humans by 0.1%, (1 in 1300 bases) This means that you can map DNA variation to around 10,000,000 sites in the genome Almost all.
SNP Selection University of Louisville Center for Genetics and Molecular Medicine January 10, 2008 Dana Crawford, PhD Vanderbilt University Center for.
SNP Resources: Finding SNPs Databases and Data Extraction Mark J. Rieder, PhD SeattleSNPs Variation Workshop March 20-21, 2006.
Data retrieval BioMart Data sets on ftp site MySQL queries of databases Perl API access to databases Export View.
ExPASy - Expert Protein Analysis System The bioinformatics resource portal and other resources An Overview.
Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.
DbSNP: the NCBI database of genetic variation S. T. Sherry, M.H. Ward, M. Kholodov, J. Baker, L. Phan, E. M. Smigielski and K. Sirotkin, Nucleic Acids.
GeVab: Genome Variation Analysis Browsing Server Korean BioInformation Center, KRIBB InCoB2009 KRIBB
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Tri-I Bioinformatics Workshop: Public data and tool repositories Alex Lash & Maureen Higgins Bioinformatics Core Memorial Sloan-Kettering Cancer Center.
NCBI FieldGuide NCBI Molecular Biology Resources January 2008 Using Entrez.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
GENOME-CENTRIC DATABASES Daniel Svozil. NCBI Gene Search for DUT gene in human.
SAGExplore web server tutorial for Module II: Genome Mapping.
Copyright OpenHelix. No use or reproduction without express written consent 2 Overview of Genome Browsers Materials prepared by Warren C. Lathe, Ph.D.
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
Molecular & Genetic Epi 217 Association Studies
SNP Haplotypes as Diagnostic Markers Shrish Tiwari CCMB, Hyderabad.
1 of 38 Data Mining in Ensembl with BioMart. 2 of 38 Simple Text-based Search Engine.
Browsing the Genome Using Genome Browsers to Visualize and Mine Data.
Introduction to the Gramene Genetic Diversity module 5/2010 Build #31.
SAGExplore web server tutorial for Module I: Genome Explore.
SeattleSNPs Variation Discovery Resource Materials prepared by: Mary E. Mangan, PhD Updated: Q Version 1.
1 of 32 Sequence Variation in Ensembl. 2 of 32 Outline SNPs SNPs in Ensembl Haplotypes & Linkage Disequilibrium SNPs in BioMart HapMap project Strain-specific.
Lettuce/Sunflower EST CGPDB project. Data analysis, assembly visualization and validation. Alexander Kozik, Brian Chan, Richard Michelmore. Department.
Copyright OpenHelix. No use or reproduction without express written consent1.
Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte.
GVS: Genome Variation Server Materials prepared by: Warren C. Lathe, PhD Updated: Q Version 2.
Copyright OpenHelix. No use or reproduction without express written consent1.
SAGExplore web server tutorial. The SAGExplore server has three different modules …
Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Accessing and visualizing genomics data
As discoveries of genetic polymorphisms in the human population expand, so does the opportunity and challenge of correlating these with disease-risk. Thus,
NCBI: something old, something new. What is NCBI? Create automated systems for knowledge about molecular biology, biochemistry, and genetics. Perform.
Visualization of genomic data Genome browsers. How many have used a genome browser ? UCSC browser ? Ensembl browser ? Others ? survey.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
Online Resources for Genetic Variation Study – Part One Yi-Bu Chen, Ph.D. Bioinformatics Specialist Norris Medical Library University of Southern California.
1 Bioinformatics Tools for Genotyping Frances Tong Dr. Garry Larson, Ph.D City of Hope Department of Molecular Medicine Southern California Bioinformatics.
Gil McVean Department of Statistics
Consideration for Planning a Candidate Gene Association Study With TagSNPs Shehnaz K. Hussain, PhD, ScM Epidemiology 243: Molecular.
Visualization of genomic data
A Tutorial Lincoln Stein, Cold Spring Harbor Laboratory
Ivan P. Gorlov, Olga Y. Gorlova, Shamil R. Sunyaev, Margaret R
Problems from last section
Presentation transcript:

SNP Resources: Finding SNPs, Databases and Data Extraction Debbie Nickerson NIEHS SNPs Workshop

Genotype - Phenotype Studies What SNPs are available? How do I find the common SNPs? What is the validation/quality of the SNPs? Are these SNPs informative in my population/samples? What can I download information? How do I pick the “best” SNPs? - Dana Crawford You have candidate gene/region/pathway of interest and samples ready to study:

Minimal SNP information for genotyping/characterization What is the SNP? Flanking sequence and alleles. FASTA format >snp_name ACCGAGTAGCCAG [A/G] ACTGGGATAGAAC dbSNP reference SNP # (rs #) Where is the SNP mapped? Exon, promoter, UTR, etc How was it discovered? Method What assurances do you have that it is real? Validated how? What population – African, European, etc? What is the allele frequency of each SNP? Common (>5%), rare Are other SNPs associated - redundant? Is genotyping data for control populations available?

Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? 1. NIEHS Environmental Genome Project (EGP) Candidate gene website 2. NIEHS web applications and other tools GeneSNPS, PolyDoms, PolyPhen, GVS 3. HapMap Genome Browser 4. Entrez Gene - dbSNP - Entrez SNP

Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? 1. NIEHS Environmental Genome Project (EGP) Candidate gene website 2. NIEHS web applications and other tools GeneSNPS, PolyDoms, PolyPhen, GVS 3. HapMap Genome Browser 4. Entrez Gene - dbSNP - Entrez SNP

Finding SNPs: NIEHS SNPs Candidate Genes egp.gs.washington.edu

African American African YRI European CEU Hispanic Asian CHB JPT

SNP_pos Ind_ID allele1 allele2 Repeat for all individuals Repeat for next SNP

PolyPhen - Polymorphism Phenotyping Structural protein characteristics and evolutionary comparison SIFT = Sorting Intolerant From Tolerant Evolutionary comparison of non-synonymous SNPs

Finding SNPs: NIEHS SNPs Candidate Genes

egp.gs.washington.edu

Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? 1. NIEHS Environmental Genome Project (EGP) Candidate gene website 2. NIEHS web applications and other tools GeneSNPS, PolyDoms, PolyPhen, GVS 3. HapMap Genome Browser 4. Entrez Gene - dbSNP - Entrez SNP

GeneSNPs Graphic view of SNPs in context of gene elements All NIEHS genes presented - organized by pathway/function SNPs from dbSNP - organized by submitter handle Link-outs to EntrezSNP pages and other resources Multiple views of SNPs in contexts of gene elements, protein domains, linkage disequilibrium Tutorial available from OpenHelix (

Gene SNPs - Gene SNPs -

GeneSNPs navigation

GeneSNPs links to other resouces

GeneSNPs: multiple views of SNPS in context of gene elements

Polydoms A web-based application that maps synonymous and non-synonymous SNPs onto known functional protein domains SNPs are from dbSNP and GeneSNPsSNPs are from dbSNP and GeneSNPs Domain structures from NCBI's Conserved Domain DatabaseDomain structures from NCBI's Conserved Domain Database Functional predictions based on SIFT and PolyPhenFunctional predictions based on SIFT and PolyPhen 3 dimensional mapping of SNPs on protein structure using Chime viewer3 dimensional mapping of SNPs on protein structure using Chime viewer

Polydoms -

Scroll Down

Physical and comparative analyses used to make predictions Uses SwissProt annotations to identify known domains Calculates a substitution probability from BLAST alignments of homologous and orthologous sequences Ranks substitutions on scale of predicted functional effects from “benign” to “probably damaging” PolyPhen: Polymorphism Phenotyping- prediction of functional effect of human nsSNPs

PolyPhen: Polymorphism Phenotyping- prediction of functional effect of human nsSNPs

Provides rapid analysis of 4.5 million genotyped SNPs from dbSNP and the HapMap Mapped to human genome build 36 (hg18) Displays genotype data in text and image formats Displays tagSNPs or clusters of informative SNPs in text and image formats Displays linkage disequilibrium (LD) in text and image formats Online tutorial provided at OpenHelix.com GVS: Genome Variation Server

ADH4

GVS: Genome Variation Server

Table of genotypes Image of visual genotypes

GVS: Genome Variation Server Genotypes displayed in prettybase table and visual genotype graphic

GVS: Genome Variation Server

Dense genotypes around a candidate gene can be integrated with broader HapMap genotypes = EGP SNP discovery (1/200 bp) = HapMap SNPs (~1/1000 bp) High Density Genic Coverage (EGP) Low Density Genome Coverage (HapMap)

GVS: Genome Variation Server Dense genotypes around a candidate gene can be integrated with lower-density HapMap genotypes

GVS: Genome Variation Server Combined Common A.Common samples- combined variations B. Combined samples- common variations C.Combined samples- combined variations

GVS: Genome Variation Server A.Common samples- combined variations Combined variations -Common samples-

GVS: Genome Variation Server B. Combined samples- common variations -Combined samples- HapMap EGP

GVS: Genome Variation Server C. Combined samples- combined variations -Combined samples- Combined variations

Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? 1. NIEHS Environmental Genome Project (EGP) Candidate gene website 2. NIEHS web applications and other tools GeneSNPS, PolyDoms, PolyPhen, GVS 3. HapMap Genome Browser 4. Entrez Gene - dbSNP - Entrez SNP

Finding SNPs: HapMap Browser

Finding SNPs: HapMap Genotypes

Finding SNPs: HapMap Browser 1.HapMap data sets are useful because individual genotype data in deeply sampled populations can be used to determine optimal genotyping strategies (tagSNPs) or perform population genetic analyses (linkage disequilbrium) 2.Data are specific to the HapMap project (not all dbSNP) HapMap data is available in dbSNP HapMap data is available in dbSNP 3.Visualization of data and direct access to SNP data, individual genotypes, and LD analysis possible in the browser and formats can be saved possible in the browser and formats can be saved for Haploview for Haploview

Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? 1. NIEHS Environmental Genome Project (EGP) Candidate gene website 2. NIEHS web applications and other tools GeneSNPS, PolyDoms, PolyPhen, GVS 3. HapMap Genome Browser 4. Entrez Gene - dbSNP - Entrez SNP

NCBI - Database Resource NOS2A

Finding SNPs using NCBI databases

Default View cSNPs

Finding SNPs using NCBI databases

Entrez SNP - Query Term Capabilities

Finding SNPs - Entrez SNP Summary 1.dbSNP is useful for investigating detailed information on a small number SNPs - and it’s good for a picture of the gene 2.Entrez SNP is a direct, fast database for querying SNP data 3.Data from Entrez SNP can be retrieved in batches for many SNPs 4.Entrez SNP data can be “limited” to specific subsets of SNPs and formatted in plain text for easy parsing and manipulation 5.More detailed queries can be formed using specific “field tags” for retrieving SNP data

Summary Finding SNPs: Databases and Extraction Reviewing candidate genes using views and resources in - NIEHS SNPs - GeneSNPs Prediction of functional variations - Polydoms and PolyPhen Integration of dense, gene-centric SNP maps with genomic HapMap SNPs - GVS HapMap viewer NCBI databases through Entrez portal -Entrez Gene, dbSNP, Entrez SNP -many ways to retrieve and format data