PolyPhen and SIFT: Tools for predicting functional effects of SNPs Epi 244 Spring 2009 Sam S. Oh.

Slides:



Advertisements
Similar presentations
LS-SNP: Large-scale annotation of coding non- synonymous SNPs based on multiple information sources -Bioinformatics April 2005.
Advertisements

Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id
Single Nucleotide Polymorphisms Jennifer Lyon Eskind Biomedical Library May 1, 2009 CRC Workshop Series.
Outline to SNP bioinformatics lecture
Structural Genomics and Human Health
CS177 Lecture 9 SNPs and Human Genetic Variation Tom Madej
Predicting the Function of Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
Analysis of Phenotypic Variations in the Mouse Genome Caused by Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.
© 2006 W.W. Norton & Company, Inc. DISCOVER BIOLOGY 3/e
Genome Browsers Ensembl (EBI, UK) and UCSC (Santa Cruz, California)
SNP Resources: Finding SNPs, Databases and Data Extraction Debbie Nickerson NIEHS SNPs Workshop.
SNP Resources: Finding SNPs Databases and Data Extraction Mark J. Rieder, PhD Robert J. Livingston, PhD NIEHS Variation Workshop January 30-31, 2005.
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
SNP Resources: Finding SNPs Databases and Data Extraction Mark J. Rieder, PhD SeattleSNPs Variation Workshop March 20-21, 2006.
Mutations. The picture shows a human genome Karyotype. Look at it carefully and discuss.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Genome Variations & GWAS
DbSNP: the NCBI database of genetic variation S. T. Sherry, M.H. Ward, M. Kholodov, J. Baker, L. Phan, E. M. Smigielski and K. Sirotkin, Nucleic Acids.
Changes in DNA can produce variation
Presented by: Andrew McMurry Boston University Bioinformatics Children’s Hospital Informatics Program Harvard Medical School Center for BioMedical Informatics.
Problem Set I review BIOL221T: Advanced Bioinformatics for Biotechnology Irene Gabashvili, PhD.
Mutations Mutation- a change in the DNA nucleotide sequence
- any detectable change in DNA sequence eg. errors in DNA replication/repair - inherited ones of interest in evolutionary studies Deleterious - will be.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
01/03/2013UK NEQAS UV Participants Meeting 2013 in a quality perspective.
MES Genome Informatics I - Lecture VIII. Interpreting variants Sangwoo Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute,
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
Identification and evaluation of causative genetic variants corresponding to a certain phenotype Xidan Li.
SNP Haplotypes as Diagnostic Markers Shrish Tiwari CCMB, Hyderabad.
Construction of Substitution Matrices
Spliceosome attachs to hnRNA and begins to snip out non-coding introns mRNA strand composed of exons is free to leave the nucleus.
POLYMORPHISM AND VARIANT ANALYSIS Saurabh Sinha, University of Illinois.
Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte.
By Chris Paine Genes Essential idea: Every living organism inherits a blueprint for life from its parents. Genes and.
.1Sources of DNA and Sequencing Methods.1Sources of DNA and Sequencing Methods 2 Genome Assembly Strategy and Characterization 2 Genome Assembly.
Gene Expression: from DNA to protein to phenotype How is DNA transcribed to RNA? How is mRNA translated to protein? How do alterations in DNA lead to alterations.
Single nucleotide polymorphisms and Large scale variation
NEW TOPIC: MOLECULAR EVOLUTION.
Lesson Four Structure of a Gene. Gene Structure What is a gene? Gene: a unit of DNA on a chromosome that codes for a protein(s) –Exons –Introns –Promoter.
 Genetics Primer: SBI 4UI Mrs. Tuma. Test Your Genetic IQ: 1. The Human Genome contains 3 billion base pairs. True or False?
Single Nucleotide Polymorphisms (SNPs) By Amira Jhelum Rahul Shweta.
Protein Synthesis Transcription and Translation RNA Structure Like DNA, RNA consists of a long chain of nucleotides 3 Differences between RNA and DNA:
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
Research proposal 2009 信息技术会议 Bioinformatics Analysis & Identification of non-Synonymous SNPs in Candidate Genes for Ascites College of Animal Husbandry.
Genetics 3.1 Genes. Essential Idea: Every living organism inherits a blueprint for life from its parents.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
SC.912.L.16.3 DNA Replication. – During DNA replication, a double-stranded DNA molecule divides into two single strands. New nucleotides bond to each.
Lesson Four Structure of a Gene.
Lesson Four Structure of a Gene.
Genetics Topic3.
Amino acid substitution & Free Reducing free sulfhydryl residues
Types of Mutations.
School of Pharmacy, University of Nizwa
Gene Hunting: Design and statistics
What are the Patterns Of Nucleotide Substitution Within Coding and
Polymorphisms GWAS traits.
Genes 3.1.
Genetics Topic3.
Relationship between Genotype and Phenotype
Polymorphisms GWAS traits.
School of Pharmacy, University of Nizwa
BLAT Blast Like Alignment Tool
Ivan P. Gorlov, Olga Y. Gorlova, Shamil R. Sunyaev, Margaret R
(A) Schematic of the TGGT1_254250/TgPRELID gene, mRNA, and predicted protein. (A) Schematic of the TGGT1_254250/TgPRELID gene, mRNA, and predicted protein.
.1Sources of DNA and Sequencing Methods 2 Genome Assembly Strategy and Characterization 3 Gene Prediction and Annotation 4 Genome Structure 5 Genome.
5.4 Cladistics.
The genomic landscape of a HeLa cell line.
Analysis of A-to-I RNA edits found in transcriptome-wide RNA-seq
Presentation transcript:

PolyPhen and SIFT: Tools for predicting functional effects of SNPs Epi 244 Spring 2009 Sam S. Oh

Frazer et al. Nat Rev Genet, 2009;10: Human genome variation 3.2 billion base pairs (bp) 99.9% similarity across individuals –3.2 million bp dissimilar ~11 million SNPs –Coding vs. non-coding (intron and intergenic regions) –Most are synonymous

DNA → RNA → Protein

Example: sickle-cell anemia A to T SNP of beta-globin gene results in glutamate (hydrophilic) to valine (hydrophobic) substitution

Example: MTHFR Folate metabolism

Finding MTHFR SNPs

Highlight all refSNP numbers (use scroll bar) and copy

Note Build number (currently Build 130)

SIFT Sorting Intolerant From Tolerant Predicts tolerability of AA substitution effects (i.e., non-synonymous SNPs) based on –Sequence homology –Physical properties of amino acids Can be applied to naturally occurring nonsynonymous polymorphisms and laboratory-induced missense mutations

Copy all SNP IDs and paste into SIFT. Choose “Submit Query” Compare Build numbers

Getting more info for rs Enter “rs ”

Allele info Protein name Contig name mRNA name Position of SNP in mRNA, protein, contig Flanking sequence, IUPAC code, flanking seq Build number Select protein Scroll down Note AA1, AA2, and position

Copy FASTA-formatted protein sequence

Paste FASTA-formatted protein sequence Enter AA substitution [Letter1-position-Letter2]

Substitution occurs at AA 566 Scroll down

Check tolerance of AA substitutions

Tolerance of specified substitution “Substitution at pos 566 from G to E is predicted to AFFECT PROTEIN FUNCTION with a score of 0.01.

Polymorphism Phenotyping Tool for prediction of possible impact of amino acid substitution (i.e., non-synonymous SNPs) on protein structure and function based on: –Amino acid sequence What part of the protein did the SNP occur? (E.g., active site, binding site, transmembrane region) –Multiple alignments with homologous proteins and mammalian orthologues How compatible is the substitution based on proteins of comparable sequence? –3D structural properties with the substituted amino acid What is the substitution’s effect on the protein’s physiochemistry? (E.g., hydrophobicity, electrostatic interactions, ligand binding)

PolyPhen data flow

Four potential predictions Probably damaging –It is with high confidence supposed to affect protein function or structure Possibly damaging –It is supposed to affect protein function or structure Benign –Most likely lacking any phenotypic effect Unknown –Lack of data do not allow PolyPhen to make a prediction

Copy FASTA-formatted protein sequence Enter AA position, ancestral AA, and substituted AA

Enter SNP rs# In dbSNP Build 129, corresponds to protein NP_

Query vs. SNP Collection QuerySNP Collection PredictionProbably damaging PSIC db SNP Build#N/A126

References NCBI dbSNP – SIFT – PolyPhen –