1 of 32 Sequence Variation in Ensembl. 2 of 32 Outline SNPs SNPs in Ensembl Haplotypes & Linkage Disequilibrium SNPs in BioMart HapMap project Strain-specific.

Slides:



Advertisements
Similar presentations
The Human Genome Project Main reference: Nature (2001) 409,
Advertisements

Genetic Analysis of Genome-wide Variation in Human Gene Expression Morley M. et al. Nature 2004,430: Yen-Yi Ho.
Julia Krushkal 4/11/2017 The International HapMap Project: A Rich Resource of Genetic Information Julia Krushkal Lecture in Bioinformatics 04/15/2010.
Variation and Functional Genomics. 2 of 51 Overview of Talk SNPs and InDels Larger structural variants (CNVs) Phenotype data Individual genomes HapMap.
Genome-wide Association Study Focus on association between SNPs and traits Tendency – Larger and larger sample size – Use of more narrowly defined phenotypes(blood.
Efficient Algorithms for Genome-wide TagSNP Selection across Populations via the Linkage Disequilibrium Criterion Authors: Lan Liu, Yonghui Wu, Stefano.
1 of 25 Sequence Variation in Ensembl. 2 of 25 Outline SNPs SNPs in Ensembl Linkage disequilibrium SNPs in BioMart DAS sources.
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id
MALD Mapping by Admixture Linkage Disequilibrium.
Outline to SNP bioinformatics lecture
Predicting the Function of Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.
Introduction to Linkage Analysis March Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those.
SNP Resources: Finding SNPs, Databases and Data Extraction Debbie Nickerson NIEHS SNPs Workshop.
How to access genomic information using Ensembl August 2005.
SNP Resources: Finding SNPs Databases and Data Extraction Mark J. Rieder, PhD Robert J. Livingston, PhD NIEHS Variation Workshop January 30-31, 2005.
Office hours Wednesday 3-4pm 304A Stanley Hall Review session 5pm Thursday, Dec. 11 GPB100.
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
Online Resources for Genetic Variation Study – Part One
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
Two loci AA/AA x BB/BB AB/AB (F1) AA AA long chrom short chrom locus 1 locus 2 Parent A BB BB long chrom short chrom locus 1 locus 2 Parent B Locus1/Locus.
SNP Resources: Finding SNPs Databases and Data Extraction Mark J. Rieder, PhD SeattleSNPs Variation Workshop March 20-21, 2006.
Course Overview Personalized Medicine: Understanding Your Own Genome Fall 2014.
Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.
Genome Variations & GWAS
Computational Molecular Biology Biochem 218 – BioMedical Informatics Simple Nucleotide.
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Single Nucleotide Polymorphism
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
Introduction to Single Nucleotide Polymorphisms (SNPs) Zhongming Zhao Department of Psychiatry and Center for the Study of Biological Complexity June 28,
SNPs Daniel Fernandez Alejandro Quiroz Zárate. A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more.
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
Molecular & Genetic Epi 217 Association Studies
SNP Haplotypes as Diagnostic Markers Shrish Tiwari CCMB, Hyderabad.
Gene Hunting: Linkage and Association
Genome-Wide Association Study (GWAS)
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Polymorphism Haixu Tang School of Informatics. Genome variations underlie phenotypic differences cause inherited diseases.
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
1 of 42 Browsing Genes and Genomes with Ensembl Maria Wilbe Department of Animal Breeding and Genetics, SLU, Sweden
Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte, Xin Liu & Mark Pletcher.
Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte.
Data Mining in Ensembl with BioMart Giulietta Spudich.
GVS: Genome Variation Server Materials prepared by: Warren C. Lathe, PhD Updated: Q Version 2.
Class 22 DNA Polymorphisms Based on Chapter 10 Recombinant DNA Technology Copyright © 2010 Pearson Education Inc.
The HapMap Project and Haploview
The International Consortium. The International HapMap Project.
In The Name of GOD Genetic Polymorphism M.Dianatpour MLD,PHD.
Single nucleotide polymorphisms and Large scale variation
Motivations to study human genetic variation
Copyright OpenHelix. No use or reproduction without express written consent1.
Genetics of Gene Expression BIOS Statistics for Systems Biology Spring 2008.
Notes: Human Genome (Right side page)
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
Online Resources for Genetic Variation Study – Part One Yi-Bu Chen, Ph.D. Bioinformatics Specialist Norris Medical Library University of Southern California.
Lecture/Lab 7.31
Population Genetics As we all have an interest in genomic epidemiology we are likely all either in the process of sampling and ananlysising genetic data.
School of Pharmacy, University of Nizwa
Recombination (Crossing Over)
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Itsik Pe’er, Yves R. Chretien, Paul I. W. de Bakker, Jeffrey C
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
DNA and the Genome Key Area 6a & b Mutations.
School of Pharmacy, University of Nizwa
DNA and the Genome Key Area 6a & b Mutations.
Haplotypes When the presence of two or more polymorphisms on a single chromosome is statistically correlated in a population, this is a haplotype Example.
SNPs and CNPs By: David Wendel.
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Presentation transcript:

1 of 32 Sequence Variation in Ensembl

2 of 32 Outline SNPs SNPs in Ensembl Haplotypes & Linkage Disequilibrium SNPs in BioMart HapMap project Strain-specific SNPs

3 of 32 Single nucleotide polymorphisms (SNPs) Two human genomes differ by ~0.1% Polymorphism: a DNA variation in which each possible sequence is present in at least 1% of people Most polymorphisms (~90%) take the forms of SNPs: variations that involve just one nucleotide ~1 out of every 300 bases in the human genome ~10 million in the human genome

4 of 32 Functional Consequences TypeConsequence SNPs in coding area that alter aa sequence Cause of most monogenic disorders, e.g: Hemochromatosis (HFE) Cystic fibrosis (CFTR) Hemophilia (F8) SNPs in coding areas that don’t alter aa sequence May affect splicing SNPs in promoter or regulatory regions May affect the level, location or timing of gene expression SNPs in other regionsNo direct known impact on phenotype Useful as markers

5 of 32 Practical Applications Disease diagnosis Association studies Pharmacogenomics Forensic testing Population genetics and evolutionary studies Marker-assisted selection

6 of 32 Practical Applications

7 of 32 SNPs in Ensembl Most SNPs imported from dbSNP (rs……): Imported data: alleles, flanking sequences, frequencies, …. Calculated data: position, synonymous status, peptide shift, …. For human also: HGVbase TSC Affy GeneChip 100K and 500K Mapping Array Ensembl-called SNPs (from Celera reads) For mouse and rat also: Sanger- and Ensembl-called SNPs (other strains)

8 of 32 dbSNP Central repository for both SNPs and short deletion and insertion polymorphisms For human (dbSNP build 127): 31, submissions (ss#’s) 11,811,594 RefSNP clusters (rs#’s) 5,689,286 validated 5,559,898 with genotype 710,090 with frequency

9 of 32 SNPs in Ensembl - Types Non-synonymousIn coding sequence, resulting in an aa change Synonymous In coding sequence, not resulting in an aa change FrameshiftIn coding sequence, resulting in a frameshift Stop lostIn coding sequence, resulting in the loss of a stop codon Stop gainedIn coding sequence, resulting in the gain of a stop codon Essential splice site In the first 2 or the last 2 basepairs of an intron Splice site1-3 bps into an exon or 3-8 bps into an intron UpstreamWithin 5 kb upstream of the 5'-end of a transcript Regulatory regionIn regulatory region annotated by Ensembl 5' UTRIn 5' UTR IntronicIn intron 3' UTRIn 3' UTR DownstreamWithin 5 kb downstream of the 3'-end of a transcript IntergenicMore than 5 kb away from a transcript

10 of 32 SNPs in Ensembl - Species Human Chimp Mouse Rat Dog Cow Platypus Chicken Zebrafish Tetraodon Mosquito

11 of 32 SNPs in Ensembl MapView: SNP density on chromosome

12 of 32 SNPs in Ensembl ContigView: SNPs in genomic context

13 of 32 SNPs in Ensembl GeneSeqView: SNPs in genomic sequence

14 of 32 SNPs in Ensembl TransView & ProtView: SNPs in transcript/ protein

15 of 32 SNPs in Ensembl What SNPs does my gene contain? > GeneSNPView

16 of 32 SNPs in Ensembl Info about one specific SNP? > SNPView: SNP Report Genotype and allele frequencies per population Located in transcripts SNP Context Individual genotypes

17 of 32 Caveat For human, mouse and rat Ensembl defines all SNP alleles respective to the + strand of the genome assembly! (to be able to merge dbSNP data with Sanger resequencing data) Exceptions: TransView, ProtView and GeneSeqView show alleles as they are in the transcript, protein or strand from which the transcript is transcribed, respectively.

18 of 32 Haplotypes and Linkage Disequilibrium A haplotype is a set of SNPs on a single chromatid that are statistically associated Linkage disequilibrium describes a situation in which some combinations of SNP alleles occur more or less frequently in a population than would be expected from a random formation of haplotypes from alleles based on their frequencies

19 of 32 Measures of LD D = P(AB) – P(A)P(B) D ranges from – 0.25 to D = 0 indicates linkage equilibrium dependent on allele frequencies, therefore of little use D’ = D / maximum possible value D’ = 1 indicates perfect LD estimates of D’ strongly inflated in small samples r 2 = D 2 / P(A)P(B)P(a)P(b) r 2 = 1 indicates perfect LD measure of choice

20 of 32 Linkage Disequilibrium LDView It is also possible to export SNP information for upload into the HaploView software tool

21 of 32 Linkage Disequilibrium LDTableView

22 of 32 SNPs in BioMart SNP datasets

23 of 32 SNPs in BioMart FILTER OUTPUT Ensembl gene datasets

24 of 32 SNPs in BioMart Start with a Genes dataset: to retrieve SNPs associated with a particular gene Start with a SNPs dataset to retrieve SNPs located in a certain region

25 of 32 HapMap A multi-country effort to identify and catalog genetic similarities and differences in human beings Collaboration among scientists and funding agencies from Japan, the United Kingdom, Canada, China, Nigeria, and the United States All of the information generated by the project is released into the public domain

26 of 32 HapMap Samples from populations with African, Asian and European ancestry 270 DNA samples from 4 populations: 30 trios (two parents and an adult child) from the Yoruba people of Ibadan, Nigeria 45 unrelated Japanese from the Tokyo area 45 unrelated Han Chinese from Beijing 30 trios from Utah with Northern and Western European ancestry (CEPH)

27 of 32 HapMap

28 of 32 HapMart

29 of 32 Strain-specific SNPs Mice and rats for experimental research are selected from inbred strains in order to allow reproducibility C57BL/6J and BN/SsNHsd/MCW (BN) are the strains selected for the mouse and rat sequencing projects, respectively

30 of 32 Strain-specific SNPs TranscriptSNPView Now also available for dog breeds and human individuals (Celera)

31 of 32 Strain-specific SNPs

32 of 32 Q & A Q U E S T I O N S A N S W E R S