Download presentation
1
Outline to SNP bioinformatics lecture
Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers
2
Single Nucleotide Polymorphisms
Must be present in at least 1% of the population Most (90%) of the sequence variation between two genomes Two humans differ 0.1% 1/300 bp in the human genome Lower in coding regions 10 million in the human genome
3
Categories of SNPs Missense/Non-synonymous Nonsense
Changes an amino acid About half of the SNPs in coding sequence Can alter function and or structure of the protein Cause of most monogenetic diseases Hemochromatosis (HFE) Cystic fibrosis (CFTR) Hemophilia (F8) Nonsense Introduces a stop codon Same consequences as non-synonymous
4
Categories of SNPs Synonymous Non-coding
Does not alter the coding sequence May alter splicing Non-coding Can be located in promoter or regulatory regions Can impact the expression of the gene All SNPs can be used as markers
5
Use to cell biologist Association studies Causative SNPs
Use SNPs as markers to find regions associated with phenotype Causative SNPs Altered protein Altered expression Regions of altered conservation between strains/species/individuals Evolutionary analyses Etc…
6
SNP discovery Discovery of SNPs usually from sequencing
Discovery is based on separating sequencing errors from ’real’ differences and assessing the frequency in the sequenced population Separation of parologous sequences Validation, genotyping
7
SNP discovery resources
Polybayes SNP discovery in redundant sequences Polyphred SNP discovery based on phred/phrap/consed NovoSNP Graphical identification of SNPs
8
Example: PolyPhred Detects heterozygotes from chromatograms
Runs together with phred/phrap/consed Command line
9
SNP assessment Assess SNPs for functional effects
Non-synonymous SNPs Conservation across species Amino acid properties Protein structure Transmembrane regions, signal peptides etc.
10
SNP assessment resources
SIFT PolyPhen Pmut SNPs3D PANTHER PSEC TopoSNP MAPP Etc
11
Example: SIFT Sorting Intolerant From Tolerant
Builds an alignment of similar sequences Calculates a score based on the aa in the alignment Takes the environment into account Takes the properties of the aa into account Does not use structure
13
SNP databases Maps of SNPs in human, mouse, etc Haplotype maps
Functional SNPs Disease databases
14
SNP databases dbSNP F-SNP HGVBase PolyDoms OMIN Etc…
15
Example: dbSNP 50 million submissions 18 million clusters
7 million in genes 44 organisms 91 million SNPs submitted
16
dbSNP Search for SNPs, location, etc
Information submitted on method, flanking sequence, alleles, population, sample size, validation etc Information computed on SNPs at same location including functional analysis, population diversity etc
18
SNPs in genome browsers
Ensembl UCSC
19
Example: UCSC
23
HapMap Aim: a haplotype map of the human genome describing common patterns of sequence variation A haplotype map is based on alleles of SNPs close together are inherited together HapMap will identify which SNPs are informative in mapping, reducing the number of SNPs to genotype by a magnitude Populations from Asia, Europe and Africa 2nd generation map with over 3.1 million SNPs
24
Ng PC, Henikoff S. Predicting the effects of amino acid substitutions on protein function. Annu Rev Genomics Hum Genet. 2006;7: Review. Bhatti P, Church DM, Rutter JL, Struewing JP, Sigurdson AJ. Candidate single nucleotide polymorphism selection using publicly available tools: a guide for epidemiologists. Am J Epidemiol Oct 15;164(8): Epub 2006 Aug 21. Clifford RJ, Edmonson MN, Nguyen C, Scherpbier T, Hu Y, Buetow KH. Bioinformatics tools for single nucleotide polymorphism discovery and analysis. Ann N Y Acad Sci May;1020: Review. The International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449,
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.