Download presentation
Presentation is loading. Please wait.
Published byValentine Kelley Modified over 9 years ago
1
SNP molecular function, evolution and disease Md Imtiyaz Hassan, Ph.D
2
Effect on molecular function Phenotype Natural selection Medical Genetics Structural Biology Biochemistry Evolutionary Genetics
3
Predicting the effect of mutations in proteins
4
Why is this useful? n Understanding variation in molecular function and structure n Evolutionary genetics: comparison of polymorphism and divergence rates between different functional categories is a robust way to detect selection
5
Linkage analysis Rare
6
Classical association studies ControlDisease Common
7
Quantitative trait Mendelists Biometricians Forces to maintain variation: Selection Mutation
8
Common disease / Common variant Trade off (antagonistic pleiotropy) Balancing selection Recent positive selection Reverse in direction of selection Examples APOEAlzheimer’s disease AGTHypertension CYP3AHypertension CAPN10Type 2 diabetes
9
Individual human genome is a target for deleterious mutations ! ~40% of human Mendelian diseases are due to hypermutable sites Frequency of deleterious variants is directly proportional to mutation rate (q= /s)
10
Multiple mostly rare variants Many deleterious alleles in mutation-selection balance Examples Plasma level of HDL-C Plasma level of LDL-C Colorectal adenomas
11
Harmful mutations n Function: damaging n Evolution: deleterious n Phenotype: detrimental n Advantageous pseudogenization (Zhang et al. 2006) n Gain of function disease mutations n Sickle Cell Anemia
13
protein multiple alignment profile
14
PolyPhen
15
Prediction rate of damaging substitutions possibly probably Disease mutations Divergence 82%57% 9% 3% Polymorphism 27% 15%
16
10% of PolyPhen false-positives are due to compensatory substitutions
17
Neutral mutation model Human ACCTTGCAAAT Chimpanzee ACCTTACAAAT Baboon ACCTTACAAAT Prob(TAC->TGC) Prob(TGC->TAC) Prob(XY 1 Z->XY 2 Z) 64x3 matrix
18
Strongly detrimental mutations
19
Effectively neutral mutations
20
Mildly deleterious mutations
21
54 genes, 757 individuals inflammatory response 236 genes, 46-47 individuals DNA repair and cell cycle pathways 518 genes, 90-95 individuals
22
Fitness and selection coefficient Wild typeNew mutation N 1 = 4 N 2 = 3 Fitness 1 N1N1 N2N2 = 1 – s Selection coefficient
23
Classical association studies ControlDisease Common
24
Genetic polymorphism Genetic Polymorphism: A difference in DNA sequence among individuals, groups, or populations. Genetic Mutation: A change in the nucleotide sequence of a DNA molecule. Genetic mutations are a kind of genetic polymorphism. Single nucleotide Polymorphism (point mutation) Repeat heterogeneity Genetic Variation
25
SNP Single Nucleotide Polymorphisms A Single Nucleotide Polymorphism is a source variance in a genome. A SNP ("snip") is a single base mutation in DNA. SNPs are the most simple form and most common source of genetic polymorphism in the human genome (90% of all human DNA polymorphisms). There are two types of nucleotide base substitutions resulting in SNPs: –Transition: substitution between purines (A, G) or between pyrimidines (C, T). Constitute two thirds of all SNPs. –Transversion: substitution between a purine and a pyrimidine.
26
SNP Instead of using restriction enzymes, these are found by direct sequencing They are extremely useful for mapping Markers Classical Mendelian100 RFLPs7000 SNPs1.4x10 6 ----------------------- ACGGCTAA ----------------------- ATGGCTAA SNPs occur every 300-1000 bp along the 3 billion long human genome Many SNPs have no effect on cell function
27
Human Genome and SNPs Human genome is (mostly) sequenced, attention turning to the evaluation of variation Alterations in DNA involving a single base pair are called single nucleotide polymorphisms, or SNPs Map of ~1.4 million SNPs (Feb 2001) It is estimated that ~60,000 SNPs occur within exons
28
Goals of SNP Initiatives Immediate goals: –Detection/identification of all SNPs estimated to be present in the human genome –Interest also in other organisms, e.g. potatoes(!) –Establishment of SNP Database(s)
29
SNPs Humans are genetically >99 per cent identical: it is the tiny percentage that is different Much of our genetic variation is caused by single-nucleotide differences in our DNA : these are called single nucleotide polymorphisms, or SNPs. As a result, each of us has a unique genotype that typically differs in about three million nucleotides from every other person. SNPs occur about once every 300-1000 base pairs in the genome, and the frequency of a particular polymorphism tends to remain stable in the population. Because only about 3 to 5 percent of a person's DNA sequence codes for the production of proteins, most SNPs are found outside of "coding sequences".
30
Longer term goals: Areas of SNP Application Gene discovery and mapping Association-based candidate polymorphism testing Diagnostics/risk profiling Response prediction Homogeneity testing/study design Gene function identification etc.
31
Polymorphism Technical definition: most common variant (allele) occurs with less than 99% frequency in the population Also used as a general term for variation Many types of DNA polymorphisms, including RFLPs, VNTRs, micro-satellites ‘Highly polymorphic’ = many variants
32
SNPs in Genetic Analysis Abundance – lots Position – throughout genome Haplotype patterns – groups of SNPs may provide exploitable diversity Rapid and efficient to genotype Increased stability over other types of mutation Recombination patterns – e.g. ‘hot spots’
33
Coding Region SNPs Occasionally, a SNP may actually cause a disease. SNPs within a coding sequence are of particular interest to researchers because they are more likely to alter the biological function of a protein. Types of coding region SNPs – Synonymous: the substitution causes no amino acid change to the protein it produces. This is also called a silent mutation. – Non-Synonymous: the substitution results in an alteration of the encoded amino acid. A missense mutation changes the protein by causing a change of codon. A nonsense mutation results in a misplaced termination. – One half of all coding sequence SNPs result in non-synonymous codon changes.
34
Intergenic SNPs Researchers have found that most SNPs are not responsible for a disease state because they are intergenic SNPs Instead, they serve as biological markers for pinpointing a disease on the human genome map, because they are usually located near a gene found to be associated with a certain disease. Scientists have long known that diseases caused by single genes and inherited according to the laws of Mendel are actually rare. Most common diseases, like diabetes, are caused by multiple genes. Finding all of these genes is a difficult task. Recently, there has been focus on the idea that all of the genes involved can be traced by using SNPs. By comparing the SNP patterns in affected and non-affected individuals—patients with diabetes and healthy controls, for example—scientists can catalog the specific DNA variations that underlie susceptibility for diabetes
35
Polymorphic Sites Revealed in Sequencing
36
Medium- and Low-throughput SNP Genotyping I. SNP Discovery and validation. A. Data base mining, “resequencing” on microarrays, de novo sequencing of EST libraries. B. Genotyping of pooled samples for determining heterozygosity. II. How many SNPs are to be typed in how many samples? A. What degree of multiplexing is possible for the” before-typing” PCR reactions? B. What degree of multiplexing is possible for the genotyping reactions? III. What is the appropriate platform given the size of the project, the budget and the degree of automation desired?
38
July 2003 NCBI build 34 Red = at least 1 SNP per 100 kb Black = Gaps in genome coverage 92% of genome within 100kb of a SNP 83% of genome within 50 kb of a SNP 50% of genome within 15 kb of a SNP 25% of genome within 5 kb of a SNP Mapping 100K Coverage: 116,204 SNPs
39
Chemistry/Demultiplexing/Detection Options in SNP Genotyping Allele-Specific Hybridization Allele-Specific Extend + Ligate Allele-Specific PCR Sequenom iPlex TM Mass Spec. “DASH”, Amplicon T m Fluor Res Energy Transfer-FRET Luminex 100 Flow Cytometry Single Nucleotide Primer Extension Oligonucleotide Ligation Assay Capillary Electrophoresis Homogeneous Semi-Homogen. Fluorescence Solid phase microarray Solid phase microspheres Mass Spectrometry ABI SNPlex TM ABI SNaPShot TM Fluorescence Polarization Microarray Minisequencing Perkin-Elmer FP-TDI ABI Taqman TM 5’-Nuclease Illumina BeadArray TM Enzyme ChemistryDemultiplexingDetection MethodPlatform/Company
41
A 5’ A T T C C ddC-biot or ddA-biot 5’ T A T A Single Base Primer Extension, “Minisequencing” Allele-specific Primer Extension Allele-specific Primer Extension and Ligation Allele-specific Hybridization T 5’A T A LSO Probes SBE Primer 5’ Short GC T A G C Long GC PCR only: T m -shift Primers Enzymatic Options in SNP Genotyping ddA-biot, dATP, dTTP, dGTP
42
SNP Genotyping on Beads/Microarrays Selection of SNPs Design of PCR and “Tag” SBE/ASPE primers Preparation of beads with “Anti- Tag” primers Multiplex PCR Cyclic SBE/ASPE with biot(fluor.)- ddNTP/dNTP Capture of products on beads Signal measurement in flow cytometer/scanner
43
Pastinen, et al., Gen. Res. 7, 606, 1997 Single Base Extension (SBE) of Targets on Microarrays
44
SBE (Minisequencing) of Target DNA with Glass-immobilized primers
45
Allele-Specific Extension & Identification in CE: “Minisequencing” (ABI SNaPShot TM )
46
dR6G dR110 Degree of Multiplexing Depends on Resolution in CE ABI SNaPshot ® on 3130xl
47
Gen. Res. 9: 492, 1999 Fluorescence Polarization
48
Gen. Res. 9: 492, 1999 SBE (Minisequencing) with Detection by Fluorescence Polarization
49
PCR Amplification Single Base Extension SAP Treatment MALDI-TOF Mass Spec Spot on 384-place Chips Genotyping by SBE and Mass Spectrometry
50
Allele-specific Primer Extension (ASPE) with Chain Termination
52
Use of Allele-specific Probes in Genotyping by Melting Curve Analysis: “DASH” One base mismatch Matched Heterozygote Nature Biotech. 17: 87, 1999 Intercalating dye
53
Wang, et al., Biotechniques 39: 885, 2005 Use of Modified T m -shifting Primers in Genotyping
54
Bead Arrays: DNA immobilized on silica or polystyrene beads, random array requires decoding steps. 1) Lynx (www.lynxgen.com). In rows. Limited to ca. 20 bases/read. 2) Illumina BeadChip (www.illumina.com). In etched microwells. 3) Luminex coded microspheres (luminexcorp.com). Measurements by flow cytometry. 4) 454 LifeSciences (www.454.com). Clonal amplification and sequencing on 28 µ beads. Minimum 100 bases/read. Bead Technologies for SNP Genotyping/Gene Expression and Massively Parallel Sequencing (not currently supported in CIF)
55
Lynx/Solexa Bead Arrays for Gene Expression and MPSS Clones on Beads Brenner et al., PNAS 97: 1665, 2000, and Nature Biotech. 18: 630, 2000 Separate loaded from unloaded beads (FACS), ligate to anti-tag. 1.8 x 10 15 unique Tags tag Competitively hybridize beads with labeled libraries, then sort by FACS, OR… Sequence signatures with type IIs res. enz. & labeled, encoded adaptors.
56
Expression profiling with Illumina BeadChips in Microwells Gen. Res. 14: 870 & 2347, 2004 Total setup costs, satellite facility <$6000. HumanRef-8: 24k probes, $100/sample, $50 labeling. Random loading of beads in etched 3 µm microwells Decoding by Sequential hybridization: 11012202. 3 8 = 6561 codes. (4 8 = 65,536) 5’ 3’
57
Illumina Allele Specific Primer Extension (ASPE) and Ligation ASOs and LSOs Cy3 and Cy5-labeled universal primers
58
Luminex coded microspheres and multiplexed assays Green laser: Up to 100 different transcripts can be monitored simultaneously in high-throughput by flow cytometry, e.g., with “PR” genes in Arabidopsis, Gen. Res. 11: 1888, 2001 and 217 miRNAs in human cancers, Nature 435: 834, 2005. Red laser: Coding is in ratio of red and orange fluorescence inside microsphere.
59
SNP Genotyping Costs by Platform Platform#SNPs/ sample # samples$Oligo Set/$SNP $Mix/SNP$ per SNPMin $ Illumina (UCLA)15364880.0969,892 AB SNPlex (ABI 3730) 485000 500 72/0.0144 0.04 0.20 0.078 0.214 14,840 AB SNaPshot (ABI 3100) 5050050/0.100.4760.57614,400 AB Taqman (ABI 7700) 1750310/0.4130.751.21910 Allele-specific PCR 505000 500 17.60/0.0035 17.60/0.035 0.4220.43
60
S.-H. Lee et al., Theor. Appl. Genet. 110:167, 2004
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.