Download presentation
Presentation is loading. Please wait.
Published byDwain Turner Modified over 9 years ago
1
MEDG 505 Pharmacogenomics March 17, 2005 A. Brooks-Wilson
2
Reminder: What is Genomics?
According to “Genomics is operationally defined as investigations into the structure and function of very large numbers of genes undertaken in a simultaneous fashion” So lots of people are trying to do pharmacogenetics but is anyone doing pharmacogenomics?
3
Pharmacogenetics “The study of how genes affect people’s response to medicines” (NIH) A subset of complex genetics for which the traits relate to drugs First observed in 1957 Part of “personalized medicine” 20-95% of variability in drug disposition and effects is thought to be genetic Non-genetic factors: age, interacting medications, organ function Drug absorption, distribution, metabolism, excretion >30 families of genes So I’ll talk about complex genetics and let you consider that in the context of drug response, etc.
4
Pharmacogenetics: Examples
Drug metabolism genes NAT2, isoniazid anti-tuberculosis drug hepatotoxicity CYP3A5, many drugs Thiopurine S-methyltransferase (TPMT), 6-thioguanine Drug targets (receptors) B2 Adrenergic Receptor, inhaled B agonists for asthma Drug transporters P-glycoprotein (ABCB1, MDR1), resistance to anti-epileptic drugs The examples known today are those that come closest to simple genetic traits So I’ll talk about complex genetics and let you consider that in the context of drug response, etc.
5
Potential Consequences
Extended / shortened pharmacological effect Adverse drug reactions Lack of pro-drug activation Increased / decreased effective dose Metabolism by alternative, deleterious pathways Exacerbated drug-drug interactions
6
The Goal of Pharmacogenomics
Picture from Perlegen website:
7
Complex Genetics: Concepts
Family studies vs. population studies Penetrance Genetic heterogeneity Linkage vs. association Haplotypes in family and association studies Genetic variation, SNPs Genotyping These concepts relate to PGX-related and other traits
8
Types of Genetic Studies
Family studies multi-generation families Association studies Case / control (easiest to collect)
9
Penetrance Penetrance = the proportion of carriers who show the phenotype Expressivity = severity of the phenotype
10
Genetic Heterogeneity
Locus heterogeneity (what we usually refer to when we talk about genetic heterogeneity) Allelic heterogeneity Examples: in a family study, effect in an association study, effect
11
Family Studies Identify Highly Penetrant Mutations
High penetrance disease allele(s) Availability of suitable families is the limiting factor Family studies are effective for only a minority of conditions Many of you are well-acquainted with the established and successful method of identifying cancer genes through the use of cancer families, studies of the king that produced notable successes like BRCA1 or MEN2. These are the special cases of cancer genetics, where the genetic defect has such a clear effect and high enough penetrance that the pattern of inheritance can be distinguished in families. Penetrance is the proportion of mutation carriers that develop cancer. For a high penetrance mutation, if an individual inherits the gene, it is very probable that they will develop the cancer. Families with clear inheritance of a type of cancer are immensely valuable but often hard to find in sufficient size and number to identify the disease gene. These cases are the minority, extreme examples of inherited cancers. Availability of families not linked to known cancer genes will determine how much time I would invest in this method. But what of the majority of cancers?
12
Association Studies Can Identify Variants with High or Low Penetrance
Case / control groups Not limited to high penetrance alleles Amenable to the study of gene-environment interactions A preferred approach for the majority of complex genetic disorders KNOW EXAMPLES OF SUCCESSES FOR CANCER The majority of cancers are late adult onset, and will have both genetic and environmental components to their etiology. As opposed to a high penetrance gene, more subtle genetic variation may lead to cancer only in the presence of other genetic or environmental factors. This is well documented in mice where, if a cancer causing mutation is bred onto two different background strains of mice, the animals may develop cancer in one background but be cancer-free in another. - Imagine this situation in the mice, where cancer risk depends on presence or absence of a mutation, AND on genetic background. - now imagine that there are 20 strains of mice, all with different genetic backgrounds that affect development of different tumours in different ways. - now vary the animals diet and living conditions - now allow these mice to mate for several generations in an essentially random fashion. - what you have is a model for a human population. As for other complex disorders like type II diabetes, the majority of cancers will bemore easily approached through a population-based approach.
13
Complex Diseases / Phenotypes
Multigenic (genetic heterogeneity) Environmental effects (multiple) Gene-gene interactions Gene-environment interactions (for pharmacogenetic traits: age, alcohol consumption, hepatitis exposure, etc.) Association studies will hold up under these complications but family-based linkage studies will not!
14
Linkage vs. Association
Linkage is to a locus different families can be linked to the same locus but have different disease alleles how to take advantage of this in proving a gene is responsible for a disease Association is with an allele done in groups or populations the allele arose and was propagated in the population; the haplotype was degraded by recombination
15
Genetic Markers SNPs: Substitutions, for example, C / T
Most common type of genetic variation Ideal for association mapping over short distances 1 SNP every ~ 200 base pairs in a population 1 SNP every ~1000 base pairs between 2 individuals dbSNP: >10M putative SNPs, > 5M validated SNPs Microsatellites: (CA)n or other short repeats More polymorphic than SNPs Less common than SNPs 1 polymorphic microsatellite per ~ 100,000 base pairs Best for linkage mapping over long distances, in families Microsatellites are the preferred marker type for mapping in families. These markers, which are further apart than SNPs can be used because in families the small expected number of recombination events means that very large chunks of chromosomes are often shared between family members, and can be used in linkage mapping. In association studies between unrelated individuals, ancestral recombination events in the population have decreased the size of the shared region, necessitating the use of closely linked markers.
16
SNPs Single Nucleotide Polymorphisms
Can also use “Indels”, though some investigators throw them away! Synonymous, non-synonymous SNPs Mutation vs. polymorphism vs. variant or variation The 1% definition
17
SNP Databases dbSNP (more than just human)
Human Genome Variation Database At least 11 others! ~ 10 million SNPs with minor allele >1% ~ 7 million SNPs with minor allele >5% ~ 50,000 non-synonymous SNPs in the human genome Kruglyak and Nickerson, 2001
18
Case / Control Studies Collect blood samples from patients and controls, with consent Establish database of clinical and epidemiological data Select ‘candidate’ genes of interest for each trait Sequence the candidate genes in a small group of patients Genotype selected variants in case / control groups Analyze for association with a phenotype Analyze for gene-gene and gene-environment interactions Genetic, Ethical, Legal and Social (GELS) issues investigations
19
Linkage Disequilibrium
The difference between the observed frequency of a haplotype and its expected frequency if all alleles were segregating randomly For adjacent loci: A,a B,b D = PAB - PA x PB D is dependent on allele frequencies Other related measures also used
20
Human haplotype blocks . . .
Ancestral chromosomes Observed pattern of historical recombination in common haplotypes Looking for variation over short distances would require many markers to assess the whole human genome Early estimates were as high as 2 million markers Recent groundbreaking studies reported in October of this year reveal that haplotypes observed within human populations tend to consist of kb blocks that tend not to recombine, separated by 1-2 kb hot spots for recombination. This is expected to decrease the number of markers necessary to conduct genome-wide scanning for association to as low as 60,000. We will generate haplotype information in the vicinity of each candidate gene we assess, in order to maximize the genetic information obtained from each analysis. Rather than 50 kb
21
. . . Simplify association studies
Ancestral chromosomes SNP1 SNP2 A C A C G T G T A disease-causing mutation arises * A C A A C G T G G T Association with nearby SNPs * Even if SNP1 and SNP2 are the only SNPs genotyped (i.e., the mutation itself is not genotyped), the haplotype blocks allow the mutation to be assigned to a minimal region Reduces the number of markers required. G C A A T A T G G C Location of mutation Gene
22
LD and Association Direct association Indirect association
asks about the effect of a variant if negative, the gene may still be involved! Indirect association uses LD can be more convincingly negative if haplotypes are assessed
23
Haplotype Blocks Became clear in October 2001
87% of the genome is in blocks ~> 30 kb Not all of the genome is in haplotype blocks! Average block 22 kb, 11kb in African populations (Gabriel et al, 2002) A few common haplotypes at a given locus in a given population African populations generally have the greatest number of haplotypes and the shortest haplotype blocks Strength of LD and size of blocks varies greatly between regions
24
How to Generate Haplotypes
Haplotyping in families Physical determination long-range PCR, separation of molecules cloning of single molecules labor intensive Estimate haplotype frequencies Expectation Maximization algorithm, others generate frequencies for case group, control group
25
Tag SNPs Chromosome copy 1 Chromosome copy 2 Chromosome copy 3
26
The HapMap Reference map for association studies
Expected to reduce the number of markers required to conduct effective genome scans for association 270 samples from 4 populations: 30 Yoruban trios (Nigeria) 45 unrelated Japanese (Tokyo) 45 unrelated Chinese (Beijing) 30 U.S. trios (CEPH, N/W European ancestry) >400,000 markers genotyped in all samples, nearly 1M in CEPH trios
27
Strategies Candidate gene based studies Genome scans hypothesis-driven
must guess (one of) the right gene(s)!! Current state of the art Genome scans “hypothesis-free” scans of ~ 1 million markers are now possible
28
SNP Discovery is Still Necessary
Many have been found by multi-read sequence mining Directed public SNP discovery in certain sets of genes, e.g.: SNP500Cancer Environmental Genome Project (EGP) Individuals used usually “unaffected” My own group finds previously unknown SNPs every time we sequence a gene in multiple individuals. Not all of these SNPs we find are rare.
29
SNP Discovery All exons and regulatory regions of each gene
Identify regulatory regions by comparative genomics Bi-directional sequencing Denaturing High Performance Liquid Chromatography (DHPLC) Other methods Comparative genomics uses evolutionary conservation How many individuals to sequence is an issue Relates to the statistical power of a study I do 95 individuals Accepting up to 10% sequence dropout in each of the forward and reverse directions, expect at least 75 individuals to have good bi-directional sequences (150 chromosomes). A variant observed in 1/150 chromosomes would have a frequency of approximately 0.7%. Public data was shown by John Todd’s group to be inadequate for the specification of all common haplotypes.
30
1 2 3 5 6 4 PCR products Cycle Sequencing The re-sequencing pipeline
Template aliquotting: Robbins Hydra 1 PCR Set-up: Packard Multiprobe II liquid handler 2 PCR and cycle sequencing: MJ Tetrads 3 5 PCR products Sequencing: ABI 3700s 6 Cycle Sequencing The re-sequencing pipeline PROCESS DEVELOPMENT We hand off to the sequencing group just at the machine-load stage THIS PIPELINE IS FULLY ESTABLISHED Started out slowly but now it’s cranking 4 Purification of PCR Products: Agencourt
31
SNP Discovery: PolyPhred and Consed
A screen shot from the NHL Sequence Analysis PolyPhred developed by Debbie Nickerson’s group PolyPhred: Debbie Nickerson; Consed, Phil Green
32
Sample Output GG GA AA Collaboration with David Huntsman
Found 5 E-cadherin mutations in the first batch of 22 families so far
33
Genotyping, Technology
Determining the allele(s) present in a particular sample at a particular (SNP) marker Many methods
34
TaqMan (ABI): Uniplex genotyping
35
TaqMan
36
TaqMan Output Homozygous 1,1 Heterozygous Homozygous 2,2
37
Extended Primer (24-mer)
MassEXTEND REACTION Allele 1 Allele 2 Unlabeled Primer (23-mer) Same Primer (23-mer) TCT ACT +Enzyme +ddATP +dCTP/dGTP/dTTP Extended Primer (24-mer) Extended Primer (26-mer) A T G A T C T A C T EXTEND Primer Allele 1 Allele 2 Allelotyping, determination of allele frequencies within pools of DNA samples The key design feature is the use of a terminator mix that maximizes the mass difference between alleles. In this example, dideoxyA is used with deoxyC, G, and T. For allele 1, the dideoxy A is incorporated immediately, extending a 23-mer (seen here in the MS) to a 24-mer (seen here). For allele 2, the SNP calls for incorporation of a T residue, then a G; prior to incorporation of the dideoxy A, extending the 23-mer PROBE primer to a 26-mer. The differences in mass between these two products in enormous compared to the resolution of the MS, and as you will see, this allows for completely automated, error-free calling of the genotype. I should also point out that although this SNP is depicted as biallelic, the PROBE assay design combined with the resolution of the MS would detect other alleles as well. Indeed, we have customers who have discovered triallelic SNPs previously thought to be biallelic. Diagram courtesy of Sequenom
38
Sequenom MassARRAY: < 12-plex
* A G * A G * T C * A G * C T As mentioned, element chips can be loaded at a time. This is equal to 3,840 elements. If single-plex assays are loaded, this is 3,840 genotypes; if penta-plex assays are loaded, then 19,200 genotypes are determined per ~4 hour run. We typically multiplex at a level of 5 or 6 (same tube for amplification and MassEXTEND). All SNP assays are computer-designed. This is an example of a 5-plex of SNPs on chromosome 22. Each SNP is in a different color. In all cases, an asterisk (*) is used to denote the primer. The genotype is indicated by the letter(s) for the allele(s) being underlined. A dotted arrow for a primer means that all of the primer was converted to product. A dotted arrow for an allele simply means that the allele is absent (i.e., the genotype is homozygous for the other allele). Diagram courtesy of Sequenom
39
Illumina BeadArray System: 1152-plex
1152-fold multiplexing 0.26 ng of genomic DNA per genotype $ 0.05 USD per genotype
40
Illumina BeadArray System
41
ParAllele Molecular Inversion Probes: 10,000 Plex
Allele-specific gap filling The probe is only amplifiable (and can become labelled) in the tube that contained dA.
42
Affymetrix Whole Genome Sampling Analysis: 500,000-plex
Perlegen is similar to Affy “Affy on steroids” Perlegen: At least 1.6 million SNPs per experiment Kennedy et al., 2003
43
Affymetrix: Allele-Specific Hybridization
PM = perfect match MM = mismatch PM = perfect match MM = mismatch
44
DNA Pooling Strategies
Reduce the number of genotypes and genotyping cost, particularly for whole genome scans Pool of case DNAs vs. pool of control DNAs DNAs must be mixed in precisely equimolar proportions in the pools! Requires a quantitative genotyping technique E.g. 40% in cases vs. 20% in controls Verify positives by genotyping individual samples
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.