Generation and Analysis of AFLP Data ESPM 150/290: Biology, Ecology, and Genetics of Forest Diseases Laboratory Exercise April 1, 2010
Some Considerations in Choosing a Genotyping Method What is the level of taxonomic resolution desired? (Populations? Species? Phyla?) Comparison of distantly related individuals requires slowly evolving markers (e.g., protein-coding DNA or Amino Acid sequences) due to saturation of changes in quickly-evolving markers Comparison of closely related individuals requires rapidly evolving markers (e.g., microsatellites or non-coding DNA sequences) What is the level of genotypic resolution desired? Dominant vs. codominant markers Fine (e.g., nucleotide-level) data vs. coarse (e.g., fragment size) genomic scale – detailed information about one or a few loci vs. less-detailed information about more loci
Some Considerations in Choosing a Genotyping Method How much previous sequence knowledge is available? DNA sequencing, microsatellite amplification, PCR-RFLP, etc. require previous sequence information so that PCR primers can be designed AFLPs and RAPDs allow genetic fingerprinting when previous sequence knowledge is not available What are the cost and labor constraints? DNA sequencing is more costly than fragment analysis Techniques requiring fluorescent labeling are generally more costly than techniques that don’t require labeling
A review of PCR amplification Requirements: DNA template 2 oligonucleotides - Primers Nucleotides dATP, dCTP, dGTP, dTTP Taq polymerase Double strand denaturation Annealing of the primers Elongation 5’ 3’
Restriction Enzymes Found in bacteria Cut DNA within the molecule (endonuclease) Cut at sequences that are specific for each enzyme (restriction sites) Leave either blunt or sticky ends, depending upon the specific enzyme Tobin & Dusheck, Asking About Life, 2nd ed. Copyright 2001, Harcourt, Inc. http://users.rcn.com/jkimball.ma.ultranet/BiologyPages/R/RestrictionEnzymes.html
Random Genomic Markers DNA sequence of suitable SNPs is not available Relatively inexpensive Scan the entire genome producing information on several variations in the same reaction RAPD Random Amplification of Polymorphic DNA AFLP Amplified Fragment Length Polymorphism
AFLP Amplified Fragment Length Polymorphisms (Vos et al., 1995) Genomic DNA digested with 2 restriction enzymes: EcoRI (6 bp restriction site) cuts infrequently MseI(4 bp restriction site) cuts frequently GAATTC CTTAAG TTAA AATT Technique similar to RFLP Restriction Fragment Length Polymorphisms and RAPD Random Amplification of Polymorphic DNA. As RFLP is based on restriction digestion but, like RAPD uses random amplification of genomic DNA. Genome is digested using 2 enzymes, 1 frequent cutter MseI (4 bp) and 1 infrequent cutter (EcoRI).
Fragments of DNA resulting from restriction digestion are ligated with end-specific adaptors (a different one for each enzyme) to create a new PCR priming site Pre selective PCR amplification is done using primers complementary to the adaptor + 1 bp (chosen by the user) N N N N
Selective amplification using primers complementary to the adaptor (+1 bp) + 2 bp NNN NNN NNN NNN
AFLP OVERVIEW (VOS ET AL., 1995)
Sample AFLP Gel
AFLP Electropherogram Peak Height Fragment Size (bp) Source: Wikimedia Commons
AFLP Fluorescent electrophoresis
AFLP Data Map from Urbanelli et al. (2007) Rows: individuals Columns: alleles
AFLP genotyping PCR amplification using primers corresponding to the new sequence If there are 2 new priming sites within 400 – 1600 bp there is amplification The result is: Presence or absence of amplification 1 or 0 Dominant marker: does not distinguish between heterozygote and homozygote Due mostly to SNPs but also to deletions/insertions
Analysis of AFLP data Similarity (cluster analysis) NJ (Neighbor Joining) UPGMA (Unweighted Pair Group Method with Arithmetic mean) AMOVA (Analysis of Molecular Variance) Model-based Maximum likelihood Bayesian Example of a sequence distance matrix Image Source: http://media.wiley.com/CurrentProtocols/BI/bi0603/bi0603-fig-0002-1-full.gif
Analysis of AFLP data Similarity (cluster analysis) NJ (Neighbor Joining) UPGMA (Unweighted Pair Group Method with Arithmetic mean) AMOVA (Analysis of Molecular Variance) Model-based Maximum likelihood Bayesian Example of a sequence distance matrix Image Source: http://media.wiley.com/CurrentProtocols/BI/bi0603/bi0603-fig-0002-1-full.gif
AFLP Clustering Analysis Clustering Dendrogram Fragment Visualization Source: Wikimedia Commons
AFLP Data Map with UPGMA dendogram from Urbanelli et al AFLP Data Map with UPGMA dendogram from Urbanelli et al. (2007): “Distinguishing taxa in the Pleurotus eryngii (King Oyster Mushroom) complex using AFLPs” 90 populations sampled 94 AFLP loci scored Photos: (Top) The New York Times (Bottom L) Wikimedia Commons (Bottom R) http://steinpilz.up.seesaa.net
Example Structure Output “Estimated population structure for 10 runs of structure using 1056 individuals from 52 human populations. Each graph represents the output of one run of structure. In each graph, each individual is represented by a vertical line, which is partitioned into 5 colors that represent its estimated membership fractions in K=5 clusters.” (Source: http://rosenberglab.bioinformatics.med.umich.edu/clumppExample.html) Rosenberg et al. (2002). Science 298: 2381-2385.