Genomics of Adaptation

Slides:



Advertisements
Similar presentations
Linkage and Genetic Mapping
Advertisements

The genetic dissection of complex traits
Planning breeding programs for impact
Frary et al. Advanced Backcross QTL analysis of a Lycopersicon esculentum x L. pennellii cross and identification of possible orthologs in the Solanaceae.
Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
Association Mapping as a Breeding Strategy
Identification of markers linked to Selenium tolerance genes
ASSOCIATION MAPPING WITH TASSEL Presenter: VG SHOBHANA PhD Student CPMB.
Genetic Basis of Agronomic Traits Connecting Phenotype to Genotype Yu and Buckler (2006); Zhu et al. (2008) Traditional F2 QTL MappingAssociation Mapping.
Chapter 19 Evolutionary Genetics 18 and 20 April, 2004
Random fixation and loss of heterozygosity
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
Lecture 19: Causes and Consequences of Linkage Disequilibrium March 21, 2014.
MALD Mapping by Admixture Linkage Disequilibrium.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Atelier INSERM – La Londe Les Maures – Mai 2004
Signatures of Selection
1.Generate mutants by mutagenesis of seeds Use a genetic background with lots of known polymorphisms compared to other genotypes. Availability of polymorphic.
14 Molecular Evolution and Population Genetics
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Modes of selection on quantitative traits. Directional selection The population responds to selection when the mean value changes in one direction Here,
Hidenki Innan and Yuseob Kim Pattern of Polymorphism After Strong Artificial Selection in a Domestication Event Hidenki Innan and Yuseob Kim A Summary.
- any detectable change in DNA sequence eg. errors in DNA replication/repair - inherited ones of interest in evolutionary studies Deleterious - will be.
Non-Mendelian Genetics
Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012.
Genetic Linkage. Two pops may have the same allele frequencies but different chromosome frequencies.
Lecture 23: Causes and Consequences of Linkage Disequilibrium November 16, 2012.
Gene Hunting: Linkage and Association
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Models of Molecular Evolution III Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections 7.5 – 7.8.
Finnish Genome Center Monday, 16 November Genotyping & Haplotyping.
What is a SNP?. Lecture topics What is a SNP? What use are they? SNP discovery SNP genotyping Introduction to Linkage Disequilibrium.
INTRODUCTION TO ASSOCIATION MAPPING
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Selectionist view: allele substitution and polymorphism
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.
The International Consortium. The International HapMap Project.
NEW TOPIC: MOLECULAR EVOLUTION.
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Molecular evolution Part I: The evolution of macromolecules.
Chapter 22 - Quantitative genetics: Traits with a continuous distribution of phenotypes are called continuous traits (e.g., height, weight, growth rate,
Testing the Neutral Mutation Hypothesis The neutral theory predicts that polymorphism within species is correlated positively with fixed differences between.
Quantitative Genetics and Genetic Diversity Bruce Walsh Depts of Ecology & Evol. Biology, Animal Science, Biostatistics, Plant Science Footprints of Diversity.
In populations of finite size, sampling of gametes from the gene pool can cause evolution. Incorporating Genetic Drift.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
The Haplotype Blocks Problems Wu Ling-Yun
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Genetic Linkage.
MULTIPLE GENES AND QUANTITATIVE TRAITS
Population Genetics As we all have an interest in genomic epidemiology we are likely all either in the process of sampling and ananlysising genetic data.
Signatures of Selection
upstream vs. ORF binding and gene expression?
Neutrality Test First suggested by Kimura (1968) and King and Jukes (1969) Shift to using neutrality as a null hypothesis in positive selection and selection.
Linkage and Linkage Disequilibrium
Genetic Linkage.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Detection of the footprint of natural selection in the genome
Mapping Quantitative Trait Loci
MULTIPLE GENES AND QUANTITATIVE TRAITS
Genome-wide Associations
The ‘V’ in the Tajima D equation is:
Genome-wide Association Studies
Genetic Drift, followed by selection can cause linkage disequilibrium
Genetic Linkage.
Summary of the population genetics measures used in this study.
Genes Code for Proteins
Reminder The AP Exam registration is open in Naviance. The Exam is on Monday, May 13. I’ll let you know when the next test/homework will be.
Presentation transcript:

Genomics of Adaptation 1

Questions What are the genetic changes that underlie adaptation? What are the population genetic or genomic signatures of adaptation? How do non-adaptive processes affect tests of selection?

Goals Understand some top down and bottom up approaches used to identify genes responsible for adaptation Explain patterns of sequence variation expected with directional and balancing selection Understand the principles of population genetic tests of selection 3

The genetic basis of adaptation Ross-Ibarra J et al. PNAS 2007;104:8641-8648 4

Quantitative trait loci (QTL) -Genomic regions associated with trait variation -Loci detected may differ across individuals/environments -Statistical issues (sample size, genes of small effect, epistasis) -Can be large regions of a chromosome (further mapping in region needed) -Can't perform in all species 5

Quantitative trait loci (QTL) -Precision limited by density of markers and number of recombination events -Recombination events limited by the number of individuals and their degree of recombination between the parental genomes (i.e. F2, F3, etc) F1 F2 F3 F4 Parental genomes are more finely recombined with each generation of consecutive intercrosses.

Association mapping Associations between markers (SNPs) and phenotypes in natural populations Different populations of Arabidopsis have different leaf shape. Look through whole genome to find SNPs associated with leaf shape. 7

Association mapping Pros: Much higher resolution No need for crosses Cons: Population structure may lead to spurious associations Need many many markers, more than QTL mapping. 8

Example association map Response to the avirulence gene AvrRpm1 A very simple trait Used 95 inbred lines and 250,000 SNPs

Bulk Segregrant Analysis Cross two plants divergent phenotype, then self or intercross to make an F2 population Select F2 individuals with extreme phenotypes for the trait Genotype both pools for many markers Look for genes where different alleles are enriched in each pool X Parents Selfed or intercrossed F1 F2

Example of Top-Down Approach in Sunflowers Wild Landrace Elite Wild Landrace Elite Domestication Improvement

Top-Down Strategy for Identifying Genes Controlling Flowering Time Differences between Wild, Domesticated, and Improved Sunflowers Isolate sunflower homologs of known flowering time genes Test for co-localization with previously mapped QTLs Test for molecular signatures of positive selection Investigate Function Ben Blackman Genetics 2011; 187:271-287

Isolate and map sunflower homologs of known flowering time genes

Four flowering time gene homologs – all members of the FT gene family – experienced selective sweeps during a stage of sunflower domestication

Genetic and Functional Analyses Identify Causative Mutations in HaFT paralogs

Heterozygotes with Frameshift Allele exhibit Heterosis

Advantages and Disadvantages of Top-Down Strategy Proven Strategy for Identifying Major Genes *Disadvantages: Time-Consuming and Expensive Requires Segregating Populations May miss Genes with Small Phenotypic Effects *Disadvantages can be mitigated by association mapping, in which genome-wide scans are employed to search for correlations between genetic markers and phenotypic traits in highly recombinant populations

The genetic basis of adaptation Ross-Ibarra J et al. PNAS 2007;104:8641-8648

Which locus is likely involved in the change in floral phenotype? Loci Loci 1 2 3 4 1 2 3 4 1 3 Selective sweep 1 2 3 4 4 2

Which locus is likely involved in the divergence in floral phenotype?

Detecting natural selection The Neutral theory suggests that most molecular changes are neutral and are caused by random genetic drift This is used as a null hypothesis and deviations from neutral expectations are evidence of selection Important to consider how non-selective processes like population structure and linkage affect the statistics

The effect of selection on the genome Directional selection Best allele(s) sweep to fixation Loss of variation Change in frequency distribution of polymorphisms Increase in linkage disequilibrium around the site 23

The effect of selection on the genome Directional selection Best allele(s) sweep to fixation Loss of variation Change in frequency distribution of polymorphisms Increase in linkage disequilibrium around the site Balancing selection Maintains variation that otherwise would be lost to drift Heterozygote advantage, frequency dependent selection, fluctuating selection, (divergent selection) 24

Directional selection A beneficial allele arises Variants with this allele rapidly spread through the species Genetic diversity is reduced around this adaptive locus ancestral After selection 25 25

Chance of detecting natural selection Depends on: Time Strength of selection Recombination, mutation Initial frequency Selective sweep 26 26

Methods for detecting selection A. MacDonald-Kreitman Type Tests B. Site Frequency Spectrum Approaches C. Linkage Disequilibrium (LD) and Haplotype Structure D. Population Differentiation: Lewontin-Krakauer Methods These tests can be applied to single genes, or across the whole genome. 27

A. MacDonald-Krietman type tests Synonymous substitutions: Mutations that do not cause amino acid change (usually 3rd position) “silent substitutions” Nonsynonymous substitutions: Mutations that cause amino acid change (1st, 2nd position) “replacement substitutions” 28

A. MacDonald-Krietman type tests GCU - Alanine GCC – Alanine Synonymous GUU – Valine Nonsynonymous 29

A. MacDonald-Krietman type tests Ka/Ks Test Nonsynonymous substitutions Ka Synonymous substitutions Ks Uses coding sequence (sequence that codes proteins) Controls for max possible rate of each type of substitution Ks doesn’t change protein so is “neutral” and is used as baseline rate Important to remember that both types of mutations occur at the same rate, it is fixation rate that varies. 30

A. MacDonald-Krietman type tests Ka/Ks Test Nonsynonymous substitutions Ka Synonymous substitutions Ks Ka/Ks = 1 --- Neutral drift. Protein changes aren’t being selected for or against. Ka/Ks > 1 --- Positive selection. Protein changes are being selected for Ka/Ks < 1 --- Purifying selection. Protein changes are being selected against. 31

A. MacDonald-Krietman type tests Ka/Ks Test Nonsynonymous substitutions Ka Synonymous substitutions Ks Can be done with single sequences per species/group (don’t need population genetics data) Can pinpoint where selection occurred on a phylogeny Proteins very rarely have Ka/Ks > 1 for their entirely sequence, often only small pieces or single codons are under selection Proteins with Ka/Ks > 1 are often under diversifying selection, e.g. immune or self-incompatibility genes 32

B. Site Frequency Spectrum Selection affects the distribution of alleles within populations Method examines site frequency spectrum and compares to neutral expectations Could be applied to a single locus. Now used often for genomic scans for selective sweeps 33

B. Site Frequency Spectrum Tajima’s D Fu’s Fs Fay and Wu’s H 34

B. Site Frequency Spectrum Many derived alleles at high frequency because they were swept there with the positively selected site Proportion of polymorphic sites Tajima’s D Fu’s Fs Fay and Wu’s H Low High Population derived allele frequency 35

B. Site Frequency Spectrum Many derived alleles at high frequency because they were swept there with the positively selected site Tajima’s D Fu’s Fs Fay and Wu’s H 36

B. Site Frequency Spectrum Many derived alleles at low frequency are new mutations in the swept region Proportion of polymorphic sites Tajima’s D Fu’s Fs Fay and Wu’s H Low High Population derived allele frequency 37

B. Site Frequency Spectrum Many derived alleles at low frequency are new mutations in the swept region Tajima’s D Fu’s Fs Fay and Wu’s H 38

B. Site Frequency Spectrum Few medium frequency derived alleles. Pre-sweep alleles were either swept to high frequency or removed. Post-sweep alleles are too young to reach medium frequency Proportion of polymorphic sites Tajima’s D Fu’s Fs Fay and Wu’s H Low High Population derived allele frequency 39

B. Site Frequency Spectrum Few medium frequency derived alleles. Pre-sweep alleles were either swept to high frequency or removed. Tajima’s D Fu’s Fs Fay and Wu’s H 40

Maize cupulate fruitcase genetics Wildtype teosinte hard fruitcase Teosinte with maize tga1 gene

Maize cupulate fruitcase genetics Maize has less diversity compared to teosinte

Maize cupulate fruitcase genetics Tajima’s D looks at site frequency spectrum. Negative values suggests many rare polymorphisms, which occurs during positive selection.

Maize cupulate fruitcase genetics HKA asks if there is more divergence between species than would be expected by the amount of polymorphism in the species

C. Linkage Disequilibrium (LD) The nonrandom association of alleles from different loci Levels of linkage disequilibrium will increase during selective sweeps As a new mutation rises in frequency, it will drag along linked sites This haplotype block will have high LD until recombination breaks it up over time 45

D. Population Differentiation: Lewontin-Krakauer Methods Selection will often increase the degree of genetic distance between populations Compute pairwise genetic distances (e.g., FST) for many loci between populations When a locus shows extraordinary levels of genetic distance relative to other loci, this “outlier” locus is a candidate for positive selection 46

Example of Fst scan between sunflower species cM Renaut et al. (2013; Nature Comm.) 47

Example of Top-Down Approach in Sunflowers Wild Landrace Elite Domestication Improvement

Methods & Analyses Transcriptome sequencing of wild, landrace, and elite lines using Illumina and 454 sequencing Assayed circa 200,000 SNPs per comparison Scanned genome to identify outlier SNPs Greg Baute

Outlier Scan Domestication Candidates Improvement Candidate Genes Most strongly selected genes are involved in fatty acid biosynthesis (oil production!) But many outlier genes cannot be linked with phenotype and may represent false positives

Wild introgressions Sclerotinia resistance locus

False Positive Rate can be Reduced by Conducting Multiple Independent Comparisons

Advantages and Disadvantages of Bottom-Up Strategy Efficient and relatively inexpensive Does not require segregating population *Disadvantages: False positives frequent due to complex demographic history High LD (especially in selfing species) may obscure target of selection Linking swept genes to phenotypic variation can be challenging *Disadvantages can be mitigated by conducting multiple independent comparisons and by integration of genome scan with association mapping data