Lecture 13: Population Structure

Slides:



Advertisements
Similar presentations
Lab 3 : Exact tests and Measuring of Genetic Variation.
Advertisements

Lab 3 : Exact tests and Measuring Genetic Variation.
METHODS FOR HAPLOTYPE RECONSTRUCTION
Population Structure Partitioning of Genetic Variation.
Chapter 3: The Modern Synthesis. Hardy-Weinberg equilibrium If no selection and mating is random (i.e., no processes acting to change the distribution.
CSS 650 Advanced Plant Breeding Module 2: Inbreeding Small Populations –Random drift –Changes in variance, genotypes Mating Systems –Inbreeding coefficient.
Lecture 9: Introduction to Genetic Drift February 14, 2014.
Chapter 17 Population Genetics and Evolution, part 2 Jones and Bartlett Publishers © 2005.
MALD Mapping by Admixture Linkage Disequilibrium.
Signatures of Selection
Population Genetics What is population genetics?
Inbreeding. inbreeding coefficient F – probability that given alleles are identical by descent - note: homozygotes may arise in population from unrelated.
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
Genetic diversity and evolution. Content Summary of previous class H.W equilibrium Effect of selection Genetic Variance Drift, mutations and migration.
PROCESS OF EVOLUTION I (Genetic Context). Since the Time of Darwin  Darwin did not explain how variation originates or passed on  The genetic principles.
Quantitative Genetics
Hardy Weinberg: Population Genetics
Genetic variation, detection, concepts, sources, and forces
Genetic Diversity of the Phaseolus acutifolius A. Gray Collection of the USDA National Plant Germplasm System Using Targeted Region Amplified Polymorphism.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
Biodiversity IV: genetics and conservation
Population Genetics Reconciling Darwin & Mendel. Darwin Darwin’s main idea (evolution), was accepted But not the mechanism (natural selection) –Scientists.
Population Genetics Learning Objectives
Lecture 21: Tests for Departures from Neutrality November 9, 2012.
I. The Modern Synthetic Theory of Evolution A. Initial Structure – 1940 Sources of VariationAgents of Change MutationNatural Selection RecombinationDrift.
Lecture 12: Effective Population Size and Gene Flow October 5, 2012.
The Hardy-Weinberg Principles Changing Populations.
The Evolution of Populations Chapter 23 Biology – Campbell Reece.
PowerPoint Slides for Chapter 16: Variation and Population Genetics Section 16.2: How can population genetic information be used to predict evolution?
Population Stratification
Populations, Genes and Evolution Ch Population Genetics  Study of diversity in a population at the genetic level.  Alleles  1 individual will.
Deviations from HWE I. Mutation II. Migration III. Non-Random Mating IV. Genetic Drift A. Sampling Error.
Lecture 21 Based on Chapter 21 Population Genetics Copyright © 2010 Pearson Education Inc.
Lecture 13: Population Structure October 5, 2015.
Lecture 5: Genetic Variation and Inbreeding August 31, 2015.
Lecture 14: Population structure and Population Assignment October 12, 2012.
The plant of the day Bristlecone pine - Two species Pinus aristata (CO, NM, AZ), Pinus longaeva (UT, NV, CA) Thought to reach an age far greater than any.
Bottlenecks reduce genetic variation – Genetic Drift Northern Elephant Seals were reduced to ~30 individuals in the 1800s.
Lab 7. Estimating Population Structure. Goals 1.Estimate and interpret statistics (AMOVA + Bayesian) that characterize population structure. 2.Demonstrate.
Lecture 6: Inbreeding September 4, Last Time uCalculations  Measures of diversity and Merle patterning in dogs  Excel sheet posted uFirst Violation.
Lecture 14: Population Assignment and Individual Identity October 8, 2015.
Lecture 20 : Tests of Neutrality
Lecture 12: Effective Population Size and Gene Flow
Populations: defining and identifying. Two major paradigms for defining populations Ecological paradigm A group of individuals of the same species that.
Lab 7. Estimating Population Structure
Mammalian Population Genetics
Exam 1 Review September 21, Logistics u3306 LSB at 6:30 on Wednesday, September 23 uClosed book, notes, internet uComputers and software will be.
The plant of the day Pinus longaevaPinus aristata.
Chapter 2: Bayesian hierarchical models in geographical genetics Manda Sayler.
Lecture 6: Inbreeding September 10, Announcements Hari’s New Office Hours  Tues 5-6 pm  Wed 3-4 pm  Fri 2-3 pm In computer lab 3306 LSB.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
Microevolution. What is the smallest unit that can evolve? a)Individual b)Species c)Genus d)Population Final Answer? d! Do you remember how evolution.
Lecture 5: Genetic Variation and Inbreeding September 7, 2012.
Robert Page Doctoral Student in Dr. Voss’ Lab Population Genetics.
HS-LS-3 Apply concepts of statistics and probability to support explanations that organisms with an advantageous heritable trait tend to increase in proportion.
Bottlenecks reduce genetic variation – Genetic Drift
III. Modeling Selection
Signatures of Selection
Population Genetics: Selection and mutation as mechanisms of evolution
Haplotype Reconstruction
Lecture 4: Testing for Departures from Hardy-Weinberg Equilibrium
Volume 26, Issue 7, Pages (April 2016)
Quantifying the distribution of variation
The Evolution of Populations
Mammalian Population Genetics
Proportioning Whole-Genome Single-Nucleotide–Polymorphism Diversity for the Identification of Geographic Population Structure and Genetic Ancestry  Oscar.
A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants  Andrew.
Goals: To identify subpopulations (subsets of the sample with distinct allele frequencies) To assign individuals (probabilistically) to subpopulations.
Presentation transcript:

Lecture 13: Population Structure October 8, 2012

Last Time Effective population size calculations Historical importance of drift: shifting balance or noise? Population structure

Today Course feedback The F-Statistics Sample calculations of FST Defining populations on genetic criteria

Midterm Course Evaluations Based on five responses: It’s not too late to have an impact! Lectures are generally OK Labs are valuable, but better organization and more feedback are needed Difficulty level is OK Book is awful

F-Coefficients Quantification of the structure of genetic variation in populations: population structure Partition variation to the Total Population (T), Subpopulations (S), and Individuals (I) T S

F-Coefficients Combine different sources of reduction in expected heterozygosity into one equation: Overall deviation from H-W expectations Deviation due to subpopulation differentiation Deviation due to inbreeding within populations

F-Coefficients and IBD View F-statistics as probability of Identity by Descent for different samples Probability of IBD within an individual Overall probability of IBD Probability of IBD for 2 individuals in a subpopulation

F-Statistics Can Measure Departures from Expected Heterozygosity Due to Wahlund Effect where HT is the average expected heterozygosity in the total population HS is the average expected heterozygosity in subpopulations HI is observed heterozygosity within a subpopulation

Recessive allele for flower color Calculating FST Recessive allele for flower color B2B2 = white; B1B1 and B1B2 = dark pink Subpopulation 1: F(white) = 10/20 = 0.5 F(B2)1 = q1= 0.5 = 0.707 p1=1-0.707 = 0.293 White: 10, Dark: 10 Subpopulation 2: F(white)=2/20=0.1 F(B2)2 = q2 = 0.1 = 0.32 p2 = 1-0.32 = 0.68 White: 2, Dark: 18

Calculating FST Calculate Average HE of Subpopulations (HS) For 2 subpopulations: HS = Σ2piqi/2 = (2(0.707)(0.293) + 2(0.32)(0.68))/2 HS= 0.425 Calculate Average HE of Subpopulations (HS) White: 10, Dark: 10 Calculate Average HE for Merged Subpopulations (HT): F(white) = 12/40 = 0.3 q = 0.3 = 0.55; p=0.45 HT = 2pq = 2(0.55)(0.45) HT = 0.495 White: 2, Dark: 18

Bottom Line: FST = (HT-HS)/HT = (0.495 - 0.425)/ 0.495 = 0.14 14% of the total variation in flower color alleles is due to variation among populations AND Expected heterozygosity is increased 14% when subpopulations are merged (Wahlund Effect) White: 10, Dark: 10 White: 2, Dark: 18

Nei's Gene Diversity: GST Nei's generalization of FST to multiple, multiallelic loci Where HS is mean HE of m subpopulations, calculated for n alleles with frequency of pj Where pj is mean allele frequency of allele j over all subpopulation

Unbiased Estimate of FST Weir and Cockerham's (1984) Theta Compensates for sampling error, which can cause large biases in FST or GST (e.g., if sample represents different proportions of populations) Calculated in terms of correlation coefficients Calculated by FSTAT software: http://www2.unil.ch/popgen/softwares/fstat.htm Goudet, J. (1995). "FSTAT (Version 1.2): A computer program to calculate F- statistics." Journal of Heredity 86(6): 485-486. Often simply referred to as FST in the literature Weir, B.S. and C.C. Cockerham. 1984. Estimating F-statistics for the analysis of population structure. Evolution 38:1358-1370.

Linanthus parryae population structure Annual plant in Mojave desert is classic example of migration vs drift Allele for blue flower color is recessive Use F-statistics to partition variation among regions, subpopulations, and individuals FST can be calculated for any hierarchy: FRT: Variation due to differentiation of regions FSR: Variation due to differentiation among subpopulations within regions Schemske and Bierzychudek 2007 Evolution

Linanthus parryae population structure

Hartl and Clark 2007

FST as Variance Partitioning Think of FST as proportion of genetic variation partitioned among populations where V(q) is variance of q across subpopulations Denominator is maximum amount of variance that could occur among subpopulations

Analysis of Molecular Variance (AMOVA) Analogous to Analysis of Variance (ANOVA) Use pairwise genetic distances as ‘response’ Test significance using permutations Partition genetic diversity into different hierarchical levels, including regions, subpopulations, individuals Many types of marker data can be used Method of choice for dominant markers, sequence, and SNP

Phi Statistics from AMOVA Correlation of random pairs of haplotypes drawn from a region relative to pairs drawn from the whole population (FRT) Correlation of random pairs of haplotypes drawn from an individual subpopulation relative to pairs drawn from a region (FSR) Correlation of random pairs of haplotypes drawn from an individual subpopulation relative to pairs drawn from the whole population (FST) http://www.bioss.ac.uk/smart/unix/mamova/slides/frames.htm

What if you don’t know how your samples are organized into populations (i.e., you don’t know how many source populations you have)? What if reference samples aren’t from a single population? What if they are offspring from parents coming from different source populations (admixture)?

What’s a population anyway?

Defining populations on genetic criteria Assume subpopulations are at Hardy-Weinberg Equilibrium and linkage equilibrium Probabilistically ‘assign’ individuals to populations to minimize departures from equilibrium Can allow for admixture (individuals with different proportions of each population) and geographic information Bayesian approach using Monte-Carlo Markov Chain method to explore parameter space Implemented in STRUCTURE program: http://pritch.bsd.uchicago.edu/structure.html Londo and Schaal 2007 Mol Ecol 16:4523

Example: Taita Thrush data* Three main sampling locations in Kenya Low migration rates (radio-tagging study) 155 individuals, genotyped at 7 microsatellite loci Slide courtesy of Jonathan Pritchard

Posterior probability of K Estimating K Structure is run separately at different values of K. The program computes a statistic that measures the fit of each value of K (sort of a penalized likelihood); this can be used to help select K. Assumed value of K Posterior probability of K 12345 ~0 ~0 0.993 0.007 0.00005 Taita thrush data

Another method for inference of K The K method of Evanno et al. (2005, Mol. Ecol. 14: 2611-2620): Eckert, Population Structure, 5-Aug-2008 46

Inferred population structure Africans Europeans MidEast Cent/S Asia Asia Oceania America Each individual is a thin vertical line that is partitioned into K colored segments according to its membership coefficients in K clusters. Rosenberg et al. 2002 Science 298: 2381-2385

Inferred population structure – regions Rosenberg et al. 2002 Science 298: 2381-2385