Lab 7. Estimating Population Structure. Goals 1.Estimate and interpret statistics (AMOVA + Bayesian) that characterize population structure. 2.Demonstrate.

Slides:



Advertisements
Similar presentations
1 BI3010H08 Population genetics Halliburton chapter 9 Population subdivision and gene flow If populations are reproductible isolated their genepools tend.
Advertisements

Lab 9: Linkage Disequilibrium. Goals 1.Estimation of LD in terms of D, D’ and r 2. 2.Determine effect of random and non-random mating on LD. 3.Estimate.
Lab 10: Mutation, Selection and Drift
Lab 10: Mutation, Selection and Drift. Goals 1.Effect of mutation on allele frequency. 2.Effect of mutation and selection on allele frequency. 3.Effect.
Lab 3 : Exact tests and Measuring of Genetic Variation.
Lab 3 : Exact tests and Measuring Genetic Variation.
Lab 6: Genetic Drift and Effective Population Size.
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Brian Kinlan UC Santa Barbara Integral-difference model simulations of marine population genetics.
Population Structure Partitioning of Genetic Variation.
What causes geographic populations to become differentiated? Natural Selection? Genetic Drift? (limited gene flow)
Modeling Populations forces that act on allelic frequencies.
Section 3 Characterizing Genetic Diversity: Single Loci Gene with 2 alleles designated “A” and “a”. Three genotypes: AA, Aa, aa Population of 100 individuals.
ProportionMisc.Grab BagRatiosIntro.
1 BSCI 363: read the rest of chapter 9 CONS 670: read the rest of chapter 7, and chapter 9.
Population Genetics What is population genetics?
PROCESS OF EVOLUTION I (Genetic Context). Since the Time of Darwin  Darwin did not explain how variation originates or passed on  The genetic principles.
Population Genetics. Macrophage CCR5 CCR5-  32.
Population Genetics Learning Objectives
Inbreeding if population is finite, and mating is random, there is some probability of mating with a relative effects of small population size, mating.
Weak forces in Evolution
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Lab 6: Genetic Drift and Effective population size.
CONFIDENCE INTERVAL It is the interval or range of values which most likely encompasses the true population value. It is the extent that a particular.
1) Gene flow A) is movement of alleles from one population to another B) counts as true gene flow only if immigrant individuals breed within their new.
Lecture 13: Population Structure October 5, 2015.
Lecture 14: Population structure and Population Assignment October 12, 2012.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Bottlenecks reduce genetic variation – Genetic Drift Northern Elephant Seals were reduced to ~30 individuals in the 1800s.
1 Population Genetics Definitions of Important Terms Population: group of individuals of one species, living in a prescribed geographical area Subpopulation:
Lab 6: Genetic Drift and Effective Population Size
Lecture 13: Population Structure
Lab 9: Linkage Disequilibrium. Goals 1.Estimation of LD in terms of D, D’ and r 2. 2.Determine effect of random and non-random mating on LD. 3.Estimate.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
AP Process Test of Significance for Population Proportion.
Lecture 12: Effective Population Size and Gene Flow
Lab 7. Estimating Population Structure
Godfrey Hardy ( ) Wilhelm Weinberg ( ) Hardy-Weinberg Principle p + q = 1 Allele frequencies, assuming 2 alleles, one dominant over the.
Mammalian Population Genetics
Lab 11 :Test of Neutrality and Evidence for Selection
Chapter 22 - Quantitative genetics: Traits with a continuous distribution of phenotypes are called continuous traits (e.g., height, weight, growth rate,
Lab 4: Inbreeding and Kinship. Inbreeding Reduces heterozygosity Does not change allele frequencies.
The plant of the day Pinus longaevaPinus aristata.
Individual Identity and Population Assignment Lab. 8 Date: 10/17/2012.
In populations of finite size, sampling of gametes from the gene pool can cause evolution. Incorporating Genetic Drift.
Today: Hypothesis testing. Example: Am I Cheating? If each of you pick a card from the four, and I make a guess of the card that you picked. What proportion.
Chi square and Hardy-Weinberg
By Bryce Perry and Cecil Brown
Robert Page Doctoral Student in Dr. Voss’ Lab Population Genetics.
Bottlenecks reduce genetic variation – Genetic Drift
A Genetic Analysis of the Local Rana sylvatica Population
Why study population genetic structure?
Chapters 20, 21 Hypothesis Testing-- Determining if a Result is Different from Expected.
Population Genetics: Selection and mutation as mechanisms of evolution
What Is Genetic Drift? Genetic drift definition: A random change in the frequency of alleles in a gene pool, usually due to small population size.
Genetic Drift: Chance Change A common misconception about evolution is that the features of organisms have evolved due to random chance alone Random cause.
Two-sided p-values (1.4) and Theory-based approaches (1.5)
Understanding Evolution : Personal response question
Is the CFTR allele maintained by mutation/selection balance?
Diversity of Individuals and Evolution of Populations
Is the CFTR allele maintained by mutation/selection balance?
Quantifying the distribution of variation
The Mechanisms of Evolution
Mammalian Population Genetics
Analyzing the Association Between Categorical Variables
Proportioning Whole-Genome Single-Nucleotide–Polymorphism Diversity for the Identification of Geographic Population Structure and Genetic Ancestry  Oscar.
STA 291 Spring 2008 Lecture 18 Dustin Lueker.
Intro to Confidence Intervals Introduction to Inference
Hardy-Weinberg Lab Data
Presentation transcript:

Lab 7. Estimating Population Structure

Goals 1.Estimate and interpret statistics (AMOVA + Bayesian) that characterize population structure. 2.Demonstrate roles of gene flow and genetic drift on population structure.

Gene flow and Genetic drift Gene flow maintains similar allele frequency in different subpopulations. Genetic drift causes random differences in allele frequencies among small subpopulations. Wright’s Island model: Assumes Gene flow occurs with equal probability from the continent (large source population) to each island (smaller subpopulations) qmqm q0q0 m m m m m q0q0 q0q0 q0q0 q0q0

Gene flow and Genetic drift Assuming equilibrium between gene flow (increasing variations) and genetic drift (reducing variation in finite population) and also assuming Wright’s island model, diversity among subpopulations(F ST ) can be calculated as : If, m=0, F ST =1; i.e. Strong genetic differentiation exists among subpopulations. If, m=1, F ST =0; i.e. No genetic differentiation exists among subpopulations.

F Coefficients with different level of structure FFormulaMeaning F IT Measure of deviation (MD) from HWE in total population. 0 : No deviation from HWE in TP. Positive: Deviation due to deficiency of heterozygotes in TP. Negative: Deviation due to excess of heterozygotes in TP. F ST Measure of genetic differentiation among subpopulations. It is always positive. 0 : No genetic differentiations among subpopulations. 1 :Strong genetic differentiations among subpopulations. F IS Measure of deviation from HWE within subpopulations. 0 = No deviation from HWE within SP. Positive: Deviation due to deficiency of heterozygotes within SP. Negative: Deviation due to excess of heterozygotes within SP.

ParameterFormulaMeaning F SR Measure of genetic differentiation among subpopulations within a region. 0 : No genetic differentiation among subpopulations within a region. 1 :Strong genetic differentiation among subpopulations within a region. F RT Measure of genetic differentiation among regions for the total population. 0 : No genetic differentiation among regions in TP. 1 :Strong genetic differentiation among regions in TP. F Coefficients with diffent levels of structure

Estimation of F Coefficients using AMOVA ParameterAMOVA (Arlequin) F ST φ ST or FST F SR φ SC or FSC F RT φ CT or FCT

America Africa Eurasia East Asia Oceania Population structure from worldwide human population Population = subpopulation. Group = Regions

Source of variationsPercentage of variation Among groups(regions) 10 Among sub(populations) within a region 4 Within sub(populations) 86 Fixation Indices: FST : 0.14 FSC : 0.04 FCT : % of total genetic variation is due to differentiation among subpopulations. 86 % of total genetic variation is due to differentiation within subpopulations. 4 % of regional genetic variation is due to differentiation among subpopulations. 10 % of total genetic variation is due to differentiation among regions. AMOVA result interpretations:

Human structure data ColombianKaritianaMayaPima SANA IDPopulation 46Colombian Colombian Colombian # of individuals # of pops. # individuals in pops. # of regions # individuals in regions

Problem 1. File human_struc.xls contains data for 10 microsatellite loci used to genotype 41 human populations from a worldwide sample. a.) Convert the file into Arlequin format and perform AMOVA based on this grouping of populations within regions using distance measures based on the IAM and the SMM. How do you interpret these results? Report values of the phi-statistics and their statistical significance for each AMOVA you run. b.) Do you think that any of these regions can justifiably be divided into subregions? Pick a region, form a hypothesis for what would be a reasonable grouping of populations into subregions, then run AMOVA only for the region you selected using distance measures based on both the IAM and the SMM. Was your hypothesis supported by the data? c.) How do the phi-statistics calculated from distance measures based on the SMM compare to those based on the IAM? d.) GRADUATE STUDENTS: Which of the 5 initially defined regions has the highest diversity in terms of effective number of alleles? What is your biological explanation for this?

How to choose K?

KLog-likelihood Picking the Best K

Problem 2. Use Structure to further test the hypotheses you developed in Problem 1. a.) Calculate the posterior probabilities to test whether: i. All populations form a single genetically homogeneous group. ii. There are two genetically distinct groups within your selected region iii. There are three genetically distinct groups within your selected region. b.) Use the ΔK method to determine the most likely number of groups. How does this compare to the method based on posterior probabilities? c.) How do the groupings of subpopulations compare to your expectations from Problem 1? d.) Is there evidence of admixture among the groups? If so, include a table or figure showing the proportion of each subpopulation assigned to each group. e.) GRADUATE STUDENTS: Provide a brief, literature-based explanation for the groupings you observe.