Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012.

Similar presentations


Presentation on theme: "Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012."— Presentation transcript:

1 Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012

2 Last Time uSequence data and quantification of variation  Infinite sites model  Nucleotide diversity (π) uSequence-based tests of neutrality  Tajima’s D  Hudson-Kreitman-Aguade  Synonymous versus Nonsynonymous substitutions  McDonald-Kreitman

3 Today uSignatures of selection based on synonymous and nonsynonymous substitutions uMultiple loci and independent segregation uEstimating linkage disequilibrium

4 Using Synonymous Substitutions to Control for Factors Other Than Selection d N /d S or Ka/Ks Ratios

5 Types of Mutations (Polymorphisms)

6 uFirst and second position SNP often changes amino acid  UCA, UCU, UCG, and UCC all code for Serine uThird position SNP often synonymous uMajority of positions are nonsynonymous uNot all amino acid changes affect fitness: allozymes Synonymous versus Nonsynonymous SNP

7 Synonymous & Nonsynonymous Substitutions uSynonymous substitution rate can be used to set neutral expectation for nonsynonymous rate ud S is the relative rate of synonymous mutations per synonymous site ud N is the relative rate of nonsynonymous mutations per non-synonymous site u  = d N /d S  If  = 1, neutral selection  If  < 1, purifying selection  If  > 1, positive Darwinian selection uFor human genes,  ≈ 0.1

8 Complications in Estimating d N /d S  Multiple mutations in a codon give multiple possible paths  Two types of nucleotide base substitutions resulting in SNPs: transitions and transversions not equally likely  Back-mutations are invisible  Complex evolutionary models using likelihood and Bayesian approaches must be used to estimate d N /d S (also called K A /K S or K N /K S depending on method) (PAML package) http://www.mun.ca/biology/scarr/Transitions_vs_Transversions.html CGT(Arg)->AGA(Arg) CGT(Arg)->AGT(Ser)->AGA(Arg) CGT(Arg)->CGA(Arg)->AGA(Arg)

9 dn/ds ratios for 363 mouse-rat comparisons interleukin-3: mast cells and bone marrow cells in immune system Hartl and Clark 2007  Most genes show purifying selection (dN/dS < 1)  Some evidence of positive selection, especially in genes related to immune system

10 McDonald-Kreitman Test uConceptually similar to HKA test uUses only one gene uContrasts ratios of synonymous divergence and polymorphism to rates of nonsynonymous divergence and polymorphism uGene provides internal control for evolution rates and demography

11 uAligned 11,624 gene sequences between human and chimp uCalculated synonymous and nonsynonymous substitutions between species (Divergence) and within humans (SNPs) uIdentified 304 genes showing evidence of positive selection (blue) and 814 genes showing purifying selection (red) in humans Bustamente et al. 2005. Nature 437, 1153-1157 uPositive selection: defense/immunity, apoptosis, sensory perception, and transcription factors uPurifying selection: structural and housekeeping genes Application of McDonald- Kreitman Test:

12 Genes showing purifying (red) or positive (blue) selection in the human genome based on the McDonald-Kreitman Test Bustamente et al. 2005. Nature 437, 1153-1157

13 How can you differentiate between effects of selection and demographic effects on sequence variation? Will this work for organellar DNA?

14 Extending to Multiple Loci uSo far, only considering dynamics of alleles at single loci uLoci occur on chromosomes, linked to other loci! “The fitness of a single locus ripped from its interactive context is about as relevant to real problems of evolutionary genetics as the study of the psychology of individuals isolated from their social context is to an understanding of man’s sociopolitical evolution” Richard Lewontin (quoted in Hedrick 2005) uSize of region that must be considered depends on Linkage Disequilibrium

15 Gametic (Linkage) Disequilibrium (LD) uNonrandom association of alleles at different loci into gametes uHaplotype: Genotype of a group of closely linked loci uLD is a major factor in evolution uLD itself provides insights into population history uEstimation of LD is critical for ALL population genetic data

16 Nomenclature and concepts uTwo loci, two alleles  Frequency of allele i at locus 1 is p i  Frequency of allele i at locus 2 is q i A1A1 A2A2 B1B1 B2B2 p1p1 p2p2 q1q1 q2q2

17 Nomenclature and concepts uGenotype is written as A1A1 A2A2 B1B1 B2B2 A1A1 A2A2 B1B1 B2B2 uA 1 and B 1 are in coupling phase uA 1 and B 2 are in repulsion phase

18 Gametic Disequilibrium uEasiest to think about physically linked loci, but not necessarily the case A1A1 A2A2 B1B1 B2B2 A1A1 B1B1 A1A1 B2B2 A2A2 B1B1 A2A2 B2B2 Meiosis p1q1p1q1 p1q2p1q2 p2q1p2q1 p2q2p2q2 What Are Expected Frequencies of Gametes in a Population Under Independent Assortment?

19 What are expected frequency of Gametes with complete linkage? A1A1 A2A2 B1B1 B2B2 p1p1 p2p2 q1q1 q2q2 A1A1 A2A2 B1B1 B2B2 A1A1 B1B1 A1A1 B2B2 A2A2 B1B1 A2A2 B2B2 Meiosis x 11 x 12 x 21 x 22

20 Linkage disequilibrium measure, D Independent Assortment: With LD: Substituting from above table:

21 Problem: D is sensitive to allele frequencies Example, if D is positive: p 1 =0.5, q 2 =0.5, Dmax=0.25 but p 1 =0.1, q 2 =0.9, D max =0.09 Solution: D' = D/D max ranges from -1 to 1 D max Calculation : If D is positive, D max is lesser of p 1 q 2 or p 2 q 1 If D is negative, D max is lesser of p 1 q 1 or p 2 q 2 uCan’t have negative gamete frequencies uMaximum D set by allele frequencies

22 LD can also be estimated as correlation between alleles ur can also be standardized to a -1 to 1 scale uIt is equivalent to D’ in this case

23 Recombination uShuffling of parental alleles during meiosis A1A1 A2A2 B1B1 B2B2 uOccurs for unlinked loci and linked loci uRate of recombination for linked markers is partially a function of physical distance A1A1 A2A2 B1B1 B2B2 A1A1 A2A2 B1B1 B2B2

24 What is the expected recombination rate for unlinked loci? A1A1 A2A2 B1B1 B2B2 A1A1 B1B1 A1A1 B2B2 A2A2 B1B1 A2A2 B2B2 Meiosis Where n r is number of repulsion phase gametes, and n c is number of coupling phase gametes Coupling Repulsion

25 LD is partially a function of recombination rate uExpected proportions of gametes produced by various genotypes over two generations Where c is the recombination rate and D 0 is the initial amount of LD First generation(Second generation)

26 Recombination degrades LD over time Where t is time (in generations) and e is base of natural log ( 2.718)

27 Effects of recombination rate on LD uDecline in LD over time with different theoretical recombination rates (c) uEven with independent segregation (c=0.5), multiple generations required to break up allelic associations uGenome-wide linkage disequilibrium can be caused by demographic factors (more later)


Download ppt "Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012."

Similar presentations


Ads by Google