BME 130 – Genomes Lecture 20 Population Genomics I
mouse human chimp orang Human: MTEPTLKDRSALGTLARTHL Chimp: MTEPALKDRTALGTLARTHL Orang: MTEPALKQRSALGTIADTHL Mouse: MTDPSLKQRSALTGLARTHL Phylogenetics
Ka/Ks ratio test Ka = number of non-synonymous changes per non-synonymous site Ks = number of synonymous changes per synonymous site Ka/Ks purifying selection Ka/Ks = 1.0 => no selection Ka/Ks >> 1.0 => positive selection ATGACAAGTCTGATGCCTGGTGCAGGATTGCTTCCAATACCGACCCCAAATCCT M T S L M P G A G L L P I P T P N P synonymous site non-synonymous site synonymous and non-synonymous site
Human accelerated regions human chimp Other mammals
Prabhakar et al.,Science 321(5894): HAR2/HACNS1
But, there’s not just one human (or chimp, or orang, or mouse)
Wright-Fischer model of reproduction finite and constant N random mating with respect to the gene being studied non-overlapping generations. N = 10 t0t0 t1t1 t2t2 t3t3
GenotypeA 1 A 1 A 1 A 2 A 2 A 2 frequencyx 1 x 1 x 1 x 2 x 2 x 2 x 1 x 1 + x 1 x 2 + x 2 x 2 = 1 p = x 1 x 1 + (1/2)x 1 x 2 q = 1 – p = x 2 x 2 + (1/2)x 1 x 2
What happens to genotype frequencies over time (generations)? p 2 + 2pq + q 2 = 1 Hardy-Weinberg How long does it take to get to Hardy- Weinberg equilibrium for the most extreme case?
Graph heterozygosity from 0% to 100% allele frequency p. What allele frequency, p, maximizes heterozygosity, assuming Hardy- Weinberg equilibrium? Concerning rare alleles, how much more rare is it to find homozygous alleles if the allele frequency exists at p=0.01 compared to p=0.001?
Genetic drift
Why?
Under drift, what is the chance that a given allele at frequency p will go to fixation? (p) = p why?
…so drift removes heterozygosity. What balances that?
Mutation! New mutations enter the population at rate , per generation (at mutation - drift equilibrium, let’s see…)
Molecular evolution What is the rate of fixation of new mutations over evolutionary time? 2N new alleles per generation, each of which starts life at frequency 1/2N Chance of fixation is the allele frequency Rate of fixation / generation = number of new alleles x chance that each goes to fixation = 1/2N x 2N
Tic, toc, molecular clock
The coalescent
Homework: 1.Find yourself a copy of the 1000genomes data 2.Learn the VCF format specification of these data 3.Install, familiarize yourself with vcftools