Great Dane x Mexican Chihuahua F1 Big (Great Danes) 3 Big : 1 Small
The Genius of Mendel Highly inbred strains of peas Differed by single character Round x Wrinkled (WT x mutant) F1 All Round F Round 1850 Wrinkled 2.96:1 (3:1) Needs Statistics
Mapping in Drosophila Ly Sb br + + +
Lots of variation in people There must be a genetic component How do we assign “traits” to genes? Ultimately want a molecular description Start with inherited diseases
Pedigrees……Mendel’s First Law Autosomal Dominant Disorder
Autosomal Recessive Disease is apparent because of consanguinity (III 5 &6)
Population Genetics Science of Intraspecific Variation Phenotypic GENOTYPIC Genotypic Variation: Alleles, Polymorphism –Ultimate Source of Variation: Mutation Dynamics of Variation during Population History –Changes in Allele Frequencies due to Drift Selection –Persistence of Allele Combinations due to Linkage Linkage Disequilibrium
Some Basics 1 GAT T ACA TGT A ATC GAT C ACA TGT G ATC Allele 1Allele 2 = GAT T ACA TGT A ATC GAT C ACA TGT G ATC Allele 1Allele 2 1. Only refer to one strand, and don’t confuse strands with alleles 2. Context is unimportant (unless we have linkage…next) Allele 1: T Allele 2: C AGACAGAAAGGAAAAGAACCTTCCATTTTTGGCTGTGCCAAGAAGCTCAGAAAGG T GATAATATAAAAAATATATAGTTAATTGGGAATTGAATTTACAAA ATACATTGTG AGACAGAAAGGAAAAGAACCTTCCATTTTTGGCTGTGCCAAGAAGCTCAGAAAGG C GATAATATAAAAAATATATAGTTAATTGGGAATTGAATTTACAAA ATACATTGTG
Some Basics 2 3. Because mutations are rare events, the vast majority of variation is BINARY, at the base pair level. CAAAGGAAAAGAATGCCTTCCATTTTTGGCTGTGCCAAGAAGCTCAGAAAGG T GATAATATAAAAAATATATAGTTAATTGGGAATTGAATTTACAAAATACATT CAAAGGAAAAGAATGCCTTCCATTTTTGGCTGTGCCAAGAAGCTCAGAAAGG C GATAATATAAAAAATATATAGTTAATTGGGAATTGAATTTACAAAATACATT Allele 1 Allele 2 GAAAGGAAAAGAAGATTT A CTTCC [1396bp] GAAGCTCAGAAAGG C GATAATATAAAAAATAT [2502bp] TTGGGAATTTACA G AATAC Haplotype 3 4. Linkage makes things more complicated but only if you actually care about linkage: Linkage equilibrium/disequilibrium. GAAAGGAAAAGAAGATTT C CTTCC [1396bp] GAAGCTCAGAAAGG T GATAATATAAAAAATAT [2502bp] TTGGGAATTTACA G AATAC GAAAGGAAAAGAAGATTT A CTTCC [1396bp] GAAGCTCAGAAAGG C GATAATATAAAAAATAT [2502bp] TTGGGAATTTACA A AATAC 2 alleles Haplotype 2 Haplotype 1
Some Basics 3 5. Alleles have frequencies in the population (which sum to 1) Frequency of Allele 1 ( T ) = 0.59 Frequency of Allele 2 ( C ) = 0.41 p = 0.59 frequency of major allele We’ll be talking about diploids, and genotype probabilities (which sum to 1) can be calculated from allele frequencies. (And vice versa; and under certain assumptions) T,TT,TC,CC,CT,CT,C Prob. of having: p 2 2pq q 2
What about two different genes? Consider two genes A and B that each have two alleles A a B b Allelic frequencies are 0.5 (At the “A” locus A=0.5, a= 0.5) (At the “B” locus B=0.5 and b=0.5) For A and a genotype frequencies = p 2 +2pq +q 2 AA, Aa and aa individuals = The same for BB, Bb and bb How many AA BB individuals are (0.25 x 0.25) aa Bb individuals are (0.25 x 0.50) Both genes are in “equilibrium”. (Hardy and Weinberg)
A a a AAAAa aa (p + q) 2 = p 2 +2pq + q 2 Hardy Weinberg is the Population Equivalent of the Punnett Square
Mutation Rate per Generation How often per generation does this happen? Average Mutation Rates in Mammals Point substitution (nuc)0.5 x per base pair Microdeletion (1-10bp)~10 -9 per base pair Microinsertion (1-10bp)~0.5 x per base pair Mobile element ins’n~ Inversion?? much rarer Exceptions Hypermutable sites (CpGs) C->T = 10x avg point rate Simple Sequence Repeats x indel rate (some !) mitochondrial DNA x nuclear point rate 1 generation
Haploid Human Genome is ~2 x 10 9 base pairs Most of the DNA is non-coding Introns, Intragenic regions, LINES, SINES etc AT the DNA level, can have tremendous variation ath no phenotypic consequenses
Remember the LacI gene (the repressor) Nonsense mutations at every codon Substitute every AA at every position White means no phenotype Lesson….most mutations in coding regions are silent
Drift vs. Selection Drift –Change in allele frequencies due to sampling Selection –Change in allele frequencies due to function The two forces that determine the fate of alleles in a population
Genetic Drift
Gen 0 Gen 19 This is like 107 independent populations For every bottle: after eggs hatch pick 8 male larvae and 8 female larvae, stick in a new bottle. Repeat for 19 generations.
Genetic Drift: Size Matters From Li (1997) Molecular Evolution, Sinauer Press 4 populations 2 at N=25 2 at N=250
Selection & Fitness “Absolute Fitness” = “Viability” = # of survivors / total # progeny produced = P(survival until mean reproductive age) If Fitness depends on Genotype, then we have (natural) Selection
Selection vs Drift Recap From the perspective of disease severity: Given a particular selection coefficient (picture severity of disease), selection is only effective in a population whose size is large enough to overcome the effect of drift. From the perspective of population size: Given a particular population size, only alleles that bear a large enough selection coefficient (picture severity of disease) will be strongly selected against.
A new mutation! (on the "red" chromosome) Eager geneticist obtains samples from multiple affected individuals Linkage disequilibrium: the big (and oversimplified) picture Small number (maybe one) of ancestral disease-causing mutations Isolation of chromosome bearing disease-causing mutation "Reasonable" opportunity for recombination during population history (Think Finland: 1000 founders 2000 years ago; consistent expansion) Few (maybe none) reoccurrences of disease-causing mutation
LD and time: history at work Do we care about: The age of the mutation or the age of the founding population?
Two common types of DNA variants
DNA haplotype Haplotype = a series of marker alleles on a chromosome (DNA molecule) E.g.: DNA sequence, a series of SNPs or microsatellites along a chromosome.