Download presentation
Presentation is loading. Please wait.
1
Introduction to Basic and Quantitative Genetics
2
Darwin & Mendel Darwin (1859) Origin of Species –Instant Classic, major immediate impact –Problem: Model of Inheritance Darwin assumed Blending inheritance Offspring = average of both parents z o = (z m + z f )/2 Fleming Jenkin (1867) pointed out problem –Var(z o ) = Var[(z m + z f )/2] = (1/2) Var(parents) –Hence, under blending inheritance, half the variation is removed each generation and this must somehow be replenished by mutation.
3
Mendel Mendel (1865), Experiments in Plant Hybridization No impact, paper essentially ignored –Ironically, Darwin had an apparently unread copy in his library –Why ignored? Perhaps too mathematical for 19th century biologists The rediscovery in 1900 (by three independent groups) Mendel’s key idea: Genes are discrete particles passed on intact from parent to offspring
4
Mendel’s experiments with the Garden Pea 7 traits examined
5
Mendel crossed a pure-breeding yellow pea line with a pure-breeding green line. Let P1 denote the pure-breeding yellow (parental line 1) P2 the pure-breed green (parental line 2) The F1, or first filial, generation is the cross of P1 x P2 (yellow x green). All resulting F1 were yellow The F2, or second filial, generation is a cross of two F1’s In F2, 1/4 are green, 3/4 are yellow This outbreak of variation blows the theory of blending inheritance right out of the water.
6
Mendel also observed that the P1, F1 and F2 Yellow lines behaved differently when crossed to pure green P1 yellow x P2 (pure green) --> all yellow F1 yellow x P2 (pure green) --> 1/2 yellow, 1/2 green F2 yellow x P2 (pure green) --> 2/3 yellow, 1/3 green
7
Mendel’s explanation Genes are discrete particles, with each parent passing one copy to its offspring. Let an allele be a particular copy of a gene. In Diploids, each parent carries two alleles for every gene Pure Yellow parents have two Y (or yellow) alleles We can thus write their genotype as YY Likewise, pure green parents have two g (or green) alleles Their genotype is thus gg Since there are lots of genes, we refer to a particular gene by given names, say the pea-color gene (or locus)
8
Each parent contributes one of its two alleles (at random) to its offspring Hence, a YY parent always contributes a Y, while a gg parent always contributes a g In the F1, YY x gg --> all individuals are Yg An individual carrying only one type of an allele (e.g. yy or gg) is said to be a homozygote An individual carrying two types of alleles is said to be a heterozygote.
9
The phenotype of an individual is the trait value we observe For this particular gene, the map from genotype to phenotype is as follows: YY --> yellow Yg --> yellow gg --> green Since the Yg heterozygote has the same phenotypic value as the YY homozygote, we say (equivalently) Y is dominant to g, or g is recessive to Y
10
Explaining the crosses F1 x F1 -> Yg x Yg Prob(YY) = yellow(dad)*yellow(mom) = (1/2)*(1/2) Prob(gg) = green(dad)*green(mom) = (1/2)*(1/2) Prob(Yg) = 1-Pr(YY) - Pr(gg) = 1/2 Prob(Yg) = yellow(dad)*green(mom) + green(dad)*yellow(mom) Hence, Prob(Yellow phenotype) = Pr(YY) + Pr(Yg) = 3/4 Prob(green phenotype) = Pr(gg) = 1/4
11
Dealing with two (or more) genes For his 7 traits, Mendel observed Independent Assortment The genotype at one locus is independent of the second RR, Rr - round seeds, rr - wrinkled seeds Pure round, green (RRgg) x pure wrinkled yellow (rrYY) F1 --> RrYg = round, yellow What about the F2?
12
Let R- denote RR and Rr. R- are round. Note in F2, Pr(R-) = 1/2 + 1/4 = 3/4 Likewise, Y- are YY or Yg, and are yellow PhenotypeGenotypeFrequency Yellow, round Y-R- (3/4)*(3/4) = 9/16 Yellow, wrinkled Y-rr (3/4)*(1/4) = 3/16 Green, round ggR- (1/4)*(3/4) = 3/16 Green, wrinkled ggrr (1/4)*(1/4) = 1/16 Or a 9:3:3:1 ratio
13
Probabilities for more complex genotypes Cross AaBBCcDD X aaBbCcDd What is Pr(aaBBCCDD)? Under independent assortment, = Pr(aa)*Pr(BB)*Pr(CC)*Pr(DD) = (1/2*1)*(1*1/2)*(1/2*1/2)*(1*1/2) = 1/2 5 What is Pr(AaBbCc)? = Pr(Aa)*Pr(Bb)*Pr(Cc) = (1/2)*(1/2)*(1/2) = 1/8
14
Mendel was wrong: Linkage PhenotypeGenotypeObservedExpected Purple longP-L-284215 Purple roundP-ll2171 Red longppL-2171 Red roundppll5524 Bateson and Punnet looked at flower color: P (purple) dominant over p (red ) pollen shape: L (long) dominant over l (round) Excess of PL, pl gametes over Pl, pL Departure from independent assortment
15
Linkage If genes are located on different chromosomes they (with very few exceptions) show independent assortment. Indeed, peas have only 7 chromosomes, so was Mendel lucky in choosing seven traits at random that happen to all be on different chromosomes? Problem: compute this probability. However, genes on the same chromosome, especially if they are close to each other, tend to be passed onto their offspring in the same configuation as on the parental chromosomes.
16
Consider the Bateson-Punnet pea data Let PL / pl denote that in the parent, one chromosome carries the P and L alleles (at the flower color and pollen shape loci, respectively), while the other chromosome carries the p and l alleles. Unless there is a recombination event, one of the two parental chromosome types (PL or pl) are passed onto the offspring. These are called the parental gametes. However, if a recombination event occurs, a PL/pl parent can generate Pl and pL recombinant chromosomes to pass onto its offspring.
17
Let c denote the recombination frequency --- the probability that a randomly-chosen gamete from the parent is of the recombinant type (i.e., it is not a parental gamete). For a PL/pl parent, the gamete frequencies are Gamete typeFrequencyExpectation under independent assortment PL(1-c)/21/4 pl(1-c)/21/4 pLc/21/4 Plc/21/4 Parental gametes in excess, as (1-c)/2 > 1/4 for c < 1/2 Recombinant gametes in deficiency, as c/2 < 1/4 for c < 1/2
18
Expected genotype frequencies under linkage Suppose we cross PL/pl X PL/pl parents What are the expected frequencies in their offspring? Pr(PPLL) = Pr(PL|father)*Pr(PL|mother) = [(1-c)/2]*[(1-c)/2] = (1-c) 2 /4 Recall from previous data that freq(ppll) = 55/381 =0.144 Hence, (1-c) 2 /4 = 0.144, or c = 0.24 Likewise, Pr(ppll) = (1-c) 2 /4
19
A (slightly) more complicated case Again, assume the parents are both PL/pl. Compute Pr(PpLl) Two situations, as PpLl could be PL/pl or Pl/pL Pr(PL/pl) = Pr(PL|dad)*Pr(pl|mom) + Pr(PL|mom)*Pr(pl|dad) = [(1-c)/2]*[(1-c)/2] + [(1-c)/2]*[(1-c)/2] Pr(Pl/pL) = Pr(Pl|dad)*Pr(pL|mom) + Pr(Pl|mom)*Pr(pl|dad) = (c/2)*(c/2) + (c/2)*(c/2) Thus, Pr(PpLl) = (1-c) 2 /2 + c 2 /2
20
Generally, to compute the expected genotype probabilities, need to consider the frequencies of gametes produced by both parents. Suppose dad = Pl/pL, mom = PL/pl Pr(PPLL) = Pr(PL|dad)*Pr(PL|mom) = [c/2]*[(1-c)/2] Notation: when PL/pl, we say that alleles P and L are in coupling When parent is Pl/pL, we say that P and L are in repulsion
21
Molecular Markers You and your neighbor differ at roughly 22,000,000 nucleotides (base pairs) out of the roughly 3 billion bp that comprises the human genome Hence, LOTS of molecular variation to exploit SNP -- single nucleotide polymorphism. A particular position on the DNA (say base 123,321 on chromosome 1) that has two different nucleotides (say G or A) segregating STR -- simple tandem arrays. An STR locus consists of a number of short repeats, with alleles defined by the number of repeats. For example, you might have 6 and 4 copies of the repeat on your two chromosome 7s
22
SNPs vs STRs SNPs Cons: Less polymorphic (at most 2 alleles) Pros: Low mutation rates, alleles very stable Excellent for looking at historical long-term associations (association mapping) STRs Cons: High mutation rate Pros: Very highly polymorphic Excellent for linkage studies within an extended Pedigree (QTL mapping in families or pedigrees)
23
Quantitative Genetics The analysis of traits whose variation is determined by both a number of genes and environmental factors Phenotype is highly uninformative as to underlying genotype
24
Complex (or Quantitative) trait No (apparent) simple Mendelian basis for variation in the trait May be a single gene strongly influenced by environmental factors May be the result of a number of genes of equal (or differing) effect Most likely, a combination of both multiple genes and environmental factors Example: Blood pressure, cholesterol levels –Known genetic and environmental risk factors Molecular traits can also be quantitative traits –mRNA level on a microarray analysis –Protein spot volume on a 2-D gel
25
Phenotypic distribution of a trait Consider a specific locus influencing the trait For this locus, mean phenotype = 0.15, while overall mean phenotype = 0
26
Basic model of Quantitative Genetics Basic model: P = G + E Phenotypic value -- we will occasionally also use z for this value Genotypic value Environmental value G = average phenotypic value for that genotype if we are able to replicate it over the universe of environmental values, G = E[P] G x E interaction --- G values are different across environments. Basic model now becomes P = G + E + GE
27
Q1Q1Q1Q1 Q2Q1Q2Q1 Q2Q2Q2Q2 CC + a(1+k)C + 2a CC + a + dC + 2a C -aC + dC + a 2a = G(Q 2 Q 2 ) - G(Q 1 Q 1 ) d = ak =G(Q 1 Q 2 ) - [G(Q 2 Q 2 ) + G(Q 1 Q 1 ) ]/2 d measures dominance, with d = 0 if the heterozygote is exactly intermediate to the two homozygotes k = d/a is a scaled measure of the dominance Contribution of a locus to a trait
28
Example: Apolipoprotein E & Alzheimer’s GenotypeeeEeEE Average age of onset68.475.584.3 2a = G(EE) - G(ee) = 84.3 - 68.4 --> a = 7.95 ak =d = G(Ee) - [ G(EE)+G(ee)]/2 = -0.85 k = d/a = 0.10Only small amount of dominance
29
Example: Booroola (B) gene GenotypebbBbBB Average Litter size1.482.172.66 2a = G(BB) - G(bb) = 2.66 -1.46 --> a = 0.59 ak =d = G(Bb) - [ G(BB)+G(bb)]/2 = 0.10 k = d/a = 0.17
30
Fisher’s (1918) Decomposition of G One of Fisher’s key insights was that the genotypic value consists of a fraction that can be passed from parent to offspring and a fraction that cannot. π G = X G ij ¢freq(Q i Q j ) Mean value, with Average contribution to genotypic value for allele i Since parents pass along single alleles to their offspring, the i (the average effect of allele i) represent these contributions G ij =π G +Æ i +Æ j +± ij b G ij =π G +Æ i +Æ j The genotypic value predicted from the individual allelic effects is thus G ij °G ij =± ij b Dominance deviations --- the difference (for genotype A i A j ) between the genotypic value predicted from the two single alleles and the actual genotypic value, Consider the genotypic value G ij resulting from an A i A j individual
31
G ij =π G +2Æ 1 +(Æ 2 °Æ 1 )N+± ij 2Æ 1 +(Æ 2 °Æ 1 )N= 8 > < > : 2Æ 1 forN=0;e.g,Q 1 Q 1 Æ 1 +Æ 1 forN=1;e.g,Q 1 Q 2 2Æ 1 forN=2;e.g,Q 2 Q 2 G ij =π G +Æ i +Æ j +± ij Fisher’s decomposition is a Regression Predicted value Residual error A notational change clearly shows this is a regression, Independent (predictor) variable N = # of Q 2 alleles Regression slope Intercept Regression residual
32
0 12 N G G 22 G 11 G 21 Allele Q 1 common, 2 > 1 Slope = 2 - 1 Allele Q 2 common, 1 > 2 Both Q 1 and Q 2 frequent, 1 = 2 = 0
33
GenotypeQ1Q1Q1Q1 Q2Q1Q2Q1 Q2Q2Q2Q2 Genotypic value 0a(1+k)2a Consider a diallelic locus, where p 1 = freq( Q 1 ) π G =2p 2 a(1+p 1 k) Mean Allelic effects Æ 2 =p 1 a[1+k(p 1 °p 2 )] Æ 1 =°p 2 a[1+k(p 1 °p 2 )] Dominance deviations ± ij =G ij °π G °Æ i °Æ j
34
Average effects and Additive Genetic Values A(G ij )=Æ i +Æ j A= n X k =1 ≥ Æ ( k ) i +Æ ( k ) k ¥ The values are the average effects of an allele A key concept is the Additive Genetic Value (A) of an individual Why all the fuss over A? Suppose father has A = 10 and mother has A = -2 for (say) blood pressure Expected blood pressure in their offspring is (10-2)/2 = 4 units above the population mean. Offspring A = Average of parental A’s KEY: parents only pass single alleles to their offspring. Hence, they only pass along the A part of their genotypic Value G
35
Genetic Variances G ij =π g +(Æ i +Æ j )+± ij æ 2 (G)= n X k =1 æ 2 (Æ ( k ) i +Æ ( k ) j )+ n X k =1 æ 2 (± ( k ) ij ) æ 2 G =æ 2 A +æ 2 D æ 2 (G)=æ 2 (π g +(Æ i +Æ j )+± ij )=æ 2 (Æ i +Æ j )+æ 2 (± ij ) As Cov( ) = 0 Additive Genetic Variance (or simply Additive Variance) Dominance Genetic Variance (or simply dominance variance)
36
Key concepts (so far) i = average effect of allele i –Property of a single allele in a particular population (depends on genetic background) A = Additive Genetic Value (A) –A = sum (over all loci) of average effects –Fraction of G that parents pass along to their offspring –Property of an Individual in a particular population Var(A) = additive genetic variance –Variance in additive genetic values –Property of a population Can estimate A or Var(A) without knowing any of the underlying genetical detail (forthcoming)
37
æ 2 D =2E[± 2 ]= m X i =1 m X j =1 ± 2 ij p i p j æ 2 D =(2p 1 p 2 ak) 2 æ 2 A =2p 1 p 2 a 2 [1+k(p 1 °p 2 )] 2 One locus, 2 alleles: Q 1 Q 1 Q 1 Q 2 Q 2 Q 2 0 a(1+k) 2a Dominance effects additive variance When dominance present, asymmetric function of allele frequencies Equals zero if k = 0 This is a symmetric function of allele frequencies æ 2 A =2E[Æ 2 ]=2 m X i =1 Æ 2 i p i Since E[ ] = 0, Var( ) = E[( - a ) 2 ] = E[ 2 ]
38
Additive variance, V A, with no dominance (k = 0) Allele frequency, p VAVA
39
Complete dominance (k = 1) Allele frequency, p VAVA VDVD
40
Epistasis Additive Genetic value Dominance value -- interaction between the two alleles at a locus Additive x Additive interactions -- interactions between a single allele at one locus with a single allele at another Additive x Dominant interactions -- interactions between an allele at one locus with the genotype at another, e.g. allele A i and genotype B kj Dominance x dominance interaction --- the interaction between the dominance deviation at one locus with the dominance deviation at another. These components are defined to be uncorrelated, (or orthogonal), so that æ 2 G =æ 2 A +æ 2 D +æ 2 AA +æ 2 AD +æ 2 DD
41
Resemblance Between Relatives
42
Heritability Central concept in quantitative genetics Proportion of variation due to additive genetic values (Breeding values) –h 2 = V A /V P –Phenotypes (and hence V P ) can be directly measured –Breeding values (and hence V A ) must be estimated Estimates of V A require known collections of relatives
43
Ancestral relatives e.g., parent and offspring Collateral relatives, e.g. sibs
44
Full-sibs Half-sibs
45
Key observations The amount of phenotypic resemblance among relatives for the trait provides an indication of the amount of genetic variation for the trait. If trait variation has a significant genetic basis, the closer the relatives, the more similar their appearance
46
Genetic Covariance between relatives Genetic covariances arise because two related individuals are more likely to share alleles than are two unrelated individuals. Sharing alleles means having alleles that are identical by descent (IBD): both copies of can be traced back to a single copy in a recent common ancestor. No alleles IBD One allele IBD Both alleles IBD
47
Parent-offspring genetic covariance Cov(G p, G o ) --- Parents and offspring share EXACTLY one allele IBD Denote this common allele by A 1 G p =A p +D p =Æ 1 +Æ x +D 1 x G o =A o +D o =Æ 1 +Æ y +D 1 y IBD allele Non-IBD alleles
48
All white covariance terms are zero. By construction, and D are uncorrelated By construction, from non-IBD alleles are uncorrelated By construction, D values are uncorrelated unless both alleles are IBD
49
Cov(Æ x ;Æ y )= Ω 0ifx6=y;i.e.,notIBD Var(A)=2ifx=y;i.e.,IBD Var(A)=Var(Æ 1 +Æ 2 )=2Var(Æ 1 ) sothat Var(Æ 1 )=Cov(Æ 1 ;Æ 1 )=Var(A)=2 Hence, relatives sharing one allele IBD have a genetic covariance of Var(A)/2 The resulting parent-offspring genetic covariance becomes Cov(G p,G o ) = Var(A)/2
50
Half-sibs The half-sibs share one allele IBD occurs with probability 1/2 The half-sibs share no alleles IBD occurs with probability 1/2 Each sib gets exactly one allele from common father, different alleles from the different mothers Hence, the genetic covariance of half-sibs is just (1/2)Var(A)/2 = Var(A)/4
51
Full-sibs Paternal allele not IBD [ Prob = 1/2 ] Maternal allele not IBD [ Prob = 1/2 ] -> Prob(zero alleles IBD) = 1/2*1/2 = 1/4 Paternal allele IBD [ Prob = 1/2 ] Maternal allele IBD [ Prob = 1/2 ] -> Prob(both alleles IBD) = 1/2*1/2 = 1/4 Prob(exactly one allele IBD) = 1/2 = 1- Prob(0 IBD) - Prob(2 IBD) Each sib gets exact one allele from each parent
52
IBD alleles ProbabilityContribution 01/40 1 2Var(A)/2 2 4Var(A) +Var(D) Resulting Genetic Covariance between full-sibs Cov(Full-sibs) = Var(A)/2 + Var(D)/4
53
Genetic Covariances for General Relatives Let r = (1/2)Prob(1 allele IBD) + Prob(2 alleles IBD) Let u = Prob(both alleles IBD) General genetic covariance between relatives Cov(G) = rVar(A) + uVar(D) When epistasis is present, additional terms appear r 2 Var(AA) + ruVar(AD) + u 2 Var(DD) + r 3 Var(AAA) +
54
Components of the Environmental Variance E = E c + E s Total environmental value Common environmental value experienced by all members of a family, e.g., shared maternal effects Specific environmental value, any unique environmental effects experienced by the individual V E = V Ec + V Es The Environmental variance can thus be written in terms of variance components as One can decompose the environmental further, if desired. For example, plant breeders have terms for the location variance, the year variance, and the location x year variance.
55
Shared Environmental Effects contribute to the phenotypic covariances of relatives Cov(P 1,P 2 ) = Cov(G 1 +E 1,G 2 +E 2 ) = Cov(G 1,G 2 ) + Cov(E 1,E 2 ) Shared environmental values are expected when sibs share the same mom, so that Cov(Full sibs) and Cov(Maternal half-sibs) not only contain a genetic covariance, but an environmental covariance as well, V Ec Cov(Full-sibs) = Var(A)/2 + Var(D)/4 + V Ec
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.