Download presentation
Presentation is loading. Please wait.
Published byCharity Randall Modified over 9 years ago
1
Introduction to Linkage Analysis Pak Sham Twin Workshop 2003
2
Human Genome 22 autosomes, XY 3 10 9 base-pairs (2 metres long) 2% coding sequences, rest regulatory & “junk” 30,000 - 40,000 genes Much communality with other species
3
Genetic Variation Chromosomal abnormalities Duplication (e.g. Down’s) Deletion (e.g. Velo-cardio-facial syndrome) Major deleterious mutations Usually Rare (e.g. Huntington’s) Polymorphisms Single nucleotide polymorphisms (SNPs) Variable length repeats (e.g. microsatellites) Some are functional (“normal variation”) Most are non-functional (neutral markers)
4
Genetic Mapping of Disease Levels of Genetic Analysis Estimate heritability (family, twins, adoption) Find chromosomal locations (linkage) Identify risk variants (association) Understand mechanisms (cell biology, etc) Applications Prediction of genetic risk More accurate prediction of genetic risk Even more accurate prediction of genetic risk; prediction of prognosis and treatment response Development of new drug targets
5
Strategies of Gene Mapping Functional Uses knowledge of disease to identify candidate genes Finds variants in candidate genes Looks for association between variants and disease Positional Systematic screen of whole genome Uses a set of 400 evenly-spaced markers Looks for markers which con-segregate with disease
6
Co-segregation A2A4A2A4 A3A4A3A4 A1A3A1A3 A1A2A1A2 A2A3A2A3 A1A2A1A2 A1A4A1A4 A3A4A3A4 A3A2A3A2 Marker allele A 1 cosegregates with dominant disease
7
Linkage Co-segregation Parent Gametes Alleles on the same chromosome tend to be stay together in meiosis; therefore they tend be co-transmitted.
8
Crossing over between homologous chromosomes
9
Map Distance Map distance between two loci (Morgans) = Expected number of crossovers per meiosis (1 Morgan = 100 centiMorgans) Note: Map distances are additive Heterogeneity in recombination frequencies Total map length: 33 (1 cM 10 6 base pairs)
10
Recombination A1A1 A2A2 Q1Q1 Q2Q2 A1A1 A2A2 Q1Q1 Q2Q2 A1A1 A2A2 Q1Q1 Q2Q2 Non-recombinants 1- Recombinants Parental genotypes
11
Recombination Fraction Recombination fraction ( ) between two loci = Proportion of gametes that are recombinant with respect to the two loci
12
Recombination & map distance Haldane map function
13
Double Backcross : Fully Informative Gametes AaBb aabb AABB aabb AaBbaabb Aabb aaBb Non-recombinantRecombinant
14
Linkage Analysis : Fully Informative Gametes Count DataRecombinant Gametes: R Non-recombinant Gametes: N ParameterRecombination Fraction: LikelihoodL( ) = R (1- ) N Estimation Chi-square
15
Phase Unknown Meioses AaBb aabb AaBbaabb Aabb aaBb Non-recombinantRecombinant Non-recombinant Either : Or :
16
Mixture distribution likelihood The probability of observed data X depend on the status of descrete variable G P(X|G) The status of G is not observed but the probability distribution of G is available P(G) Then the likelihood of the observed data X is
17
Linkage Analysis : Phase-unknown Meioses Count DataRecombinant Gametes: X Non-recombinant Gametes: Y orRecombinant Gametes: Y Non-recombinant Gametes: X LikelihoodL( ) = X (1- ) Y + Y (1- ) X An example of incomplete data : Mixture distribution likelihood function
18
Parental genotypes unknown Likelihood will be a function of allele frequencies (population parameters) (transmission parameter) AaBbaabb Aabb aaBb
19
Complex Phenotypes Penetrance parameters Genotype Phenotype f2f2 AA aa Aa Disease Normal f1f1 f0f0 1- f 2 1- f 1 1- f 0 Each phenotype is compatible with multiple genotypes.
20
General Pedigree Likelihood Likelihood is a sum of products (mixture distribution likelihood) number of terms = (m 1 m 2 …..m k ) 2n where m j is number of alleles at locus j
21
Elston-Stewart algorithm Reduces computations by peeling: Step 1 Condition likelihoods of family 1 on genotype of X. 1 2 X Step 2 Joint likelihood of families 2 and 1
22
Lod Score: Morton (1955) Lod > 3 conclude linkage Prior odds linkage ratioPosterior odds 1:50100020:1 Lod <-2 exclude linkage
23
Lod Score Curves lod 0.5 Lod score curves are additive over pedigrees 0
24
Lods, chi-squares & p-values In large samples 2 log e (10) Max lod ~ 2 1 In small samples P 10 -Max lod
25
Problems with parametric linkage Requires parameters of the disease model to be specified Allele frequency Penetrances These are generally unknown for a complex trait Disease model assumes that a single locus is the only source of familial resemblance This is generally unrealistic
26
Linkage Analysis Admixture Test (CAB Smith) Model Probability of linkage in family = Likelihood L( , ) = L( ) + (1- ) L( =1/2) Note: Another example of mixture likelihood
27
Linkage Analysis: MOD Maximise lod score over several sets of disease models, e.g. dominant, recessive, additive Make correction for multiple (k) models Adjusted lod = lod – log 10 (k)
28
Allele sharing (non-parametric) methods Penrose (1935): Sib Pair linkage For rare disease IBD Concordant affected Concordant normal Discordant Therefore affected sib pair (ASP) design efficient Test H 0 : Proportion of alleles IBD =1/2 H A : Proportion of alleles IBD >1/2
29
Correlation between IBD of two loci For sib pairs Corr( A, B ) = (1-2 AB ) 2 attenuation of linkage signal with increasing genetic distance from disease locus
30
Joint distribution of Pedigree IBD IBD of relative pairs are not independent e.g If IBD(1,2) = 2 and IBD (1,3) = 2 then IBD(2,3) = 2 Inheritance vector gives joint IBD distribution Each element indicates whether paternally inherited allele is transmitted (1) or maternally inherited allele is transmitted (0) Vector of 2N elements (N = # of non-founders)
31
Inheritance Vector: An Example 1/23/4 Ordered genotype notation 1 st allele = paternally inherited 2 nd allele = maternally inherited 1/31/4 2/3 2/4 Inheritance vector = (1, 1, 1, 0, 1, 0)
32
Pedigree allele-sharing methods APM: Affected Pedigree Members: Uses IBS very sensitive to allele frequency mis-specification less powerful than IBD-based methods NPL: Non-Parametric Linkage (Genehunter) Conservative at positions between markers LRT: “Delta parameter” (Genehunter+, Allegro) All these methods consider affected members only
33
Variance Components Linkage Models trait values of pedigree members jointly Assumes multivariate normality conditional on IBD Covariance between relative pairs = Vr + V Q [ -E( )] WhereV = trait variance r = correlation (depends on relationship) V Q = QTL additive variance E( ) = expected proportion IBD
34
Path Diagram for Sib-Pair QTL model P T1 QS N P T2 QSN 1 [0 / 0.5 / 1] nqsnsq
35
Incomplete Marker Information IBD sharing cannot always be deduced from marker genotypes with certainty Obtain probabilities of IBD values (Z 0, Z 1, Z 2 ) Finite mixture likelihood Pi-hat likelihood
36
P T1 QS N P T2 QSN 1 nqsnsq Pi-hat Model
37
Parametric / Allele Sharing Trait DataMarker Data IBD sharing Parametric Allele sharing
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.