Grouping loci Criteria Maximum two-point recombination fraction –Example -r ij ≤ 0.40 Minimum LOD score - Z ij –For n loci, there are n(n-1)/2 possible.

Slides:



Advertisements
Similar presentations
BIOL EVOLUTION AT MORE THAN ONE GENE SO FAR Evolution at a single locus No interactions between genes One gene - one trait REAL evolution: 10,000.
Advertisements

Planning breeding programs for impact
Mapping genes with LOD score method
. Exact Inference in Bayesian Networks Lecture 9.
Note that the genetic map is different for men and women Recombination frequency is higher in meiosis in women.
Eukaryotic Chromosome Mapping
LINKAGE AND CHROMOSOME MAPPNG
Gene Frequency and LINKAGE Gregory Kovriga & Alex Ratt.
Concepts and Connections
Fig. 4-1 Chapter 4 overview. Genetic recombination: mixing of genes during gametogenesis that produces gametes with combinations of genes that are different.
Chromosome Mapping in Eukaryotes
AN INTRODUCTION TO RECOMBINATION AND LINKAGE ANALYSIS Mary Sara McPeek Presented by: Yue Wang and Zheng Yin 11/25/2002.
Linkage Mapping Physical basis of linkage mapping
Basics of Linkage Analysis
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
DATA ANALYSIS Module Code: CA660 Lecture Block 6: Alternative estimation methods and their implementation.
QTL Mapping R. M. Sundaram.
. Learning – EM in ABO locus Tutorial #08 © Ydo Wexler & Dan Geiger.
1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.
Statistical association of genotype and phenotype.
31 January, 2 February, 2005 Chapter 6 Genetic Recombination in Eukaryotes Linkage and genetic diversity.
Lecture 9: QTL Mapping I:
1 How many genes? Mapping mouse traits, cont. Lecture 2B, Statistics 246 January 22, 2004.
Molecular Mapping. Terminology Gene: a particular sequence of nucleotides along a molecule of DNA which represents a functional unit of inheritance. (
DATA ANALYSIS Module Code: CA660 Lecture Block 2.
In addition to maximum parsimony (MP) and likelihood methods, pairwise distance methods form the third large group of methods to infer evolutionary trees.
Linkage Disequilibrium Granovsky Ilana and Berliner Yaniv Computational Genetics
Mapping Basics MUPGRET Workshop June 18, Randomly Intermated P1 x P2  F1  SELF F …… One seed from each used for next generation.
Genetic Recombination in Eukaryotes
BIO341 Meiotic mapping of whole genomes (methods for simultaneously evaluating linkage relationships among large numbers of loci)
Introduction to Interference By: Nickolay Dovgolevsky Itai Sharon 29/05/03.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
Textbook. Textbook Grading 30% Homework (one per two weeks) 70% Research project - Class presentation (20%) - Written report (50%)
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,
1 MAXIMUM LIKELIHOOD ESTIMATION Recall general discussion on Estimation, definition of Likelihood function for a vector of parameters  and set of values.
Class 3 1. Construction of genetic maps 2. Single marker QTL analysis 3. QTL cartographer.
Joint Linkage and Linkage Disequilibrium Mapping Key Reference Li, Q., and R. L. Wu, 2009 A multilocus model for constructing a linkage disequilibrium.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
Genetic design. Testing Mendelian segregation Consider marker A with two alleles A and a BackcrossF 2 AaaaAAAaaa Observationn 1 n 0 n 2 n 1 n 0 Expected.
Lecture 12: Linkage Analysis V Date: 10/03/02  Least squares  An EM algorithm  Simulated distribution  Marker coverage and density.
Lecture 15: Linkage Analysis VII
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
Mapping and cloning Human Genes. Finding a gene based on phenotype ’s of DNA markers mapped onto each chromosome – high density linkage map. 2.
Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.
Population structure at QTL d A B C D E Q F G H a b c d e q f g h The population content at a quantitative trait locus (backcross, RIL, DH). Can be deduced.
Computational Issues on Statistical Genetics Develop Methods Data Collection Analyze Data Write Reports/Papers Research Questions Review the Literature.
The genomes of recombinant inbred lines
Statistical Genetics Instructor: Rongling Wu.
Fast test for multiple locus mapping By Yi Wen Nisha Rajagopal.
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Logistic Regression Saed Sayad 1www.ismartsoft.com.
Lecture 22: Quantitative Traits II
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Bio 2970 Lab 5: Linkage Mapping Sarah VanVickle-Chavez.
Review of statistical modeling and probability theory Alan Moses ML4bio.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
1 Genetic Mapping Establishing relative positions of genes along chromosomes using recombination frequencies Enables location of important disease genes.
Types of genome maps Physical – based on bp Genetic/ linkage – based on recombination from Thomas Hunt Morgan's 1916 ''A Critique of the Theory of Evolution'',
Step one Two gene loci: A & B What will your first cross be in an experiment to test for possible meiotic crossing over? Hint: what condition do you have.
High resolution QTL mapping in genotypically selected samples from experimental crosses Selective mapping (Fig. 1) is an experimental design strategy for.
Lecture 11: Linkage Analysis IV Date: 10/01/02  linkage grouping  locus ordering  confidence in locus ordering.
I. Allelic, Genic, and Environmental Interactions
Gene mapping in mice Karl W Broman Department of Biostatistics
Lecture 7: QTL Mapping I:
Mapping Quantitative Trait Loci
Modelling data and curve fitting
Discrete Event Simulation - 4
POINT ESTIMATOR OF PARAMETERS
Maximum Likelihood Estimation (MLE)
Presentation transcript:

Grouping loci Criteria Maximum two-point recombination fraction –Example -r ij ≤ 0.40 Minimum LOD score - Z ij –For n loci, there are n(n-1)/2 possible combinations that will be tested –Expect  probability of false positives Significant probability value - p ij –Example p ij ≤

Locus ordering Ideally, we would estimate the likelihoods for all possible orders and take the one that is most probable by comparing log likelihoods That is computationally inefficient when there are more than ~10 loci Several methods have been proposed for producing a preliminary order

Locus ordering No. of loci k Possible orders No. of triplets ,814, X , X ,880 Number of orders among k loci Number of triplets among k loci

Three-point Analysis Number of unique orders among k loci OrderMirror Order ABCCBA ACBBCA BACCAB For three loci (k = 3 )

Three-point analysis

Non-Additivity of recombination frequencies A B C r AB r BC r AC The recombination frequency over the interval A – C (r AC ) is less than the sum of r AB and r BC : r AC < r AB + r BC. This is because (rare) double recombination events (a recombination in both A - B and B - C) do not contribute to recombination between A and C.

Non-Additivity of recombination frequencies A B C A B C A B C A B C P 00 =(1-r AB )(1-r BC ) P 10 =r AB (1-r BC ) P 01 =(1-r AB )r BC P 11 =r AB r BC r AC =r AB (1-r BC )+(1-r AB )r BC r AC =r AB +r BC -2r AB r BC

Interference means that recombination events in adjacent intervals interfere. The occurrence of an event in a given interval may reduce or enhance the occurrence of an event in its neighbourhood. Positive interference refers to the ‘suppression’ of recombination events in the neighbourhood of a given one. Negative interference refers to the opposite: enhancement of clusters of recombination events. Positive interference results in less double recombinants (over adjacent intervals) than expected on the basis of independence of recombination events. Interference r AC =r AB +r BC -2Cr AB r BC

Interference C = coefficient of coincidence A BC a bc Interference I = 1 - C Coefficient of coincidence Expected number of double crossovers = r AB r BC N

Observed Count: DH population N=100, locus order ABC

Interference No interference –C = 1 and Interference = 1-C = 0 Complete interference –C = 0 and Interference = 1-C = 1 Negative interference –C > 1 and Interference = 1-C < 0 Positive interference –C 0

Three locus analysis, DH population Expected frequency GenotypesObserved count Without interferenceWith interference ABC/ABCf1f1  r   r   r   r   Cr  r  ) ABc/ABcf2f2  r   r   r   Cr  r  ) AbC/AbCf3f3  r  r   Cr  r  Abc/Abcf4f4  r   r   r   Cr  r  ) aBC/aBCf5f5  r   r   r   Cr  r  ) aBc/aBcf6f6  r  r   Cr  r  abC/abCf7f7  r   r   r   Cr  r  ) abc/abcf8f8  r   r   r   r   Cr  r  ) NR DC 12 SC 2 SC 1 For the ABC locus order

MLE of two-locus recombination fractions GenotypesObserved count Expected frequency ABC/ABCf 1 = 34  r   r   Cr  r  ) ABc/ABcf 2 = 5  r   Cr  r  ) AbC/AbCf 3 = 11  Cr  r  Abc/Abcf 4 = 0  r   Cr  r  ) aBC/aBCf 5 = 1  r   Cr  r  ) aBc/aBcf 6 = 10  Cr  r  abC/abCf 7 = 4  r   Cr  r  ) abc/abcf 8 = 35  r   r   Cr  r  ) Regardless of locus order the MLEs of r  are For the ABC locus order

Ordering Loci by Minimizing Double Crossovers GenotypesObserved count ABC/ABCf 1 = 34 ABc/ABcf 2 = 5 AbC/AbCf 3 = 11 Abc/Abcf 4 = 0 aBC/aBCf 5 = 1 aBc/aBcf 6 = 10 abC/abCf 7 = 4 abc/abcf 8 = 35 GenotypesObserved count ABC + abcf 1 + f 8 = = 69 ABc + abCf 2 + f 7 = = 9 AbC + aBcf 3 + f 6 = = 21 Abc + aBCf 4 + f 5 = = 1 Rarest genotypes are double recombinants BAC bac XX BaC bAc The order of loci is BAC

Ordering Loci by using recombination fractions MLEs of r  are Largest r is r BC = 0.3 Smallest r is r AC = 0.1 B C A C B A C Order

Minimum Sum of Adjacent Recombination Frequencies (SARF) (Falk 1989) OrderSARF ABC = 0.52 BAC = 0.32 ACB = 0.40 r = recombination frequency between adjacent loci ai and aj for a given order: 1, 2, 3, …, l -1, l The B-A-C order gives MIN[SARF] and the minimum distance (MD) map Simulations have shown that SARF is a reliable method to obtain markers orders for large datasets

Minimum Product of Adjacent Recombination Frequencies (PARF) (Wilson 1988) OrderPARF ABC0.22 x 0.30 = BAC0.22 x 0.10 = ACB0.10 x 0.30 = r = recombination frequency between adjacent loci ai and aj for a given order: 1, 2, 3, …, l -1, l The B-A-C order gives MIN[PARF] and the minimum distance (MD) map SARF and PARF are equivalent methods to obtain markers orders for large datasets

Maximum Sum of Adjacent LOD Scores (SALOD) OrderSALOD ABC = BAC = ACB =  = LOD score for recombination frequency between adjacent loci a i and a j for a given order: 1, 2, 3, …, l -1, l The B-A-C order gives MAX[SALOD] SALOD is sensitive to locus informativeness

Minimum Count of Crossover Events (COUNT) (Van Os et al. 2005) OrderCOUNT ABC = 52 BAC = 32 ACB = 40 X = simple count of recombination events between adjacent loci a i and a j for a given sequence: 1, 2, 3, …, l -1, l The B-A-C order gives MIN[COUNT] COUNT is equivalent to SARF and PARF with perfect data. COUNT is superior to SARF with incomplete data

Locus Order- Likelihood Approach r  = Recombination fraction in interval 1 r   = Recombination fraction in interval 2 C = Coefficient of coincidence p i = f i / n f i = Expected frequency of the i th pooled phenotypic class I = 1, 2, …, k k = No. of pooled phenotypic classes

Three locus analysis, DH population Expected frequency GenotypesObserved count Without interferenceWith interference ABC/ABCf1f1  r   r   r   r   Cr  r  ) ABc/ABcf2f2  r   r   r   Cr  r  ) AbC/AbCf3f3  r  r   Cr  r  Abc/Abcf4f4  r   r   r   Cr  r  ) aBC/aBCf5f5  r   r   r   Cr  r  ) aBc/aBcf6f6  r  r   Cr  r  abC/abCf7f7  r   r   r   Cr  r  ) abc/abcf8f8  r   r   r   r   Cr  r  ) NR DC 12 SC 2 SC 1 For the ABC locus order

MLE of two-locus recombination fractions GenotypesObserved count Expected frequency ABC/ABCf 1 = 34  r   r   Cr  r  ) ABc/ABcf 2 = 5  r   Cr  r  ) AbC/AbCf 3 = 11  Cr  r  Abc/Abcf 4 = 0  r   Cr  r  ) aBC/aBCf 5 = 1  r   Cr  r  ) aBc/aBcf 6 = 10  Cr  r  abC/abCf 7 = 4  r   Cr  r  ) abc/abcf 8 = 35  r   r   Cr  r  ) Regardless of locus order the MLEs of r  are For the ABC locus order

HaplotypesObs. No.Freq. C=3.00Exp. freq.Exp. freq. C=0Exp. freq. C=1 ABC + abcf 1 =  r   r   Cr  r  = =0.63 ABc + abCf 2 = CrrCrr AbC + aBcf 3 = rCrrrCrr =0.27 Abc + aBCf 4 = rCrrrCrr =0.07 HaplotypesObs. No.Freq. C=3.18Exp. freq.Exp. freq. C=0Exp. freq. C=1 ABC + abcf 1 =  r   r   Cr  r  = =0.546 ABc + abCf 2 = rCrrrCrr =0.234 AbC + aBcf 3 = CrrCrr Abc + aBCf 4 = rCrrrCrr =0.154 HaplotypesObs. No.Freq. C=0.45Exp. freq.Exp. freq. C=0Exp. freq. C=1 ABC + abcf 1 =  r   r   Cr  r  = =0.702 ABc + abCf 2 = rCrrrCrr =0.078 AbC + aBcf 3 = rCrrrCrr =0.198 Abc + aBCf 4 = CrrCrr ABC ORDER BAC ORDER ACB ORDER

HaplotypesObs. No.p i, C=3.18p i, C=1 ABC + abcf 1 = ABc + abCf 2 = AbC + aBcf 3 = Abc + aBCf 4 = ABC ORDER

HaplotypesObs. No.p i, C=0.45p i, C=1 ABC + abcf 1 = ABc + abCf 2 = AbC + aBcf 3 = Abc + aBCf 4 = BAC ORDER

HaplotypesObs. No.p i, C=3.00p i, C=1 ABC + abcf 1 = ABc + abCf 2 = AbC + aBcf 3 = Abc + aBCf 4 = ACB ORDER

Likelihood method Unconstrained ModelConstrained Model OrderCLikelihoodLOD Likelihood C=1 LOD C=1 ABC BAC ACB The B-A-C order gives highest likelihood and LOD under a no interference C=1 model Most multipoint ML mapping algorithms use no interference models

Ordering Loci GMENDEL (Liu and Knapp 1990) minimizes SARF (Minimum Sum of Adjacent Recombination Frequencies ) PGRI (Lu and Liu 1995) minimizes SARF (Minimum Sum of Adjacent Recombination Frequencies ) or maximizes the likelihood. RECORD (Van Os et al. 2005) minimizes COUNT (Minimum Count of Crossover Events)

Ordering Loci JoinMap 4 (Van Ooijen, 2005) –minimizes the least square locus order using a stepwise search (regression) –Monte Carlo maximum likelihood (ML). Very fast computation of high density maps