(1) Schedule Mar 15Linkage disequilibrium (LD) mapping Mar 17LD mapping Mar 22Guest speaker, Dr Yang Mar 24Overview Attend ENAR Biometrical meeting in.

Slides:



Advertisements
Similar presentations
QTL Mapping in Natural Populations Basic theory for QTL mapping is derived from linkage analysis in controlled crosses There is a group of species in which.
Advertisements

An Introduction to the EM Algorithm Naala Brewer and Kehinde Salau.
Experimental crosses. Inbred Strain Cross Backcross.
Qualitative and Quantitative traits
Functional Mapping A statistical model for mapping dynamic genes.
METHODS FOR HAPLOTYPE RECONSTRUCTION
Put Markers and Trait Data into box below Linkage Disequilibrium Mapping - Natural Population OR.
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
Joint Linkage and Linkage Disequilibrium Mapping
1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.
Mendel-Penetrance Module Presenter: Joseph Kim Mentors: Dr.Kenneth Lange Brian Dolan.
Quantitative Genetics
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Likelihood probability of observing the data given a model with certain parameters Maximum Likelihood Estimation (MLE) –find the parameter combination.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
Methods of Genome Mapping linkage maps, physical maps, QTL analysis The focus of the course should be on analytical (bioinformatic) tools for genome mapping,
Haplotype Blocks An Overview A. Polanski Department of Statistics Rice University.
Gene, Allele, Genotype, and Phenotype
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
Advanced Algorithms and Models for Computational Biology -- a machine learning approach Population Genetics: SNPS Haplotype Inference Eric Xing Lecture.
Human Chromosomes Male Xy X y Female XX X XX Xy Daughter Son.
Calculation of IBD State Probabilities Gonçalo Abecasis University of Michigan.
Introduction to Linkage Analysis Pak Sham Twin Workshop 2003.
Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.
Complex Traits Most neurobehavioral traits are complex Multifactorial
Joint Linkage and Linkage Disequilibrium Mapping Key Reference Li, Q., and R. L. Wu, 2009 A multilocus model for constructing a linkage disequilibrium.
Quantitative Genetics
Genetic design. Testing Mendelian segregation Consider marker A with two alleles A and a BackcrossF 2 AaaaAAAaaa Observationn 1 n 0 n 2 n 1 n 0 Expected.
Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm.
Lecture 12: Linkage Analysis V Date: 10/03/02  Least squares  An EM algorithm  Simulated distribution  Marker coverage and density.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Lecture 6 Spring 2010 Dr. Jianjun Hu CSCE883 Machine Learning.
Functional Mapping of QTL and Recent Developments
FINE SCALE MAPPING ANDREW MORRIS Wellcome Trust Centre for Human Genetics March 7, 2003.
QTL Mapping Quantitative Trait Loci (QTL): A chromosomal segments that contribute to variation in a quantitative phenotype.
Population structure at QTL d A B C D E Q F G H a b c d e q f g h The population content at a quantitative trait locus (backcross, RIL, DH). Can be deduced.
Interval mapping with maximum likelihood Data Files: Marker file – all markers Traits file – all traits Linkage map – built based on markers For example:
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Linkage Disequilibrium Mapping of Complex Binary Diseases Two types of complex traits Quantitative traits–continuous variation Dichotomous traits–discontinuous.
Biostatistics-Lecture 19 Linkage Disequilibrium and SNP detection
Lecture 22: Quantitative Traits II
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Why you should know about experimental crosses. To save you from embarrassment.
Types of genome maps Physical – based on bp Genetic/ linkage – based on recombination from Thomas Hunt Morgan's 1916 ''A Critique of the Theory of Evolution'',
NCSU Summer Institute of Statistical Genetics, Raleigh 2004: Genome Science Session 3: Genomic Variation.
Association Mapping in Families Gonçalo Abecasis University of Oxford.
Constrained Hidden Markov Models for Population-based Haplotyping
upstream vs. ORF binding and gene expression?
Classification of unlabeled data:
Latent Variables, Mixture Models and EM
Recombination (Crossing Over)
Epidemiology 101 Epidemiology is the study of the distribution and determinants of health-related states in populations Study design is a key component.
Mapping Quantitative Trait Loci
Statistical Methods for Quantitative Trait Loci (QTL) Mapping II
Genome-wide Associations
The ‘V’ in the Tajima D equation is:
Basic concepts on population genetics
Complex Traits Qualitative traits. Discrete phenotypes with direct Mendelian relationship to genotype. e.g. black or white, tall or short, sick or healthy.
Correlation for a pair of relatives
What are BLUP? and why they are useful?
Linkage analysis and genetic mapping
QTL Fine Mapping by Measuring and Testing for Hardy-Weinberg and Linkage Disequilibrium at a Series of Linked Marker Loci in Extreme Samples of Populations 
Ho Kim School of Public Health Seoul National University
A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants  Andrew.
Linkage Disequilibrium Mapping - Natural Population
X-chromosomal markers and FamLinkX
Cancer as a Complex Genetic Trait
Presentation transcript:

(1) Schedule Mar 15Linkage disequilibrium (LD) mapping Mar 17LD mapping Mar 22Guest speaker, Dr Yang Mar 24Overview Attend ENAR Biometrical meeting in Austin from Mar 20 to 23 (2) Projects - Work on a problem learnt in the class -Select a problem from your own projects

What I have learnt from my trip to Seattle -Fred Hutchinson Cancer Research Center -University of Washington Statistical Genetics of Complex Traits Single Nucleotide Polymorphisms (SNPs) Haplotype blocks HIV/AIDS dynamics Cancer progression

Statistical Genetics of Complex Traits Rongling Wu, Chang-Xing Ma and George Casella  Springer-Verlag New York Linkage, Disequilibrium and QTL

Linkage Disequilibrium Linkage analysis – controlled crosses (backcross or F2) and structured pedigrees (grandparent- parent-children generation) Linkage disequilibrium analysis – Natural population Linkage mapping is used in plant and animal genetics, as well as human genetics of diseases like cancers. LD mapping is used for human genetics of diseases like HIV/AIDS and SARS.

Mixture model-based likelihood without marker information L(y|  ) =  i=1 n [½f 1 (y i ) + ½f 0 (y i )] Height QTL genotype Sample (cm, y) Qqqq 1184½ ½ 2185 ½ ½ 3180 ½ ½ 4182 ½ ½ 5167 ½ ½ 6169 ½ ½ 7165 ½ ½ 8166 ½ ½ Linkage mapping - backcross

Mixture model-based likelihood with marker information L(y,M|  ) =  i=1 n [  1|i f 1 (y i ) +  0|i f 0 (y i )] Sam- Height Marker genotype QTL ple(cm, y) M1M2Qqqq 1184Mm (1)Nn (1) Mm (1)Nn (1) Mm (1)Nn (1) Mm (1)nn (0)1-   5167mm (0)nn (1)  1-  6169mm (0)nn (0) mm (0)nn (0) mm (0)Nn (0) 0 1 Prior prob. Linkage mapping - backcross

Conditional probabilities of the QTL genotypes (missing) based on marker genotypes (observed) L(y,M|  ) =  i=1 n [  1|i f 1 (y i ) +  0|i f 0 (y i )] =  i=1 n1 [1 f 1 (y i ) + 0 f 0 (y i )]Conditional on 11 (n 1 )   i=1 n2 [(1-  ) f 1 (y i ) +  f 0 (y i )]Conditional on 10 (n 2 )   i=1 n3 [  f 1 (y i ) + (1-  ) f 0 (y i )]Conditional on 01 (n 3 )   i=1 n4 [0 f 1 (y i ) + 1 f 0 (y i )]Conditional on 00 (n 4 ) Linkage mapping - backcross

Normal distributions of phenotypic values for each QTL genotype group f 1 (y i ) = 1/(2  2 ) 1/2 exp[-(y i -  1 ) 2 /(2  2 )],  1 =  + a* f 0 (y i ) = 1/(2  2 ) 1/2 exp[-(y i -  0 ) 2 /(2  2 )],  0 =  Linkage mapping - backcross

Differentiating L with respect to each unknown parameter, setting derivatives equal zero and solving the log-likelihood equations L(y,M|  ) =  i=1 n [  1|i f 1 (y i ) +  0|i f 0 (y i )] log L(y,M|  ) =  i=1 n log[  1|i f 1 (y i ) +  0|i f 0 (y i )] Define  1|i =  1|i f 1 (y i )/[  1|i f 1 (y i ) +  0|i f 0 (y i )](1)  0|i =  0|i f 1 (y i )/[  1|i f 1 (y i ) +  0|i f 0 (y i )](2)  1 =  i=1 n (  1|i y i )/  i=1 n  1|i (3)  0 =  i=1 n (  0|i y i )/  i=1 n  0|i (4)  2 = 1/n  i=1 n [  1|i (y i -  1 ) 2 +  0 |i (y i -  0 ) 2 ](5)  = (  i=1 n2  1|i +  i=1 n3  0 |i )/(n 2 +n 3 )(6) Linkage mapping - backcross

Mixture model-based likelihood without marker information Suppose there is natural population with a segregating QTL of two alternative alleles, Q and q, Prob(Q)=q, Prob(q)=1-q → Prob(QQ)=q 2, Prob(Qq)=2q(1-q), Prob(qq)=(1-q) 2 L(y|  ) =  i=1 n [[q 2 f 2 (y i ) + 2q(1-q)f 1 (y i ) + (1-q) 2 f 0 (y i )] Height QTL genotype Sample (cm, y) QQQqqq 1184q 2 2q(1-q)(1-q) q 2 2q(1-q)(1-q) q 2 2q(1-q)(1-q) q 2 2q(1-q)(1-q) q 2 2q(1-q)(1-q) q 2 2q(1-q)(1-q) q 2 2q(1-q)(1-q) q 2 2q(1-q)(1-q) 2 Linkage disequilibrium mapping – natural population

Association between marker and QTL -Marker, Prob(M)=p, Prob(m)=1-p -QTL, Prob(Q)=q, Prob(q)=1-q Four haplotypes: Prob(MQ)=p 11 =pq+D p=(p 11 +p 10 )/2 Prob(Mq)=p 10 =p(1-q)-Dq=(p 11 +p 01 )/2 Prob(mQ)=p 01 =(1-p)q-DD=p 11 p 00 -p 10 p 01 Prob(mq)=p 00 =(1-p)(1-q)+D Linkage disequilibrium mapping – natural population

QQQqqqObs MMp p 11 p 10 p 10 2 n 2 Mm2p 11 p 01 2(p 11 p 00 +p 10 p 01 )2p 10 p 00 n 1 mmp p 01 p 00 p 00 2 n 0 MMp p 11 p 10 p 10 2 n 2 p 2 p 2 p 2 Mm2p 11 p 01 2(p 11 p 00 +p 10 p 01 )2p 10 p 00 n 1 2p(1-p)2p(1-p)2p(1-p) mmp p 01 p 00 p 00 2 n 0 (1-p) 2 (1-p) 2 (1-p) 2 Joint and conditional (  j|i ) genotype prob. between marker and QTL

Mixture model-based likelihood with marker information L(y,M|  )=  i=1 n [  2|i f 2 (y i ) +  1|i f 1 (y i ) +  0|i f 0 (y i )] Sam- Height Marker genotype QTL genotype ple(cm, y) M QQQqqq 1184MM (2)  2|i  1|i  0|i 2185MM (2)  2|i  1|i  0|i 3180Mm (1)  2|i  1|i  0|i 4182Mm (1)  2|i  1|i  0|i 5167Mm (1)  2|i  1|i  0|i 6169Mm (1)  2|i  1|i  0|i 7165mm (0)  2|i  1|i  0|i 8166mm (0)  2|i  1|i  0|i Prior prob. Linkage disequilibrium mapping – natural population

Conditional probabilities of the QTL genotypes (missing) based on marker genotypes (observed) L(y,M|  ) =  i=1 n [  2|i f 2 (y i ) +  1|i f 1 (y i ) +  0|i f 0 (y i )] =  i=1 n2 [  2|2i f 2 (y i ) +  1|2i f 1 (y i ) +  0|2i f 0 (y i )] Conditional on 2 (n 2 )   i=1 n1 [  2|1i f 2 (y i ) +  1|1i f 1 (y i ) +  0|1i f 0 (y i )] Conditional on 1 (n 1 )   i=1 n0 [  2|0i f 2 (y i ) +  1|0i f 1 (y i ) +  0|0i f 0 (y i )] Conditional on 0 (n 0 ) Linkage disequilibrium mapping – natural population

Normal distributions of phenotypic values for each QTL genotype group f 2 (y i ) = 1/(2  2 ) 1/2 exp[-(y i -  2 ) 2 /(2  2 )],  2 =  + a f 1 (y i ) = 1/(2  2 ) 1/2 exp[-(y i -  1 ) 2 /(2  2 )],  1 =  + d f 0 (y i ) = 1/(2  2 ) 1/2 exp[-(y i -  0 ) 2 /(2  2 )],  0 =  - a Linkage disequilibrium mapping – natural population

Differentiating L with respect to each unknown parameter, setting derivatives equal zero and solving the log-likelihood equations L(y,M|  ) =  i=1 n [  2|i f 2 (y i ) +  1|i f 1 (y i ) +  0|i f 0 (y i )] log L(y,M|  ) =  i=1 n log[  2|i f 2 (y i ) +  1|i f 1 (y i ) +  0|i f 0 (y i )] Define  2|i =  2|i f 1 (y i )/[  2|i f 2 (y i ) +  1|i f 1 (y i ) +  0|i f 0 (y i )](1)  1|i =  1|i f 1 (y i )/[  2|i f 2 (y i ) +  1|i f 1 (y i ) +  0|i f 0 (y i )](2)  0|i =  0|i f 1 (y i )/[  2|i f 2 (y i ) +  1|i f 1 (y i ) +  0|i f 0 (y i )](3)  1 =  i=1 n (  1|i y i )/  i=1 n  1|i (4)  0 =  i=1 n (  0|i y i )/  i=1 n  0|i (5)  2 = 1/n  i=1 n [  1|i (y i -  1 ) 2 +  0|i (y i -  0 ) 2 ](6) Linkage disequilibrium mapping – natural population

Complete dataPrior prob QQQqqqObs MMp p 11 p 10 p 10 2 n 2 Mm2p 11 p 01 2(p 11 p 00 +p 10 p 01 )2p 10 p 00 n 1 mmp p 01 p 00 p 00 2 n 0 QQQqqqObs MMn 22 n 21 n 20 n 2 Mmn 12 n 11 n 10 n 1 mmn 02 n 01 n 00 n 0 p 11 =[2n 22 + (n 21 +n 12 ) +  n 22 ]/2n, p 10 =[2n 20 + (n 21 +n 10 ) + (1-  )n 22 ]/2n, p 01 =[2n 02 + (n 12 +n 01 ) + (1-  )n 22 ]/2n, p 11 =[2n 00 + (n 10 +n 01 ) +  n 22 ]/2n,  =p 11 p 00 /(p 11 p 00 +p 10 p 01 )

Incomplete (observed) data Posterior prob QQQqqqObs MM  2|2i  1|2i  0|2i n 2 Mm  2|1i  1|1i  0|1i n 1 mm  2|0i  1|0i  0|0i n 0 p 11 =1/2n{  i=1 n2 [2  2|2i +  1|2i ]+  i=1 n1 [  2|1i +  1|1i ],(7) p 10 =1/2n{  i=1 n2 [2  0|2i +  1|2i ]+  i=1 n1 [  0|1i +(1-  )  1|1i ],(8) p 01 =1/2n{  i=1 n0 [2  2|0i +  1|0i ]+  i=1 n1 [  2|1i +(1-  )  1|1i ],(9) p 00 =1/2n{  i=1 n2 [2  0|0i +  1|0i ]+  i=1 n1 [  0|1i +  1|1i ] (10)

EM algorithm (1) Give initiate values  (0) =(  2,  1,  0,  2,p 11,p 10,p 01,p 00 ) (0) (2) Calculate  2|i (1),  1|i (1) and  0|i (1) using Eqs. 1-3, (3) Calculate  (1) using  2|i (1),  1|i (1) and  0|i (1) based on Eqs. 4-10, (4) Repeat (2) and (3) until convergence.

Example: Human Obesity