Interval mapping with maximum likelihood Data Files: Marker file – all markers Traits file – all traits Linkage map – built based on markers For example:

Slides:



Advertisements
Similar presentations
QTL Mapping in Natural Populations Basic theory for QTL mapping is derived from linkage analysis in controlled crosses There is a group of species in which.
Advertisements

Maximum Likelihood Estimates and the EM Algorithms II
Point Estimation Notes of STAT 6205 by Dr. Fan.
Lecture 13 L1 , L∞ Norm Problems and Linear Programming
Selective mapping and simulation study. high-density genome maps Are used for: Comparative mapping Map-based cloning Genome sequencing But genotyping.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
EM Algorithm Jur van den Berg.
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Put Markers and Trait Data into box below Linkage Disequilibrium Mapping - Natural Population OR.
Joint Linkage and Linkage Disequilibrium Mapping
1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.
Lecture 9: QTL Mapping I:
1 How many genes? Mapping mouse traits, cont. Lecture 2B, Statistics 246 January 22, 2004.
. Learning – EM in The ABO locus Tutorial #9 © Ilan Gronau.
1 QTL mapping in mice, cont. Lecture 11, Statistics 246 February 26, 2004.
Mapping Basics MUPGRET Workshop June 18, Randomly Intermated P1 x P2  F1  SELF F …… One seed from each used for next generation.
Maximum Likelihood (ML), Expectation Maximization (EM)
Expectation-Maximization
EM Algorithm Likelihood, Mixture Models and Clustering.
Zen, and the Art of Neural Decoding using an EM Algorithm Parameterized Kalman Filter and Gaussian Spatial Smoothing Michael Prerau, MS.
© 2007 John M. Abowd, Lars Vilhuber, all rights reserved Estimating m and u Probabilities Using EM Based on Winkler 1988 "Using the EM Algorithm for Weight.
EXAMPLE 1 Solve absolute value inequalities
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Likelihood probability of observing the data given a model with certain parameters Maximum Likelihood Estimation (MLE) –find the parameter combination.
QTL mapping in animals. It works QTL mapping in animals It works It’s cheap.
Gene, Allele, Genotype, and Phenotype
EM and expected complete log-likelihood Mixture of Experts
Chapter 9 – Modeling Breaking Strength with Dichotomous Data You are a statistician working for the Cry Your Eyes Out Tissue Company. The company wants.
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
Analytical vs. Numerical Minimization Each experimental data point, l, has an error, ε l, associated with it ‣ Difference between the experimentally measured.
Class 3 1. Construction of genetic maps 2. Single marker QTL analysis 3. QTL cartographer.
Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.
Complex Traits Most neurobehavioral traits are complex Multifactorial
Joint Linkage and Linkage Disequilibrium Mapping Key Reference Li, Q., and R. L. Wu, 2009 A multilocus model for constructing a linkage disequilibrium.
Quantitative Genetics
Composite Method for QTL Mapping Zeng (1993, 1994) Limitations of single marker analysis Limitations of interval mapping The test statistic on one interval.
Genetic design. Testing Mendelian segregation Consider marker A with two alleles A and a BackcrossF 2 AaaaAAAaaa Observationn 1 n 0 n 2 n 1 n 0 Expected.
Lecture 12: Linkage Analysis V Date: 10/03/02  Least squares  An EM algorithm  Simulated distribution  Marker coverage and density.
Lecture 15: Linkage Analysis VII
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Functional Mapping of QTL and Recent Developments
Confidence Interval & Unbiased Estimator Review and Foreword.
QTL Mapping Quantitative Trait Loci (QTL): A chromosomal segments that contribute to variation in a quantitative phenotype.
Population structure at QTL d A B C D E Q F G H a b c d e q f g h The population content at a quantitative trait locus (backcross, RIL, DH). Can be deduced.
Software Designing Interface – user friendly (think about MSoffice) –using menu, form, toolbars … –as simple as possible –using default setting, but allow.
Computational Issues on Statistical Genetics Develop Methods Data Collection Analyze Data Write Reports/Papers Research Questions Review the Literature.
Linkage Disequilibrium Mapping of Complex Binary Diseases Two types of complex traits Quantitative traits–continuous variation Dichotomous traits–discontinuous.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: MLLR For Two Gaussians Mean and Variance Adaptation MATLB Example Resources:
Lecture 22: Quantitative Traits II
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
- Type of Study Composite Interval Mapping Program - Genetic Design.
Figure S6. Explanation of computer simulation used for calculating confidence intervals of Δ(SNP-index) under the null hypothesis. (A) Flow chart of simulation.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Maximum Likelihood Estimates and the EM Algorithms III Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
Lecture 11: Linkage Analysis IV Date: 10/01/02  linkage grouping  locus ordering  confidence in locus ordering.
Computational Intelligence: Methods and Applications Lecture 26 Density estimation, Expectation Maximization. Włodzisław Duch Dept. of Informatics, UMK.
EM Algorithm 主講人:虞台文 大同大學資工所 智慧型多媒體研究室. Contents Introduction Example  Missing Data Example  Mixed Attributes Example  Mixture Main Body Mixture Model.
Introduction We consider the data of ~1800 phenotype measurements Each mouse has a given probability distribution of descending from one of 8 possible.
(1) Schedule Mar 15Linkage disequilibrium (LD) mapping Mar 17LD mapping Mar 22Guest speaker, Dr Yang Mar 24Overview Attend ENAR Biometrical meeting in.
Equations and Inequalities involving Absolute Value
Solving quadratic equations
Interval Mapping.
Huffman Codes Let A1,...,An be a set of items.
Basic concepts on population genetics
Y The graph of a rule connecting x and y is shown. From the graph, when x is -1, what is y? Give me a value of x that makes y positive, negative, equal.
QTL Fine Mapping by Measuring and Testing for Hardy-Weinberg and Linkage Disequilibrium at a Series of Linked Marker Loci in Extreme Samples of Populations 
Linkage Disequilibrium Mapping - Natural Population
Differentiation and Optimisation
Gel Electrophoresis Analysis
Composite Interval Mapping Program
Presentation transcript:

Interval mapping with maximum likelihood Data Files: Marker file – all markers Traits file – all traits Linkage map – built based on markers For example:

Interval mapping with maximum likelihood ID #PN RG472 RG246 K5 U10 RG532 W1 RG173 Amy1B RZ276 RG146

Interval mapping with maximum likelihood ID 10D 20D 30D 40D 50D 60D 70D 80D 90D

RG472 RG K5 U10 RG532 W1 RG173 RZ276 Amy1B RG146 RG345 RG381 RZ19 RG690 RZ730 RZ801 RG810 RG RG437 RG544 RG171 RG157 RZ318 Pall RZ58 CDO686 Amy1A/C RG95 RG654 RG256 RZ213 RZ123 RG RG104 RG348 RZ329 RZ892 RG100 RG191 RZ678 RZ574 RZ284 RZ394 pRD10A RZ403 RG179 CDO337 RZ337A RZ448 RZ519 Pgi -1 CDO87 RG910 RG418A RG218 RZ262 RG190 RG908 RG91 RG449 RG788 RZ565 RZ675 RG163 RZ590 RG214 RG143 RG chrom1chrom2chrom3chrom4

- Type of Study Interval Mapping Program - Genetic Design

- Data and Options Names of Markers (optional) Cumulative Marker Distance (cM) Interval Mapping Program Map Function Parameters Here for Simulation Study Only QTL Searching StepcM

- Data Interval Mapping Program Put Markers and Trait Data into box below OR

- Analyze Data Interval Mapping Program

- Profile Interval Mapping Program

- Permutation Test Interval Mapping Program #Tests Cut Point at Level Is Based on Tests.

Backcross Population – Two Point FreqQqqq Mm1/2(1-r)/2r/2 mm1/2r/2(1-r)/2

Backcross Population – Three Point FreqQqqq MmNn(1-r)/2(1-r 1 )(1-r 2 )r 1 *r 2 Mmnnr/2(1-r 1 )r 2 r 1 (1-r 2 ) mmNnr/2r 1 (1-r 2 )(1-r 1 )r 2 mmnn(1-r)/2r1*r2(1-r)/2 M Q N

F2 Population – Two Point FreqQQQqqq MM1/4(1-r) 2 /4(1-r)r/2r 2 /4 Mm1/2(1-r)r/2½-(1-r)r(1-r)r/2 mm1/4r 2 /4(1-r)r/2(1-r) 2 /4

F2 Population – Three Point FreqQQQqqq MMNN(1-r) 2 /4 1/4(1-a) 2 (1-b) 2 1/2a(1-a)b(1-b)1/4a 2 b 2 Nn(1-r)r/2 1/2(1-a) 2 b(1-b)1/2a(2b 2 - 2b+1)(1-a) 1/2a 2 b(1-b) nnr 2 /4 1/4(1-a) 2 b 2 1/2a(1-a)b(1-b)1/4a 2 (1-b) 2 MmNN(1-r)r/2 1/2a(1-a)(1-b) 2 1/2b(1- 2a+2a 2 )(1-b) 1/2a(1-a)b 2 Nn ½-(1-r)r a(1-a)b(1-b)1/2(2b 2 - 2b+1)(1-2a+2a 2 ) a(1-a)b(1-b) Nn(1-r)r/2 1/2a(1-a)b 2 1/2b(1- 2a+2a 2 )(1-b) 1/2a(1-a)(1-b) 2 mmNNr 2 /4 1/4a 2 (1-b) 2 1/2a(1-a)b(1-b)1/4(1-a) 2 b 2 Nn(1-r)r/2 1/2a 2 b(1-b)1/2a(2b 2 - 2b+1)(1-a) 1/2(1-a) 2 b(1-b) nn(1-r) 2 /4 1/4a 2 b 2 1/2a(1-a)b(1-b)1/4(1-a) 2 (1-b) 2 M a Q b N r=a+b-2ab

Differentiating L with respect to each unknown parameter, setting derivatives equal zero and solving the log-likelihood equations L(y,M|  ) =  i=1 n [  1|i f 1 (y i ) +  0|i f 0 (y i )] log L(y,M|  ) =  i=1 n log[  1|i f 1 (y i ) +  0|i f 0 (y i )] Define  1|i =  1|i f 1 (y i )/[  1|i f 1 (y i ) +  0|i f 0 (y i )](1)  0|i =  0|i f 1 (y i )/[  1|i f 1 (y i ) +  0|i f 0 (y i )](2)  1 =  i=1 n (  1|i y i )/  i=1 n  1|i (3)  0 =  i=1 n (  0|i y i )/  i=1 n  0|i (4)  2 = 1/n  i=1 n [  1|i (y i -  1 ) 2 +  0 |i (y i -  0 ) 2 ](5)  = (  i=1 n2  1|i +  i=1 n3  0 |i )/(n 2 +n 3 )(6)

function [mk, testres]=GenMarkerForBackcross(dist, N) %genarate N Backcross Markers from marker disttance (cM) dist. if dist(1)~=0, cm=[0 dist]/100; else cm=dist/100; end n=length(cm); rs=1/2*(exp(2*cm)-exp(-2*cm))./(exp(2*cm)+exp(-2*cm)); for j=1:N mk(j,1)=(rand>0.5); end Random Generate Markers for Backcross Population

for i=2:n for j=1:N if mk(j,i-1)==1, mk(j,i)=rand>rs(i); else mk(j,i)=rand<rs(i); end Random Generate Markers for Backcross Population, Cont’

EM algorithm for Interval Mapping function intmapbackross(Datas, mrkplace) % for example, mrkplace=[ ]; N=size(Datas,1); nmrk=size(mrkplace); mm=mean(Datas(:,size(Datas,2))); vv=var(Datas(:,size(Datas,2)),1); ll0 = N * (-log(2 * * vv) - 1) / 2; %likelihood at null res=[]; omu1=0;

for cm = 1:2:mrkplace(nmrk) for i = 1:nmrk if mrkplace(i) <= cm qtlk = i end theta = (cm - mrkplace(qtlk)) / (mrkplace(qtlk + 1) - mrkplace(qtlk)); th(1) = 1; th(2) = 1 - theta; th(3) = theta; th(4) = 0; mu1 = mm; mu0 = mm; s2=vv; EM algorithm for Interval Mapping

while (abs(mu1 - omu1) > ) omu1 = mu1; cmu1 = 0; cmu0 = 0; cs2 = 0; cpi = 0; ll = 0; for j = 1:N f1 = 1 / sqrt(2 * * s2) * exp(-(Datas(j, nmrk+1) - mu1)^2 / 2 / s2); f0 = 1 / sqrt(2 * * s2) * exp(-(Datas(j, nmrk+1) - mu0)^2 / 2 / s2); pi1i = th(4 - Datas(j, qtlk + 1) - Datas(j, qtlk) * 2); pi0i = 1 - pi1i; ll = ll + log(pi1i * f1 + pi0i * f0); BPi1i = pi1i * f1 / (pi1i * f1 + pi0i * f0); %E-Step BPi0i = 1 - BPi1i; cmu1 = cmu1 + BPi1i * Datas(j, nmrk+1); %M-STEP cmu0 = cmu0 + BPi0i * Datas(j, nmrk+1); cs2 = cs2 + BPi1i * (Datas(j, nmrk+1) - mu1) ^ 2 + BPi0i * (Datas(j, nmrk+1) - mu0) ^ 2; cpi = cpi + BPi1i; end mu1 = cmu1 / cpi; mu0 = cmu0 / (N - cpi); %M-STEP s2 = cs2 / N; end

prob=th(4 - Datas(:, qtlk + 1) - Datas(:, qtlk) * 2)'; [mmmm, likelihoodback(p, … Datas(:,nmrk+1), [prob 1-prob]), [mm mm vv]); %Simplex Local Search Method LR = 2 * (ll - ll0); res=[res; [cm mu1 mu0 s2 LR]]; end EM algorithm for Interval Mapping %Simplex Local Search Method

function A=likelihoodback(par, y, marker) mu1=par(1); mu0=par(2); s2=par(3); yy1=y-mu1; yy0=y-mu0; A=sum( log( sum([exp(-yy1.^2/2/s2) … exp(-yy0.^2/s2/2)].*marker,2))... -log(s2)/2-1/2*log(2*pi)) -10.E5*(s2<0.001); A=-A; EM algorithm for Interval Mapping