June 1999NCSU QTL Workshop © Brian S. Yandell 1 Bayesian Inference for QTLs in Inbred Lines II Brian S Yandell University of Wisconsin-Madison www.stat.wisc.edu/~yandell.

Slides:

Advertisements

Similar presentations

Pattern Recognition and Machine Learning

Advertisements

Introduction to Monte Carlo Markov chain (MCMC) methods

Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 

METHODS FOR HAPLOTYPE RECONSTRUCTION

Markov-Chain Monte Carlo

Bayesian Methods with Monte Carlo Markov Chains III

QTL Mapping R. M. Sundaram.

1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.

Introduction to Sampling based inference and MCMC Ata Kaban School of Computer Science The University of Birmingham.

CHAPTER 16 MARKOV CHAIN MONTE CARLO

Bayesian statistics – MCMC techniques

Suggested readings Historical notes Markov chains MCMC details

BAYESIAN INFERENCE Sampling techniques

Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.

Model SelectionSeattle SISG: Yandell © QTL Model Selection 1.Bayesian strategy 2.Markov chain sampling 3.sampling genetic architectures 4.criteria.

. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:

Sessão Temática 2 Análise Bayesiana Utilizando a abordagem Bayesiana no mapeamento de QTL´s Roseli Aparecida Leandro ESALQ/USP 11 o SEAGRO / 50ª RBRAS.

Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy.

Course overview Tuesday lecture –Those not presenting turn in short review of a paper using the method being discussed Thursday computer lab –Turn in short.

End of Chapter 8 Neil Weisenfeld March 28, 2005.

Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.

Lecture II-2: Probability Review

Inference for regression - Simple linear regression

Bayes Factor Based on Han and Carlin (2001, JASA).

Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health.

Model Inference and Averaging

Priors, Normal Models, Computing Posteriors

Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.

1 Gil McVean Tuesday 24 th February 2009 Markov Chain Monte Carlo.

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.

Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.

Bayesian MCMC QTL mapping in outbred mice Andrew Morris, Binnaz Yalcin, Jan Fullerton, Angela Meesaq, Rob Deacon, Nick Rawlins and Jonathan Flint Wellcome.

October 2001Jackson Labs Workshop © Brian S. Yandell 1 Efficient and Robust Statistical Methods for Quantitative Trait Loci Analysis Brian S. Yandell University.

Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.

Bayesian Reasoning: Tempering & Sampling A/Prof Geraint F. Lewis Rm 560:

Lecture 12: Linkage Analysis V Date: 10/03/02  Least squares  An EM algorithm  Simulated distribution  Marker coverage and density.

Bayesian Generalized Kernel Mixed Models Zhihua Zhang, Guang Dai and Michael I. Jordan JMLR 2011.

Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.

Brian S. Yandell University of Wisconsin-Madison

June 1999NCSU QTL Workshop © Brian S. Yandell 1 Bayesian Inference for QTLs in Inbred Lines II Brian S Yandell University of Wisconsin-Madison

28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.

BayesNCSU QTL II: Yandell © Bayesian Interval Mapping 1.what is Bayes? Bayes theorem? Bayesian QTL mapping Markov chain sampling18-25.

Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.

Tutorial I: Missing Value Analysis

Lecture 22: Quantitative Traits II

Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.

September 2002Jax Workshop © Brian S. Yandell1 Bayesian Model Selection for Quantitative Trait Loci using Markov chain Monte Carlo in Experimental Crosses.

Institute of Statistics and Decision Sciences In Defense of a Dissertation Submitted for the Degree of Doctor of Philosophy 26 July 2005 Regression Model.

April 2008Stat 877 © Brian S. Yandell1 Reversible Jump Details Brian S. Yandell University of Wisconsin-Madison April.

Bayesian Variable Selection in Semiparametric Regression Modeling with Applications to Genetic Mappping Fei Zou Department of Biostatistics University.

Introduction to Sampling based inference and MCMC

Plant Microarray Course

MCMC Output & Metropolis-Hastings Algorithm Part I

Model Inference and Averaging

Bayesian Inference for QTLs in Inbred Lines I

Bayesian Model Selection for Quantitative Trait Loci with Markov chain Monte Carlo in Experimental Crosses Brian S. Yandell University of Wisconsin-Madison.

Bayesian Inference for QTLs in Inbred Lines

Bayesian Interval Mapping

Bayesian Interval Mapping

Note on Outbred Studies

Robust Full Bayesian Learning for Neural Networks

CHAPTER 12 More About Regression

Opinionated Lessons #39 MCMC and Gibbs Sampling in Statistics

Ch 3. Linear Models for Regression (2/2) Pattern Recognition and Machine Learning, C. M. Bishop, Previously summarized by Yung-Kyun Noh Updated.

Bayesian Interval Mapping

Presentation transcript:

June 1999NCSU QTL Workshop © Brian S. Yandell 1 Bayesian Inference for QTLs in Inbred Lines II Brian S Yandell University of Wisconsin-Madison NCSU Statistical Genetics June 1999

NCSU QTL Workshop © Brian S. Yandell 2 Many Thanks Michael Newton Daniel Sorensen Daniel Gianola Jaya Satagopan Patrick Gaffney Fei Zou Tom Osborn David Butruille Marcio Ferrera Josh Udahl Pablo Quijada USDA Hatch Grants

June 1999NCSU QTL Workshop © Brian S. Yandell 3 Overview II quick review of trait model –single & multiple QTL –details of Gibbs sampler full conditionals –vector notation reversible jump MCMC –multiple regression –number of QTLs deconstructing Bayesian LODs

June 1999NCSU QTL Workshop © Brian S. Yandell 4 Quick Review of trait Model single QTL details of Gibbs sampler –normal priors & likelihoods mean, additive effects –inverse gamma prior for variance or inverse chi-square –vague priors lead to usual estimates as posterior means multiple QTL trait model –model with vector notation

June 1999NCSU QTL Workshop © Brian S. Yandell 5 Single QTL trait Model trait = mean + additive + error trait = effect_of_geno + error prob( trait | geno, effects )

June 1999NCSU QTL Workshop © Brian S. Yandell 6 Gibbs Sampler updates variance mean traits additive genos

June 1999NCSU QTL Workshop © Brian S. Yandell 7 Full Conditional for mean normal prior with large variance leads to normal posterior posterior mean posterior variance

June 1999NCSU QTL Workshop © Brian S. Yandell 8 Full Conditional for additive Effect normal prior with large variance leads to normal posterior posterior mean posterior variance

June 1999NCSU QTL Workshop © Brian S. Yandell 9 Full Conditional for variance inverse gamma prior with large v/a posterior distribution posterior mean

June 1999NCSU QTL Workshop © Brian S. Yandell 10 MCMC run for variance

June 1999NCSU QTL Workshop © Brian S. Yandell 11 Alternative for Variance: use Inverse Chi-square inverse chi-square prior with large d,v posterior distribution

June 1999NCSU QTL Workshop © Brian S. Yandell 12 Markov chain updates effects locus traits genos

June 1999NCSU QTL Workshop © Brian S. Yandell 13 Prior for locus no prior information on locus –uniform prior over genome –use framework map choose interval proportional to length then pick uniform position within interval prior information from other studies concentrate on credible regions use posterior of previous study as new prior

June 1999NCSU QTL Workshop © Brian S. Yandell 14 Metropolis-Hastings Step pick new locus based upon current locus –propose new locus from distribution q( ) pick value near current one? pick uniformly across genome? –accept new locus with probability a() Gibbs sampler is special case of M-H –always accept new proposal acceptance insures right stable distribution

June 1999NCSU QTL Workshop © Brian S. Yandell 15 Full Conditional for genos full conditional for geno type depends on –effects via trait model –locus via recombination model can explicitly decompose by individual j –binomial (or trinomial) probability

June 1999NCSU QTL Workshop © Brian S. Yandell 16 Missing marker Data sample missing marker data a la QT geno types full conditional for missing markers depends on –flanking markers –possible flanking QTL can explicitly decompose by individual j –binomial (or trinomial) probability

June 1999NCSU QTL Workshop © Brian S. Yandell 17 Multiple QTL model trait = mean + add1 + add2 + error trait = effect_of_genos + error prob( trait | genos, effects )

June 1999NCSU QTL Workshop © Brian S. Yandell 18 Vector Notation for QTLs inner product for sum condense notation

June 1999NCSU QTL Workshop © Brian S. Yandell 19 Multiple loci vector of loci across linkage map careful bookkeeping during update –identifiability & bump hunting –possibility of two loci in one marker interval ordered loci are sufficient

June 1999NCSU QTL Workshop © Brian S. Yandell 20 Posterior: Multiple QTLs posterior = likelihood * prior / constant posterior( paramaters | data ) prob( genos, effects, loci | traits, map ) is proportional to

June 1999NCSU QTL Workshop © Brian S. Yandell 21 MCMC for Multiple QTLs construct Markov chain around posterior update one (or several) components at a time –update effects given geno types & traits –update loci given geno types & traits –update geno types give loci & effects update all terms for each locus at one time? –open questions of efficient mixing

June 1999NCSU QTL Workshop © Brian S. Yandell 22 MCMC Updates effects loci traits genos

June 1999NCSU QTL Workshop © Brian S. Yandell 23 MCMC Conditions construct Markov chain with stable distribution ergodic Markov chain –reversibile (detailed balance) –irreducible (can get from any value to any other) –aperiodic (no fixed pattern) –positive recurrent (chance to visit all possible values)

June 1999NCSU QTL Workshop © Brian S. Yandell 24 Reversible Jump MCMC basic idea of Green(1995) model selection in regression how many QTLs? –number of QTL is random –estimate the number m RJ-MCMC vs. Bayes factors other similar ideas

June 1999NCSU QTL Workshop © Brian S. Yandell 25 Jumping the Number of QTL model changes with number of QTL –almost analogous to stepwise regression –use reversible jump MCMC to change number book keeping helps in comparing models change of variables between models prior on number of QTL –uniform over some range –Poisson with prior mean

June 1999NCSU QTL Workshop © Brian S. Yandell 26 Posterior: Number of QTL posterior = likelihood * prior / constant posterior( paramaters | data ) prob( genos, effects, loci, m | traits, map ) is proportional to

June 1999NCSU QTL Workshop © Brian S. Yandell 27 Reversible Jump Choices action step: draw one of three choices update step with probability 1-b(m+1)-d(m) –update current model –loci, effects, geno types as before add a locus with probability b(m+1) –propose a new locus –innovate effect and geno types at new locus –decide whether to accept the “birth” of new locus drop a locus with probability d(m) –pick one of existing loci to drop –decide whether to accept the “death” of locus

June 1999NCSU QTL Workshop © Brian S. Yandell 28 Markov chain for number m add a new locus drop a locus update current model 01mm-1m+1... m

June 1999NCSU QTL Workshop © Brian S. Yandell 29 Jumping QTL number & loci

June 1999NCSU QTL Workshop © Brian S. Yandell 30 RJ-MCMC Updates effects loci traits genos add locus drop locus b(m+1) d(m)d(m) 1-b(m+1)-d(m)

June 1999NCSU QTL Workshop © Brian S. Yandell 31 Propose to Add a locus propose a new locus –similar proposal to ordinary update uniform chance over genome easier to avoid interval with another QTL –need geno types at locus & model effect innovate effect & geno types at new locus –draw geno types based on recombination (prior) no dependence on trait model yet –draw effect as in Green’s reversible jump adjust for collinearity modify other parameters accordingly check acceptance...

June 1999NCSU QTL Workshop © Brian S. Yandell 32 Propose to Drop a locus choose an existing locus –equal weight for all loci ? –more weight to loci with small effects ? “drop” effect & geno types at old locus –adjust effects at other loci for collinearity –this is reverse jump of Green (1995) check acceptance … –do not drop locus, effects & geno types –until move is accepted

June 1999NCSU QTL Workshop © Brian S. Yandell 33 Acceptance of Reversible Jump accept birth of new locus with probability min(1,A) accept death of old locus with probability min(1,1/A)

June 1999NCSU QTL Workshop © Brian S. Yandell 34 Acceptance of Reversible Jump move probabilities birth & death proposals Jacobian between models –fudge factor –see stepwise regression example mm+1

June 1999NCSU QTL Workshop © Brian S. Yandell 35 RJ-MCMC: Number of QTL

June 1999NCSU QTL Workshop © Brian S. Yandell 36 Posterior # QTL for 8-week Data 98% credible region for m : (1,3) based on 1 million steps with prior mean of 3

June 1999NCSU QTL Workshop © Brian S. Yandell 37 How Good is RJ-MCMC? simulations with 0, 1 or 2 QTL –strong effects (additive = 2, variance = 1) –linked loci 36cM apart differences with number of QTL –clear differences by actual number –works well with 100,000, better with 1M effect of Poisson prior mean –larger prior mean shifts posterior up –but prior does not take over

June 1999NCSU QTL Workshop © Brian S. Yandell 38 Posterior for Simulated Data 0,1 or 2 large QTL prior Poisson mean of 2 100,000 RJ-MCMC runs

June 1999NCSU QTL Workshop © Brian S. Yandell 39 Effect of Prior Mean

June 1999NCSU QTL Workshop © Brian S. Yandell 40 # QTL in Brassica Data 4-week & 8-week vernalization –log( days to flower) –105 lines, 10 markers –modest effects –evidence of 1 or 2 QTL using Bayes factors histograms of posterior number of QTL –depends somewhat on prior –mode is 1 or 2 QTL 90% credible sets –all include 2 QTL –include 1 QTL if prior not huge

June 1999NCSU QTL Workshop © Brian S. Yandell 41 #QTL for Brassica 8-week

June 1999NCSU QTL Workshop © Brian S. Yandell 42 Brassica #QTL 90% Credible Sets 8-week 4-week

June 1999NCSU QTL Workshop © Brian S. Yandell 43 Brassica #QTL Comparison

June 1999NCSU QTL Workshop © Brian S. Yandell 44 Reversible Jump II reversible jump MCMC details –can update model with m QTL –have basic idea of jumping models –now: careful bookkeeping between models RJ-MCMC & Bayes factors –Bayes factors from RJ-MCMC chain –components of Bayes factors

June 1999NCSU QTL Workshop © Brian S. Yandell 45 RJ-MCMC Updates effects loci traits genos add locus drop locus b(m+1) d(m)d(m) 1-b(m+1)-d(m)

June 1999NCSU QTL Workshop © Brian S. Yandell 46 Reversible Jump Idea expand idea of MCMC to compare models adjust for parameters in different models –augment smaller model with innovations –constraints on larger model calculus “change of variables” is key –add or drop parameter(s) –carefully compute the Jacobian consider stepwise regression –Mallick (1995) & Green (1995) –efficient calculation with Hausholder decomposition

June 1999NCSU QTL Workshop © Brian S. Yandell 47 Model Selection in Regression known regressors (e.g. markers ) –models with 1 or 2 regressors jump between models –centering regressors simplifies calculations

June 1999NCSU QTL Workshop © Brian S. Yandell 48 Slope Estimate for 1 Regressor recall least squares estimate of slope note relation of slope to correlation

June 1999NCSU QTL Workshop © Brian S. Yandell 49 2 Correlated Regressors slopes adjusted for other regressors

June 1999NCSU QTL Workshop © Brian S. Yandell 50 Gibbs Sampler for Model 1 mean slope variance

June 1999NCSU QTL Workshop © Brian S. Yandell 51 Gibbs Sampler for Model 2 mean slopes variance

June 1999NCSU QTL Workshop © Brian S. Yandell 52 Updates from 2->1 drop 2nd regressor adjust other regressor

June 1999NCSU QTL Workshop © Brian S. Yandell 53 Updates from 1->2 add 2nd slope, adjusting for collinearity adjust other slope & variance

June 1999NCSU QTL Workshop © Brian S. Yandell 54 Model Selection in Regression known regressors (e.g. markers ) –models with 1 or 2 regressors jump between models –augment with new innovation z

June 1999NCSU QTL Workshop © Brian S. Yandell 55 Change of Variables change variables from model 1 to model 2 calculus issues for integration –need to formally account for change of variables –infinitessimal steps in integration (db) –involves partial derivatives (next page)

June 1999NCSU QTL Workshop © Brian S. Yandell 56 Jacobian & the Calculus Jacobian sorts out change of variables –careful: easy to mess up here!

June 1999NCSU QTL Workshop © Brian S. Yandell 57 Geometry of Reversible Jump

June 1999NCSU QTL Workshop © Brian S. Yandell 58 QT additive Reversible Jump

June 1999NCSU QTL Workshop © Brian S. Yandell 59 Credible Set for additive 90% & 95% sets based on normal regression line corresponds to slope of updates

June 1999NCSU QTL Workshop © Brian S. Yandell 60 Efficient Updating of additive more computations when m > 2 want to avoid matrix inverses –decompose matrix instead –solve linear system of equations use linear algebra –Hausholder (QR) decomposition –LAPACK User’s Guide (1995, 2nd ed) Anderson et al., SIAM.

June 1999NCSU QTL Workshop © Brian S. Yandell 61 Hausholder (QR) Decomposition decomposition –G is upper triangular –F is orthogonal orthogonality design matrix

June 1999NCSU QTL Workshop © Brian S. Yandell 62 QR & Regression model error piece model piece estimators

June 1999NCSU QTL Workshop © Brian S. Yandell 63 Absorbing Old Model old model –m regressors –QR decomposition new model –m+1 regressor –use QR to absorb old model

June 1999NCSU QTL Workshop © Brian S. Yandell 64 Adjusted Slope Estimators old slopes –note m=1 case added slope –note sum of squares variance –note Jacobian new slopes

June 1999NCSU QTL Workshop © Brian S. Yandell 65 How To Infer loci ? if m is known, use fixed MCMC –histogram of loci –issue of bump hunting combining loci estimates in RJ-MCMC –some steps are from wrong model too few loci (bias) too many loci (variance/identifiability) –condition on number of loci subsets of Markov chain

June 1999NCSU QTL Workshop © Brian S. Yandell 66 Brassica 8-week Data locus MCMC with m=2

June 1999NCSU QTL Workshop © Brian S. Yandell 67 Jumping QTL number & loci

June 1999NCSU QTL Workshop © Brian S. Yandell 68 RJ-MCMC loci chain

June 1999NCSU QTL Workshop © Brian S. Yandell 69 Raw Histogram of loci

June 1999NCSU QTL Workshop © Brian S. Yandell 70 Conditional Histograms

June 1999NCSU QTL Workshop © Brian S. Yandell 71 Bayes Factors ratio of posterior odds to prior odds –RJ-MCMC gives posterior on number of QTL –prior is Poisson

June 1999NCSU QTL Workshop © Brian S. Yandell 72 #QTL for Brassica 8-week

June 1999NCSU QTL Workshop © Brian S. Yandell 73 RJ-Bayes Factors (8-week Brassica data)

June 1999NCSU QTL Workshop © Brian S. Yandell 74 Simulation Study of Prior Effect how dramatic is the effect of prior? simulations of 0, 1 or 2 QTL –QTL have large effect additive = 2, variance = 1 –2 QTL spaced 36cM apart –sample sized of 105 RJ-MCMC runs of 100,000

June 1999NCSU QTL Workshop © Brian S. Yandell 75 Effect of Prior Mean

June 1999NCSU QTL Workshop © Brian S. Yandell 76 Bayes Factor prior of 2 prior of 4

June 1999NCSU QTL Workshop © Brian S. Yandell 77 Computing Bayes Factors arithmetic mean –using samples from prior –mean across Monte Carlo or MCMC runs –can be inefficient if prior differs from posterior harmonic mean –using samples from posterior –more efficient but less stable –careful choice of weight h() close to posterior

June 1999NCSU QTL Workshop © Brian S. Yandell 78 Stable Bayes Factors Satagopan, Raftery & Newton (1999) –weighted harmonic mean –absorb variance (normal to t dist) replace by

June 1999NCSU QTL Workshop © Brian S. Yandell 79 Bayes Factors & LODs others have tried arithmetic & harmonic mean why not geometric mean? terms that are averaged are log likelihoods...

June 1999NCSU QTL Workshop © Brian S. Yandell 80 Bayesian LOD Bayesian “LOD” computed at each step –based on LR given sampled geno types and effects –can be larger or smaller than profile LOD –informal diagnostic of fit –combine to for geometric estimates of Bayes factors

June 1999NCSU QTL Workshop © Brian S. Yandell 81 Compare LODs scatter plot of loci and Bayesian LODs –same BLOD for all loci of step –overlay LOD from interval mapping (red) –overlay LOD from CIM (green) –vertical lines at true or inferred loci (blue) steps with higher BLOD –may have more likely genotypes –basis for MCMC step always up to higher likelihood sometimes down to lower likelihood

June 1999NCSU QTL Workshop © Brian S. Yandell 82 BLODs with no QTL LOD profile roughly zero most BLOD values should be negative –no pattern across linkage group distribution similar to rescaled chi- square with 1 df –-2log(LR) approximately chi-square –assignment of genotypes “irrelevant” RJ-MCMC skewed with inferred QTL –numbers indicate #QTL

June 1999NCSU QTL Workshop © Brian S. Yandell 83 Simulated Data with No QTL

June 1999NCSU QTL Workshop © Brian S. Yandell 84 BLOD: no QTL

June 1999NCSU QTL Workshop © Brian S. Yandell 85 RJ-MCMC BLOD: no QTL

June 1999NCSU QTL Workshop © Brian S. Yandell 86 BLODs with 1 QTL LOD peaks at correct locus most BLOD values near locus –some considerably larger than LOD –inferred genotype vs EM average non-central distribution of BLOD –rescaled non-central chi-square? RJ-MCMC dispersed from peak –locus proposal has local dispersion

June 1999NCSU QTL Workshop © Brian S. Yandell 87 LOD for 1 QTL

June 1999NCSU QTL Workshop © Brian S. Yandell 88 RJ-MCMC BLOD: 1 QTL

June 1999NCSU QTL Workshop © Brian S. Yandell 89 BLODs with 2 QTL incorrect fit of 1 QTL model –LOD peaks at ghost locus –BLOD values near LOD peaks correct 2-QTL model –IM misses, CIM gets loci –BLOD at loci –BLOD approx. 2x LOD simultaneous fit of both loci RJ-MCMC dispersed from peak –need to look conditional on m=2

June 1999NCSU QTL Workshop © Brian S. Yandell 90 LOD for 2 QTL with 1 Fit

June 1999NCSU QTL Workshop © Brian S. Yandell 91 LOD for 2 QTL with 2 Fit

June 1999NCSU QTL Workshop © Brian S. Yandell 92 RJ-MCMC BLOD: 2 QTL

June 1999NCSU QTL Workshop © Brian S. Yandell 93 Brassica BLODs 4-week clearer than 8-week –ghost QTL or smear for 8-week? MCMC with m=2 fairly clear RJ-MCMC dispersed –conditioning on m=2 similar to MCMC not shown –mixing models together –local proposal moves hamper mixing

June 1999NCSU QTL Workshop © Brian S. Yandell 94 Brassica 4-week BLOD Map

June 1999NCSU QTL Workshop © Brian S. Yandell 95 Brassica 8-week BLOD Map

June 1999NCSU QTL Workshop © Brian S. Yandell 96 4-week RJ-MCMC BLOD

June 1999NCSU QTL Workshop © Brian S. Yandell 97 8-week RJ-MCMC BLOD

June 1999NCSU QTL Workshop © Brian S. Yandell 98 The Art of MCMC convergence issues –burn-in period & when to stop proper mixing of the chain –smart proposals & smart updates frequentist approach –simulated annealing: reaching the peak –simulated tempering: heating & cooling the chain Bayesian approach –influence of priors on posterior –Rao-Blackwell smoothing bump-hunting for mixtures (e.g. QTL)

June 1999NCSU QTL Workshop © Brian S. Yandell 99 RJ-MCMC Software General MCMC software –U Bristol links –BUGS (Bayesian inference Using Gibbs Sampling) Our MCMC software for QTLs –C code using LAPACK ftp://ftp.stat.wisc.edu/pub/yandell/revjump.tar.gz –coming soon: perl preprocessing (to/from QtlCart format) Splus post processing Bayes factor computation

June 1999NCSU QTL Workshop © Brian S. Yandell 100 RJ-MCMC Software Details input files –marker.dist –marker.mark –trait.y output file –result.write –result.error distances between loci (cM) marker genotypes (-1,1) –one line per marker trait phenotypes results errors if any

June 1999NCSU QTL Workshop © Brian S. Yandell 101 trait.y file missing value =

June 1999NCSU QTL Workshop © Brian S. Yandell 102 marker.dist file

June 1999NCSU QTL Workshop © Brian S. Yandell 103 marker.mark file one row per marker, one column per line

June 1999NCSU QTL Workshop © Brian S. Yandell 104 nval.dat file 1# 1=revjump,0=no # n=individuals markers # N=MCMC_runs skips 2# m 0 0.5# mu sigmasq 0 0# initial b’s # initial loci # prior(mu)~N(0,10) prior(sigmasq)~IG(1,1) 0 10# prior(b)~N(0,1) 4# prior(m)~Poisson(4)

June 1999NCSU QTL Workshop © Brian S. Yandell 105 result.write file m mu b(1)..b(m) sigmasq lambda(1)..lambda(m) move LOD propose accept birth death locus m=0 m=1 m=2 m=3 m=4 m=

June 1999NCSU QTL Workshop © Brian S. Yandell 106 Bayes Factor References MA Newton & AE Raftery (1994) “Approximate Bayesian inference with the weighted likelihood bootstrap”, J Royal Statist Soc B 56: RE Kass & AE Raftery (1995) “Bayes factors”, J Amer Statist Assoc 90: JM Satagopan, MA Newton & AE Rafter (1999) “On the harmonic mean estimator of marginal probability”, ms in prep,

June 1999NCSU QTL Workshop © Brian S. Yandell 107 Reversible Jump MCMC References PJ Green (1995) “Reversible jump Markov chain Monte Carlo computation and Bayesian model determination”, Biometrika 82: S Richardson & PJ Green (1997) “On Bayesian analysis of mixture with an unknown of components”, J Royal Statist Soc B 59: BK Mallick (1995) “Bayesian curve estimation by polynomials of random order”, TR 95-19, Math Dept, Imperial College London. L Kuo & B Mallick (1996) “Bayesian variable selection for regression models”, ASA Proc Section on Bayesian Statistical Science,

June 1999NCSU QTL Workshop © Brian S. Yandell 108 QTL Reversible Jump MCMC: Inbred Lines JM Satagopan & BS Yandell (1996) “Estimating the number of quantitative trait loci via Bayesian model determination”, Proc JSM Biometrics Section. DA Stephens & RD Fisch (1998) “Bayesian analysis of quantitative trait locus data using reversible jump Markov chain Monte Carlo”, Biometrics 54: MJ Sillanpaa & E Arjas (1998) “Bayesian mapping of multiple quantitative trait loci from incomplete inbred line cross data”, Genetics 148: R Waagepetersen & D Sorensen (1999) “Understanding reversible jump MCMC”,

June 1999NCSU QTL Workshop © Brian S. Yandell 109 QTL Reversible Jump MCMC: Pedigrees S Heath (1997) “Markov chain Monte Carlo segregation and linkage analysis for oligenic models”, Am J Hum Genet 61: I Hoeschele, P Uimari, FE Grignola, Q Zhang & KM Gage (1997) “Advances in statistical methods to map quantitative trait loci in outbred populations”, Genetics 147: P Uimari and I Hoeschele (1997) “Mapping linked quantitative trait loci using Bayesian analysis and Markov chain Monte Carlo algorithms”, Genetics 146: MJ Sillanpaa & E Arjas (1999) “Bayesian mapping of multiple quantitative trait loci from incomplete outbred offspring data”, Genetics 151,

June 1999NCSU QTL Workshop © Brian S. Yandell 110 QTLs and Polygenes phenotype = design + QTLs + polygenes + error QTLs:quantitative trait loci polygenes:many genes of small effect spread throughout genome distinction is arbitrary, depending on sample size magnitude of effects design/cross/marker polymorphism analogy to multiple regression

June 1999NCSU QTL Workshop © Brian S. Yandell 111 Polygenes and Inbred Lines same (raw) genetic correlation across cohort: –1/2 for DH –2/3 for F2 –4/7 for F3 –1/2 for RI modified by specific information: –major & minor QTLs –marker surrogates for polygenes (CIM)

June 1999NCSU QTL Workshop © Brian S. Yandell 112 Composite QTL model trait = mean + add + dom + other + error trait = effect_of_geno + other + error prob( trait | genos, effects, other ) other ( ): other linked and unlinked QTLs, and polygenes