Functional Mapping of QTL and Recent Developments

Slides:



Advertisements
Similar presentations
The genetic dissection of complex traits
Advertisements

Planning breeding programs for impact
Association Mapping as a Breeding Strategy
Qualitative and Quantitative traits
Tutorial #1 by Ma’ayan Fishelson
Functional Mapping A statistical model for mapping dynamic genes.
Put Markers and Trait Data into box below Linkage Disequilibrium Mapping - Natural Population OR.
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
Joint Linkage and Linkage Disequilibrium Mapping
QTL Mapping R. M. Sundaram.
1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.
The Inheritance of Complex Traits
Quantitative Genetics
Modeling evolutionary genetics Jason Wolf Department of ecology and evolutionary biology University of Tennessee.
Modeling Gene Interactions in Disease CS 686 Bioinformatics.
What is a QTL? What are QTL?. Current methods for QTL  Single Marker Methods ( Student, 17?? )  t-tests  Interval Mapping Method (Lander and Botstein,
Quantitative Genetics
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Modes of selection on quantitative traits. Directional selection The population responds to selection when the mean value changes in one direction Here,
QTL mapping in animals. It works QTL mapping in animals It works It’s cheap.
ConceptS and Connections
Gene, Allele, Genotype, and Phenotype
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
The Complexities of Data Analysis in Human Genetics Marylyn DeRiggi Ritchie, Ph.D. Center for Human Genetics Research Vanderbilt University Nashville,
Human Chromosomes Male Xy X y Female XX X XX Xy Daughter Son.
Introduction to Linkage Analysis Pak Sham Twin Workshop 2003.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Joint Linkage and Linkage Disequilibrium Mapping Key Reference Li, Q., and R. L. Wu, 2009 A multilocus model for constructing a linkage disequilibrium.
Quantitative Genetics
Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm.
Lecture 12: Linkage Analysis V Date: 10/03/02  Least squares  An EM algorithm  Simulated distribution  Marker coverage and density.
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.
QTL Mapping Quantitative Trait Loci (QTL): A chromosomal segments that contribute to variation in a quantitative phenotype.
Interval mapping with maximum likelihood Data Files: Marker file – all markers Traits file – all traits Linkage map – built based on markers For example:
Linkage Disequilibrium Mapping of Complex Binary Diseases Two types of complex traits Quantitative traits–continuous variation Dichotomous traits–discontinuous.
Pedagogical Objectives Bioinformatics/Neuroinformatics Unit Review of genetics Review/introduction of statistical analyses and concepts Introduce QTL.
BIO.B.2- GENETICS CHAPTER 11. B2: Genetics 1. Describe and/ or predict observed patterns of inheritance i.e. dominant, recessive, co-dominant, incomplete.
Lecture 22: Quantitative Traits II
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Understanding Inheritance Main Idea: The interactions among alleles, genes, and the environment determine an organism’s traits.
Chapter 22 - Quantitative genetics: Traits with a continuous distribution of phenotypes are called continuous traits (e.g., height, weight, growth rate,
- Type of Study Composite Interval Mapping Program - Genetic Design.
Why you should know about experimental crosses. To save you from embarrassment.
1 Genetic Mapping Establishing relative positions of genes along chromosomes using recombination frequencies Enables location of important disease genes.
Analyzing circadian expression data by harmonic regression based on autoregressive spectral estimation Rendong Yang and Zhen Su Division of Bioinformatics,
Genetic mapping and QTL analysis - JoinMap and QTLNetwork -
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
(1) Schedule Mar 15Linkage disequilibrium (LD) mapping Mar 17LD mapping Mar 22Guest speaker, Dr Yang Mar 24Overview Attend ENAR Biometrical meeting in.
Bayesian Variable Selection in Semiparametric Regression Modeling with Applications to Genetic Mappping Fei Zou Department of Biostatistics University.
Difference between a monohybrid cross and a dihybrid cross
Identifying QTLs in experimental crosses
upstream vs. ORF binding and gene expression?
From: Will genomic selection be a practical method for plant breeding?
New Methods for Analyzing Complex Traits
Interval Mapping.
Relationship between quantitative trait inheritance and
Gene mapping in mice Karl W Broman Department of Biostatistics
Bio.B.2- Genetics CHAPTER 11.
Mapping Quantitative Trait Loci
The ‘V’ in the Tajima D equation is:
Mathematical Foundations of BME Reza Shadmehr
Chapter 7 Multifactorial Traits
Detecting variance-controlling QTL
Genetics.
A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants  Andrew.
Linkage Disequilibrium Mapping - Natural Population
Chapter 7 Beyond alleles: Quantitative Genetics
Introduction to Genetics
Composite Interval Mapping Program
Presentation transcript:

Functional Mapping of QTL and Recent Developments Chang-Xing Ma Department of Biostatistics University at Buffalo cxma@buffalo.edu Rongling Wu University of Florida

Outline Interval Mapping Functional Mapping Functional Mapping Demo Recent Developments Conclusion

Gene, Allele, Genotype, Phenotype Chromosomes from Father Mother Genotype Phenotype Height IQ Gene A, with two alleles A and a AA 185 100 AA 182 104 Aa 175 103 Aa 171 102 aa 155 101 aa 152 103

Regression model for estimating the genotypic effect Phenotype = Genotype + Error yi = xij + ei xi is the indicator for QTL genotype j is the mean for genotype j ei ~ N(0, 2)

The genotypes for the trait are not observable and should be predicted from linked neutral molecular markers (M) M1 QTL M2 The genes that lead to the phenotypic variation are called Quantitative Trait Loci (QTL) M3 . Our task is to construct a statistical model that connects the QTL genotypes and marker genotypes through observed phenotypes Mm

Data Structure n = n22 + n21 + n20 + n12 + n00 + n02 + n01 + n00 1 Parents AA  aa F1 Aa F2 AA Aa aa ¼ ½ ¼ Data Structure Subject Marker (M) Phenotype Genotype frequency M1 M2 … Mm (y) QQ(2) Qq(1) qq(0) ¼ ½ ¼ 1 AA(2) BB(2) … Y1 ¼ ½ ¼ 2 AA(2) BB(2) ... y2 3 Aa(1) Bb(1) ... y3 ¼ ½ ¼ 4 y4 ¼ ½ ¼ 5 y5 6 Aa(1) bb(0) ... y6 7 aa(0) Bb(1) ... y7 8 aa(0) bb(0) … y8 n = n22 + n21 + n20 + n12 + n00 + n02 + n01 + n00

Finite mixture model for estimating genotypic effects yi ~ p(yi|,) = ¼ f2(yi) + ½ f1(yi) + ¼ f0(yi) QTL genotype (j) QQ Qq qq Code 2 1 0 where fj(yi) is a normal distribution density with mean j and variance 2  = (2, 1, 0),  = (2)

Likelihood function based on the mixture model L(, , |M, y) j|i is the conditional (prior) probability of QTL genotype j (= 2, 1, 0) given marker genotypes for subject i (= 1, …, n).

We model the parameters contained within the mixture model using particular functions QTL genotype frequency: j|i = gj(p) Mean: j = hj(m) Variance:  = l(v) p contains the population genetic parameters q = (m, v) contains the quantitative genetic parameters

j|i 2|22 1|12 r=a+b-2ab F2 QTL genotype frequency: M a Q b N QQ(2) MM(2) NN(2) (1-r)2/4 1/4(1-a)2(1-b)2 1/2a(1-a)b(1-b) 1/4a2b2 Nn(1) (1-r)r/2 1/2(1-a)2b(1-b) 1/2a(2b2-2b+1)(1-a) 1/2a2b(1-b) nn(0) r2/4 1/4(1-a)2b2 1/4a2(1-b)2 Mm(1) 1/2a(1-a)(1-b)2 1/2b(1-2a+2a2)(1-b) 1/2a(1-a)b2 ½-(1-r)r a(1-a)b(1-b) 1/2(2b2-2b+1)(1-2a+2a2) mm(0) 2|22 1|12

Log- Likelihood Function

The EM algorithm E step M step Calculate the posterior probability of QTL genotype j for individual i that carries a known marker genotype M step Solve the log-likelihood equations Iterations are made between the E and M steps until convergence

Interval Mapping Program - Type of Study - Genetic Design

Interval Mapping Program - Data and Options Names of Markers (optional) Cumulative Marker Distance (cM) Map Function QTL Searching Step cM Parameters Here for Simulation Study Only

Interval Mapping Program - Data Put Markers and Trait Data into box below OR

Interval Mapping Program - Analyze Data Trait:

Interval Mapping Program - Profile

Interval Mapping Program - Permutation Test #Tests Cut off Point at Level Is Based on Tests.

Functional Mapping An innovative model for genetic dissection of complex traits by incorporating mathematical aspects of biological principles into a mapping framework Provides a tool for cutting-edge research at the interplay between gene action and development

Data Structure n = n22 + n21 + n20 + n12 + n00 + n02 + n01 + n00 Parents AA  aa F1 Aa F2 AA Aa aa ¼ ½ ¼ Subject Marker (M) Phenotype (y) Genotype frequency 1 2 … m 1 2 … T QQ(2) Qq(1) qq(0) ¼ ½ ¼ 1 2 2 … y1(1) y1(2) … y1(T) 2 2 2 ... y2(1) y2(2) … y2(T) 3 1 1 … y3(1) y3(2) … y3(T) 4 y4(1) y4(2) … y 4(T) 5 y5(1) y5(2) … y5(T) 6 1 0 … y6(1) y6(2) … y6(T) 7 0 1 … y7(1) y7(2) … y7(T) 8 0 0 ... y8(1) y8(2) … y8(T) n = n22 + n21 + n20 + n12 + n00 + n02 + n01 + n00

The Finite Mixture Model Observation vector, yi = [yi(1), …, yi(T)] ~ MVN(uj, ) Mean vector, uj = [uj(1), uj(2), …, uj(T)], (Co)variance matrix,

Modeling the Mean Vector Parametric approach Growth trajectories – Logistic curve HIV dynamics – Bi-exponential function Biological clock – Van Der Pol equation Drug response – Emax model Nonparametric approach Lengedre function (orthogonal polynomial) B-spline

Stem diameter growth in poplar trees Ma, Casella & Wu: Genetics 2002

Logistic Curve of Growth – A Universal Biological Law Logistic Curve of Growth – A Universal Biological Law (West et al.: Nature 2001) Logistic Curve of Growth – A Universal Biological Law Instead of estimating uj, we estimate curve parameters q = (aj, bj, rj) Modeling the genotype- dependent mean vector, uj = [uj(1), uj(2), …, uj(T)] = [ , , …, ] Number of parameters to be estimated in the mean vector Time points Traditional approach Our approach 5 3  5 = 15 3  3 = 9 10 3  10 = 30 3  3 = 9 50 3  50 = 150 3  3 = 9

Modeling the Variance Matrix Stationary parametric approach Autoregressive (AR) model Nonstationary parameteric approach Structured antedependence (SAD) model Ornstein-Uhlenbeck (OU) process Nonparametric approach Lengendre function

Autoregressive model AR(1)  = q = (aj, bj, rj , ρ, σ2)

Box-Cox Transformation Differences in growth across ages Untransformed Log-transformed Poplar data

EM Algorithm (Ma et al 2002, Genetics) Estimate (aj, bj, rj; rho, sigma^2)

An example of a forest tree The study material used was derived from the triple hybridization of Populus (poplar). A Populus deltoides clone (designated I-69) was used as a female parent to mate with an interspecific P. deltoides x P. nigra clone (designated I-45) as a male parent (WU et al. 1992 ). In the spring of 1988, a total of 450 1-year-old rooted three-way hybrid seedlings were planted at a spacing of 4 x 5 m at a forest farm near Xuchou City, Jiangsu Province, China. The total stem heights and diameters measured at the end of each of 11 growing seasons are used in this example. A genetic linkage map has been constructed using 90 genotypes randomly selected from the 450 hybrids with random amplified polymorphic DNAs (RAPDs) (Yin 2002)

Functional mapping incorporated by logistic curves and AR(1) model QTL

The dynamic pattern of QTL expression:

Functional Mapping - Data Genetic Design: Curve: Marker Place: Time Point: Parameters Here for Simulation Study QTL Position: Sample Size: Curve Parameters: Sigma^2: Correlation rho: Search Step: cM Map Function:

Functional Mapping - Data Put Markers and Trait Data into box below OR

Functional Mapping - Data Curves

Functional Mapping - Profile Initiate Values

Functional Mapping - Profile

Functional Mapping - Data Curves

Recent Developments transform-both-sides logistic model. Wu, Ma, et al Biometrics 2004 Multiple genes – Epistatic gene-gene interactions. Wu, Ma, et al Genetics 2004 Multiple environments – Genotype x environment Zhao,Zhu,Gallo-Meagher & Wu: Genetics 2004 Multiple traits – Trait correlations Zhao et al Biometrics 2005 Genetype by Sex interactions - Zhao,Ma,Cheverud &Wu Physiological Genomics 2004

transform-both-sides logistic model Developmental pattern of genetic effects Wu, Ma, Lin, Wang & Casella: Biometrics 2004 Timing at which the QTL is switched on

Functional mapping for epistasis in poplar Wu, Ma, Lin & Casella Genetics 2004 QTL 1 QTL 2

The growth curves of four different QTL genotypes Functional mapping for epistasis in poplar The growth curves of four different QTL genotypes for two QTL detected on the same linkage group D16

Genotype  environment interaction in rice Zhao, Zhu, Gallo-Meagher & Wu: Genetics 2004

Plant height growth trajectories in rice affected by QTL in two contrasting environments Red: Subtropical Hangzhou Blue: Tropical Hainan QQ qq

Functional mapping: Genotype  sex interaction Zhao, Ma, Cheverud & Wu Physiological Genomics 2004

Body weight growth trajectories affected by QTL in male and female mice QQ Qq qq Red: Male mice Blue: Female mice

Functional mapping for trait correlation Zhao, Hou, Littell & Wu: Biometrics submitted

Growth trajectories for stem height and diameter affected by a pleiotropic QTL Red: Diameter Blue: Height QQ Qq

Functional Mapping: toward high-dimensional biology A new conceptual model for genetic mapping of complex traits A systems approach for studying sophisticated biological problems A framework for testing biological hypotheses at the interplay among genetics, development, physiology and biomedicine

Functional Mapping: Simplicity from complexity Estimating fewer biologically meaningful parameters that model the mean vector, Modeling the structure of the variance matrix by developing powerful statistical methods, leading to few parameters to be estimated, The reduction of dimension increases the power and precision of parameter estimation