Genetic evaluation under parental uncertainty Robert J. Tempelman Michigan State University, East Lansing, MI National Animal Breeding Seminar Series December.

Slides:



Advertisements
Similar presentations
15 The Genetic Basis of Complex Inheritance
Advertisements

Introduction to Monte Carlo Markov chain (MCMC) methods
Software for Incorporating Marker Data in Genetic Evaluations Kathy Hanford U.S. Meat Animal Research Center Agricultural Research Service U.S. Department.
Evaluation of a new tool for use in association mapping Structure Reinhard Simon, 2002/10/29.
Applied Bayesian Inference for Agricultural Statisticians Robert J. Tempelman Department of Animal Science Michigan State University 1.
Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review By Mary Kathryn Cowles and Bradley P. Carlin Presented by Yuting Qi 12/01/2006.
West Virginia University Extension Service Genetics in Beef Cattle Wayne R. Wagner.
METHODS FOR HAPLOTYPE RECONSTRUCTION
Bayesian Estimation in MARK
Multiple Breed Evaluation Can MBE enhance crossbreeding? John Pollak Cornell University Director, NBCEC.
1 15 The Genetic Basis of Complex Inheritance. 2 Multifactorial Traits Multifactorial traits are determined by multiple genetic and environmental factors.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
DATA ANALYSIS Module Code: CA660 Lecture Block 2.
USE OF LAPLACE APPROXIMATIONS TO SIGNIFICANTLY IMPROVE THE EFFICIENCY
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Quantitative Genetics
Lecture II-2: Probability Review
Review Session Monday, November 8 Shantz 242 E (the usual place) 5:00-7:00 PM I’ll answer questions on my material, then Chad will answer questions on.
BEEF CATTLE GENETICS By David R. Hawkins Michigan State University.
Bob Weaber, Ph.D. Cow-Calf Extension Specialist Associate Professor Dept. of Animal Sciences and Industry
How Genomics is changing Business and Services of Associations Dr. Josef Pott, Weser-Ems-Union eG, Germany.
Extension of Bayesian procedures to integrate and to blend multiple external information into genetic evaluations J. Vandenplas 1,2, N. Gengler 1 1 University.
Mating Programs Including Genomic Relationships and Dominance Effects
Mating Programs Including Genomic Relationships and Dominance Effects Chuanyu Sun 1, Paul M. VanRaden 2, Jeff R. O'Connell 3 1 National Association of.
Matt Spangler Beef Genetics Specialist University of Nebraska-Lincoln.
Chuanyu Sun Paul VanRaden National Association of Animal breeders, USA Animal Improvement Programs Laboratory, USA Increasing long term response by selecting.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
Module 7: Estimating Genetic Variances – Why estimate genetic variances? – Single factor mating designs PBG 650 Advanced Plant Breeding.
1 Institute of Engineering Mechanics Leopold-Franzens University Innsbruck, Austria, EU H.J. Pradlwarter and G.I. Schuëller Confidence.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
1 Multi-Breed Evaluation For Growth Traits J. Keith Bertrand University of Georgia, Athens.
Multi-breed Evaluation J. Keith Bertrand University of Georgia, Athens.
Applied Bayesian Inference, KSU, April 29, 2012 § ❷ / §❷ An Introduction to Bayesian inference Robert J. Tempelman 1.
Commercial Ranch Project and DNA testing John Pollak Cornell University.
Genomics. Finding True Genetic Merit 2 Dam EPD Sire EPD Pedigree Estimate EPD TRUE Progeny Difference Mendelian Sampling Effect Adapted from Dr. Bob Weaber.
Use of DNA information in Genetic Programs.. Next Four Seminars John Pollak – DNA Tests and genetic Evaluations and sorting on genotypes. John Pollak.
Managerial Economics Demand Estimation & Forecasting.
The Many Measures of Accuracy: How Are They Related? Matt Spangler, Ph.D. University of Nebraska-Lincoln.
Maximum Likelihood - "Frequentist" inference x 1,x 2,....,x n ~ iid N( ,  2 ) Joint pdf for the whole random sample Maximum likelihood estimates.
Genetic correlations between first and later parity calving ease in a sire-maternal grandsire model G. R. Wiggans*, C. P. Van Tassell, J. B. Cole, and.
Simple examples of the Bayesian approach For proportions and means.
Methodology for Prediction of Bull Fertility from Field Data M. T. Kuhn* and J. L. Hutchison Animal Improvement Programs Laboratory, Agricultural Research.
Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Council on Dairy Cattle Breeding April 27, 2010 Interpretation of genomic breeding values from a unified, one-step national evaluation Research project.
2007 Paul VanRaden 1, Jeff O’Connell 2, George Wiggans 1, Kent Weigel 3 1 Animal Improvement Programs Lab, USDA, Beltsville, MD, USA 2 University of Maryland.
Lecture 22: Quantitative Traits II
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Advanced Animal Breeding
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
G.R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD Select Sires‘ Holstein.
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Canadian Bioinformatics Workshops
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
MCMC Stopping and Variance Estimation: Idea here is to first use multiple Chains from different initial conditions to determine a burn-in period so the.
DNA Sire Identification Meat Animal Research Center Clay Center, NE
Use of DNA information in Genetic Programs.
Robert J. Tempelman Michigan State University
Bayesian inference Presented by Amir Hadadi
Predictive distributions
'Linear Hierarchical Models'
OVERVIEW OF LINEAR MODELS
What are BLUP? and why they are useful?
OVERVIEW OF LINEAR MODELS
Parametric Methods Berlin Chen, 2005 References:
Use of a threshold animal model to estimate calving ease and stillbirth (co)variance components for US Holsteins.
Using Haplotypes in Breeding Programs
The Basic Genetic Model
Presentation transcript:

Genetic evaluation under parental uncertainty Robert J. Tempelman Michigan State University, East Lansing, MI National Animal Breeding Seminar Series December 6, 2004.

Key papers from our lab: Cardoso, F.F., and R.J. Tempelman Bayesian inference on genetic merit under uncertain paternity. Genetics, Selection, Evolution 35: Cardoso, F.F., and R.J. Tempelman Genetic evaluation of beef cattle accounting for uncertain paternity. Livestock Production Science 89:

Multiple sires – The situation Cows are mated with a group of bulls under pasture conditions Common in large beef cattle populations raised on extensive pasture conditions –Accounts for up to 50% of calves in some herds under genetic evaluation in Brazil (~25-30% on average) –Multiple sires group sizes range from 2 to 12+ (Breeding cows group size range from 50 to 300+) Common in commercial U.S. herds. –Potential bottleneck for genetic evaluations beyond the seedstock level (Pollak, 2003).

Multiple sires – The situation x x?? Who is the sire?

The tabular method for computing genetic relationships Recall basis tabular method for computing the numerator relationship matrix: –Henderson, C.R A simple method for computing the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics 32:69. A = {a ij } where a ij is the genetic relationship between animals i and j. Let parents of j be s j and d j.

The average numerator relationship matrix (ANRM) Henderson, C.R Use of an average numerator relationship matrix for multiple-sire joining. Journal of Animal Science 66: –a ij is the genetic relationship between animals i and j. Suppose dam of j be known to be d j whereas there are v j different candidate sires (s 1,s 2,…s vj ) with probabilities (p 1,p 2,…p vj ) of being the true sire:

Pedigree file example from Henderson (1988) AnimalSiresSire probabilitiesDam ,50.6, ,50.3, ,4,50.3, 0.6, = unknown Could be determined using genetic markers

Numerator relationship matrix: symmetric Rest provided in Henderson, 1988 AnimalSiresSire probabilities Dam 73,50.6, ,50.3, ,4,50.3, 0.6, Note if true sire of 7 is 3, a 77 = 1.25; otherwise a 77 =

How about inferring upon what might be the correct sire? Empirical Bayes Strategy: –Foulley, J.L., D. Gianola, and D. Planchenault Sire evaluation with uncertain paternity. Genetics, Selection, Evolution. 19: Sire model implementation.

Simple sire model AnimalSiresSire probabilities ,50.6, ,50.3, ,4,50.3, 0.6, y =X  + Zs + e

One possibility: Substitute sire probabilities for elements of Z. AnimalSiresSire probabilities ,50.6, ,50.3, ,4,50.3, 0.6,

Strategy of Foulley et al. (1987) : Posterior probabilities using provided sire probabilities as “prior” probabilities and y to estimate elements of Z. - computed iteratively Limitation: Can only be used for sire models.

Inferring upon elements of design matrix Where else is this method currently used? Segregation analysis –Estimating allelic frequencies and genotypic effects for a biallelic locus WITHOUT molecular marker information. –Prior probabilities based on HW equilibrium for base population. –Posterior probabilities based on data. –Reference: Janss, L.L.G., R. Thompson., J.A.M. Van Arendonk Application of Gibbs sampling for inference in a mixed major gene-polygenic inheritance model in animal populations. Theoretical and Applied Genetics 91:

Another strategy (most commonly used) Use phantom groups (Westell et al., 1988; Quaas et al., 1988). Used commonly in genetic evaluation systems having incomplete ancestral pedigrees in order to mitigate bias due to genetic trend. –Limitations (applied to multiple sires): 1.Assumes the number of candidate sires is effectively infinite within a group. 2.None of the phantom parents are related. 3.Potential confounding problems for small groups (Quaas, 1988).

The ineffectiveness of phantom grouping for genetic evaluations in multiple sire pastures: Perez-Enciso, M. and R.L. Fernando Genetic evaluation with uncertain parentage: A comparison of methods. Theoretical and Applied Genetics 84: Sullivan, P.G Alternatives for genetic evaluation with uncertain paternity. Canadian Journal of Animal Science 75: –Greater selection response using Henderson’s ANRM relative to phantom grouping (simulation studies). – Excluding animals with uncertain paternity reduces expected selection response by as much as 37%.

1.To propose a hierarchical Bayes animal model for genetic evaluation of individuals having uncertain paternity 2.To estimate posterior probabilities of each bull in the group being the correct sire of the individual 3.To compare the proposed method with Henderson’s ANRM via 1.Simulation study 2.Application to Hereford PWG and WW data. Uncertain paternity - objectives

Animal genetic values – a Uncertain paternity - hierarchical Bayes model 1 st stage Data - y (Performance records) Non-genetic effects -  (Contemporary groups, age of dam, age of calf, gender) Residual terms - e (assumed to be normal) y = X  + Za + e; e ~N (0,I  e 2 )

Uncertain paternity - hierarchical Bayes model 2 nd stage Non-genetic effects Animal genetic values Residual Variance Prior knowledge based on literature information sire assignments (s) genetic variance (  a 2 ) (Co)variances based on relationship (A), sire assignments (s) and genetic variance (  a 2 ) Prior means based on literature information Variance based on the reliability of prior information  ~N (  o,V  )a|s ~N (0,A s  a 2 )  e 2 ~ s e 2  

Uncertain paternity - hierarchical Bayes model 3 rd stage sire assignments Prior knowledge based on literature information Probability for sire assignments (  j ) genetic variance  a 2 ~ s a 2   a)  Could be based on marker data.

Uncertain paternity - hierarchical Bayes model 4 th stage Specifying uncertainty for probability of sire assignments assignments e.g. How sure are you about the prior probabilities of 0.6 and 0.4 for Sires 3 and 5, respectively, being the correct sire? Assessment based on how much you trust the genotype based probabilities. Could also model genotyping error rates explicitly (Rosa, G.J.M, Yandell, B.S., Gianola, D. A Bayesian approach for constructing genetics maps when markers are miscoded. Genetics, Selection, Evolution 34: ) Dirichlet prior

Uncertain paternity - joint posterior density 2 nd stage Genetic effects Residual error Prior knowledge based on literature information (Co)variances (relationship, sire assignments and genetic variances) Prior means (literature information) Variance (reliability of priors) 1 st stage Data Prior knowledge based on literature information 3 rd stage Non-genetic fixed effects Markov chain Monte Carlo (MCMC) Prior probability for sire assignments Reliability of priors 4 th stage

Simulation Study (Cardoso and Tempelman, 2003) Generation 0 Base population Selection (20 sires & 100 dams) Breeding population Selection (15 sires & 75 dams) Selection (5 sires & 25 dams) Breeding population Offspring (500 animals) Random mating (inbreeding avoided) Offspring (360 animals) Selection (15 sires & 75 dams) Selection (5 sires & 25 dams) Breeding population Offspring (500 animals) Random mating (inbreeding avoided) 2 Totals: 80 sires, 400 dams, 2000 non-parents.

Paternity assignment Offspring UncertainCertain Random Assignment to Paternity Condition.3.7 Assignment to Multiple Sire Groups Within the assigned group one of the sire is picked to be the true sire (with equal or unequal probabilities) Sire Record: Sires averaged 23.6 progeny, Dams averaged 5.9 progeny

Simulated traits: Ten datasets generated from each of two different types of traits: –Trait 1 (WW): –Trait 2 (PWG): Naïve prior assignments: i.e. equal prior probabilities to each candidate sire (i.e. no information based on genetic markers available)

Posterior probabilities of sire assignments being equal to true sires Multiple-sire group size Animal Category Trait 1 Parents Non-parents Trait 2 Parents Non-parents

Rank correlation of predicted genetic effects ANRM = Henderson’s ANRM HIER = proposed model TRUE = all sires known Sidenote: Model fit criteria was clearly in favor of HIER over ANRM

 Data set  3,402 post-weaning gain records on Hereford calves raised in southern Brazil (from )  4,703 animals  Paternity (57% certain; 15% uncertain & 28% unknown-base animals)  Group sizes 2, 3, 4, 5, 6, 10, 12 & 17  Methods  ANRM (average relationship)  HIER (uncertain paternity hierarchical Bayes model) Uncertain paternity - application to field data

Parameter a Posterior median95% Credible Set ANRM 0.231(0.153, 0.316) 73.8(48.0, 103.6) 246.5(221.5, 271.2) 404.5(334.3, 494.0) HIER 0.244(0.162, 0.336) 78.2(51.1, 111.2) 242.9(216.5, 268.2) 404.5(333.9, 493.8) Posterior inference for PWG genetic parameters under ANRM versus HIER models

Model choice criteria (DIC and PBF) decisively favored HIER over ANRM Very high rank correlations between genetic evaluations using ANRM versus HIER Some non-trivial differences on posterior means of additive genetic value for some animals Uncertain paternity - Results summary

Standard deviation of additive genetic effects Uncertain paternity - assessment of accuracy (PWG) Sire with 50 progeny Sire with 9 progeny i.e. accuracies are generally slightly overstated with Henderson’s ANRM

Conclusions Uncertain paternity modeling complements genetic marker information (as priors) –Reliability on prior information can be expressed (via Dirichlet). Little advantage over the use of Henderson’s ANRM. –However, accuracies of EPD’s overstated using ANRM. –Power of inference may improve with better statistical assumptions (i.e. heterogeneous residual variances)

Implementation issues Likely require a non-MCMC approach to providing genetic evaluations. Some hybrid with phantom grouping may be likely needed. –Candidate sires are not simply known for some animals. Bob Weaber’s talk.