Modeling nuisance variables for phenotypic evaluation of bull fertility M. T. Kuhn, J. L. Hutchison, and H. D. Norman* Animal Improvement Programs Laboratory, Agricultural Research Service, USDA, Beltsville, MD Abstract T INTRODUCTION ◆ In May 2006, AIPL began evaluation of U.S. bull fertility. General research objectives: investigate options for modeling and trait definition that might improve accuracy ◆ Specific goal of this research: determine which available nuisance variables to include in the evaluation model and how to model them ◆ Factors considered were management (mgt) groups based on herd-yr-season-parity- registry status (HYSPR), Yr-State (St)-Mo, cow age, DIM, lactation, service number, milk yield, cow effects, and short heat intervals MATERIALS & METHODS Comparing Predictors from Alternative Models ◆ Bulls’ predicted conception rates (CR) computed from estimation data (n=3,613,907) and compared to their average CR in set-aside data (n=2,025,884) using accuracy, bias, and MSE; the 803 bulls with a min. of 50 matings for estimation and 100 matings in the set- aside data were included in comparisons. Only AI cow breedings were included Management Groups ◆ Minimum (target) group sizes tested were 3, 5, 10, 20 ◆ Many small groups occurred; thus 3 basic strategies tested: 1.Exclude records if HYSPR does not have the min. number (exact HYSPR groups) 2.Combine groups until the target size is reached; exclude the group if target not reached 3.Combine to target size but if HY has a specified minimum number of records, allow it into the evaluation; HY minimums were 2, 5, 10 when target sizes were 5, 10, and 20, respectively ◆ Model: y = HYSPR + 1 *Milk + 2 *Milk 2 + 3 *Age Cow + 4 *Age Cow 2 + 5 *DIM + 6 *DIM + 7 *F Bull + 8 *F Mating + Age Bull + Stud-Yr + Service Sire (SSR) + A Cow + PE Cow + e, y = conception, yes or no Other Factors ◆ Tested by dropping/adding factors of interest from/to the basic model of: y = HYSPR + SSR variables + PE + A + Age Cow + DIM + Yr-St-Mo + Milk + Lact + e ◆ SSR variables included: F SSR, F Mating, Age Bull, Stud-Yr, SSR ◆ The HYSPR strategy used was to combine to a target group size of 20 and allow the HY into the evaluation if it had at least 10 breedings ◆ Preliminary results showed: 1.Use of 305d-2x-ME milk yield provided as good or better predictions than use of test- day yields; ME records also did as well as FCM. Thus, ME milk yield used 2.For quantitative nuisance variables (e.g., cow age), categorical variables found to be preferable over linear and quadratic covariates; relationships with CR were not linear or quadratic. Thus, quantitative vars. fit as categorical 3.Combining mgt groups implies some groups contain multiple seasons and lactations; inclusion of Yr-St-Month and Lactation (Lact) found to improve prediction and therefore included in all models RESULTS CONCLUSIONS ◆ Combining HYSPR groups to a target size of 20 and allowing HYs in with a min of 10 records maximized accuracy and thus will be implemented ◆ Other nuisance variables to include are: cow PE, cow breeding value, cow age, Yr-St-Mo, ME milk yield, lactation, service number, and a short breeding interval variable; quantitative nuisance variables will be fit as categorical variables. Management group N. Records for estimationCorr Mean Diff (%) Std. Dev. Diff Min. N.Strategy 3No combine3,467, Combine3,612, No combine3,249, Combine3,609, Combine, allow 23,613, No combine2,670, Combine3,596, Combine, allow 53,609, No combine1,905, Combine3,542, Combine, allow 103,595, Management Groups ◆ Models are sorted from best to worst for each statistic (mean difference, correlation, and mean square error); the model listed first was the best for that statistic and the model listed last was the poorest ◆ The model without cow age (but with Lact; see basic model in methods) had the smallest mean difference but mean difference between bulls’ predicted CR and CR in the set-aside data was nearly 0 for all models, except when all nuisance variables dropped from the model (Omit All) ◆ The model with service number (ServN) and without DIM maximized accuracy and minimized MSE; correlations with both in the model were lower than with just service number because these 2 variables are highly correlated; the importance of including at least one is seen from the correlation when DIM was omitted without including ServN (Omit DIM) ◆ The range in correlations and MSEs, however, was generally small, except when all nuisance variables were omitted ◆ While simple average CR was 9% lower for breedings preceded by a short breeding interval (10-17 days, min. of 10 required), they accounted for only 2.5% of all breedings and the max percentage for any one bull was 9%; thus, this variable had minimal impact overall. For bulls where these breedings accounted for at least 5% of their matings (52 out of 803), accuracy improved by 0.4% when this variable was included ◆ Generally, combining groups resulted in higher correlations of predicted CR with bulls’ average CR in set aside data, than did using exact HYSPRs (no combining); except in the case where min. group size was 3, restricting to exact HYSPRs resulted in the loss of too many records ◆ Allowing HYs into the evaluation that had fewer than the target number of records was beneficial only when target group size was 20; considerably more records were salvaged when target group size was 20 than when it was 5 or 10 ◆ In general, though, differences among the options tested were small; provided that excessive data exclusion is avoided, formation of mgt groups will not have a large impact on accuracy ◆ Combining groups to a target group size of 20 and allowing HYs in if they have a min. of 10 records maximized accuracy. The small mean difference for this option was eliminated by categorization of quantitative nuisance variables, as can be seen below (Basic model) Other Factors Model Mean Diff (%) ModelCorr ModelMSE Omit Cow Age ServN, Omit DIM55.17 ServN, Omit DIM3.254 Basic Model Lact*ServN, Omit DIM55.11 Lact*ServN, Omit DIM3.256 Omit DIM Lact*ServN and DIM55.07 Lact*ServN and DIM3.258 Lact*ServN, Omit DIM ServN and DIM55.06 ServN and DIM3.259 ServN, Omit DIM-0.020DIM*ServN55.05Basic Model3.260 Omit Cow-0.021Basic Model54.97DIM*ServN3.260 Lact*ServN and DIM-0.022Omit Cow54.93 Omit Cow Age3.263 DIM*ServN-0.028Omit Cow Age54.89Omit Cow3.265 Omit Milk-0.028Omit Milk54.72Omit Milk3.265 ServN and DIM-0.042Omit DIM53.35Omit DIM3.300 Omit All Omit All51.54 Omit All3.501