Breeding Value Prediction Using a Functional Data Multiple Regression Equation
Kunio Takezawa *
Agroinformatics Division, Agricultural Research Center, National Agriculture and Food Research Organization, Kannondai 3-1-1, Tsukuba, Ibaraki 305-8666, Japan.
*Author to whom correspondence should be addressed.
Abstract
In this study, the applicability of a multiple regression equation to predict breeding values based on the high-density SNP (single nucleotide polymorphism) markers that are found in the whole genome sequences of animals and plants was evaluated. The genotypes of a large number of SNPs distributed on chromosomes were treated as functional data and phenotypic values of a trait were treated as scalar target variables in the functional data multiple regression equations. The functional data analysis R package (“fda”, version 2.4.0) was used to create the functional data multiple linear regression equations. An outline of this procedure is presented in this paper. We evaluated the accuracy of the functional data multiple regression equations by predicting breeding values using simulated data sets of SNPs as predictors and phenotypic values of a trait as variables. We found that the regression equations predicted the breeding values with considerable accuracy even though the predictors were not selected, nor were prior distributions assumed.
Keywords: B-spline, genome sequence, genomic selection, genotype, single nucleotide polymorphism, smoothing splines.