Mesh : Sorghum / genetics Plant Breeding / methods Genotype Pedigree Models, Genetic Ethiopia Environment Linear Models Phenotype

来  源:   DOI:10.1007/s00122-024-04684-z   PDF(Pubmed)

Abstract:
CONCLUSIONS: We investigate a method of extracting and fitting synthetic environmental covariates and pedigree information in multilocation trial data analysis to predict genotype performances in untested locations. Plant breeding trials are usually conducted across multiple testing locations to predict genotype performances in the targeted population of environments. The predictive accuracy can be increased by the use of adequate statistical models. We compared linear mixed models with and without synthetic covariates (SCs) and pedigree information under the identity, the diagonal and the factor-analytic variance-covariance structures of the genotype-by-location interactions. A comparison was made to evaluate the accuracy of different models in predicting genotype performances in untested locations using the mean squared error of predicted differences (MSEPD) and the Spearman rank correlation between predicted and adjusted means. A multi-environmental trial (MET) dataset evaluated for yield performance in the dry lowland sorghum (Sorghum bicolor (L.) Moench) breeding program of Ethiopia was used. For validating our models, we followed a leave-one-location-out cross-validation strategy. A total of 65 environmental covariates (ECs) obtained from the sorghum test locations were considered. The SCs were extracted from the ECs using multivariate partial least squares analysis and subsequently fitted in the linear mixed model. Then, the model was extended accounting for pedigree information. According to the MSEPD, models accounting for SC improve predictive accuracy of genotype performances in the three of the variance-covariance structures compared to others without SC. The rank correlation was also higher for the model with the SC. When the SC was fitted, the rank correlation was 0.58 for the factor analytic, 0.51 for the diagonal and 0.46 for the identity variance-covariance structures. Our approach indicates improvement in predictive accuracy with SC in the context of genotype-by-location interactions of a sorghum breeding in Ethiopia.
摘要:
结论:我们研究了一种在多位置试验数据分析中提取和拟合合成环境协变量和谱系信息的方法,以预测未测试位置的基因型表现。植物育种试验通常在多个测试位置进行,以预测目标环境种群中的基因型表现。可以通过使用适当的统计模型来提高预测准确性。我们比较了具有和不具有合成协变量(SC)和谱系信息的线性混合模型,基因型-位置相互作用的对角线和因子分析方差-协方差结构。使用预测差异的均方误差(MSEPD)和预测均值与调整均值之间的Spearman等级相关性进行比较以评估不同模型在预测未测试位置中的基因型表现的准确性。评估了干旱低地高粱的产量表现的多环境试验(MET)数据集(双色高粱(L.)Moench)使用了埃塞俄比亚的育种计划。为了验证我们的模型,我们采用了留一置交叉验证策略.考虑了从高粱测试位置获得的总共65个环境协变量(EC)。使用多变量偏最小二乘分析从EC中提取SC,随后在线性混合模型中进行拟合。然后,该模型是对谱系信息的扩展核算。根据MSEPD,与没有SC的其他模型相比,考虑SC的模型提高了三种方差-协方差结构中基因型表现的预测准确性。具有SC的模型的等级相关性也较高。当安装SC时,因子分析的秩相关为0.58,对角线为0.51,同一性方差-协方差结构为0.46。我们的方法表明,在埃塞俄比亚高粱育种的基因型-位置相互作用的背景下,SC的预测准确性有所提高。
公众号