关键词: Bin PLS Chemometrics PCA PLS Quantification Robust PLS

来  源:   DOI:10.1016/j.aca.2024.342895

Abstract:
BACKGROUND: Multivariate calibration by Partial Least Squares (PLS) on near-infrared data has been applied successfully in several industrial sectors, including pulp and paper. The creation of multivariate calibration models relies on a set of well-characterised samples that cover the range of the intended application. However, sample sets that originate from an industrial process often show an uneven distribution of reference values. This can be addressed by curation of the reference data and the methodology for multivariate calibration. It needs to be better understood, how these approaches affect the quality and scope of the final model.
RESULTS: We describe the effect of log10 transformation of the reference values, regular PLS, robust PLS, the newly introduced bin PLS, and their combinations to select more evenly distributed reference values for the quantification of five pulp characteristics (kappa number, R18, R10, cuen viscosity, and brightness; 200 samples) by near-infrared spectroscopy. The quality of the models was assessed by root mean squared error of prediction, calibration range, and coverage of sample types. The best models yielded uncertainty levels equivalent to that of the reference measurement. The optimal approach depended on the investigated reference value.
CONCLUSIONS: Robust PLS commonly gives the model with the lowest error, but this usually comes at the cost of a notably reduced calibration range. The other approaches rarely impacted the calibration range. None of them stood out as superior; their performance depended on the calibrated parameter. It is therefore worthwhile to investigate various calibration options to obtain a model that matches the requirements of the application without compromising calibration range and sample coverage.
摘要:
背景:通过偏最小二乘(PLS)对近红外数据进行多变量校准已成功应用于多个工业部门,包括纸浆和纸张。多变量校准模型的创建依赖于覆盖预期应用范围的一组充分表征的样本。然而,源自工业过程的样本集通常显示出参考值的不均匀分布。这可以通过参考数据的管理和用于多变量校准的方法来解决。需要更好地理解,这些方法如何影响最终模型的质量和范围。
结果:我们描述了参考值的log10转换的效果,常规PLS,强大的PLS,新推出的binPLS,及其组合,以选择更均匀分布的参考值来量化五种纸浆特征(卡伯值,R18,R10,Cuen粘度,和亮度;200个样品)通过近红外光谱法。通过预测的均方根误差评估模型的质量,校准范围,和样本类型的覆盖范围。最好的模型产生的不确定度水平与参考测量值相同。最佳方法取决于所研究的参考值。
结论:鲁棒PLS通常给出误差最小的模型,但这通常是以明显缩小的校准范围为代价的。其他方法很少影响校准范围。它们都没有脱颖而出。它们的性能取决于校准参数。因此,值得研究各种校准选项,以获得符合应用要求的模型,而不影响校准范围和样品覆盖率。
公众号