关键词: PTR-TOF-MS data normalisation exhaled breath machine learning

Mesh : Humans Protons Benchmarking COVID-19 Testing Breath Tests / methods Mass Spectrometry / methods Volatile Organic Compounds / analysis COVID-19

来  源:   DOI:10.1088/1752-7163/ad08ce

Abstract:
Volatilomics is the branch of metabolomics dedicated to the analysis of volatile organic compounds in exhaled breath for medical diagnostic or therapeutic monitoring purposes. Real-time mass spectrometry (MS) technologies such as proton transfer reaction (PTR) MS are commonly used, and data normalisation is an important step to discard unwanted variation from non-biological sources, as batch effects and loss of sensitivity over time may be observed. As normalisation methods for real-time breath analysis have been poorly investigated, we aimed to benchmark known metabolomic data normalisation methods and apply them to PTR-MS data analysis. We compared seven normalisation methods, five statistically based and two using multiple standard metabolites, on two datasets from clinical trials for COVID-19 diagnosis in patients from the emergency department or intensive care unit. We evaluated different means of feature selection to select the standard metabolites, as well as the use of multiple repeat measurements of ambient air to train the normalisation methods. We show that the normalisation tools can correct for time-dependent drift. The methods that provided the best corrections for both cohorts were probabilistic quotient normalisation and normalisation using optimal selection of multiple internal standards. Normalisation also improved the diagnostic performance of the machine learning models, significantly increasing sensitivity, specificity and area under the receiver operating characteristic (ROC) curve for the diagnosis of COVID-19. Our results highlight the importance of adding an appropriate normalisation step during the processing of PTR-MS data, which allows significant improvements in the predictive performance of statistical models.Clinical trials: VOC-COVID-Diag (EudraCT 2020-A02682-37); RECORDS trial (EudraCT 2020-000296-21).
摘要:
背景:挥发物组学是代谢组学的分支,致力于分析呼出气中的挥发性有机化合物(VOC),以用于医学诊断或治疗监测目的。实时质谱技术,如质子转移反应质谱(PTR-MS),数据标准化是丢弃非生物来源不需要的变化的重要步骤,因为可以观察到批次效应和灵敏度随时间的损失。由于实时呼吸分析的标准化方法研究不足,我们旨在对已知的代谢组学数据标准化方法进行基准测试,并将其应用于PTR-MS数据分析.
方法:我们比较了七种归一化方法,五个基于统计学,两个使用多种标准代谢物,来自急诊科或重症监护室患者的COVID-19诊断临床试验的两个数据集。我们评估了不同的特征选择方法来选择标准代谢物,以及使用多次重复测量环境空气来训练归一化方法。
结果:我们证明了标准化工具可以纠正与时间相关的漂移。为两个队列提供最佳校正的方法是使用多个内部标准的最佳选择的概率商归一化和归一化。归一化还提高了机器学习模型的诊断性能,灵敏度显著提高,诊断COVID-19的特异性和ROC曲线下面积。
结论:我们的结果强调了在处理PTR-MS数据期间添加适当标准化步骤的重要性,这可以显著提高统计模型的预测性能。
临床试验:VOC-COVID-Diag(EudraCT2020-A02682-37);记录试验(EudraCT2020-000296-21)
关键词:数据标准化,PTR-TOF-MS,机器学习,呼出气 .
公众号