关键词: R packages data pre-processing liquid chromatography mass spectrometry metabolomics vendor software

来  源:   DOI:10.3390/metabo10010028   PDF(Sci-hub)   PDF(Pubmed)

Abstract:
Data pre-processing of the LC-MS data is a critical step in untargeted metabolomics studies in order to achieve correct biological interpretations. Several tools have been developed for pre-processing, and these can be classified into either commercial or open source software. This case report aims to compare two specific methodologies, Agilent Profinder vs. R pipeline, for a metabolomic study with a large number of samples. Specifically, 369 plasma samples were analyzed by HPLC-ESI-QTOF-MS. The collected data were pre-processed by both methodologies and later evaluated by several parameters (number of peaks, degree of missingness, quality of the peaks, degree of misalignments, and robustness in multivariate models). The vendor software was characterized by ease of use, friendly interface and good quality of the graphs. The open source methodology could more effectively correct the drifts due to between and within batch effects. In addition, the evaluated statistical methods achieved better classification results with higher parsimony for the open source methodology, indicating higher data quality. Although both methodologies have strengths and weaknesses, the open source methodology seems to be more appropriate for studies with a large number of samples mainly due to its higher capacity and versatility that allows combining different packages, functions, and methods in a single environment.
摘要:
LC-MS数据的数据预处理是非靶向代谢组学研究中的关键步骤,以实现正确的生物学解释。已经开发了几种用于预处理的工具,这些可以分为商业或开源软件。本病例报告旨在比较两种具体方法,AgilentProfindervs.R管道,用于大量样本的代谢组学研究。具体来说,通过HPLC-ESI-QTOF-MS分析369个血浆样品。收集的数据通过两种方法进行预处理,然后通过几个参数进行评估(峰的数量,痛苦程度,峰的质量,错位程度,和多变量模型中的稳健性)。供应商软件的特点是易于使用,界面友好,图形质量好。开源方法可以更有效地纠正由于批次效应之间和内部的漂移。此外,评估的统计方法获得了更好的分类结果,对开源方法具有更高的简约性,表明更高的数据质量。尽管这两种方法都有优点和缺点,开源方法似乎更适合于大量样本的研究,这主要是由于其更高的容量和多功能性,允许组合不同的包,功能,和单一环境中的方法。
公众号