关键词: Gas chromatography Molecular descriptors Pyridinium-based ionic liquids Quantitative structure-retention relationships Stationary phases

Mesh : Ionic Liquids / chemistry Chromatography, Gas / methods Pyridinium Compounds / chemistry Software Reproducibility of Results Quantitative Structure-Activity Relationship Linear Models Polyethylene Glycols / chemistry

来  源:   DOI:10.1016/j.chroma.2024.465144

Abstract:
Ionic liquids, i.e., organic salts with a low melting point, can be used as gas chromatographic liquid stationary phases. These stationary phases have some advantages such as peculiar selectivity, high polarity, and thermostability. Many previous works are devoted to such stationary phases. However, there are still no large enough retention data sets of structurally diverse compounds for them. Consequently, there are very few works devoted to quantitative structure-retention relationships (QSRR) for ionic liquid-based stationary phases. This work is aimed at closing this gap. Three ionic liquids with substituted pyridinium cations are considered. We provide large enough data sets (123-158 compounds) that can be used in further works devoted to QSRR and related methods. We provide a QSRR study using this data set and demonstrate the following. The retention index for a polyethylene glycol stationary phase (denoted as RI_PEG), predicted using another model, can be used as a molecular descriptor. This descriptor significantly improves the accuracy of the QSRR model. Both deep learning-based and linear models were considered for RI_PEG prediction. The ability to predict the retention indices for ionic liquid-based stationary phases with high accuracy is demonstrated. Particular attention is paid to the reproducibility and reliability of the QSRR study. It was demonstrated that adding/removing several compounds, small perturbations of the data set can considerably affect the results such as descriptor importance and model accuracy. These facts have to be considered in order to avoid misleading conclusions. For the QSRR research, we developed a software tool with a graphical user interface, which we called CHERESHNYA. It is intended to select molecular descriptors and construct linear equations connecting molecular descriptors with gas chromatographic retention indices for any stationary phase. The software allows the user to generate several hundred molecular descriptors (one-dimensional and two-dimensional). Among them, predicted retention indices for popular stationary phases such as polydimethylsiloxane and polyethylene glycol are used as molecular descriptors. Various methods for selecting (and assessing the importance of) molecular descriptors have been implemented, in particular the Boruta algorithm, partial least squares, genetic algorithms, L1-regularized regression (LASSO) and others. The software is free, open-source and available online: https://github.com/mtshn/chereshnya.
摘要:
离子液体,即,低熔点的有机盐,可用作气相色谱液相固定相。这些固定相具有一些优点,例如特殊的选择性,高极性,和热稳定性。许多以前的工作都致力于这样的固定阶段。然而,仍然没有足够大的结构不同化合物的保留数据集。因此,很少有致力于离子液体固定相的定量结构保留关系(QSRR)的工作。这项工作旨在缩小这一差距。考虑了三种具有取代的吡啶阳离子的离子液体。我们提供了足够大的数据集(123-158种化合物),可用于QSRR和相关方法的进一步研究。我们使用此数据集提供了QSRR研究,并证明了以下内容。聚乙二醇固定相的保留指数(表示为RI_PEG),使用另一个模型预测,可用作分子描述符。该描述符显著提高了QSRR模型的准确性。基于深度学习的模型和线性模型都被考虑用于RI_PEG预测。证明了以高精度预测基于离子液体的固定相的保留指数的能力。特别注意QSRR研究的可重复性和可靠性。证明了添加/去除几种化合物,数据集的小扰动可以显著影响结果,如描述符的重要性和模型的准确性。必须考虑这些事实,以避免误导性结论。对于QSRR研究,我们开发了一个带有图形用户界面的软件工具,我们称之为CHERESHNYA.旨在选择分子描述符并构建将分子描述符与任何固定相的气相色谱保留指数连接的线性方程。该软件允许用户生成数百个分子描述符(一维和二维)。其中,流行的固定相如聚二甲基硅氧烷和聚乙二醇的预测保留指数被用作分子描述符。已经实施了用于选择(和评估)分子描述符的各种方法,特别是Boruta算法,偏最小二乘,遗传算法,L1正则化回归(LASSO)等。该软件是免费的,开源,可在线获取:https://github.com/mtshn/chereshnya。
公众号