Multi-model approach

  • 文章类型: Journal Article
    确保化合物的安全性和有效性在小分子药物开发中至关重要。在药物开发的后期,有毒化合物构成了重大挑战,失去宝贵的资源和时间。使用深度学习模型对化合物毒性的早期和准确预测提供了一种有前途的解决方案,可以在药物发现期间减轻这些风险。在这项研究中,我们介绍了几种旨在评估不同类型化合物毒性的深度学习模型的发展,包括急性毒性,致癌性,hERG_心脏毒性(人类ether-a-go-go相关基因引起的心脏毒性),肝毒性,和诱变性。为了解决数据大小的固有变化,标签类型,以及在不同类型的毒性中的分布,我们采用了不同的培训策略。我们的第一种方法涉及利用图卷积网络(GCN)回归模型来预测急性毒性,在腹膜内用PearsonR0.76、0.74和0.65取得了显著的性能,静脉注射,和口服给药途径,分别。此外,我们训练了多个GCN二元分类模型,每种都适合特定类型的毒性。这些模型表现出很高的曲线下面积(AUC)得分,预测致癌性的AUC为0.69、0.77、0.88和0.79,hERG_心脏毒性,致突变性,和肝毒性,分别。此外,我们使用批准的药物数据集来确定模型使用预测评分的适当阈值.我们将这些模型整合到虚拟筛选管道中,以评估其在识别潜在低毒候选药物方面的有效性。我们的研究结果表明,这种深度学习方法有可能通过加快选择低毒性化合物来显著降低与药物开发相关的成本和风险。因此,本研究开发的模型有望成为早期候选药物筛选和选择的关键工具.
    Ensuring the safety and efficacy of chemical compounds is crucial in small-molecule drug development. In the later stages of drug development, toxic compounds pose a significant challenge, losing valuable resources and time. Early and accurate prediction of compound toxicity using deep learning models offers a promising solution to mitigate these risks during drug discovery. In this study, we present the development of several deep-learning models aimed at evaluating different types of compound toxicity, including acute toxicity, carcinogenicity, hERG_cardiotoxicity (the human ether-a-go-go related gene caused cardiotoxicity), hepatotoxicity, and mutagenicity. To address the inherent variations in data size, label type, and distribution across different types of toxicity, we employed diverse training strategies. Our first approach involved utilizing a graph convolutional network (GCN) regression model to predict acute toxicity, which achieved notable performance with Pearson R 0.76, 0.74, and 0.65 for intraperitoneal, intravenous, and oral administration routes, respectively. Furthermore, we trained multiple GCN binary classification models, each tailored to a specific type of toxicity. These models exhibited high area under the curve (AUC) scores, with an impressive AUC of 0.69, 0.77, 0.88, and 0.79 for predicting carcinogenicity, hERG_cardiotoxicity, mutagenicity, and hepatotoxicity, respectively. Additionally, we have used the approved drug dataset to determine the appropriate threshold value for the prediction score in model usage. We integrated these models into a virtual screening pipeline to assess their effectiveness in identifying potential low-toxicity drug candidates. Our findings indicate that this deep learning approach has the potential to significantly reduce the cost and risk associated with drug development by expediting the selection of compounds with low toxicity profiles. Therefore, the models developed in this study hold promise as critical tools for early drug candidate screening and selection.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    塑料的广泛使用导致微塑料的释放和扩散。家用塑料制品占很大一部分,与日常生活密切相关。由于微塑料的尺寸小,成分复杂,识别和量化微塑料具有挑战性。因此,开发了一种基于拉曼光谱的家用微塑料分类多模型机器学习方法。在这项研究中,拉曼光谱与机器学习算法相结合,实现了7种标准微塑料样品的准确鉴别,真实的微塑料样品和真实的微塑料样品暴露于环境压力后。本研究使用了四种单模型机器学习方法,包括支持向量机(SVM),K最近邻(KNN),线性判别分析(LDA),和多层感知器(MLP)模型。在SVM之前使用主成分分析(PCA),KNN和LDA。四种型号对标准塑料样品的分类效果达88%以上,采用reliefF算法区分HDPE和LDPE样品。提出了一种基于PCA-LDA四个单一模型的多模型,PCA-KNN和MLP。多模型对标准微塑料样品的识别精度,真实的微塑料样品和微塑料样品暴露于环境压力后超过98%。我们的研究表明,多模型与拉曼光谱相结合是微塑料分类的有价值的工具。
    The extensive use of plastics leads to the release and diffusion of microplastics. Household plastic products occupy a large part and are closely related to daily life. Due to the small size and complex composition of microplastics, it is challenging to identify and quantify microplastics. Therefore,a multi-model machine learning approach was developed for classification of household microplastics based on Raman spectroscopy. In this study, Raman spectroscopy and machine learning algorithm are combined to realize the accurate identification of seven standard microplastic samples, real microplastics samples and real microplastic samples post-exposure to environmental stresses. Four single-model machine learning methods were used in this study, including Support vector machine (SVM), K-nearest neighbor (KNN), Linear discriminant analysis (LDA), and Multi-layer perceptron (MLP) model. The principal components analysis (PCA) was utilized before SVM, KNN and LDA. The classification effect of four models on standard plastic samples is over 88%, and reliefF algorithm was used to distinguish HDPE and LDPE samples. A multi-model is proposed based on four single models including PCA-LDA, PCA-KNN and MLP. The recognition accuracy of multi-model for standard microplastic samples, real microplastic samples and microplastic samples post-exposure to environmental stresses is over 98%. Our study demonstrates that the multi-model coupled with Raman spectroscopy is a valuable tool for microplastic classification.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Review
    大数据和数据分析方法和模型是粮食安全(FS)研究中的重要工具,用于差距分析和准备适当的分析框架。这些创新需要开发新的收集方法,存储,processing,并提取数据。
    这项研究的主要目标是对自2010年以来在同行评审期刊上发表的用于FS研究的农业大数据以及方法和模型进行批判性审查。在预筛选过程后,大约130篇文章被选择用于完整的内容审查。
    有不同的数据收集来源,包括但不限于在线数据库,互联网,组学,物联网,社交媒体,调查回合,遥感,和粮食及农业组织公司统计数据库。收集的数据需要分析(即,采矿,神经网络,贝叶斯网络,和其他ML算法)在使用Python进行数据可视化之前,R,Circos,Gephi,Tableau,或者Cytoscape.大约122个模型,所有这些都被用于全球的FS研究,从130篇文章中选出。然而,这些模型中的大多数只针对FS的一个或两个维度(即,可用性和访问),并忽略其他维度(即,稳定性和利用率),在全球范围内造成差距。
    科学家和政策制定者需要解决全球和阿拉伯联合酋长国的某些FS差距。在确认司机身份后,政策,和指标,本综述的结果可用于为FS和营养建立适当的分析框架.
    Big data and data analysis methods and models are important tools in food security (FS) studies for gap analysis and preparation of appropriate analytical frameworks. These innovations necessitate the development of novel methods for collecting, storing, processing, and extracting data.
    The primary goal of this study was to conduct a critical review of agricultural big data and methods and models used for FS studies published in peer-reviewed journals since 2010. Approximately 130 articles were selected for full content review after the pre-screening process.
    There are different sources of data collection, including but not limited to online databases, the internet, omics, Internet of Things, social media, survey rounds, remote sensing, and the Food and Agriculture Organization Corporate Statistical Database. The collected data require analysis (i.e., mining, neural networks, Bayesian networks, and other ML algorithms) before data visualization using Python, R, Circos, Gephi, Tableau, or Cytoscape. Approximately 122 models, all of which were used in FS studies worldwide, were selected from 130 articles. However, most of these models addressed only one or two dimensions of FS (i.e., availability and access) and ignored the other dimensions (i.e., stability and utilization), creating a gap in the global context.
    There are certain FS gaps both worldwide and in the United Arab Emirates that need to be addressed by scientists and policymakers. Following the identification of the drivers, policies, and indicators, the findings of this review could be used to develop an appropriate analytical framework for FS and nutrition.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    对尺寸相关的光学特征和纳米传感器的精确规律性的深刻理解对于开发等离子体感测的应用至关重要。这项工作对基于局部表面等离子体共振(LSPR)的等离子体银纳米立方体(AgNCs)的边缘长度为59.84±7.97nm的纳米传感器进行了系统研究(编号:1个AgNCs),75.70±9.05纳米(编号2个AgNCs)和110.32±14.63nm(编号3个AgNCs),分别。通过暗场显微镜(DFM)使用单个AgNC的多模型共定位方法原位确定了不同尺寸对AgNC的散射特征和折射率(RI)灵敏度的影响。LSPR光谱和扫描电镜(SEM)。随着单个AgNC边缘长度的增加,单个AgNC的散射光颜色从单色到多色发生红色变化。no的LSPR散射光谱。1和2个AgNCs表现出单线态和单线态,肩峰来自四极共振模式,分别。与no的散射特征相比。1和2个AgNCs,在单个no上观察到具有两个不同峰的等离子体激元线形状的有趣的LSPR效应。3AgNC。对单个AgNC对环境溶剂的散射光谱响应的原位研究以及对单个银纳米立方体表面上的小分子吸附物的探测表明,没有。2AgNC具有较强的规律性和较高的灵敏度,更适合作为纳米传感器。结合实验和理论模拟,明确阐述了光学签名涉及的机制。
    Insightful understanding of size-dependent optical signatures and precise regularity of nanosensors is critical for developing applications of plasmonic sensing. This work presents a systematic study on localized surface plasmon resonance (LSPR)-based nanosensors of plasmonic silver nanocubes (AgNCs) with the edge lengths of 59.84 ± 7.97 nm (no. 1 AgNCs), 75.70 ± 9.05 nm (no. 2 AgNCs) and 110.32 ± 14.63 nm (no. 3 AgNCs), respectively. The effects of different sizes on the scattering signatures and refractive index (RI) sensitivities of AgNCs were in situ determined using the multi-model co-localization approach of single AgNC by dark-field microscope (DFM), LSPR spectroscopy and scanning electron microscopy (SEM). The scattering light colour of single AgNC took place bathochromic shift from monocolour to multicolour with the growth of edge length of single AgNC. The LSPR scattering spectra of no. 1 and 2 AgNCs exhibited singlet and singlet with the shoulder peak from quadrupolar resonance mode, respectively. Compared with the scattering signatures of no. 1 and 2 AgNCs, the interesting LSPR effect of plasmon line shape with two distinct peaks was observed on single no. 3 AgNC. In situ studies on the scattering spectral response of single AgNC to the ambient solvents and probing the small-molecule adsorbates on the surface of single silver nanocube reveal that no. 2 AgNC is more suitable as nanosensor due to strong regularity and higher sensitivity. The mechanism involved in optical signatures was elaborated clearly by combining with the experiments and theoretical simulation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    The geological conceptual model is considered a major source of uncertainty in groundwater modelling and well capture zone delineation. However, how to account for it in groundwater policy and management remains largely unresolved. We explore the drivers and barriers to account for geological conceptual uncertainty in groundwater protection amongst decision makers and stakeholders in an agricultural groundwater catchment in Denmark. Using a groundwater model, we analyze the impact of alternative geological conceptual models on capture zone delineation. A focus area, which covers multiple modelled capture zones, is defined and considered for groundwater protection. Model uncertainty and focus area are discussed at two workshops, one with local and national stakeholders and another with local farmers. The drivers to account for model uncertainty include: i) safer drinking water protection by considering a larger area for protection than identified from a single geological model; and ii) stability over time of management plans. The main barrier is the additional cost to the stakeholders for the protection of a larger area. We conclude that integration of geological uncertainty in groundwater protection plans may be improved through: i) better communication between the research community and the national water authority; ii) more constraining guidelines regarding the estimation of geological uncertainty; and iii) the development of a framework ensuring knowledge transfer to the local water authorities and detailing how to integrate uncertainty in management plans.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    尽管全世界都知道未来气候变化可能带来的威胁,在政策决定方面仍有严重的障碍有待解决。气候变化政策中的科学和社会不确定性必须是这一障碍的很大一部分。根据《巴黎协定》,世界进入下一阶段,决定下一步的行动。没有风险管理的观点,任何决定都将是“基于忽略替代方案”的行为。环境部,日本建立了跨学科研究项目,称为综合气候评估-风险,不确定性,和社会(ICA-RUS)由SeitaEmori博士主持,国家环境研究所。ICA-RUS由五个研究主题组成,即,(1)全球气候风险综合,(2)优化用地,水,和气候风险的生态系统,(3)关键气候风险分析;(4)气候风险管理方案评价,(5)科学理性和社会理性之间的相互作用。我们参加了第四个主题,通过整合评估模型模拟,对技术选择和政策措施进行定量评估。我们采用多模型方法来处理技术等各个领域之间的复杂关系,经济学,和土地利用变化。四种不同类型的综合评估模型,即,MARIA-14(Mori),EMEDA(Washida),葡萄(黑泽明),和AIM(Masui),参与第四个研究主题。这些模型通过提供两个信息类别来为ICA-RUS做出贡献。首先,这些模型基于ICA-RUS第一个主题给出的共享社会经济途径情景和共享气候政策案例提供了共同的模拟结果,以查看评估范围.第二,每个模型还提供特定于模型的结果来回答特殊主题,例如,地球工程,部门贸易,适应,和不确定因素下的决策。本文的目的是描述四个模型之间多模型相互比较的概述和主要结果,重点放在第一个模型上,并介绍主要结果。此外,在这项研究中,我们引入了多模型模拟结果的统计荟萃分析,以查看不同结构的模型是否提供了相互一致的结果.我们活动的主要发现如下:第一,在严格的气候目标中,模型之间的区域经济损失趋于分歧,而全球总经济损失没有。第二,碳捕获和封存(CCS)以及BECCS对于提供严格的气候目标的可行性至关重要,即使部署潜力因模型而异。第三,这些模型显示了世界总产量中作物产量的微小变化,而地区之间出现了很大的差异。第四,多模型模拟结果的统计荟萃分析表明,即使这些模型的结构和模拟结果不同,它们在国内生产总值损失和缓解方案之间也存在隐含但共同的关系。由于这项研究只不过是统计荟萃分析的初步练习,预计更复杂的方法,如数据挖掘或机器学习可以适用于仿真数据库,以提取模型背后的隐含信息。
    Although the world understands the possible threat of the future of climate changes, there remain serious barriers to be resolved in terms of policy decisions. The scientific and the societal uncertainties in the climate change policies must be the large part of this barrier. Following the Paris Agreement, the world comes to the next stage to decide the next actions. Without a view of risk management, any decision will be \"based on neglecting alternatives\" behavior. The Ministry of the Environment, Japan has established an inter-disciplinary research project, called Integrated Climate Assessment-Risks, Uncertainties, and Society (ICA-RUS) conducted by Dr. Seita Emori, National Institute for Environmental Studies. ICA-RUS consists of five research themes, i.e., (1) synthesis of global climate risks, (2) optimization of land, water, and ecosystem for climate risks, (3) analysis of critical climate risks, (4) evaluation of climate risk management options, and (5) interactions between scientific and social rationalities. We participated in the fourth theme to provide the quantitative assessment of technology options and policy measures by integrating assessment model simulations. We employ the multi-model approach to deal with the complex relationships among various fields such as technology, economics, and land use changes. Four different types of integrated assessment models, i.e., MARIA-14 (Mori), EMEDA (Washida), GRAPE (Kurosawa), and AIM (Masui), participate in the fourth research theme. These models contribute to the ICA-RUS by providing two information categories. First, these models provide common simulation results based on shared socioeconomic pathway scenarios and the shared climate policy cases given by the first theme of ICA-RUS to see the ranges of the evaluation. Second, each model also provides model-specific outcomes to answer special topics, e.g., geoengineering, sectoral trade, adaptation, and decision making under uncertainties. The purpose of this paper is to describe the outline and the main outcomes of the multi-model inter-comparison among the four models with a focus upon the first and to present the main outcomes. Furthermore, in this study, we introduce a statistical meta-analysis of the multi-model simulation results to see whether the differently structured models provide the inter-consistent findings. The major findings of our activities are as follows: First, in the stringent climate target, the regional economic losses among models tend to diverge, whereas global total economic loss does not. Second, both carbon capture and storage (CCS) as well as BECCS are essential for providing the feasibility of stringent climate targets even if the deployment potential varies among models. Third, the models show small changes in the crop production in world total, whereas large differences appear between regions. Fourth, the statistical meta-analysis of the multi-model simulation results suggests that the models would have an implicit but common relationship between gross domestic product losses and mitigation options even if their structures and simulation results are different. Since this study is no more than a preliminary exercise of the statistical meta-analysis, it is expected that more sophisticated methods such as data mining or machine learning could be applicable to the simulation database to extract the implicit information behind the models.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    克服基因表达数据集中缺乏足够的样本,该数据集具有数千个基因,但少数样本对使用它们的计算方法提出了挑战。
    本文介绍了一种多模型人工基因表达数据生成框架,其中不同的基因调控网络(GRN)模型根据其基础范式的特征为最终的样本集做出贡献。在第一阶段,我们建立不同的GRN模型,并分别对每个数据进行采样。然后,我们将生成的样本汇集成一组丰富的基因表达样本,最后尝试基于多目标选择方法从兼容性等三个不同方面测量生成样本的质量,多样性和覆盖面。我们使用四种替代的GRN模型,即,常微分方程,概率布尔网络,多目标遗传算法和分层马尔可夫模型。
    我们基于现实生活中的生物和合成基因表达数据集进行了一组全面的实验。我们表明,我们的多目标样本选择机制有效地结合了来自不同模型的样本,具有高达95%的兼容性,10%的多样性和50%的覆盖率。我们表明,我们的框架生成的样本具有高达1.5倍的兼容性,比多模型框架使用的单个模型生成的样本高2倍的多样性和高2倍的覆盖率。此外,结果表明,从我们的框架生成的样本推断的GRN可以具有2.4倍的更高的精度,12倍以上的召回,和比从原始基因表达样品推断的GRN高5.4倍的f测量值。
    因此,我们证明,我们可以通过将不同的计算模型集成到一个统一的框架中来显著提高生成的基因表达样本的质量,而无需处理每个单独模型的复杂内部细节。此外,丰富的人工基因表达样本集能够捕获一些甚至无法被原始基因表达数据集捕获的生物关系。
    Overcome the lack of enough samples in gene expression data sets having thousands of genes but a small number of samples challenging the computational methods using them.
    This paper introduces a multi-model artificial gene expression data generation framework where different gene regulatory network (GRN) models contribute to the final set of samples based on the characteristics of their underlying paradigms. In the first stage, we build different GRN models, and sample data from each of them separately. Then, we pool the generated samples into a rich set of gene expression samples, and finally try to select the best of the generated samples based on a multi-objective selection method measuring the quality of the generated samples from three different aspects such as compatibility, diversity and coverage. We use four alternative GRN models, namely, ordinary differential equations, probabilistic Boolean networks, multi-objective genetic algorithm and hierarchical Markov model.
    We conducted a comprehensive set of experiments based on both real-life biological and synthetic gene expression data sets. We show that our multi-objective sample selection mechanism effectively combines samples from different models having up to 95% compatibility, 10% diversity and 50% coverage. We show that the samples generated by our framework has up to 1.5x higher compatibility, 2x higher diversity and 2x higher coverage than the samples generated by the individual models that the multi-model framework uses. Moreover, the results show that the GRNs inferred from the samples generated by our framework can have 2.4x higher precision, 12x higher recall, and 5.4x higher f-measure values than the GRNs inferred from the original gene expression samples.
    Therefore, we show that, we can significantly improve the quality of generated gene expression samples by integrating different computational models into one unified framework without dealing with complex internal details of each individual model. Moreover, the rich set of artificial gene expression samples is able to capture some biological relations that can even not be captured by the original gene expression data set.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    This paper deals with characterization and modelling of human handwriting motion from two forearm muscle activity signals, called electromyography signals (EMG). In this work, an experimental approach was used to record the coordinates of a pen tip moving on the (x, y) plane and EMG signals during the handwriting act. The main purpose is to design a new mathematical model which characterizes this biological process. Based on a multi-model approach, this system was originally developed to generate letters and geometric forms written by different writers. A Recursive Least Squares algorithm is used to estimate the parameters of each sub-model of the multi-model basis. Simulations show good agreement between predicted results and the recorded data.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号