random forest classifier

随机森林分类器
  • 文章类型: Journal Article
    微生物-药物关联的识别可以极大地促进药物研发。用于筛选微生物-药物关联的传统方法是耗时的,人力密集型,而且行为成本很高,所以计算方法是一个很好的选择。然而,他们中的大多数忽略了丰富序列的组合,结构信息,和微生物-药物网络拓扑。
    在这项研究中,我们开发了一个基于改进型图注意力变分自编码器(MGAVAEMDA)的计算框架,通过将生物信息与变分自编码器相结合来推断潜在的微药物关联.在MGAVAEMDA,我们首先使用了多个数据库,其中包括微生物序列,药物结构,和微生物-药物关联数据库,经过多次相似度计算,建立微生物和药物的两个综合特征矩阵,聚变,平滑,和阈值。然后,我们采用了变分自动编码器和图形注意力的组合来提取微生物和药物的低维特征表示。最后,将低维特征表示和图形邻接矩阵输入随机森林分类器,以获得微生物-药物关联评分,从而识别潜在的微生物-药物关联.此外,为了校正模型复杂性和冗余计算以提高效率,我们引入了一个改进的图卷积神经网络嵌入到变分自动编码器用于计算低维特征。
    实验结果表明,MGAVAEMDA的预测性能优于五种最先进的方法。对于主要测量(AUC=0.9357,AUPR=0.9378),与次优方法相比,MGAVAEMDA的相对改进分别为1.76%和1.47%,分别。
    我们对两种药物进行了案例研究,发现PubMed中已报道了超过85%的预测关联。综合实验结果验证了我们模型在准确推断潜在微生物-药物关联方面的可靠性。
    UNASSIGNED: The identification of microbe-drug associations can greatly facilitate drug research and development. Traditional methods for screening microbe-drug associations are time-consuming, manpower-intensive, and costly to conduct, so computational methods are a good alternative. However, most of them ignore the combination of abundant sequence, structural information, and microbe-drug network topology.
    UNASSIGNED: In this study, we developed a computational framework based on a modified graph attention variational autoencoder (MGAVAEMDA) to infer potential microbedrug associations by combining biological information with the variational autoencoder. In MGAVAEMDA, we first used multiple databases, which include microbial sequences, drug structures, and microbe-drug association databases, to establish two comprehensive feature matrices of microbes and drugs after multiple similarity computations, fusion, smoothing, and thresholding. Then, we employed a combination of variational autoencoder and graph attention to extract low-dimensional feature representations of microbes and drugs. Finally, the lowdimensional feature representation and graphical adjacency matrix were input into the random forest classifier to obtain the microbe-drug association score to identify the potential microbe-drug association. Moreover, in order to correct the model complexity and redundant calculation to improve efficiency, we introduced a modified graph convolutional neural network embedded into the variational autoencoder for computing low dimensional features.
    UNASSIGNED: The experiment results demonstrate that the prediction performance of MGAVAEMDA is better than the five state-of-the-art methods. For the major measurements (AUC =0.9357, AUPR =0.9378), the relative improvements of MGAVAEMDA compared to the suboptimal methods are 1.76 and 1.47%, respectively.
    UNASSIGNED: We conducted case studies on two drugs and found that more than 85% of the predicted associations have been reported in PubMed. The comprehensive experimental results validated the reliability of our models in accurately inferring potential microbe-drug associations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:由于多重耐药生物体(MDROs)引起的医疗保健相关感染,如耐甲氧西林金黄色葡萄球菌(MRSA)和艰难梭菌(CDI),给我们的医疗基础设施带来沉重负担。
    目的:MDROs的筛查是防止传播的重要机制,但却是资源密集型的。这项研究的目的是开发可以使用电子健康记录(EHR)数据预测定植或感染风险的自动化工具,提供有用的信息来帮助感染控制,并指导经验性抗生素覆盖。
    方法:我们回顾性地开发了一个机器学习模型来检测在弗吉尼亚大学医院住院患者样本采集时未分化患者的MRSA定植和感染。我们使用来自患者EHR数据的入院和住院期间信息的临床和非临床特征来构建模型。此外,我们在EHR数据中使用了一类从联系网络派生的特征;这些网络特征可以捕获患者与提供者和其他患者的联系,提高预测MRSA监测试验结果的模型可解释性和准确性。最后,我们探索了不同患者亚群的异质模型,例如,入住重症监护病房或急诊科的人或有特定检测史的人,哪个表现更好。
    结果:我们发现惩罚逻辑回归比其他方法表现更好,当我们使用多项式(二次)变换特征时,该模型的性能根据其接收器操作特征-曲线下面积得分提高了近11%。预测MDRO风险的一些重要特征包括抗生素使用,手术,使用设备,透析,患者的合并症状况,和网络特征。其中,网络功能增加了最大的价值,并将模型的性能提高了至少15%。对于特定患者亚群,具有相同特征转换的惩罚逻辑回归模型也比其他模型表现更好。
    结论:我们的研究表明,使用来自EHR数据的临床和非临床特征,通过机器学习方法可以非常有效地进行MRSA风险预测。网络特征是最具预测性的,并且提供优于现有方法的显著改进。此外,不同患者亚群的异质预测模型提高了模型的性能。
    BACKGROUND: Health care-associated infections due to multidrug-resistant organisms (MDROs), such as methicillin-resistant Staphylococcus aureus (MRSA) and Clostridioides difficile (CDI), place a significant burden on our health care infrastructure.
    OBJECTIVE: Screening for MDROs is an important mechanism for preventing spread but is resource intensive. The objective of this study was to develop automated tools that can predict colonization or infection risk using electronic health record (EHR) data, provide useful information to aid infection control, and guide empiric antibiotic coverage.
    METHODS: We retrospectively developed a machine learning model to detect MRSA colonization and infection in undifferentiated patients at the time of sample collection from hospitalized patients at the University of Virginia Hospital. We used clinical and nonclinical features derived from on-admission and throughout-stay information from the patient\'s EHR data to build the model. In addition, we used a class of features derived from contact networks in EHR data; these network features can capture patients\' contacts with providers and other patients, improving model interpretability and accuracy for predicting the outcome of surveillance tests for MRSA. Finally, we explored heterogeneous models for different patient subpopulations, for example, those admitted to an intensive care unit or emergency department or those with specific testing histories, which perform better.
    RESULTS: We found that the penalized logistic regression performs better than other methods, and this model\'s performance measured in terms of its receiver operating characteristics-area under the curve score improves by nearly 11% when we use polynomial (second-degree) transformation of the features. Some significant features in predicting MDRO risk include antibiotic use, surgery, use of devices, dialysis, patient\'s comorbidity conditions, and network features. Among these, network features add the most value and improve the model\'s performance by at least 15%. The penalized logistic regression model with the same transformation of features also performs better than other models for specific patient subpopulations.
    CONCLUSIONS: Our study shows that MRSA risk prediction can be conducted quite effectively by machine learning methods using clinical and nonclinical features derived from EHR data. Network features are the most predictive and provide significant improvement over prior methods. Furthermore, heterogeneous prediction models for different patient subpopulations enhance the model\'s performance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    肱骨沟是肱骨近端的重要解剖特征,在肩关节成形术和肱骨近端骨折重建等手术的手术计划过程中需要识别。由于骨赘的存在,目前用于自动识别的算法在关节炎肱骨中无效,降低了它们对全肩关节置换术的有用性。我们的方法涉及使用随机森林分类器(RFC)在肱骨的分段计算机断层扫描扫描中自动检测二头肌沟。我们在两个不同的测试数据集上评估了我们的模型:一个包含非关节炎性肱骨,另一个包含以明显骨赘为特征的关节炎肱骨。我们的模型在关节炎肱骨上检测到二头肌沟,平均绝对误差小于1mm,证明比以前的金本位制方法有了显著的改进。即使在关节炎肱骨中,也可以高精度地成功识别二头肌沟。这个模型是开源的,包含在python包肩膀中。
    The bicipital groove is an important anatomical feature of the proximal humerus that needs to be identified during surgical planning for procedures such as shoulder arthroplasty and proximal humeral fracture reconstruction. Current algorithms for automatic identification prove ineffective in arthritic humeri due to the presence of osteophytes, reducing their usefulness for total shoulder arthroplasty. Our methodology involves the use of a Random Forest Classifier (RFC) to automatically detect the bicipital groove on segmented computed tomography scans of humeri. We evaluated our model on two distinct test datasets: one comprising non-arthritic humeri and another with arthritic humeri characterized by significant osteophytes. Our model detected the bicipital groove with a mean absolute error of less than 1mm on arthritic humeri, demonstrating a significant improvement over the previous gold standard approach. Successful identification of the bicipital groove with a high degree of accuracy even in arthritic humeri was accomplished. This model is open source and included in the python package shoulder.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景和目的:心力衰竭(HF)是一种普遍且使人衰弱的疾病,给医疗保健系统带来了巨大的负担,并对全球患者的生活质量产生不利影响。合并症如慢性肾病(CKD),动脉高血压,糖尿病(DM)在HF患者中很常见,因为他们有相似的风险因素。本研究旨在确定约旦队列中多种因素的预后意义及其与疾病预后和预后的相关性。材料和方法:对约旦心力衰竭登记处(JoHFR)的数据进行分析,包括在约旦的公共和私人心脏病学诊所和医院就诊的急性和慢性HF患者的医疗记录。使用在线表格进行数据收集,专注于三项肾功能测试,估计肾小球滤过率(eGFR),血尿素氮(BUN),和肌酐水平,使用Cockcroft-Gault公式计算的eGFR。我们还建立了六个机器学习模型来预测我们队列中的死亡率。结果:从JoHFR,纳入2151例HF患者,分析了644、1799和1927年的eGFR记录,BUN,和肌酐水平,分别。年龄对所有测量值产生负面影响(p≤0.001),而吸烟者出人意料地表现出比不吸烟者更好的结果(p≤0.001)。与女性相比,男性的eGFR水平更正常(p=0.002)。高血压等合并症,糖尿病,心律失常,和植入装置与eGFR呈负相关(均p值<0.05)。较高的BUN水平与慢性HF相关,血脂异常,和ASCVD(p≤0.001)。较高的肌酐水平与高血压有关,糖尿病,血脂异常,心律失常,和以前的HF病史(所有p值<0.05)。低eGFR水平与机械通气需求增加(p=0.049)和死亡率增加(p≤0.001)相关。而BUN水平对这些结局没有显著影响。采用随机森林分类器的机器学习分析显示,住院时间和肌酐>115是死亡率的最显著预测因子。分类器的准确率为90.02%,AUC为80.51%,表明其在预测建模中的有效性。结论:这项研究揭示了肾功能测试之间的复杂关系,合并症,约旦HF患者的临床结果,强调肾功能作为预测工具的重要性。将机器学习模型集成到临床实践中可以提高患者预后的预测准确性。从而支持更个性化的方法来管理HF和相关的肾功能不全。需要进一步的研究来验证这些发现,并为HF队列中的CKD人群开发创新的治疗策略。
    Background and Objectives: Heart failure (HF) is a prevalent and debilitating condition that imposes a significant burden on healthcare systems and adversely affects the quality of life of patients worldwide. Comorbidities such as chronic kidney disease (CKD), arterial hypertension, and diabetes mellitus (DM) are common among HF patients, as they share similar risk factors. This study aimed to identify the prognostic significance of multiple factors and their correlation with disease prognosis and outcomes in a Jordanian cohort. Materials and Methods: Data from the Jordanian Heart Failure Registry (JoHFR) were analyzed, encompassing medical records from acute and chronic HF patients attending public and private cardiology clinics and hospitals across Jordan. An online form was utilized for data collection, focusing on three kidney function tests, estimated glomerular filtration rate (eGFR), blood urea nitrogen (BUN), and creatinine levels, with the eGFR calculated using the Cockcroft-Gault formula. We also built six machine learning models to predict mortality in our cohort. Results: From the JoHFR, 2151 HF patients were included, with 644, 1799, and 1927 records analyzed for eGFR, BUN, and creatinine levels, respectively. Age negatively impacted all measures (p ≤ 0.001), while smokers surprisingly showed better results than non-smokers (p ≤ 0.001). Males had more normal eGFR levels compared to females (p = 0.002). Comorbidities such as hypertension, diabetes, arrhythmias, and implanted devices were inversely related to eGFR (all with p-values <0.05). Higher BUN levels were associated with chronic HF, dyslipidemia, and ASCVD (p ≤ 0.001). Higher creatinine levels were linked to hypertension, diabetes, dyslipidemia, arrhythmias, and previous HF history (all with p-values <0.05). Low eGFR levels were associated with increased mechanical ventilation needs (p = 0.049) and mortality (p ≤ 0.001), while BUN levels did not significantly affect these outcomes. Machine learning analysis employing the Random Forest Classifier revealed that length of hospital stay and creatinine >115 were the most significant predictors of mortality. The classifier achieved an accuracy of 90.02% with an AUC of 80.51%, indicating its efficacy in predictive modeling. Conclusions: This study reveals the intricate relationship among kidney function tests, comorbidities, and clinical outcomes in HF patients in Jordan, highlighting the importance of kidney function as a predictive tool. Integrating machine learning models into clinical practice may enhance the predictive accuracy of patient outcomes, thereby supporting a more personalized approach to managing HF and related kidney dysfunction. Further research is necessary to validate these findings and to develop innovative treatment strategies for the CKD population within the HF cohort.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    全球癌症相关死亡的主要原因是皮肤癌。有效的治疗取决于通过皮肤病变的精确分类对皮肤癌的早期诊断。然而,皮肤科医生可能会发现对皮肤病变进行准确分类既困难又耗时。使用迁移学习来提高皮肤癌分类模型的精度是一种有前途的策略。在这项工作中,我们提出了一种具有迁移学习模型和随机森林分类器的混合CNN,用于皮肤癌疾病检测。为了评估所提出模型的有效性,在良性皮肤痣和恶性皮肤痣的两个数据集上进行了验证。所提出的模型能够对图像进行分类,准确率高达90.11%。实证结果和分析保证了所提出的皮肤癌分类模型的可行性和有效性。
    The leading cause of cancer-related deaths worldwide is skin cancer. Effective therapy depends on the early diagnosis of skin cancer through the precise classification of skin lesions. However, dermatologists may find it difficult and time-consuming to accurately classify skin lesions. The use of transfer learning to boost skin cancer classification model precision is a promising strategy. In this work, we proposed a hybrid CNN with a transfer learning model and a random forest classifier for skin cancer disease detection. To evaluate the efficacy of the proposed model, it was verified over two datasets of benign skin moles and malignant skin moles. The proposed model is able to classify images with an accuracy of up to 90.11%. The empirical results and analysis assure the feasibility and effectiveness of the proposed model for skin cancer classification.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:尽管急性肾损伤(AKI)是造血干细胞移植(HSCT)患者常见的并发症,它的预防仍然是一个临床挑战。预防或早期诊断的尝试集中在确定影响AKI发生率的因素的各种方法上。我们的目的是在构建定义预测AKI发展的参数的模型中测试人工智能(AI)的潜力。方法:分析HSCT术后6个月随访患儿的临床资料。在预处理治疗前评估肾功能,HSCT后24小时,移植后1、2、3、4和8周,and,最后,移植后3个月和6个月。捐赠者的类型,调节协议,并将并发症纳入模型。结果:根据AKI的存在或不存在,随机森林分类器(RFC)标记93名患者。RFC模型显示,HSCT前后的肾小球滤过率(eGFR)估计值,以及甲氨蝶呤的使用,急性移植物抗宿主病(GvHD),和病毒感染的发生,是移植后6个月观察期内AKI发生率的主要决定因素。结论:人工智能似乎是预测AKI潜在风险的有前途的工具,甚至在HSCT之前或手术之后。
    Background: Although acute kidney injury (AKI) is a common complication in patients undergoing hematopoietic stem cell transplantation (HSCT), its prophylaxis remains a clinical challenge. Attempts at prevention or early diagnosis focus on various methods for the identification of factors influencing the incidence of AKI. Our aim was to test the artificial intelligence (AI) potential in the construction of a model defining parameters predicting AKI development. Methods: The analysis covered the clinical data of children followed up for 6 months after HSCT. Kidney function was assessed before conditioning therapy, 24 h after HSCT, 1, 2, 3, 4, and 8 weeks after transplantation, and, finally, 3 and 6 months post-transplant. The type of donor, conditioning protocol, and complications were incorporated into the model. Results: A random forest classifier (RFC) labeled the 93 patients according to presence or absence of AKI. The RFC model revealed that the values of the estimated glomerular filtration rate (eGFR) before and just after HSCT, as well as methotrexate use, acute graft versus host disease (GvHD), and viral infection occurrence, were the major determinants of AKI incidence within the 6-month post-transplant observation period. Conclusions: Artificial intelligence seems a promising tool in predicting the potential risk of developing AKI, even before HSCT or just after the procedure.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    胸部X射线图像包含足够的信息,可在各种疾病诊断和决策中找到广泛应用,以协助医学专家。本文提出了一种使用深度卷积神经网络(CNN)和离散小波变换(DWT)特征的混合从胸部X射线图像中检测Covid-19的智能方法。起初,通过预处理任务对X射线图像进行增强和分割,然后提取深度CNN和DWT特征。通过最小冗余和最大相关性(mRMR)以及递归特征消除(RFE)从这些杂交特征中提取最佳特征。最后,基于随机森林的装袋方法用于完成检测任务。进行了广泛的实验,结果证实,与现有方法相比,我们的方法具有令人满意的性能,总体精度超过98.5%。
    Chest X-ray image contains sufficient information that finds wide-spread applications in diverse disease diagnosis and decision making to assist the medical experts. This paper has proposed an intelligent approach to detect Covid-19 from the chest X-ray image using the hybridization of deep convolutional neural network (CNN) and discrete wavelet transform (DWT) features. At first, the X-ray image is enhanced and segmented through preprocessing tasks, and then deep CNN and DWT features are extracted. The optimum features are extracted from these hybridized features through minimum redundancy and maximum relevance (mRMR) along with recursive feature elimination (RFE). Finally, the random forest-based bagging approach is used for doing the detection task. An extensive experiment is performed, and the results confirm that our approach gives satisfactory performance compare to the existing methods with an overall accuracy of more than 98.5%.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    人类肠道微生物群依赖于复杂的碳水化合物(聚糖)来提供能量和生长,主要是膳食纤维和宿主来源的粘蛋白。我们在人结肠的两室恒化器模型中介绍了聚糖通才和粘蛋白专家的数学模型。我们的目标是表征膳食纤维和粘蛋白供应对肠道生态系统中粘蛋白降解物种丰度的影响。包括聚糖的酶促降解的当前数学肠反应器模型不区分聚糖类型及其降解物。我们提出的模型区分了可以降解膳食纤维和粘蛋白的通才,和一种只能降解粘蛋白的特殊物种。结肠粘液屏障的完整性对人类整体健康和福祉至关重要,粘蛋白专家Akkermanisa粘蛋白与健康的粘液层有关。Competition,特别是在专家和通才之间,可能导致粘液层侵蚀,尤其是在膳食纤维匮乏时期。我们的模型将结肠视为肠道反应器系统,将它分成两个代表肠道内腔和粘液的隔室,导致具有大且不确定的参数空间的常微分方程的复杂系统。要了解模型参数对长期行为的影响,我们使用一个随机森林分类器,有监督的机器学习方法。此外,基于方差的灵敏度分析用于确定稳态值对模型参数输入变化的灵敏度。通过构建这个模型,我们可以研究控制肠道菌群组成和功能的潜在机制,没有混杂因素。
    The human gut microbiota relies on complex carbohydrates (glycans) for energy and growth, primarily dietary fiber and host-derived mucins. We introduce a mathematical model of a glycan generalist and a mucin specialist in a two-compartment chemostat model of the human colon. Our objective is to characterize the influence of dietary fiber and mucin supply on the abundance of mucin-degrading species within the gut ecosystem. Current mathematical gut reactor models that include the enzymatic degradation of glycans do not differentiate between glycan types and their degraders. The model we present distinguishes between a generalist that can degrade both dietary fiber and mucin, and a specialist species that can only degrade mucin. The integrity of the colonic mucus barrier is essential for overall human health and well-being, with the mucin specialist Akkermanisa muciniphila being associated with a healthy mucus layer. Competition, particularly between the specialist and generalists like Bacteroides thetaiotaomicron, may lead to mucus layer erosion, especially during periods of dietary fiber deprivation. Our model treats the colon as a gut reactor system, dividing it into two compartments that represent the lumen and the mucus of the gut, resulting in a complex system of ordinary differential equations with a large and uncertain parameter space. To understand the influence of model parameters on long-term behavior, we employ a random forest classifier, a supervised machine learning method. Additionally, a variance-based sensitivity analysis is utilized to determine the sensitivity of steady-state values to changes in model parameter inputs. By constructing this model, we can investigate the underlying mechanisms that control gut microbiota composition and function, free from confounding factors.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    森林冠层覆盖(FCC)在森林评估和管理中至关重要,影响生态系统服务,如碳封存,野生动物栖息地,和水的调节。准确有效地映射和提取FCC信息的技术的不断进步需要对其有效性和可靠性进行全面评估。本研究的主要目标是:(1)创建具有1米空间分辨率的大规模森林FCC数据集,(2)在区域尺度上评估FCC的区域空间分布,和(3)调查全球森林变化中FCC区域的差异(Hansen等人。,2013)和阿肯色州各种空间尺度的美国森林服务树冠覆盖产品(即,县级和市级)。这项研究利用了高分辨率的航空图像和机器学习算法,使用GoogleEarthEngine云计算平台进行了处理和分析,以生成FCC数据集。使用从全球森林变化中获得的参考位置的三分之一验证了该数据集的准确性(Hansen等人。,2013)数据集和国家农业图像计划(NAIP)航空图像,空间分辨率为0.6米。结果表明,该数据集在研究区域中以1-m的分辨率成功识别了FCC,总体准确率在每个县83.31%至94.35%之间。产生的FCC数据集和Hansen等人之间的空间比较结果。,2013年和USFS产品显示出强正相关,县级和市级的R2值在0.94到0.98之间。该数据集为监测提供了有价值的信息,预测,和管理阿肯色州及其他地区的森林资源。本研究采用的方法提高效率,成本效益,和可扩展性,因为它可以在基于云的环境中处理具有高计算要求的大规模数据集。它还证明了机器学习和云计算技术可以生成高分辨率的森林覆盖数据集,这可能对世界其他地区有所帮助。
    Forest canopy cover (FCC) is essential in forest assessment and management, affecting ecosystem services such as carbon sequestration, wildlife habitat, and water regulation. Ongoing advancements in techniques for accurately and efficiently mapping and extracting FCC information require a thorough evaluation of their validity and reliability. The primary objectives of this study are to: (1) create a large-scale forest FCC dataset with a 1-meter spatial resolution, (2) assess the regional spatial distribution of FCC at a regional scale, and (3) investigate differences in FCC areas among the Global Forest Change (Hansen et al., 2013) and U.S. Forest Service Tree Canopy Cover products at various spatial scales in Arkansas (i.e., county and city levels). This study utilized high-resolution aerial imagery and a machine learning algorithm processed and analyzed using the Google Earth Engine cloud computing platform to produce the FCC dataset. The accuracy of this dataset was validated using one-third of the reference locations obtained from the Global Forest Change (Hansen et al., 2013) dataset and the National Agriculture Imagery Program (NAIP) aerial imagery with a 0.6-m spatial resolution. The results showed that the dataset successfully identified FCC at a 1-m resolution in the study area, with overall accuracy ranging between 83.31% and 94.35% per county. Spatial comparison results between the produced FCC dataset and the Hansen et al., 2013 and USFS products indicated a strong positive correlation, with R2 values ranging between 0.94 and 0.98 for county and city levels. This dataset provides valuable information for monitoring, forecasting, and managing forest resources in Arkansas and beyond. The methodology followed in this study enhances efficiency, cost-effectiveness, and scalability, as it enables the processing of large-scale datasets with high computational demands in a cloud-based environment. It also demonstrates that machine learning and cloud computing technologies can generate high-resolution forest cover datasets, which might be helpful in other regions of the world.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    快速,低成本,并有效检测SARS-CoV-2病毒感染,尤其是在临床样本中,仍然是一个重大挑战。解决此问题的有希望的解决方案是光谱技术的结合:表面增强拉曼光谱(SERS)与基于机器学习(ML)算法的高级化学计量学。在本研究中,我们对一组患者的唾液和鼻咽拭子进行了SERS检查(唾液:175;鼻咽拭子:114).获得的SERS光谱使用一系列分类器进行分析,其中随机森林(RF)取得了最好的结果,例如,唾液,准确率和召回率分别为94.0%和88.9%,分别。结果表明,即使临床样本数量相对较少,SERS和浅层机器学习的结合可用于在临床实践中识别SARS-CoV-2病毒。
    The rapid, low cost, and efficient detection of SARS-CoV-2 virus infection, especially in clinical samples, remains a major challenge. A promising solution to this problem is the combination of a spectroscopic technique: surface-enhanced Raman spectroscopy (SERS) with advanced chemometrics based on machine learning (ML) algorithms. In the present study, we conducted SERS investigations of saliva and nasopharyngeal swabs taken from a cohort of patients (saliva: 175; nasopharyngeal swabs: 114). Obtained SERS spectra were analyzed using a range of classifiers in which random forest (RF) achieved the best results, e.g., for saliva, the precision and recall equals 94.0% and 88.9%, respectively. The results demonstrate that even with a relatively small number of clinical samples, the combination of SERS and shallow machine learning can be used to identify SARS-CoV-2 virus in clinical practice.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号