random forest classifier

随机森林分类器
  • 文章类型: Journal Article
    微生物-药物关联的识别可以极大地促进药物研发。用于筛选微生物-药物关联的传统方法是耗时的,人力密集型,而且行为成本很高,所以计算方法是一个很好的选择。然而,他们中的大多数忽略了丰富序列的组合,结构信息,和微生物-药物网络拓扑。
    在这项研究中,我们开发了一个基于改进型图注意力变分自编码器(MGAVAEMDA)的计算框架,通过将生物信息与变分自编码器相结合来推断潜在的微药物关联.在MGAVAEMDA,我们首先使用了多个数据库,其中包括微生物序列,药物结构,和微生物-药物关联数据库,经过多次相似度计算,建立微生物和药物的两个综合特征矩阵,聚变,平滑,和阈值。然后,我们采用了变分自动编码器和图形注意力的组合来提取微生物和药物的低维特征表示。最后,将低维特征表示和图形邻接矩阵输入随机森林分类器,以获得微生物-药物关联评分,从而识别潜在的微生物-药物关联.此外,为了校正模型复杂性和冗余计算以提高效率,我们引入了一个改进的图卷积神经网络嵌入到变分自动编码器用于计算低维特征。
    实验结果表明,MGAVAEMDA的预测性能优于五种最先进的方法。对于主要测量(AUC=0.9357,AUPR=0.9378),与次优方法相比,MGAVAEMDA的相对改进分别为1.76%和1.47%,分别。
    我们对两种药物进行了案例研究,发现PubMed中已报道了超过85%的预测关联。综合实验结果验证了我们模型在准确推断潜在微生物-药物关联方面的可靠性。
    UNASSIGNED: The identification of microbe-drug associations can greatly facilitate drug research and development. Traditional methods for screening microbe-drug associations are time-consuming, manpower-intensive, and costly to conduct, so computational methods are a good alternative. However, most of them ignore the combination of abundant sequence, structural information, and microbe-drug network topology.
    UNASSIGNED: In this study, we developed a computational framework based on a modified graph attention variational autoencoder (MGAVAEMDA) to infer potential microbedrug associations by combining biological information with the variational autoencoder. In MGAVAEMDA, we first used multiple databases, which include microbial sequences, drug structures, and microbe-drug association databases, to establish two comprehensive feature matrices of microbes and drugs after multiple similarity computations, fusion, smoothing, and thresholding. Then, we employed a combination of variational autoencoder and graph attention to extract low-dimensional feature representations of microbes and drugs. Finally, the lowdimensional feature representation and graphical adjacency matrix were input into the random forest classifier to obtain the microbe-drug association score to identify the potential microbe-drug association. Moreover, in order to correct the model complexity and redundant calculation to improve efficiency, we introduced a modified graph convolutional neural network embedded into the variational autoencoder for computing low dimensional features.
    UNASSIGNED: The experiment results demonstrate that the prediction performance of MGAVAEMDA is better than the five state-of-the-art methods. For the major measurements (AUC =0.9357, AUPR =0.9378), the relative improvements of MGAVAEMDA compared to the suboptimal methods are 1.76 and 1.47%, respectively.
    UNASSIGNED: We conducted case studies on two drugs and found that more than 85% of the predicted associations have been reported in PubMed. The comprehensive experimental results validated the reliability of our models in accurately inferring potential microbe-drug associations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:由于多重耐药生物体(MDROs)引起的医疗保健相关感染,如耐甲氧西林金黄色葡萄球菌(MRSA)和艰难梭菌(CDI),给我们的医疗基础设施带来沉重负担。
    目的:MDROs的筛查是防止传播的重要机制,但却是资源密集型的。这项研究的目的是开发可以使用电子健康记录(EHR)数据预测定植或感染风险的自动化工具,提供有用的信息来帮助感染控制,并指导经验性抗生素覆盖。
    方法:我们回顾性地开发了一个机器学习模型来检测在弗吉尼亚大学医院住院患者样本采集时未分化患者的MRSA定植和感染。我们使用来自患者EHR数据的入院和住院期间信息的临床和非临床特征来构建模型。此外,我们在EHR数据中使用了一类从联系网络派生的特征;这些网络特征可以捕获患者与提供者和其他患者的联系,提高预测MRSA监测试验结果的模型可解释性和准确性。最后,我们探索了不同患者亚群的异质模型,例如,入住重症监护病房或急诊科的人或有特定检测史的人,哪个表现更好。
    结果:我们发现惩罚逻辑回归比其他方法表现更好,当我们使用多项式(二次)变换特征时,该模型的性能根据其接收器操作特征-曲线下面积得分提高了近11%。预测MDRO风险的一些重要特征包括抗生素使用,手术,使用设备,透析,患者的合并症状况,和网络特征。其中,网络功能增加了最大的价值,并将模型的性能提高了至少15%。对于特定患者亚群,具有相同特征转换的惩罚逻辑回归模型也比其他模型表现更好。
    结论:我们的研究表明,使用来自EHR数据的临床和非临床特征,通过机器学习方法可以非常有效地进行MRSA风险预测。网络特征是最具预测性的,并且提供优于现有方法的显著改进。此外,不同患者亚群的异质预测模型提高了模型的性能。
    BACKGROUND: Health care-associated infections due to multidrug-resistant organisms (MDROs), such as methicillin-resistant Staphylococcus aureus (MRSA) and Clostridioides difficile (CDI), place a significant burden on our health care infrastructure.
    OBJECTIVE: Screening for MDROs is an important mechanism for preventing spread but is resource intensive. The objective of this study was to develop automated tools that can predict colonization or infection risk using electronic health record (EHR) data, provide useful information to aid infection control, and guide empiric antibiotic coverage.
    METHODS: We retrospectively developed a machine learning model to detect MRSA colonization and infection in undifferentiated patients at the time of sample collection from hospitalized patients at the University of Virginia Hospital. We used clinical and nonclinical features derived from on-admission and throughout-stay information from the patient\'s EHR data to build the model. In addition, we used a class of features derived from contact networks in EHR data; these network features can capture patients\' contacts with providers and other patients, improving model interpretability and accuracy for predicting the outcome of surveillance tests for MRSA. Finally, we explored heterogeneous models for different patient subpopulations, for example, those admitted to an intensive care unit or emergency department or those with specific testing histories, which perform better.
    RESULTS: We found that the penalized logistic regression performs better than other methods, and this model\'s performance measured in terms of its receiver operating characteristics-area under the curve score improves by nearly 11% when we use polynomial (second-degree) transformation of the features. Some significant features in predicting MDRO risk include antibiotic use, surgery, use of devices, dialysis, patient\'s comorbidity conditions, and network features. Among these, network features add the most value and improve the model\'s performance by at least 15%. The penalized logistic regression model with the same transformation of features also performs better than other models for specific patient subpopulations.
    CONCLUSIONS: Our study shows that MRSA risk prediction can be conducted quite effectively by machine learning methods using clinical and nonclinical features derived from EHR data. Network features are the most predictive and provide significant improvement over prior methods. Furthermore, heterogeneous prediction models for different patient subpopulations enhance the model\'s performance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景和目的:心力衰竭(HF)是一种普遍且使人衰弱的疾病,给医疗保健系统带来了巨大的负担,并对全球患者的生活质量产生不利影响。合并症如慢性肾病(CKD),动脉高血压,糖尿病(DM)在HF患者中很常见,因为他们有相似的风险因素。本研究旨在确定约旦队列中多种因素的预后意义及其与疾病预后和预后的相关性。材料和方法:对约旦心力衰竭登记处(JoHFR)的数据进行分析,包括在约旦的公共和私人心脏病学诊所和医院就诊的急性和慢性HF患者的医疗记录。使用在线表格进行数据收集,专注于三项肾功能测试,估计肾小球滤过率(eGFR),血尿素氮(BUN),和肌酐水平,使用Cockcroft-Gault公式计算的eGFR。我们还建立了六个机器学习模型来预测我们队列中的死亡率。结果:从JoHFR,纳入2151例HF患者,分析了644、1799和1927年的eGFR记录,BUN,和肌酐水平,分别。年龄对所有测量值产生负面影响(p≤0.001),而吸烟者出人意料地表现出比不吸烟者更好的结果(p≤0.001)。与女性相比,男性的eGFR水平更正常(p=0.002)。高血压等合并症,糖尿病,心律失常,和植入装置与eGFR呈负相关(均p值<0.05)。较高的BUN水平与慢性HF相关,血脂异常,和ASCVD(p≤0.001)。较高的肌酐水平与高血压有关,糖尿病,血脂异常,心律失常,和以前的HF病史(所有p值<0.05)。低eGFR水平与机械通气需求增加(p=0.049)和死亡率增加(p≤0.001)相关。而BUN水平对这些结局没有显著影响。采用随机森林分类器的机器学习分析显示,住院时间和肌酐>115是死亡率的最显著预测因子。分类器的准确率为90.02%,AUC为80.51%,表明其在预测建模中的有效性。结论:这项研究揭示了肾功能测试之间的复杂关系,合并症,约旦HF患者的临床结果,强调肾功能作为预测工具的重要性。将机器学习模型集成到临床实践中可以提高患者预后的预测准确性。从而支持更个性化的方法来管理HF和相关的肾功能不全。需要进一步的研究来验证这些发现,并为HF队列中的CKD人群开发创新的治疗策略。
    Background and Objectives: Heart failure (HF) is a prevalent and debilitating condition that imposes a significant burden on healthcare systems and adversely affects the quality of life of patients worldwide. Comorbidities such as chronic kidney disease (CKD), arterial hypertension, and diabetes mellitus (DM) are common among HF patients, as they share similar risk factors. This study aimed to identify the prognostic significance of multiple factors and their correlation with disease prognosis and outcomes in a Jordanian cohort. Materials and Methods: Data from the Jordanian Heart Failure Registry (JoHFR) were analyzed, encompassing medical records from acute and chronic HF patients attending public and private cardiology clinics and hospitals across Jordan. An online form was utilized for data collection, focusing on three kidney function tests, estimated glomerular filtration rate (eGFR), blood urea nitrogen (BUN), and creatinine levels, with the eGFR calculated using the Cockcroft-Gault formula. We also built six machine learning models to predict mortality in our cohort. Results: From the JoHFR, 2151 HF patients were included, with 644, 1799, and 1927 records analyzed for eGFR, BUN, and creatinine levels, respectively. Age negatively impacted all measures (p ≤ 0.001), while smokers surprisingly showed better results than non-smokers (p ≤ 0.001). Males had more normal eGFR levels compared to females (p = 0.002). Comorbidities such as hypertension, diabetes, arrhythmias, and implanted devices were inversely related to eGFR (all with p-values <0.05). Higher BUN levels were associated with chronic HF, dyslipidemia, and ASCVD (p ≤ 0.001). Higher creatinine levels were linked to hypertension, diabetes, dyslipidemia, arrhythmias, and previous HF history (all with p-values <0.05). Low eGFR levels were associated with increased mechanical ventilation needs (p = 0.049) and mortality (p ≤ 0.001), while BUN levels did not significantly affect these outcomes. Machine learning analysis employing the Random Forest Classifier revealed that length of hospital stay and creatinine >115 were the most significant predictors of mortality. The classifier achieved an accuracy of 90.02% with an AUC of 80.51%, indicating its efficacy in predictive modeling. Conclusions: This study reveals the intricate relationship among kidney function tests, comorbidities, and clinical outcomes in HF patients in Jordan, highlighting the importance of kidney function as a predictive tool. Integrating machine learning models into clinical practice may enhance the predictive accuracy of patient outcomes, thereby supporting a more personalized approach to managing HF and related kidney dysfunction. Further research is necessary to validate these findings and to develop innovative treatment strategies for the CKD population within the HF cohort.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:尽管急性肾损伤(AKI)是造血干细胞移植(HSCT)患者常见的并发症,它的预防仍然是一个临床挑战。预防或早期诊断的尝试集中在确定影响AKI发生率的因素的各种方法上。我们的目的是在构建定义预测AKI发展的参数的模型中测试人工智能(AI)的潜力。方法:分析HSCT术后6个月随访患儿的临床资料。在预处理治疗前评估肾功能,HSCT后24小时,移植后1、2、3、4和8周,and,最后,移植后3个月和6个月。捐赠者的类型,调节协议,并将并发症纳入模型。结果:根据AKI的存在或不存在,随机森林分类器(RFC)标记93名患者。RFC模型显示,HSCT前后的肾小球滤过率(eGFR)估计值,以及甲氨蝶呤的使用,急性移植物抗宿主病(GvHD),和病毒感染的发生,是移植后6个月观察期内AKI发生率的主要决定因素。结论:人工智能似乎是预测AKI潜在风险的有前途的工具,甚至在HSCT之前或手术之后。
    Background: Although acute kidney injury (AKI) is a common complication in patients undergoing hematopoietic stem cell transplantation (HSCT), its prophylaxis remains a clinical challenge. Attempts at prevention or early diagnosis focus on various methods for the identification of factors influencing the incidence of AKI. Our aim was to test the artificial intelligence (AI) potential in the construction of a model defining parameters predicting AKI development. Methods: The analysis covered the clinical data of children followed up for 6 months after HSCT. Kidney function was assessed before conditioning therapy, 24 h after HSCT, 1, 2, 3, 4, and 8 weeks after transplantation, and, finally, 3 and 6 months post-transplant. The type of donor, conditioning protocol, and complications were incorporated into the model. Results: A random forest classifier (RFC) labeled the 93 patients according to presence or absence of AKI. The RFC model revealed that the values of the estimated glomerular filtration rate (eGFR) before and just after HSCT, as well as methotrexate use, acute graft versus host disease (GvHD), and viral infection occurrence, were the major determinants of AKI incidence within the 6-month post-transplant observation period. Conclusions: Artificial intelligence seems a promising tool in predicting the potential risk of developing AKI, even before HSCT or just after the procedure.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    胸部X射线图像包含足够的信息,可在各种疾病诊断和决策中找到广泛应用,以协助医学专家。本文提出了一种使用深度卷积神经网络(CNN)和离散小波变换(DWT)特征的混合从胸部X射线图像中检测Covid-19的智能方法。起初,通过预处理任务对X射线图像进行增强和分割,然后提取深度CNN和DWT特征。通过最小冗余和最大相关性(mRMR)以及递归特征消除(RFE)从这些杂交特征中提取最佳特征。最后,基于随机森林的装袋方法用于完成检测任务。进行了广泛的实验,结果证实,与现有方法相比,我们的方法具有令人满意的性能,总体精度超过98.5%。
    Chest X-ray image contains sufficient information that finds wide-spread applications in diverse disease diagnosis and decision making to assist the medical experts. This paper has proposed an intelligent approach to detect Covid-19 from the chest X-ray image using the hybridization of deep convolutional neural network (CNN) and discrete wavelet transform (DWT) features. At first, the X-ray image is enhanced and segmented through preprocessing tasks, and then deep CNN and DWT features are extracted. The optimum features are extracted from these hybridized features through minimum redundancy and maximum relevance (mRMR) along with recursive feature elimination (RFE). Finally, the random forest-based bagging approach is used for doing the detection task. An extensive experiment is performed, and the results confirm that our approach gives satisfactory performance compare to the existing methods with an overall accuracy of more than 98.5%.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    森林冠层覆盖(FCC)在森林评估和管理中至关重要,影响生态系统服务,如碳封存,野生动物栖息地,和水的调节。准确有效地映射和提取FCC信息的技术的不断进步需要对其有效性和可靠性进行全面评估。本研究的主要目标是:(1)创建具有1米空间分辨率的大规模森林FCC数据集,(2)在区域尺度上评估FCC的区域空间分布,和(3)调查全球森林变化中FCC区域的差异(Hansen等人。,2013)和阿肯色州各种空间尺度的美国森林服务树冠覆盖产品(即,县级和市级)。这项研究利用了高分辨率的航空图像和机器学习算法,使用GoogleEarthEngine云计算平台进行了处理和分析,以生成FCC数据集。使用从全球森林变化中获得的参考位置的三分之一验证了该数据集的准确性(Hansen等人。,2013)数据集和国家农业图像计划(NAIP)航空图像,空间分辨率为0.6米。结果表明,该数据集在研究区域中以1-m的分辨率成功识别了FCC,总体准确率在每个县83.31%至94.35%之间。产生的FCC数据集和Hansen等人之间的空间比较结果。,2013年和USFS产品显示出强正相关,县级和市级的R2值在0.94到0.98之间。该数据集为监测提供了有价值的信息,预测,和管理阿肯色州及其他地区的森林资源。本研究采用的方法提高效率,成本效益,和可扩展性,因为它可以在基于云的环境中处理具有高计算要求的大规模数据集。它还证明了机器学习和云计算技术可以生成高分辨率的森林覆盖数据集,这可能对世界其他地区有所帮助。
    Forest canopy cover (FCC) is essential in forest assessment and management, affecting ecosystem services such as carbon sequestration, wildlife habitat, and water regulation. Ongoing advancements in techniques for accurately and efficiently mapping and extracting FCC information require a thorough evaluation of their validity and reliability. The primary objectives of this study are to: (1) create a large-scale forest FCC dataset with a 1-meter spatial resolution, (2) assess the regional spatial distribution of FCC at a regional scale, and (3) investigate differences in FCC areas among the Global Forest Change (Hansen et al., 2013) and U.S. Forest Service Tree Canopy Cover products at various spatial scales in Arkansas (i.e., county and city levels). This study utilized high-resolution aerial imagery and a machine learning algorithm processed and analyzed using the Google Earth Engine cloud computing platform to produce the FCC dataset. The accuracy of this dataset was validated using one-third of the reference locations obtained from the Global Forest Change (Hansen et al., 2013) dataset and the National Agriculture Imagery Program (NAIP) aerial imagery with a 0.6-m spatial resolution. The results showed that the dataset successfully identified FCC at a 1-m resolution in the study area, with overall accuracy ranging between 83.31% and 94.35% per county. Spatial comparison results between the produced FCC dataset and the Hansen et al., 2013 and USFS products indicated a strong positive correlation, with R2 values ranging between 0.94 and 0.98 for county and city levels. This dataset provides valuable information for monitoring, forecasting, and managing forest resources in Arkansas and beyond. The methodology followed in this study enhances efficiency, cost-effectiveness, and scalability, as it enables the processing of large-scale datasets with high computational demands in a cloud-based environment. It also demonstrates that machine learning and cloud computing technologies can generate high-resolution forest cover datasets, which might be helpful in other regions of the world.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    快速,低成本,并有效检测SARS-CoV-2病毒感染,尤其是在临床样本中,仍然是一个重大挑战。解决此问题的有希望的解决方案是光谱技术的结合:表面增强拉曼光谱(SERS)与基于机器学习(ML)算法的高级化学计量学。在本研究中,我们对一组患者的唾液和鼻咽拭子进行了SERS检查(唾液:175;鼻咽拭子:114).获得的SERS光谱使用一系列分类器进行分析,其中随机森林(RF)取得了最好的结果,例如,唾液,准确率和召回率分别为94.0%和88.9%,分别。结果表明,即使临床样本数量相对较少,SERS和浅层机器学习的结合可用于在临床实践中识别SARS-CoV-2病毒。
    The rapid, low cost, and efficient detection of SARS-CoV-2 virus infection, especially in clinical samples, remains a major challenge. A promising solution to this problem is the combination of a spectroscopic technique: surface-enhanced Raman spectroscopy (SERS) with advanced chemometrics based on machine learning (ML) algorithms. In the present study, we conducted SERS investigations of saliva and nasopharyngeal swabs taken from a cohort of patients (saliva: 175; nasopharyngeal swabs: 114). Obtained SERS spectra were analyzed using a range of classifiers in which random forest (RF) achieved the best results, e.g., for saliva, the precision and recall equals 94.0% and 88.9%, respectively. The results demonstrate that even with a relatively small number of clinical samples, the combination of SERS and shallow machine learning can be used to identify SARS-CoV-2 virus in clinical practice.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:以前预测心脏手术后谵妄的模型仍然不足。本研究旨在开发和验证基于机器学习的心脏瓣膜手术患者术后谵妄(POD)预测模型。
    方法:从中国南方某三级和主要转诊医院提取心外科重症监护病房(CSICU)1年以上的电子医疗信息,从2019年6月到2020年6月。本研究共纳入了心脏瓣膜手术后入住CSICU的507例患者。七种经典机器学习算法(随机森林分类器,Logistic回归,支持向量机分类器,K近邻分类器,高斯朴素贝叶斯,梯度提升决策树,和感知器.)用于在完整(q=31)和选定(q=19)特征集下开发谵妄预测模型,分别。
    结果:随机森林分类器在两个特征数据集中都表现得非常好,完整特征数据集的曲线下面积(AUC)为0.92,所选特征数据集的AUC为0.86。此外,它实现了相对较低的预期校准误差(ECE)和最高的平均精度(AP),完整特征数据集的AP为0.80,选定特征数据集的AP为0.73。为了进一步评估性能最佳的随机森林分类器,使用SHAP(Shapley加法解释),和重要性矩阵图,散点图,并生成了摘要图。
    结论:我们建立了基于机器学习的预测模型来预测心脏瓣膜手术患者的POD。随机森林模型在预测方面具有最好的预测性能,有助于改善POD患者的预后。
    BACKGROUND: Previous models for predicting delirium after cardiac surgery remained inadequate. This study aimed to develop and validate a machine learning-based prediction model for postoperative delirium (POD) in cardiac valve surgery patients.
    METHODS: The electronic medical information of the cardiac surgical intensive care unit (CSICU) was extracted from a tertiary and major referral hospital in southern China over 1 year, from June 2019 to June 2020. A total of 507 patients admitted to the CSICU after cardiac valve surgery were included in this study. Seven classical machine learning algorithms (Random Forest Classifier, Logistic Regression, Support Vector Machine Classifier, K-nearest Neighbors Classifier, Gaussian Naive Bayes, Gradient Boosting Decision Tree, and Perceptron.) were used to develop delirium prediction models under full (q = 31) and selected (q = 19) feature sets, respectively.
    RESULTS: The Random Forest classifier performs exceptionally well in both feature datasets, with an Area Under the Curve (AUC) of 0.92 for the full feature dataset and an AUC of 0.86 for the selected feature dataset. Additionally, it achieves a relatively lower Expected Calibration Error (ECE) and the highest Average Precision (AP), with an AP of 0.80 for the full feature dataset and an AP of 0.73 for the selected feature dataset. To further evaluate the best-performing Random Forest classifier, SHAP (Shapley Additive Explanations) was used, and the importance matrix plot, scatter plots, and summary plots were generated.
    CONCLUSIONS: We established machine learning-based prediction models to predict POD in patients undergoing cardiac valve surgery. The random forest model has the best predictive performance in prediction and can help improve the prognosis of patients with POD.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    经由计算机的人与人的通信主要使用键盘或麦克风来进行。在虚拟现实(VR)领域,在需要最身临其境的体验的地方,使用键盘与这一目标相矛盾,而麦克风的使用并不总是可取的(例如,特遣部队训练期间的无声命令)或根本不可能(例如,如果用户有听力损失)。数据手套有助于增加VR中的沉浸感,因为它们对应于我们的自然互动。同时,它们提供了准确捕捉手形的可能性,例如用于非语言交流的那些(例如,竖起大拇指,好的手势,...)和手语。在本文中,我们提出了一种使用ManusPrimeX数据手套的手形识别系统,包括数据采集,数据预处理,和数据分类,以实现VR中的非语言交流。我们研究了在数据预处理中使用离群值检测和特征选择方法对准确性和分类时间的影响。为了获得更广义的方法,我们还研究了人工数据增强的影响,即,我们从记录和过滤的数据中创建了新的人工数据,以增强训练数据集。用我们的方法,可以区分56种不同的手形,准确率高达93.28%。减少了27个手形,可以达到高达95.55%的精度。投票元分类器(VL2)被证明是最准确的,尽管最慢,分类器。一个很好的选择是随机森林(RF),这甚至能够在少数情况下实现更好的精度值,并且通常更快。异常检测被证明是一种有效的方法,特别是在提高分类时间方面。总的来说,我们已经证明,我们使用数据手套的手形识别系统适合在VR中进行通信。
    Human-to-human communication via the computer is mainly carried out using a keyboard or microphone. In the field of virtual reality (VR), where the most immersive experience possible is desired, the use of a keyboard contradicts this goal, while the use of a microphone is not always desirable (e.g., silent commands during task-force training) or simply not possible (e.g., if the user has hearing loss). Data gloves help to increase immersion within VR, as they correspond to our natural interaction. At the same time, they offer the possibility of accurately capturing hand shapes, such as those used in non-verbal communication (e.g., thumbs up, okay gesture, …) and in sign language. In this paper, we present a hand-shape recognition system using Manus Prime X data gloves, including data acquisition, data preprocessing, and data classification to enable nonverbal communication within VR. We investigate the impact on accuracy and classification time of using an outlier detection and a feature selection approach in our data preprocessing. To obtain a more generalized approach, we also studied the impact of artificial data augmentation, i.e., we created new artificial data from the recorded and filtered data to augment the training data set. With our approach, 56 different hand shapes could be distinguished with an accuracy of up to 93.28%. With a reduced number of 27 hand shapes, an accuracy of up to 95.55% could be achieved. The voting meta-classifier (VL2) proved to be the most accurate, albeit slowest, classifier. A good alternative is random forest (RF), which was even able to achieve better accuracy values in a few cases and was generally somewhat faster. outlier detection was proven to be an effective approach, especially in improving the classification time. Overall, we have shown that our hand-shape recognition system using data gloves is suitable for communication within VR.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    COVID-19死亡率预测背景COVID-19已成为全球主要的公共卫生问题,尽管有预防和努力。每天COVID-19病例数迅速增加,与测试程序相关的时间和财务成本是繁重的。方法为了克服这一点,我们的目标是使用机器学习模型鉴定免疫和代谢生物标志物以预测COVID-19死亡率.我们纳入了2020年1月1日至9月30日期间香港公立医院的住院患者,这些患者使用RT-PCR诊断为COVID-19。我们开发了三种机器学习模型来根据COVID-19患者的电子病历数据预测其死亡率。我们进行了统计分析,以比较深度神经网络(DNN)的训练后的机器学习模型,随机森林分类器(RF)和支持向量机(SVM)使用来自5059名患者(中位年龄=46岁;49.3%男性)的数据,这些患者基于电子健康记录和532,427名患者作为对照的数据检测出COVID-19阳性。结果我们确定了可以准确预测COVID-19死亡风险的前20种免疫和代谢生物标志物,ROC-AUC为0.98(95%CI0.96-0.98)。在使用的三种模型中,我们的结果表明,随机森林(RF)模型在COVID-19患者中实现了最准确的死亡率预测,肾小球滤过,白蛋白,尿素,降钙素原,c反应蛋白,氧气,碳酸氢盐,二氧化碳,铁蛋白,葡萄糖,红细胞,肌酐,淋巴细胞,血液和白细胞的PH是确定的最重要的生物标志物。来自广华医院的队列(131名患者)用于模型验证,ROC-AUC为0.90(95%CI0.84-0.92)。结论建议医师密切监测血液学,凝血,心脏,肝,COVID-19患者中肾脏和炎症因子可能进展为严重疾病。据我们所知,以前的研究中没有发现重要的免疫和代谢生物标志物,达到我们研究中所证明的程度.
    在线版本包含补充材料,可在10.1186/s44247-022-00001-0获得。
    COVID-19 mortality prediction Background COVID-19 has become a major global public health problem, despite prevention and efforts. The daily number of COVID-19 cases rapidly increases, and the time and financial costs associated with testing procedure are burdensome. Method To overcome this, we aim to identify immunological and metabolic biomarkers to predict COVID-19 mortality using a machine learning model. We included inpatients from Hong Kong\'s public hospitals between January 1, and September 30, 2020, who were diagnosed with COVID-19 using RT-PCR. We developed three machine learning models to predict the mortality of COVID-19 patients based on data in their electronic medical records. We performed statistical analysis to compare the trained machine learning models which are Deep Neural Networks (DNN), Random Forest Classifier (RF) and Support Vector Machine (SVM) using data from a cohort of 5,059 patients (median age = 46 years; 49.3% male) who had tested positive for COVID-19 based on electronic health records and data from 532,427 patients as controls. Result We identified top 20 immunological and metabolic biomarkers that can accurately predict the risk of mortality from COVID-19 with ROC-AUC of 0.98 (95% CI 0.96-0.98). Of the three models used, our result demonstrate that the random forest (RF) model achieved the most accurate prediction of mortality among COVID-19 patients with age, glomerular filtration, albumin, urea, procalcitonin, c-reactive protein, oxygen, bicarbonate, carbon dioxide, ferritin, glucose, erythrocytes, creatinine, lymphocytes, PH of blood and leukocytes among the most important biomarkers identified. A cohort from Kwong Wah Hospital (131 patients) was used for model validation with ROC-AUC of 0.90 (95% CI 0.84-0.92). Conclusion We recommend physicians closely monitor hematological, coagulation, cardiac, hepatic, renal and inflammatory factors for potential progression to severe conditions among COVID-19 patients. To the best of our knowledge, no previous research has identified important immunological and metabolic biomarkers to the extent demonstrated in our study.
    UNASSIGNED: The online version contains supplementary material available at 10.1186/s44247-022-00001-0.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号