Supervised learning

监督学习
  • 文章类型: Journal Article
    背景:静脉-体外膜氧合(VV-ECMO)是难治性呼吸衰竭患者的一种治疗方法。从体外膜氧合(ECMO)中拔管的决定通常涉及断奶试验和临床直觉。迄今为止,预测指标有限,无法指导临床决策,以确定哪些患者将成功断奶和拔管.
    目的:本研究旨在帮助临床医生决定将患者从ECMO拔管,使用VV-ECMO结果的持续评估(CEVVO),基于深度学习的模型,用于预测VV-ECMO支持的患者拔管成功。可以每天应用运行度量以将患者分类为高风险和低风险组。利用这些数据,提供者可根据其专业知识和CEVVO考虑启动断奶试验.
    方法:从哥伦比亚大学欧文医学中心接受VV-ECMO支持的118例患者收集数据。使用基于长期短期记忆的网络,CEVVO是第一个能够将离散临床信息与从ECMO设备收集的连续数据集成的模型。共进行了12套5折交叉验证,以评估性能,这是使用接收器工作特征曲线下面积(AUROC)和平均精度(AP)测量的。要将预测值转化为临床有用的度量,模型结果被校准并分层为风险组,范围从0(高风险)到3(低风险)。为了进一步研究CEVVO的性能优势,使用高斯过程回归生成2个合成数据集。第一个数据集保留了患者数据集的长期依赖性,而第二个没有。
    结果:与现代模型相比,CEVVO始终表现出优异的分类性能(与第二高AUROC和AP相比,P<.001和P=.04)。尽管模型的逐个患者预测能力可能太低,无法整合到临床环境中(AUROC95%CI0.6822-0.7055;AP95%CI0.8515-0.8682),患者风险分类系统显示出更大的潜力.当在72小时测量时,高危人群拔管成功率为58%(7/12),而低危组的成功拔管率为92%(11/12;P=.04).当在96小时测量时,高危和低危组脱管率分别为54%(6/11)和100%(9/9),分别(P=0.01)。我们假设CEVVO的性能提高归因于其有效捕获瞬态时间模式的能力。的确,与逻辑回归和密集神经网络相比,CEVVO在具有固有时间依赖性的合成数据上表现出改进的性能(P<.001)。
    结论:解释和整合大型数据集的能力对于创建能够帮助临床医生对VV-ECMO支持的患者进行风险分层的准确模型至关重要。我们的框架可以指导未来将CEVVO纳入更全面的重症监护监测系统。
    BACKGROUND: Venovenous extracorporeal membrane oxygenation (VV-ECMO) is a therapy for patients with refractory respiratory failure. The decision to decannulate someone from extracorporeal membrane oxygenation (ECMO) often involves weaning trials and clinical intuition. To date, there are limited prognostication metrics to guide clinical decision-making to determine which patients will be successfully weaned and decannulated.
    OBJECTIVE: This study aims to assist clinicians with the decision to decannulate a patient from ECMO, using Continuous Evaluation of VV-ECMO Outcomes (CEVVO), a deep learning-based model for predicting success of decannulation in patients supported on VV-ECMO. The running metric may be applied daily to categorize patients into high-risk and low-risk groups. Using these data, providers may consider initiating a weaning trial based on their expertise and CEVVO.
    METHODS: Data were collected from 118 patients supported with VV-ECMO at the Columbia University Irving Medical Center. Using a long short-term memory-based network, CEVVO is the first model capable of integrating discrete clinical information with continuous data collected from an ECMO device. A total of 12 sets of 5-fold cross validations were conducted to assess the performance, which was measured using the area under the receiver operating characteristic curve (AUROC) and average precision (AP). To translate the predicted values into a clinically useful metric, the model results were calibrated and stratified into risk groups, ranging from 0 (high risk) to 3 (low risk). To further investigate the performance edge of CEVVO, 2 synthetic data sets were generated using Gaussian process regression. The first data set preserved the long-term dependency of the patient data set, whereas the second did not.
    RESULTS: CEVVO demonstrated consistently superior classification performance compared with contemporary models (P<.001 and P=.04 compared with the next highest AUROC and AP). Although the model\'s patient-by-patient predictive power may be too low to be integrated into a clinical setting (AUROC 95% CI 0.6822-0.7055; AP 95% CI 0.8515-0.8682), the patient risk classification system displayed greater potential. When measured at 72 hours, the high-risk group had a successful decannulation rate of 58% (7/12), whereas the low-risk group had a successful decannulation rate of 92% (11/12; P=.04). When measured at 96 hours, the high- and low-risk groups had a successful decannulation rate of 54% (6/11) and 100% (9/9), respectively (P=.01). We hypothesized that the improved performance of CEVVO was owing to its ability to efficiently capture transient temporal patterns. Indeed, CEVVO exhibited improved performance on synthetic data with inherent temporal dependencies (P<.001) compared with logistic regression and a dense neural network.
    CONCLUSIONS: The ability to interpret and integrate large data sets is paramount for creating accurate models capable of assisting clinicians in risk stratifying patients supported on VV-ECMO. Our framework may guide future incorporation of CEVVO into more comprehensive intensive care monitoring systems.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:在医院环境中,虚弱是一个重要的风险因素,但在临床实践中难以衡量。我们建议使用德国南部三级护理教学医院的常规数据,对现有的基于诊断的虚弱评分进行重新加权。
    方法:数据集包括患者特征,例如性别,年龄,主要和次要诊断和住院死亡率。根据这些信息,我们重新计算现有的医院衰弱风险评分.该队列包括年龄≥75的患者,并分为发展队列(2011年至2013年,N=30,525)和验证队列(2014年,N=11,202)。在2022年整个德国(N=491,251),在包含年龄≥75的住院病例的第二个验证队列中也进行了有限的外部验证。在发展队列中,LASSO回归分析用于选择最相关的变量,并为德语设置生成重新加权的脆弱评分。使用接受者工作特征曲线下面积(AUC)评估鉴别。进行校准曲线的可视化和决策曲线分析。使用逻辑回归模型评估了加权脆弱评分在非老年人口中的适用性。
    结果:在109例与虚弱相关的诊断中,虚弱评分的重新加权仅包括53例,并且比评分的初始加权具有更好的辨别能力(AUC=0.89vs.AUC=0.80,验证队列中p<0.001)。校准曲线显示基于分数的预测与实际观察到的死亡率之间的良好一致性。2022年在整个德国(N=491,251)使用年龄≥75岁的住院病例进行的其他外部验证证实了有关辨别和校准的结果,并强调了重新加权的脆弱评分的地理和时间有效性。决策曲线分析表明,重新加权评分作为一般决策支持工具的临床实用性优于初始版本的评分。对重新加权脆弱评分在非老年人群中的适用性的评估(N=198,819)表明,歧视优于初始版本的评分(AUC=0.92vs.AUC=0.87,p<0.001)。此外,我们观察到重新加权脆弱评分对住院死亡率的年龄稳定影响,这对女性和男性来说没有很大的不同。
    结论:我们的数据表明,重新加权的衰弱评分优于原始的衰弱评分,有住院死亡风险的虚弱患者。因此,我们建议在德国住院设置中使用重新加权的脆弱评分.
    BACKGROUND: In the hospital setting, frailty is a significant risk factor, but difficult to measure in clinical practice. We propose a reweighting of an existing diagnoses-based frailty score using routine data from a tertiary care teaching hospital in southern Germany.
    METHODS: The dataset includes patient characteristics such as sex, age, primary and secondary diagnoses and in-hospital mortality. Based on this information, we recalculate the existing Hospital Frailty Risk Score. The cohort includes patients aged ≥ 75 and was divided into a development cohort (admission year 2011 to 2013, N = 30,525) and a validation cohort (2014, N = 11,202). A limited external validation is also conducted in a second validation cohort containing inpatient cases aged ≥ 75 in 2022 throughout Germany (N = 491,251). In the development cohort, LASSO regression analysis was used to select the most relevant variables and to generate a reweighted Frailty Score for the German setting. Discrimination is assessed using the area under the receiver operating characteristic curve (AUC). Visualization of calibration curves and decision curve analysis were carried out. Applicability of the reweighted Frailty Score in a non-elderly population was assessed using logistic regression models.
    RESULTS: Reweighting of the Frailty Score included only 53 out of the 109 frailty-related diagnoses and resulted in substantially better discrimination than the initial weighting of the score (AUC = 0.89 vs. AUC = 0.80, p < 0.001 in the validation cohort). Calibration curves show a good agreement between score-based predictions and actual observed mortality. Additional external validation using inpatient cases aged ≥ 75 in 2022 throughout Germany (N = 491,251) confirms the results regarding discrimination and calibration and underlines the geographic and temporal validity of the reweighted Frailty Score. Decision curve analysis indicates that the clinical usefulness of the reweighted score as a general decision support tool is superior to the initial version of the score. Assessment of the applicability of the reweighted Frailty Score in a non-elderly population (N = 198,819) shows that discrimination is superior to the initial version of the score (AUC = 0.92 vs. AUC = 0.87, p < 0.001). In addition, we observe a fairly age-stable influence of the reweighted Frailty Score on in-hospital mortality, which does not differ substantially for women and men.
    CONCLUSIONS: Our data indicate that the reweighted Frailty Score is superior to the original Frailty Score for identification of older, frail patients at risk for in-hospital mortality. Hence, we recommend using the reweighted Frailty Score in the German in-hospital setting.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    早期阿尔茨海默病(AD)和额颞叶痴呆(FTD)症状相似,复杂的诊断和特定治疗策略的发展。我们的研究评估了从脑电图(EEG)信号中识别AD和FTD生物标志物的多种特征提取技术。我们开发了一种优化的机器学习架构,集成了滑动窗口,特征提取,和监督学习来区分AD和FTD患者,以及健康对照(HCs)。我们的模型,对于滑动窗口有90%的重叠,用于特征提取的SVD熵,和K最近邻居(KNN)用于监督学习,平均F1得分和准确率分别为93%和91%,92.5%和93%,区分AD和HC的比例分别为91.5%和91%,FTD和HC,AD和FTD,分别。特征重要性数组,一个可以解释的人工智能功能,强调了有助于识别和区分AD和FTD生物标志物的脑叶。本研究引入了一种使用EEG信号检测和区分AD和FTD的新框架,满足对准确的早期诊断的需求。此外,滑动窗口的比较评估,多特征提取,并记录了有关AD/FTD检测和区分的机器学习方法。
    Early-stage Alzheimer\'s disease (AD) and frontotemporal dementia (FTD) share similar symptoms, complicating their diagnosis and the development of specific treatment strategies. Our study evaluated multiple feature extraction techniques for identifying AD and FTD biomarkers from electroencephalographic (EEG) signals. We developed an optimised machine learning architecture that integrates sliding windowing, feature extraction, and supervised learning to distinguish between AD and FTD patients, as well as from healthy controls (HCs). Our model, with a 90% overlap for sliding windowing, SVD entropy for feature extraction, and K-Nearest Neighbors (KNN) for supervised learning, achieved a mean F1-score and accuracy of 93% and 91%, 92.5% and 93%, and 91.5% and 91% for discriminating AD and HC, FTD and HC, and AD and FTD, respectively. The feature importance array, an explainable AI feature, highlighted the brain lobes that contributed to identifying and distinguishing AD and FTD biomarkers. This research introduces a novel framework for detecting and discriminating AD and FTD using EEG signals, addressing the need for accurate early-stage diagnostics. Furthermore, a comparative evaluation of sliding windowing, multiple feature extraction, and machine learning methods on AD/FTD detection and discrimination is documented.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    老年人跌倒对健康造成相当大的危害,不仅导致身体伤害,还导致许多其他相关问题。关于步态恶化的及时警报,作为即将下跌的迹象,可以帮助预防跌倒。在这次调查中,对市售移动电话系统和两个腕带系统进行了全面的比较分析:一个是市售的,另一个是一种新颖的方法。每个系统都配备了单个三轴加速度计。暗示潜在跌倒的步行是由参与者佩戴的特殊眼镜引起的。相同的标准机器学习技术用于基于单个三轴加速度计的所有三个系统的分类。产生86%的最佳平均准确度,特异性为88%,通过使用腕带的支持向量机(SVM)方法,灵敏度为86%。一部智能手机,另一方面,仅使用三轴加速度计传感器的SVM也实现了73%的最佳平均精度。平均准确度的意义分析,灵敏度,创新腕带和智能手机之间的特异性产生了0.000的p值。此外,这项研究应用了无监督和半监督学习方法,结合主成分分析和t分布随机邻居嵌入。总而言之,这两个腕带都展示了可穿戴传感器在早期检测和缓解老年人跌倒方面的可用性,超越智能手机。
    Falls by the elderly pose considerable health hazards, leading not only to physical harm but a number of other related problems. A timely alert about a deteriorating gait, as an indication of an impending fall, can assist in fall prevention. In this investigation, a comprehensive comparative analysis was conducted between a commercially available mobile phone system and two wristband systems: one commercially available and another representing a novel approach. Each system was equipped with a singular three-axis accelerometer. The walk suggestive of a potential fall was induced by special glasses worn by the participants. The same standard machine-learning techniques were employed for the classification with all three systems based on a single three-axis accelerometer, yielding a best average accuracy of 86%, a specificity of 88%, and a sensitivity of 86% via the support vector machine (SVM) method using a wristband. A smartphone, on the other hand, achieved a best average accuracy of 73% also with an SVM using only a three-axis accelerometer sensor. The significance analysis of the mean accuracy, sensitivity, and specificity between the innovative wristband and the smartphone yielded a p-value of 0.000. Furthermore, the study applied unsupervised and semi-supervised learning methods, incorporating principal component analysis and t-distributed stochastic neighbor embedding. To sum up, both wristbands demonstrated the usability of wearable sensors in the early detection and mitigation of falls in the elderly, outperforming the smartphone.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:与异常发现无关的否定和推测可能导致实验室信息系统对自动放射学报告进行突出显示或标记的假阳性警报。
    目的:这项内部验证研究评估了自然语言处理方法的性能(NegEx,NegBio,内伯特,和变压器)。
    方法:我们注释了与报告中的异常发现无关的所有阴性和推测性陈述。在实验1中,我们微调了几个变压器模型(ALBERT[来自变压器的Lite双向编码器表示],BERT[来自变压器的双向编码器表示],DeberTa[解码增强的BERT,分散注意力],DistilBERT[BERT的蒸馏版本],ELECTRA[有效地学习对令牌替换进行准确分类的编码器],ERNIE[通过知识集成增强表示],RoBERTa[稳健优化的BERT预训练方法],SpanBERT,和XLNet),并使用精度比较了它们的性能,召回,准确度,和F1分数。在实验2中,我们将实验1的最佳模型与3种已建立的否定和推测检测算法(NegEx,NegBio,和NegBERT)。
    结果:我们的研究从奇美医院的3个分院收集了6000份放射学报告,涵盖多种成像方式和身体部位。共有15.01%(105,755/704,512)的单词和39.45%(4529/11,480)的重要诊断关键词出现在与异常发现无关的否定或推测性陈述中。在实验1中,所有模型在测试数据集上实现>0.98的准确度和>0.90的F1得分。ALBERT表现出最佳性能(准确度=0.991;F1分数=0.958)。在实验2中,ALBERT优于优化的NegEx,NegBio,和NegBERT方法在总体性能方面(精度=0.996;F1分数=0.991),在预测诊断关键词是否出现在与异常发现无关的推测性陈述中,并提高了关键词提取的性能(准确率=0.996;F1分数=0.997)。
    结论:ALBERT深度学习方法表现出最佳性能。我们的结果代表了计算机辅助通知系统临床应用的重大进展。
    BACKGROUND: Negation and speculation unrelated to abnormal findings can lead to false-positive alarms for automatic radiology report highlighting or flagging by laboratory information systems.
    OBJECTIVE: This internal validation study evaluated the performance of natural language processing methods (NegEx, NegBio, NegBERT, and transformers).
    METHODS: We annotated all negative and speculative statements unrelated to abnormal findings in reports. In experiment 1, we fine-tuned several transformer models (ALBERT [A Lite Bidirectional Encoder Representations from Transformers], BERT [Bidirectional Encoder Representations from Transformers], DeBERTa [Decoding-Enhanced BERT With Disentangled Attention], DistilBERT [Distilled version of BERT], ELECTRA [Efficiently Learning an Encoder That Classifies Token Replacements Accurately], ERNIE [Enhanced Representation through Knowledge Integration], RoBERTa [Robustly Optimized BERT Pretraining Approach], SpanBERT, and XLNet) and compared their performance using precision, recall, accuracy, and F1-scores. In experiment 2, we compared the best model from experiment 1 with 3 established negation and speculation-detection algorithms (NegEx, NegBio, and NegBERT).
    RESULTS: Our study collected 6000 radiology reports from 3 branches of the Chi Mei Hospital, covering multiple imaging modalities and body parts. A total of 15.01% (105,755/704,512) of words and 39.45% (4529/11,480) of important diagnostic keywords occurred in negative or speculative statements unrelated to abnormal findings. In experiment 1, all models achieved an accuracy of >0.98 and F1-score of >0.90 on the test data set. ALBERT exhibited the best performance (accuracy=0.991; F1-score=0.958). In experiment 2, ALBERT outperformed the optimized NegEx, NegBio, and NegBERT methods in terms of overall performance (accuracy=0.996; F1-score=0.991), in the prediction of whether diagnostic keywords occur in speculative statements unrelated to abnormal findings, and in the improvement of the performance of keyword extraction (accuracy=0.996; F1-score=0.997).
    CONCLUSIONS: The ALBERT deep learning method showed the best performance. Our results represent a significant advancement in the clinical applications of computer-aided notification systems.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    暴露于香烟烟雾中有害和潜在有害的成分是心血管和呼吸系统疾病的危险因素。已经开发了可以减少暴露于这些成分的烟草产品。然而,使用它们对健康的长期影响仍不清楚。烟草与健康人口评估(PATH)研究是一项基于人口的研究,研究了美国吸烟和吸烟习惯对健康的影响
    参与者包括烟草制品的使用者,包括电子烟和无烟烟草。在这项研究中,我们试图评估这些产品对人群的影响,使用机器学习技术和PATH研究中的数据。
    使用PATH第1波中吸烟者和前吸烟者的暴露(BoE)和潜在危害(BoPH)的生物标志物来创建二元分类机器学习模型,将参与者分类为当前(BoE:N=102,BoPH:N=428)或以前的吸烟者(BoE:N=102,BoPH:N=428)。关于电子烟用户的BoE和BoPH的数据(BoE:N=210,BoPH:N=258)和无烟烟草(BoE:N=206,BoPH:N=242)被输入到模型中,以调查这些产品用户是否被分类为当前或以前的吸烟者。调查了分类为当前或以前吸烟者的个体的疾病状况。
    BoE和BoPH的分类模型均具有较高的模型精度。超过60%的参与者使用电子烟或无烟烟草之一被归类为英国央行分类模型中的前吸烟者。目前吸烟者和双重使用者中只有不到15%被归类为前吸烟者。在BoPH的分类模型中发现了类似的趋势。与那些被归类为前吸烟者相比,被归类为当前吸烟者的人患有心血管疾病的比例较高(9.9-10.9%vs.6.3-6.4%)和呼吸系统疾病(19.4-22.2%vs.14.2-16.7%)。
    电子烟或无烟烟草的使用者在暴露和潜在危害的生物标志物方面可能与以前的吸烟者相似。这表明使用这些产品有助于减少接触香烟的有害成分,它们可能比传统香烟危害小。
    Exposure to harmful and potentially harmful constituents in cigarette smoke is a risk factor for cardiovascular and respiratory diseases. Tobacco products that could reduce exposure to these constituents have been developed. However, the long-term effects of their use on health remain unclear. The Population Assessment of Tobacco and Health (PATH) study is a population-based study examining the health effects of smoking and cigarette smoking habits in the U.S.
    Participants include users of tobacco products, including electronic cigarettes and smokeless tobacco. In this study, we attempted to evaluate the population-wide effects of these products, using machine learning techniques and data from the PATH study.
    Biomarkers of exposure (BoE) and potential harm (BoPH) in cigarette smokers and former smokers in wave 1 of PATH were used to create binary classification machine-learning models that classified participants as either current (BoE: N = 102, BoPH: N = 428) or former smokers (BoE: N = 102, BoPH: N = 428). Data on the BoE and BoPH of users of electronic cigarettes (BoE: N = 210, BoPH: N = 258) and smokeless tobacco (BoE: N = 206, BoPH: N = 242) were input into the models to investigate whether these product users were classified as current or former smokers. The disease status of individuals classified as either current or former smokers was investigated.
    The classification models for BoE and BoPH both had high model accuracy. More than 60% of participants who used either one of electronic cigarettes or smokeless tobacco were classified as former smokers in the classification model for BoE. Fewer than 15% of current smokers and dual users were classified as former smokers. A similar trend was found in the classification model for BoPH. Compared with those classified as former smokers, a higher percentage of those classified as current smokers had cardiovascular disease (9.9-10.9% vs. 6.3-6.4%) and respiratory diseases (19.4-22.2% vs. 14.2-16.7%).
    Users of electronic cigarettes or smokeless tobacco are likely to be similar to former smokers in their biomarkers of exposure and potential harm. This suggests that using these products helps to reduce exposure to the harmful constituents of cigarettes, and they are potentially less harmful than conventional cigarettes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在心理学中,线性判别分析(LDA)是基于问卷数据的两组分类任务的选择方法。在这项研究中,我们给出了LDA与几种监督学习算法的比较。特别是,我们考察了LDA的预测性能在多大程度上依赖于多变量正态假设。作为非参数替代方案,线性支持向量机(SVM),分类和回归树(CART),随机森林(RF),概率神经网络(PNN),并应用集合k个条件最近邻(EkCNN)算法。预测性能是使用整体性能的衡量标准来确定的,歧视,和校准,并在两个参考数据集以及模拟研究中进行了比较。参考数据是Likert类型的数据,包括5个和10个预测变量,分别。模拟基于参考数据,并在每种情况下针对平衡和不平衡场景进行。为了比较算法的性能,数据是从具有不同非正态程度的多元分布模拟的。结果因具体的性能度量而异。主要发现是LDA在双模数据中相对于整体性能总是优于RF。与LDA相比,RF算法的判别能力通常更高,但它的模型校准通常更差。仍然LDA大多在其他算法优于其他算法的情况下排名第二,或者差异只是微不足道的。因此,我们仍然推荐LDA用于此类应用。
    In psychology, linear discriminant analysis (LDA) is the method of choice for two-group classification tasks based on questionnaire data. In this study, we present a comparison of LDA with several supervised learning algorithms. In particular, we examine to what extent the predictive performance of LDA relies on the multivariate normality assumption. As nonparametric alternatives, the linear support vector machine (SVM), classification and regression tree (CART), random forest (RF), probabilistic neural network (PNN), and the ensemble k conditional nearest neighbor (EkCNN) algorithms are applied. Predictive performance is determined using measures of overall performance, discrimination, and calibration, and is compared in two reference data sets as well as in a simulation study. The reference data are Likert-type data, and comprise 5 and 10 predictor variables, respectively. Simulations are based on the reference data and are done for a balanced and an unbalanced scenario in each case. In order to compare the algorithms\' performance, data are simulated from multivariate distributions with differing degrees of nonnormality. Results differ depending on the specific performance measure. The main finding is that LDA is always outperformed by RF in the bimodal data with respect to overall performance. Discriminative ability of the RF algorithm is often higher compared to LDA, but its model calibration is usually worse. Still LDA mostly ranges second in cases it is outperformed by another algorithm, or the differences are only marginal. In consequence, we still recommend LDA for this type of application.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    量化动物行为的情感方面(例如,焦虑,社交互动,奖励,和应激反应)是神经科学研究的主要焦点。因为情绪相关行为的人工评分是耗时且主观的,经典方法依赖于容易量化的措施,例如杠杆按压或在设备的不同区域花费的时间(例如,开放式vs.高架加迷宫的闭合臂)。最近的进步使得从视频中提取姿势信息变得更加容易,并且已经提出了从姿态估计数据中提取有关行为状态的细微信息的多种方法。这些包括监督,无人监督,和自我监督的方法,采用各种不同的模型类型。从这些方法导出的行为状态的表示可以与神经活动的记录相关,以增加可以在大脑和行为之间绘制的连接的范围。在这个迷你评论中,我们将讨论如何在行为实验中使用深度学习技术,以及不同的模型架构和训练范例如何影响可以获得的表示类型。
    Quantifying emotional aspects of animal behavior (e.g., anxiety, social interactions, reward, and stress responses) is a major focus of neuroscience research. Because manual scoring of emotion-related behaviors is time-consuming and subjective, classical methods rely on easily quantified measures such as lever pressing or time spent in different zones of an apparatus (e.g., open vs. closed arms of an elevated plus maze). Recent advancements have made it easier to extract pose information from videos, and multiple approaches for extracting nuanced information about behavioral states from pose estimation data have been proposed. These include supervised, unsupervised, and self-supervised approaches, employing a variety of different model types. Representations of behavioral states derived from these methods can be correlated with recordings of neural activity to increase the scope of connections that can be drawn between the brain and behavior. In this mini review, we will discuss how deep learning techniques can be used in behavioral experiments and how different model architectures and training paradigms influence the type of representation that can be obtained.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:尽管有大量研究评估了心房颤动(AF)的各种心律控制策略,在单个患者中确定最佳策略通常基于反复试验,没有基于国际准则/建议的一刀切的方法。决定,因此,仍然是个人的,并且很好地帮助了临床决策支持系统,特别是由人工智能(AI)指导的。QRhythm利用2阶段机器学习(ML)模型,根据一组临床因素确定给定患者的最佳节律管理策略,其中该模型首先使用监督学习来预测专家临床医生的行为,并通过强化学习确定最佳策略,以获得最佳临床结果-症状复发的复合,住院治疗,和中风。
    目的:我们定性地评估了一部小说,基于AI,用于房颤节律管理的临床决策支持系统(CDSS),叫做QRhythm,它使用监督和强化学习来推荐心率控制或三种类型的节律控制策略之一-外部心脏复律,抗心律失常药物,或基于个体患者特征的消融。
    方法:33名临床医生,包括心脏病学主治医师和研究员以及内科主治医师和住院医师,对节奏进行了评估,随后进行一项调查,以评估在节律管理中使用自动化CDSS的相对舒适度,并检查未来发展的领域。
    结果:对33个提供者进行了调查,培训水平从住院医师到同伴到参加。在接受调查的应用程序的特征中,安全对提供者来说最重要,平均重要性评分为5分之4.7分(SD0.72)。此优先级之后是临床完整性(希望提供具有临床意义的建议;重要性等级4.5,SD0.9),向后可解释性(用于创建算法的总体透明度;重要性等级4.3,SD0.65),算法的透明度(决策背后的推理;重要性评级4.3,SD0.88),和提供者自主性(挑战模型做出的决策的能力;重要性评级3.85,SD0.83)。使用该应用程序的提供商将推荐的完整性列为他们在持续临床使用该模型时最关心的问题,其次是应用程序的功效和患者数据的安全性。对应用程序的信任各不相同;1(17%)的提供商回答说,他们有些不同意该声明,“我相信QRhythm应用程序提供的建议,\“2(33%)提供者对声明做出了中立的回应,3人(50%)有点同意这一说法。
    结论:ML应用程序的安全性是被调查提供商的最高优先级,对这种模式的信任仍然各不相同。ML在医疗保健中的广泛临床接受取决于提供者对算法的信任程度。建立这种信任需要确保模型的透明度和可解释性。
    BACKGROUND: Despite the numerous studies evaluating various rhythm control strategies for atrial fibrillation (AF), determination of the optimal strategy in a single patient is often based on trial and error, with no one-size-fits-all approach based on international guidelines/recommendations. The decision, therefore, remains personal and lends itself well to help from a clinical decision support system, specifically one guided by artificial intelligence (AI). QRhythm utilizes a 2-stage machine learning (ML) model to identify the optimal rhythm management strategy in a given patient based on a set of clinical factors, in which the model first uses supervised learning to predict the actions of an expert clinician and identifies the best strategy through reinforcement learning to obtain the best clinical outcome-a composite of symptomatic recurrence, hospitalization, and stroke.
    OBJECTIVE: We qualitatively evaluated a novel, AI-based, clinical decision support system (CDSS) for AF rhythm management, called QRhythm, which uses both supervised and reinforcement learning to recommend either a rate control or one of 3 types of rhythm control strategies-external cardioversion, antiarrhythmic medication, or ablation-based on individual patient characteristics.
    METHODS: Thirty-three clinicians, including cardiology attendings and fellows and internal medicine attendings and residents, performed an assessment of QRhythm, followed by a survey to assess relative comfort with automated CDSS in rhythm management and to examine areas for future development.
    RESULTS: The 33 providers were surveyed with training levels ranging from resident to fellow to attending. Of the characteristics of the app surveyed, safety was most important to providers, with an average importance rating of 4.7 out of 5 (SD 0.72). This priority was followed by clinical integrity (a desire for the advice provided to make clinical sense; importance rating 4.5, SD 0.9), backward interpretability (transparency in the population used to create the algorithm; importance rating 4.3, SD 0.65), transparency of the algorithm (reasoning underlying the decisions made; importance rating 4.3, SD 0.88), and provider autonomy (the ability to challenge the decisions made by the model; importance rating 3.85, SD 0.83). Providers who used the app ranked the integrity of recommendations as their highest concern with ongoing clinical use of the model, followed by efficacy of the application and patient data security. Trust in the app varied; 1 (17%) provider responded that they somewhat disagreed with the statement, \"I trust the recommendations provided by the QRhythm app,\" 2 (33%) providers responded with neutrality to the statement, and 3 (50%) somewhat agreed with the statement.
    CONCLUSIONS: Safety of ML applications was the highest priority of the providers surveyed, and trust of such models remains varied. Widespread clinical acceptance of ML in health care is dependent on how much providers trust the algorithms. Building this trust involves ensuring transparency and interpretability of the model.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    预测重度抑郁症(MDD)的治疗结果仍然是精确精神病学的基本挑战。基于监督机器学习的临床预测模型(CPM)是这项工作的一种有希望的方法。然而,只有少数CPM专注于模型稀疏性,即使稀疏模型可能有助于转化为临床实践并降低其应用费用。
    在这项研究中,我们开发了一个预测建模管道,该管道在嵌套交叉验证框架中结合了超参数调整和递归特征消除。我们使用三种不同的分类算法将此管道应用于有关MDD治疗反应的真实世界临床数据集以及第二个模拟数据集。通过置换测试并与没有嵌套特征选择的参考管道进行比较来评估性能。
    在所有型号中,与参考管道相比,拟议管道导致更稀疏的CPM。除了一个比较,拟议的管道产生了同样或更准确的预测。对于MDD治疗反应,当将模型应用于保留验证数据时,平衡准确度得分介于61%和71%之间.
    由此产生的模型可能对于临床应用特别有趣,因为它们可以减少临床机构的费用和患者的压力。
    Predicting treatment outcome in major depressive disorder (MDD) remains an essential challenge for precision psychiatry. Clinical prediction models (CPMs) based on supervised machine learning have been a promising approach for this endeavor. However, only few CPMs have focused on model sparsity even though sparser models might facilitate the translation into clinical practice and lower the expenses of their application.
    In this study, we developed a predictive modeling pipeline that combines hyperparameter tuning and recursive feature elimination in a nested cross-validation framework. We applied this pipeline to a real-world clinical data set on MDD treatment response and to a second simulated data set using three different classification algorithms. Performance was evaluated by permutation testing and comparison to a reference pipeline without nested feature selection.
    Across all models, the proposed pipeline led to sparser CPMs compared to the reference pipeline. Except for one comparison, the proposed pipeline resulted in equally or more accurate predictions. For MDD treatment response, balanced accuracy scores ranged between 61 and 71% when models were applied to hold-out validation data.
    The resulting models might be particularly interesting for clinical applications as they could reduce expenses for clinical institutions and stress for patients.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号