clinical notes

临床注意事项
  • 文章类型: Journal Article
    目的:主动学习(AL)很少将基于多样性和基于不确定性的策略集成到用于临床命名实体识别(NER)的动态采样框架中。机器辅助注释在创建黄金标准标签方面变得越来越流行。这项研究调查了在模拟的机器辅助注释场景下动态AL策略对临床NER的有效性。
    方法:我们提出了3种新的AL策略:基于Sentence-BERT的基于多样性的策略(CLUSTER)和2种能够从基于多样性的策略转换为基于不确定性的策略的动态策略(CLC和CNBSE)。使用BioClinicalBERT作为基础NER模型,我们独立对3个与药物相关的临床NER数据集进行了模拟实验:i2b22009,n2c22018(轨道2),和制造1.0。我们将提出的策略与基于不确定性(LC和NBSE)和被动学习(RANDOM)策略进行了比较。性能主要通过注释者为实现在独立测试集上评估的期望目标有效性而进行的编辑数量来衡量。
    结果:当目标是98%的整体目标有效性时,平均而言,CLUSTER需要最少的编辑。当瞄准99%的总体目标有效性时,CNBSE需要的编辑比NBSE少20.4%。在基于池的仿真实验下,CLUSTER和RANDOM无法实现如此高的目标。对于高难度实体,CNBSE需要比NBSE少22.5%的编辑才能实现99%的目标有效性,而集群和随机都没有达到93%的目标有效性。
    结论:当目标有效性设定为高时,提出的动态策略CNBSE在机器辅助注释中表现出强大的学习能力和较低的注释成本。当目标有效性设置为较低时,CLUSTER需要的编辑最少。
    OBJECTIVE: Active learning (AL) has rarely integrated diversity-based and uncertainty-based strategies into a dynamic sampling framework for clinical named entity recognition (NER). Machine-assisted annotation is becoming popular for creating gold-standard labels. This study investigated the effectiveness of dynamic AL strategies under simulated machine-assisted annotation scenarios for clinical NER.
    METHODS: We proposed 3 new AL strategies: a diversity-based strategy (CLUSTER) based on Sentence-BERT and 2 dynamic strategies (CLC and CNBSE) capable of switching from diversity-based to uncertainty-based strategies. Using BioClinicalBERT as the foundational NER model, we conducted simulation experiments on 3 medication-related clinical NER datasets independently: i2b2 2009, n2c2 2018 (Track 2), and MADE 1.0. We compared the proposed strategies with uncertainty-based (LC and NBSE) and passive-learning (RANDOM) strategies. Performance was primarily measured by the number of edits made by the annotators to achieve a desired target effectiveness evaluated on independent test sets.
    RESULTS: When aiming for 98% overall target effectiveness, on average, CLUSTER required the fewest edits. When aiming for 99% overall target effectiveness, CNBSE required 20.4% fewer edits than NBSE did. CLUSTER and RANDOM could not achieve such a high target under the pool-based simulation experiment. For high-difficulty entities, CNBSE required 22.5% fewer edits than NBSE to achieve 99% target effectiveness, whereas neither CLUSTER nor RANDOM achieved 93% target effectiveness.
    CONCLUSIONS: When the target effectiveness was set high, the proposed dynamic strategy CNBSE exhibited both strong learning capabilities and low annotation costs in machine-assisted annotation. CLUSTER required the fewest edits when the target effectiveness was set low.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    心力衰竭已经成为一个巨大的公共卫生问题,不能准确预测再入院将进一步导致疾病的高成本和高死亡率。构建再入院预测模型可以辅助医生进行决策,防止病患恶化,减轻费用负担。本文从MIMIC-III数据库中提取患者出院记录。它将患者分为三个研究类别:没有再入院,30天内重新接纳,30天后再入院,预测患者的再入院。我们提出了HR-BGCN模型来预测患者的再入院。首先,我们使用Adaptive-TMix来改进几个类别的预测指标,并减少不平衡类别的影响。然后,提出了基于知识的图注意机制。通过引入文档级显式图结构,图节点特征的编码能力显著提高。通过图学习获得的段落级表示与BERT的上下文令牌级表示相结合,最后,进行多分类任务。我们还比较了几种典型的图学习分类模型,以验证模型的有效性。例如IA-GCN模型,GAT模型,等。结果表明,本文提出的HR-BGCN模型对心力衰竭患者30天再入院的平均F1评分为88.26%,平均准确率为90.47%。HR-BGCN模型在预测心力衰竭再入院方面明显优于图学习分类模型。它可以帮助医生预测30天患者的再入院时间,然后降低患者的再入院率。
    Heart failure has become a huge public health problem, and failure to accurately predict readmission will further lead to the disease\'s high cost and high mortality. The construction of readmission prediction model can assist doctors in making decisions to prevent patients from deteriorating and reduce the cost burden. This paper extracts the patient discharge records from the MIMIC-III database. It divides the patients into three research categories: no readmission, readmission within 30 days, and readmission after 30 days, to predict the readmission of patients. We propose the HR-BGCN model to predict the readmission of patients. First, we use the Adaptive-TMix to improve the prediction indicators of a few categories and reduce the impact of unbalanced categories. Then, the knowledge-informed graph attention mechanism is proposed. By introducing a document-level explicit diagram structure, the coding ability of graph node features is significantly improved. The paragraph-level representation obtained through graph learning is combined with the context token-level representation of BERT, and finally, the multi-classification task is carried out. We also compare several typical graph learning classification models to verify the model\'s effectiveness, such as the IA-GCN model, GAT model, etc. The results show that the average F1 score of the HR-BGCN model proposed in this paper for 30-day readmission of heart failure patients is 88.26%, and the average accuracy is 90.47%. The HR-BGCN model is significantly better than the graph learning classification model for predicting heart failure readmission. It can help doctors predict the 30-day readmission of patients, then reduce the readmission rate of patients.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:随着COVID-19在全球范围内的爆发和传播,有限的呼吸机无法满足ICU机械通气的需求。基于结构化数据的临床模型已被提出来合理化呼吸机分配,由于固定的领域和费力的标准化过程,其延展性通常较差。预训练模型和下游微调方法的出现允许针对不同任务学习大量非结构化临床文本。但是,大规模预训练模型和下游无目的网络的硬件要求导致在临床领域缺乏推广。
    目的:在本研究中,提出了任务驱动预测模型的创新体系结构,并基于该体系结构开发了任务驱动的门控循环注意力池模型(TGRA-P)。TGRA-P可预测ICU机械通气患者的早期死亡风险,用于辅助临床医生诊断和决策。
    方法:具体来说,建议使用特定于任务的嵌入模块来微调任务标签的嵌入,并将其保存为下游调用的静态文件。它更好地服务于任务并防止GPU过载。提出了门控递归注意单元(GRA),以进一步增强具有较少参数的文本序列前后信息的依赖性。此外,我们提出了一个残差最大池(RMP),通过合并注释的所有单词级特征来避免在常见文本分类任务中忽略单词进行预测。最后,我们使用全连接解码网络作为分类器来预测死亡风险.
    结果:所提出的模型显示出非常有希望的结果,AUROC为0.8245±0.0096,AUPRC为0.7532±0.0115,准确度为0.7422±0.0028,F1评分为0.6612±0.0059,使用MIMIC-III数据集上机械通气患者的临床记录预测ICU90天死亡率,所有这些都比以前的研究好。此外,通过计算出的Cohend效应大小,在统计学上也验证了该模型与其他基线模型相比的优越性。
    结论:实验结果表明,基于创新任务驱动的预后架构的TGRA-P获得了最先进的性能。在今后的工作中,我们将在提供的代码的基础上构建,并研究其对不同数据集的适用性。该模型平衡了性能和效率,不仅可以降低早期死亡风险预测的成本,还可以帮助医生及时进行临床干预和决策。通过合并临床医生难以利用的文本记录,该模型是对医生判断的宝贵补充,加强他们的决策过程。
    BACKGROUND: With the outbreak and spread of COVID-19 worldwide, limited ventilators fail to meet the surging demand for mechanical ventilation in the ICU. Clinical models based on structured data that have been proposed to rationalize ventilator allocation often suffer from poor ductility due to fixed fields and laborious normalization processes. The advent of pre-trained models and downstream fine-tuning methods allows for learning large amounts of unstructured clinical text for different tasks. But the hardware requirements of large-scale pre-trained models and purposeless networks downstream have led to a lack of promotion in the clinical domain.
    OBJECTIVE: In this study, an innovative architecture of a task-driven predictive model is proposed and a Task-driven Gated Recurrent Attention Pool model (TGRA-P) is developed based on the architecture. TGRA-P predicts early mortality risk from patients\' clinical notes on mechanical ventilation in the ICU, which is used to assist clinicians in diagnosis and decision-making.
    METHODS: Specifically, a Task-Specific Embedding Module is proposed to fine-tune the embedding with task labels and save it as static files for downstream calls. It serves the task better and prevents GPU overload. The Gated Recurrent Attention Unit (GRA) is proposed to further enhance the dependency of the information preceding and following the text sequence with fewer parameters. In addition, we propose a Residual Max Pool (RMP) to avoid ignoring words in common text classification tasks by incorporating all word-level features of the notes for prediction. Finally, we use a fully connected decoding network as a classifier to predict the mortality risk.
    RESULTS: The proposed model shows very promising results with an AUROC of 0.8245±0.0096, an AUPRC of 0.7532±0.0115, an accuracy of 0.7422±0.0028 and F1-score of 0.6612±0.0059 for 90-day mortality prediction using clinical notes of ICU mechanically ventilated patients on the MIMIC-III dataset, all of which are better than previous studies. Moreover, the superiority of the proposed model in comparison with other baseline models is also statistically validated through the calculated Cohen\'s d effect sizes.
    CONCLUSIONS: The experimental results show that TGRA-P based on the innovative task-driven prognostic architecture obtains state-of-the-art performance. In future work, we will build upon the provided code and investigate its applicability to different datasets. The model balances performance and efficiency, not only reducing the cost of early mortality risk prediction but also assisting physicians in making timely clinical interventions and decisions. By incorporating textual records that are challenging for clinicians to utilize, the model serves as a valuable complement to physicians\' judgment, enhancing their decision-making process.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    国际疾病分类(ICD)是跨区域和随时间生成可比的全球疾病统计数据的基础。ICD编码的过程涉及根据临床注释为疾病分配代码,可以用标准的方式描述病人的病情。然而,这个过程是复杂的大量的代码和复杂的分类的ICD代码,它们被分层组织成不同的层次,包括章,类别,子类别,及其细分。许多现有的研究只专注于预测子类别代码,忽略代码之间的层次关系。为了解决这个限制,我们提出了一个多任务学习模型,可以为不同的代码级别训练多个分类器,同时还通过增强机制捕获较粗和较细粒度标签之间的关系。我们的方法在英文和中文基准数据集上进行了评估,我们证明了我们的方法通过基线模型实现了竞争性能,特别是在宏观F1结果方面。这些发现表明,我们的方法有效地利用了ICD代码的层次结构来提高疾病代码预测的准确性。对注意力机制的分析表明,我们模型的多粒度注意力在不同粒度级别上捕获了输入文本的关键特征,为预测结果提供合理的解释。
    International Classification of Diseases (ICD) serves as the foundation for generating comparable global disease statistics across regions and over time. The process of ICD coding involves assigning codes to diseases based on clinical notes, which can describe a patient\'s condition in a standard way. However, this process is complicated by the vast number of codes and the intricate taxonomy of ICD codes, which are hierarchically organized into various levels, including chapter, category, subcategory, and its subdivisions. Many existing studies focus solely on predicting subcategory codes, ignoring the hierarchical relationships among codes. To address this limitation, we propose a multitask learning model that trains multiple classifiers for different code levels, while also capturing the relations between coarser and finer-grained labels through a reinforcement mechanism. Our approach is evaluated on both English and Chinese benchmark dataset, and we demonstrate that our method achieves competitive performance with baseline models, particularly in terms of macro-F1 results. These findings suggest that our approach effectively leverages the hierarchical structure of ICD codes to improve disease code prediction accuracy. Analysis of attention mechanism shows that multigranularity attention of our model captures crucial feature of input text on different granularity levels, which can provide reasonable explanations for the prediction results.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:严重的药物超敏反应(DHRs)是指由药物引起的过敏反应,通常以严重的皮疹和内部损伤为主要症状。现在,医院中严重DHR的报告仅通过自发报告系统(SRS)进行,由哪些临床医生负责操作。自动识别系统仔细检查临床记录并报告潜在的严重DHR病例。
    目的:研究的目标是开发一种用于挖掘严重DHR病例的自动识别系统,并发现更多DHR病例以供进一步研究。将该方法应用于北京儿童医院儿科电子健康档案(EHRs)9年的数据。
    方法:将表型分析任务作为一个文献分类问题来处理。准备了包含用于训练的标记文档的DHR数据集。每个文档包含该数据集中在1次住院期间生成的所有临床注释。文档级标签对应于DHR类型和否定类别。对公开可用的2016年国家NLP临床挑战吸烟任务的长文档分类策略进行了评估。在这项工作中评估了四种策略:文档截断,层次结构表示,高效的自我注意,和关键句子选择。在DHR数据集上评估了域内和开放域预训练的嵌入。执行自动网格搜索以调整统计分类器,以在转换后的数据上获得最佳性能。分析了最佳性能模型的推理效率和内存需求。运行了从EHR系统中数百万文档中挖掘DHR案例的最有效模型。
    结果:对于长文档分类,具有指南关键字的关键句子选择实现了最佳性能,并且比层次结构表示模型的推理速度快9倍。最佳模型在北京儿童医院EHR系统中发现1155例DHR病例。经过临床医生专家的反复检查,最终确定了357例严重DHR。对于吸烟的挑战,我们的模型达到了最先进的性能记录(94.1%vs94.2%)。
    结论:所提出的方法从大量的EHR记录档案中发现了357例DHR阳性病例,其中约90%被SRS遗漏。SRS在同一时期仅报告了36例。病例分析还发现更多与儿科严重DHR相关的可疑药物。
    BACKGROUND: Severe drug hypersensitivity reactions (DHRs) refer to allergic reactions caused by drugs and usually present with severe skin rashes and internal damage as the main symptoms. Reporting of severe DHRs in hospitals now solely occurs through spontaneous reporting systems (SRSs), which clinicians in charge operate. An automatic identification system scrutinizes clinical notes and reports potential severe DHR cases.
    OBJECTIVE: The goal of the research was to develop an automatic identification system for mining severe DHR cases and discover more DHR cases for further study. The proposed method was applied to 9 years of data in pediatrics electronic health records (EHRs) of Beijing Children\'s Hospital.
    METHODS: The phenotyping task was approached as a document classification problem. A DHR dataset containing tagged documents for training was prepared. Each document contains all the clinical notes generated during 1 inpatient visit in this data set. Document-level tags correspond to DHR types and a negative category. Strategies were evaluated for long document classification on the openly available National NLP Clinical Challenges 2016 smoking task. Four strategies were evaluated in this work: document truncation, hierarchy representation, efficient self-attention, and key sentence selection. In-domain and open-domain pretrained embeddings were evaluated on the DHR dataset. An automatic grid search was performed to tune statistical classifiers for the best performance over the transformed data. Inference efficiency and memory requirements of the best performing models were analyzed. The most efficient model for mining DHR cases from millions of documents in the EHR system was run.
    RESULTS: For long document classification, key sentence selection with guideline keywords achieved the best performance and was 9 times faster than hierarchy representation models for inference. The best model discovered 1155 DHR cases in Beijing Children\'s Hospital EHR system. After double-checking by clinician experts, 357 cases of severe DHRs were finally identified. For the smoking challenge, our model reached the record of state-of-the-art performance (94.1% vs 94.2%).
    CONCLUSIONS: The proposed method discovered 357 positive DHR cases from a large archive of EHR records, about 90% of which were missed by SRSs. SRSs reported only 36 cases during the same period. The case analysis also found more suspected drugs associated with severe DHRs in pediatrics.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    国际疾病分类(ICD)这得到了世界卫生组织的认可,是诊断分类标准。ICD代码存储,检索,并分析健康信息以做出临床决策。目前,ICD编码已被137多个国家采用。然而,在巴基斯坦,很少有医院实施了ICD编码并进行了不同的流行病学研究.此外,他们都没有报道过基于ICD编码的肝脏疾病负担谱,也没有实现自动ICD编码。在这项研究中,我们为PirAbdulQadirShahJeelani医学科学研究所肝移植部门的数据库注释了ICD代码.我们将此数据库命名为肝移植医学信息集市(MIMLT)。结果显示,该数据库包含34个ICD代码,其中V70.8是最常见的代码。此外,我们基于ICD编码确定了肝脏受者的肝脏疾病负担谱.我们发现慢性丙型肝炎(070.54)是肝移植的最常见指征。此外,我们利用MIMLT数据库实现了自动ICD编码,并通过预训练嵌入(DRCNNTLe)模型提出了一种具有迁移学习的新型深度递归卷积神经网络,这是我们的DRCNN-HP模型的扩展版本。DRCNNTLe从其预先训练的嵌入层中提取健壮的文本表示,在大型特定领域的MIMICIII数据库语料库上进行训练。结果表明,利用预先训练的词嵌入,在大型特定领域的语料库上进行训练,可以显着提高DRCNNTLe模型的性能,并在目标数据库较小时提供最新的结果。
    The International Classification of Diseases (ICD), which is endorsed by the World Health Organization, is a diagnostic classification standard. ICD codes store, retrieve, and analyze health information to make clinical decisions. Currently, ICD coding has been adopted by more than 137 countries. However, in Pakistan, very few hospitals have implemented ICD coding and conducted different epidemiological studies. Moreover, none of them have reported the spectrum of liver disease burden based on ICD coding, nor implemented automated ICD coding. In this study, we annotated ICD codes for the database of the liver transplant unit of the Pir Abdul Qadir Shah Jeelani Institute of Medical Sciences. We named this database Medical Information Mart for Liver Transplantation (MIMLT). The results revealed that the database contains 34 ICD codes, of which V70.8 is the most frequent code. Furthermore, we determined the spectrum of liver disease burden in liver recipients based on ICD coding. We found that chronic hepatitis C (070.54) is the most frequent indication for liver transplantation. Additionally, we implemented automated ICD coding utilizing the MIMLT database and proposed a novel Deep Recurrent Convolutional Neural Network with Transfer Learning through pre-trained Embeddings (DRCNNTLe) model, which is an extended version of our DRCNN-HP model. DRCNNTLe extracts robust text representations from its pre-trained embedding layer, which is trained on a large domain-specific MIMIC III database corpus. The results indicate that utilizing pre-trained word embeddings, which are trained on large domain-specific corpora can significantly improve the performance of the DRCNNTLe model and provide state-of-the-art results when the target database is small.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:类风湿性关节炎(RA)是一种致残率高的免疫系统疾病,在电子病历的临床记录中有大量有价值的疾病诊断和治疗信息。人工智能方法可以有效地挖掘临床笔记中的有用信息。本研究旨在开发一种有效的方法来识别和分类与RA相关的临床记录中的医疗实体,并在随后的研究中使用实体识别结果。
    方法:在本文中,我们引入了双向编码器表示从变压器(BERT)预训练模型来增强词向量的语义表示。然后将生成的单词向量输入到模型中,由传统的双向长短期记忆神经网络和条件随机场机器学习算法组成,用于临床笔记的命名实体识别,以提高模型的有效性。BERT方法采用令牌嵌入的组合,段嵌入,和位置嵌入作为模型输入,并在训练期间对模型进行微调。
    结果:与传统的Word2vec单词向量模型相比,获得词向量作为模型输入的BERT预训练模型的性能显著提高。使用许多类风湿性关节炎临床笔记训练后,命名实体识别任务的最佳F1评分为0.936。
    结论:本文证实了使用先进的人工智能方法在大量临床笔记的语料库上执行命名实体识别任务的有效性;该应用在医学环境中很有前途。此外,本研究结果的提取为后续任务提供了大量的基础数据,包括关系提取,医学知识图谱构建,疾病推理
    BACKGROUND: Rheumatoid arthritis (RA) is a disease of the immune system with a high rate of disability and there are a large amount of valuable disease diagnosis and treatment information in the clinical note of the electronic medical record. Artificial intelligence methods can be used to mine useful information in clinical notes effectively. This study aimed to develop an effective method to identify and classify medical entities in the clinical notes relating to RA and use the entity identification results in subsequent studies.
    METHODS: In this paper, we introduced the bidirectional encoder representation from transformers (BERT) pre-training model to enhance the semantic representation of word vectors. The generated word vectors were then inputted into the model, which is composed of traditional bidirectional long short-term memory neural networks and conditional random field machine learning algorithms for the named entity recognition of clinical notes to improve the model\'s effectiveness. The BERT method takes the combination of token embeddings, segment embeddings, and position embeddings as the model input and fine-tunes the model during training.
    RESULTS: Compared with the traditional Word2vec word vector model, the performance of the BERT pre-training model to obtain a word vector as model input was significantly improved. The best F1-score of the named entity recognition task after training using many rheumatoid arthritis clinical notes was 0.936.
    CONCLUSIONS: This paper confirms the effectiveness of using an advanced artificial intelligence method to carry out named entity recognition tasks on a corpus of a large number of clinical notes; this application is promising in the medical setting. Moreover, the extraction of results in this study provides a lot of basic data for subsequent tasks, including relation extraction, medical knowledge graph construction, and disease reasoning.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    临床笔记记录健康状况,患者的临床表现等详细信息。国际疾病分类(ICD)代码是电子健康记录的重要标签。通过深度学习模型将医学代码自动分配给临床笔记,不仅可以提高工作效率,加快医疗信息化的发展,而且可以解决与医疗保险相关的许多问题。最近,基于神经网络的医学代码自动分配方法已被提出。然而,在医疗领域,临床笔记通常是长文档,包含许多复杂的句子,当前的大多数方法无法有效地从文档文本中学习潜在特征的表示。
    在本文中,提出了一种混合胶囊网络模型。具体来说,我们使用双向LSTM(Bi-LSTM)与转发和后向方向合并来自序列两侧的信息。标签嵌入框架将文本和标签嵌入在一起以利用标签信息。然后,我们在胶囊网络中使用动态路由算法来提取有价值的特征,以用于医疗代码预测任务。
    我们将我们的模型应用于将自动医疗代码分配给临床笔记的任务,并基于MIMIC-III数据进行了一系列实验。实验结果表明,我们的方法在MIMIC-III数据集上获得了67.5%的微F1分数,优于其他最先进的方法。
    所提出的模型采用动态路由算法和标签嵌入框架,可以有效地捕获跨句子的重要特征。胶囊网络和领域知识都有助于医学代码预测任务。
    Clinical notes record the health status, clinical manifestations and other detailed information of each patient. The International Classification of Diseases (ICD) codes are important labels for electronic health records. Automatic medical codes assignment to clinical notes through the deep learning model can not only improve work efficiency and accelerate the development of medical informatization but also facilitate the resolution of many issues related to medical insurance. Recently, neural network-based methods have been proposed for the automatic medical code assignment. However, in the medical field, clinical notes are usually long documents and contain many complex sentences, most of the current methods cannot effective in learning the representation of potential features from document text.
    In this paper, we propose a hybrid capsule network model. Specifically, we use bi-directional LSTM (Bi-LSTM) with forwarding and backward directions to merge the information from both sides of the sequence. The label embedding framework embeds the text and labels together to leverage the label information. We then use a dynamic routing algorithm in the capsule network to extract valuable features for medical code prediction task.
    We applied our model to the task of automatic medical codes assignment to clinical notes and conducted a series of experiments based on MIMIC-III data. The experimental results show that our method achieves a micro F1-score of 67.5% on MIMIC-III dataset, which outperforms the other state-of-the-art methods.
    The proposed model employed the dynamic routing algorithm and label embedding framework can effectively capture the important features across sentences. Both Capsule networks and domain knowledge are helpful for medical code prediction task.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    Respiratory diseases, including asthma, bronchitis, pneumonia, and upper respiratory tract infection (RTI), are among the most common diseases in clinics. The similarities among the symptoms of these diseases precludes prompt diagnosis upon the patients\' arrival. In pediatrics, the patients\' limited ability in expressing their situation makes precise diagnosis even harder. This becomes worse in primary hospitals, where the lack of medical imaging devices and the doctors\' limited experience further increase the difficulty of distinguishing among similar diseases. In this paper, a pediatric fine-grained diagnosis-assistant system is proposed to provide prompt and precise diagnosis using solely clinical notes upon admission, which would assist clinicians without changing the diagnostic process. The proposed system consists of two stages: a test result structuralization stage and a disease identification stage. The first stage structuralizes test results by extracting relevant numerical values from clinical notes, and the disease identification stage provides a diagnosis based on text-form clinical notes and the structured data obtained from the first stage. A novel deep learning algorithm was developed for the disease identification stage, where techniques including adaptive feature infusion and multi-modal attentive fusion were introduced to fuse structured and text data together. Clinical notes from over 12000 patients with respiratory diseases were used to train a deep learning model, and clinical notes from a non-overlapping set of about 1800 patients were used to evaluate the performance of the trained model. The average precisions (AP) for pneumonia, RTI, bronchitis and asthma are 0.878, 0.857, 0.714, and 0.825, respectively, achieving a mean AP (mAP) of 0.819. These results demonstrate that our proposed fine-grained diagnosis-assistant system provides precise identification of the diseases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    糖尿病是一种以慢性高血糖为特征的常见代谢性疾病。大量的医疗保健数据正在加速精准化和个性化医疗。人工智能和基于算法的方法对于支持临床决策变得越来越重要。这些方法能够通过取消一些日常工作并使他们能够专注于关键问题来增强医疗保健提供者。然而,很少有研究使用预测模型来揭示ICU患者合并症与糖尿病之间的关联.本研究旨在使用统一医学语言系统(UMLS)资源,涉及机器学习和自然语言处理(NLP)方法来预测死亡风险。
    我们对重症监护III(MIMIC-III)数据进行了二次分析。应用了不同的机器学习建模和NLP方法。医疗保健领域的知识建立在由定义临床术语(如药物或临床症状)的专家创建的词典上。这些知识对于从断言某种疾病的文本注释中识别信息是有价值的。知识引导模型可以自动从包含概念实体和这些各种概念之间的关系的临床笔记或生物医学文献中提取知识。死亡率分类基于知识引导特征和规则的组合。应用了UMLS实体嵌入和具有词嵌入的卷积神经网络(CNN)。利用具有实体嵌入的概念唯一标识符(CUI)来构建临床文本表示。所采用的机器学习模型的最佳配置产生了0.97的竞争性AUC。机器学习模型以及临床笔记的NLP有望帮助医疗保健提供者预测危重病人的死亡风险。
    UMLS资源和临床记录是预测重症监护环境中糖尿病患者死亡率的强大而重要的工具。知识引导的CNN模型对于学习隐藏特征是有效的(AUC=0.97)。
    Diabetes mellitus is a prevalent metabolic disease characterized by chronic hyperglycemia. The avalanche of healthcare data is accelerating precision and personalized medicine. Artificial intelligence and algorithm-based approaches are becoming more and more vital to support clinical decision-making. These methods are able to augment health care providers by taking away some of their routine work and enabling them to focus on critical issues. However, few studies have used predictive modeling to uncover associations between comorbidities in ICU patients and diabetes. This study aimed to use Unified Medical Language System (UMLS) resources, involving machine learning and natural language processing (NLP) approaches to predict the risk of mortality.
    We conducted a secondary analysis of Medical Information Mart for Intensive Care III (MIMIC-III) data. Different machine learning modeling and NLP approaches were applied. Domain knowledge in health care is built on the dictionaries created by experts who defined the clinical terminologies such as medications or clinical symptoms. This knowledge is valuable to identify information from text notes that assert a certain disease. Knowledge-guided models can automatically extract knowledge from clinical notes or biomedical literature that contains conceptual entities and relationships among these various concepts. Mortality classification was based on the combination of knowledge-guided features and rules. UMLS entity embedding and convolutional neural network (CNN) with word embeddings were applied. Concept Unique Identifiers (CUIs) with entity embeddings were utilized to build clinical text representations.
    The best configuration of the employed machine learning models yielded a competitive AUC of 0.97. Machine learning models along with NLP of clinical notes are promising to assist health care providers to predict the risk of mortality of critically ill patients.
    UMLS resources and clinical notes are powerful and important tools to predict mortality in diabetic patients in the critical care setting. The knowledge-guided CNN model is effective (AUC = 0.97) for learning hidden features.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号