fine-tuning

微调
  • 文章类型: Journal Article
    临床注释部分识别有助于定位相关信息,并可能有利于下游任务,如命名实体识别。然而,传统的监督方法存在可转移性问题。本研究提出了一种使用大型语言模型(LLM)进行部分识别的新框架,以克服这些局限性。
    我们将部分识别框为问答,并以自由文本提供部分定义。我们在没有任何培训的情况下评估了多个现成的LLM。我们还微调我们的LLM,以调查微调数据集的大小和特异性如何影响模型性能。
    GPT4获得了最高的F1得分0.77。最佳开源模型(Tulu2-70b)达到0.64,与GPT3.5(ChatGPT)相当。还发现GPT4在27种(33%)截面类型中的9种获得的F1得分大于0.9,在27种(56%)截面类型中的15种获得的F1得分大于0.8。对于我们的微调模型,我们发现它们随着一般领域数据集的大小而趋于稳定。我们还发现,添加合理量的区段识别实例是有益的。
    这些结果表明,GPT4已接近生产就绪,可用于区段识别,似乎包含了笔记结构的知识和遵循复杂指令的能力,目前最好的开源LLM正在迎头赶上。
    我们的研究表明,LLM有望用于可推广的临床笔记部分识别。通过向微调数据集添加部分识别示例,它们有可能得到进一步改进。
    UNASSIGNED: Clinical note section identification helps locate relevant information and could be beneficial for downstream tasks such as named entity recognition. However, the traditional supervised methods suffer from transferability issues. This study proposes a new framework for using large language models (LLMs) for section identification to overcome the limitations.
    UNASSIGNED: We framed section identification as question-answering and provided the section definitions in free-text. We evaluated multiple LLMs off-the-shelf without any training. We also fine-tune our LLMs to investigate how the size and the specificity of the fine-tuning dataset impacts model performance.
    UNASSIGNED: GPT4 achieved the highest F1 score of 0.77. The best open-source model (Tulu2-70b) achieved 0.64 and is on par with GPT3.5 (ChatGPT). GPT4 is also found to obtain F1 scores greater than 0.9 for 9 out of the 27 (33%) section types and greater than 0.8 for 15 out of 27 (56%) section types. For our fine-tuned models, we found they plateaued with an increasing size of the general domain dataset. We also found that adding a reasonable amount of section identification examples is beneficial.
    UNASSIGNED: These results indicate that GPT4 is nearly production-ready for section identification, and seemingly contains both knowledge of note structure and the ability to follow complex instructions, and the best current open-source LLM is catching up.
    UNASSIGNED: Our study shows that LLMs are promising for generalizable clinical note section identification. They have the potential to be further improved by adding section identification examples to the fine-tuning dataset.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    分子性质预测(MPP)对于药物发现至关重要,作物保护,和环境科学。在过去的几十年里,已经开发了各种各样的计算技术,从在统计模型和经典机器学习中使用简单的物理和化学性质以及分子指纹到高级深度学习方法。在这次审查中,我们的目标是从当前关于采用变压器模型进行MPP的研究中提取见解。我们分析了当前可用的模型,并探讨了在为MPP训练和微调变压器模型时出现的关键问题。这些问题包括预训练数据的选择和规模,最优架构选择,和有前途的培训前目标。我们的分析突出了当前研究尚未涵盖的领域,邀请进一步探索,以增进对该领域的理解。此外,我们应对比较不同模型的挑战,强调需要标准化的数据拆分和稳健的统计分析。
    Molecular Property Prediction (MPP) is vital for drug discovery, crop protection, and environmental science. Over the last decades, diverse computational techniques have been developed, from using simple physical and chemical properties and molecular fingerprints in statistical models and classical machine learning to advanced deep learning approaches. In this review, we aim to distill insights from current research on employing transformer models for MPP. We analyze the currently available models and explore key questions that arise when training and fine-tuning a transformer model for MPP. These questions encompass the choice and scale of the pretraining data, optimal architecture selections, and promising pretraining objectives. Our analysis highlights areas not yet covered in current research, inviting further exploration to enhance the field\'s understanding. Additionally, we address the challenges in comparing different models, emphasizing the need for standardized data splitting and robust statistical analysis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:解码人类基因组序列需要对DNA序列功能性进行全面分析。通过计算和实验方法,研究人员已经研究了基因型与表型的关系,并生成了有助于解开复杂遗传蓝图的重要数据集。因此,最近开发的人工智能方法可以用来解释这些DNA序列的功能。
    方法:本研究探讨了深度学习的使用,特别是预训练的基因组模型,如DNA_bert_6和human_gpt2-v1,在解释和表示人类基因组序列。最初,我们精心构建了多个连接基因型和表型的数据集,以微调这些模型,从而实现精确的DNA序列分类.此外,我们评估了序列长度对分类结果的影响,并使用HERV数据集分析了模型隐藏层中特征提取的影响.为了增强我们对模型识别的表型特异性模式的理解,我们进行浓缩,具有高平均局部代表权重(ALRW)评分的人内源性逆转录病毒(HERV)序列中特定基序的致病性和保守性分析。
    结果:我们构建了多个基因型-表型数据集,与随机基因组序列相比,这些数据集显示出值得称道的分类性能,特别是在HERV数据集中,实现了二进制和多分类精度,F1值分别超过0.935和0.888。值得注意的是,HERV数据集的微调不仅提高了我们识别和区分DNA序列中不同信息类型的能力,而且还成功地在ALRW评分较高的区域中识别出与神经系统疾病和癌症相关的特定基序.随后对这些基序的分析揭示了物种对环境压力的适应性反应及其与病原体的共同进化。
    结论:这些发现突出了预先训练的基因组模型在学习DNA序列表征方面的潜力。特别是在利用HERV数据集时,并为未来的研究工作提供有价值的见解。这项研究代表了一种创新的策略,将预先训练的基因组模型表示与分析基因组序列功能的经典方法相结合。从而促进基因组学和人工智能之间的交叉受精。
    BACKGROUND: Decoding human genomic sequences requires comprehensive analysis of DNA sequence functionality. Through computational and experimental approaches, researchers have studied the genotype-phenotype relationship and generate important datasets that help unravel complicated genetic blueprints. Thus, the recently developed artificial intelligence methods can be used to interpret the functions of those DNA sequences.
    METHODS: This study explores the use of deep learning, particularly pre-trained genomic models like DNA_bert_6 and human_gpt2-v1, in interpreting and representing human genome sequences. Initially, we meticulously constructed multiple datasets linking genotypes and phenotypes to fine-tune those models for precise DNA sequence classification. Additionally, we evaluate the influence of sequence length on classification results and analyze the impact of feature extraction in the hidden layers of our model using the HERV dataset. To enhance our understanding of phenotype-specific patterns recognized by the model, we perform enrichment, pathogenicity and conservation analyzes of specific motifs in the human endogenous retrovirus (HERV) sequence with high average local representation weight (ALRW) scores.
    RESULTS: We have constructed multiple genotype-phenotype datasets displaying commendable classification performance in comparison with random genomic sequences, particularly in the HERV dataset, which achieved binary and multi-classification accuracies and F1 values exceeding 0.935 and 0.888, respectively. Notably, the fine-tuning of the HERV dataset not only improved our ability to identify and distinguish diverse information types within DNA sequences but also successfully identified specific motifs associated with neurological disorders and cancers in regions with high ALRW scores. Subsequent analysis of these motifs shed light on the adaptive responses of species to environmental pressures and their co-evolution with pathogens.
    CONCLUSIONS: These findings highlight the potential of pre-trained genomic models in learning DNA sequence representations, particularly when utilizing the HERV dataset, and provide valuable insights for future research endeavors. This study represents an innovative strategy that combines pre-trained genomic model representations with classical methods for analyzing the functionality of genome sequences, thereby promoting cross-fertilization between genomics and artificial intelligence.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    微调是迁移学习中的一项重要技术,在缺乏训练数据的任务中取得了显著的成功。然而,由于当源域和目标域之间的数据分布差异较大时,难以提取单源域微调的有效特征,我们提出了一种基于多源域的迁移学习框架,称为自适应多源域协作微调(AMCF)。AMCF利用多个源域模型进行协作微调,从而提高模型在目标任务中的特征提取能力。具体来说,AMCF采用自适应多源域层选择策略,为多个源域模型中的目标任务定制合适的层微调方案,旨在提取更有效的特征。此外,设计了一种新的多源域协同损失函数,便于各源域模型精确提取目标数据特征。同时,它致力于最小化各种源域模型之间的输出差异,增强了源域模型对目标数据的适应性。为了验证AMCF的有效性,它适用于迁移学习中常用的七个公共视觉分类数据集,并与最广泛使用的单源域微调方法进行了比较。实验结果表明,与现有的微调方法相比,我们的方法不仅提高了模型中特征提取的准确性,而且为目标任务提供了精确的层微调方案,从而显著提高微调性能。
    Fine-tuning is an important technique in transfer learning that has achieved significant success in tasks that lack training data. However, as it is difficult to extract effective features for single-source domain fine-tuning when the data distribution difference between the source and the target domain is large, we propose a transfer learning framework based on multi-source domain called adaptive multi-source domain collaborative fine-tuning (AMCF) to address this issue. AMCF utilizes multiple source domain models for collaborative fine-tuning, thereby improving the feature extraction capability of model in the target task. Specifically, AMCF employs an adaptive multi-source domain layer selection strategy to customize appropriate layer fine-tuning schemes for the target task among multiple source domain models, aiming to extract more efficient features. Furthermore, a novel multi-source domain collaborative loss function is designed to facilitate the precise extraction of target data features by each source domain model. Simultaneously, it works towards minimizing the output difference among various source domain models, thereby enhancing the adaptability of the source domain model to the target data. In order to validate the effectiveness of AMCF, it is applied to seven public visual classification datasets commonly used in transfer learning, and compared with the most widely used single-source domain fine-tuning methods. Experimental results demonstrate that, in comparison with the existing fine-tuning methods, our method not only enhances the accuracy of feature extraction in the model but also provides precise layer fine-tuning schemes for the target task, thereby significantly improving the fine-tuning performance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    小分子药物设计旨在产生靶向特定蛋白质的化合物,在药物发现的早期阶段起着至关重要的作用。最近,利用GPT模型的研究已经出现,在产生分子化合物的各个领域取得了显著的成功。然而,由于制药领域小数据集的持续挑战,在产生目标特定化合物的性能方面存在一些降解。为了解决这个问题,我们提出了一种增强的靶标特异性药物生成模型,Adapt-cMolGPT,它修改了分子表示并优化了微调过程。特别是,我们引入了一种新的微调方法,该方法将适配器模块集成到预训练的基础模型中,并按部分交替进行权重更新。我们通过多次实验评估了所提出的模型,并证明了与以前模型相比的性能改进。在实验结果中,与其他模型相比,Adapt-cMolGPT产生了更多的新颖有效化合物,这些生成的化合物表现出与真实分子数据相似的特性。这些结果表明,我们提出的方法在设计靶向特定蛋白质的药物方面非常有效。
    Small-molecule drug design aims to generate compounds that target specific proteins, playing a crucial role in the early stages of drug discovery. Recently, research has emerged that utilizes the GPT model, which has achieved significant success in various fields to generate molecular compounds. However, due to the persistent challenge of small datasets in the pharmaceutical field, there has been some degradation in the performance of generating target-specific compounds. To address this issue, we propose an enhanced target-specific drug generation model, Adapt-cMolGPT, which modifies molecular representation and optimizes the fine-tuning process. In particular, we introduce a new fine-tuning method that incorporates an adapter module into a pre-trained base model and alternates weight updates by sections. We evaluated the proposed model through multiple experiments and demonstrated performance improvements compared to previous models. In the experimental results, Adapt-cMolGPT generated a greater number of novel and valid compounds compared to other models, with these generated compounds exhibiting properties similar to those of real molecular data. These results indicate that our proposed method is highly effective in designing drugs targeting specific proteins.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    我们目前最好的科学似乎表明,物理定律和我们宇宙的初始条件已经针对生命的可能性进行了微调。许多科学家和哲学家认为,微调是多重宇宙假设的证据。本文将重点讨论对从微调到多重宇宙的推论的讨论很多的反对意见:这种推理路线犯了反向赌徒的谬误。尽管有几十年前的文学作品,这场哲学辩论与微调和多元宇宙的科学讨论几乎没有联系,它主要围绕着植根于永恒通货膨胀和弦理论的多元宇宙假设的特定形式。正因为如此,从科学到哲学的潜在重要影响,反之亦然,已经被开发不足。在本文中,我将迈出第一步,加入这两个讨论,通过认为对多元宇宙的永恒通货膨胀弦理论概念的关注支持了反向赌徒的谬论指控。它通过支持我们的宇宙是偶然微调的想法来做到这一点,从而解决了人们的担忧,即反赌徒谬论指控的支持者们毫无争议地假设了这一点。
    Our best current science seems to suggest the laws of physics and the initial conditions of our universe are fine-tuned for the possibility of life. A significant number of scientists and philosophers believe that the fine-tuning is evidence for the multiverse hypothesis. This paper will focus on a much-discussed objection to the inference from the fine-tuning to the multiverse: the charge that this line of reasoning commits the inverse gambler\'s fallacy. Despite the existence of a literature going back decades, this philosophical debate has made little contact with scientific discussion of fine-tuning and the multiverse, which mainly revolves around a specific form of the multiverse hypothesis rooted in eternal inflation combined with string theory. Because of this, potentially important implications from science to philosophy, and vice versa, have been left underexplored. In this paper, I will take a first step at joining up these two discussions, by arguing that attention to the eternal inflation + string theory conception of the multiverse supports the inverse gambler\'s fallacy charge. It does this by supporting the idea that our universe is contingently fine-tuned, thus addressing the concern that proponents of the inverse gambler\'s fallacy charge have assumed this without argument.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:大型语言模型(LLM)具有支持健康信息学中有前途的新应用的潜力。然而,缺乏在生物医学和卫生政策背景下对LLM进行微调以执行特定任务的样本量考虑因素的实际数据。
    目的:本研究旨在评估用于微调LLM的样本量和样本选择技术,以支持针对利益冲突披露声明的自定义数据集的改进的命名实体识别(NER)。
    方法:随机抽取200份披露声明进行注释。所有“人员”和“ORG”实体均由2个评估者识别,一旦建立了适当的协议,注释者独立地注释了另外290个公开声明。从490个注释文档中,抽取了2500个不同大小范围的分层随机样本。2500个训练集子样本用于在2个模型架构(来自变压器[BERT]和生成预训练变压器[GPT]的双向编码器表示)中微调语言模型的选择,以改善NER。多元回归用于评估样本量(句子)之间的关系,实体密度(每个句子的实体[EPS]),和训练的模型性能(F1分数)。此外,单预测阈值回归模型用于评估增加样本量或实体密度导致边际收益递减的可能性。
    结果:在架构中,微调模型的顶线NER性能从F1分数=0.79到F1分数=0.96不等。双预测多元线性回归模型的多重R2在0.6057~0.7896之间有统计学意义(均P<.001)。在所有情况下,EPS和句子数是F1得分的显著预测因子(P<.001),除了GPT-2_large模型,其中每股收益不是显著的预测因子(P=0.184)。模型阈值表示由增加的训练数据集样本量(以句子的数量衡量)的边际收益递减点,点估计范围从RoBERTa_large的439个句子到GPT-2_large的527个句子。同样,阈值回归模型表明每股收益的边际收益递减,点估计在1.36和1.38之间。
    结论:相对适度的样本量可用于微调适用于生物医学文本的NER任务的LLM,和训练数据实体密度应代表性地近似生产数据中的实体密度。训练数据质量和模型架构的预期用途(文本生成与文本处理或分类)可能是,或更多,重要的是训练数据量和模型参数大小。
    BACKGROUND: Large language models (LLMs) have the potential to support promising new applications in health informatics. However, practical data on sample size considerations for fine-tuning LLMs to perform specific tasks in biomedical and health policy contexts are lacking.
    OBJECTIVE: This study aims to evaluate sample size and sample selection techniques for fine-tuning LLMs to support improved named entity recognition (NER) for a custom data set of conflicts of interest disclosure statements.
    METHODS: A random sample of 200 disclosure statements was prepared for annotation. All \"PERSON\" and \"ORG\" entities were identified by each of the 2 raters, and once appropriate agreement was established, the annotators independently annotated an additional 290 disclosure statements. From the 490 annotated documents, 2500 stratified random samples in different size ranges were drawn. The 2500 training set subsamples were used to fine-tune a selection of language models across 2 model architectures (Bidirectional Encoder Representations from Transformers [BERT] and Generative Pre-trained Transformer [GPT]) for improved NER, and multiple regression was used to assess the relationship between sample size (sentences), entity density (entities per sentence [EPS]), and trained model performance (F1-score). Additionally, single-predictor threshold regression models were used to evaluate the possibility of diminishing marginal returns from increased sample size or entity density.
    RESULTS: Fine-tuned models ranged in topline NER performance from F1-score=0.79 to F1-score=0.96 across architectures. Two-predictor multiple linear regression models were statistically significant with multiple R2 ranging from 0.6057 to 0.7896 (all P<.001). EPS and the number of sentences were significant predictors of F1-scores in all cases ( P<.001), except for the GPT-2_large model, where EPS was not a significant predictor (P=.184). Model thresholds indicate points of diminishing marginal return from increased training data set sample size measured by the number of sentences, with point estimates ranging from 439 sentences for RoBERTa_large to 527 sentences for GPT-2_large. Likewise, the threshold regression models indicate a diminishing marginal return for EPS with point estimates between 1.36 and 1.38.
    CONCLUSIONS: Relatively modest sample sizes can be used to fine-tune LLMs for NER tasks applied to biomedical text, and training data entity density should representatively approximate entity density in production data. Training data quality and a model architecture\'s intended use (text generation vs text processing or classification) may be as, or more, important as training data volume and model parameter size.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    基于肽和蛋白质的疗法正在成为多种疾病的有希望的治疗方案。蛋白质的毒性是基于蛋白质的疗法的主要障碍。因此,迫切需要准确的计算机方法来确定有毒蛋白质,以过滤潜在的候选物。同时,必须精确识别无毒蛋白质,以扩大基于蛋白质的生物制剂的可能性。为了应对这一挑战,我们提出了一个集成框架,叫做VISH-Pred,包括通过在大型上微调ESM2变压器模型而构建的模型,实验验证,精选的蛋白质和肽毒性数据集。VISH-Pred框架中的主要步骤是仅以蛋白质序列作为输入来有效估计蛋白质毒性。采用欠采样技术来处理数据中的巨大类不平衡,并从经过微调的ESM2蛋白质语言模型中学习表示,然后将其提供给诸如Lightgbm和XGBoost之类的机器学习技术。VISH-Pred框架能够正确识别具有潜在毒性的肽/蛋白质和无毒蛋白质,在三个非冗余盲测试中,马修斯相关系数为0.737、0.716和0.322,F1评分为0.759、0.696和0.713,分别,在这些质量指标上,性能优于其他方法超过$10\\%$。此外,VISH-Pred在这些独立测试集上取得了最佳的准确性和接收器工作曲线下面积评分,突出了框架的健壮性和泛化能力。通过使VISH-Pred成为易于使用的Web服务器,我们希望它作为一个宝贵的资产,为未来的努力,旨在辨别肽的毒性,并使有效的蛋白质为基础的治疗。
    Peptide- and protein-based therapeutics are becoming a promising treatment regimen for myriad diseases. Toxicity of proteins is the primary hurdle for protein-based therapies. Thus, there is an urgent need for accurate in silico methods for determining toxic proteins to filter the pool of potential candidates. At the same time, it is imperative to precisely identify non-toxic proteins to expand the possibilities for protein-based biologics. To address this challenge, we proposed an ensemble framework, called VISH-Pred, comprising models built by fine-tuning ESM2 transformer models on a large, experimentally validated, curated dataset of protein and peptide toxicities. The primary steps in the VISH-Pred framework are to efficiently estimate protein toxicities taking just the protein sequence as input, employing an under sampling technique to handle the humongous class-imbalance in the data and learning representations from fine-tuned ESM2 protein language models which are then fed to machine learning techniques such as Lightgbm and XGBoost. The VISH-Pred framework is able to correctly identify both peptides/proteins with potential toxicity and non-toxic proteins, achieving a Matthews correlation coefficient of 0.737, 0.716 and 0.322 and F1-score of 0.759, 0.696 and 0.713 on three non-redundant blind tests, respectively, outperforming other methods by over $10\\%$ on these quality metrics. Moreover, VISH-Pred achieved the best accuracy and area under receiver operating curve scores on these independent test sets, highlighting the robustness and generalization capability of the framework. By making VISH-Pred available as an easy-to-use web server, we expect it to serve as a valuable asset for future endeavors aimed at discerning the toxicity of peptides and enabling efficient protein-based therapeutics.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:我们旨在通过微调Llama2(一种开源大型语言模型(LLM))来开发一种用于罕见疾病概念规范化的新颖方法,使用源自人类表型本体论(HPO)的领域特定语料库。
    方法:我们开发了一个基于内部模板的脚本,以生成两个语料库进行微调。第一个(NAME)包含标准化的HPO名称,来自HPO词汇,以及它们相应的标识符。第二个(NAME+SYN)包括HPO名称和一半的概念同义词以及标识符。随后,我们对每个句子集的Llama2(Llama2-7B)进行了微调,并使用一系列句子提示和各种表型术语进行了评估.
    结果:当用于标准化的表型术语包含在微调语料库中时,两种型号都表现出近乎完美的性能,平均准确率超过99%。相比之下,ChatGPT-3.5在识别表型术语的HPOID方面只有约20%的准确性。当在表型术语中引入单字符错别字时,NAME和NAME+SYN的准确率分别为10.2%和36.1%,分别,但增加到61.8%(NAME+SYN)与额外的排字特定的微调。对于来自HPO词汇的术语,作为看不见的同义词,NAME模型达到11.2%的准确率,而NAME+SYN模型实现了92.7%的准确率。
    结论:我们的微调模型证明了将微调语料库中看不到的表型术语标准化的能力,包括拼写错误,同义词,来自其他本体论的术语,和外行的条款。我们的方法为使用LLM从临床叙述中识别命名的医疗实体提供了解决方案,同时成功地将它们标准化为受控词汇中的标准概念。
    OBJECTIVE: We aim to develop a novel method for rare disease concept normalization by fine-tuning Llama 2, an open-source large language model (LLM), using a domain-specific corpus sourced from the Human Phenotype Ontology (HPO).
    METHODS: We developed an in-house template-based script to generate two corpora for fine-tuning. The first (NAME) contains standardized HPO names, sourced from the HPO vocabularies, along with their corresponding identifiers. The second (NAME+SYN) includes HPO names and half of the concept\'s synonyms as well as identifiers. Subsequently, we fine-tuned Llama 2 (Llama2-7B) for each sentence set and conducted an evaluation using a range of sentence prompts and various phenotype terms.
    RESULTS: When the phenotype terms for normalization were included in the fine-tuning corpora, both models demonstrated nearly perfect performance, averaging over 99% accuracy. In comparison, ChatGPT-3.5 has only ∼20% accuracy in identifying HPO IDs for phenotype terms. When single-character typos were introduced in the phenotype terms, the accuracy of NAME and NAME+SYN is 10.2% and 36.1%, respectively, but increases to 61.8% (NAME+SYN) with additional typo-specific fine-tuning. For terms sourced from HPO vocabularies as unseen synonyms, the NAME model achieved 11.2% accuracy, while the NAME+SYN model achieved 92.7% accuracy.
    CONCLUSIONS: Our fine-tuned models demonstrate ability to normalize phenotype terms unseen in the fine-tuning corpus, including misspellings, synonyms, terms from other ontologies, and laymen\'s terms. Our approach provides a solution for the use of LLMs to identify named medical entities from clinical narratives, while successfully normalizing them to standard concepts in a controlled vocabulary.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    新发布的SegmentAnythingModel(SAM)是图像处理中使用的一种流行工具,由于其优越的分割精度,各种输入提示,培训能力,高效的模型设计。然而,它当前的模型是在不适合医学图像的不同数据集上训练的,特别是超声图像。超声图像往往有很多噪声,使得难以分割出重要的结构。在这个项目中,我们开发了ClickSAM,它使用超声图像的单击提示来微调“段任意模型”。ClickSAM有两个训练阶段:第一个阶段是在以真实轮廓为中心的单击提示上进行训练,第二阶段的重点是通过额外的正面和负面点击提示来提高模型性能。通过将第一阶段的预测与地面真相面具进行比较,真积极,假阳性,并计算假阴性段。使用真阳性和假阴性段生成阳性点击,和否定点击使用假阳性段生成。然后采用中心Voronoi细分算法在每个段中收集用于在第二阶段训练期间增强模型性能的正面和负面点击提示。使用单击训练方法,与其他现有的超声图像分割模型相比,ClickSAM具有出色的性能。
    The newly released Segment Anything Model (SAM) is a popular tool used in image processing due to its superior segmentation accuracy, variety of input prompts, training capabilities, and efficient model design. However, its current model is trained on a diverse dataset not tailored to medical images, particularly ultrasound images. Ultrasound images tend to have a lot of noise, making it difficult to segment out important structures. In this project, we developed ClickSAM, which fine-tunes the Segment Anything Model using click prompts for ultrasound images. ClickSAM has two stages of training: the first stage is trained on single-click prompts centered in the ground-truth contours, and the second stage focuses on improving the model performance through additional positive and negative click prompts. By comparing the first stage\'s predictions to the ground-truth masks, true positive, false positive, and false negative segments are calculated. Positive clicks are generated using the true positive and false negative segments, and negative clicks are generated using the false positive segments. The Centroidal Voronoi Tessellation algorithm is then employed to collect positive and negative click prompts in each segment that are used to enhance the model performance during the second stage of training. With click-train methods, ClickSAM exhibits superior performance compared to other existing models for ultrasound image segmentation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号