Mesh : Humans Natural Language Processing Neoplasm Metastasis Tomography, X-Ray Computed / methods Neoplasms / pathology diagnostic imaging Female Algorithms Data Mining / methods Electronic Health Records Male

来  源:   DOI:10.1200/CCI.23.00122

Abstract:
To evaluate natural language processing (NLP) methods to infer metastatic sites from radiology reports.
A set of 4,522 computed tomography (CT) reports of 550 patients with 14 types of cancer was used to fine-tune four clinical large language models (LLMs) for multilabel classification of metastatic sites. We also developed an NLP information extraction (IE) system (on the basis of named entity recognition, assertion status detection, and relation extraction) for comparison. Model performances were measured by F1 scores on test and three external validation sets. The best model was used to facilitate analysis of metastatic frequencies in a cohort study of 6,555 patients with 53,838 CT reports.
The RadBERT, BioBERT, GatorTron-base, and GatorTron-medium LLMs achieved F1 scores of 0.84, 0.87, 0.89, and 0.91, respectively, on the test set. The IE system performed best, achieving an F1 score of 0.93. F1 scores of the IE system by individual cancer type ranged from 0.89 to 0.96. The IE system attained F1 scores of 0.89, 0.83, and 0.81, respectively, on external validation sets including additional cancer types, positron emission tomography-CT ,and magnetic resonance imaging scans, respectively. In our cohort study, we found that for colorectal cancer, liver-only metastases were higher in de novo stage IV versus recurrent patients (29.7% v 12.2%; P < .001). Conversely, lung-only metastases were more frequent in recurrent versus de novo stage IV patients (17.2% v 7.3%; P < .001).
We developed an IE system that accurately infers metastatic sites in multiple primary cancers from radiology reports. It has explainable methods and performs better than some clinical LLMs. The inferred metastatic phenotypes could enhance cancer research databases and clinical trial matching, and identify potential patients for oligometastatic interventions.
摘要:
目的:评估自然语言处理(NLP)方法,以从放射学报告中推断转移部位。
方法:使用一组4,522例14种癌症患者的计算机断层扫描(CT)报告,对四个临床大语言模型(LLM)进行微调,以对转移部位进行多标签分类。我们还开发了一个NLP信息提取(IE)系统(在命名实体识别的基础上,断言状态检测,和关系提取)进行比较。通过测试和三个外部验证集上的F1分数来衡量模型性能。在6,555例患者和53,838例CT报告的队列研究中,使用了最佳模型来促进转移频率的分析。
结果:RadBERT,Biobert,GatorTron基地,和GatorTron-mediumLLM的F1得分分别为0.84、0.87、0.89和0.91,在测试装置上。IE系统表现最好,F1得分为0.93。根据个体癌症类型,IE系统的F1评分范围为0.89至0.96。IE系统的F1得分分别为0.89、0.83和0.81,在外部验证集上,包括其他癌症类型,正电子发射断层扫描-CT,和磁共振成像扫描,分别。在我们的队列研究中,我们发现对于结直肠癌,与复发患者相比,初发IV期仅肝转移较高(29.7%v12.2%;P<.001).相反,单肺转移在复发与从头IV期患者中更为常见(17.2%v7.3%;P<.001).
结论:我们开发了一种IE系统,可以从放射学报告中准确推断多原发癌的转移部位。IthasexplainablemethodsandperformancebetterthansomeclinicalLLM.Theforceedtransactive表型couldenhancecancerresearchdatabasesandclinicaltrialmatching,并确定进行寡转移干预的潜在患者。
公众号