Unsupervised clustering

无监督聚类
  • 文章类型: Journal Article
    人群放射学报告内容的变化可以发现新出现的疾病。在这里,我们开发了一种方法,使用自然语言处理来量化放射学报告的连续时间分组的相似性,我们调查了连续时期之间差异的出现是否与法国COVID-19大流行的开始有关。收集了2019年10月至2020年3月期间法国62个急诊科的67,368名连续成年人的CT报告。使用一克的时频逆文档频率(TF-IDF)分析对报告进行矢量化。对于每个连续的2周期间,我们基于TF-IDF值和分区-环绕-medoids对报告进行了无监督聚类.接下来,我们根据平均校正Rand指数(AARI)评估了该聚类与两周前的聚类之间的相似性.统计分析包括(1)互相关函数(CCF)与SARS-CoV-2阳性测试的数量和流感综合征的高级卫生指数(ASI-流感,来自开源数据集),(2)时间序列在不同滞后的线性回归,以了解AARI随时间的变化。总的来说,分析13235例胸部CT报告。AARI在滞后=1、5和6周与ASI流感相关(分别为P=0.0454、0.0121和0.0042),在滞后=-1和0周与SARS-CoV-2阳性测试相关(分别为P=0.0057和0.0001)。在最适合的情况下,AARI与ASI流感相关,滞后2周(P=0.0026),同一周SARS-CoV-2阳性检测(P<0.0001)及其相互作用(P<0.0001)(调整后的R2=0.921)。因此,我们的方法能够自动监测放射学报告的变化,并有助于捕获疾病的出现.
    Changes in the content of radiological reports at population level could detect emerging diseases. Herein, we developed a method to quantify similarities in consecutive temporal groupings of radiological reports using natural language processing, and we investigated whether appearance of dissimilarities between consecutive periods correlated with the beginning of the COVID-19 pandemic in France. CT reports from 67,368 consecutive adults across 62 emergency departments throughout France between October 2019 and March 2020 were collected. Reports were vectorized using time frequency-inverse document frequency (TF-IDF) analysis on one-grams. For each successive 2-week period, we performed unsupervised clustering of the reports based on TF-IDF values and partition-around-medoids. Next, we assessed the similarities between this clustering and a clustering from two weeks before according to the average adjusted Rand index (AARI). Statistical analyses included (1) cross-correlation functions (CCFs) with the number of positive SARS-CoV-2 tests and advanced sanitary index for flu syndromes (ASI-flu, from open-source dataset), and (2) linear regressions of time series at different lags to understand the variations of AARI over time. Overall, 13,235 chest CT reports were analyzed. AARI was correlated with ASI-flu at lag = + 1, + 5, and + 6 weeks (P = 0.0454, 0.0121, and 0.0042, respectively) and with SARS-CoV-2 positive tests at lag = - 1 and 0 week (P = 0.0057 and 0.0001, respectively). In the best fit, AARI correlated with the ASI-flu with a lag of 2 weeks (P = 0.0026), SARS-CoV-2-positive tests in the same week (P < 0.0001) and their interaction (P < 0.0001) (adjusted R2 = 0.921). Thus, our method enables the automatic monitoring of changes in radiological reports and could help capturing disease emergence.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项研究旨在调查在评估光学相干断层扫描(OCT)报告以检测青光眼期间,具有不同专业知识水平的眼科医生的眼球运动模式。目标包括评估眼睛注视指标和模式作为眼科教育的功能,从眼动追踪中获得新的特征,并开发用于疾病检测和专业知识区分的二元分类模型。13位眼科住院医师,研究员,和专门研究青光眼的临床医生参与了这项研究.初级居民的经验不到一年,而高级居民有2-3年的经验。专家组由具有3至30多年经验的研究员和教师组成。每位参与者均接受了一组20份TopconOCT报告(10份健康和10份青光眼),并被要求确定青光眼的存在或不存在并评估他们的诊断信心。当他们使用PupilLabsCore眼睛跟踪器诊断报告时,记录每个参与者的眼睛运动。专家眼科医生展示了更精致和专注的眼睛固定,特别是在OCT报告的特定区域,如视网膜神经纤维层(RNFL)概率图和周围乳头RNFLb扫描。使用衍生特征开发的二元分类模型在区分专家和新手临床医生方面表现出高达94.0%的高准确性。推导的特征和训练的二元分类模型有望提高青光眼检测的准确性,并区分专家和新手眼科医生。这些发现对加强眼科教育和开发有效的诊断工具具有启示意义。
    This study aimed to investigate the eye movement patterns of ophthalmologists with varying expertise levels during the assessment of optical coherence tomography (OCT) reports for glaucoma detection. Objectives included evaluating eye gaze metrics and patterns as a function of ophthalmic education, deriving novel features from eye-tracking, and developing binary classification models for disease detection and expertise differentiation. Thirteen ophthalmology residents, fellows, and clinicians specializing in glaucoma participated in the study. Junior residents had less than 1 year of experience, while senior residents had 2-3 years of experience. The expert group consisted of fellows and faculty with over 3 to 30+ years of experience. Each participant was presented with a set of 20 Topcon OCT reports (10 healthy and 10 glaucomatous) and was asked to determine the presence or absence of glaucoma and rate their confidence of diagnosis. The eye movements of each participant were recorded as they diagnosed the reports using a Pupil Labs Core eye tracker. Expert ophthalmologists exhibited more refined and focused eye fixations, particularly on specific regions of the OCT reports, such as the retinal nerve fiber layer (RNFL) probability map and circumpapillary RNFL b-scan. The binary classification models developed using the derived features demonstrated high accuracy up to 94.0% in differentiating between expert and novice clinicians. The derived features and trained binary classification models hold promise for improving the accuracy of glaucoma detection and distinguishing between expert and novice ophthalmologists. These findings have implications for enhancing ophthalmic education and for the development of effective diagnostic tools.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号