关键词: Cancer imaging Data mining Machine learning

来  源:   DOI:10.1038/s41746-020-0238-2   PDF(Sci-hub)   PDF(Pubmed)

Abstract:
The emergence of digital pathology has opened new horizons for histopathology. Artificial intelligence (AI) algorithms are able to operate on digitized slides to assist pathologists with different tasks. Whereas AI-involving classification and segmentation methods have obvious benefits for image analysis, image search represents a fundamental shift in computational pathology. Matching the pathology of new patients with already diagnosed and curated cases offers pathologists a new approach to improve diagnostic accuracy through visual inspection of similar cases and computational majority vote for consensus building. In this study, we report the results from searching the largest public repository (The Cancer Genome Atlas, TCGA) of whole-slide images from almost 11,000 patients. We successfully indexed and searched almost 30,000 high-resolution digitized slides constituting 16 terabytes of data comprised of 20 million 1000 × 1000 pixels image patches. The TCGA image database covers 25 anatomic sites and contains 32 cancer subtypes. High-performance storage and GPU power were employed for experimentation. The results were assessed with conservative \"majority voting\" to build consensus for subtype diagnosis through vertical search and demonstrated high accuracy values for both frozen section slides (e.g., bladder urothelial carcinoma 93%, kidney renal clear cell carcinoma 97%, and ovarian serous cystadenocarcinoma 99%) and permanent histopathology slides (e.g., prostate adenocarcinoma 98%, skin cutaneous melanoma 99%, and thymoma 100%). The key finding of this validation study was that computational consensus appears to be possible for rendering diagnoses if a sufficiently large number of searchable cases are available for each cancer subtype.
摘要:
数字病理学的出现为组织病理学开辟了新的视野。人工智能(AI)算法能够对数字化幻灯片进行操作,以帮助病理学家完成不同的任务。而人工智能的分类和分割方法对图像分析有明显的好处,图像搜索代表了计算病理学的根本转变。将新患者的病理与已经诊断和策划的病例相匹配,为病理学家提供了一种新的方法,可以通过对类似病例的视觉检查和计算多数投票来建立共识来提高诊断准确性。在这项研究中,我们报告了搜索最大的公共数据库(癌症基因组图谱,TCGA)来自近11,000名患者的全幻灯片图像。我们成功地索引并搜索了近30,000张高分辨率数字化幻灯片,这些幻灯片构成了16TB的数据,其中包括2000万个1000×1000像素的图像块。TCGA图像数据库涵盖25个解剖部位,包含32种癌症亚型。高性能存储和GPU功率用于实验。通过保守的“多数投票”对结果进行了评估,以通过垂直搜索建立亚型诊断的共识,并证明了两种冷冻切片的高准确性值(例如,膀胱尿路上皮癌93%,肾肾透明细胞癌97%,和卵巢浆液性囊腺癌99%)和永久性组织病理学切片(例如,前列腺腺癌98%,皮肤皮肤黑色素瘤99%,和胸腺瘤100%)。这项验证研究的关键发现是,如果每种癌症亚型都有足够多的可搜索病例,则计算共识似乎可以用于诊断。
公众号