组织病理学档案跨模态搜索的自监督框架，使用尺度协调。A self-supervised framework for cross-modal search in histopathology archives using scale harmonization.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

The exponential growth of data across various medical domains has generated a substantial demand for techniques to analyze multimodal big data. This demand is particularly pronounced in fields such as computational pathology due to the diverse nature of the tissue. Cross-modal retrieval aims to identify a common latent space where different modalities, such as image-text pairs, exhibit close alignment. The primary challenge, however, often lies in the representation of tissue features. While language models can be trained relatively easily, visual models frequently struggle due to the scarcity of labeled data. To address this issue, the innovative concept of harmonization has been introduced, extending the learning scheme distillation without supervision, known as DINO. The harmonization of scale refines the DINO paradigm through a novel patching approach, overcoming the complexities posed by gigapixel whole slide images in digital pathology. Experiments conducted on diverse datasets have demonstrated that the proposed approach significantly enhances cross-modal retrieval in tissue imaging. Moreover, it exhibits vast potential for other fields that rely on gigapixel imaging.

摘要：

跨各种医疗领域的数据的指数增长已经产生了对分析多模态大数据的技术的大量需求。由于组织的不同性质,这种需求在诸如计算病理学的领域中特别明显。跨模态检索旨在识别一个共同的潜在空间，其中不同的模态，例如图像-文本对，表现出紧密的对准。主要挑战，然而,通常在于组织特征的表示。虽然语言模型可以相对容易地训练，由于标记数据的稀缺，视觉模型经常挣扎。为了解决这个问题，引入了创新的协调概念，在没有监督的情况下扩展学习方案蒸馏，被称为DINO。尺度的协调通过一种新颖的修补方法完善了DINO范式，克服了数字病理学中千兆像素整张幻灯片图像所带来的复杂性。在不同数据集上进行的实验表明，所提出的方法显着增强了组织成像中的跨模态检索。此外，它在其他依赖于千兆像素成像的领域表现出巨大的潜力。