Deep metric learning

  • 文章类型: Journal Article
    The emergence of unknown diseases is often with few or no samples available. Zero-shot learning and few-shot learning have promising applications in medical image analysis. In this paper, we propose a Cross-Modal Deep Metric Learning Generalized Zero-Shot Learning (CM-DML-GZSL) model. The proposed network consists of a visual feature extractor, a fixed semantic feature extractor, and a deep regression module. The network belongs to a two-stream network for multiple modalities. In a multi-label setting, each sample contains a small number of positive labels and a large number of negative labels on average. This positive-negative imbalance dominates the optimization procedure and may prevent the establishment of an effective correspondence between visual features and semantic vectors during training, resulting in a low degree of accuracy. A novel weighted focused Euclidean distance metric loss is introduced in this regard. This loss not only can dynamically increase the weight of hard samples and decrease the weight of simple samples, but it can also promote the connection between samples and semantic vectors corresponding to their positive labels, which helps mitigate bias in predicting unseen classes in the generalized zero-shot learning setting. The weighted focused Euclidean distance metric loss function can dynamically adjust sample weights, enabling zero-shot multi-label learning for chest X-ray diagnosis, as experimental results on large publicly available datasets demonstrate.






  • 文章类型: Journal Article
    Single-cell RNA sequencing (scRNA-seq) has significantly accelerated the experimental characterization of distinct cell lineages and types in complex tissues and organisms. Cell-type annotation is of great importance in most of the scRNA-seq analysis pipelines. However, manual cell-type annotation heavily relies on the quality of scRNA-seq data and marker genes, and therefore can be laborious and time-consuming. Furthermore, the heterogeneity of scRNA-seq datasets poses another challenge for accurate cell-type annotation, such as the batch effect induced by different scRNA-seq protocols and samples. To overcome these limitations, here we propose a novel pipeline, termed TripletCell, for cross-species, cross-protocol and cross-sample cell-type annotation. We developed a cell embedding and dimension-reduction module for the feature extraction (FE) in TripletCell, namely TripletCell-FE, to leverage the deep metric learning-based algorithm for the relationships between the reference gene expression matrix and the query cells. Our experimental studies on 21 datasets (covering nine scRNA-seq protocols, two species and three tissues) demonstrate that TripletCell outperformed state-of-the-art approaches for cell-type annotation. More importantly, regardless of protocols or species, TripletCell can deliver outstanding and robust performance in annotating different types of cells. TripletCell is freely available at We believe that TripletCell is a reliable computational tool for accurately annotating various cell types using scRNA-seq data and will be instrumental in assisting the generation of novel biological hypotheses in cell biology.






  • 文章类型: Journal Article
    An increasing number of deep autoencoder-based algorithms for intelligent condition monitoring and anomaly detection have been reported in recent years to improve wind turbine reliability. However, most existing studies have only focused on the precise modeling of normal data in an unsupervised manner; few studies have utilized the information of fault instances in the learning process, which results in suboptimal detection performance and low robustness. To this end, we first developed a deep autoencoder enhanced by fault instances, that is, a triplet-convolutional deep autoencoder (triplet-Conv DAE), jointly integrating a convolutional autoencoder and deep metric learning. Aided by fault instances, triplet-Conv DAE can not only capture normal operation data patterns but also acquire discriminative deep embedding features. Moreover, to overcome the difficulty of scarce fault instances, we adopted an improved generative adversarial network-based data augmentation method to generate high-quality synthetic fault instances. Finally, we validated the performance of the proposed anomaly detection method using a multitude of performance measures. The experimental results show that our method is superior to three other state-of-the-art methods. In addition, the proposed augmentation method can efficiently improve the performance of the triplet-Conv DAE when fault instances are insufficient.






  • 文章类型: Journal Article
    -Thoracic disease, like many other diseases, can lead to complications. Existing multi-label medical image learning problems typically include rich pathological information, such as images, attributes, and labels, which are crucial for supplementary clinical diagnosis. However, the majority of contemporary efforts exclusively focus on regression from input to binary labels, ignoring the relationship between visual features and semantic vectors of labels. In addition, there is an imbalance in data amount between diseases, which frequently causes intelligent diagnostic systems to make erroneous disease predictions. Therefore, we aim to improve the accuracy of the multi-label classification of chest X-ray images. Chest X-ray14 pictures were utilized as the multi-label dataset for the experiments in this study. By fine-tuning the ConvNeXt network, we got visual vectors, which we combined with semantic vectors encoded by BioBert to map the two different forms of features into a common metric space and made semantic vectors the prototype of each class in metric space. The metric relationship between images and labels is then considered from the image level and disease category level, respectively, and a new dual-weighted metric loss function is proposed. Finally, the average AUC score achieved in the experiment reached 0.826, and our model outperformed the comparison models.






  • 文章类型: Journal Article
    Melanoma is a tumor caused by melanocytes with a high degree of malignancy, easy local recurrence, distant metastasis, and poor prognosis. It is also difficult to be detected by inexperienced dermatologist due to their similar appearances, such as color, shape, and contour.
    To develop and test a new computer-aided diagnosis scheme to detect melanoma skin cancer.
    In this new scheme, the unsupervised clustering based on deep metric learning is first conducted to make images with high similarity together and the corresponding model weights are utilized as teacher-model for the next stage. Second, benefit from the knowledge distillation, the attention transfer is adopted to make the classification model enable to learn the similarity features and information of categories simultaneously which improve the diagnosis accuracy than the common classification method.
    In validation sets, 8 categories were included, and 2443 samples were calculated. The highest accuracy of the new scheme is 0.7253, which is 5% points higher than the baseline (0.6794). Specifically, the F1-Score of three malignant lesions BCC (Basal cell carcinoma), SCC (Squamous cell carcinomas), and MEL (Melanoma) increase from 0.65 to 0.73, 0.28 to 0.37, and 0.54 to 0.58, respectively. In two test sets of HAN including 3844 samples and BCN including 6375 samples, the highest accuracies are 0.68 and 0.53 for HAM and BCN datasets, respectively, which are higher than the baseline (0.649 and 0.516). Additionally, F1 scores of BCC, SCC, MEL are 0.49, 0.2, 0.45 in HAM dataset and 0.6, 0.14, 0.55 in BCN dataset, respectively, which are also higher than F1 scores the results of baseline.
    This study demonstrates that the similarity clustering method enables to extract the related feature information to gather similar images together. Moreover, based on the attention transfer, the proposed classification framework can improve total accuracy and F1-score of skin lesion diagnosis.






  • 文章类型: Journal Article
    Objectives.The cardiac-related component in chest electrical impedance tomography (EIT) measurement is of potential value to pulmonary perfusion monitoring and cardiac function measurement. In a spontaneous breathing case, cardiac-related signals experience serious interference from ventilation-related signals. Traditional cardiac-related signal-separation methods are usually based on certain features of signals. To further improve the separation accuracy, more comprehensive features of the signals should be exploited.Approach.We propose an unsupervised deep-learning method called deep feature-domain matching (DFDM), which exploits the feature-domain similarity of the desired signals and the breath-holding signals. This method is characterized by two sub-steps. In the first step, a novel Siamese network is designed and trained to learn common features of breath-holding signals; in the second step, the Siamese network is used as a feature-matching constraint between the separated signals and the breath-holding signals.Main results.The method is first tested using synthetic data, and the results show satisfactory separation accuracy. The method is then tested using the data of three patients with pulmonary embolism, and the consistency between the separated images and the radionuclide perfusion scanning images is checked qualitatively.Significance.The method uses a lightweight convolutional neural network for fast network training and inference. It is a potential method for dynamic cardiac-related signal separation in clinical settings.






  • 文章类型: Journal Article
    Rapid identification of plant diseases is essential for effective mitigation and control of their influence on plants. For plant disease automatic identification, classification of plant leaf images based on deep learning algorithms is currently the most accurate and popular method. Existing methods rely on the collection of large amounts of image annotation data and cannot flexibly adjust recognition categories, whereas we develop a new image retrieval system for automated detection, localization, and identification of individual leaf disease in an open setting, namely, where newly added disease types can be identified without retraining. In this paper, we first optimize the YOLOv5 algorithm, enhancing recognition ability in small objects, which helps to extract leaf objects more accurately; secondly, integrating classification recognition with metric learning, jointly learning categorizing images and similarity measurements, thus, capitalizing on prediction ability of available image classification models; and finally, constructing an efficient and nimble image retrieval system to quickly determine leaf disease type. We demonstrate detailed experimental results on three publicly available leaf disease datasets and prove the effectiveness of our system. This work lays the groundwork for promoting disease surveillance of plants applicable to intelligent agriculture and to crop research such as nutrition diagnosis, health status surveillance, and more.






  • 文章类型: Journal Article
    Premature ventricular contractions (PVCs), common in the general and patient population, are irregular heartbeats that indicate potential heart diseases. Clinically, long-term electrocardiograms (ECG) collected from the wearable device is a non-invasive and inexpensive tool widely used to diagnose PVCs by physicians. However, analyzing these long-term ECG is time-consuming and labor-intensive for cardiologists. Therefore, this paper proposed a simplistic but powerful approach to detect PVC from long-term ECG. The suggested method utilized deep metric learning to extract features, with compact intra-product variance and separated inter-product differences, from the heartbeat. Subsequently, the k-nearest neighbors (KNN) classifier calculated the distance between samples based on these features to detect PVC. Unlike previous systems used to detect PVC, the proposed process can intelligently and automatically extract features by supervised deep metric learning, which can avoid the bias caused by manual feature engineering. As a generally available set of standard test material, the MIT-BIH (Massachusetts Institute of Technology-Beth Israel Hospital) Arrhythmia Database is used to evaluate the proposed method, and the experiment takes 99.7% accuracy, 97.45% sensitivity, and 99.87% specificity. The simulation events show that it is reliable to use deep metric learning and KNN for PVC recognition. More importantly, the overall way does not rely on complicated and cumbersome preprocessing.







  • 文章类型: Journal Article
    Convolutional neural networks (CNN) is the most mainstream solution in the field of image retrieval. Deep metric learning is introduced into the field of image retrieval, focusing on the construction of pair-based loss function. However, most pair-based loss functions of metric learning merely take common vector similarity (such as Euclidean distance) of the final image descriptors into consideration, while neglecting other distribution characters of these descriptors. In this work, we propose relative distribution entropy (RDE) to describe the internal distribution attributes of image descriptors. We combine relative distribution entropy with the Euclidean distance to obtain the relative distribution entropy weighted distance (RDE-distance). Moreover, the RDE-distance is fused with the contrastive loss and triplet loss to build the relative distributed entropy loss functions. The experimental results demonstrate that our method attains the state-of-the-art performance on most image retrieval benchmarks.







  • 文章类型: Journal Article
    To distinguish ambiguous images during specimen slides viewing, pathologists usually spend lots of time to seek guidance from confirmed similar images or cases, which is inefficient. Therefore, several histopathological image retrieval methods have been proposed for pathologists to easily obtain images sharing similar content with the query images. However, these methods cannot ensure a reasonable similarity metric, and some of them need lots of annotated images to train a feature extractor to represent images. Motivated by this circumstance, we propose the first deep metric learning-based histopathological image retrieval method in this paper and construct a deep neural network based on the mixed attention mechanism to learn an embedding function under the supervision of image category information. With the learned embedding function, original images are mapped into the predefined metric space where similar images from the same category are close to each other, so that the distance between image pairs in the metric space can be regarded as a reasonable metric for image similarity. We evaluate the proposed method on two histopathological image retrieval datasets: our self-established dataset and a public dataset called Kimia Path24, on which the proposed method achieves recall in top-1 recommendation (Recall@1) of 84.04% and 97.89% respectively. Moreover, further experiments confirm that the proposed method can achieve comparable performance to several published methods with less training data, which hedges the shortage of annotated medical image data to some extent. Code is available at





