loss function

  • 文章类型: Journal Article
    UNASSIGNED: The aim of our research is to enhance the calibration of machine learning models for glaucoma classification through a specialized loss function named Confidence-Calibrated Label Smoothing (CC-LS) loss. This approach is specifically designed to refine model calibration without compromising accuracy by integrating label smoothing and confidence penalty techniques, tailored to the specifics of glaucoma detection.
    UNASSIGNED: This study focuses on the development and evaluation of a calibrated deep learning model.
    UNASSIGNED: The study employs fundus images from both external datasets-the Online Retinal Fundus Image Database for Glaucoma Analysis and Research (482 normal, 168 glaucoma) and the Retinal Fundus Glaucoma Challenge (720 normal, 80 glaucoma)-and an extensive internal dataset (4639 images per category), aiming to bolster the model\'s generalizability. The model\'s clinical performance is validated using a comprehensive test set (47 913 normal, 1629 glaucoma) from the internal dataset.
    UNASSIGNED: The CC-LS loss function seamlessly integrates label smoothing, which tempers extreme predictions to avoid overfitting, with confidence-based penalties. These penalties deter the model from expressing undue confidence in incorrect classifications. Our study aims at training models using the CC-LS and comparing their performance with those trained using conventional loss functions.
    UNASSIGNED: The model\'s precision is evaluated using metrics like the Brier score, sensitivity, specificity, and the false positive rate, alongside qualitative heatmap analyses for a holistic accuracy assessment.
    UNASSIGNED: Preliminary findings reveal that models employing the CC-LS mechanism exhibit superior calibration metrics, as evidenced by a Brier score of 0.098, along with notable accuracy measures: sensitivity of 81%, specificity of 80%, and weighted accuracy of 80%. Importantly, these enhancements in calibration are achieved without sacrificing classification accuracy.
    UNASSIGNED: The CC-LS loss function presents a significant advancement in the pursuit of deploying machine learning models for glaucoma diagnosis. By improving calibration, the CC-LS ensures that clinicians can interpret and trust the predictive probabilities, making artificial intelligence-driven diagnostic tools more clinically viable. From a clinical standpoint, this heightened trust and interpretability can potentially lead to more timely and appropriate interventions, thereby optimizing patient outcomes and safety.
    UNASSIGNED: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.






  • 文章类型: Journal Article
    Short-term precipitation forecasting is essential for agriculture, transportation, urban management, and tourism. The radar echo extrapolation method is widely used in precipitation forecasting. To address issues like forecast degradation, insufficient capture of spatiotemporal dependencies, and low accuracy in radar echo extrapolation, we propose a new model: MS-DD3D-RSTN. This model employs spatiotemporal convolutional blocks (STCBs) as spatiotemporal feature extractors and uses the spatial-temporal loss (STLoss) function to learn intra-frame and inter-frame changes for end-to-end training, thereby capturing the spatiotemporal dependencies in radar echo signals. Experiments on the Sichuan dataset and the HKO-7 dataset show that the proposed model outperforms advanced models in terms of CSI and POD evaluation metrics. For 2 h forecasts with 20 dBZ and 30 dBZ reflectivity thresholds, the CSI metrics reached 0.538, 0.386, 0.485, and 0.198, respectively, representing the best levels among existing methods. The experiments demonstrate that the MS-DD3D-RSTN model enhances the ability to capture spatiotemporal dependencies, mitigates forecast degradation, and further improves radar echo prediction performance.






  • 文章类型: Journal Article
    In this paper, artificial intelligence (AI) technology is applied to the electromagnetic imaging of anisotropic objects. Advances in magnetic anomaly sensing systems and electromagnetic imaging use electromagnetic principles to detect and characterize subsurface or hidden objects. We use measured multifrequency scattered fields to calculate the initial dielectric constant distribution of anisotropic objects through the backpropagation scheme (BPS). Later, the estimated multifrequency permittivity distribution is input to a convolutional neural network (CNN) for the adaptive moment estimation (ADAM) method to reconstruct a more accurate image. In the meantime, we also improve the definition of loss function in the CNN. Numerical results show that the improved loss function unifying the structural similarity index measure (SSIM) and root mean square error (RMSE) can effectively enhance image quality. In our simulation environment, noise interference is considered for both TE (transverse electric) and TM (transverse magnetic) waves to reconstruct anisotropic scatterers. Lastly, we conclude that multifrequency reconstructions are more stable and precise than single-frequency reconstructions.






  • 文章类型: Journal Article
    In the field of industrial safety, wearing helmets plays a vital role in ensuring workers\' health. Aiming at addressing the complex background in the industrial environment, caused by differences in distance, the helmet small target wearing detection methods for misdetection and omission detection problems are needed. An improved YOLOv8 safety helmet wearing detection network is proposed to enhance the capture of details, improve multiscale feature processing and improve the accuracy of small target detection by introducing Dilation-wise residual attention module, atrous spatial pyramid pooling and normalized Wasserstein distance loss function. Experiments were conducted on the SHWD dataset, and the results showed that the mAP of the improved network improved to 92.0%, which exceeded that of the traditional target detection network in terms of accuracy, recall, and other key metrics. These findings further improved the detection of helmet wearing in complex environments and greatly enhanced the accuracy of detection.






  • 文章类型: Journal Article
    Addressing the limitations of current railway track foreign object detection techniques, which suffer from inadequate real-time performance and diminished accuracy in detecting small objects, this paper introduces an innovative vision-based perception methodology harnessing the power of deep learning. Central to this approach is the construction of a railway boundary model utilizing a sophisticated track detection method, along with an enhanced UNet semantic segmentation network to achieve autonomous segmentation of diverse track categories. By employing equal interval division and row-by-row traversal, critical track feature points are precisely extracted, and the track linear equation is derived through the least squares method, thus establishing an accurate railway boundary model. We optimized the YOLOv5s detection model in four aspects: incorporating the SE attention mechanism into the Neck network layer to enhance the model\'s feature extraction capabilities, adding a prediction layer to improve the detection performance for small objects, proposing a linear size scaling method to obtain suitable anchor boxes, and utilizing Inner-IoU to refine the boundary regression loss function, thereby increasing the positioning accuracy of the bounding boxes. We conducted a detection accuracy validation for railway track foreign object intrusion using a self-constructed image dataset. The results indicate that the proposed semantic segmentation model achieved an MIoU of 91.8%, representing a 3.9% improvement over the previous model, effectively segmenting railway tracks. Additionally, the optimized detection model could effectively detect foreign object intrusions on the tracks, reducing missed and false alarms and achieving a 7.4% increase in the mean average precision (IoU = 0.5) compared to the original YOLOv5s model. The model exhibits strong generalization capabilities in scenarios involving small objects. This proposed approach represents an effective exploration of deep learning techniques for railway track foreign object intrusion detection, suitable for use in complex environments to ensure the operational safety of rail lines.






  • 文章类型: Journal Article
    BACKGROUND: Bladder cancer (BC) segmentation on MRI images is the first step to determining the presence of muscular invasion. This study aimed to assess the tumor segmentation performance of three deep learning (DL) models on multi-parametric MRI (mp-MRI) images.
    METHODS: We studied 53 patients with bladder cancer. Bladder tumors were segmented on each slice of T2-weighted (T2WI), diffusion-weighted imaging/apparent diffusion coefficient (DWI/ADC), and T1-weighted contrast-enhanced (T1WI) images acquired at a 3Tesla MRI scanner. We trained Unet, MAnet, and PSPnet using three loss functions: cross-entropy (CE), dice similarity coefficient loss (DSC), and focal loss (FL). We evaluated the model performances using DSC, Hausdorff distance (HD), and expected calibration error (ECE).
    RESULTS: The MAnet algorithm with the CE+DSC loss function gave the highest DSC values on the ADC, T2WI, and T1WI images. PSPnet with CE+DSC obtained the smallest HDs on the ADC, T2WI, and T1WI images. The segmentation accuracy overall was better on the ADC and T1WI than on the T2WI. The ECEs were the smallest for PSPnet with FL on the ADC images, while they were the smallest for MAnet with CE+DSC on the T2WI and T1WI.
    CONCLUSIONS: Compared to Unet, MAnet and PSPnet with a hybrid CE+DSC loss function displayed better performances in BC segmentation depending on the choice of the evaluation metric.






  • 文章类型: Journal Article
    UNASSIGNED: Deep learning is the standard for medical image segmentation. However, it may encounter difficulties when the training set is small. Also, it may generate anatomically aberrant segmentations. Anatomical knowledge can be potentially useful as a constraint in deep learning segmentation methods. We propose a loss function based on projected pooling to introduce soft topological constraints. Our main application is the segmentation of the red nucleus from quantitative susceptibility mapping (QSM) which is of interest in parkinsonian syndromes.
    UNASSIGNED: This new loss function introduces soft constraints on the topology by magnifying small parts of the structure to segment to avoid that they are discarded in the segmentation process. To that purpose, we use projection of the structure onto the three planes and then use a series of MaxPooling operations with increasing kernel sizes. These operations are performed both for the ground truth and the prediction and the difference is computed to obtain the loss function. As a result, it can reduce topological errors as well as defects in the structure boundary. The approach is easy to implement and computationally efficient.
    UNASSIGNED: When applied to the segmentation of the red nucleus from QSM data, the approach led to a very high accuracy (Dice 89.9%) and no topological errors. Moreover, the proposed loss function improved the Dice accuracy over the baseline when the training set was small. We also studied three tasks from the medical segmentation decathlon challenge (MSD) (heart, spleen, and hippocampus). For the MSD tasks, the Dice accuracies were similar for both approaches but the topological errors were reduced.
    UNASSIGNED: We propose an effective method to automatically segment the red nucleus which is based on a new loss for introducing topology constraints in deep learning segmentation.






  • 文章类型: Journal Article
    A new algorithm, Yolov8n-FADS, has been proposed with the aim of improving the accuracy of miners\' helmet detection algorithms in complex underground environments. By replacing the head part with Attentional Sequence Fusion (ASF) and introducing the P2 detection layer, the ASF-P2 structure is able to comprehensively extract the global and local feature information of the image, and the improvement in the backbone part is able to capture the spatially sparsely distributed features more efficiently, which improves the model\'s ability to perceive complex patterns. The improved detection head, SEAMHead by the SEAM module, can handle occlusion more effectively. The Focal Loss module can improve the model\'s ability to detect rare target categories by adjusting the weights of positive and negative samples. This study shows that compared with the original model, the improved model has 29% memory compression, a 36.7% reduction in the amount of parameters, and a 4.9% improvement in the detection accuracy, which can effectively improve the detection accuracy of underground helmet wearers, reduce the workload of underground video surveillance personnel, and improve the monitoring efficiency.






  • 文章类型: Journal Article
    This article addresses the semantic segmentation of laparoscopic surgery images, placing special emphasis on the segmentation of structures with a smaller number of observations. As a result of this study, adjustment parameters are proposed for deep neural network architectures, enabling a robust segmentation of all structures in the surgical scene. The U-Net architecture with five encoder-decoders (U-Net5ed), SegNet-VGG19, and DeepLabv3+ employing different backbones are implemented. Three main experiments are conducted, working with Rectified Linear Unit (ReLU), Gaussian Error Linear Unit (GELU), and Swish activation functions. The applied loss functions include Cross Entropy (CE), Focal Loss (FL), Tversky Loss (TL), Dice Loss (DiL), Cross Entropy Dice Loss (CEDL), and Cross Entropy Tversky Loss (CETL). The performance of Stochastic Gradient Descent with momentum (SGDM) and Adaptive Moment Estimation (Adam) optimizers is compared. It is qualitatively and quantitatively confirmed that DeepLabv3+ and U-Net5ed architectures yield the best results. The DeepLabv3+ architecture with the ResNet-50 backbone, Swish activation function, and CETL loss function reports a Mean Accuracy (MAcc) of 0.976 and Mean Intersection over Union (MIoU) of 0.977. The semantic segmentation of structures with a smaller number of observations, such as the hepatic vein, cystic duct, Liver Ligament, and blood, verifies that the obtained results are very competitive and promising compared to the consulted literature. The proposed selected parameters were validated in the YOLOv9 architecture, which showed an improvement in semantic segmentation compared to the results obtained with the original architecture.






  • 文章类型: Journal Article
    Most accidents in a chemical process are caused by abnormal or deviations of the process parameters, and the existing research is focused on short-term prediction. When the early warning time is advanced, many false and missing alarms will occur in the system, which will cause certain problems for on-site personnel; how to ensure the accuracy of early warning as much as possible while the early warning time is a technical problem requiring an urgent solution. In the present work, a bidirectional long short-term memory network (BiLSTM) model was established according to the temporal variation characteristics of process parameters, and the Whale optimization algorithm (WOA) was used to optimize the model\'s hyperparameters automatically. The predicted value was further constructed as a Modified Inverted Normal Loss Function (MINLF), and the probability of abnormal fluctuations of process parameters was calculated using the residual time theory. Finally, the WOA-BiLSTM-MINLF process parameter prediction model with inherent risk and trend risk was established, and the fluctuation process of the process parameters was transformed into dynamic risk values. The results show that the prediction model alarms 16 min ahead of distributed control systems (DCS), which can reserve enough time for operators to take safety protection measures in advance and prevent accidents.





