focal loss

  • 文章类型: Journal Article
    红茶是中国第二常见的茶。发酵是其生产中最关键的过程之一,影响成品的质量,无论是不足还是过度。目前,红茶发酵程度的测定完全依靠人工经验。这导致红茶的质量不一致。为了解决这个问题,我们使用机器视觉技术根据图像来区分红茶的发酵程度,本文提出了一种结合知识蒸馏的轻量级卷积神经网络(CNN)来判别红茶的发酵程度。在比较了12种CNN模型后,考虑到模型的大小和歧视的表现,以及教师模型的选择原则,选择Shufflenet_v2_x1.0作为学生模型,并选择Efficientnet_v2作为教师模型。然后,熵损失被焦点损失取代。最后,对于0.6、0.7、0.8、0.9的蒸馏损失率,软目标知识蒸馏(ST),掩蔽生成蒸馏(MGD),保持相似度的知识蒸馏(SPKD),和注意力转移(AT)四种知识蒸馏方法在Shufflenet_v2_x1.0模型中提取知识的性能进行了测试。结果表明,当蒸馏损失率为0.8,采用MGD方法时,蒸馏后的模型判别性能最好。这种设置有效地提高了辨别性能,而不会增加参数的数量和计算量。模型的P,R和F1值分别达到0.9208、0.9190和0.9192。实现了对红茶发酵程度的精确判别。这满足了客观红茶发酵判断的要求,为红茶的智能化加工提供了技术支持。
    Black tea is the second most common type of tea in China. Fermentation is one of the most critical processes in its production, and it affects the quality of the finished product, whether it is insufficient or excessive. At present, the determination of black tea fermentation degree completely relies on artificial experience. It leads to inconsistent quality of black tea. To solve this problem, we use machine vision technology to distinguish the degree of fermentation of black tea based on images, this paper proposes a lightweight convolutional neural network (CNN) combined with knowledge distillation to discriminate the degree of fermentation of black tea. After comparing 12 kinds of CNN models, taking into account the size of the model and the performance of discrimination, as well as the selection principle of teacher models, Shufflenet_v2_x1.0 is selected as the student model, and Efficientnet_v2 is selected as the teacher model. Then, CrossEntropy Loss is replaced by Focal Loss. Finally, for Distillation Loss ratios of 0.6, 0.7, 0.8, 0.9, Soft Target Knowledge Distillation (ST), Masked Generative Distillation (MGD), Similarity-Preserving Knowledge Distillation (SPKD), and Attention Transfer (AT) four knowledge distillation methods are tested for their performance in distilling knowledge from the Shufflenet_v2_x1.0 model. The results show that the model discrimination performance after distillation is the best when the Distillation Loss ratio is 0.8 and the MGD method is used. This setup effectively improves the discrimination performance without increasing the number of parameters and computation volume. The model\'s P, R and F1 values reach 0.9208, 0.9190 and 0.9192, respectively. It achieves precise discrimination of the fermentation degree of black tea. This meets the requirements of objective black tea fermentation judgment and provides technical support for the intelligent processing of black tea.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:MRI图像上的膀胱癌(BC)分割是确定是否存在肌肉浸润的第一步。本研究旨在评估三种深度学习(DL)模型在多参数MRI(mp-MRI)图像上的肿瘤分割性能。
    方法:我们研究了53例膀胱癌患者。膀胱肿瘤在T2加权(T2WI)的每个切片上进行分割,扩散加权成像/表观扩散系数(DWI/ADC),和在3TeslaMRI扫描仪上采集的T1加权对比增强(T1WI)图像。我们训练了Unet,MAnet,和PSPnet使用三个损失函数:交叉熵(CE),骰子相似系数损失(DSC),和病灶丢失(FL)。我们使用DSC评估了模型性能,Hausdorff距离(HD),和预期校准误差(ECE)。
    结果:具有CE+DSC损失函数的MAnet算法在ADC上给出了最高的DSC值,T2WI,和T1WI图像。PSPnet与CE+DSC在ADC上获得了最小的HDs,T2WI,和T1WI图像。总体上,ADC和T1WI的分割精度优于T2WI。在ADC图像上,带FL的PSPnet的ECE最小,而在T2WI和T1WI上使用CE+DSC的MAnet是最小的。
    结论:与Unet相比,根据评估指标的选择,具有混合CEDSC损失函数的MAnet和PSPnet在BC分割中显示出更好的性能。
    BACKGROUND: Bladder cancer (BC) segmentation on MRI images is the first step to determining the presence of muscular invasion. This study aimed to assess the tumor segmentation performance of three deep learning (DL) models on multi-parametric MRI (mp-MRI) images.
    METHODS: We studied 53 patients with bladder cancer. Bladder tumors were segmented on each slice of T2-weighted (T2WI), diffusion-weighted imaging/apparent diffusion coefficient (DWI/ADC), and T1-weighted contrast-enhanced (T1WI) images acquired at a 3Tesla MRI scanner. We trained Unet, MAnet, and PSPnet using three loss functions: cross-entropy (CE), dice similarity coefficient loss (DSC), and focal loss (FL). We evaluated the model performances using DSC, Hausdorff distance (HD), and expected calibration error (ECE).
    RESULTS: The MAnet algorithm with the CE+DSC loss function gave the highest DSC values on the ADC, T2WI, and T1WI images. PSPnet with CE+DSC obtained the smallest HDs on the ADC, T2WI, and T1WI images. The segmentation accuracy overall was better on the ADC and T1WI than on the T2WI. The ECEs were the smallest for PSPnet with FL on the ADC images, while they were the smallest for MAnet with CE+DSC on the T2WI and T1WI.
    CONCLUSIONS: Compared to Unet, MAnet and PSPnet with a hybrid CE+DSC loss function displayed better performances in BC segmentation depending on the choice of the evaluation metric.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    作为中国的传统美食,皮蛋在生产过程中不可避免地出现质量不合格的情况。中国皮蛋生产设施只能依靠有经验的工人来挑选皮蛋。然而,皮蛋的手动选择带来了挑战,例如效率低下,主观判断,高成本,阻碍了工业生产过程。为了应对这些挑战,这项研究获得了皮蛋的传输图像,并在四个关键维度上完善了ConvNeXt网络:模型特征图的降维,多尺度特征融合(MSFF)集成,纳入全球关注机制(GAM)模块,以及交叉熵损失函数与焦点损失的融合。由此产生的精炼模型,ConvNeXt_PEgg,熟练掌握皮蛋的分类和分级。值得注意的是,改进后的模型在五类皮蛋中实现了92.6%的分类准确率,分级准确率为95.9%,跨越三个级别。此外,与其前身相比,精致的模型见证了参数体积减少了24.5%,此外,分类准确性提高了3.2个百分点,分级准确性提高了2.8个百分点。通过细致的比较分析,每次增强都表现出不同程度的性能提升。显然,精致的模型胜过众多经典模型,强调其在辨别皮蛋内部质量方面的功效。凭借其现实世界实施的潜力,这项技术预示着提高制造设施的经济可行性。
    As a traditional delicacy in China, preserved eggs inevitably experience instances of substandard quality during the production process. Chinese preserved egg production facilities can only rely on experienced workers to select the preserved eggs. However, the manual selection of preserved eggs presents challenges such as a low efficiency, subjective judgments, high costs, and hindered industrial production processes. In response to these challenges, this study procured the transmitted imagery of preserved eggs and refined the ConvNeXt network across four pivotal dimensions: the dimensionality reduction of model feature maps, the integration of multi-scale feature fusion (MSFF), the incorporation of a global attention mechanism (GAM) module, and the amalgamation of the cross-entropy loss function with focal loss. The resultant refined model, ConvNeXt_PEgg, attained proficiency in classifying and grading preserved eggs. Notably, the improved model achieved a classification accuracy of 92.6% across the five categories of preserved eggs, with a grading accuracy of 95.9% spanning three levels. Moreover, in contrast to its predecessor, the refined model witnessed a 24.5% reduction in the parameter volume, alongside a 3.2 percentage point augmentation in the classification accuracy and a 2.8 percentage point boost in the grading accuracy. Through meticulous comparative analysis, each enhancement exhibited varying degrees of performance elevation. Evidently, the refined model outshone a plethora of classical models, underscoring its efficacy in discerning the internal quality of preserved eggs. With its potential for real-world implementation, this technology portends to heighten the economic viability of manufacturing facilities.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:赖氨酸巴豆化(Kcr)是在组蛋白和非组蛋白蛋白中发现的关键蛋白翻译后修饰。它在调节动物和植物的各种生物过程中起着关键作用,包括基因转录和复制,细胞代谢和分化,以及光合作用。尽管Kcr的重要性,通过生物学实验检测Kcr位点通常很耗时,贵,并且只有一小部分巴豆酰化肽可以被鉴定。这一现实凸显了通过计算方法对Kcr位点进行高效快速预测的必要性。目前,存在几种机器学习模型来预测人类的Kcr位点,然而,为植物量身定制的模型是罕见的。此外,尚未专门为植物开发可下载的Kcr站点预测因子或数据集。为了解决这个差距,必须整合植物实验中检测到的现有Kcr位点,并为植物建立专用的计算模型。
    结果:大多数植物Kcr位点位于非组蛋白上。在这项研究中,我们从五株植物中收集了非组蛋白Kcr位点,包括小麦,烟草,大米,花生,还有木瓜.然后,我们对这些位点周围的氨基酸分布进行了全面分析。为建立植物非组蛋白Kcr位点的预测模型,我们结合了卷积神经网络(CNN),双向长短期记忆网络(BiLSTM),和注意力机制,建立一个名为PlantNh-Kcr的深度学习模型。在五重交叉验证和独立测试中,PlantNh-Kcr的性能优于多个传统机器学习模型和其他深度学习模型。此外,我们对PlantNh-Kcr模型进行了物种特异性效应分析,发现使用多个物种数据训练的通用模型优于物种特异性模型.
    结论:PlantNh-Kcr代表了预测植物非组蛋白Kcr位点的有价值的工具。我们希望该模型将有助于解决植物巴豆化位点研究中的关键挑战和任务。
    BACKGROUND: Lysine crotonylation (Kcr) is a crucial protein post-translational modification found in histone and non-histone proteins. It plays a pivotal role in regulating diverse biological processes in both animals and plants, including gene transcription and replication, cell metabolism and differentiation, as well as photosynthesis. Despite the significance of Kcr, detection of Kcr sites through biological experiments is often time-consuming, expensive, and only a fraction of crotonylated peptides can be identified. This reality highlights the need for efficient and rapid prediction of Kcr sites through computational methods. Currently, several machine learning models exist for predicting Kcr sites in humans, yet models tailored for plants are rare. Furthermore, no downloadable Kcr site predictors or datasets have been developed specifically for plants. To address this gap, it is imperative to integrate existing Kcr sites detected in plant experiments and establish a dedicated computational model for plants.
    RESULTS: Most plant Kcr sites are located on non-histones. In this study, we collected non-histone Kcr sites from five plants, including wheat, tabacum, rice, peanut, and papaya. We then conducted a comprehensive analysis of the amino acid distribution surrounding these sites. To develop a predictive model for plant non-histone Kcr sites, we combined a convolutional neural network (CNN), a bidirectional long short-term memory network (BiLSTM), and attention mechanism to build a deep learning model called PlantNh-Kcr. On both five-fold cross-validation and independent tests, PlantNh-Kcr outperformed multiple conventional machine learning models and other deep learning models. Furthermore, we conducted an analysis of species-specific effect on the PlantNh-Kcr model and found that a general model trained using data from multiple species outperforms species-specific models.
    CONCLUSIONS: PlantNh-Kcr represents a valuable tool for predicting plant non-histone Kcr sites. We expect that this model will aid in addressing key challenges and tasks in the study of plant crotonylation sites.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    高光谱成像技术具有捕获种子光学特性变化的能力,是目前无损检测种子活力的关键技术。由于种子活力数据取决于实际发芽率,它不可避免地导致阳性和阴性样本之间的不平衡。此外,高光谱图像(HSI)由于包含数百个波长而遭受特征冗余和共线性。在特征选择中提取有效波长信息也带来了挑战,然而,这限制了深度学习从恒生指数中提取特征和准确预测种子活力的能力。因此,在本文中,我们提出了一个Focal-WAResNet网络来预测种子的活力端到端,这提高了网络性能和特征表示能力,提高了种子活力预测的准确性。首先,利用焦点损失函数调整不同样本类别的损失权重,解决样本不平衡问题。其次,提出了一个WAResNet网络来选择特征波长并预测种子的活力,专注于具有较高网络权重的波长,提高了种子活力预测能力。为了验证该方法的有效性,本研究收集了玉米种子的HSI进行实验验证,为植物育种提供参考。实验结果表明,与其他最先进的方法相比,分类性能有了显着提高,准确率高达98.48%,F1评分为95.9%。
    Hyperspectral imaging is a key technology for non-destructive detection of seed vigor presently due to its capability to capture variations of optical properties in seeds. As the seed vigor data depends on the actual germination rate, it inevitably results in an imbalance between positive and negative samples. Additionally, hyperspectral image (HSI) suffers from feature redundancy and collinearity due to its inclusion of hundreds of wavelengths. It also creates a challenge to extract effective wavelength information in feature selection, however, which limits the ability of deep learning to extract features from HSI and accurately predict seed vigor. Accordingly, in this paper, we proposed a Focal-WAResNet network to predict seed vigor end-to-end, which improves the network performance and feature representation capability, and improves the accuracy of seed vigor prediction. Firstly, the focal loss function is utilized to adjust the loss weights of different sample categories to solve the problem of sample imbalance. Secondly, a WAResNet network is proposed to select characteristic wavelengths and predict seed vigor end-to-end, focusing on wavelengths with higher network weights, which enhance the ability of seed vigor prediction. To validate the effectiveness of this method, this study collected HSI of maize seeds for experimental verification, providing a reference for plant breeding. The experimental results demonstrate a significant improvement in classification performance compared to other state-of-the-art methods, with an accuracy up to 98.48% and an F1 score of 95.9%.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    蛋白质接触图预测是蛋白质结构预测的关键和关键步骤。其准确性在很大程度上取决于蛋白质序列信息的特征表示和深度学习模型的有效性。在本文中,我们提出了一种算法,DeepMSA+,生成蛋白质多序列比对(MSA),并基于共同进化信息和源自MSA的序列信息构建特征表示。我们还提出了一种改进的深度学习模型,AttCON,用于训练输入特征以预测蛋白质接触图。该模型包含一个注意模块,通过比较不同的注意力模块,我们找到了一个适用于接触图预测的无参数注意力模块。此外,我们使用FocalLoss函数来更好地解决蛋白质接触图中的数据不平衡问题。我们还开发了加权评价指标(W评分)用于模型评价,它考虑了广泛的指标。W评分的范围是全面的,特别关注中程和远程联系人的预测精度。实验结果表明,AttCON在CASP11到CASP15的数据集上取得了良好的精度结果。与一些最先进的方法相比,它在中期和长期预测中平均提高了5%以上,W评分平均提高2分。
    Protein contact map prediction is a critical and vital step in protein structure prediction, and its accuracy is highly contingent upon the feature representations of protein sequence information and the efficacy of deep learning models. In this paper, we propose an algorithm, DeepMSA+, to generate protein multiple sequence alignments (MSAs) and to construct feature representations based on co-evolutionary information and sequence information derived from MSAs. We also propose an improved deep learning model, AttCON, for training input features to predict protein contact map. The model incorporates an attention module, and by comparing different attention modules, we find a parameter-free attention module suitable for contact map prediction. Additionally, we use the Focal Loss function to better address the data imbalance issue in protein contact map. We also developed a weighted evaluation index (W score) for model evaluation, which takes into account a wide range of metrics. W score is comprehensive in its scope, with a particular focus on the precision of predictions for medium-range and long-range contacts. Experimental results show that AttCON achieves good precision results on datasets from CASP11 to CASP15. Compared to some state-of-the-art methods, it achieves an average improvement of over 5% in both medium-range and long-range predictions, and W score is improved by an average of 2 points.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    及时准确的火焰检测是有效预防火灾事故发生的一项非常重要和实用的技术。然而,当前的火焰检测方法在视频监控场景中由于火焰形状的变化等问题仍然面临着许多挑战,不平衡样品,和火焰状物体的干扰。在这项工作中,针对这些问题,提出了一种基于可变形物体检测和时序分析的实时火焰检测方法。首先,基于现有的单级目标检测网络YOLOV5S,通过引入可变形卷积来改进网络结构,以增强对不规则形状火焰的特征提取能力。其次,通过使用FocalLoss作为分类损失函数来改进损失函数,以解决正(火焰)和负(背景)样本的不平衡问题,以及易硬样品的不平衡,利用EIOULoss作为回归损失函数,解决了网络训练中收敛速度慢、回归位置不准确的问题。最后,采用时序分析策略对监控视频中当前帧和历史帧的火焰检测结果进行综合分析,减轻火焰形状变化引起的误报,火焰闭塞,和火焰般的干扰。实验结果表明,该方法的火焰检测平均精度(AP)和F-Measure指标分别达到93.0%和89.6%,分别,两者都优于比较方法,检测速度为24-26FPS,满足视频火焰检测的实时性要求。
    Timely and accurate flame detection is a very important and practical technology for preventing the occurrence of fire accidents effectively. However, the current methods of flame detection are still faced with many challenges in video surveillance scenarios due to issues such as varying flame shapes, imbalanced samples, and interference from flame-like objects. In this work, a real-time flame detection method based on deformable object detection and time sequence analysis is proposed to address these issues. Firstly, based on the existing single-stage object detection network YOLOv5s, the network structure is improved by introducing deformable convolution to enhance the feature extraction ability for irregularly shaped flames. Secondly, the loss function is improved by using Focal Loss as the classification loss function to solve the problems of the imbalance of positive (flames) and negative (background) samples, as well as the imbalance of easy and hard samples, and by using EIOU Loss as the regression loss function to solve the problems of a slow convergence speed and inaccurate regression position in network training. Finally, a time sequence analysis strategy is adopted to comprehensively analyze the flame detection results of the current frame and historical frames in the surveillance video, alleviating false alarms caused by flame shape changes, flame occlusion, and flame-like interference. The experimental results indicate that the average precision (AP) and the F-Measure index of flame detection using the proposed method reach 93.0% and 89.6%, respectively, both of which are superior to the compared methods, and the detection speed is 24-26 FPS, meeting the real-time requirements of video flame detection.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    确保道路安全,结构稳定性和耐久性至关重要,检测道路裂缝在实现这些目标中起着至关重要的作用。我们提出了一种基于GM-ResNet的方法来提高裂纹检测的精度和有效性。利用ResNet-34作为裂缝图像特征提取的基础网络,我们考虑了模型中全球和本地信息同化不足的挑战。为了克服这一点,我们将全球关注机制纳入架构,便于跨通道和空间宽度和高度维度的综合特征提取。这些维度之间的这种动态交互优化了特征表示和泛化,导致更精确的裂纹检测结果。认识到ResNet-34在管理复杂数据关系方面的局限性,我们用多层全连接神经网络代替它的全连接层。我们通过整合多个线性,批量归一化和激活功能层。这种构造放大了特征表达,稳定训练收敛并提高模型在复杂检测任务中的性能。此外,在道路裂缝检测中,解决类不平衡势在必行。引入焦点损失函数,因为训练损失正面解决了这一挑战,有效缓解类不平衡对模型性能的不利影响。在公开可用的裂纹数据集上的实验结果强调了GM-ResNet与其他方法相比在裂纹检测精度方面的优势。值得注意的是,与其他方法相比,该方法在检测结果中具有更好的评价指标,强调其有效性。这验证了我们的方法在实现最佳裂纹检测结果方面的效力。
    Ensuring road safety, structural stability and durability is of paramount importance, and detecting road cracks plays a critical role in achieving these goals. We propose a GM-ResNet-based method to enhance the precision and efficacy of crack detection. Leveraging ResNet-34 as the foundational network for crack image feature extraction, we consider the challenge of insufficient global and local information assimilation within the model. To overcome this, we incorporate the global attention mechanism into the architecture, facilitating comprehensive feature extraction across the channel and the spatial width and height dimensions. This dynamic interaction across these dimensions optimizes feature representation and generalization, resulting in a more precise crack detection outcome. Recognizing the limitations of ResNet-34 in managing intricate data relationships, we replace its fully connected layer with a multilayer fully connected neural network. We fashion a deep network structure by integrating multiple linear, batch normalization and activation function layers. This construction amplifies feature expression, stabilizes training convergence and elevates the performance of the model in complex detection tasks. Moreover, tackling class imbalance is imperative in road crack detection. Introducing the focal loss function as the training loss addresses this challenge head-on, effectively mitigating the adverse impact of class imbalance on model performance. The experimental outcomes on a publicly available crack dataset emphasize the advantages of the GM-ResNet in crack detection accuracy compared to other methods. It is worth noting that the proposed method has better evaluation indicators in the detection results compared with alternative methodologies, highlighting its effectiveness. This validates the potency of our method in achieving optimal crack detection outcomes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    猪种群的高效检测和计数对于推进智能育种至关重要。传统的猪检测和计数方法主要依靠人工劳动,这要么耗时且效率低下,要么缺乏足够的检测精度。为了解决这些问题,本文提出了一种基于YOLOv5增强洗牌注意力(SA)和Focal-CIU(FC)的猪检测和计数新模型,我们称之为YOLOv5-SA-FC。该模型中的SA注意模块实现了几乎没有额外参数的多通道信息融合,提高特征提取的丰富性和鲁棒性。此外,Focal-CIU定位损失有助于减少样本不平衡对检测结果的影响,提高模型的整体性能。从实验结果来看,提出的YOLOv5-SA-FC模型的平均精度(mAP)和计数精度分别为93.8%和95.6%,在猪检测和计数方面优于其他方法10.2%和15.8%,分别。这些发现验证了所提出的YOLOv5-SA-FC模型在智能猪育种背景下用于猪种群检测和计数的有效性。
    The efficient detection and counting of pig populations is critical for the promotion of intelligent breeding. Traditional methods for pig detection and counting mainly rely on manual labor, which is either time-consuming and inefficient or lacks sufficient detection accuracy. To address these issues, a novel model for pig detection and counting based on YOLOv5 enhanced with shuffle attention (SA) and Focal-CIoU (FC) is proposed in this paper, which we call YOLOv5-SA-FC. The SA attention module in this model enables multi-channel information fusion with almost no additional parameters, enhancing the richness and robustness of feature extraction. Furthermore, the Focal-CIoU localization loss helps to reduce the impact of sample imbalance on the detection results, improving the overall performance of the model. From the experimental results, the proposed YOLOv5-SA-FC model achieved a mean average precision (mAP) and count accuracy of 93.8% and 95.6%, outperforming other methods in terms of pig detection and counting by 10.2% and 15.8%, respectively. These findings verify the effectiveness of the proposed YOLOv5-SA-FC model for pig population detection and counting in the context of intelligent pig breeding.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    茶叶采摘点的识别和定位是实现名茶自动采摘的前提。然而,由于茶芽与幼叶和老叶之间的颜色相似,人眼很难准确识别它们。
    为了解决分割问题,检测,在机械采摘名茶的复杂环境中定位采茶点,本文提出了一种称为MDY7-3PTB模型的新模型,结合了DeepLabv3+的高精度分割能力和YOLOv7的快速检测能力。该模型首先实现了分割的过程,其次是茶芽的检测和最终的定位,从而准确识别茶芽采摘点。该模型用更轻量级的MobileNetV2网络代替DeepLabv3+特征提取网络,以提高模型计算速度。此外,多注意机制(CBAM)融合到特征提取和ASPP模块中,以进一步优化模型性能。此外,为了解决数据集中的类不平衡问题,焦点损失函数用于纠正数据不平衡和改善分割,检测,和定位精度。
    MDY7-3PTB模型实现了86.61%的联合平均交点(mIoU),平均像素精度(mPA)为93.01%,茶芽分割数据集上的平均召回率(mRecall)为91.78%,比PSPNet等通常的分割模型表现更好,Unet,和DeeplabV3+。在茶芽采摘点识别与定位方面,该模型实现了93.52%的平均精度(mAP),准确率和召回率的加权平均值(F1得分)为93.17%,精度为97.27%,召回率达到89.41%。与现有的主流YOLO系列检测模型相比,该模型在各方面都有显著的改进,具有很强的通用性和鲁棒性。该方法消除了背景的影响,直接检测茶芽采摘点,几乎没有漏检,为茶芽采摘点提供精确的二维坐标,定位精度达96.41%。这为今后的茶芽采摘提供了有力的理论依据。
    UNASSIGNED: The identification and localization of tea picking points is a prerequisite for achieving automatic picking of famous tea. However, due to the similarity in color between tea buds and young leaves and old leaves, it is difficult for the human eye to accurately identify them.
    UNASSIGNED: To address the problem of segmentation, detection, and localization of tea picking points in the complex environment of mechanical picking of famous tea, this paper proposes a new model called the MDY7-3PTB model, which combines the high-precision segmentation capability of DeepLabv3+ and the rapid detection capability of YOLOv7. This model achieves the process of segmentation first, followed by detection and finally localization of tea buds, resulting in accurate identification of the tea bud picking point. This model replaced the DeepLabv3+ feature extraction network with the more lightweight MobileNetV2 network to improve the model computation speed. In addition, multiple attention mechanisms (CBAM) were fused into the feature extraction and ASPP modules to further optimize model performance. Moreover, to address the problem of class imbalance in the dataset, the Focal Loss function was used to correct data imbalance and improve segmentation, detection, and positioning accuracy.
    UNASSIGNED: The MDY7-3PTB model achieved a mean intersection over union (mIoU) of 86.61%, a mean pixel accuracy (mPA) of 93.01%, and a mean recall (mRecall) of 91.78% on the tea bud segmentation dataset, which performed better than usual segmentation models such as PSPNet, Unet, and DeeplabV3+. In terms of tea bud picking point recognition and positioning, the model achieved a mean average precision (mAP) of 93.52%, a weighted average of precision and recall (F1 score) of 93.17%, a precision of 97.27%, and a recall of 89.41%. This model showed significant improvements in all aspects compared to existing mainstream YOLO series detection models, with strong versatility and robustness. This method eliminates the influence of the background and directly detects the tea bud picking points with almost no missed detections, providing accurate two-dimensional coordinates for the tea bud picking points, with a positioning precision of 96.41%. This provides a strong theoretical basis for future tea bud picking.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号