target detection

目标检测
  • 文章类型: Journal Article
    对于肌电人体和机器协作应用,对到达运动的意图检测是相当重要的。从上肢肌肉的肌电图(EMG)窗口中挖掘了一套完整的手工制作功能,同时达到了附近的9个目标,例如日常生活活动。基于特征选择的评分方法,邻域成分分析(NCA),选择相关特征子集。最后,通过支持向量机(SVM)模型识别目标。通过嵌套交叉验证结构来概括分类性能,该结构在内部循环中选择了最佳特征子集。根据显示器上目标位置的低空间分辨率以及目标之间信号的轻微区分,对于连接长度为2和0.25s的两段的特征,达到了77.11%的最佳分类精度。由于缺乏肌电图的细微变化,在达到不同目标的同时,应用了广泛的功能来考虑EMG信号中包含的知识的其他方面。此外,由于NCA选择了提供更多判别能力的功能,它变得可以利用各种组合的特征,甚至连接的特征提取从不同的运动部分,以提高分类性能。
    Intention detection of the reaching movement is considerable for myoelectric human and machine collaboration applications. A comprehensive set of handcrafted features was mined from windows of electromyogram (EMG) of the upper-limb muscles while reaching nine nearby targets like activities of daily living. The feature selection-based scoring method, neighborhood component analysis (NCA), selected the relevant feature subset. Finally, the target was recognized by the support vector machine (SVM) model. The classification performance was generalized by a nested cross-validation structure that selected the optimal feature subset in the inner loop. According to the low spatial resolution of the target location on display and following the slight discrimination of signals between targets, the best classification accuracy of 77.11 % was achieved for concatenating the features of two segments with a length of 2 and 0.25 s. Due to the lack of subtle variation in EMG, while reaching different targets, a wide range of features was applied to consider additional aspects of the knowledge contained in EMG signals. Furthermore, since NCA selected features that provided more discriminant power, it became achievable to employ various combinations of features and even concatenated features extracted from different movement parts to improve classification performance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    颈椎成熟(CVM)方法对于确定正畸和骨科治疗的时机至关重要。在本文中,提出了一种称为DC-YOLOv5的目标检测模型,以实现对CVM的全自动检测和分级。总共标记了1800张头颅X射线照片,并根据CVM阶段进行了分类。我们引入了一个名为DC-YOLOv5的模型,该模型针对基于YOLOv5的CVM的特定特性进行了优化。这种优化包括用Wise-IOU代替原来的边界框回归损失计算方法,以解决Complete-IOU(CIOU)中纵向和横向损失相互干扰的问题,这使得模型趋同具有挑战性。我们结合了Res-dcn-head模块结构,以增强对小目标功能的关注,提高模型对细微样本差异的敏感性。此外,我们引入了卷积块注意力模块(CBAM)双通道注意力机制,以增强对关键特征的专注和理解,从而提高目标检测的准确性和效率。损失函数,精度,召回,平均精度(MAP),和F1分数作为主要的算法评价指标来评价这些模型的性能。此外,我们尝试使用梯度类激活映射(CAM)技术分析模型预测的重要区域。用于CVM识别的DC-YOLOv5模型的最终F1分数为0.993,mAp0.5为0.994,mAp0.5为0.943:0.95,收敛速度更快,比其他四种模型更准确、更稳健的检测。DC-YOLOv5算法在CVM识别中具有较高的准确性和鲁棒性,为CVM的快速、准确鉴定提供了有力支持,对医学领域的发展和临床诊断具有积极作用。
    The cervical vertebral maturation (CVM) method is essential to determine the timing of orthodontic and orthopedic treatment. In this paper, a target detection model called DC-YOLOv5 is proposed to achieve fully automated detection and staging of CVM. A total of 1800 cephalometric radiographs were labeled and categorized based on the CVM stages. We introduced a model named DC-YOLOv5, optimized for the specific characteristics of CVM based on YOLOv5. This optimization includes replacing the original bounding box regression loss calculation method with Wise-IOU to address the issue of mutual interference between vertical and horizontal losses in Complete-IOU (CIOU), which made model convergence challenging. We incorporated the Res-dcn-head module structure to enhance the focus on small target features, improving the model\'s sensitivity to subtle sample differences. Additionally, we introduced the Convolutional Block Attention Module (CBAM) dual-channel attention mechanism to enhance focus and understanding of critical features, thereby enhancing the accuracy and efficiency of target detection. Loss functions, precision, recall, mean average precision (mAP), and F1 scores were used as the main algorithm evaluation metrics to assess the performance of these models. Furthermore, we attempted to analyze regions important for model predictions using gradient Class Activation Mapping (CAM) techniques. The final F1 scores of the DC-YOLOv5 model for CVM identification were 0.993, 0.994 for mAp0.5 and 0.943 for mAp0.5:0.95, with faster convergence, more accurate and more robust detection than the other four models. The DC-YOLOv5 algorithm shows high accuracy and robustness in CVM identification, which provides strong support for fast and accurate CVM identification and has a positive effect on the development of medical field and clinical diagnosis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    复杂环境下的鲁棒目标检测,视觉条件差,和开放场景在自动驾驶中提出了重大的技术挑战。这些挑战需要开发毫米波(mmWave)雷达点云数据和视觉图像的先进融合方法。为了解决这些问题,本文提出了一种雷达相机鲁棒融合网络(RCRFNet),它利用自监督学习和开放集识别来有效地利用来自两个传感器的互补信息。具体来说,该网络通过视锥关联方法使用匹配的雷达摄像机数据来生成自我监督信号,加强网络培训。雷达点云和视觉图像之间的全局和局部深度一致性的集成,连同图像特征,帮助构造用于检测未知目标的对象类置信度。此外,这些技术与多层特征提取主干和多模态特征检测头相结合,以实现鲁棒的对象检测。在nuScenes公共数据集上的实验表明,RCRFNet优于最先进的(SOTA)方法,特别是在低视觉能见度的条件下和检测未知类对象时。
    Robust object detection in complex environments, poor visual conditions, and open scenarios presents significant technical challenges in autonomous driving. These challenges necessitate the development of advanced fusion methods for millimeter-wave (mmWave) radar point cloud data and visual images. To address these issues, this paper proposes a radar-camera robust fusion network (RCRFNet), which leverages self-supervised learning and open-set recognition to effectively utilise the complementary information from both sensors. Specifically, the network uses matched radar-camera data through a frustum association approach to generate self-supervised signals, enhancing network training. The integration of global and local depth consistencies between radar point clouds and visual images, along with image features, helps construct object class confidence levels for detecting unknown targets. Additionally, these techniques are combined with a multi-layer feature extraction backbone and a multimodal feature detection head to achieve robust object detection. Experiments on the nuScenes public dataset demonstrate that RCRFNet outperforms state-of-the-art (SOTA) methods, particularly in conditions of low visual visibility and when detecting unknown class objects.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    我们提出了一种视觉同时定位和映射(SLAM)算法,该算法集成了动态场景中的目标检测和聚类技术,以解决传统SLAM算法对运动目标的脆弱性。所提出的算法将目标检测模块集成到SLAM的前端,通过改进YOLOv5来识别视觉范围内的动态对象。忽略与动态对象关联的特征点,并且只有对应于静态目标的那些用于帧到帧匹配。该方法有效地解决了动态环境中的摄像机姿态估计问题,提高系统定位精度,并优化视觉SLAM性能。在TUM公共数据集上进行了实验,并与传统的ORB-SLAM3算法和DS-SLAM算法进行了比较,验证了所提出的视觉SLAM算法在高动态场景下的定位精度平均提高了85.70%和30.92%。与使用MASK-RCNN的DynaSLAM系统相比,我们的系统表现出卓越的实时性能,同时保持可比的ATE指数。这些结果突出表明,我们提出的SLAM算法有效地减少了姿态估计误差,提高定位精度,并展示了与传统视觉SLAM算法相比增强的鲁棒性。
    We propose a visual Simultaneous Localization and Mapping (SLAM) algorithm that integrates target detection and clustering techniques in dynamic scenarios to address the vulnerability of traditional SLAM algorithms to moving targets. The proposed algorithm integrates the target detection module into the front end of the SLAM and identifies dynamic objects within the visual range by improving the YOLOv5. Feature points associated with the dynamic objects are disregarded, and only those that correspond to static targets are utilized for frame-to-frame matching. This approach effectively addresses the camera pose estimation in dynamic environments, enhances system positioning accuracy, and optimizes the visual SLAM performance. Experiments on the TUM public dataset and comparison with the traditional ORB-SLAM3 algorithm and DS-SLAM algorithm validate that the proposed visual SLAM algorithm demonstrates an average improvement of 85.70 and 30.92% in positioning accuracy in highly dynamic scenarios. In comparison to the DynaSLAM system using MASK-RCNN, our system exhibits superior real-time performance while maintaining a comparable ATE index. These results highlight that our pro-posed SLAM algorithm effectively reduces pose estimation errors, enhances positioning accuracy, and showcases enhanced robustness compared to conventional visual SLAM algorithms.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在火车站和机场等大型公共场所,密集行人检测对于安全和保障非常重要。深度学习方法提供了相对有效的解决方案,但仍面临特征提取困难等问题,图像多尺度变化,和高泄漏检测率,这给该领域的研究带来了巨大的挑战。在本文中,提出了一种改进的基于Yolov8的密集行人检测算法GR-yolo。GR-yolo引入了repc3模块来优化骨干网,这增强了特征提取的能力,采用聚集分布机制重建yolov8颈部结构,融合多层次的信息,实现更有效的信息交换,增强了模型的检测能力。同时,Giou损失计算用于帮助GR-yolo更好地收敛,提高目标位置的检测精度,减少漏检。实验表明,GR-yolo比yolov8提高了检测性能,在更广泛的人数据集上检测手段精度提高了3.1%,人群人类数据集上的7.2%,和11.7%的人检测图像数据集。因此,所提出的GR-yolo算法适用于密集,多尺度,和场景可变的行人检测,改进也为解决真实场景中密集行人检测提供了新思路。
    In large public places such as railway stations and airports, dense pedestrian detection is important for safety and security. Deep learning methods provide relatively effective solutions but still face problems such as feature extraction difficulties, image multi-scale variations, and high leakage detection rates, which bring great challenges to the research in this field. In this paper, we propose an improved dense pedestrian detection algorithm GR-yolo based on Yolov8. GR-yolo introduces the repc3 module to optimize the backbone network, which enhances the ability of feature extraction, adopts the aggregation-distribution mechanism to reconstruct the yolov8 neck structure, fuses multi-level information, achieves a more efficient exchange of information, and enhances the detection ability of the model. Meanwhile, the Giou loss calculation is used to help GR-yolo converge better, improve the detection accuracy of the target position, and reduce missed detection. Experiments show that GR-yolo has improved detection performance over yolov8, with a 3.1% improvement in detection means accuracy on the wider people dataset, 7.2% on the crowd human dataset, and 11.7% on the people detection images dataset. Therefore, the proposed GR-yolo algorithm is suitable for dense, multi-scale, and scene-variable pedestrian detection, and the improvement also provides a new idea to solve dense pedestrian detection in real scenes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在宫颈细胞学图像中自动检测宫颈病变细胞/团块对于计算机辅助诊断至关重要。在这项任务中,病变细胞/团块的形状和大小似乎变化很大,降低宫颈病变细胞/团块的检测性能。为了解决这个问题,我们提出了一种用于宫颈病变细胞/团块检测的自适应特征提取网络,称为AFE-Net。具体来说,我们提出了自适应模块来获取宫颈病变细胞/团块的特征,在引入全局偏差机制以获取全局平均信息的同时,旨在将自适应特征与全局信息相结合,以改善目标特征在模型中的表示,提高了模型的检测性能。此外,我们分析了模型上流行的边界框损失的结果,并提出了新的边界框损失趋势-IoU(TIoU)。最后,该网络在CDetector数据集上实现了64.8%的平均精度(MAP),具有3070万个参数。与YOLOv7的62.6%和34.8M相比,该模型将mAP提高了2.2%,参数数量减少了11.8%。
    Automated detection of cervical lesion cell/clumps in cervical cytological images is essential for computer-aided diagnosis. In this task, the shape and size of the lesion cell/clumps appeared to vary considerably, reducing the detection performance of cervical lesion cell/clumps. To address the issue, we propose an adaptive feature extraction network for cervical lesion cell/clumps detection, called AFE-Net. Specifically, we propose the adaptive module to acquire the features of cervical lesion cell/clumps, while introducing the global bias mechanism to acquire the global average information, aiming at combining the adaptive features with the global information to improve the representation of the target features in the model, and thus enhance the detection performance of the model. Furthermore, we analyze the results of the popular bounding box loss on the model and propose the new bounding box loss tendency-IoU (TIoU). Finally, the network achieves the mean Average Precision (mAP) of 64.8% on the CDetector dataset, with 30.7 million parameters. Compared with YOLOv7 of 62.6% and 34.8M, the model improved mAP by 2.2% and reduced the number of parameters by 11.8%.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    亚麻植物器官的准确检测和计数对于获得表型数据至关重要,是亚麻品种选择和管理策略的基石。在这项研究中,提出了一种亚麻-YOLOv5模型来获得亚麻植物表型数据。基于原有YOLOv5x特征提取网络的坚实基础,网络结构被扩展到包括BiFormer模块,无缝集成双向编码器和转换器,使其能够以自适应的查询方式专注于关键特性。因此,这提高了模型的计算性能和效率。此外,我们引入了SIOU函数来计算回归损失,有效解决了预测帧与实际帧不匹配的问题。收集兰州种植的亚麻植物进行生产培训,验证,和测试集,验证集上的检测结果显示,平均准确率(mAP@0.5)为99.29%。在测试集中,模型预测结果与人工测量亚麻果实数量的相关系数(R),植物高度,主杆长度,主茎的数量为99.59%,99.53%,99.05%,和92.82%,分别。本研究为亚麻表型特征的检测和定量提供了一种稳定可靠的方法。为选育优良品种开辟了一条新的技术途径。
    Accurate detection and counting of flax plant organs are crucial for obtaining phenotypic data and are the cornerstone of flax variety selection and management strategies. In this study, a Flax-YOLOv5 model is proposed for obtaining flax plant phenotypic data. Based on the solid foundation of the original YOLOv5x feature extraction network, the network structure was extended to include the BiFormer module, which seamlessly integrates bi-directional encoders and converters, enabling it to focus on key features in an adaptive query manner. As a result, this improves the computational performance and efficiency of the model. In addition, we introduced the SIoU function to compute the regression loss, which effectively solves the problem of mismatch between predicted and actual frames. The flax plants grown in Lanzhou were collected to produce the training, validation, and test sets, and the detection results on the validation set showed that the average accuracy (mAP@0.5) was 99.29%. In the test set, the correlation coefficients (R) of the model\'s prediction results with the manually measured number of flax fruits, plant height, main stem length, and number of main stem divisions were 99.59%, 99.53%, 99.05%, and 92.82%, respectively. This study provides a stable and reliable method for the detection and quantification of flax phenotypic characteristics. It opens up a new technical way of selecting and breeding good varieties.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:稻田杂草物体检测可提供有关杂草种类和位置的关键信息,以进行精确的喷洒,在实际农业生产中具有重要意义。然而,面对复杂多变的现实农场环境,传统的目标检测方法仍然难以识别小尺寸,闭塞和密集分布的杂草实例。为了解决这些问题,本文提出了一种多尺度特征增强的DETR网络,名为RMS-DETR。通过在DETR之上添加多尺度特征提取分支,该模型充分利用了不同语义特征层的信息,提高了现实场景下稻田杂草的识别能力。
    方法:在DETR模型的基础上引入多尺度特征层,我们对不同的语义特征层进行了差异化设计。高层语义特征层采用Transformer结构,提取谷草和水稻之间的上下文信息。低级语义特征层利用CNN结构提取谷仓草的局部细节特征。引入多尺度特征层不可避免地导致模型计算量的增加,从而降低模型推理速度。因此,我们使用一种新型的Pconv(部分卷积)来代替模型中的传统标准卷积。
    结果:与原始DETR模型相比,我们提出的RMS-DETR模型在我们构建的稻田杂草数据集和DOTA公共数据集上实现了3.6%和4.4%的平均识别精度提高,分别。平均识别精度分别达到0.792和0.851。RMS-DETR模型大小为40.8M,推理时间为0.0081s。与三个经典DETR模型(可变形DETR,锚点DETR和DAB-DETR),RMS-DETR模型的平均精度分别提高了2.1%,4.9%和2.4%。
    结论:该模型能够在复杂的现实环境中准确识别稻田杂草,从而为精密喷涂和可变速率喷涂系统的管理提供了关键的技术支持。
    BACKGROUND: Rice field weed object detection can provide key information on weed species and locations for precise spraying, which is of great significance in actual agricultural production. However, facing the complex and changing real farm environments, traditional object detection methods still have difficulties in identifying small-sized, occluded and densely distributed weed instances. To address these problems, this paper proposes a multi-scale feature enhanced DETR network, named RMS-DETR. By adding multi-scale feature extraction branches on top of DETR, this model fully utilizes the information from different semantic feature layers to improve recognition capability for rice field weeds in real-world scenarios.
    METHODS: Introducing multi-scale feature layers on the basis of the DETR model, we conduct a differentiated design for different semantic feature layers. The high-level semantic feature layer adopts Transformer structure to extract contextual information between barnyard grass and rice plants. The low-level semantic feature layer uses CNN structure to extract local detail features of barnyard grass. Introducing multi-scale feature layers inevitably leads to increased model computation, thus lowering model inference speed. Therefore, we employ a new type of Pconv (Partial convolution) to replace traditional standard convolutions in the model.
    RESULTS: Compared to the original DETR model, our proposed RMS-DETR model achieved an average recognition accuracy improvement of 3.6% and 4.4% on our constructed rice field weeds dataset and the DOTA public dataset, respectively. The average recognition accuracies reached 0.792 and 0.851, respectively. The RMS-DETR model size is 40.8 M with inference time of 0.0081 s. Compared with three classical DETR models (Deformable DETR, Anchor DETR and DAB-DETR), the RMS-DETR model respectively improved average precision by 2.1%, 4.9% and 2.4%.
    CONCLUSIONS: This model is capable of accurately identifying rice field weeds in complex real-world scenarios, thus providing key technical support for precision spraying and management of variable-rate spraying systems.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    红花长丝目标的识别和采摘点的精确定位是实现自动提取长丝的基本前提。鉴于目标严重遮挡等挑战,低识别精度,以及非结构化环境中相当大的模型,本文介绍了一种新颖的轻量级YOLO-SaFi模型。该模型的架构设计具有合并了StarNet网络的Backbone层;颈部层引入了新颖的ELC卷积模块以完善C2f模块;头部层实现了新的轻量级共享卷积检测头,检测_EL。此外,通过将CIoU升级到PIoUv2来增强损失函数。这些增强显著增强了模型感知空间信息和促进多特征融合的能力,从而增强检测性能并使模型更加轻巧。通过与基线模型的比较实验进行的性能评估表明,YOLO-SaFi实现了参数的减少,计算负荷,重量文件减少了50.0%,40.7%,和48.2%,分别,与YOLOv8基线模型相比。此外,YOLO-SaFi展示了召回方面的改进,平均精度,检测速度提高1.9%,0.3%,每秒88.4帧,分别。最后,YOLO-SaFi模型在JetsonOrinNano设备上的部署证实了增强模型的卓越性能,从而为非结构化环境下智能红花丝取出机器人的进步建立了一个鲁棒的视觉检测框架。
    The identification of safflower filament targets and the precise localization of picking points are fundamental prerequisites for achieving automated filament retrieval. In light of challenges such as severe occlusion of targets, low recognition accuracy, and the considerable size of models in unstructured environments, this paper introduces a novel lightweight YOLO-SaFi model. The architectural design of this model features a Backbone layer incorporating the StarNet network; a Neck layer introducing a novel ELC convolution module to refine the C2f module; and a Head layer implementing a new lightweight shared convolution detection head, Detect_EL. Furthermore, the loss function is enhanced by upgrading CIoU to PIoUv2. These enhancements significantly augment the model\'s capability to perceive spatial information and facilitate multi-feature fusion, consequently enhancing detection performance and rendering the model more lightweight. Performance evaluations conducted via comparative experiments with the baseline model reveal that YOLO-SaFi achieved a reduction of parameters, computational load, and weight files by 50.0%, 40.7%, and 48.2%, respectively, compared to the YOLOv8 baseline model. Moreover, YOLO-SaFi demonstrated improvements in recall, mean average precision, and detection speed by 1.9%, 0.3%, and 88.4 frames per second, respectively. Finally, the deployment of the YOLO-SaFi model on the Jetson Orin Nano device corroborates the superior performance of the enhanced model, thereby establishing a robust visual detection framework for the advancement of intelligent safflower filament retrieval robots in unstructured environments.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    随着经济和科技的不断进步,汽车的数量继续增加,一些主要道路的交通拥堵问题日益严重。本文提出了一种新的车辆信息特征图(VIFM)方法和多分支卷积神经网络(MBCNN)模型,并将其应用于基于摄像机图像数据的交通拥堵检测问题。本研究的目的是建立一个以交通图像为输入,拥堵检测结果为输出的深度学习模型。旨在为交通拥堵的自动检测提供一种新的方法。本文中基于深度学习的方法可以有效地利用交通系统中现有的海量摄像机网络,而无需过多的硬件投资。本研究首先使用物体检测模型来识别图像中的车辆。然后,提出了一种提取VIFM的方法。最后,构建了基于MBCNN的交通拥堵检测模型。本文在中国城市交通图像数据库(CCTRIB)中验证了该方法的应用效果。与其他卷积神经网络相比,其他深度学习模型,和基线模型,本文提出的方法取得了优越的成果。本文中的方法获得了98.61%的F1得分和98.62%的准确性。实验结果表明,该方法有效地解决了交通拥堵检测问题,为交通管理提供了有力的工具。
    With the continuous advancement of the economy and technology, the number of cars continues to increase, and the traffic congestion problem on some key roads is becoming increasingly serious. This paper proposes a new vehicle information feature map (VIFM) method and a multi-branch convolutional neural network (MBCNN) model and applies it to the problem of traffic congestion detection based on camera image data. The aim of this study is to build a deep learning model with traffic images as input and congestion detection results as output. It aims to provide a new method for automatic detection of traffic congestion. The deep learning-based method in this article can effectively utilize the existing massive camera network in the transportation system without requiring too much investment in hardware. This study first uses an object detection model to identify vehicles in images. Then, a method for extracting a VIFM is proposed. Finally, a traffic congestion detection model based on MBCNN is constructed. This paper verifies the application effect of this method in the Chinese City Traffic Image Database (CCTRIB). Compared to other convolutional neural networks, other deep learning models, and baseline models, the method proposed in this paper yields superior results. The method in this article obtained an F1 score of 98.61% and an accuracy of 98.62%. Experimental results show that this method effectively solves the problem of traffic congestion detection and provides a powerful tool for traffic management.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号