object classification

对象分类
  • 文章类型: Journal Article
    高压电力线绝缘子对于电力的安全高效传输至关重要。然而,现实世界的图像限制,特别是关于脏绝缘子串,延迟了绝缘子检测鲁棒算法的发展。该数据集通过创建新颖的合成高压电力线绝缘子图像数据库来解决这一挑战。该数据库是使用计算机辅助设计软件和游戏开发引擎创建的。公开可用的高压塔CAD模型,具有最常见的绝缘子类型(聚合物,玻璃,和瓷器)被导入到游戏引擎中。这个虚拟环境允许通过操纵虚拟摄像机来生成多样化的数据集,模拟各种照明条件,融合了不同的背景,比如山脉,森林,种植园,河流,城市和沙漠该数据库包括两个主要的集合:图像分割集,其中包括按绝缘体材料分类的47,286张图像(陶瓷,聚合物,和玻璃)和景观类型(山脉,森林,种植园,河流,城市和沙漠)。此外,包含14,424张图像的图像分类集模拟常见的绝缘子串污染物:盐,煤烟,鸟粪,和干净的绝缘子。每个污染物类别都有3,606个图像,每个绝缘体类型分为1,202个图像。该综合数据库为训练和评估用于高压电力线绝缘子检查的机器学习算法提供了宝贵的资源,最终有助于提高电网维护和可靠性。
    High-voltage power line insulators are crucial for safe and efficient electricity transmission. However, real-world image limitations, particularly regarding dirty insulator strings, delay the development of robust algorithms for insulator inspection. This dataset addresses this challenge by creating a novel synthetic high-voltage power line insulator image database. The database was created using computer-aided design softwares and a game development engine. Publicly available CAD models of high-voltage towers with the most common insulator types (polymer, glass, and porcelain) were imported into the game engine. This virtual environment allowed for the generation of a diverse dataset by manipulating virtual cameras, simulating various lighting conditions, and incorporating different backgrounds such as mountains, forests, plantation, rivers, city and deserts. The database comprises two main sets: The Image Segmentation Set, which includes 47,286 images categorized by insulator material (ceramic, polymeric, and glass) and landscape type (mountains, forests, plantation, rivers, city and deserts). Moreover, the Image Classification Set that contains 14,424 images simulating common insulator string contaminants: salt, soot, bird excrement, and clean insulators. Each contaminant category has 3,606 images divided into 1,202 images per insulator type. This synthetic database offers a valuable resource for training and evaluating machine learning algorithms for high-voltage power line insulator inspection, ultimately contributing to enhanced power grid maintenance and reliability.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项研究概述了一种使用监控摄像头的方法和一种算法,该算法调用深度学习模型来生成以小流鲑鱼和鳟鱼为特征的视频片段。这种自动化过程大大减少了视频监控中人为干预的需求。此外,提供了有关设置和配置监视设备的全面指南,以及有关培训适合特定需求的深度学习模型的说明。访问有关深度学习模型的视频数据和知识使对鳟鱼和鲑鱼的监控变得动态和动手,因为收集的数据可用于训练和进一步改进深度学习模型。希望,这种设置将鼓励渔业管理人员进行更多的监测,因为与定制的鱼类监测解决方案相比,设备相对便宜。为了有效利用数据,相机捕获的鱼的自然标记可用于个人识别。虽然自动化过程大大减少了视频监控中人为干预的需求,并加快了鱼类的初始分类和检测速度,基于自然标记的人工识别单个鱼类仍然需要人工的努力和参与。个人遭遇数据拥有许多潜在的应用,如捕获-再捕获和相对丰度模型,并通过空间捕获来评估水力发电中的鱼类通道,也就是说,在不同位置识别的同一个人。使用这种技术可以获得很多收益,因为相机捕获是鱼的福利的更好选择,并且与物理捕获和标记相比耗时更少。
    This study outlines a method for using surveillance cameras and an algorithm that calls a deep learning model to generate video segments featuring salmon and trout in small streams. This automated process greatly reduces the need for human intervention in video surveillance. Furthermore, a comprehensive guide is provided on setting up and configuring surveillance equipment, along with instructions on training a deep learning model tailored to specific requirements. Access to video data and knowledge about deep learning models makes monitoring of trout and salmon dynamic and hands-on, as the collected data can be used to train and further improve deep learning models. Hopefully, this setup will encourage fisheries managers to conduct more monitoring as the equipment is relatively cheap compared with customized solutions for fish monitoring. To make effective use of the data, natural markings of the camera-captured fish can be used for individual identification. While the automated process greatly reduces the need for human intervention in video surveillance and speeds up the initial sorting and detection of fish, the manual identification of individual fish based on natural markings still requires human effort and involvement. Individual encounter data hold many potential applications, such as capture-recapture and relative abundance models, and for evaluating fish passages in streams with hydropower by spatial recaptures, that is, the same individual identified at different locations. There is much to gain by using this technique as camera captures are the better option for the fish\'s welfare and are less time-consuming compared with physical captures and tagging.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    近年来,智能传感器系统的发展经历了显著的增长,特别是在微波和毫米波传感领域,由于增加了负担得起的硬件组件的可用性。随着智能地面合成孔径雷达(GBSAR)系统GBSAR-Pi的发展,我们之前探讨了基于原始雷达数据的物体分类应用。在这个基础上,在这项研究中,我们分析了利用极化信息来提高基于原始GBSAR数据的深度学习模型性能的潜力。数据是通过在24GHz下工作的GBSAR获得的,具有垂直(VV)和水平(HH)极化,每个观察到的场景产生两个矩阵(VV和HH)。我们提出了几种方法,这些方法演示了基于改进的ResNet18体系结构将此类数据集成到分类模型中。我们还介绍了一种新颖的暹罗架构,以适应双输入雷达数据。结果表明,简单的级联方法是最有前途的方法,并强调了在基于雷达数据的深度学习应用中考虑天线极化和合并策略的重要性。
    In recent years, the development of intelligent sensor systems has experienced remarkable growth, particularly in the domain of microwave and millimeter wave sensing, thanks to the increased availability of affordable hardware components. With the development of smart Ground-Based Synthetic Aperture Radar (GBSAR) system called GBSAR-Pi, we previously explored object classification applications based on raw radar data. Building upon this foundation, in this study, we analyze the potential of utilizing polarization information to improve the performance of deep learning models based on raw GBSAR data. The data are obtained with a GBSAR operating at 24 GHz with both vertical (VV) and horizontal (HH) polarization, resulting in two matrices (VV and HH) per observed scene. We present several approaches demonstrating the integration of such data into classification models based on a modified ResNet18 architecture. We also introduce a novel Siamese architecture tailored to accommodate the dual input radar data. The results indicate that a simple concatenation method is the most promising approach and underscore the importance of considering antenna polarization and merging strategies in deep learning applications based on radar data.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:植物病虫害的准确识别和预警是植物病虫害智能防控的前提。由于病虫害发生后,濒危植物的表型相似,以及外部环境的干扰,传统的深度学习模型在植物病虫害表型识别中经常面临过拟合问题,这不仅导致网络的收敛速度慢,而且识别准确率低。
    结果:由于上述问题,本研究提出了一种深度学习模型EResNet-支持向量机(SVM)来缓解植物病虫害识别和分类的过拟合问题。首先,在卷积神经网络中增加特征提取层,提高了模型的特征提取能力。第二,嵌入降阶模块,引入稀疏激活函数,降低模型复杂度,减轻过拟合。最后,引入SVM和全连通层融合的分类器,将原来的非线性分类问题转化为高维空间的线性分类问题,进一步缓解了过拟合现象,提高了病虫害的识别精度。消融实验进一步证明,融合结构能有效缓解过拟合现象,提高识别精度。对典型植物病虫害的实验识别结果表明,所提出的EResNet-SVM模型对8种情况(7种植物病害和1种正常情况)具有99.30%的测试准确率,比原来的ResNet18高出5.90%。与经典的AlexNet相比,GoogLeNet,Xception,SqueezeNet和DenseNet201型号,EResNet-SVM模型的精度提高了5.10%,7%,8.10%,6.20%和1.90%,分别。EResNet-SVM模型对6种害虫的检测准确率为100%,比原始ResNet18型号高出3.90%。
    结论:这项研究不仅为缓解深度学习中的过拟合问题提供了有益的参考,为病虫害的智能检测和控制提供了理论和技术支持。©2024化学工业学会。
    BACKGROUND: The accurate recognition and early warning for plant diseases and pests are a prerequisite of intelligent prevention and control for plant diseases and pests. As a result of the phenotype similarity of the hazarded plant after plant diseases and pests occur, as well as the interference of the external environment, traditional deep learning models often face the overfitting problem in phenotype recognition of plant diseases and pests, which leads to not only the slow convergence speed of the network, but also low recognition accuracy.
    RESULTS: Motivated by the above problems, the present study proposes a deep learning model EResNet-support vector machine (SVM) to alleviate the overfitting for the recognition and classification of plant diseases and pests. First, the feature extraction capability of the model is improved by increasing feature extraction layers in the convolutional neural network. Second, the order-reduced modules are embedded and a sparsely activated function is introduced to reduce model complexity and alleviate overfitting. Finally, a classifier fused by SVM and fully connected layers are introduced to transforms the original non-linear classification problem into a linear classification problem in high-dimensional space to further alleviate the overfitting and improve the recognition accuracy of plant diseases and pests. The ablation experiments further demonstrate that the fused structure can effectively alleviate the overfitting and improve the recognition accuracy. The experimental recognition results for typical plant diseases and pests show that the proposed EResNet-SVM model has 99.30% test accuracy for eight conditions (seven plant diseases and one normal), which is 5.90% higher than the original ResNet18. Compared with the classic AlexNet, GoogLeNet, Xception, SqueezeNet and DenseNet201 models, the accuracy of the EResNet-SVM model has improved by 5.10%, 7%, 8.10%, 6.20% and 1.90%, respectively. The testing accuracy of the EResNet-SVM model for 6 insect pests is 100%, which is 3.90% higher than that of the original ResNet18 model.
    CONCLUSIONS: This research provides not only useful references for alleviating the overfitting problem in deep learning, but also a theoretical and technical support for the intelligent detection and control of plant diseases and pests. © 2024 Society of Chemical Industry.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    自然历史收藏数字化中最慢的步骤之一是将图像标签转换为数字文本。我们在这里提出了一个有效的解决方案,以克服这个长期公认的效率瓶颈,利用社区科学努力和机器学习方法之间的协同作用。
    我们提供两个新的半自动服务。第一个检测和分类打字,手写,或来自植物标本室的混合标签。第二个使用为样本标签调整的工作流程,以使用光学字符识别(OCR)标记文本。标签查找器和分类器是通过人工循环过程构建的,该过程利用Nature平台的社区科学笔记来开发训练和验证数据集,以输入机器学习管道。
    我们的结果显示,在寻找和分类主要标签方面,成功率>93%。OCR管道优化了预处理,多个OCR引擎,和后处理步骤,包括从分子系统学中借用的对齐方法。与现成的开源解决方案相比,此管道的错误减少了4倍以上。OCR工作流还允许使用来自Nature的自定义Notes工具进行人工验证。
    我们的工作展示了一套用于植物标本室数字化的可用工具,包括可免费访问的自定义Web应用程序。将这些服务更好地集成到现有工具包中的进一步工作可以支持广泛的社区使用。
    UNASSIGNED: Among the slowest steps in the digitization of natural history collections is converting imaged labels into digital text. We present here a working solution to overcome this long-recognized efficiency bottleneck that leverages synergies between community science efforts and machine learning approaches.
    UNASSIGNED: We present two new semi-automated services. The first detects and classifies typewritten, handwritten, or mixed labels from herbarium sheets. The second uses a workflow tuned for specimen labels to label text using optical character recognition (OCR). The label finder and classifier was built via humans-in-the-loop processes that utilize the community science Notes from Nature platform to develop training and validation data sets to feed into a machine learning pipeline.
    UNASSIGNED: Our results showcase a >93% success rate for finding and classifying main labels. The OCR pipeline optimizes pre-processing, multiple OCR engines, and post-processing steps, including an alignment approach borrowed from molecular systematics. This pipeline yields >4-fold reductions in errors compared to off-the-shelf open-source solutions. The OCR workflow also allows human validation using a custom Notes from Nature tool.
    UNASSIGNED: Our work showcases a usable set of tools for herbarium digitization including a custom-built web application that is freely accessible. Further work to better integrate these services into existing toolkits can support broad community use.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    本文是关于在禁区山区对人类的自主检测。在禁区山区,人类很少存在,因此,人类检测是极其罕见的事件。由于人工智能的进步,基于卷积神经网络(CNN)的对象检测分类算法可用于此应用。然而,考虑到禁区山脉,一般不应该有人。因此,连续运行对象检测分类算法是不可取的,因为它们计算量很大。本文讨论了一种时效性的人体探测器系统,基于运动检测和对象分类。所提出的方案是不时地运行运动检测算法。在相机图像中,我们定义了一个人类可以出现的可行的人类空间。一旦在可行的人体空间内检测到运动,一个可以实现对象分类,仅在检测到运动的边界框内。由于可行的人体空间内的运动检测比对象检测分类方法运行得更快,该方法适用于低计算量的实时人体检测。据我们所知,文献中没有一篇论文使用可行的人类空间,就像我们的论文一样。通过与其他最先进的对象检测分类算法(HOG检测器,YOLOv7和YOLOv7-tiny)在实验中。本文证明了所提出的人体检测器系统的准确性与其他最先进的算法相当,同时在计算速度上表现出色。我们的实验表明,在没有人类的环境中,所提出的人类探测器比YOLOv7方法快62倍,同时显示出相当的准确性。
    This paper is on the autonomous detection of humans in off-limits mountains. In off-limits mountains, a human rarely exists, thus human detection is an extremely rare event. Due to the advances in artificial intelligence, object detection-classification algorithms based on a Convolution Neural Network (CNN) can be used for this application. However, considering off-limits mountains, there should be no person in general. Thus, it is not desirable to run object detection-classification algorithms continuously, since they are computationally heavy. This paper addresses a time-efficient human detector system, based on both motion detection and object classification. The proposed scheme is to run a motion detection algorithm from time to time. In the camera image, we define a feasible human space where a human can appear. Once motion is detected inside the feasible human space, one enables the object classification, only inside the bounding box where motion is detected. Since motion detection inside the feasible human space runs much faster than an object detection-classification method, the proposed approach is suitable for real-time human detection with low computational loads. As far as we know, no paper in the literature used the feasible human space, as in our paper. The outperformance of our human detector system is verified by comparing it with other state-of-the-art object detection-classification algorithms (HOG detector, YOLOv7 and YOLOv7-tiny) under experiments. This paper demonstrates that the accuracy of the proposed human detector system is comparable to other state-of-the-art algorithms, while outperforming in computational speed. Our experiments show that in environments with no humans, the proposed human detector runs 62 times faster than YOLOv7 method, while showing comparable accuracy.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    水培生菜移植后容易出现病虫害问题。人工鉴定每株水培生菜目前的生长状况不仅消耗时间,容易出错,而且达不到优质高效生菜栽培的要求。针对这个问题,本文提出了一种鉴定水培生菜生长状况的方法,称为YOLO-EfficientNet。首先,对水培生菜的视频数据进行处理,获得单个帧图像。并从这些帧中选取2240幅图像作为图像数据集A,使用图像数据集A训练YOLO-v8n对象检测模型,以检测视频数据中每个水培生菜的位置。根据预测的边界框选择目标后,通过裁剪获得了12,000张生菜图像,作为图像数据集B。最后,使用图像数据集B训练EfficientNet-v2s对象分类模型,以识别三种生长状态(健康,疾病,和害虫)水培生菜。结果表明,在使用YOLO-v8n模型训练图像数据集A之后,准确率和召回率始终在99%左右.在使用EfficientNet-v2s模型训练图像数据集B之后,Val-acc取得了95.78分的优异成绩,94.68forTest-acc,召回96.02,用于精度的96.32,F1分数为96.18。因此,本文提出的方法在水培生菜生长状态的鉴定和分类的农业应用中具有潜力。
    Hydroponic lettuce was prone to pest and disease problems after transplantation. Manual identification of the current growth status of each hydroponic lettuce not only consumed time and was prone to errors but also failed to meet the requirements of high-quality and efficient lettuce cultivation. In response to this issue, this paper proposed a method called YOLO-EfficientNet for identifying the growth status of hydroponic lettuce. Firstly, the video data of hydroponic lettuce were processed to obtain individual frame images. And 2240 images were selected from these frames as the image dataset A. Secondly, the YOLO-v8n object detection model was trained using image dataset A to detect the position of each hydroponic lettuce in the video data. After selecting the targets based on the predicted bounding boxes, 12,000 individual lettuce images were obtained by cropping, which served as image dataset B. Finally, the EfficientNet-v2s object classification model was trained using image dataset B to identify three growth statuses (Healthy, Diseases, and Pests) of hydroponic lettuce. The results showed that, after training image dataset A using the YOLO-v8n model, the accuracy and recall were consistently around 99%. After training image dataset B using the EfficientNet-v2s model, it achieved excellent scores of 95.78 for Val-acc, 94.68 for Test-acc, 96.02 for Recall, 96.32 for Precision, and 96.18 for F1-score. Thus, the method proposed in this paper had potential in the agricultural application of identifying and classifying the growth status in hydroponic lettuce.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Orchard monitoring is a vital direction of scientific research and practical application for increasing fruit production in ecological conditions. Recently, due to the development of technology and the decrease in equipment cost, the use of unmanned aerial vehicles and artificial intelligence algorithms for image acquisition and processing has achieved tremendous progress in orchards monitoring. This paper highlights the new research trends in orchard monitoring, emphasizing neural networks, unmanned aerial vehicles (UAVs), and various concrete applications. For this purpose, papers on complex topics obtained by combining keywords from the field addressed were selected and analyzed. In particular, the review considered papers on the interval 2017-2022 on the use of neural networks (as an important exponent of artificial intelligence in image processing and understanding) and UAVs in orchard monitoring and production evaluation applications. Due to their complexity, the characteristics of UAV trajectories and flights in the orchard area were highlighted. The structure and implementations of the latest neural network systems used in such applications, the databases, the software, and the obtained performances are systematically analyzed. To recommend some suggestions for researchers and end users, the use of the new concepts and their implementations were surveyed in concrete applications, such as a) identification and segmentation of orchards, trees, and crowns; b) detection of tree diseases, harmful insects, and pests; c) evaluation of fruit production, and d) evaluation of development conditions. To show the necessity of this review, in the end, a comparison is made with review articles with a related theme.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Simultaneous location and mapping (SLAM) technology is key in robot autonomous navigation. Most visual SLAM (VSLAM) algorithms for dynamic environments cannot achieve sufficient positioning accuracy and real-time performance simultaneously. When the dynamic object proportion is too high, the VSLAM algorithm will collapse. To solve the above problems, this paper proposes an indoor dynamic VSLAM algorithm called YDD-SLAM based on ORB-SLAM3, which introduces the YOLOv5 object detection algorithm and integrates deep information. Firstly, the objects detected by YOLOv5 are divided into eight subcategories according to their motion characteristics and depth values. Secondly, the depth ranges of the dynamic object and potentially dynamic object in the moving state in the scene are calculated. Simultaneously, the depth value of the feature point in the detection box is compared with that of the feature point in the detection box to determine whether the point is a dynamic feature point; if it is, the dynamic feature point is eliminated. Further, multiple feature point optimization strategies were developed for VSLAM in dynamic environments. A public data set and an actual dynamic scenario were used for testing. The accuracy of the proposed algorithm was significantly improved compared to that of ORB-SLAM3. This work provides a theoretical foundation for the practical application of a dynamic VSLAM algorithm.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    呈现的数据包括两个名为RealSAR-RAW和RealSAR-IMG的数据集。第一个包含使用地面合成孔径雷达(GBSAR)获得的未经处理的(原始)雷达数据,而第二个包含使用应用于第一组原始数据的Omega-K算法重建的图像。GBSAR系统沿轨道移动雷达传感器,以虚拟地扩展(合成)天线孔径,并提供系统前方区域的成像数据。所使用的传感器是频率调制连续波(FMCW)雷达,其具有24GHz的中心频率和700MHz的宽带宽,其在我们的情况下以1cm的步长以30个步长覆盖观察到的场景。测量(记录)的场景是在由不同材料(铝,玻璃,和塑料)在不同的位置。目的是开发一个可用于分类应用的GBSAR数据小数据集,该分类应用专注于从稀疏雷达数据中区分不同材料。
    Presented data includes two datasets named RealSAR-RAW and RealSAR-IMG. The first one contains unprocessed (raw) radar data obtained using Ground Based Synthetic Aperture Radar (GBSAR), while the second one contains images reconstructed using Omega-K algorithm applied to raw data from the first set. The GBSAR system moves the radar sensor along the track to virtually extend (synthesize) the antenna aperture and provides imaging data of the area in front of the system. The used sensor was a Frequency Modulated Continuous Wave (FMCW) radar with a central frequency of 24 GHz and a 700 MHz wide bandwidth which in our case covered the observed scene in 30 steps with 1 cm step size. The measured (recorded) scenes were made on combinations of three test objects (bottles) made of different material (aluminum, glass, and plastic) in different positions. The aim was to develop a small dataset of GBSAR data useful for classification applications focused on distinguishing different materials from sparse radar data.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号