尽管在其他领域受到了极大的关注,深度学习在生物多样性研究中的应用只是非常缓慢的开始。Mayfes(Ephemeroptera),石蝇(Plecoptera)和caddisfly(Trichoptera),通常缩写为EPT,由于其数量众多且对环境变化敏感,因此经常用于淡水生物监测。然而,EPT物种的明确形态鉴定是一个挑战,而是基本任务。因此,对这些淡水昆虫的形态鉴定不仅极其耗时和昂贵,但也经常导致误判或生成分类分辨率低的数据集。这里,我们研究了深度学习的应用,以提高生物监测程序的效率和分类分辨率。我们的数据库包含90个EPT分类单元(属或种级别),每个类别的图像数量从21到300(总共16,650)。培训完成后,建立了CNN(卷积神经网络)模型,能够自动将这些分类单元分类为适当的分类类别,准确率为98.7%。对于广泛的68个测试类群,我们的模型实现了100%的完美分类率。我们在训练数据中通过形态学密切相关的分类单元实现了值得注意的分类准确性(例如,Baetis属的物种,Hydropsyche,Perla)。梯度加权类激活图(Grad-CAM)可视化了负责CNN模型中处理物种分类的形态特征。在星翅目中,头部是最重要的特征,而胸部和腹部对于Plecoptera分类群的分类同样重要。对于直翅目,头部和胸部几乎同样重要。我们的数据库是公认的最广泛的水生昆虫数据库,值得注意的是其丰富的类别(分类单元)。我们的方法可以帮助解决生物多样性研究中的长期挑战,并通过节省样品和数据处理时间来解决监测计划中的紧迫问题。
Deep learning techniques have recently found application in biodiversity research. Mayflies (
Ephemeroptera), stoneflies (Plecoptera) and caddisflies (Trichoptera), often abbreviated as EPT, are frequently used for freshwater biomonitoring due to their large numbers and sensitivity to environmental changes. However, the morphological identification of EPT species is a challenging but fundamental task. Morphological identification of these freshwater insects is therefore not only extremely time-consuming and costly, but also often leads to misjudgments or generates datasets with low taxonomic resolution. Here, we investigated the application of deep learning to increase the efficiency and taxonomic resolution of biomonitoring programs. Our database contains 90 EPT taxa (genus or species level), with the number of images per category ranging from 21 to 300 (16,650 in total). Upon completion of training, a CNN (Convolutional Neural Network) model was created, capable of automatically classifying these taxa into their appropriate taxonomic categories with an accuracy of 98.7 %. Our model achieved a perfect classification rate of 100 % for 68 of the taxa in our dataset. We achieved noteworthy classification accuracy with morphologically closely related taxa within the training data (e.g., species of the genus Baetis, Hydropsyche, Perla). Gradient-weighted Class Activation Mapping (Grad-CAM) visualized the morphological features responsible for the classification of the treated species in the CNN models. Within
Ephemeroptera, the head was the most important feature, while the thorax and abdomen were equally important for the classification of Plecoptera taxa. For the order Trichoptera, the head and thorax were almost equally important. Our database is recognized as the most extensive aquatic insect database, notably distinguished by its wealth of included categories (taxa). Our approach can help solve long-standing challenges in biodiversity research and address pressing issues in monitoring programs by saving time in sample identification.