Mesh : Glycine max Agriculture Algorithms Cluster Analysis Seeds Tetrazolium Salts

来  源:   DOI:10.1371/journal.pone.0285566   PDF(Pubmed)

Abstract:
Soy is the main product of Brazilian agriculture and the fourth most cultivated bean globally. Since soy cultivation tends to increase and due to this large market, the guarantee of product quality is an indispensable factor for enterprises to stay competitive. Industries perform vigor tests to acquire information and evaluate the quality of soy planting. The tetrazolium test, for example, provides information about moisture damage, bedbugs, or mechanical damage. However, the verification of the damage reason and its severity are done by an analyst, one by one. Since this is massive and exhausting work, it is susceptible to mistakes. Proposals involving different supervised learning approaches, including active learning strategies, have already been used, and have brought significant results. Therefore, this paper analyzes the performance of non-supervised techniques for classifying soybeans. An extensive experimental evaluation was performed, considering (9) different clustering algorithms (partitional, hierarchical, and density-based) applied to 5 image datasets of soybean seeds submitted to the tetrazolium test, including different damages and/or their levels. To describe those images, we considered 18 extractors of traditional features. We also considered four metrics (accuracy, FOWLKES, DAVIES, and CALINSKI) and two-dimensionality reduction techniques (principal component analysis and t-distributed stochastic neighbor embedding) for validation. Results show that this paper presents essential contributions since it makes it possible to identify descriptors and clustering algorithms that shall be used as preprocessing in other learning processes, accelerating and improving the classification process of key agricultural problems.
摘要:
大豆是巴西农业的主要产品,也是全球第四大栽培豆。由于大豆种植趋于增加,并且由于这个庞大的市场,产品质量的保证是企业保持竞争力不可或缺的因素。工业进行活力测试以获取信息并评估大豆种植的质量。四唑盐试验,例如,提供有关湿气损坏的信息,臭虫,或机械损坏。然而,损坏原因及其严重程度的验证由分析师完成,一个接一个。因为这是大量而令人筋疲力尽的工作,容易出错。涉及不同监督学习方法的建议,包括主动学习策略,已经使用,并带来了重大成果。因此,本文分析了大豆分类的非监督技术的性能。进行了广泛的实验评估,考虑(9)不同的聚类算法(分区,分层,和基于密度的)应用于提交四唑测试的大豆种子的5个图像数据集,包括不同的损害和/或其水平。为了描述这些图像,我们考虑了18个传统特征的提取器。我们还考虑了四个指标(准确性,FOWLKES,戴维斯,和CALINSKI)和二维约简技术(主成分分析和t分布随机邻居嵌入)进行验证。结果表明,本文提出了重要的贡献,因为它可以识别描述符和聚类算法,应在其他学习过程中用作预处理,加快和改进重点农业问题分类进程。
公众号