关键词: AGNES DBSCAN K-means inlet water classification unsupervised learning water reuse

Mesh : Unsupervised Machine Learning Bays Reproducibility of Results Algorithms Cluster Analysis

来  源:   DOI:10.2166/wst.2024.087

Abstract:
The water reuse facilities of industrial parks face the challenge of managing a growing variety of wastewater sources as their inlet water. Typically, this clustering outcome is designed by engineers with extensive expertise. This paper presents an innovative application of unsupervised learning methods to classify inlet water in Chinese water reuse stations, aiming to reduce reliance on engineer experience. The concept of \'water quality distance\' was incorporated into three unsupervised learning clustering algorithms (K-means, DBSCAN, and AGNES), which were validated through six case studies. Of the six cases, three were employed to illustrate the feasibility of the unsupervised learning clustering algorithm. The results indicated that the clustering algorithm exhibited greater stability and excellence compared to both artificial clustering and ChatGPT-based clustering. The remaining three cases were utilized to showcase the reliability of the three clustering algorithms. The findings revealed that the AGNES algorithm demonstrated superior potential application ability. The average purity in six cases of K-means, DBSCAN, and AGNES were 0.947, 0.852, and 0.955, respectively.
摘要:
工业园区的水回用设施面临着管理越来越多的废水源作为其入口水的挑战。通常,这种聚类结果是由具有广泛专业知识的工程师设计的。本文介绍了无监督学习方法在中国中水回用站进水分类中的创新应用,旨在减少对工程师经验的依赖。“水质距离”的概念被纳入三种无监督学习聚类算法(K-means,DBSCAN,和AGNES),通过六个案例研究进行了验证。在这六个案例中,三个被用来说明无监督学习聚类算法的可行性。结果表明,与人工聚类和基于ChatGPT的聚类相比,该聚类算法具有更大的稳定性和优越性。其余三个案例用于展示三种聚类算法的可靠性。研究结果表明,AGNES算法显示出优越的潜在应用能力。6例K-means的平均纯度,DBSCAN,和AGNES分别为0.947、0.852和0.955。
公众号