混合数据处理的自适应邻域粗糙集模型 — — 以帕金森病行为分析为例.Adaptive neighborhood rough set model for hybrid data processing: a case study on Parkinson's disease behavioral analysis.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

Extracting knowledge from hybrid data, comprising both categorical and numerical data, poses significant challenges due to the inherent difficulty in preserving information and practical meanings during the conversion process. To address this challenge, hybrid data processing methods, combining complementary rough sets, have emerged as a promising approach for handling uncertainty. However, selecting an appropriate model and effectively utilizing it in data mining requires a thorough qualitative and quantitative comparison of existing hybrid data processing models. This research aims to contribute to the analysis of hybrid data processing models based on neighborhood rough sets by investigating the inherent relationships among these models. We propose a generic neighborhood rough set-based hybrid model specifically designed for processing hybrid data, thereby enhancing the efficacy of the data mining process without resorting to discretization and avoiding information loss or practical meaning degradation in datasets. The proposed scheme dynamically adapts the threshold value for the neighborhood approximation space according to the characteristics of the given datasets, ensuring optimal performance without sacrificing accuracy. To evaluate the effectiveness of the proposed scheme, we develop a testbed tailored for Parkinson\'s patients, a domain where hybrid data processing is particularly relevant. The experimental results demonstrate that the proposed scheme consistently outperforms existing schemes in adaptively handling both numerical and categorical data, achieving an impressive accuracy of 95% on the Parkinson\'s dataset. Overall, this research contributes to advancing hybrid data processing techniques by providing a robust and adaptive solution that addresses the challenges associated with handling hybrid data, particularly in the context of Parkinson\'s disease analysis.

摘要：

从混合数据中提取知识，包括分类数据和数值数据，由于在转换过程中保存信息和实际意义的固有困难，因此提出了重大挑战。为了应对这一挑战，混合数据处理方法，结合互补粗糙集，已经成为处理不确定性的一种有希望的方法。然而,选择合适的模型并在数据挖掘中有效地利用它需要对现有的混合数据处理模型进行彻底的定性和定量比较。本研究旨在通过研究基于邻域粗糙集的混合数据处理模型之间的内在关系，为分析这些模型做出贡献。我们提出了一种基于通用邻域粗糙集的混合模型，专门用于处理混合数据，从而提高了数据挖掘过程的效率，而无需求助于离散化，并避免了数据集中的信息丢失或实际意义降级。所提出的方案根据给定数据集的特征动态地适应邻域近似空间的阈值，确保最佳性能而不牺牲精度。为了评估拟议方案的有效性，我们开发了一个专为帕金森患者量身定制的测试床，混合数据处理特别相关的领域。实验结果表明，该方案在自适应处理数字和分类数据方面始终优于现有方案。在帕金森的数据集上实现了95%的令人印象深刻的准确率。总的来说,这项研究有助于推进混合数据处理技术，通过提供一个强大的和自适应的解决方案，解决与处理混合数据相关的挑战，特别是在帕金森病分析的背景下。