curse of dimensionality

维度的诅咒
  • 文章类型: Journal Article
    我们将一系列日益复杂的参数统计主题重新制定和重组为响应与响应的框架-在没有任何明确功能结构的情况下描述的协变量(Re-Co)动力学。然后,我们通过仅利用数据的分类性质来发现此类Re-Co动态的主要因素,从而解决这些主题的数据分析任务。通过采用香农的条件熵(CE)和互信息(I[Re;Co])作为两个关键的信息理论测量,说明并执行了分类探索数据分析(CEDA)范式核心的主要因素选择协议。通过评估这两种基于熵的测量和解决统计任务的过程,我们获得了一些计算指南,用于以做和学习的方式执行主要因素选择协议。具体来说,根据称为[C1:可确认]的标准,建立了评估CE和I[Re;Co]的实用指南。按照[C1:可确认]标准,我们没有试图获得这些理论信息测量的一致估计。所有评价均在列联表平台上进行,在此基础上,实践准则还提供了减轻维度诅咒影响的方法。我们明确地进行了Re-Co动力学的六个例子,其中每一个,还探索和讨论了几个广泛扩展的场景。
    We reformulate and reframe a series of increasingly complex parametric statistical topics into a framework of response-vs.-covariate (Re-Co) dynamics that is described without any explicit functional structures. Then we resolve these topics\' data analysis tasks by discovering major factors underlying such Re-Co dynamics by only making use of data\'s categorical nature. The major factor selection protocol at the heart of Categorical Exploratory Data Analysis (CEDA) paradigm is illustrated and carried out by employing Shannon\'s conditional entropy (CE) and mutual information (I[Re;Co]) as the two key Information Theoretical measurements. Through the process of evaluating these two entropy-based measurements and resolving statistical tasks, we acquire several computational guidelines for carrying out the major factor selection protocol in a do-and-learn fashion. Specifically, practical guidelines are established for evaluating CE and I[Re;Co] in accordance with the criterion called [C1:confirmable]. Following the [C1:confirmable] criterion, we make no attempts on acquiring consistent estimations of these theoretical information measurements. All evaluations are carried out on a contingency table platform, upon which the practical guidelines also provide ways of lessening the effects of the curse of dimensionality. We explicitly carry out six examples of Re-Co dynamics, within each of which, several widely extended scenarios are also explored and discussed.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号