Stability analysis

稳定性分析
  • 文章类型: Review
    背景:主题模型是一类无监督机器学习模型,这有助于总结,从大型非结构化文档集合中浏览和检索。本研究回顾了几种评估使用非负矩阵分解估计的无监督主题模型质量的方法。已经跨不同的字段开发了用于主题模型验证的技术。我们综合这些文献,讨论主题模型验证的不同技术的优缺点,并说明了它们对指导大型临床文本语料库模型选择的有用性。
    使用回顾性队列设计,我们整理了一个文本语料库,其中包含2017年1月1日至2020年12月31日从加拿大多伦多的初级保健电子病历中收集的382,666份临床笔记.
    方法:已经提出了几个主题模型质量指标来评估模型拟合的不同方面。我们探索了以下指标:重建误差,主题连贯性,有秩偏差的重叠,肯德尔的加权tau,分配系数,分区熵和谢贝尼统计量。根据上下文,交叉验证和/或Bootstrap稳定性分析用于在我们的语料库上估计这些指标。
    结果:交叉验证的重建错误偏爱我们语料库上的大型主题模型(K≥100个主题)。使用主题相干性和Xie-Beni统计量的稳定性分析也有利于大型模型(K=100个主题)。秩偏重叠和Kendall的加权tau偏爱小模型(K=5个主题)。很少有模型评估指标表明中型主题模型(25≤K≤75)是最佳的。然而,人类判断表明,中型主题模型产生了语料库的表达性低维摘要。
    结论:主题模型质量指标是指导模型选择和评估的透明定量工具。我们的经验说明表明,不同的主题模型质量指标有利于不同复杂性的模型;并且可能不会选择与人类判断一致的模型。这表明不同的指标捕获了模型拟合优度的不同方面。主题模型质量指标的组合,再加上人类的验证,可能有助于评估无监督主题模型。
    BACKGROUND: Topic models are a class of unsupervised machine learning models, which facilitate summarization, browsing and retrieval from large unstructured document collections. This study reviews several methods for assessing the quality of unsupervised topic models estimated using non-negative matrix factorization. Techniques for topic model validation have been developed across disparate fields. We synthesize this literature, discuss the advantages and disadvantages of different techniques for topic model validation, and illustrate their usefulness for guiding model selection on a large clinical text corpus.
    UNASSIGNED: Using a retrospective cohort design, we curated a text corpus containing 382,666 clinical notes collected between 01/01/2017 through 12/31/2020 from primary care electronic medical records in Toronto Canada.
    METHODS: Several topic model quality metrics have been proposed to assess different aspects of model fit. We explored the following metrics: reconstruction error, topic coherence, rank biased overlap, Kendall\'s weighted tau, partition coefficient, partition entropy and the Xie-Beni statistic. Depending on context, cross-validation and/or bootstrap stability analysis were used to estimate these metrics on our corpus.
    RESULTS: Cross-validated reconstruction error favored large topic models (K ≥ 100 topics) on our corpus. Stability analysis using topic coherence and the Xie-Beni statistic also favored large models (K = 100 topics). Rank biased overlap and Kendall\'s weighted tau favored small models (K = 5 topics). Few model evaluation metrics suggested mid-sized topic models (25 ≤ K ≤ 75) as being optimal. However, human judgement suggested that mid-sized topic models produced expressive low-dimensional summarizations of the corpus.
    CONCLUSIONS: Topic model quality indices are transparent quantitative tools for guiding model selection and evaluation. Our empirical illustration demonstrated that different topic model quality indices favor models of different complexity; and may not select models aligning with human judgment. This suggests that different metrics capture different aspects of model goodness of fit. A combination of topic model quality indices, coupled with human validation, may be useful in appraising unsupervised topic models.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    This paper reviews recent studies on the Particle Swarm Optimization (PSO) algorithm. The review has been focused on high impact recent articles that have analyzed and/or modified PSO algorithms. This paper also presents some potential areas for future study.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号