关键词: autoregressive networks diffusion-generated models flow-based models sampling spin glasses

来  源:   DOI:10.1073/pnas.2311810121   PDF(Pubmed)

Abstract:
Recent years witnessed the development of powerful generative models based on flows, diffusion, or autoregressive neural networks, achieving remarkable success in generating data from examples with applications in a broad range of areas. A theoretical analysis of the performance and understanding of the limitations of these methods remain, however, challenging. In this paper, we undertake a step in this direction by analyzing the efficiency of sampling by these methods on a class of problems with a known probability distribution and comparing it with the sampling performance of more traditional methods such as the Monte Carlo Markov chain and Langevin dynamics. We focus on a class of probability distribution widely studied in the statistical physics of disordered systems that relate to spin glasses, statistical inference, and constraint satisfaction problems. We leverage the fact that sampling via flow-based, diffusion-based, or autoregressive networks methods can be equivalently mapped to the analysis of a Bayes optimal denoising of a modified probability measure. Our findings demonstrate that these methods encounter difficulties in sampling stemming from the presence of a first-order phase transition along the algorithm\'s denoising path. Our conclusions go both ways: We identify regions of parameters where these methods are unable to sample efficiently, while that is possible using standard Monte Carlo or Langevin approaches. We also identify regions where the opposite happens: standard approaches are inefficient while the discussed generative methods work well.
摘要:
近年来见证了基于流的强大生成模型的发展,扩散,或者自回归神经网络,在从广泛领域的应用实例中生成数据方面取得了显著成功。对这些方法的性能和局限性的理解进行了理论分析,然而,具有挑战性。在本文中,我们通过分析这些方法对具有已知概率分布的一类问题的采样效率,并将其与蒙特卡洛马尔可夫链和Langevin动力学等更传统方法的采样性能进行比较,朝着这个方向迈出了一步。我们专注于一类在与自旋眼镜有关的无序系统的统计物理学中广泛研究的概率分布,统计推断,和约束满意度问题。我们利用了这样一个事实,即通过基于流量的采样,基于扩散的,或自回归网络方法可以等效地映射到修改的概率度量的贝叶斯最佳去噪的分析。我们的发现表明,由于沿算法的去噪路径存在一阶相变,这些方法在采样时遇到了困难。我们的结论是双向的:我们确定了这些方法无法有效采样的参数区域,而使用标准的蒙特卡洛或Langevin方法是可能的。我们还确定了发生相反情况的区域:标准方法效率低下,而讨论的生成方法效果很好。
公众号