关键词: Bayesian averaging Decision trees Markov chain Monte Carlo Outcome prediction Predictive posterior distribution Trauma Uncertainty calibration

Mesh : Humans Bayes Theorem Learning Decision Making Uncertainty Entropy

来  源:   DOI:10.1016/j.artmed.2023.102634

Abstract:
Decision tree (DT) models provide a transparent approach to prediction of patient\'s outcomes within a probabilistic framework. Averaging over DT models under certain conditions can deliver reliable estimates of predictive posterior probability distributions, which is of critical importance in the case of predicting an individual patient\'s outcome. Reliable estimations of the distribution can be achieved within the Bayesian framework using Markov chain Monte Carlo (MCMC) and its Reversible Jump extension enabling DT models to grow to a reasonable size. Existing MCMC strategies however have limited ability to control DT structures and tend to sample overgrown DT models, making unreasonably small partitions, thus deteriorating the uncertainty calibration. This happens because the MCMC explores a DT model parameter space within a limited knowledge of the distribution of data partitions. We propose a new adaptive strategy which overcomes this limitation, and show that in the case of predicting trauma outcomes the number of data partitions can be significantly reduced, so that the unnecessary uncertainty of estimating the predictive posterior density is avoided. The proposed and existing strategies are compared in terms of entropy which, being calculated for predicted posterior distributions, represents the uncertainty in decisions. In this framework, the proposed method has outperformed the existing sampling strategies, so that the unnecessary uncertainty in decisions is efficiently avoided.
摘要:
决策树(DT)模型提供了一种透明的方法来在概率框架内预测患者的结果。在某些条件下对DT模型进行平均可以提供预测后验概率分布的可靠估计,这在预测个体患者的结果时至关重要。可以使用马尔可夫链蒙特卡罗(MCMC)及其可逆跳转扩展在贝叶斯框架内实现对分布的可靠估计,从而使DT模型能够增长到合理的大小。然而,现有的MCMC策略控制数字孪生结构的能力有限,并且倾向于采样过度生长的数字孪生模型,制作不合理的小隔板,从而恶化了不确定度校准。之所以会发生这种情况,是因为MCMC在有限的数据分区分布知识范围内探索了DT模型参数空间。我们提出了一种新的自适应策略,克服了这一限制,并表明在预测创伤结果的情况下,数据分区的数量可以显着减少,从而避免了估计预测后验密度的不必要的不确定性。在熵方面比较了拟议的和现有的策略,为预测的后验分布计算,代表决策中的不确定性。在这个框架中,所提出的方法优于现有的抽样策略,从而有效避免决策中不必要的不确定性。
公众号