具有两个以上类别的诊断设置的最佳分类和广义患病率估计。Optimal classification and generalized prevalence estimates for diagnostic settings with more than two classes.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

An accurate multiclass classification strategy is crucial to interpreting antibody tests. However, traditional methods based on confidence intervals or receiver operating characteristics lack clear extensions to settings with more than two classes. We address this problem by developing a multiclass classification based on probabilistic modeling and optimal decision theory that minimizes the convex combination of false classification rates. The classification process is challenging when the relative fraction of the population in each class, or generalized prevalence, is unknown. Thus, we also develop a method for estimating the generalized prevalence of test data that is independent of classification of the test data. We validate our approach on serological data with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) naïve, previously infected, and vaccinated classes. Synthetic data are used to demonstrate that (i) prevalence estimates are unbiased and converge to true values and (ii) our procedure applies to arbitrary measurement dimensions. In contrast to the binary problem, the multiclass setting offers wide-reaching utility as the most general framework and provides new insight into prevalence estimation best practices.

摘要：

准确的多类别分类策略对于解释抗体测试至关重要。然而,基于置信区间或接收器操作特性的传统方法缺乏对具有两个以上类别的设置的明确扩展。我们通过基于概率建模和最佳决策理论开发多类分类来解决此问题，该分类将错误分类率的凸组合降至最低。当每个类别中人口的相对比例，或普遍流行，是未知的。因此，我们还开发了一种独立于测试数据分类的测试数据广义患病率估计方法.我们验证了我们对严重急性呼吸综合征冠状病毒2（SARS-CoV-2）的血清学数据的方法，以前感染过,和接种疫苗的课程。合成数据用于证明（i）患病率估计值无偏且收敛于真实值，以及（ii）我们的程序适用于任意测量维度。与二进制问题相反，多类设置作为最通用的框架提供了广泛的效用，并提供了对患病率估计最佳实践的新见解。