关键词: Covariate shift model evaluation receiver operating characteristic curve risk prediction semisupervised learning transfer learning

Mesh : Humans Supervised Machine Learning Machine Learning ROC Curve Research Design Bias

来  源:   DOI:10.1093/biomtc/ujae002   PDF(Pubmed)

Abstract:
In many modern machine learning applications, changes in covariate distributions and difficulty in acquiring outcome information have posed challenges to robust model training and evaluation. Numerous transfer learning methods have been developed to robustly adapt the model itself to some unlabeled target populations using existing labeled data in a source population. However, there is a paucity of literature on transferring performance metrics, especially receiver operating characteristic (ROC) parameters, of a trained model. In this paper, we aim to evaluate the performance of a trained binary classifier on unlabeled target population based on ROC analysis. We proposed Semisupervised Transfer lEarning of Accuracy Measures (STEAM), an efficient three-step estimation procedure that employs (1) double-index modeling to construct calibrated density ratio weights and (2) robust imputation to leverage the large amount of unlabeled data to improve estimation efficiency. We establish the consistency and asymptotic normality of the proposed estimator under the correct specification of either the density ratio model or the outcome model. We also correct for potential overfitting bias in the estimators in finite samples with cross-validation. We compare our proposed estimators to existing methods and show reductions in bias and gains in efficiency through simulations. We illustrate the practical utility of the proposed method on evaluating prediction performance of a phenotyping model for rheumatoid arthritis (RA) on a temporally evolving EHR cohort.
摘要:
在许多现代机器学习应用中,协变量分布的变化和获取结果信息的难度对稳健的模型训练和评估提出了挑战。已经开发了许多迁移学习方法,以使用源种群中的现有标记数据将模型本身鲁棒地适应一些未标记的目标种群。然而,关于转移绩效指标的文献很少,特别是接收机工作特性(ROC)参数,一个经过训练的模型。在本文中,我们旨在基于ROC分析评估经过训练的二元分类器对未标记目标人群的性能.我们提出了半监督传输精度度量(STEAM),一种有效的三步估计程序,采用(1)双指数建模来构建校准的密度比权重和(2)稳健的插补来利用大量未标记的数据来提高估计效率。在密度比模型或结果模型的正确规范下,我们建立了所提出的估计器的一致性和渐近正态。我们还通过交叉验证校正了有限样本中估计器的潜在过拟合偏差。我们将我们提出的估计器与现有方法进行了比较,并通过模拟显示了偏差的减少和效率的提高。我们说明了所提出的方法在评估随时间发展的EHR队列中类风湿关节炎(RA)表型模型的预测性能方面的实际实用性。
公众号