{Reference Type}: Journal Article {Title}: Machine Learning and External Validation of the IDENTIFY Risk Calculator for Patients with Haematuria Referred to Secondary Care for Suspected Urinary Tract Cancer. {Author}: Khadhouri S;Hramyka A;Gallagher K;Light A;Ippoliti S;Edison M;Alexander C;Kulkarni M;Zimmermann E;Nathan A;Orecchia L;Banthia R;Piazza P;Mak D;Pyrgidis N;Narayan P;Abad Lopez P;Nawaz F;Tran TT;Claps F;Hogan D;Gomez Rivas J;Alonso S;Chibuzo I;Gutierrez Hidalgo B;Whitburn J;Teoh J;Marcq G;Szostek A;Bondad J;Sountoulides P;Kelsey T;Kasivisvanathan V; ; {Journal}: Eur Urol Focus {Volume}: 0 {Issue}: 0 {Year}: 2024 Jun 20 {Factor}: 5.952 {DOI}: 10.1016/j.euf.2024.06.004 {Abstract}: BACKGROUND: The IDENTIFY study developed a model to predict urinary tract cancer using patient characteristics from a large multicentre, international cohort of patients referred with haematuria. In addition to calculating an individual's cancer risk, it proposes thresholds to stratify them into very-low-risk (<1%), low-risk (1-<5%), intermediate-risk (5-<20%), and high-risk (≥20%) groups.
OBJECTIVE: To externally validate the IDENTIFY haematuria risk calculator and compare traditional regression with machine learning algorithms.
METHODS: Prospective data were collected on patients referred to secondary care with new haematuria. Data were collected for patient variables included in the IDENTIFY risk calculator, cancer outcome, and TNM staging. Machine learning methods were used to evaluate whether better models than those developed with traditional regression methods existed.
METHODS: The area under the receiver operating characteristic curve (AUC) for the detection of urinary tract cancer, calibration coefficient, calibration in the large (CITL), and Brier score were determined.
CONCLUSIONS: There were 3582 patients in the validation cohort. The development and validation cohorts were well matched. The AUC of the IDENTIFY risk calculator on the validation cohort was 0.78. This improved to 0.80 on a subanalysis of urothelial cancer prevalent countries alone, with a calibration slope of 1.04, CITL of 0.24, and Brier score of 0.14. The best machine learning model was Random Forest, which achieved an AUC of 0.76 on the validation cohort. There were no cancers stratified to the very-low-risk group in the validation cohort. Most cancers were stratified to the intermediate- and high-risk groups, with more aggressive cancers in higher-risk groups.
CONCLUSIONS: The IDENTIFY risk calculator performed well at predicting cancer in patients referred with haematuria on external validation. This tool can be used by urologists to better counsel patients on their cancer risks, to prioritise diagnostic resources on appropriate patients, and to avoid unnecessary invasive procedures in those with a very low risk of cancer.
RESULTS: We previously developed a calculator that predicts patients' risk of cancer when they have blood in their urine, based on their personal characteristics. We have validated this risk calculator, by testing it on a separate group of patients to ensure that it works as expected. Most patients found to have cancer tended to be in the higher-risk groups and had more aggressive types of cancer with a higher risk. This tool can be used by clinicians to fast-track high-risk patients based on the calculator and investigate them more thoroughly.