{Reference Type}: Journal Article {Title}: Improving the Detection of Potential Cases of Familial Hypercholesterolemia: Could Machine Learning Be Part of the Solution? {Author}: Stevens CAT;Vallejo-Vaz AJ;Chora JR;Barkas F;Brandts J;Mahani A;Abar L;Sharabiani MTA;Ray KK; {Journal}: J Am Heart Assoc {Volume}: 13 {Issue}: 12 {Year}: 2024 Jun 18 {Factor}: 6.106 {DOI}: 10.1161/JAHA.123.034434 {Abstract}: BACKGROUND: Familial hypercholesterolemia (FH), while highly prevalent, is a significantly underdiagnosed monogenic disorder. Improved detection could reduce the large number of cardiovascular events attributable to poor case finding. We aimed to assess whether machine learning algorithms outperform clinical diagnostic criteria (signs, history, and biomarkers) and the recommended screening criteria in the United Kingdom in identifying individuals with FH-causing variants, presenting a scalable screening criteria for general populations.
RESULTS: Analysis included UK Biobank participants with whole exome sequencing, classifying them as having FH when (likely) pathogenic variants were detected in their LDLR, APOB, or PCSK9 genes. Data were stratified into 3 data sets for (1) feature importance analysis; (2) deriving state-of-the-art statistical and machine learning models; (3) evaluating models' predictive performance against clinical diagnostic and screening criteria: Dutch Lipid Clinic Network, Simon Broome, Make Early Diagnosis to Prevent Early Death, and Familial Case Ascertainment Tool. One thousand and three of 454 710 participants were classified as having FH. A Stacking Ensemble model yielded the best predictive performance (sensitivity, 74.93%; precision, 0.61%; accuracy, 72.80%, area under the receiver operating characteristic curve, 79.12%) and outperformed clinical diagnostic criteria and the recommended screening criteria in identifying FH variant carriers within the validation data set (figures for Familial Case Ascertainment Tool, the best baseline model, were 69.55%, 0.44%, 65.43%, and 71.12%, respectively). Our model decreased the number needed to screen compared with the Familial Case Ascertainment Tool (164 versus 227).
CONCLUSIONS: Our machine learning-derived model provides a higher pretest probability of identifying individuals with a molecular diagnosis of FH compared with current approaches. This provides a promising, cost-effective scalable tool for implementation into electronic health records to prioritize potential FH cases for genetic confirmation.