%0 Journal Article %T Federated Learning: A Cross-Institutional Feasibility Study of Deep Learning Based Intracranial Tumor Delineation Framework for Stereotactic Radiosurgery. %A Lee WK %A Hong JS %A Lin YH %A Lu YF %A Hsu YY %A Lee CC %A Yang HC %A Wu CC %A Lu CF %A Sun MH %A Pan HC %A Wu HM %A Chung WY %A Guo WY %A You WC %A Wu YT %J J Magn Reson Imaging %V 59 %N 6 %D 2024 Jun 12 %M 37572087 %F 5.119 %R 10.1002/jmri.28950 %X BACKGROUND: Deep learning-based segmentation algorithms usually required large or multi-institute data sets to improve the performance and ability of generalization. However, protecting patient privacy is a key concern in the multi-institutional studies when conventional centralized learning (CL) is used.
OBJECTIVE: To explores the feasibility of a proposed lesion delineation for stereotactic radiosurgery (SRS) scheme for federated learning (FL), which can solve decentralization and privacy protection concerns.
METHODS: Retrospective.
METHODS: 506 and 118 vestibular schwannoma patients aged 15-88 and 22-85 from two institutes, respectively; 1069 and 256 meningioma patients aged 12-91 and 23-85, respectively; 574 and 705 brain metastasis patients aged 26-92 and 28-89, respectively.
UNASSIGNED: 1.5T, spin-echo, and gradient-echo [Correction added after first online publication on 21 August 2023. Field Strength has been changed to "1.5T" from "5T" in this sentence.].
RESULTS: The proposed lesion delineation method was integrated into an FL framework, and CL models were established as the baseline. The effect of image standardization strategies was also explored. The dice coefficient was used to evaluate the segmentation between the predicted delineation and the ground truth, which was manual delineated by neurosurgeons and a neuroradiologist.
METHODS: The paired t-test was applied to compare the mean for the evaluated dice scores (p < 0.05).
RESULTS: FL performed the comparable mean dice coefficient to CL for the testing set of Taipei Veterans General Hospital regardless of standardization and parameter; for the Taichung Veterans General Hospital data, CL significantly (p < 0.05) outperformed FL while using bi-parameter, but comparable results while using single-parameter. For the non-SRS data, FL achieved the comparable applicability to CL with mean dice 0.78 versus 0.78 (without standardization), and outperformed to the baseline models of two institutes.
CONCLUSIONS: The proposed lesion delineation successfully implemented into an FL framework. The FL models were applicable on SRS data of each participating institute, and the FL exhibited comparable mean dice coefficient to CL on non-SRS dataset. Standardization strategies would be recommended when FL is used.
METHODS: 4 TECHNICAL EFFICACY: Stage 1.