{Reference Type}: Journal Article
{Title}: Comparing the performance of a deep convolutional neural network with orthopedic surgeons on the identification of total hip prosthesis design from plain radiographs.
{Author}: Borjali A;Chen AF;Bedair HS;Melnic CM;Muratoglu OK;Morid MA;Varadarajan KM;
{Journal}: Med Phys
{Volume}: 48
{Issue}: 5
{Year}: May 2021
{Factor}: 4.506
{DOI}: 10.1002/mp.14705
{Abstract}: <B>OBJECTIVE: </B>A crucial step in the preoperative planning for a revision total hip replacement (THR) surgery is the accurate identification of the failed implant design, especially if one or more well-fixed/functioning components are to be retained. Manual identification of the implant design from preoperative radiographic images can be time-consuming and inaccurate, which can ultimately lead to increased operating room time, more complex surgery, and increased healthcare costs.<BR><B>METHODS: </B>In this study, we present a novel approach to identifying THR femoral implants' design from plain radiographs using a convolutional neural network (CNN). We evaluated a total of 402 radiographs of nine different THR implant designs including, Accolade II (130 radiographs), Corail (89 radiographs), M/L Taper (31 radiographs), Summit (31 radiographs), Anthology (26 radiographs), Versys (26 radiographs), S-ROM (24 radiographs), Taperloc Standard Offset (24 radiographs), and Taperloc High Offset (21 radiographs). We implemented a transfer learning approach and adopted a DenseNet-201 CNN architecture by replacing the final classifier with nine fully connected neurons. Furthermore, we used saliency maps to explain the CNN decision-making process by visualizing the most important pixels in a given radiograph on the CNN's outcome. We also compared the CNN's performance with three board-certified and fellowship-trained orthopedic surgeons.<BR><B>RESULTS: </B>The CNN achieved the same or higher performance than at least one of the surgeons in identifying eight of nine THR implant designs and underperformed all of the surgeons in identifying one THR implant design (Anthology). Overall, the CNN achieved a lower Cohen's kappa (0.78) than surgeon 1 (1.00), the same Cohen's kappa as surgeon 2 (0.78), and a slightly higher Cohen's kappa than surgeon 3 (0.76) in identifying all the nine THR implant designs. Furthermore, the saliency maps showed that the CNN generally focused on each implant's unique design features to make a decision. Regarding the time spent performing the implant identification, the CNN accomplished this task in ~0.06 s per radiograph. The surgeon's identification time varied based on the method they utilized. When using their personal experience to identify the THR implant design, they spent negligible time. However, the identification time increased to an average of 8.4 min (standard deviation 6.1 min) per radiograph when they used another identification method (online search, consulting with the orthopedic company representative, and using image atlas), which occurred in about 17% of cases in the test subset (40 radiographs).<BR><B>CONCLUSIONS: </B>CNNs such as the one developed in this study can be used to automatically identify the design of a failed THR femoral implant preoperatively in just a fraction of a second, saving time and in some cases improving identification accuracy.