TY - JOUR
T1 - Development and validation of deep learning models for identifying the brand of pedicle screws on plain spine radiographs
AU - Yao, Yu Cheng
AU - Lin, Cheng Li
AU - Chen, Hung Hsun
AU - Lin, Hsi Hsien
AU - Hsiung, Wei
AU - Wang, Shih Tien
AU - Sun, Ying Chou
AU - Tang, Yu Hsuan
AU - Chou, Po Hsin
N1 - Publisher Copyright:
© 2024 The Author(s). JOR Spine published by Wiley Periodicals LLC on behalf of Orthopaedic Research Society.
PY - 2024/9
Y1 - 2024/9
N2 - Background: In spinal revision surgery, previous pedicle screws (PS) may need to be replaced with new implants. Failure to accurately identify the brand of PS-based instrumentation preoperatively may increase the risk of perioperative complications. This study aimed to develop and validate an optimal deep learning (DL) model to identify the brand of PS-based instrumentation on plain radiographs of spine (PRS) using anteroposterior (AP) and lateral images. Methods: A total of 529 patients who received PS-based instrumentation from seven manufacturers were enrolled in this retrospective study. The postoperative PRS were gathered as ground truths. The training, validation, and testing datasets contained 338, 85, and 106 patients, respectively. YOLOv5 was used to crop out the screws' trajectory, and the EfficientNet-b0 model was used to develop single models (AP, Lateral, Merge, and Concatenated) based on the different PRS images. The ensemble models were different combinations of the single models. Primary outcomes were the models' performance in accuracy, sensitivity, precision, F1-score, kappa value, and area under the curve (AUC). Secondary outcomes were the relative performance of models versus human readers and external validation of the DL models. Results: The Lateral model had the most stable performance among single models. The discriminative performance was improved by the ensemble method. The AP + Lateral ensemble model had the most stable performance, with an accuracy of 0.9434, F1 score of 0.9388, and AUC of 0.9834. The performance of the ensemble models was comparable to that of experienced orthopedic surgeons and superior to that of inexperienced orthopedic surgeons. External validation revealed that the Lat + Concat ensemble model had the best accuracy (0.9412). Conclusion: The DL models demonstrated stable performance in identifying the brand of PS-based instrumentation based on AP and/or lateral images of PRS, which may assist orthopedic spine surgeons in preoperative revision planning in clinical practice.
AB - Background: In spinal revision surgery, previous pedicle screws (PS) may need to be replaced with new implants. Failure to accurately identify the brand of PS-based instrumentation preoperatively may increase the risk of perioperative complications. This study aimed to develop and validate an optimal deep learning (DL) model to identify the brand of PS-based instrumentation on plain radiographs of spine (PRS) using anteroposterior (AP) and lateral images. Methods: A total of 529 patients who received PS-based instrumentation from seven manufacturers were enrolled in this retrospective study. The postoperative PRS were gathered as ground truths. The training, validation, and testing datasets contained 338, 85, and 106 patients, respectively. YOLOv5 was used to crop out the screws' trajectory, and the EfficientNet-b0 model was used to develop single models (AP, Lateral, Merge, and Concatenated) based on the different PRS images. The ensemble models were different combinations of the single models. Primary outcomes were the models' performance in accuracy, sensitivity, precision, F1-score, kappa value, and area under the curve (AUC). Secondary outcomes were the relative performance of models versus human readers and external validation of the DL models. Results: The Lateral model had the most stable performance among single models. The discriminative performance was improved by the ensemble method. The AP + Lateral ensemble model had the most stable performance, with an accuracy of 0.9434, F1 score of 0.9388, and AUC of 0.9834. The performance of the ensemble models was comparable to that of experienced orthopedic surgeons and superior to that of inexperienced orthopedic surgeons. External validation revealed that the Lat + Concat ensemble model had the best accuracy (0.9412). Conclusion: The DL models demonstrated stable performance in identifying the brand of PS-based instrumentation based on AP and/or lateral images of PRS, which may assist orthopedic spine surgeons in preoperative revision planning in clinical practice.
UR - http://www.scopus.com/inward/record.url?scp=85204524407&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85204524407&partnerID=8YFLogxK
U2 - 10.1002/jsp2.70001
DO - 10.1002/jsp2.70001
M3 - Article
AN - SCOPUS:85204524407
SN - 2572-1143
VL - 7
JO - JOR Spine
JF - JOR Spine
IS - 3
M1 - e70001
ER -