Artificial intelligence treatment decision support for complex breast cancer among oncologists with varying expertise
Abstract
PURPOSE The aim of the current study was to assess treatment concordance and adherence to National Comprehensive Cancer Network breast cancer treatment guidelines between oncologists and an artificial intelligence advisory tool. PATIENTS AND METHODS Study cases of patients (N = 1,977) who were at high risk for recurrence or who had metastatic disease and cell types for which the advisory tool was trained were obtained from the Chinese Society for Clinical Oncology cancer database (2012 to 2017). A cross-sectional observational study was performed to examine treatment concordance and guideline adherence among an artificial intelligence advisory tool and 10 oncologists with varying expertise-three fellows, four attending physicians, and three chief physicians. In a blinded fashion, each oncologist provided treatment advice on an average of 198 cases and the advisory tool on all cases (N = 1,977). Results are reported as rates and logistic regression odds ratios. RESULTS Concordance for the recommended treatment was 0.56 for all physicians and higher for fellows compared with chief and attending physicians (0.68 v 0.54; 0.49; P = .001). Concordance differed by hormone receptor subtype-TNM stage, with the lowest for hormone receptor-positive human epidermal growth factor receptor 2/neu-positive cancers (0.48) and highest for triple-negative breast cancers (0.71) across most TNM stages. Adherence to National Comprehensive Cancer Network guidelines was higher for oncologists compared with the advisory tool (0.96 v 0.82; P , .003) and lower for fellows compared with attending physicians (0.93 v 0.98; 0.96; P = .04). CONCLUSION Study findings reflect a complex breast cancer case mix, the limits of medical knowledge regarding optimum treatment, clinician practice patterns, and use of a tool that reflects expertise from one cancer center. Additional research in different practice settings is needed to understand the tool's scalability and its impact on treatment decisions and clinical and health services outcomes.