Deep mining generation of lung cancer malignancy models from chest x-ray images
Abstract
Lung cancer is the leading cause of cancer death and morbidity worldwide. Many studies have shown machine learning models to be effective in detecting lung nodules from chest X-ray images. However, these techniques have yet to be embraced by the medical community due to several practical, ethical, and regulatory constraints stemming from the “black-box” nature of deep learning models. Additionally, most lung nodules visible on chest X-rays are benign; therefore, the narrow task of computer vision-based lung nodule detection cannot be equated to automated lung cancer detection. Addressing both concerns, this study introduces a novel hybrid deep learning and decision tree-based computer vision model, which presents lung cancer malignancy predictions as interpretable decision trees. The deep learning component of this process is trained using a large publicly available dataset on pathological biomarkers associated with lung cancer. These models are then used to inference biomarker scores for chest X-ray images from two independent data sets, for which malignancy metadata is available. Next, multi-variate predictive models were mined by fitting shallow decision trees to the malignancy stratified datasets and interrogating a range of metrics to determine the best model. The best decision tree model achieved sensitivity and specificity of 86.7% and 80.0%, respectively, with a positive predictive value of 92.9%. Decision trees mined using this method may be considered as a starting point for refinement into clinically useful multi-variate lung cancer malignancy models for implementation as a workflow augmentation tool to improve the efficiency of human radiologists.