Maximizing auc with deep learning for classification of imbalanced mammogram datasets
Abstract
Breast cancer is the second most common cause of death in women. Computer-aided diagnosis typically demand for carefully annotated data, precise tumor allocation and delineation of the boundaries, which is rarely available in the medical system. In this paper we present a new deep learning approach for classification of mammograms that requires only a global binary label. Traditional deep learning methods typically employ classification error losses, which are highly biased by class imbalance – a situation that naturally arises in medical classification problems. We hereby suggest a novel loss measure that directly maximizes the Area Under the ROC Curve (AUC), providing an unbiased loss. We validate the proposed model on two mammogram datasets: IMG, comprising of 796 patients, 80 positive (164 images) and 716 negative (1869 images), and the publicly available dataset INbreast. Our results are encouraging, as the proposed scheme achieves an AUC of 0.76 and 0.65 for IMG and INbreast, respectively.