MBET: Resilience Improvement Method for DNNs
Abstract
Deep neural network (DNN) accelerators become a large study field. Low voltage DNN accelerators are designed to achieve high throughput and reduce energy consumption. Using low voltage leads to many bit errors in DNN weights. One method to increase fault tolerance against random bit errors is random bit error training. In this paper, we improve this method with multiple bit error rate training (MBET). MBET aims to improve the fault tolerance of the DNN model with using more than one bit error rates. During the training, we inject bit errors with different rates and combine the corresponding loss values. The experimental results on 4 state-of-the-art models show that this method improves fault tolerance of the model against random bit errors while it does not decrease the test accuracy of the model.