Hardening Deep Neural Networks via Adversarial Model Cascades
Deepak Vijaykeerthy, Anshuman Suri, et al.
IJCNN 2019
Adversarial machine learning defenses have primarily been focused on mitigating static, white-box attacks. However, it remains an open question whether such defenses are robust under an adaptive black-box adversary. In this paper, we specifically focus on the black-box threat model and make the following contributions: First we develop an enhanced adaptive black-box attack which is experimentally shown to be ≥ 30 % more effective than the original adaptive black-box attack proposed by Papernot et al. For our second contribution, we test 10 recent defenses using our new attack and propose our own black-box defense (barrier zones). We show that our defense based on barrier zones offers significant improvements in security over state-of-the-art defenses. This improvement includes greater than 85% robust accuracy against black-box boundary attacks, transfer attacks and our new adaptive black-box attack, for the datasets we study. For completeness, we verify our claims through extensive experimentation with 10 other defenses using three adversarial models (14 different black-box attacks) on two datasets (CIFAR-10 and Fashion-MNIST).
Deepak Vijaykeerthy, Anshuman Suri, et al.
IJCNN 2019
Vinicius Lima, Dzung T. Phan, et al.
ACC 2023
Joel Dapello, Tiago Marques, et al.
NeurIPS 2020
Pradip Bose, Jennifer Dworak, et al.
MICRO 2023