Language Agnostic Code Embeddings
Saiteja Utpala, Alex Gu, et al.
NAACL 2024
Model robustness against adversarial examples of single perturbation type such as the ℓp-norm has been widely studied, yet its generalization to more realistic scenarios involving multiple semantic perturbations and their composition remains largely unexplored. In this paper, we first propose a novel method for generating composite adversarial examples. Our method can find the optimal attack composition by utilizing component-wise projected gradient descent and automatic attack-order scheduling. We then propose generalized adversarial training (GAT) to extend model robustness from ℓp-ball to composite semantic perturbations, such as the combination of Hue, Saturation, Brightness, Contrast, and Rotation. Results obtained using ImageNet and CIFAR-10 datasets indicate that GAT can be robust not only to all the tested types of a single attack, but also to any combination of such attacks. GAT also outperforms baseline ℓ∞-norm bounded adversarial training approaches by a significant margin.
Saiteja Utpala, Alex Gu, et al.
NAACL 2024
Reuben Tan, Arijit Ray, et al.
CVPR 2023
Megh Thakkar, Quentin Fournier, et al.
ACL 2024
Gururaj Saileshwar, Prashant J. Nair, et al.
HPCA 2018