This list contains proposed white-box defenses to adversarial examples that have been open-sourced, along with third-party analyses / security evaluations that have been open-sourced.
Submit a new defense or analysis.
| Defense | Dataset | Threat Model | Natural Accuracy | Claims | Analyses |
|---|---|---|---|---|---|
| Bandlimiting Neural Networks Against Adversarial Attacks (Lin et al.) (code) | ImageNet | $$\ell_\infty (\epsilon = 8/255)$$ | 77.32% accuracy |
76.06% accuracy |
|
| Bandlimiting Neural Networks Against Adversarial Attacks (Lin et al.) (code) | CIFAR‑10 | $$\ell_\infty (\epsilon = 8/255)$$ | 92.55% accuracy |
88.41% accuracy |
|
| Adversarial Logit Pairing (Kannan et al.) (code) | ImageNet | $$\ell_\infty (\epsilon = 16/255)$$ | 72% |
27.9% accuracy |
|
| Combatting and detecting FGSM and PGD adversarial noise (James Gannon) (code) | MNIST | $$\ell_\infty (\epsilon = 0.1)$$ | 98.2% |
78.2% accuracy |
|
| Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks (Mustafa et al.) (code) | CIFAR‑10 | $$\ell_\infty (\epsilon = 8/255)$$ | 90.62% accuracy |
32.32% accuracy |
|