Adversarial Example Defenses (Preprints)

This list contains proposed white-box defenses to adversarial examples that have been open-sourced, along with third-party analyses / security evaluations that have been open-sourced.

Submit a new defense or analysis.

Defense	Dataset	Threat Model	Natural Accuracy	Claims	Analyses
Bandlimiting Neural Networks Against Adversarial Attacks (Lin et al.) (code)	ImageNet	$$\ell_\infty (\epsilon = 8/255)$$	77.32% accuracy	76.06% accuracy	0.4% accuracy [ACF+19] (code)
Bandlimiting Neural Networks Against Adversarial Attacks (Lin et al.) (code)	CIFAR‑10	$$\ell_\infty (\epsilon = 8/255)$$	92.55% accuracy	88.41% accuracy	15.8% accuracy [ACF+19] (code)
Adversarial Logit Pairing (Kannan et al.) (code)	ImageNet	$$\ell_\infty (\epsilon = 16/255)$$	72%	27.9% accuracy	0.1% accuracy [EIA18] (code)
Combatting and detecting FGSM and PGD adversarial noise (James Gannon) (code)	MNIST	$$\ell_\infty (\epsilon = 0.1)$$	98.2%	78.2% accuracy
Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks (Mustafa et al.) (code)	CIFAR‑10	$$\ell_\infty (\epsilon = 8/255)$$	90.62% accuracy	32.32% accuracy