Deep neural networks have been shown to be vulnerable to adversarial examples: very small perturbations of the input having a dramatic impact on the predictions. In this package, we provide a TensorFlow implementation for a new type of adversarial attack based on local geometric transformations: Spatially Transformed Adversarial Examples (stAdv).
Our implementation follows the procedure from the original paper:
Spatially Transformed Adversarial ExamplesChaowei Xiao, Jun-Yan Zhu, Bo Li, Warren He, Mingyan Liu, Dawn Song
If you use this code, please cite the following paper for which this implementation was originally made:
Robustness of Rotation-Equivariant Networks to Adversarial PerturbationsBeranger Dumont, Simona Maggio, Pablo Montalvo
First, make sure you have installed TensorFlow (CPU or GPU version).
Then, to install the stadv
package, simply run
$ pip install stadv
A typical use of this package is as follows:
- Start with a trained network implemented in TensorFlow.
- Insert the
stadv.layers.flow_st
layer in the graph immediately after the input layer. This is in order to perturb the input images according to local differentiable geometric perturbations parameterized with input flow tensors. - In the end of the graph, after computing the logits, insert the computation
of an adversarial loss (to fool the network) and of a flow loss (to enforce
local smoothness), e.g. using
stadv.losses.adv_loss
andstadv.losses.flow_loss
, respectively. Define the final loss to be optimized as a combination of the two. - Find the flows which minimize this loss, e.g. by using an L-BFGS-B optimizer
as conveniently provided in
stadv.optimization.lbfgs
.
An end-to-end example use of the library is provided in the notebook
demo/simple_mnist.ipynb
(see on GitHub).
The documentation of the API is available at http://stadv.readthedocs.io/en/latest/stadv.html.
You can run all unit tests with
$ make init
$ make test