1 Introduction
2 Related Works
2.1 Environment Perception for Automated Vehicles
2.2 Adversarial Attacks
3 Method
3.1 Mathematical Notations
3.2 Method Introduction and Initial Analysis
VGG-16
, VGG-19
[SZ15], ResNet-18
, and ResNet-152
[HZRS16], all pretrained on ImageNet [RDS+15], to investigate their extracted feature maps in different levels. Then, we first measure the similarity of the mean feature maps of a layer between all networks over the entire ImageNet [RDS+15] validation set, using the well-known and universally applicable mean squared error (MSE).1 Figure 3 displays the resulting heat maps. In addition, Fig. 4 shows the mean of feature representations \(\mathbf {A}_{\ell }(\mathbf {x})\) for these four pretrained classifiers computed for layer \(\ell =1\) up to \(\ell =6\) (after each activation function) for a selected input image \(\mathbf {x}\). Both figures, Figs. 3 and 4, show that the respective networks share a qualitatively and quantitatively high similarity in the first layer compared to all subsequent layers. Only for close relatives, such as VGG-16
and VGG-19
, this similarity is found in later layers as well. We thus hypothesize that by applying the fast feature fool loss \(J_{1}^{\text {FFF}}\) (1) only to the first layer of the source model during training, we not only inject high adversarial energy into the first layer but also increase the transferability of the generated UAPs.3.3 Non-targeted Perturbations
3.4 Targeted Perturbations
4 Experiments on Image Classification
4.1 Experimental Setup
ResNet
generator from [JAFF16], which consists of some convolution layers for downsampling, followed by some residual blocks, before performing upsampling using transposed convolutions. As topology for the source model S, we utilize the same set of pretrained image classifiers as for the target model T, i.e., VGG-16
, VGG-19
[SZ15], ResNet-18
, ResNet-152
[HZRS16], and also GoogleNet
[SLJ+15].4.2 Non-Targeted Universal Perturbations
\(\alpha \) | Source Model S = Target Model T | Avg | |||
---|---|---|---|---|---|
VGG-16 | VGG-19 | ResNet-18 | ResNet-152 | ||
0 | 8.52 | 8.29 | 7.24 | 4.04 | 7.02 |
0.6 | 90.49 | 93.48 | 88.93 | 84.41 | 89.32 |
0.7 | 95.20 | 93.79 | 89.16 | 87.05 | 91.30 |
0.8 | 90.03 | 93.24 | 89.07 | 89.91 | 90.56 |
0.9 | 95.13 | 92.14 | 88.34 | 89.37 | 91.24 |
1 | 92.87 | 71.88 | 88.88 | 85.34 | 84.74 |
p | \(\epsilon \) | \(\alpha \) | Source Model S = Target Model T | |||
---|---|---|---|---|---|---|
VGG-16 | VGG-19 | ResNet-18 | ResNet-152 | |||
2 | 2000 | 0.7 | 96.57 | 94.99 | 91.85 | 88.73 |
\(\infty \) | 10 | 0.7 | 95.70 | 94.00 | 90.46 | 90.40 |
ResNet-18
model results. Highest fooling rates are printed in boldfacep | \(\epsilon \) | Method | S = T | Avg\(^+\) | |||
---|---|---|---|---|---|---|---|
VGG-16 | VGG-19 | ResNet-18 | ResNet-152 | ||||
\(\infty \) | *10 | FFF | 47.10 | 43.62 | - | 29.78 | 40.16 |
CIs | 71.59 | 72.84 | – | 60.72 | 68.38 | ||
UAP | 78.30 | 77.80 | – | 84.00 | 80.03 | ||
GAP | 83.70 | 80.10 | – | 82.70 | 82.16 | ||
NAG | 77.57 | 83.78 | – | 87.24 | 82.86 | ||
TUAP-RSI | 94.30 | 94.98 | – | 90.08 | 93.12 | ||
Ours | 95.70 | 94.00 | 90.46 | 90.40 | 93.36 | ||
2 | 2000 | UAP | 90.30 | 84.50 | – | 88.50 | 87.76 |
GAP | 93.90 | 94.90 | – | 79.50 | 89.43 | ||
Ours | 96.57 | 94.99 | 91.85 | 88.73 | 93.43 |
Target model T | Avg\(^*\) | |||||
---|---|---|---|---|---|---|
VGG-16 | VGG-19 | ResNet-18 | ResNet-152 | |||
Source model S | VGG-16 | 95.70 | 86.67 | 49.98 | 36.34 | 57.66 |
VGG-19 | 84.77 | 94.00 | 47.24 | 36.46 | 56.15 | |
ResNet-18 | 76.49 | 72.18 | 90.46 | 50.46 | 66.37 | |
ResNet-152 | 86.19 | 82.36 | 76.04 | 90.40 | 81.53 |
GoogleNet
). *Note that the results are reported from the respective paper. Highest fooling rates are printed in boldfaceVGG-16
, ResNet-152
, and GoogleNet
are used as the source model in Table 5a–c, respectively. It turns out to be advisable to choose a deep network as the source model (ResNet-152
); since then our performance on the unseen VGG-16
and VGG-19
target models is about 12% absolute better than earlier state of the art (\(L_{\infty }\) norm).
VGG-16
as the target model T, on the ImageNet validation set for different sizes of \(\mathcal {X}^\text {train}\). The results show that by using a dataset \(\mathcal {X}^\text {train}\) containing only 1000 images, our approach leads to a fooling rate of more than 60% on the ImageNet validation dataset in both the white-box and black-box settings. Additionally, the number of images in \(\mathcal {X}^\text {train}\) turns out to be more vital for the fooling rate of black-box attacks as compared to white-box attacks.
4.3 Targeted Universal Perturbations
street sign
) and \(\mathring{m}=920\) (traffic light, traffic signal, stoplight
) are 63.2 and 57.83%, respectively, which underlines the effectiveness of our approach.
VGG-16
, in white-box and black-box settings, on the Image-Net validation dataset for different sizes of \(\mathcal {X}^\text {train}\) in Fig. 9. For instance, with \(\mathcal {X}^\text {train}\) containing 10,000 images, we are able to fool the target model on over 20% of the images in the ImageNet validation set. It should be noted that training the generator \(\mathbf {G}\) to produce a single UAP forcing the target model to output a specific target class \(\mathring{m}\) is an extremely challenging task. However, we particularly observe that utilizing 10,000 training images again seems to be sufficient for a white-box attack.
5 Experiments on Semantic Segmentation
5.1 Experimental Setup
FCN-8
[LSD15] for white-box attacks, and ERFNet
[RABA18] for the black-box setting. FCN-8
consists of an encoder part which transforms an input image into a low-resolution semantic representation and a decoder part which recovers the high spatial resolution of the image by fusing different levels of feature representations together. ERFNet
also consists of an encoder-decoder structure, but without any bypass connections between the encoder and the decoder. Additionally, residual units are used with factorized convolutions to obtain a more efficient computation.FCN-8
[LSD15] as our segmentation target model T, and use \(L_{\infty }\) norm, to be comparable with [MCKBF17, PKGB18].FCN-8
pretrained on Cityscapes. Our method is compared with GAP [PKGB18]. In these experiments, both the source model S and the target model T are either the same, i.e., \(S=T=\texttt {FCN-8}\), or different, i.e., \(S=\texttt {ERFNet}, T=\texttt {FCN-8}\). Parameters are set to \(L_{\infty }(\mathbf {r})\le \epsilon \), \(\alpha =0.7\). Best results are printed in boldface5.2 Non-targeted Universal Perturbations
ERFNet
as the source model S and test it on the target model T being FCN-8
. Table 6b reports black-box attack results for our method compared to the attack method GAP [PKGB18]. Our non-targeted UAPs decrease the mIoU of the FCN-8
on the Cityscapes dataset more than GAP [PKGB18] does, in all different ranges of adversarial perturbations (\(\epsilon \in \{2,5,10,20\}\)). These results illustrate the effectivity of the generated perturbation.
5.3 Targeted Universal Perturbations
FCN-8
in a way that it now outputs the target segmentation mask.
FCN-8
pretrained on the Cityscapes training set. Results are reported on the Cityscapes validation set. Our method is compared to UAP-Seg [MCKBF17] and GAP [PKGB18]. In these experiments, both the source model S and the target model T are FCN-8
, with \(L_{\infty }(\mathbf {r})\le \epsilon \), \(\alpha =0.7\). Best results are printed in boldfaceMethod | \(\epsilon \) | |||
---|---|---|---|---|
2 | 5 | 10 | 20 | |
GAP | 61.2 | 79.5 | 92.1 | 97.2 |
UAP-Seg | 60.9 | 80.3 | 91.0 | 96.3 |
Ours | 61.0 | 81.8 | 93.1 | 97.4 |