nach oben

Multimedia Systems

Erschienen in:

04.06.2023 | Regular Paper

FDS_2D: rethinking magnitude-phase features for DeepFake detection

verfasst von: Gaoming Yang, Anxing Wei, Xianjin Fang, Ji Zhang

Erschienen in: Multimedia Systems | Ausgabe 4/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

To reduce the harm of forged information, more and more detection methods use frequency domain information. They mostly take spectra as clues to identify fake content. However, the current work tends to use only one of the magnitude and phase spectra for learning. In this paper, we notice that the magnitude and phase spectrum contain different image information. Only one spectrum is easily disturbed by noise, and the robustness of the method is difficult to guarantee. Therefore, we propose the Frequency Domain Separable DeepFake Detection (FDS_2D), which is a multi-branch network to obtain features in different frequency spectra. In FDS_2D, the spectral information is divided into three categories: the magnitude spectrum, the phase spectrum, and the relationship between the two spectra. According to their characteristics, we design independent modules for feature extraction from them. Moreover, to improve the utilization efficiency of multi-features, we propose a multi-input multi-output attention mechanism for information interaction between branches. The experimental results show that each part of FDS_2D effectively extracts and applies spectral information; The comprehensive performance of our model is verified on FaceForensic + + , Celeb-DF, and DFDC. It proves that the ability of FDS_2D to detect DeepFake is not inferior to existing models.

Vorheriger Artikel Multi-modal humor segment prediction in video

Nächster Artikel HRCutBlur Augment: effectively enhancing data diversity for image super-resolution

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Juefei-Xu, F., Wang, R., Huang, Y., et al.: Countering malicious deepfakes: Survey, battleground, and horizon. Int. J. Comput. Vision 130(7), 1678–1734 (2022). https://doi.org/10.1007/s11263-022-01606-8CrossRef

Tolosana, R., Vera-Rodriguez, R., Fierrez, J., et al.: Deepfakes and beyond: a survey of face manipulation and fake detection. Information Fusion 64, 131–148 (2020). https://doi.org/10.1016/j.inffus.2020.06.014CrossRef

Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020). https://doi.org/10.1145/3422622MathSciNetCrossRef

Kingma, D.P., Welling, M.: Auto-encoding variational bayes. Stat 1050, 1 (2014)MATH

Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural. Inf. Process. Syst. 33, 6840–6851 (2020)

Lin BS, Hsu DW, Shen CH, et al (2020) Using fully connected and convolutional net for GAN-based face swapping. In: 2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), IEEE, pp 185–188, https://doi.org/10.1109/APCCAS50809.2020.9301665

Choi Y, Choi M, Kim M, et al (2018) StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8789–8797, https://doi.org/10.1109/CVPR.2018.00916

Wang SY, Wang O, Zhang R, et al (2020) CNN-generated images are surprisingly easy to spot... for now. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8695–8704, https://doi.org/10.1109/CVPR42600.2020.00872

Marra F, Gragnaniello D, Verdoliva L, et al (2019) Do GANs leave artificial fingerprints? In: 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), IEEE, pp 506–511, https://doi.org/10.1109/MIPR.2019.00103

10.

Matern F, Riess C, Stamminger M (2019) Exploiting visual artifacts to expose deepfakes and face manipulations. In: 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), IEEE, pp 83–92, https://doi.org/10.1109/WACVW.2019.00020

11.

Zhao H, Zhou W, Chen D, et al (2021) Multi-attentional deepfake detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2185–2194, https://doi.org/10.1109/CVPR46437.2021.00222

12.

Bondi L, Cannas ED, Bestagini P, et al (2020) Training strategies and data augmentations in CNN-based deepfake video detection. In: 2020 IEEE International Workshop on Information Forensics and Security (WIFS), IEEE, pp 1–6, https://doi.org/10.1109/WIFS49906.2020.9360901

13.

Coccomini DA, Messina N, Gennaro C, et al (2022) Combining efficientnet and vision transformers for video deepfake detection. In: Image Analysis and Processing–ICIAP 2022: 21st International Conference, Lecce, Italy, May 23–27, 2022, Proceedings, Part III, Springer, pp 219–229, https://doi.org/10.1007/978-3-031-06433-3 19

14.

Durall R, Keuper M, Pfreundt FJ, et al (2019) Unmasking deepfakes with simple features. CoRR abs/1911.00686

15.

Liu H, Li X, Zhou W, et al (2021) Spatial-phase shallow learning: Rethinking face forgery detection in frequency domain. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 772–781, https://doi.org/10.1109/CVPR46437.2021.00083

16.

Zhang X, Karaman S, Chang SF (2019) Detecting and simulating artifacts in GAN fake images. In: 2019 IEEE International Workshop on Information Forensics and Security (WIFS), IEEE, pp 1–6, https://doi.org/10.1109/WIFS47025.2019.9035107

17.

Qian Y, Yin G, Sheng L, et al (2020) Thinking in frequency: Face forgery detection by mining frequency-aware clues. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XII, Springer, pp 86–103, https://doi.org/10.1007/978-3-030-58610-2 6

18.

Odena, A., Dumoulin, V., Olah, C.: Deconvolution and checkerboard artifacts. Distill 1(10), e3 (2016)CrossRef

19.

Azulay, A., Weiss, Y.: Why do deep convolutional networks generalize so poorly to small image transformations? J. Mach. Learn. Res. 20, 1–25 (2019)MathSciNetMATH

20.

Wang, B., Li, Y., Wu, X., et al.: Face forgery detection based on the improved siamese network. Secur Commun Net 2022, 1–13 (2022). https://doi.org/10.1155/2022/5169873CrossRef

21.

Yang, G., Xu, K., Fang, X., et al.: Video face forgery detection via facial motion-assisted capturing dense optical flow truncation. Visual Comput (2022). https://doi.org/10.1007/s00371-022-02683-zCrossRef

22.

Wang J, Wu Z, Ouyang W, et al (2022) M2tr: Multi-modal multi-scale transformers for deepfake detection. In: Proceedings of the 2022 International Conference on Multimedia Retrieval, pp 615–623, https://doi.org/10.1145/3512527.3531415

23.

Zhang R (2019) Making convolutional networks shift-invariant again. In: International Conference on Machine Learning, PMLR, pp 7324–7334, URL https://proceedings.mlr.press/v97/zhang19a.html

24.

Kaiser L, Gomez AN, Chollet F (2018) Depthwise separable convolutions for neural machine translation. In: International Conference on Learning Representations

25.

Sandler M, Howard A, Zhu M, et al (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4510–4520, https://doi.org/10.1109/CVPR.2018.00474

26.

He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778, https://doi.org/10.1109/CVPR.2016.90

27.

Luo Y, Zhang Y, Yan J, et al (2021) Generalizing face forgery detection with high-frequency features. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16,317–16,326, https://doi.org/10.1109/CVPR46437.2021.01605

28.

Feichtenhofer C, Fan H, Malik J, et al (2019) Slowfast networks for video recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6202–6211, https://doi.org/10.1109/ICCV.2019.00630

29.

Vaswani A, Shazeer N, Parmar N, et al (2017) Attention is all you need. Advances in Neural Information Processing Systems 30

30.

Dosovitskiy A, Beyer L, Kolesnikov A, et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. CoRR abs/2010.11929

31.

Rossler A, Cozzolino D, Verdoliva L, et al (2019) Faceforensics++: Learning to detect manipulated facial images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 1–11, https://doi.org/10.1109/ICCV.2019.00009

32.

Li Y, Yang X, Sun P, et al (2020) Celeb-df: A large-scale challenging dataset for deepfake forensics. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3207–3216, https://doi.org/10.1109/CVPR42600.2020.00327

33.

Thies J, Zollhofer M, Stamminger M, et al (2016) Face2face: Real-time face capture and reenactment of RGB videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2387–2395, https://doi.org/10.1109/CVPR.2016.262

34.

Thies, J., Zollh ofer, M., Nießner, M.: Deferred neural rendering: Image synthesis using neural textures. ACM Transact Graph (TOG). 38(4), 1–12 (2019). https://doi.org/10.1145/3306346.3323035CrossRef

35.

Zhou P, Han X, Morariu VI, et al (2017) Two-stream neural networks for tampered face detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, pp 1831–1839, https://doi.org/10.1109/CVPRW.2017.229

36.

Afchar D, Nozick V, Yamagishi J, et al (2018) Mesonet: a compact facial video forgery detection network. In: 2018 IEEE International Workshop on Information Forensics and Security (WIFS), IEEE, pp 1–7, https://doi.org/10.1109/WIFS.2018.8630761

37.

Nguyen HH, Fang F, Yamagishi J, et al (2019) Multi-task learning for detecting and segmenting manipulated facial images and videos. In: 2019 IEEE 10th International Conference on Biometrics Theory, Applications and Systems (BTAS), IEEE, pp 1–8, https://doi.org/10.1109/BTAS46853.2019.9185974

38.

Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1251–1258, https://doi.org/10.1109/CVPR.2017.195

39.

Mehra, A., Agarwal, A., Vatsa, M., et al.: Motion magnified 3-D residual-in-dense network for DeepFake Detection[J]. IEEE Transact Biomet Behav, Identity Sci 5(1), 39–52 (2022). https://doi.org/10.1109/TBIOM.2022.3201887CrossRef

40.

Zhang, D., Zhu, W., Ding, X., et al.: SRTNet: a spatial and residual based two-stream neural network for DeepFakes detection. Multimed Tools App (2022). https://doi.org/10.1007/s11042-022-13966-xCrossRef

Titel: FDS_2D: rethinking magnitude-phase features for DeepFake detection
verfasst von: Gaoming Yang
Anxing Wei
Xianjin Fang
Ji Zhang
Publikationsdatum: 04.06.2023
Verlag: Springer Berlin Heidelberg
Erschienen in: Multimedia Systems / Ausgabe 4/2023
Print ISSN: 0942-4962
Elektronische ISSN: 1432-1882
DOI: https://doi.org/10.1007/s00530-023-01118-6

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 4/2023

Human pose transfer via shape-aware partial flow prediction network

Triple-level relationship enhanced transformer for image captioning

Hierarchical cross-modal contextual attention network for visual grounding

Multi-modal humor segment prediction in video

Empowering neural collaborative filtering with contextual features for multimedia recommendation

Multi-level network based on transformer encoder for fine-grained image–text matching