Top

Multimedia Systems

Published in:

01-02-2024 | Regular Paper

Object-based video anomaly detection using multi-attention and adaptive velocity attribute representation learning

Authors: Xiaopeng Ren, Huifen Xia, Yongzhao Zhan

Published in: Multimedia Systems | Issue 1/2024

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Video anomaly detection is an important topic in multimedia technology. Multiscale features and cross-learning between low-level and high-level features in the existing prediction models for anomaly detection are not exploited sufficiently, resulting in inadequate learning of object appearance and motion representations. In addition, the velocity attribute of the object is not been effectively utilized, resulting in inadequate learning of object motion information. To this end, a novel method for object-based video anomaly detection using multi-attention and adaptive velocity attribute representation learning is proposed. In this method, the MA-Unet (multiple attention Unet) model that incorporates channel attention, multi-scale spatial attention, and cross-semantic attention is constructed to learn object features effectively and capture more effective object appearance and motion information. Furthermore, the mechanism combining adaptive velocity attribute representation learning is proposed for anomaly discrimination, aiming to learn the velocity attributes of the object reasonably and better utilize the motion information of the object. Experiments conducted on the publicly available datasets USCDped2, Avenue, and ShanghaiTech show that our method outperforms state-of-the-art methods, which further validates the effectiveness of our method.

previous article You watch once more: a more effective CNN architecture for video spatio-temporal action localization

next article Ecarnet: enhanced clue-ambiguity reasoning network for multimodal fake news detection

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Sultani, W., Chen, C., Shah, M.: Real-world anomaly detection in surveillance videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6479–6488 (2018). https://doi.org/10.1109/cvpr.2018.00678

Chang, S., Li, Y., Shen, S., Feng, J., Zhou, Z.: Contrastive attention for video anomaly detection. IEEE Trans. Multimedia 24, 4067–4076 (2021). https://doi.org/10.1109/tmm.2021.3112814CrossRef

Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 733–742 (2016). https://doi.org/10.1109/cvpr.2016.86

Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection—a new baseline. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6536–6545 (2018). https://doi.org/10.1109/cvpr.2018.00684

Pathak, D., Girshick, R., Dollár, P., Darrell, T., Hariharan, B.: Learning features by watching objects move. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2701–2710 (2017). https://doi.org/10.1109/cvpr.2017.638

Wang, S., Yu, G., Cai, Z., Liu, X., Zhu, E., Yin, J.: Video abnormal event detection by learning to complete visual cloze tests. arXiv:2108.02356 (2021)

Yu, G., Wang, S., Cai, Z., Zhu, E., Xu, C., Yin, J., Kloft, M.: Cloze test helps: effective video anomaly detection via learning to complete video events. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 583–591 (2020). https://doi.org/10.1145/3394171.3413973

Reiss, T., Hoshen, Y.: Attribute-based representations for accurate and interpretable video anomaly detection. arXiv:2212.00789 (2022)

Glodek, M., Schels, M., Schwenker, F.: Ensemble gaussian mixture models for probability density estimation. Comput. Statis. 28, 127–138 (2013). https://doi.org/10.1007/s00180-012-0374-5MathSciNetCrossRef

10.

Zhang, D., Gatica-Perez, D., Bengio, S., McCowan, I.: Semi-supervised adapted hmms for unusual event detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, IEEE, pp. 611–618 (2005). https://doi.org/10.1109/cvpr.2005.316

11.

Zhao, B., Fei-Fei, L., Xing, E.P.: Online detection of unusual events in videos via dynamic sparse coding. In: CVPR 2011, IEEE, pp. 3313–3320 (2011). https://doi.org/10.1109/cvpr.2011.5995524

12.

Xu, D., Ricci, E., Yan, Y., Song, J., Sebe, N.: Learning deep representations of appearance and motion for anomalous event detection. In: Proceedings of the British Machine Vision Conference 2015, BMVC 2015, pp. 81–88 (2015). https://doi.org/10.5244/c.29.8

13.

Luo, W., Liu, W., Gao, S.: Remembering history with convolutional lstm for anomaly detection. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), IEEE, pp. 439–444 (2017). https://doi.org/10.1109/icme.2017.8019325

14.

Gong, D., Liu, L., Le, V., Saha, B., Mansour, M.R., Venkatesh, S., Hengel, A.V.D.: Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1705–1714 (2019). https://doi.org/10.1109/iccv.2019.00179

15.

Feng, X., Song, D., Chen, Y., Chen, Z., Ni, J., Chen, H.: Convolutional transformer based dual discriminator generative adversarial networks for video anomaly detection. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 5546–5554 (2021). https://doi.org/10.1145/3474085.3475693

16.

Zhao, Y., Deng, B., Shen, C., Liu, Y., Lu, H., Hua, X.-S.: Spatio-temporal autoencoder for video anomaly detection. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1933–1941 (2017). https://doi.org/10.1145/3123266.3123451

17.

Park, H., Noh, J., Ham, B.: Learning memory-guided normality for anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14372–14381 (2020). https://doi.org/10.1109/cvpr42600.2020.01438

18.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł, Polosukhin, I.: Attention is all you need. Adv. Neural Inform. Process. Syst. 30, 6000–6010 (2017)

19.

He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022). https://doi.org/10.1109/cvpr52688.2022.01553

20.

Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018). https://doi.org/10.1007/978-3-030-01234-2_1

21.

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv:2010.11929 (2020)

22.

Zhou, J.T., Zhang, L., Fang, Z., Du, J., Peng, X., Xiao, Y.: Attention-driven loss for anomaly detection in video surveillance. IEEE Trans. Circ. Syst. Video Technol. 30(12), 4639–4647 (2019). https://doi.org/10.1109/tcsvt.2019.2962229CrossRef

23.

Le, V.-T., Kim, Y.-G.: Attention-based residual autoencoder for video anomaly detection. Appl. Intell. 53(3), 3240–3254 (2023). https://doi.org/10.1007/s10489-022-03613-1CrossRef

24.

Cai, Z., Vasconcelos, N.: Cascade r-cnn: delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6154–6162 (2018). https://doi.org/10.1109/cvpr.2018.00644

25.

Blei, D.M., Jordan, M.I.: Variational inference for dirichlet process mixtures. Bayesian Anal. (2006). https://doi.org/10.1214/06-ba104MathSciNetCrossRef

26.

Yang, Z., Liu, J., Wu, Z., Wu, P., Liu, X.: Video event restoration based on keyframes for video anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14592–14601 (2023)

27.

Liu, Z., Nie, Y., Long, C., Zhang, Q., Li, G.: A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13588–13597 (2021). https://doi.org/10.1109/iccv48922.2021.01333

28.

Ionescu, R.T., Khan, F.S., Georgescu, M.-I., Shao, L.: Object-centric auto-encoders and dummy anomalies for abnormal event detection in video. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7842–7851 (2019). https://doi.org/10.1109/cvpr.2019.00803

29.

Doshi, K., Yilmaz, Y.: Any-shot sequential anomaly detection in surveillance videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 934–935 (2020). https://doi.org/10.1109/cvprw50498.2020.00475

30.

Georgescu, M.-I., Barbalau, A., Ionescu, R.T., Khan, F.S., Popescu, M., Shah, M.: Anomaly detection in video via self-supervised and multi-task learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12742–12752 (2021). https://doi.org/10.1109/cvpr46437.2021.01255

31.

Georgescu, M.I., Ionescu, R.T., Khan, F.S., Popescu, M., Shah, M.: A background-agnostic framework with adversarial training for abnormal event detection in video. IEEE Trans. Pattern Anal. Mach. Intell. 44(9), 4505–4523 (2021). https://doi.org/10.1109/tpami.2021.3074805CrossRef

32.

Cai, R., Zhang, H., Liu, W., Gao, S., Hao, Z.: Appearance-motion memory consistency network for video anomaly detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 938–946 (2021). https://doi.org/10.1609/aaai.v35i2.16177

33.

Acsintoae, A., Florescu, A., Georgescu, M.-I., Mare, T., Sumedrea, P., Ionescu, R.T., Khan, F.S., Shah, M.: Ubnormal: New benchmark for supervised open-set video anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20143–20153 (2022). https://doi.org/10.1109/cvpr52688.2022.01951

34.

Wang, G., Wang, Y., Qin, J., Zhang, D., Bao, X., Huang, D.: Video anomaly detection by solving decoupled spatio-temporal jigsaw puzzles. In: European Conference on Computer Vision, pp. 494–511. Springer, Berlin (2022). https://doi.org/10.1007/978-3-031-20080-9_29

35.

Ristea, N.-C., Madan, N., Ionescu, R.T., Nasrollahi, K., Khan, F.S., Moeslund, T.B., Shah, M.: Self-supervised predictive convolutional attentive block for anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13576–13586 (2022). https://doi.org/10.1109/cvpr52688.2022.01321

Title: Object-based video anomaly detection using multi-attention and adaptive velocity attribute representation learning
Authors: Xiaopeng Ren
Huifen Xia
Yongzhao Zhan
Publication date: 01-02-2024
Publisher: Springer Berlin Heidelberg
Published in: Multimedia Systems / Issue 1/2024
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI: https://doi.org/10.1007/s00530-023-01257-w

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 1/2024

A comparative study of color quantization methods using various image quality assessment indices

Underwater acoustic target recognition based on knowledge distillation under working conditions mismatching

BENet: bi-directional enhanced network for image captioning

Ecarnet: enhanced clue-ambiguity reasoning network for multimodal fake news detection

Occluded pedestrian re-identification via Res-ViT double-branch hybrid network

Enhanced 3D reconstruction with all-neighbor-first philosophy and Ricci flow-based mesh smoothing approach