Skip to main content
Top
Published in: Multimedia Systems 1/2024

01-02-2024 | Regular Paper

Dy-MIL: dynamic multiple-instance learning framework for video anomaly detection

Authors: Chen Li, Mo Chen

Published in: Multimedia Systems | Issue 1/2024

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Anomaly detection is an extremely challenging task in the field of visual understanding because it involves identifying events that deviate significantly from normal patterns. One of the primary reasons for the difficulty of this task is the diversity and complexity of anomalous events. Therefore, it is impossible for us to collect all types of anomalies and label them. In recent work, weakly supervised methods become one of the optimal solutions for anomaly detection. Thus, in this paper, we focus on weakly supervised learning and propose a dynamic multiple-instance learning framework for video anomaly detection, which develops a dynamic ranking method combined the k-max-selection scheme to enlarge the inter-class distance between anomalous and normal instances by only using video-level labels. Experimental results demonstrate that our framework achieves superior improvements on three benchmark datasets, including the ShanghaiTech dataset, UCF Crime dataset and NUT dataset.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: International Conference on Pattern Recognition (ICPR), pp. 733–742 (2016) Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: International Conference on Pattern Recognition (ICPR), pp. 733–742 (2016)
2.
go back to reference Zhao, Y., Deng, B., Shen, C., Liu, Y., Lu, H., Hua, X.-S.: Spatio-temporal autoencoder for video anomaly detection. In: ACM International Conference on Multimedia (ACM MM), pp. 1933–1941 (2017) Zhao, Y., Deng, B., Shen, C., Liu, Y., Lu, H., Hua, X.-S.: Spatio-temporal autoencoder for video anomaly detection. In: ACM International Conference on Multimedia (ACM MM), pp. 1933–1941 (2017)
3.
go back to reference Sultani, W., Chen, C., Shah, M.: Real-world anomaly detection in surveillance videos. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6479–6488 (2018) Sultani, W., Chen, C., Shah, M.: Real-world anomaly detection in surveillance videos. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6479–6488 (2018)
4.
go back to reference Feng, J.C., Hong, F.T., Zheng, W.S.: Mist: Multiple instance self-training framework for video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14009–14018 (2021) Feng, J.C., Hong, F.T., Zheng, W.S.: Mist: Multiple instance self-training framework for video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14009–14018 (2021)
5.
go back to reference Zaheer, M.Z., Mahmood, A., Astrid, M., Lee, S.-I.: Claws: Clustering assisted weakly supervised learning with normalcy suppression for anomalous event detection. In: European Conference on Computer Vision (ECCV), pp. 358–376 (2020) Zaheer, M.Z., Mahmood, A., Astrid, M., Lee, S.-I.: Claws: Clustering assisted weakly supervised learning with normalcy suppression for anomalous event detection. In: European Conference on Computer Vision (ECCV), pp. 358–376 (2020)
6.
go back to reference Zhu, Y., Newsam, S.: Motion-aware feature for improved video anomaly detection. In: 30th British Machine Vision Conference (BMVC), pp. 1–12 (2020) Zhu, Y., Newsam, S.: Motion-aware feature for improved video anomaly detection. In: 30th British Machine Vision Conference (BMVC), pp. 1–12 (2020)
7.
go back to reference Degardin, B., Proena, H.: Iterative weak/self-supervised classification framework for abnormal events detection. Pattern Recogn. Lett. 145(1), 50–57 (2021)ADSCrossRef Degardin, B., Proena, H.: Iterative weak/self-supervised classification framework for abnormal events detection. Pattern Recogn. Lett. 145(1), 50–57 (2021)ADSCrossRef
8.
go back to reference Zhong, J., Li, N., Kong, W., Liu, S., Li, T.H., Li, G.: Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1237–1246 (2019) Zhong, J., Li, N., Kong, W., Liu, S., Li, T.H., Li, G.: Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1237–1246 (2019)
9.
go back to reference Wan, B., Fang, Y., Xia, X., Mei, J.: Weakly supervised video anomaly detection via center-guided discriminative learning. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2020) Wan, B., Fang, Y., Xia, X., Mei, J.: Weakly supervised video anomaly detection via center-guided discriminative learning. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2020)
10.
go back to reference Lin, S., Clark, R., Birke, R., Schönborn, S., Trigoni, N., Roberts, S.: Anomaly detection for time series using vae-lstm hybrid model. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4322–4326 (2020) Lin, S., Clark, R., Birke, R., Schönborn, S., Trigoni, N., Roberts, S.: Anomaly detection for time series using vae-lstm hybrid model. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4322–4326 (2020)
11.
go back to reference Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection–a new baseline. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6536–6545 (2018) Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection–a new baseline. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6536–6545 (2018)
12.
go back to reference Zaheer, M.Z., Mahmood, A., Khan, M.H., Segu, M., Yu, F., Lee, S.-I.: Generative cooperative learning for unsupervised video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14744–14754 (2022) Zaheer, M.Z., Mahmood, A., Khan, M.H., Segu, M., Yu, F., Lee, S.-I.: Generative cooperative learning for unsupervised video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14744–14754 (2022)
13.
go back to reference Liu, Z., Nie, Y., Long, C., Zhang, Q., Li, G.: A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction. In: IEEE International Conference on Computer Vision (ICCV), pp. 13588–13597 (2021) Liu, Z., Nie, Y., Long, C., Zhang, Q., Li, G.: A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction. In: IEEE International Conference on Computer Vision (ICCV), pp. 13588–13597 (2021)
14.
go back to reference Ye, M., Peng, X., Gan, W., Wu, W., Qiao, Y.: Anopcn: Video anomaly detection via deep predictive coding network. In: ACM International Conference on Multimedia (ACM MM), pp. 1805–1813 (2019) Ye, M., Peng, X., Gan, W., Wu, W., Qiao, Y.: Anopcn: Video anomaly detection via deep predictive coding network. In: ACM International Conference on Multimedia (ACM MM), pp. 1805–1813 (2019)
15.
go back to reference Tang, Y., Zhao, L., Zhang, S., Gong, C., Li, G., Yang, J.: Integrating prediction and reconstruction for anomaly detection. Pattern Recogn. Lett. 129(1), 123–130 (2020)ADSCrossRef Tang, Y., Zhao, L., Zhang, S., Gong, C., Li, G., Yang, J.: Integrating prediction and reconstruction for anomaly detection. Pattern Recogn. Lett. 129(1), 123–130 (2020)ADSCrossRef
16.
go back to reference Chong, Y.S., Tay, Y.H.: Abnormal event detection in videos using spatio-temporal autoencoder. In: International Symposium on Neural Networks (ISNN), pp. 189–196 (2017) Chong, Y.S., Tay, Y.H.: Abnormal event detection in videos using spatio-temporal autoencoder. In: International Symposium on Neural Networks (ISNN), pp. 189–196 (2017)
17.
go back to reference Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 733–742 (2016) Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 733–742 (2016)
18.
go back to reference Dubey, S., Boragule, A., Jeon, M.: 3D ResNet with ranking loss function for abnormal activity detection in videos. In: International Conference on Control, Automation and Information Sciences (ICCAIS), pp. 1–6 (2019) Dubey, S., Boragule, A., Jeon, M.: 3D ResNet with ranking loss function for abnormal activity detection in videos. In: International Conference on Control, Automation and Information Sciences (ICCAIS), pp. 1–6 (2019)
19.
go back to reference Tan, W., Yao, Q., Liu, J.: Overlooked video classification in weakly supervised video anomaly detection. In: arXiv Preprint arXiv: 2210.06688 (2023) Tan, W., Yao, Q., Liu, J.: Overlooked video classification in weakly supervised video anomaly detection. In: arXiv Preprint arXiv:​ 2210.​06688 (2023)
20.
go back to reference Doshi, K., Yilmaz, Y.: Any-shot sequential anomaly detection in surveillance videos. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 4037–4042 (2020) Doshi, K., Yilmaz, Y.: Any-shot sequential anomaly detection in surveillance videos. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 4037–4042 (2020)
21.
go back to reference Georgescu, M.I., Ionescu, R., Khan, F.S., Popescu, M., Shah, M.: A background-agnostic framework with adversarial training for abnormal event detection in video. IEEE Transactions on Pattern Analysis and Machine Intelligence PP(1), 1–18 (2021) Georgescu, M.I., Ionescu, R., Khan, F.S., Popescu, M., Shah, M.: A background-agnostic framework with adversarial training for abnormal event detection in video. IEEE Transactions on Pattern Analysis and Machine Intelligence PP(1), 1–18 (2021)
22.
go back to reference Li, Q., Yang, R., Xiao, F., Bhanu, B., Zhang, F.: Attention-based anomaly detection in multi-view surveillance videos. Knowl.-Based Syst. 252(2), 1–11 (2022) Li, Q., Yang, R., Xiao, F., Bhanu, B., Zhang, F.: Attention-based anomaly detection in multi-view surveillance videos. Knowl.-Based Syst. 252(2), 1–11 (2022)
23.
go back to reference Lu, J., Batra, D., Parikh, D., Lee, S.: Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In: Advances in Neural Information Processing Systems (NIPS), pp. 13–23 (2019) Lu, J., Batra, D., Parikh, D., Lee, S.: Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In: Advances in Neural Information Processing Systems (NIPS), pp. 13–23 (2019)
24.
go back to reference Luo, W., Liu, W., Gao, S.: A revisit of sparse coding based anomaly detection in stacked RNN framework. In: IEEE International Conference on Computer Vision (ICCV), pp. 341–349 (2017) Luo, W., Liu, W., Gao, S.: A revisit of sparse coding based anomaly detection in stacked RNN framework. In: IEEE International Conference on Computer Vision (ICCV), pp. 341–349 (2017)
25.
go back to reference Hinami, R., Mei, T., Satoh, S.: Joint detection and recounting of abnormal events by learning deep generic knowledge. In: IEEE International Conference on Computer Vision (ICCV), pp. 3639–3647 (2017) Hinami, R., Mei, T., Satoh, S.: Joint detection and recounting of abnormal events by learning deep generic knowledge. In: IEEE International Conference on Computer Vision (ICCV), pp. 3639–3647 (2017)
26.
go back to reference Sharma, P., Ding, N., Goodman, S., Soricut, R.: Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning. In: 56th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 2556–2565 (2018) Sharma, P., Ding, N., Goodman, S., Soricut, R.: Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning. In: 56th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 2556–2565 (2018)
27.
go back to reference Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNet Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNet
28.
go back to reference Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: International Conference on Machine Learning (ICML), pp. 1–8 (2010) Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: International Conference on Machine Learning (ICML), pp. 1–8 (2010)
29.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: European Conference on Computer Vision (ECCV), pp. 630–645 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: European Conference on Computer Vision (ECCV), pp. 630–645 (2016)
30.
go back to reference Duchi, J.C., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(1), 2121–2159 (2011)MathSciNet Duchi, J.C., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(1), 2121–2159 (2011)MathSciNet
31.
go back to reference Joo, H.K., Vo, K., Yamazaki, K., Le, N.: Clip-tsa: Clip-assisted temporal self-attention for weakly-supervised video anomaly detection. In: International Conference on Image Processing (ICIP), pp. 3230–3234 (2023) Joo, H.K., Vo, K., Yamazaki, K., Le, N.: Clip-tsa: Clip-assisted temporal self-attention for weakly-supervised video anomaly detection. In: International Conference on Image Processing (ICIP), pp. 3230–3234 (2023)
32.
go back to reference Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: IEEE International Conference on Computer Vision (ICCV), pp. 2720–2727 (2013) Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: IEEE International Conference on Computer Vision (ICCV), pp. 2720–2727 (2013)
33.
go back to reference Wan, B., Fang, Y., Xia, X., Mei, J.: Weakly supervised video anomaly detection via center-guided discriminative learning. In: IEEE International Conference on Multimedia and Expositions (ICME), pp. 1–6 (2020) Wan, B., Fang, Y., Xia, X., Mei, J.: Weakly supervised video anomaly detection via center-guided discriminative learning. In: IEEE International Conference on Multimedia and Expositions (ICME), pp. 1–6 (2020)
34.
go back to reference Sapkota, H., Yu, Q.: Bayesian nonparametric submodular video partition for robust anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3202–3211 (2022) Sapkota, H., Yu, Q.: Bayesian nonparametric submodular video partition for robust anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3202–3211 (2022)
35.
go back to reference Zaigham Zaheer, M., Mahmood, A., Haris Khan, M., Segu, M., Yu, F., Lee, S.-I.: Generative cooperative learning for unsupervised video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14724–14734 (2022) Zaigham Zaheer, M., Mahmood, A., Haris Khan, M., Segu, M., Yu, F., Lee, S.-I.: Generative cooperative learning for unsupervised video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14724–14734 (2022)
Metadata
Title
Dy-MIL: dynamic multiple-instance learning framework for video anomaly detection
Authors
Chen Li
Mo Chen
Publication date
01-02-2024
Publisher
Springer Berlin Heidelberg
Published in
Multimedia Systems / Issue 1/2024
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI
https://doi.org/10.1007/s00530-023-01237-0

Other articles of this Issue 1/2024

Multimedia Systems 1/2024 Go to the issue