Skip to main content
Erschienen in: Multimedia Systems 1/2024

01.02.2024 | Regular Paper

Dy-MIL: dynamic multiple-instance learning framework for video anomaly detection

verfasst von: Chen Li, Mo Chen

Erschienen in: Multimedia Systems | Ausgabe 1/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Anomaly detection is an extremely challenging task in the field of visual understanding because it involves identifying events that deviate significantly from normal patterns. One of the primary reasons for the difficulty of this task is the diversity and complexity of anomalous events. Therefore, it is impossible for us to collect all types of anomalies and label them. In recent work, weakly supervised methods become one of the optimal solutions for anomaly detection. Thus, in this paper, we focus on weakly supervised learning and propose a dynamic multiple-instance learning framework for video anomaly detection, which develops a dynamic ranking method combined the k-max-selection scheme to enlarge the inter-class distance between anomalous and normal instances by only using video-level labels. Experimental results demonstrate that our framework achieves superior improvements on three benchmark datasets, including the ShanghaiTech dataset, UCF Crime dataset and NUT dataset.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: International Conference on Pattern Recognition (ICPR), pp. 733–742 (2016) Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: International Conference on Pattern Recognition (ICPR), pp. 733–742 (2016)
2.
Zurück zum Zitat Zhao, Y., Deng, B., Shen, C., Liu, Y., Lu, H., Hua, X.-S.: Spatio-temporal autoencoder for video anomaly detection. In: ACM International Conference on Multimedia (ACM MM), pp. 1933–1941 (2017) Zhao, Y., Deng, B., Shen, C., Liu, Y., Lu, H., Hua, X.-S.: Spatio-temporal autoencoder for video anomaly detection. In: ACM International Conference on Multimedia (ACM MM), pp. 1933–1941 (2017)
3.
Zurück zum Zitat Sultani, W., Chen, C., Shah, M.: Real-world anomaly detection in surveillance videos. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6479–6488 (2018) Sultani, W., Chen, C., Shah, M.: Real-world anomaly detection in surveillance videos. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6479–6488 (2018)
4.
Zurück zum Zitat Feng, J.C., Hong, F.T., Zheng, W.S.: Mist: Multiple instance self-training framework for video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14009–14018 (2021) Feng, J.C., Hong, F.T., Zheng, W.S.: Mist: Multiple instance self-training framework for video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14009–14018 (2021)
5.
Zurück zum Zitat Zaheer, M.Z., Mahmood, A., Astrid, M., Lee, S.-I.: Claws: Clustering assisted weakly supervised learning with normalcy suppression for anomalous event detection. In: European Conference on Computer Vision (ECCV), pp. 358–376 (2020) Zaheer, M.Z., Mahmood, A., Astrid, M., Lee, S.-I.: Claws: Clustering assisted weakly supervised learning with normalcy suppression for anomalous event detection. In: European Conference on Computer Vision (ECCV), pp. 358–376 (2020)
6.
Zurück zum Zitat Zhu, Y., Newsam, S.: Motion-aware feature for improved video anomaly detection. In: 30th British Machine Vision Conference (BMVC), pp. 1–12 (2020) Zhu, Y., Newsam, S.: Motion-aware feature for improved video anomaly detection. In: 30th British Machine Vision Conference (BMVC), pp. 1–12 (2020)
7.
Zurück zum Zitat Degardin, B., Proena, H.: Iterative weak/self-supervised classification framework for abnormal events detection. Pattern Recogn. Lett. 145(1), 50–57 (2021)ADSCrossRef Degardin, B., Proena, H.: Iterative weak/self-supervised classification framework for abnormal events detection. Pattern Recogn. Lett. 145(1), 50–57 (2021)ADSCrossRef
8.
Zurück zum Zitat Zhong, J., Li, N., Kong, W., Liu, S., Li, T.H., Li, G.: Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1237–1246 (2019) Zhong, J., Li, N., Kong, W., Liu, S., Li, T.H., Li, G.: Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1237–1246 (2019)
9.
Zurück zum Zitat Wan, B., Fang, Y., Xia, X., Mei, J.: Weakly supervised video anomaly detection via center-guided discriminative learning. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2020) Wan, B., Fang, Y., Xia, X., Mei, J.: Weakly supervised video anomaly detection via center-guided discriminative learning. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2020)
10.
Zurück zum Zitat Lin, S., Clark, R., Birke, R., Schönborn, S., Trigoni, N., Roberts, S.: Anomaly detection for time series using vae-lstm hybrid model. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4322–4326 (2020) Lin, S., Clark, R., Birke, R., Schönborn, S., Trigoni, N., Roberts, S.: Anomaly detection for time series using vae-lstm hybrid model. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4322–4326 (2020)
11.
Zurück zum Zitat Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection–a new baseline. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6536–6545 (2018) Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection–a new baseline. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6536–6545 (2018)
12.
Zurück zum Zitat Zaheer, M.Z., Mahmood, A., Khan, M.H., Segu, M., Yu, F., Lee, S.-I.: Generative cooperative learning for unsupervised video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14744–14754 (2022) Zaheer, M.Z., Mahmood, A., Khan, M.H., Segu, M., Yu, F., Lee, S.-I.: Generative cooperative learning for unsupervised video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14744–14754 (2022)
13.
Zurück zum Zitat Liu, Z., Nie, Y., Long, C., Zhang, Q., Li, G.: A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction. In: IEEE International Conference on Computer Vision (ICCV), pp. 13588–13597 (2021) Liu, Z., Nie, Y., Long, C., Zhang, Q., Li, G.: A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction. In: IEEE International Conference on Computer Vision (ICCV), pp. 13588–13597 (2021)
14.
Zurück zum Zitat Ye, M., Peng, X., Gan, W., Wu, W., Qiao, Y.: Anopcn: Video anomaly detection via deep predictive coding network. In: ACM International Conference on Multimedia (ACM MM), pp. 1805–1813 (2019) Ye, M., Peng, X., Gan, W., Wu, W., Qiao, Y.: Anopcn: Video anomaly detection via deep predictive coding network. In: ACM International Conference on Multimedia (ACM MM), pp. 1805–1813 (2019)
15.
Zurück zum Zitat Tang, Y., Zhao, L., Zhang, S., Gong, C., Li, G., Yang, J.: Integrating prediction and reconstruction for anomaly detection. Pattern Recogn. Lett. 129(1), 123–130 (2020)ADSCrossRef Tang, Y., Zhao, L., Zhang, S., Gong, C., Li, G., Yang, J.: Integrating prediction and reconstruction for anomaly detection. Pattern Recogn. Lett. 129(1), 123–130 (2020)ADSCrossRef
16.
Zurück zum Zitat Chong, Y.S., Tay, Y.H.: Abnormal event detection in videos using spatio-temporal autoencoder. In: International Symposium on Neural Networks (ISNN), pp. 189–196 (2017) Chong, Y.S., Tay, Y.H.: Abnormal event detection in videos using spatio-temporal autoencoder. In: International Symposium on Neural Networks (ISNN), pp. 189–196 (2017)
17.
Zurück zum Zitat Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 733–742 (2016) Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 733–742 (2016)
18.
Zurück zum Zitat Dubey, S., Boragule, A., Jeon, M.: 3D ResNet with ranking loss function for abnormal activity detection in videos. In: International Conference on Control, Automation and Information Sciences (ICCAIS), pp. 1–6 (2019) Dubey, S., Boragule, A., Jeon, M.: 3D ResNet with ranking loss function for abnormal activity detection in videos. In: International Conference on Control, Automation and Information Sciences (ICCAIS), pp. 1–6 (2019)
19.
Zurück zum Zitat Tan, W., Yao, Q., Liu, J.: Overlooked video classification in weakly supervised video anomaly detection. In: arXiv Preprint arXiv: 2210.06688 (2023) Tan, W., Yao, Q., Liu, J.: Overlooked video classification in weakly supervised video anomaly detection. In: arXiv Preprint arXiv:​ 2210.​06688 (2023)
20.
Zurück zum Zitat Doshi, K., Yilmaz, Y.: Any-shot sequential anomaly detection in surveillance videos. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 4037–4042 (2020) Doshi, K., Yilmaz, Y.: Any-shot sequential anomaly detection in surveillance videos. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 4037–4042 (2020)
21.
Zurück zum Zitat Georgescu, M.I., Ionescu, R., Khan, F.S., Popescu, M., Shah, M.: A background-agnostic framework with adversarial training for abnormal event detection in video. IEEE Transactions on Pattern Analysis and Machine Intelligence PP(1), 1–18 (2021) Georgescu, M.I., Ionescu, R., Khan, F.S., Popescu, M., Shah, M.: A background-agnostic framework with adversarial training for abnormal event detection in video. IEEE Transactions on Pattern Analysis and Machine Intelligence PP(1), 1–18 (2021)
22.
Zurück zum Zitat Li, Q., Yang, R., Xiao, F., Bhanu, B., Zhang, F.: Attention-based anomaly detection in multi-view surveillance videos. Knowl.-Based Syst. 252(2), 1–11 (2022) Li, Q., Yang, R., Xiao, F., Bhanu, B., Zhang, F.: Attention-based anomaly detection in multi-view surveillance videos. Knowl.-Based Syst. 252(2), 1–11 (2022)
23.
Zurück zum Zitat Lu, J., Batra, D., Parikh, D., Lee, S.: Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In: Advances in Neural Information Processing Systems (NIPS), pp. 13–23 (2019) Lu, J., Batra, D., Parikh, D., Lee, S.: Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In: Advances in Neural Information Processing Systems (NIPS), pp. 13–23 (2019)
24.
Zurück zum Zitat Luo, W., Liu, W., Gao, S.: A revisit of sparse coding based anomaly detection in stacked RNN framework. In: IEEE International Conference on Computer Vision (ICCV), pp. 341–349 (2017) Luo, W., Liu, W., Gao, S.: A revisit of sparse coding based anomaly detection in stacked RNN framework. In: IEEE International Conference on Computer Vision (ICCV), pp. 341–349 (2017)
25.
Zurück zum Zitat Hinami, R., Mei, T., Satoh, S.: Joint detection and recounting of abnormal events by learning deep generic knowledge. In: IEEE International Conference on Computer Vision (ICCV), pp. 3639–3647 (2017) Hinami, R., Mei, T., Satoh, S.: Joint detection and recounting of abnormal events by learning deep generic knowledge. In: IEEE International Conference on Computer Vision (ICCV), pp. 3639–3647 (2017)
26.
Zurück zum Zitat Sharma, P., Ding, N., Goodman, S., Soricut, R.: Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning. In: 56th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 2556–2565 (2018) Sharma, P., Ding, N., Goodman, S., Soricut, R.: Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning. In: 56th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 2556–2565 (2018)
27.
Zurück zum Zitat Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNet Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNet
28.
Zurück zum Zitat Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: International Conference on Machine Learning (ICML), pp. 1–8 (2010) Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: International Conference on Machine Learning (ICML), pp. 1–8 (2010)
29.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: European Conference on Computer Vision (ECCV), pp. 630–645 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: European Conference on Computer Vision (ECCV), pp. 630–645 (2016)
30.
Zurück zum Zitat Duchi, J.C., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(1), 2121–2159 (2011)MathSciNet Duchi, J.C., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(1), 2121–2159 (2011)MathSciNet
31.
Zurück zum Zitat Joo, H.K., Vo, K., Yamazaki, K., Le, N.: Clip-tsa: Clip-assisted temporal self-attention for weakly-supervised video anomaly detection. In: International Conference on Image Processing (ICIP), pp. 3230–3234 (2023) Joo, H.K., Vo, K., Yamazaki, K., Le, N.: Clip-tsa: Clip-assisted temporal self-attention for weakly-supervised video anomaly detection. In: International Conference on Image Processing (ICIP), pp. 3230–3234 (2023)
32.
Zurück zum Zitat Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: IEEE International Conference on Computer Vision (ICCV), pp. 2720–2727 (2013) Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: IEEE International Conference on Computer Vision (ICCV), pp. 2720–2727 (2013)
33.
Zurück zum Zitat Wan, B., Fang, Y., Xia, X., Mei, J.: Weakly supervised video anomaly detection via center-guided discriminative learning. In: IEEE International Conference on Multimedia and Expositions (ICME), pp. 1–6 (2020) Wan, B., Fang, Y., Xia, X., Mei, J.: Weakly supervised video anomaly detection via center-guided discriminative learning. In: IEEE International Conference on Multimedia and Expositions (ICME), pp. 1–6 (2020)
34.
Zurück zum Zitat Sapkota, H., Yu, Q.: Bayesian nonparametric submodular video partition for robust anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3202–3211 (2022) Sapkota, H., Yu, Q.: Bayesian nonparametric submodular video partition for robust anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3202–3211 (2022)
35.
Zurück zum Zitat Zaigham Zaheer, M., Mahmood, A., Haris Khan, M., Segu, M., Yu, F., Lee, S.-I.: Generative cooperative learning for unsupervised video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14724–14734 (2022) Zaigham Zaheer, M., Mahmood, A., Haris Khan, M., Segu, M., Yu, F., Lee, S.-I.: Generative cooperative learning for unsupervised video anomaly detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14724–14734 (2022)
Metadaten
Titel
Dy-MIL: dynamic multiple-instance learning framework for video anomaly detection
verfasst von
Chen Li
Mo Chen
Publikationsdatum
01.02.2024
Verlag
Springer Berlin Heidelberg
Erschienen in
Multimedia Systems / Ausgabe 1/2024
Print ISSN: 0942-4962
Elektronische ISSN: 1432-1882
DOI
https://doi.org/10.1007/s00530-023-01237-0

Weitere Artikel der Ausgabe 1/2024

Multimedia Systems 1/2024 Zur Ausgabe