Skip to main content
Erschienen in: Pattern Analysis and Applications 2/2020

04.05.2019 | Theoretical advances

Violence detection in videos for an intelligent surveillance system using MoBSIFT and movement filtering algorithm

verfasst von: I. P. Febin, K. Jayasree, Preetha Theresa Joy

Erschienen in: Pattern Analysis and Applications | Ausgabe 2/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Action recognition is an active research area in computer vision as it has enormous applications in today’s world, out of which, recognizing violent action is of great importance since it is closely related to our safety and security. An intelligent surveillance system is the idea of automatically recognizing suspicious activities in surveillance videos and thereby supporting security personals to take up right action on the right time. Under this area, most of the researchers were focused on people detection and tracking, loitering, etc., whereas detecting violent actions or fights is comparatively a less studied area. Previous works considered the local spatiotemporal feature extractors; however, it accompanies the overhead of complex optical flow estimation. Even though the temporal derivative is a fast alternative to optical flow, it alone gives very low accuracy and scales-dependent result. Hence, here we propose a cascaded method of violence detection based on motion boundary SIFT (MoBSIFT) and movement filtering. In this method, the surveillance videos are checked through a movement filtering algorithm based on temporal derivative and avoid most of the nonviolent actions from going through feature extraction. Only the filtered frames may allow going through feature extraction. In addition to scale-invariant feature transform (SIFT) and histogram of optical flow feature, motion boundary histogram is also extracted and combined to form MoBSIFT descriptor. The experimental results show that the proposed MoBSIFT outperforms the existing methods in accuracy by its high tolerance to camera movements. Time complexity has also proved to be reduced by the use of movement filtering along with MoBSIFT.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat de Souza FD, Chavez GC, do Valle EA, de A Araujo A (2010) Violence detection in video using spatio-temporal features. In: 23rd SIBGRAPI conference on graphics, patterns and images, pp 224–230 de Souza FD, Chavez GC, do Valle EA, de A Araujo A (2010) Violence detection in video using spatio-temporal features. In: 23rd SIBGRAPI conference on graphics, patterns and images, pp 224–230
2.
Zurück zum Zitat Deniz O, Serrano I, Bueno G, Tae-Tyun K (2014) Fast violence detection in video. In: VISAPP 2014  proceedings of the 9th international conference on computer vision theory and applications, pp 478–485 Deniz O, Serrano I, Bueno G, Tae-Tyun K (2014) Fast violence detection in video. In: VISAPP 2014  proceedings of the 9th international conference on computer vision theory and applications, pp 478–485
3.
Zurück zum Zitat Bermejo E, Deni O, Bueno G, Sukthankar R. (2011) Violence detection in video using computer vision techniques. In: Proceedings of the 14th international conference on computer analysis of images and patterns. Springer, pp 332–339 Bermejo E, Deni O, Bueno G, Sukthankar R. (2011) Violence detection in video using computer vision techniques. In: Proceedings of the 14th international conference on computer analysis of images and patterns. Springer, pp 332–339
5.
Zurück zum Zitat Chen M, Hauptmann A (2009) MoSIFT: recognizing human actions in surveillance videos. Technical report, Carnegie Mellon University, Pittsburgh, USA Chen M, Hauptmann A (2009) MoSIFT: recognizing human actions in surveillance videos. Technical report, Carnegie Mellon University, Pittsburgh, USA
6.
Zurück zum Zitat Dalal N, Triggs B, Schmid C (2006) Human Detection using oriented histograms of flow and appearance. In: Proceedings of 9th ECCV, pp 428–441 Dalal N, Triggs B, Schmid C (2006) Human Detection using oriented histograms of flow and appearance. In: Proceedings of 9th ECCV, pp 428–441
7.
Zurück zum Zitat Giannakopoulos T, Kosmopoulos D, Aristidou A, Theodoridis S (2006) Violence content classification using audio features. In: Proceedings of the 4th helenic conference on advances in artificial intelligence. Springer, pp 502–507 Giannakopoulos T, Kosmopoulos D, Aristidou A, Theodoridis S (2006) Violence content classification using audio features. In: Proceedings of the 4th helenic conference on advances in artificial intelligence. Springer, pp 502–507
8.
Zurück zum Zitat Gong Y, Wang W, Jiang S, Huang Q, Gao W (2008) Detecting violent scenes in movies by auditory and visual cues. In: Proceedings of the 9th Pacific Rim conference on multimedia. Springer, Berlin, Heidelberg, pp 317–326 Gong Y, Wang W, Jiang S, Huang Q, Gao W (2008) Detecting violent scenes in movies by auditory and visual cues. In: Proceedings of the 9th Pacific Rim conference on multimedia. Springer, Berlin, Heidelberg, pp 317–326
9.
Zurück zum Zitat Lin J, Wang W (2009) Weakly-supervised violence detection in movies with audio and video based cotraining. In: Proceedings of the 10th Pacific Rim conference on multimedia. Springer, Berlin, Heidelberg, pp 930–935 Lin J, Wang W (2009) Weakly-supervised violence detection in movies with audio and video based cotraining. In: Proceedings of the 10th Pacific Rim conference on multimedia. Springer, Berlin, Heidelberg, pp 930–935
10.
Zurück zum Zitat Nam J, Alghoniemy M, Tewfik AH (1998) Audio-visual content-based violent scene characterization. In: Proceedings 1998 international conference on image processing. ICIP98 (Cat. No. 98CB36269). IEEE Comput. Soc, Chicago, USA, pp 353–357 Nam J, Alghoniemy M, Tewfik AH (1998) Audio-visual content-based violent scene characterization. In: Proceedings 1998 international conference on image processing. ICIP98 (Cat. No. 98CB36269). IEEE Comput. Soc, Chicago, USA, pp 353–357
11.
Zurück zum Zitat Cheng W, Chu W, Ling J (2003) Semantic context detection based on hierarchical audio models. In: Proceedings of the ACM SIGMM workshop on multimedia information retrieval, pp. 109–115 Cheng W, Chu W, Ling J (2003) Semantic context detection based on hierarchical audio models. In: Proceedings of the ACM SIGMM workshop on multimedia information retrieval, pp. 109–115
12.
Zurück zum Zitat Giannakopoulos T, Makris A, Kosmopoulos D, Perantonis S, Theodoridis S (2010) Audio-visual fusion for detecting violent scenes in videos. In: Artificial intelligence: theories, models and applications, pp 91–100 Giannakopoulos T, Makris A, Kosmopoulos D, Perantonis S, Theodoridis S (2010) Audio-visual fusion for detecting violent scenes in videos. In: Artificial intelligence: theories, models and applications, pp 91–100
13.
Zurück zum Zitat Chen L-H, Hsu H-W, Wang L-Y, Su C-W (2011) Violence detection in movies. In: 2011 Eighth international conference computer graphics, imaging and visualization. IEEE Comput. Soc, Washington, DC, USA, pp 119–124 Chen L-H, Hsu H-W, Wang L-Y, Su C-W (2011) Violence detection in movies. In: 2011 Eighth international conference computer graphics, imaging and visualization. IEEE Comput. Soc, Washington, DC, USA, pp 119–124
14.
Zurück zum Zitat Clarin C, Dionisio J, Echavez M, Naval P (2005) DOVE: Detection of movie violence using motion intensity analysis on skin and blood. Technical report, University of the Philippines Clarin C, Dionisio J, Echavez M, Naval P (2005) DOVE: Detection of movie violence using motion intensity analysis on skin and blood. Technical report, University of the Philippines
15.
Zurück zum Zitat Zajdel W, Krijnders JD, Andringa T, Gavrila DM (2007) CASSANDRA: audio-video sensor fusion for aggression detection. In: 2007 IEEE conference on advanced video and signal based surveillance, pp 200–205 Zajdel W, Krijnders JD, Andringa T, Gavrila DM (2007) CASSANDRA: audio-video sensor fusion for aggression detection. In: 2007 IEEE conference on advanced video and signal based surveillance, pp 200–205
16.
Zurück zum Zitat Datta A, Shah M, Da Vitoria Lobo N (2002) Person-on-person violence detection in video data. In: 16th international conference on pattern recognition, pp 433–438 Datta A, Shah M, Da Vitoria Lobo N (2002) Person-on-person violence detection in video data. In: 16th international conference on pattern recognition, pp 433–438
17.
Zurück zum Zitat Yun K, Honorio J, Chattopadhyay D, Berg TL, Samaras D (2012) Two-person interaction detection using body-pose features and multiple instance learning. In: IEEE computer society conference on computer vision and pattern recognition workshops, pp 28–35 Yun K, Honorio J, Chattopadhyay D, Berg TL, Samaras D (2012) Two-person interaction detection using body-pose features and multiple instance learning. In: IEEE computer society conference on computer vision and pattern recognition workshops, pp 28–35
18.
Zurück zum Zitat Gao Z, Nie W, Liu A, Zhang H (2016) Evaluation of local spatial–temporal features for cross-view action recognition. Neurocomputing 173:110–117CrossRef Gao Z, Nie W, Liu A, Zhang H (2016) Evaluation of local spatial–temporal features for cross-view action recognition. Neurocomputing 173:110–117CrossRef
19.
Zurück zum Zitat Xu L, Gong C, Yang J, Wu Q, Yao L (2014) Violent video detection based on MoSIFT feature and sparse coding. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 3538–3542 Xu L, Gong C, Yang J, Wu Q, Yao L (2014) Violent video detection based on MoSIFT feature and sparse coding. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 3538–3542
20.
Zurück zum Zitat Hassner T, Itcher Y, Kliper-Gross O (2012) Violent flows: real-time detection of violent crowd behavior. In: 2012 IEEE computer society conference on computer vision and pattern recognition workshops. IEEE, Providence, USA, pp 1–6 Hassner T, Itcher Y, Kliper-Gross O (2012) Violent flows: real-time detection of violent crowd behavior. In: 2012 IEEE computer society conference on computer vision and pattern recognition workshops. IEEE, Providence, USA, pp 1–6
21.
Zurück zum Zitat Mousavi H, Mohammadi S, Perina A, Chellali R, Murino V (2015) Analyzing tracklets for the detection of abnormal crowd behavior. In: IEEE winter conference on applications of computer vision, pp 148–15 Mousavi H, Mohammadi S, Perina A, Chellali R, Murino V (2015) Analyzing tracklets for the detection of abnormal crowd behavior. In: IEEE winter conference on applications of computer vision, pp 148–15
22.
Zurück zum Zitat Colque RVHM, Junior CAC, Schwartz WR (2015) Histograms of optical flow orientation and magnitude to detect anomalous events in videos. In: 28th SIBGRAPI conference on graphics, patterns and images, pp 126–133 Colque RVHM, Junior CAC, Schwartz WR (2015) Histograms of optical flow orientation and magnitude to detect anomalous events in videos. In: 28th SIBGRAPI conference on graphics, patterns and images, pp 126–133
23.
Zurück zum Zitat Gao Y, Liu H, Sun X, Wang C, Liu Y (2016) Violence detection using oriented violent flows. Image Vis Comput 48–49:37–41CrossRef Gao Y, Liu H, Sun X, Wang C, Liu Y (2016) Violence detection using oriented violent flows. Image Vis Comput 48–49:37–41CrossRef
24.
Zurück zum Zitat Zhang T, Yang Z, Jia W, Yang B, Yang J, He X (2016) A new method for violence detection in surveillance scenes. Multimed Tools Appl 75:7327–7349CrossRef Zhang T, Yang Z, Jia W, Yang B, Yang J, He X (2016) A new method for violence detection in surveillance scenes. Multimed Tools Appl 75:7327–7349CrossRef
25.
Zurück zum Zitat Zhang T, Jia W, He X, Yang J (2017) Discriminative dictionary learning with motion weber local descriptor for violence detection. IEEE Trans Circuits Syst Video Technol 27(3):696–709CrossRef Zhang T, Jia W, He X, Yang J (2017) Discriminative dictionary learning with motion weber local descriptor for violence detection. IEEE Trans Circuits Syst Video Technol 27(3):696–709CrossRef
26.
Zurück zum Zitat Senst T, Eiselein V, Kuhn A, Sikora T (2017) Crowd violence detection using global motion-compensated lagrangian features and scale-sensitive video-level representation. IEEE Trans Inf Forensics Secur 12(12):2945–2956CrossRef Senst T, Eiselein V, Kuhn A, Sikora T (2017) Crowd violence detection using global motion-compensated lagrangian features and scale-sensitive video-level representation. IEEE Trans Inf Forensics Secur 12(12):2945–2956CrossRef
27.
Zurück zum Zitat Mabrouk AB, Zagrouba E (2017) Spatio-temporal feature using optical flow based distribution for violence detection. Pattern Recognit Lett 92:62–67CrossRef Mabrouk AB, Zagrouba E (2017) Spatio-temporal feature using optical flow based distribution for violence detection. Pattern Recognit Lett 92:62–67CrossRef
29.
Zurück zum Zitat Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local svm approach. In: 17th international conference on pattern recognition (ICPR’04), IEEE Comp. Soc. Washington, DC, USA, vol 3, pp 32–36 Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local svm approach. In: 17th international conference on pattern recognition (ICPR’04), IEEE Comp. Soc. Washington, DC, USA, vol 3, pp 32–36
30.
Zurück zum Zitat Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253CrossRef Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253CrossRef
31.
Zurück zum Zitat Weinland D, Ronfard R, Boyer E (2006) Free viewpoint action recognition using motion history volumes. Comput Vis Image Underst 104:249–257CrossRef Weinland D, Ronfard R, Boyer E (2006) Free viewpoint action recognition using motion history volumes. Comput Vis Image Underst 104:249–257CrossRef
33.
Zurück zum Zitat Paul M, Haque SME, Chakraborty S (2013) Human detection in surveillance videos and its applications a review. EURASIP J Adv Signal Process 2013:176CrossRef Paul M, Haque SME, Chakraborty S (2013) Human detection in surveillance videos and its applications a review. EURASIP J Adv Signal Process 2013:176CrossRef
34.
Zurück zum Zitat Wang H, Klaser A, Schmid C, Liu C-L (2011) Action recognition by dense trajectories. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, Colorado Springs, USA, pp 3169–3176 Wang H, Klaser A, Schmid C, Liu C-L (2011) Action recognition by dense trajectories. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, Colorado Springs, USA, pp 3169–3176
35.
Zurück zum Zitat Liu M, Wang M, Wang J, Li D (2013) Comparison of random forest, support vector machine and back propagation neural network for electronic tongue data classification: application to the recognition of orange beverage and Chinese vinegar. Sens Actuators B 177:970–980CrossRef Liu M, Wang M, Wang J, Li D (2013) Comparison of random forest, support vector machine and back propagation neural network for electronic tongue data classification: application to the recognition of orange beverage and Chinese vinegar. Sens Actuators B 177:970–980CrossRef
36.
Zurück zum Zitat Lorena AC, Jacintho Luis FO, Siqueira MF, De Giovanni R, Lohmann LG, de André CPLF, Carvalho MY (2011) Comparing machine learning classifiers in potential distribution modelling. Expert Syst Appl 38:5268–5275CrossRef Lorena AC, Jacintho Luis FO, Siqueira MF, De Giovanni R, Lohmann LG, de André CPLF, Carvalho MY (2011) Comparing machine learning classifiers in potential distribution modelling. Expert Syst Appl 38:5268–5275CrossRef
Metadaten
Titel
Violence detection in videos for an intelligent surveillance system using MoBSIFT and movement filtering algorithm
verfasst von
I. P. Febin
K. Jayasree
Preetha Theresa Joy
Publikationsdatum
04.05.2019
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 2/2020
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-019-00821-3

Weitere Artikel der Ausgabe 2/2020

Pattern Analysis and Applications 2/2020 Zur Ausgabe

Premium Partner