Skip to main content

2021 | OriginalPaper | Buchkapitel

Event and Activity Recognition in Video Surveillance for Cyber-Physical Systems

verfasst von : Swarnabja Bhaumik, Prithwish Jana, Partha Pratim Mohanta

Erschienen in: Emergence of Cyber Physical System and IoT in Smart Automation and Robotics

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this chapter, we aim to aid the development of Cyber-Physical Systems (CPS) in automated understanding of events and activities in various applications of video-surveillance. These events are mostly captured by drones, CCTVs or novice and unskilled individuals on low-end devices. Being unconstrained in nature, these videos are immensely challenging due to a number of quality factors. We present an extensive account of the various approaches taken to solve the problem over the years. This ranges from methods as early as Structure from Motion (SFM) based approaches to recent solution frameworks involving deep neural networks. We show that the long-term motion patterns alone play a pivotal role in the task of recognizing an event. Consequently each video is significantly represented by a fixed number of key-frames using a graph-based approach. Only the temporal features are exploited using a hybrid Convolutional Neural Network (CNN + Recurrent Neural Network (RNN)) architecture. The results we obtain are encouraging as they outperform standard temporal CNNs and are at par with those using spatial information along with motion cues. Further exploring multistream models, we conceive a multi-tier fusion strategy for the spatial and temporal wings of a network. A consolidated representation of the respective individual prediction vectors on video and frame levels is obtained using a biased conflation technique. The fusion strategy endows us with greater rise in precision on each stage as compared to the state-of-the-art methods, and thus a powerful consensus is achieved in classification. Results are recorded on four benchmark datasets widely used in the domain of action recognition, namely Columbia Consumer Videos (CCV), Human Motion Database (HMDB), UCF-101 and Kodak’s Consumer Video (KCV). It is inferable that focusing on better classification of the video sequences certainly leads to robust actuation of a system designed for event surveillance and object cum activity tracking.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Alzubi, J., Nayyar, A., & Kumar, A. (2018). Machine learning from theory to algorithms: An overview. In: Journal of Physics: Conference Series, 1142. Alzubi, J., Nayyar, A., & Kumar, A. (2018). Machine learning from theory to algorithms: An overview. In: Journal of Physics: Conference Series, 1142.
Zurück zum Zitat Bhattacharyya, A. (1946). On a measure of divergence between two multinomial populations. Sankhyā: The Indian Journal of Statistics, 401–406. Bhattacharyya, A. (1946). On a measure of divergence between two multinomial populations. Sankhyā: The Indian Journal of Statistics, 401–406.
Zurück zum Zitat Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R. (2005). Actions as space-time shapes. In Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05) (Vol. 2, pp. 1395–1402). IEEE. Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R. (2005). Actions as space-time shapes. In Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05) (Vol. 2, pp. 1395–1402). IEEE.
Zurück zum Zitat Bobick, A. F., & Davis, J. W. (2001). The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(3), 257–267.CrossRef Bobick, A. F., & Davis, J. W. (2001). The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(3), 257–267.CrossRef
Zurück zum Zitat Borisyuk, F., Gordo, A., & Sivakumar, V. (2018). Rosetta: Large scale system for text detection and recognition in images. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 71-79). Borisyuk, F., Gordo, A., & Sivakumar, V. (2018). Rosetta: Large scale system for text detection and recognition in images. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 71-79).
Zurück zum Zitat Cedras, C., & Shah, M. (1995). Motion-based recognition a survey. Image and Vision Computing, 13(2), 129–155.CrossRef Cedras, C., & Shah, M. (1995). Motion-based recognition a survey. Image and Vision Computing, 13(2), 129–155.CrossRef
Zurück zum Zitat Chen, L., Duan, L., Xu, D.: Event Recognition in Videos by Learning from Heterogeneous Web Sources. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2666–2673 (2013) Chen, L., Duan, L., Xu, D.: Event Recognition in Videos by Learning from Heterogeneous Web Sources. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2666–2673 (2013)
Zurück zum Zitat Cherian, A., & Gould, S. (2019). Second-order temporal pooling for action recognition. International Journal of Computer Vision, 127(4), 340–362.CrossRef Cherian, A., & Gould, S. (2019). Second-order temporal pooling for action recognition. International Journal of Computer Vision, 127(4), 340–362.CrossRef
Zurück zum Zitat Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) (Vol. 1, pp. 886-893). IEEE. Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) (Vol. 1, pp. 886-893). IEEE.
Zurück zum Zitat Derpanis, K. G. (2004). The Harris corner detector (pp. 1–2). York University. Derpanis, K. G. (2004). The Harris corner detector (pp. 1–2). York University.
Zurück zum Zitat Duan, L., Xu, D., & Chang, S. F. (2012). Exploiting web images for event recognition in consumer videos: A multiple source domain adaptation approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1338–1345). IEEE. Duan, L., Xu, D., & Chang, S. F. (2012). Exploiting web images for event recognition in consumer videos: A multiple source domain adaptation approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1338–1345). IEEE.
Zurück zum Zitat Dubey, S., Singh, P., Yadav, P., & Singh, K. K. (2020). Household waste management system using IoT and machine learning. Procedia Computer Science, 167, 1950–1959.CrossRef Dubey, S., Singh, P., Yadav, P., & Singh, K. K. (2020). Household waste management system using IoT and machine learning. Procedia Computer Science, 167, 1950–1959.CrossRef
Zurück zum Zitat Elgammal, A., Harwood, D., & Davis, L. (2000). Non-parametric model for background subtraction. In Proceedings of the European Conference on Computer Vision (pp. 751–767). Springer. Elgammal, A., Harwood, D., & Davis, L. (2000). Non-parametric model for background subtraction. In Proceedings of the European Conference on Computer Vision (pp. 751–767). Springer.
Zurück zum Zitat Feng, Y., Wu, X., Wang, H., & Liu, J. (2014). Multi-group adaptation for event recognition from videos. In Proceeding of the 22nd International Conference on Pattern Recognition (pp. 3915–3920). IEEE. Feng, Y., Wu, X., Wang, H., & Liu, J. (2014). Multi-group adaptation for event recognition from videos. In Proceeding of the 22nd International Conference on Pattern Recognition (pp. 3915–3920). IEEE.
Zurück zum Zitat Ghosh, S., Kundu, A., & Jana, D. (2011). Implementation challenges of time synchronization in vehicular networks. In Proceedings of the IEEE Recent Advances in Intelligent Computational Systems (pp. 575–580). IEEE. Ghosh, S., Kundu, A., & Jana, D. (2011). Implementation challenges of time synchronization in vehicular networks. In Proceedings of the IEEE Recent Advances in Intelligent Computational Systems (pp. 575–580). IEEE.
Zurück zum Zitat Girdhar, R., & Ramanan, D. (2017). Attentional pooling for action recognition. In Proceedings of the Advances in Neural Information Processing Systems (pp. 34–45). Girdhar, R., & Ramanan, D. (2017). Attentional pooling for action recognition. In Proceedings of the Advances in Neural Information Processing Systems (pp. 34–45).
Zurück zum Zitat Girdhar, R., Ramanan, D., Gupta, A., Sivic, J., & Russell, B. (2017). Actionvlad: Learning spatio-temporal aggregation for action classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 971–980). Girdhar, R., Ramanan, D., Gupta, A., Sivic, J., & Russell, B. (2017). Actionvlad: Learning spatio-temporal aggregation for action classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 971–980).
Zurück zum Zitat Gould, K., & Shah, M. (1989). The trajectory primal sketch: A multi-scale scheme for representing motion characteristics. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 79–80). IEEE Computer Society. Gould, K., & Shah, M. (1989). The trajectory primal sketch: A multi-scale scheme for representing motion characteristics. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 79–80). IEEE Computer Society.
Zurück zum Zitat Gupta, R., Tanwar, S., Al-Turjman, F., Italiya, P., Nauman, A., & Kim, S. W. (2020). Smart contract privacy protection using AI in cyber-physical systems: Tools, techniques and challenges. IEEE Access, 8, 24746–24772.CrossRef Gupta, R., Tanwar, S., Al-Turjman, F., Italiya, P., Nauman, A., & Kim, S. W. (2020). Smart contract privacy protection using AI in cyber-physical systems: Tools, techniques and challenges. IEEE Access, 8, 24746–24772.CrossRef
Zurück zum Zitat Hill, T. (2011). Conflations of probability distributions. Transactions of the American Mathematical Society, 363(6), 3351–3372.MathSciNetMATHCrossRef Hill, T. (2011). Conflations of probability distributions. Transactions of the American Mathematical Society, 363(6), 3351–3372.MathSciNetMATHCrossRef
Zurück zum Zitat Horn, B. K., & Schunck, B. G. (1993). Determining optical flow: A retrospective. Artificial Intelligence, 59, 81–87.CrossRef Horn, B. K., & Schunck, B. G. (1993). Determining optical flow: A retrospective. Artificial Intelligence, 59, 81–87.CrossRef
Zurück zum Zitat Jain, R., Nayyar, A., Bachhety, S. (2020). Factex: A practical approach to crime detection. In Data management, analytics and innovation (pp. 503–516). Springer. Jain, R., Nayyar, A., Bachhety, S. (2020). Factex: A practical approach to crime detection. In Data management, analytics and innovation (pp. 503–516). Springer.
Zurück zum Zitat Jana, D., & Bandyopadhyay, D. (2013). Efficient management of security and privacy issues in mobile cloud environment. In Proceedings of the Annual IEEE India Conference (INDICON) (pp. 1–6). IEEE. Jana, D., & Bandyopadhyay, D. (2013). Efficient management of security and privacy issues in mobile cloud environment. In Proceedings of the Annual IEEE India Conference (INDICON) (pp. 1–6). IEEE.
Zurück zum Zitat Jana, D., & Bandyopadhyay, D. (2015). Controlled privacy in mobile cloud. In Proceedings of the IEEE 2nd International Conference on Recent Trends in Information Systems (ReTIS) (pp. 98–103). IEEE. Jana, D., & Bandyopadhyay, D. (2015). Controlled privacy in mobile cloud. In Proceedings of the IEEE 2nd International Conference on Recent Trends in Information Systems (ReTIS) (pp. 98–103). IEEE.
Zurück zum Zitat Jana, P., Bhaumik, S., & Mohanta, P. P. (2019). A multi-tier fusion strategy for event classification in unconstrained videos. In Proceedings of the 8th International Conference on Pattern Recognition and Machine Intelligence (PReMI) (pp. 515–524). Springer. Jana, P., Bhaumik, S., & Mohanta, P. P. (2019). A multi-tier fusion strategy for event classification in unconstrained videos. In Proceedings of the 8th International Conference on Pattern Recognition and Machine Intelligence (PReMI) (pp. 515–524). Springer.
Zurück zum Zitat Jana, P., Bhaumik, S., & Mohanta, P. P. (2019). Key-frame based event recognition in unconstrained videos using temporal features. In Proceedings of the IEEE Region 10 Symposium (TENSYMP) (pp. 349–354). IEEE. Jana, P., Bhaumik, S., & Mohanta, P. P. (2019). Key-frame based event recognition in unconstrained videos using temporal features. In Proceedings of the IEEE Region 10 Symposium (TENSYMP) (pp. 349–354). IEEE.
Zurück zum Zitat Jana, P., Ghosh, S., Sarkar, R., & Nasipuri, M. (2017). A fuzzy C-means based approach towards efficient document image binarization. In Proceedings of the 9th International Conference on Advances in Pattern Recognition (ICAPR) (pp. 332–337). IEEE. Jana, P., Ghosh, S., Sarkar, R., & Nasipuri, M. (2017). A fuzzy C-means based approach towards efficient document image binarization. In Proceedings of the 9th International Conference on Advances in Pattern Recognition (ICAPR) (pp. 332–337). IEEE.
Zurück zum Zitat Jiang, Y. G., Dai, Q., Xue, X., Liu, W., & Ngo, C. W. (2012). Trajectory-based modeling of human actions with motion reference points. In Proceedings of the European Conference on Computer Vision (pp. 425–438). Springer. Jiang, Y. G., Dai, Q., Xue, X., Liu, W., & Ngo, C. W. (2012). Trajectory-based modeling of human actions with motion reference points. In Proceedings of the European Conference on Computer Vision (pp. 425–438). Springer.
Zurück zum Zitat Jiang, Y. G., Ye, G., Chang, S. F., Ellis, D., & Loui, A. C. (2011). Consumer video understanding: A benchmark database and an evaluation of human and machine performance. In Proceedings of the 1st ACM International Conference on Multimedia Retrieval (pp. 1–8). http://www.ee.columbia.edu/ln/dvmm/CCV/. Accessed July 2020. Jiang, Y. G., Ye, G., Chang, S. F., Ellis, D., & Loui, A. C. (2011). Consumer video understanding: A benchmark database and an evaluation of human and machine performance. In Proceedings of the 1st ACM International Conference on Multimedia Retrieval (pp. 1–8). http://​www.​ee.​columbia.​edu/​ln/​dvmm/​CCV/​. Accessed July 2020.
Zurück zum Zitat Kalra, G. S., Kathuria, R. S., & Kumar, A. (2019). YouTube video classification based on title and description text. In Proceedings of the International Conference on Computing, Communication, and Intelligent Systems (ICCCIS) (pp. 74–79). IEEE. Kalra, G. S., Kathuria, R. S., & Kumar, A. (2019). YouTube video classification based on title and description text. In Proceedings of the International Conference on Computing, Communication, and Intelligent Systems (ICCCIS) (pp. 74–79). IEEE.
Zurück zum Zitat Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1725–1732). Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1725–1732).
Zurück zum Zitat Laptev, I. (2005). On space-time interest points. International Journal of Computer Vision, 64(2–3), 107–123.CrossRef Laptev, I. (2005). On space-time interest points. International Journal of Computer Vision, 64(2–3), 107–123.CrossRef
Zurück zum Zitat Lee, J., Abu-El-Haija, S., Varadarajan, B., & Natsev, A. (2018). Collaborative deep metric learning for video understanding. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 481–490). Lee, J., Abu-El-Haija, S., Varadarajan, B., & Natsev, A. (2018). Collaborative deep metric learning for video understanding. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 481–490).
Zurück zum Zitat Li, Y., Liu, C., Ji, Y., Gong, S., & Xu, H. (2020). Spatio-temporal deep residual network with hierarchical attentions for video event recognition. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 16(2s), 1–21. Li, Y., Liu, C., Ji, Y., Gong, S., & Xu, H. (2020). Spatio-temporal deep residual network with hierarchical attentions for video event recognition. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 16(2s), 1–21.
Zurück zum Zitat Liu, K., Li, Y., Xu, N., & Natarajan, P. (2018). Learn to combine modalities in multimodal deep learning. arXiv preprint arXiv:1805.11730. Liu, K., Li, Y., Xu, N., & Natarajan, P. (2018). Learn to combine modalities in multimodal deep learning. arXiv preprint arXiv:​1805.​11730.
Zurück zum Zitat Loui, A., Luo, J., Chang, S. F., Ellis, D., Jiang, W., Kennedy, L., Lee, K., & Yanagawa, A. (2007). Kodak’s consumer video benchmark data set: Concept definition and annotation. In Proceedings of the International Workshop on Multimedia Information Retrieval (pp. 245–254). http://www.ee.columbia.edu/ln/dvmm/consumervideo/. Accessed July 2020. Loui, A., Luo, J., Chang, S. F., Ellis, D., Jiang, W., Kennedy, L., Lee, K., & Yanagawa, A. (2007). Kodak’s consumer video benchmark data set: Concept definition and annotation. In Proceedings of the International Workshop on Multimedia Information Retrieval (pp. 245–254). http://​www.​ee.​columbia.​edu/​ln/​dvmm/​consumervideo/​. Accessed July 2020.
Zurück zum Zitat Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.CrossRef Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.CrossRef
Zurück zum Zitat Lu, J., Hu, J., & Zhou, J. (2017). Deep metric learning for visual understanding: An overview of recent advances. IEEE Signal Processing Magazine, 34(6), 76–84.CrossRef Lu, J., Hu, J., & Zhou, J. (2017). Deep metric learning for visual understanding: An overview of recent advances. IEEE Signal Processing Magazine, 34(6), 76–84.CrossRef
Zurück zum Zitat Luo, C., Jin, L., & Sun, Z. (2019). MORAN: A multi-object rectified attention network for scene text recognition. Pattern Recognition, 90, 109–118.CrossRef Luo, C., Jin, L., & Sun, Z. (2019). MORAN: A multi-object rectified attention network for scene text recognition. Pattern Recognition, 90, 109–118.CrossRef
Zurück zum Zitat Luo, M., Chang, X., Nie, L., Yang, Y., Hauptmann, A. G., & Zheng, Q. (2018). An adaptive semisupervised feature analysis for video semantic recognition. IEEE Transactions on Cybernetics, 48(2), 648–660.CrossRef Luo, M., Chang, X., Nie, L., Yang, Y., Hauptmann, A. G., & Zheng, Q. (2018). An adaptive semisupervised feature analysis for video semantic recognition. IEEE Transactions on Cybernetics, 48(2), 648–660.CrossRef
Zurück zum Zitat Mohanta, P. P., Saha, S. K., & Chanda, B. (2011). A model-based shot boundary detection technique using frame transition parameters. IEEE Transactions on Multimedia, 14(1), 223–233.CrossRef Mohanta, P. P., Saha, S. K., & Chanda, B. (2011). A model-based shot boundary detection technique using frame transition parameters. IEEE Transactions on Multimedia, 14(1), 223–233.CrossRef
Zurück zum Zitat Mukherjee, A., Jana, P., Chakraborty, S., & Saha, S. K. (2020). Two stage semantic segmentation by SEEDS and Fork Net. In Proceedings of the IEEE Calcutta Conference (CALCON) (pp. 283–287). IEEE. Mukherjee, A., Jana, P., Chakraborty, S., & Saha, S. K. (2020). Two stage semantic segmentation by SEEDS and Fork Net. In Proceedings of the IEEE Calcutta Conference (CALCON) (pp. 283–287). IEEE.
Zurück zum Zitat Mukhopadhyay, M., Pal, S., Nayyar, A., Pramanik, P.K.D., Dasgupta, N., & Choudhury, P. (2020). Facial emotion detection to assess learner’s state of mind in an online learning system. In Proceedings of the 5th International Conference on Intelligent Information Technology (pp. 107–115). Mukhopadhyay, M., Pal, S., Nayyar, A., Pramanik, P.K.D., Dasgupta, N., & Choudhury, P. (2020). Facial emotion detection to assess learner’s state of mind in an online learning system. In Proceedings of the 5th International Conference on Intelligent Information Technology (pp. 107–115).
Zurück zum Zitat Oron, S., Bar-Hille, A., & Avidan, S. (2014). Extended Lucas-Kanade tracking. In Proceedings of the European Conference on Computer Vision (pp. 142–156). Springer. Oron, S., Bar-Hille, A., & Avidan, S. (2014). Extended Lucas-Kanade tracking. In Proceedings of the European Conference on Computer Vision (pp. 142–156). Springer.
Zurück zum Zitat Padikkapparambil, J., Ncube, C., Singh, K. K., & Singh, A. (2020). Internet of things technologies for elderly health-care applications. In Emergence of Pharmaceutical Industry Growth with Industrial IoT Approach (pp. 217–243). Elsevier. Padikkapparambil, J., Ncube, C., Singh, K. K., & Singh, A. (2020). Internet of things technologies for elderly health-care applications. In Emergence of Pharmaceutical Industry Growth with Industrial IoT Approach (pp. 217–243). Elsevier.
Zurück zum Zitat Paul, S., Chaudhuri, S., & Jana, D. (2016). Increasing the fault tolerance of NameNode: A proposal for using DataNode as a secondary backup node. International Journal of Advanced Research in Computer Science and Software Engineering, 6(6), 416–422. Paul, S., Chaudhuri, S., & Jana, D. (2016). Increasing the fault tolerance of NameNode: A proposal for using DataNode as a secondary backup node. International Journal of Advanced Research in Computer Science and Software Engineering, 6(6), 416–422.
Zurück zum Zitat Peng, Y., Ye, H., Lin, Y., Bao, Y., Zhao, Z., Qiu, H., Lu, Y., Wang, L., & Zheng, Y. (2017). Large-scale video classification with elastic streaming sequential data processing system. In Proceedings of the Workshop on Large-Scale Video Classification Challenge (pp. 1–7). Peng, Y., Ye, H., Lin, Y., Bao, Y., Zhao, Z., Qiu, H., Lu, Y., Wang, L., & Zheng, Y. (2017). Large-scale video classification with elastic streaming sequential data processing system. In Proceedings of the Workshop on Large-Scale Video Classification Challenge (pp. 1–7).
Zurück zum Zitat Pinar, A. J., Rice, J., Hu, L., Anderson, D. T., & Havens, T. C. (2016). Efficient multiple kernel classification using feature and decision level fusion. IEEE Transactions on Fuzzy Systems, 25(6), 1403–1416.CrossRef Pinar, A. J., Rice, J., Hu, L., Anderson, D. T., & Havens, T. C. (2016). Efficient multiple kernel classification using feature and decision level fusion. IEEE Transactions on Fuzzy Systems, 25(6), 1403–1416.CrossRef
Zurück zum Zitat Polana, R., & Nelson, R. (1994). Detecting activities. Journal of Visual Communication and Image Representation, 5(2), 172–180.CrossRef Polana, R., & Nelson, R. (1994). Detecting activities. Journal of Visual Communication and Image Representation, 5(2), 172–180.CrossRef
Zurück zum Zitat Potter, M. C. (1976). Short-term conceptual memory for pictures. Journal of Experimental Psychology: Human Learning and Memory, 2(5). Potter, M. C. (1976). Short-term conceptual memory for pictures. Journal of Experimental Psychology: Human Learning and Memory, 2(5).
Zurück zum Zitat Priyadarshni, V., Nayyar, A., Solanki, A., Anuragi, A. (2019). Human age classification system using K-NN classifier. In Proceedings of the International Conference on Advanced Informatics for Computing Research (pp. 294–311). Springer. Priyadarshni, V., Nayyar, A., Solanki, A., Anuragi, A. (2019). Human age classification system using K-NN classifier. In Proceedings of the International Conference on Advanced Informatics for Computing Research (pp. 294–311). Springer.
Zurück zum Zitat Rother, C., Kolmogorov, V., & Blake, A. (2004). GrabCut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics (TOG), 23(3), 309–314.CrossRef Rother, C., Kolmogorov, V., & Blake, A. (2004). GrabCut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics (TOG), 23(3), 309–314.CrossRef
Zurück zum Zitat Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.MathSciNetCrossRef Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.MathSciNetCrossRef
Zurück zum Zitat Saeed, F., Paul, A., Karthigaikumar, P., & Nayyar, A. (2019). Convolutional neural network based early fire detection. Multimedia Tools and Applications, 1–17. Saeed, F., Paul, A., Karthigaikumar, P., & Nayyar, A. (2019). Convolutional neural network based early fire detection. Multimedia Tools and Applications, 1–17.
Zurück zum Zitat Sehgal, A., Agrawal, R., Bhardwaj, R., & Singh, K. K. (2020). Reliability analysis of wireless link for IoT applications under shadow-fading conditions. Procedia Computer Science, 167, 1515–1523.CrossRef Sehgal, A., Agrawal, R., Bhardwaj, R., & Singh, K. K. (2020). Reliability analysis of wireless link for IoT applications under shadow-fading conditions. Procedia Computer Science, 167, 1515–1523.CrossRef
Zurück zum Zitat Sharma, P., Singh, A., Raheja, S., & Singh, K. K. (2019). Automatic vehicle detection using spatial time frame and object based classification. Journal of Intelligent & Fuzzy Systems, 37(6), 8147–8157.CrossRef Sharma, P., Singh, A., Raheja, S., & Singh, K. K. (2019). Automatic vehicle detection using spatial time frame and object based classification. Journal of Intelligent & Fuzzy Systems, 37(6), 8147–8157.CrossRef
Zurück zum Zitat Singh, A. K., Firoz, N., Tripathi, A., Singh, K., Choudhary, P., & Vashist, P. C. (2020). Internet of things: From hype to reality. An Industrial IoT Approach for Pharmaceutical Industry Growth, 2, 191.CrossRef Singh, A. K., Firoz, N., Tripathi, A., Singh, K., Choudhary, P., & Vashist, P. C. (2020). Internet of things: From hype to reality. An Industrial IoT Approach for Pharmaceutical Industry Growth, 2, 191.CrossRef
Zurück zum Zitat Singh, M., Sachan, S., Singh, A., & Singh, K. K. (2020). Internet of things in pharma industry: Possibilities and challenges. In Emergence of pharmaceutical industry growth with industrial IoT approach (pp. 195–216). Elsevier. Singh, M., Sachan, S., Singh, A., & Singh, K. K. (2020). Internet of things in pharma industry: Possibilities and challenges. In Emergence of pharmaceutical industry growth with industrial IoT approach (pp. 195–216). Elsevier.
Zurück zum Zitat Tanwar, S. (2020). Fog data analytics for IoT applications-Next generation process model with state-of-the-art technologies. Studies in Big Data, 76, 1–497. Tanwar, S. (2020). Fog data analytics for IoT applications-Next generation process model with state-of-the-art technologies. Studies in Big Data, 76, 1–497.
Zurück zum Zitat Ukil, A., Jana, D., & De Sarkar, A. (2013). A security framework in cloud computing infrastructure. International Journal of Network Security & Its Applications, 5(5), 11–24.CrossRef Ukil, A., Jana, D., & De Sarkar, A. (2013). A security framework in cloud computing infrastructure. International Journal of Network Security & Its Applications, 5(5), 11–24.CrossRef
Zurück zum Zitat Varior, R. R., Shuai, B., Lu, J., Xu, D., & Wang, G. (2016). A Siamese long short-term memory architecture for human re-identification. In Proceedings of the European Conference on Computer Vision (pp. 135–153). Springer. Varior, R. R., Shuai, B., Lu, J., Xu, D., & Wang, G. (2016). A Siamese long short-term memory architecture for human re-identification. In Proceedings of the European Conference on Computer Vision (pp. 135–153). Springer.
Zurück zum Zitat Wang, H., Wu, X., & Jia, Y. (2016). Heterogeneous domain adaptation method for video annotation. IET Computer Vision, 11(2), 181–187.CrossRef Wang, H., Wu, X., & Jia, Y. (2016). Heterogeneous domain adaptation method for video annotation. IET Computer Vision, 11(2), 181–187.CrossRef
Zurück zum Zitat Wang, H., Ullah, M. M., Klaser, A., Laptev, I., & Schmid, C. (2009, September). Evaluation of local spatio-temporal features for action recognition. In Proceedings of the British Machine Vision Conference (BMVC) (pp. 124.1–124.11). BMVA Press. Wang, H., Ullah, M. M., Klaser, A., Laptev, I., & Schmid, C. (2009, September). Evaluation of local spatio-temporal features for action recognition. In Proceedings of the British Machine Vision Conference (BMVC) (pp. 124.1–124.11). BMVA Press.
Zurück zum Zitat Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., & Van Gool, L. (2016). Temporal segment networks: Towards good practices for deep action recognition. In Proceedings of the European Conference on Computer Vision (pp. 20–36). Springer. Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., & Van Gool, L. (2016). Temporal segment networks: Towards good practices for deep action recognition. In Proceedings of the European Conference on Computer Vision (pp. 20–36). Springer.
Zurück zum Zitat Wu, Z., Jiang, Y. G., Wang, X., Ye, H., Xue, X., & Wang, J. (2015). Fusing multi-stream deep networks for video classification. arXiv preprint arXiv:1509.06086. Wu, Z., Jiang, Y. G., Wang, X., Ye, H., Xue, X., & Wang, J. (2015). Fusing multi-stream deep networks for video classification. arXiv preprint arXiv:​1509.​06086.
Zurück zum Zitat Zang, J., Wang, L., Liu, Z., Zhang, Q., Hua, G., & Zheng, N. (2018). Attention-based temporal weighted convolutional neural network for action recognition. In Proceedings of the IFIP International Conference on Artificial Intelligence Applications and Innovations (pp. 97–108). Springer. Zang, J., Wang, L., Liu, Z., Zhang, Q., Hua, G., & Zheng, N. (2018). Attention-based temporal weighted convolutional neural network for action recognition. In Proceedings of the IFIP International Conference on Artificial Intelligence Applications and Innovations (pp. 97–108). Springer.
Zurück zum Zitat Zhang, L., & Xiang, X. (2020). Video event classification based on two-stage neural network. Multimedia Tools and Applications, 1–16. Zhang, L., & Xiang, X. (2020). Video event classification based on two-stage neural network. Multimedia Tools and Applications, 1–16.
Metadaten
Titel
Event and Activity Recognition in Video Surveillance for Cyber-Physical Systems
verfasst von
Swarnabja Bhaumik
Prithwish Jana
Partha Pratim Mohanta
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-66222-6_4

Neuer Inhalt