Top

Neural Processing Letters

Published in:

15-05-2021

Boundary Adjusted Network Based on Cosine Similarity for Temporal Action Proposal Generation

Authors: Jingye Zheng, Dihu Chen, Haifeng Hu

Published in: Neural Processing Letters | Issue 4/2021

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Detecting temporal actions in long and untrimmed videos is a challenging and important field in computer vision. Generating high-quality proposals is a key step in temporal action detection. A high-quality proposal usually contains two main characteristics. One is the temporal overlaps between proposals and action instances should be as large as possible. The another one is the number of generated proposals should be as few as possible. Inspired by the similarity comparison in face recognition and the similarity of action in same action segment, we design a module to compare the similarity for visual features extracted from visual feature encoder. We find out time points where the similarity of features changes shapely to generate candidate proposals. Then, we train a classifier to evaluate the candidate proposals whether contains or not contains action instances. The experiments suggest that our method outperforms other temporal action proposal generation methods in THUMOS-14 dataset and ActivityNet-v1.3 dataset. In addition, our method still outperforms other methods when using different visual features extracted from different networks.

previous article An Effective Principal Singular Triplets Extracting Neural Network Algorithm

next article An Energy-Aware Trust and Opportunity Based Routing Algorithm in Wireless Sensor Networks Using Multipath Routes Technique

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Shou Z, Wang D, Chang SF (2016) Temporal action localization in untrimmed videos via multi-stage cnns. Proceedings of the IEEE conference on computer vision and pattern recognition 1049-1058

Zhao Y, Xiong Y, Wang L, Wu Z, Tang X, Lin D (2017) Temporal action detection with structured segment networks. Proceedings of the IEEE international conference on computer vision 2914-2923

Gao J, Chen K, Nevatia R (2018) Ctap: Complementary temporal action proposal generation. Proceedings of the European conference on computer vision 68-83

Yang K, Qiao P, Li D, et al. (2018) Exploring temporal preservation networks for precise temporal action localization. Thirty-Second AAAI Conference on Artificial Intelligence

Karaman S, Seidenari L, Del Bimbo A (2014) Fast saliency based pooling of fisher encoded dense trajectories. ECCV THUMOS Workshop. 1(2):5

Oneata D, Verbeek J, Schmid C (2014) The lear submission at thumos. 2014: ECCV THUMOS Workshop

Wang L, Qiao Y, Tang X (2014) Action recognition and detection by combining motion and appearance features. THUMOS14 Action Recogn Chall 1(2):2

Lin T, Zhao X, Su H, Wang C, Yang M (2018) Bsn: Boundary sensitive network for temporal action proposal generation. Proceedings of the European conference on computer vision 3-19

Zhou B, Andonian A, Oliva A, Torralba A (2018) Temporal relational reasoning in videos. Proceedings of the European conference on computer vision 803-818

10.

Wu CY, Zaheer M, Hu H, Manmatha R, Smola AJ, Krhenbuhl P (2018) Compressed video action recognition. Proceedings of the IEEE conference on computer vision and pattern recognition 6026-6035

11.

Zhang L, Yang M, Feng X (2011) Sparse representation or collaborative representation: Which helps face recognition?. Proceedings of the IEEE conference on computer vision and pattern recognition 471-478

12.

Zheng J, Yang P, Chen S, Shen G, Wang W (2017) Iterative re-constrained group sparse face recognition with adaptive weights learning. IEEE Trans Image Process 26(5):2408–2423MathSciNetCrossRef

13.

Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. In Advances in neural information processing systems 568-576

14.

Wang H, Schmid C (2013) Action recognition with improved trajectories. Proceedings of the IEEE international conference on computer vision 3551-3558

15.

Wang H, Kläser A, Schmid C, Liu CL (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vision 103(1):60–79MathSciNetCrossRef

16.

Wang H, Kl\(\ddot{a}\)ser A, Schmid C, Cheng-Lin L (2011) Action recognition by dense trajectories. Proceedings of the IEEE conference on computer vision and pattern recognition 3169-3176

17.

Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (2014) Large-scale video classification with convolutional neural networks. Proceedings of the IEEE conference on computer vision and pattern recognition 1725-1732

18.

Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. Proceedings of the IEEE conference on computer vision 4489-4497

19.

Buch S, Escorcia V, Ghanem B, Fei-Fei L, Niebles JC (2017) End-to-end, single-stream temporal action detection in untrimmed videos. Proceedings of the British machine vision conference. 1:2

20.

Escorcia V, Heilbron FC, Niebles JC, Ghanem B (2016) Daps: Deep action proposals for action understanding. In European conference on computer vision 768-784

21.

Gao J, Yang Z, Chen K, Sun C, Nevatia R (2017) Turn tap: Temporal unit regression network for temporal action proposals. Proceedings of the IEEE international conference on computer vision 3628-3636

22.

Gao J, Yang Z, Nevatia R (2017) Cascaded boundary regression for temporal action detection. arXiv preprint arXiv:1705.01180

23.

Xiong Y, Zhao Y, Wang L, Lin D, Tang X (2017) A pursuit of temporal accuracy in general activity detection, arXiv:1703.02716 [Online]. Available: https://arxiv.org/abs/1703.02716

24.

Caba Heilbron F, Carlos Niebles J, Ghanem B (2016) Fast temporal activity proposals for efficient detection of human actions in untrimmed videos. Proceedings of the IEEE conference on computer vision and pattern recognition 1914-1923

25.

Laptev I (2015) On space-time interest points. Int J Comput Vision 64(2–3):107–123

26.

Bodla N, Singh B, Chellappa R, Davis LS (2017) Soft-NMS–Improving Object Detection With One Line of Code. Proceedings of the IEEE conference on computer vision 5561-5569

27.

Caba Heilbron F, Escorcia V, Ghanem B, Carlos Niebles J (2015) Activitynet: A large-scale video benchmark for human activity understanding. Proceedings of the IEEE conference on computer vision and pattern recognition 961-970

28.

Jiang YG, Liu J, Zamir AR, Toderici G, Laptev I, Shah M, Sukthankar R (2014) Thumos challenge: Action recognition with a large number of classes. In: ECCV Workshop

29.

Soomro K, Zamir AR, Shah M (2012) Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402

30.

Xiong Y, Wang L, Wang Z, Zhang B, Song H, Li W, Lin D, Qiao Y, Gool LV,Tang X (2016) Cuhk & ethz & siat submission to activitynet challenge 2016. arXiv preprint arXiv:1608.00797

31.

Lin T, Zhao X, Shou Z (2017) Temporal convolution based action proposal: Submission to activitynet 2017, arXiv:1707.06750. [Online]. Available: https://arxiv.org/abs/1707.06750

32.

Chao Y-W, Vijayanarasimhan S, Seybold B, Ross DA, Deng J, Sukthankar R (2018) Rethinking the faster R-CNN architecture for temporal action localization, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 1130-1139

33.

Buch S, Escorcia V, Shen C, Ghanem B, Carlos Niebles J (2017) Sst: Single-stream temporal action proposals. Proceedings of the IEEE conference on computer vision and pattern recognition 2911-2920

34.

Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. Proceedings of the IEEE conference on computer vision 4489-4497

35.

Shou Z, Chan J, Zareian A, Miyazawa K, Chang SF (2017) Cdc: Convolutional-deconvolutional networks for precise temporal action localization in untrimmed videos. arXiv preprint arXiv:1703.01515

36.

Xu H, Das A, Saenko K (2017) R-c3d: Region convolutional 3d network for temporal activity detection. In proceedings of the IEEE international conference on computer vision (ICCV) 5783-5792

37.

Tran D, Ray J, Shou Z, et al. (2017) Convnet architecture search for spatiotemporal feature learning. arXiv preprint arXiv:1708.05038

Title: Boundary Adjusted Network Based on Cosine Similarity for Temporal Action Proposal Generation
Authors: Jingye Zheng
Dihu Chen
Haifeng Hu
Publication date: 15-05-2021
Publisher: Springer US
Published in: Neural Processing Letters / Issue 4/2021
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-021-10500-2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 4/2021

A Robust Segmentation Method Based on Improved U-Net

Multi-object Tracking Method Based on Efficient Channel Attention and Switchable Atrous Convolution

Cerebrum Tumor Segmentation of High Resolution Magnetic Resonance Images Using 2D-Convolutional Network with Skull Stripping

Research on Robot Motion Planning Based on RRT Algorithm with Nonholonomic Constraints

Special Issue: Capsule Networks and Imaging Science (CNIS)

An Efficient Mammogram Image Retrieval System Using an Optimized Classifier