Top

Neural Processing Letters

Published in:

13-08-2022

A Lightweight Neural Learning Algorithm for Real-Time Facial Feature Tracking System via Split-Attention and Heterogeneous Convolution

Authors: Yuandong Ma, Qing Song, Mengjie Hu, Xiaotong Zhu

Published in: Neural Processing Letters | Issue 2/2023

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Object tracking has made remarkable progress in the past few years. But most advanced trackers are becoming more expensive, which limits their deployment in mobile devices with limited resources. In addition, the current popular tracker realizes similarity learning through the feature correlation between multiple branches. Some of these cross-correlation methods lost a lot of face information, and some introduced a lot of unfavorable background information. Based on this motivation, this paper is committed to reducing the number of algorithm parameters and enhancing the ability of feature extraction. Heterogeneous convolution is introduced into the backbone network to reduce the convolution kernel parameters. Add a search box mechanism to dynamically adjust the network receiving domain to generate more feature maps with cheap operations. Furthermore, we also integrate the split-attention mechanism into the backbone network to standardize the arrangement of heterogeneous convolution. To evaluate the model, we conducted experiments on challenging VTB datasets and actual shooting datasets, which contain 82,351 facial features. Experimental results show that our method distance precision (DP) and overlap success precision (OP) are 93.5% and 67.5% respectively, which are comparable with the state-of-the-art object tracking methods and reduce about one-third of the parameters. Meanwhile, the feature mapping of each convolution module is explored, and the interpretation of lightweight convolution is given.

previous article Cyclic Gate Recurrent Neural Networks for Time Series Data with Missing Values

next article Correction: A Lightweight Neural Learning Algorithm for Real-Time Facial Feature Tracking System via Split-Attention and Heterogeneous Convolution

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

S, Aydın.: Deep Learning Classification of Neuro-Emotional Phase Domain Complexity Levels Induced by Affective Video Film Clips.IEEE Journal of Biomedical and Health Informatics.24 (6), pp.1695–1702

S, Aydın.: Cross-validated Adaboost Classification of Emotion Regulation Strategies Identified by Spectral Coherence in Resting-State. Neuroinform https://doi.org/10.1007/s12021-021-09542-7

Yu X, Li Y, Zhang S, Xue C, Yu W (2020) Estimation of human impedance and motion intention for constrained human–robot interaction. Neurocomputing 37(6):390–403

Eduardo SLG, Manuel MO (2011) Domain transform for edge-aware image and video processing. ACM SIGGRAPH 2011 papers (SIGGRAPH ‘11). Association for Computing Machinery, New York, NY, USA, pp 1–12

Wu J, Ji Y, Sun X et al (2020) Price Regulation Mechanism of Travelers’ Travel Mode Choice in the Driverless Transportation Network. J Adv Transp 20(3):1–9

Zolfaghari M, Ghanei-Yakhdan H, Yazdi M (2020) Real-time object tracking based on an adaptive transition model and extended Kalman filter to handle full occlusion. Vis Comput 36:701–715CrossRef

Islam MN, Loo CK, Seera M (2017) Incremental Clustering-Based Facial Feature Tracking Using Bayesian ART. Neural Process Lett 45:887–911CrossRef

Sheng L, Cai J, Cham T et al (2019) Visibility Constrained Generative Model for Depth-Based 3D Facial Pose Tracking. IEEE Trans Pattern Anal Mach Intell 41(8):2614–2623CrossRef

Marcone G, Martinelli G, Lancetti L (1998) Eye Tracking in Image Sequences by Competitive Neural Networks. Neural Process Lett 7:133–138CrossRef

10.

Zhang W, Du Y, Chen Z et al (2021) Robust adaptive learning with siamese network architecture for visual tracking. Vis Comput 37:881–894CrossRef

12.

Ren Z, Li J, Zhang X et al (2018) A Face Tracking Method in Videos Based on Convolutional Neural. Networks Int J Pattern Recognit Artif Intell 32(12):749–762

13.

Danelljan M, Robinson A, Khan FS et al (2016) : Beyond correlation filters: learning continuous convolution operators for visual tracking. In: European conference on computer vision (ECCV), pp. 472–488.Springer, Cham

14.

Danelljan M, Bhat G, Shahbaz KF et al (2017) : ECO: efficient convolution operators for tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6638–6646

15.

Luca B, Jack V, Jo˜ao FH, Andrea V, Philip HST (2016) : Fully-convolutional siamese networks for object tracking. In: European conference on computer vision (ECCV), pp. 850–865. Springer, Cham

16.

Makhura OJ, Woods JC (2019) Learn-select-track: an approach to multi-object tracking. Signal Process Image Commun 74:153–161CrossRef

17.

Lin L, Liu B, Xiao Y (2020) COB method with online learning for object tracking. Neurocomputing 393(14):142–155CrossRef

18.

Tu B, Kuang W, Shang Y et al (2019) A multi-view object tracking using triplet model. J visual communication image representation 60(Apra):64–68CrossRef

19.

Li B, Yan J, Wu W, Zhu Z, Hu X (2018) : High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.8971–8980

20.

Li B, Wu W, Wang Q, Zhang F, Xing J, Yan J (2019) : SiamRPN++: Evolution of siamese visual tracking with very deep networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1–7

21.

Xu Y, Wang Z, Li Z, Yu, Gang (2020) : SiamFC++: towards robust and accurate visual tracking with target estimation guidelines. In: Association for the Advance of Artificial Intelligence (AAAI), pp.256–267

22.

Guo D, Wang J, Cui Y, Wang Z, Chen S (2020) : SiamCAR: Siamese fully convolutional classification and regression for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6268–6276

23.

Held D, Thrun S, Savarese S (2016) : Learning to track at 100 fps with deep regression networks. In: European conference on computer vision (ECCV), pp. 749–765. Springer, Cham

24.

Wei S, Ren J (2016) : Real-time tracking of non-rigid objects. In: International Conference on Communication and Information Systems pp.11–15

25.

Zheng WL, Shen SC, Lu BL (2017) Online Depth Image-Based Object Tracking with Sparse Representation and Object Detection. Neural Process Lett 45:745–758CrossRef

26.

Lian Z, Shao S, Huang C (2020) A real-time face tracking system based on multiple information fusion. Multimedia Tools and Applications 79(23):16751–16769CrossRef

27.

Osuna E, Freund R, Girosit F (1997) : Training support vector machines: an application to face detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 130–136

28.

Sarwar Murshed MG, Murphy C, Hou DQ, Khan N, Ganesh A, Hussain F (2022) Machine Learning at the Network Edge: A Survey. ACM Comput Surv 54(8):1–37CrossRef

29.

Chen J, Ran X (2019) : Deep Learning With Edge Computing: A Review. in Proceedings of the IEEE, 107 (8), 1655–1674

30.

Zhou Z, Chen X, Li E, Zeng L, Luo K, Zhang J (2019) : Edge Intelligence: Paving the Last Mile of Artificial Intelligence With Edge Computing. in Proceedings of the IEEE, 107 (8), 1738–1762

31.

Zhang Z, Peng H (2019) : Deeper and wider siamese networks for real-time visual tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.678–685

32.

Singh et al (2019) : HetConv: Heterogeneous Kernel-Based Convolutions for Deep CNNs. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4830–4839 (2019) doi: https://doi.org/10.1109/CVPR.2019.00497

33.

He K, Zhang X, Ren S, Sun J (2016) : Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778

34.

Hwang Y, Jeong MH, Oh SR, Yoon C (2017) Adaptive mean shift based face tracking by coupled support map. Int J Fuzzy Logic Intell Syst 17(2):114–120CrossRef

35.

Mentzelopoulos M, Psarrou A, Angelopoulou A et al (2013) Active Foreground Region Extraction and Tracking for Sports Video Annotation. Neural Process Lett 37:33–46CrossRef

36.

Huang DY, Chen CH, Chen TY, Hu WC, Guo ZB, Wen CK (2021) High-efficiency face detection and tracking method for numerous pedestrians through face candidate generation. Multimedia Tools and Applications 80(1):1247–1272CrossRef

37.

Tao R, Gavves E, Smeulders AWM (2016) : Siamese instance search for tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1420–1429

38.

Chen Z, Zhong B, Li G, Zhang S, Ji R (2020) : Siamese box adaptive network for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.6667–6676

39.

Paul V, Jonathon L, Philip HST, Bastian L (2020) : Siam R-CNN: Visual tracking by re-detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.6577–6587

40.

Zhang Z, Peng H, Fu J, Li B, Hu W (2020) : Ocean: Object-aware anchor-free tracking. In: European conference on computer vision (ECCV), pp.771–787

41.

Cai H, Gan C, Wang T, Zhang Z, Han S (2020) : Once-for-all: Train one network and specialize it for efficient deployment. In: International Conference on Learning Representations (ICLR),.751–762

42.

Lu J, Xiong C, Parikh D, Socher R (2017) : Knowing when to look: adaptive attention via a visual sentinel for image captioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3242–3250

43.

Chen L, Yang Y, Wang J, Xu W, Yuille AL (2016) : Attention to scale: scale-aware semantic image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3640–3649

44.

Fan H, Mei X, Prokhorov D, Ling H (2018) Multi-level contextual RNNs with attention model for scene labeling. Intell Transp Syst IEEE Trans on 19(11):3475–3485CrossRef

45.

Xu H, Saenko K (2016) : Ask, attend and answer: exploring question-guided spatial attention for visual question answering. In: European conference on computer vision (ECCV), pp.451–466

46.

Choi J, Chang HJ, Yun S, Fischer T, Demiris Y, Choi JY (2017) : Attentional correlation filter network for adaptive visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.4828–4837

47.

Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) : Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9

48.

Xie S, Girshick R, Doll´ ar P, Tu Z, He K (2017) : Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1492–1500

49.

Hu J, Shen L, Sun G (2020) Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell 42(8):2011–2023CrossRef

50.

Li X, Wang W, Hu X, Yang J (2019) : Selective kernel networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1011–1022

51.

Han S, Mao H, Dally WJ (2016) : Deep compression: Compressing deep neural networks with pruning, trained quantization and huff man coding. In: International Conference on Learning Representations (ICLR), pp. 458–467

52.

Louizos C, Ullrich K, Welling M (2017) Bayesian compression for deep learning. In: NeurIPS, pp 3288–3298

53.

Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) : Xnor-net: ImageNet classification using binary convolutional neural networks. In: European conference on computer vision (ECCV), pp.525–542

54.

Ding X, Ding G, Han J, Tang S (2018) : Auto-balanced filter pruning for efficient convolutional neural networks. In: Association for the Advance of Artificial Intelligence (AAAI), pp. 6665–6673

55.

Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) : MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861,

56.

Ma N, Zhang X, Zheng HT, Sun J (2018) : Shufflenet v2: Practical guidelines for efficient CNN architecture design. In: European conference on computer vision (ECCV), pp.122–138

57.

Tan M, Le Q (2019) : Efficientnet: Rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning (ICML), pp.6105–6114

58.

Tan M, Le QV (2019) : Mixconv: Mixed depthwise convolutional kernels. In: British Machine Vision Conference (BMVC), pp. 275–283

59.

Bromley J, Guyon I, Lecun Y, Sackinger E, Shah R (1993) Signature verification using a siamese time delay neural network. Pattern Recognit Artif Intell 7(4):669–688CrossRef

60.

Kingma D, Ba J (2015) : Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR), pp. 492–507

61.

62.

Gooogle (2020) Visual tracker benchmark, Website, http://cvlab.hanyang.ac.kr/tracker_benchmark/datasets.html

64.

Bertinetto L, Valmadre J, Golodetz S et al (2016) : Staple: complementary learners for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1401–1409

65.

Valmadre J, Bertinetto L, Henriques J et al (2017) : End-to-end representation learning for correlation filter based tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2805–2813

66.

Guo Q, Feng W, Zhou C, Huang R, Wan L, Wang S (2017) : Learning dynamic siamese network for visual object tracking. In: IEEE International Conference on Computer Vision (ICCV), pp.1491 – 1420

67.

Yu Y, Xiong Y, Huang W, Scott MR (2020) : Deformable siamese attention networks for visual object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6727–6736

Title: A Lightweight Neural Learning Algorithm for Real-Time Facial Feature Tracking System via Split-Attention and Heterogeneous Convolution
Authors: Yuandong Ma
Qing Song
Mengjie Hu
Xiaotong Zhu
Publication date: 13-08-2022
Publisher: Springer US
Published in: Neural Processing Letters / Issue 2/2023
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-022-10951-1

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 2/2023

A Real-Time Cardiac Arrhythmia Classification Using Hybrid Combination of Delta Modulation, 1D-CNN and Blended LSTM

No-reference Video Quality Assessment Based on Spatio-temporal Perception Feature Fusion

Semantic Multiclass Segmentation and Classification of Kidney Lesions

Multi-Type Synchronization for Second-Order Memristive Neural Networks with Mixed Time-Varying Delays

An R-Transformer_BiLSTM Model Based on Attention for Multi-label Text Classification

A Topic Inference Chinese News Headline Generation Method Integrating Copy Mechanism