Abstract
Tracking in the unmanned aerial vehicle (UAV) scenarios is one of the main components of target-tracking tasks. Different from the target-tracking task in the general scenarios, the target-tracking task in the UAV scenarios is very challenging because of factors such as small scale and aerial view. Although the discriminative correlation filter (DCF)-based tracker has achieved good results in tracking tasks in general scenarios, the boundary effect caused by the dense sampling method will reduce the tracking accuracy, especially in UAV-tracking scenarios. In this work, we propose learning an adaptive spatial-temporal context-aware (ASTCA) model in the DCF-based tracking framework to improve the tracking accuracy and reduce the influence of boundary effect, thereby enabling our tracker to more appropriately handle UAV-tracking tasks. Specifically, our ASTCA model can learn a spatial-temporal context weight, which can precisely distinguish the target and background in the UAV-tracking scenarios. Besides, considering the small target scale and the aerial view in UAV-tracking scenarios, our ASTCA model incorporates spatial context information within the DCF-based tracker, which could effectively alleviate background interference. Extensive experiments demonstrate that our ASTCA method performs favorably against state-of-the-art tracking methods on some standard UAV datasets.
- [1] . 2021. Multitarget tracking using siamese neural networks. ACM Transactions on Multimedia Computing, Communications, and Applications 17 (2021), Article 75, 16 pages.Google ScholarDigital Library
- [2] . 2016. Staple: Complementary learners for real-time tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 1401–1409.Google ScholarCross Ref
- [3] . 2016. Fully-convolutional siamese networks for object tracking. In Proceedings of the European Conference on Computer Vision. Springer, Amsterdam, The Netherlands, 850–865.Google ScholarCross Ref
- [4] . 2010. Visual object tracking using adaptive correlation filters. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 2544–2550.Google ScholarCross Ref
- [5] . 2017. Semantic pooling for complex event analysis in untrimmed videos. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 8 (2017), 1617–1632.Google ScholarDigital Library
- [6] . 2020. A semisupervised recurrent convolutional attention model for human activity recognition. IEEE Transactions on Neural Networks and Learning Systems 31, 5 (2020), 1747–1756.Google ScholarCross Ref
- [7] . 2015. Progressive motion vector clustering for motion estimation and auxiliary tracking. ACM Transactions on Multimedia Computing, Communications, and Applications 11(2015), Article 33, 23 pages.Google Scholar
- [8] . 2020. Siamese box adaptive network for visual tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 6668–6677.Google ScholarDigital Library
- [9] . 2019. MMALFM: Explainable recommendation by leveraging reviews and images. ACM Transactions on Information Systems 37, 2 (2019), 16:1–16:28.Google ScholarDigital Library
- [10] . 2019. Visual tracking via adaptive spatially-regularized correlation filters. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 4670–4679.Google ScholarCross Ref
- [11] . 2017. ECO: Efficient convolution operators for tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 6638–6646.Google ScholarCross Ref
- [12] . 2017. Discriminative scale space tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 8 (2017), 1561–1575.Google ScholarDigital Library
- [13] . 2015. Convolutional features for correlation filter based visual tracking. In Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop. IEEE, 58–66.Google ScholarDigital Library
- [14] . 2015. Learning spatially regularized correlation filters for visual tracking. In Proceedings of the 2015 IEEE International Conference on Computer Vision. IEEE, 4310–4318.Google ScholarDigital Library
- [15] . 2016. Adaptive decontamination of the training set: A unified formulation for discriminative visual tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 1430–1438.Google ScholarCross Ref
- [16] . 2016. Beyond correlation filters: Learning continuous convolution operators for visual tracking. In Proceedings of the European Conference on Computer Vision. Springer, Amsterdam, 472–488.Google ScholarCross Ref
- [17] . 2014. Adaptive color attributes for real-time visual tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 1090–1097.Google ScholarDigital Library
- [18] . 2018. The unmanned aerial vehicle benchmark: Object detection and tracking. In Proceedings of the European Conference on Computer Vision. Springer, 370–386.Google ScholarCross Ref
- [19] . 2017. Parallel tracking and verifying: A framework for real-time and high accuracy visual tracking. In Proceedings of the International Conference on Computer Vision. IEEE, 5486–5494.Google ScholarCross Ref
- [20] . 2020. Feature alignment and aggregation siamese networks for fast visual tracking. IEEE Transactions on Circuits and Systems for Video Technology 31, 4 (2020), 1296–1307.Google ScholarCross Ref
- [21] . 2018. Learning spatial-temporal regularized correlation filters for visual tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 4904–4913.Google Scholar
- [22] . 2019. Boundary effect-aware visual tracking for UAV with online enhanced background learning and multi-frame consensus verification. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 4415–4422.Google ScholarDigital Library
- [23] . 2021. Dynamic graph learning convolutional networks for semi-supervised classification. ACM Transactions on Multimedia Computing, Communications, and Applications 17, 1 (2021), 1–13.Google ScholarDigital Library
- [24] . 2020. Robust visual tracking using kernel sparse coding on multiple covariance descriptors. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 1s, Article 20 (2020), 22 pages.Google ScholarDigital Library
- [25] . 2017. An efficient motion detection and tracking scheme for encrypted surveillance videos. ACM Transactions on Multimedia Computing, Communications, and Applications 13 (2017), Article 61 , 23 pages.Google ScholarDigital Library
- [26] . 2018. Reinforcement cutting-agent learning for video object segmentation. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. 9080–9089.Google ScholarCross Ref
- [27] . 2020. Mining inter-video proposal relations for video object detection. In Proceedings of the European Conference on Computer Vision 2020
Lecture Notes in Computer Science , Vol. 12366. 431–446.Google ScholarDigital Library - [28] . 2012. Exploiting the circulant structure of tracking-by-detection with kernels. In Proceedings of the European Conference on Computer Vision. Springer, 702–715.Google ScholarDigital Library
- [29] . 2014. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 3 (2014), 583–596.Google ScholarDigital Library
- [30] . 2015. MUlti-store tracker (MUSTer): A cognitive psychology inspired approach to object tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 749–758.Google ScholarCross Ref
- [31] . 2019. Learning aberrance repressed correlation filters for real-time UAV tracking. In Proceedings of the International Conference on Computer Vision. IEEE, 2891–2900.Google ScholarCross Ref
- [32] . 2017. Learning background-aware correlation filters for visual tracking. In Proceedings of the International Conference on Computer Vision. IEEE, 1135–1143.Google ScholarCross Ref
- [33] . 2015. Correlation filters with limited boundaries. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 4630–4638.Google ScholarCross Ref
- [34] . 2017. Visual object tracking for unmanned aerial vehicles: A benchmark and new motion models. In Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press, San Francisco, CA,4140–4146.Google ScholarCross Ref
- [35] . 2020. Dual-regression model for visual tracking. Neural Networks 132 (2020), 364–374.Google ScholarCross Ref
- [36] . 2020. AutoTrack: Towards high-performance visual tracking for UAV with automatic spatio-temporal regularization. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 11923–11932.Google ScholarCross Ref
- [37] . 2014. A scale adaptive kernel correlation filter tracker with feature integration. In Proceedings of the European Conference on Computer Vision. Springer, 254–265.Google Scholar
- [38] . 2019. Robust estimation of similarity transformation for visual object tracking. In Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press, 8666–8673.Google ScholarDigital Library
- [39] . 2019. Zero-shot event detection via event-adaptive concept relevance mining. Pattern Recognition 88 (2019), 595–603.Google ScholarDigital Library
- [40] . 2015. Enabling context-aware indoor augmented reality via smartphone sensing and vision tracking. ACM Transactions on Multimedia Computing, Communications, and Applications 12 (2015), Article 15 , 23 pages.Google ScholarDigital Library
- [41] . 2020. Learning deep multi-level similarity for thermal infrared object tracking. IEEE Transactions on Multimedia 23 (2020), 2114–2126.Google ScholarCross Ref
- [42] . 2019. Discrete multi-graph clustering. IEEE Transactions on Image Processing 28, 9 (2019), 4701–4712.Google ScholarCross Ref
- [43] . 2018. Joint attributes and event analysis for multimedia event detection. IEEE Transactions on Neural Networks and Learning Systems 29, 7 (2018), 2921–2930.Google Scholar
- [44] . 2016. A benchmark and simulator for UAV tracking. In Proceedings of the European Conference on Computer Vision. Springer, 445–461.Google ScholarCross Ref
- [45] . 2017. Context-aware correlation filter tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 1396–1404.Google ScholarCross Ref
- [46] . 2016. Hedged deep tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 4303–4311.Google ScholarCross Ref
- [47] . 2021. A comprehensive survey of neural architecture search: Challenges and solutions. ACM Computing Surveys 54, 4 (2021), 76:1–76:34.Google Scholar
- [48] . 2021. Adaptive segmentation model for liver CT images based on neural network and level set method. Neurocomputing 453 (2021), 438–452.Google ScholarDigital Library
- [49] . 2021. A neighbor level set framework minimized with the split Bregman method for medical image segmentation. Signal Processing 189 (2021), 108293.Google ScholarDigital Library
- [50] . 2017. CREST: Convolutional residual learning for visual tracking. In Proceedings of the 2017 IEEE International Conference on Computer Vision. IEEE, 2574–2583.Google ScholarCross Ref
- [51] . 2016. Siamese instance search for tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 1420–1429.Google ScholarCross Ref
- [52] . 2020. Deep learning on image denoising: An overview. Neural Networks 131 (2020), 251–275.Google ScholarCross Ref
- [53] . 2017. End-to-end representation learning for correlation filter based tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 2805–2813.Google ScholarCross Ref
- [54] . 2015. Visual tracking with fully convolutional networks. In Proceedings of the International Conference on Computer Vision. IEEE, 3119–3127.Google ScholarDigital Library
- [55] . 2016. STCT: Sequentially training convolutional networks for visual tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 1373–1381.Google ScholarCross Ref
- [56] . 2019. Unsupervised deep tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 1308–1317.Google ScholarCross Ref
- [57] . 2018. Multi-cue correlation filters for robust visual tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 4844–4853.Google ScholarCross Ref
- [58] . 2019. Real-time UAV tracking based on PSR stability. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop. IEEE, 1–9.Google ScholarCross Ref
- [59] . 2013. Online object tracking: A benchmark. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 2411–2418.Google ScholarDigital Library
- [60] . 2019. Joint group feature selection and discriminative filter learning for robust visual object tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 7950–7960.Google ScholarCross Ref
- [61] . 2020. Self-weighted robust LDA for multiclass classification with edge classes. ACM Transactions on Intelligent Systems and Technology 12 (2020), Article 4, 19 pages.Google Scholar
- [62] . 2020. Deep multi-view enhancement hashing for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 4 (2020), 1445–1451.Google ScholarCross Ref
- [63] . 2020. Depth image denoising using nuclear norm and learning graph model. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 4 (2020), 1–17.Google ScholarDigital Library
- [64] . 2020. 3D room layout estimation from a single RGB image. IEEE Transactions on Multimedia 22, 11 (2020), 3014–3024.Google ScholarCross Ref
- [65] . 2020. Semantics-preserving graph propagation for zero-shot object detection. IEEE Transactions on Image Processing 29 (2020), 8163–8176.Google ScholarDigital Library
- [66] . 2020. Parallelizable and robust image segmentation model based on the shape prior information. Applied Mathematical Modelling 83 (2020), 357–370.Google ScholarCross Ref
- [67] . 2019. Adaptive semi-supervised feature selection for cross-modal retrieval. IEEE Transactions on Multimedia 21, 5 (2019), 1276–1288.Google ScholarDigital Library
- [68] . 2021. Self-supervised deep correlation tracking. IEEE Transactions on Image Processing 30 (2021), 976–985.Google ScholarDigital Library
- [69] . 2020. Learning target-focusing convolutional regression model for visual object tracking. Knowledge-Based Systems 194 (2020), 105526.Google ScholarCross Ref
- [70] . 2020. Robust visual tracking with correlation filters and metric learning. Knowledge-Based Systems 195 (2020), 105697.Google ScholarCross Ref
- [71] . 2020. Visual object tracking with adaptive structural convolutional network. Knowledge-Based Systems 194 (2020), 105554.Google ScholarCross Ref
- [72] . 2020. Making sense of spatio-temporal preserving representations for EEG-based human intention recognition. IEEE Transactions on Cybernetics 50, 7 (2020), 3033–3044.Google ScholarCross Ref
- [73] . 2020. Joint attribute manipulation and modality alignment learning for composing text and image to image retrieval. In Proceedings of the 28th ACM International Conference on Multimedia. ACM, 3367–3376.Google ScholarDigital Library
- [74] . 2018. Parallel attentive correlation tracking. IEEE Transactions on Image Processing 28, 1 (2018), 479–491.Google ScholarDigital Library
- [75] . 2020. Few-shot activity recognition with cross-modal memory network. Pattern Recognition 108 (2020), 107348.Google ScholarCross Ref
- [76] . 2020. Deep top-$k$ ranking for image-sentence matching. IEEE Transactions on Multimedia 22, 3 (2020), 775–785.Google ScholarCross Ref
- [77] . 2017. Multi-task correlation particle filter for robust object tracking. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 4819–4827.Google ScholarCross Ref
- [78] . 2018. Hierarchical tracking by reinforcement learning-based searching and coarse-to-fine verifying. IEEE Transactions on Image Processing 28, 5 (2018), 2331–2341.Google ScholarCross Ref
- [79] . 2020. Person reidentification via multi-feature fusion with adaptive graph learning. IEEE Transactions on Neural Networks and Learning Systems 31, 5 (2020), 1592–1601.Google ScholarCross Ref
- [80] . 2020. Vision-language navigation with self-supervised auxiliary reasoning tasks. In Proceedings of the Computer Vision and Pattern Recognition. IEEE, 10012–10022.Google ScholarCross Ref
Index Terms
- Learning Adaptive Spatial-Temporal Context-Aware Correlation Filters for UAV Tracking
Recommendations
Learning Adaptively Context-Weight-Aware Correlation Filters for UAV Tracking with Robust Spatial-Temporal Regularization
ICIGP '21: Proceedings of the 2021 4th International Conference on Image and Graphics ProcessingRecently, Discriminative Correlation Filter (DCF) based methods have been widely applied in tracking for unmanned aerial vehicles (UAVs) because of their promising performance and efficiency. However, boundary effect, filter corruption, lack of context ...
Adaptive Spatio-Temporal Regularized Correlation Filters for UAV-Based Tracking
Computer Vision – ACCV 2020AbstractVisual tracking on unmanned aerial vehicles (UAVs) has enabled many new practical applications in computer vision. Meanwhile, discriminative correlation filter (DCF)-based trackers have drawn great attention and undergone remarkable progress due ...
Scale Adaptive Dense Structural Learning for Visual Object Tracking
ICCAE 2018: Proceedings of the 2018 10th International Conference on Computer and Automation EngineeringObject tracking has long been a hot topic in computer vision. However, existing trackers are still too far away from solving the visual tracking problem because of their limited robustness, inadequate precision and low efficiency. The correlation ...
Comments