Top

International Journal of Computer Vision

Published in:

24-11-2020

Viewpoint and Scale Consistency Reinforcement for UAV Vehicle Re-Identification

Authors: Shangzhi Teng, Shiliang Zhang, Qingming Huang, Nicu Sebe

Published in: International Journal of Computer Vision | Issue 3/2021

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

This paper studies vehicle ReID in aerial videos taken by Unmanned Aerial Vehicles (UAVs). Compared with existing vehicle ReID tasks performed with fixed surveillance cameras, UAV vehicle ReID is still under-explored and could be more challenging, e.g., aerial videos have dynamic and complex backgrounds, different vehicles show similar appearance, and the same vehicle commonly show distinct viewpoints and scales. To facilitate the research on UAV vehicle ReID, this paper contributes a novel dataset called UAV-VeID. UAV-VeID contains 41,917 images of 4601 vehicles captured by UAVs, where each vehicle has multiple images taken from different viewpoints. UAV-VeID also includes a large-scale distractor set to encourage the research on efficient ReID schemes. Compared with existing vehicle ReID datasets, UAV-VeID exhibits substantial variances in viewpoints and scales of vehicles, thus requires more robust features. To alleviate the negative effects of those variances, this paper also proposes a viewpoint adversarial training strategy and a multi-scale consensus loss to promote the robustness and discriminative power of learned deep features. Extensive experiments on UAV-VeID show our approach outperforms recent vehicle ReID algorithms. Moreover, our method also achieves competitive performance compared with recent works on existing vehicle ReID datasets including VehicleID, VeRi-776 and VERI-Wild.

previous article AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild

next article Compositional Convolutional Neural Networks: A Robust and Interpretable Model for Object Recognition Under Occlusion

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Avola, D., Cinque, L., Foresti, G. L., Martinel, N., Pannone, D., & Piciarelli, C. (2018). A UAV video dataset for mosaicking and change detection from low-altitude flights. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 50, 2139–2149.CrossRef

Bai, Y., Lou, Y., Gao, F., Wang, S., Wu, Y., & Duan, L. Y. (2018). Group-sensitive triplet embedding for vehicle reidentification. TMM, 20(2385), 2399.

Chang, X., Hospedales, T. M., & Xiang, T. (2018). Multi-level factorisation net for person re-identification. In CVPR.

Chatfield, K., Simonyan, K., Vedaldi, A., & Zisserman, A. (2014). Return of the devil in the details: Delving deep into convolutional nets.

Chen, L. C., Yang, Y., Wang, J., Xu, W., & Yuille, A. L. (2016). Attention to scale: Scale-aware semantic image segmentation. In CVPR.

Chu, R., Sun, Y., Li, Y., Liu, Z., Zhang, C., & Wei, Y. (2019). Vehicle re-identification with viewpoint-aware metric learning. In ICCV.

Du, D., Qi, Y., Yu, H., Yang, Y., Duan, K., Li, G., et al. (2018). The unmanned aerial vehicle benchmark: Object detection and tracking. In ECCV.

Ganin, Y., & Lempitsky, V. (2014). Unsupervised domain adaptation by backpropagation.

Ganin, Y., & Lempitsky, V. (2015). Unsupervised domain adaptation by backpropagation. In ICML.

Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., et al. (2016). Domain-adversarial training of neural networks. JMLR.

Girisha, S., Pai, M. M., Verma, U., & Pai, R. M. (2019). Performance analysis of semantic segmentation algorithms for finely annotated new uav aerial video dataset (manipaluavid). IEEE Access, 7, 136239–136253.CrossRef

Guo, H., Zhao, C., Liu, Z., Wang, J., Lu, H. (2018). Learning coarse-to-fine structured feature embedding for vehicle re-identification. In AAAI.

He, J., Deng, Z., & Qiao, Y. (2019b). Dynamic multi-scale filters for semantic segmentation. In ICCV.

He, B., Li, J., Zhao, Y., & Tian, Y. (2019a). Part-regularized near-duplicate vehicle re-identification. In CVPR.

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR.

Hsieh, M. R., Lin, Y. L., & Hsu, W. H. (2017). Drone-based object counting by spatially regularized regional proposal network. In ICCV.

Huang, G., & Chen, D. (2018). Multi-scale dense networks for resource efficient image classification. In ICLR.

Kanac, A., & Zhu, X. (2018). Vehicle re-identification in context. In GCPR.

Kanacı, A., Zhu, X., & Gong, S. (2017). Vehicle reidentification by fine-grained cross-level deep learning. In BMVC.

Khorramshahi, P., Kumar, A., Peri, N., Rambhatla, S. S., Chen, J. C., & Chellappa, R. (2019). A dual path modelwith adaptive attention for vehicle re-identification. In ICCV.

Li, Y., Chen, Y., Wang, N., & Zhang, Z. (2019). Scale-aware trident networks for object detection. In ICCV.

Li, W., Zhu, X., & Gong, S. (2018). Harmonious attention network for person re-identification. In CVPR.

Liu, X., Liu, W., Ma, H., & Fu, H. (2016c). Large-scale vehicle re-identification in urban surveillance videos. In ICME.

Liu, X., Liu, W., Ma, H., & Li, S. (2019b). PVSS: A progressive vehicle search system for video surveillance networks.

Liu, X., Liu, W., Mei, T., & Ma, H. (2016d). A deep learning-based approach to progressive vehicle re-identification for urban surveillance. In ECCV.

Liu, L., Qiu, Z., Li, G., Liu, S., Ouyang, W., & Lin, L. (2019a). Crowd counting with deep structured scale integration network. In ICCV.

Liu, H., Tian, Y., Yang, Y., Pang, L., & Huang, T. (2016a). Deep relative distance learning: Tell the difference between similar vehicles. In CVPR.

Liu, W., Wen, Y., Yu, Z., & Yang, M. (2016b). Large-margin softmax loss for convolutional neural networks. In ICML.

Liu, X., Zhang, S., Huang, Q., & Gao, W. (2018). RAM: A region-aware deep model for vehicle re-identification. In ICME.

Long, M., Cao, Z., Wang, J., & Jordan, M. I. (2018). Conditional adversarial domain adaptation. In NIPS.

Lou, Y., Bai, Y., Liu, J., Wang, S., & Duan, L. (2019). Veri-wild: A large dataset and a new method for vehicle re-identification in the wild. In CVPR.

Lu, J., Yang, J., Batra, D., & Parikh, D. (2016). Hierarchical question-image co-attention for visual question answering. In NIPS.

Mueller, M., Smith, N., & Ghanem, B. (2016). A benchmark and simulator for UAV tracking. In ECCV.

Pei, Z., Cao, Z., Long, M., & Wang, J. (2018). Multi-adversarial domain adaptation. In AAAI.

Qian, X., Fu, Y., Jiang, Y. G., Xiang, T., & Xue, X. (2017). Multi-scale deep learning architectures for person re-identification. In ICCV.

Ramprasaath, R. S., Michael, C., Abhishek, D., Ramakrishna, V., Devi, P., & Dhruv, B. (2017). Grad-CAM: Visual explanations from deep networks via gradient-based localization. In ICCV.

Redmon, J., & Farhadi, A. (2017). Yolo9000: Better, faster, stronger. In CVPR.

Robicquet, A., Sadeghian, A., Alahi, A., & Savarese, S. (2016). Learning social etiquette: Human trajectory understanding in crowded scenes. In ECCV.

Schroff, F., Kalenichenko, D., & Philbin, J. (2015) FaceNet: A unified embedding for face recognition and clustering. In CVPR.

Shen, Y., Xiao, T., Li, H., Yi, S., & Wang, X. (2017). Learning deep neural networks for vehicle re-id with visual-spatio-temporal path proposals. In ICCV.

Sun, Y., Zheng, L., Yang, Y., Tian, Q., & Wang, S. (2018). Beyond part models person retrieval with refined part pooling. In ECCV.

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., et al. (2015). Going deeper with convolutions. In CVPR.

Tan, W., Yan, B., & Bare, B. (2018). Feature super-resolution: Make machine see more clearly. In CVPR.

Teng, S., Liu, X., Zhang, S., & Huang, Q. (2018) SCAN: Spatial and channel attention network for vehicle re-identification. In PCM.

Teng, S., Zhang, S., Huang, Q., & Sebe, N. (2020). Multi-view spatial attention embedding for vehicle re-identification. TCSVT.

Tzeng, E., Hoffman, J., Saenko, K., & Darrell, T. (2017). Adversarial discriminative domain adaptation. In CVPR.

Wang, D., & Zhang, S. (2020). Unsupervised person re-identification via multi-label classification. In CVPR.

Wang, X., Jabri, A., & Efros, A. A. (2019). Learning correspondence from the cycle-consistency of time. In CVPR.

Wang, Z., Tang, L., Liu, X., Yao, Z., Yi, S., Shao, J., et al. (2017). Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification. In CVPR.

Wei, L., Zhang, S., Gao, W., & Tian, Q. (2017a). Person transfer gan to bridge domain gap for person re-identification. In CVPR.

Wei, L., Zhang, S., Yao, H., Gao, W., & Tian, Q. (2017b). Glad: Global-local-alignment descriptor for pedestrian retrieval. In ACM MM.

Wen, Y., Zhang, K., Li, Z., & Qiao, Y. (2016). A discriminative feature learning approach for deep face recognition. In ECCV.

Yan, K., Tian, Y., Wang, Y., Zeng, W., & Huang, T. (2017). Exploiting multi-grain ranking constraints for precisely searching visually-similar vehicles. In ICCV.

Yang, L., Luo, P., Change Loy, C., & Tang, X. (2015). A large-scale car dataset for fine-grained categorization and verification. In CVPR.

Yao, H., Zhang, S., Zhang, Y., Li, J., & Tian, Q. (2017). One-shot fine-grained instance retrieval. In ACM MM.

Yuan, Y., Yang, K., & Zhang, C. (2017). Hard-aware deeply cascaded embedding. In ICCV.

Zhang, Y., Liu, D., & Zha, Z. J. (2017). Improving triplet-wise training of convolutional neural network for vehicle re-identification. In ICME.

Zhou, Y., & Shao, L. (2018). Aware attentive multi-view inference for vehicle re-identification. In CVPR.

Zhou, K., Yang, Y., Cavallaro, A., & Xiang, T. (2019). Omni-scale feature learning for person re-identification. In ICCV.

Zhu, J. Y., Taesung, P., Phillip, I., & Alexei, A. E. (2017). Unpaired imageto-image translation using cycle-consistent adversarial networks. In ICCV.

Zhu, P., Wen, L., Bian, X., Ling, H., & Hu, Q. (2018a). Vision meets drones: A challenge.

Zhu, P., Wen, L., Du, D., Bian, X., Hu, Q., & Ling, H. (2020). Vision meets drones: Past, present and future.

Zhu, Z., Wu, W., Zou, W., & Yan, J. (2018b). End-to-end flow correlation tracking with spatial-temporal attention. In CVPR.

Title: Viewpoint and Scale Consistency Reinforcement for UAV Vehicle Re-Identification
Authors: Shangzhi Teng
Shiliang Zhang
Qingming Huang
Nicu Sebe
Publication date: 24-11-2020
Publisher: Springer US
Published in: International Journal of Computer Vision / Issue 3/2021
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-020-01402-2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 3/2021

Weakly Supervised Group Mask Network for Object Detection

Residual Dual Scale Scene Text Spotting by Fusing Bottom-Up and Top-Down Processing

AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild

Progressive Multi-granularity Analysis for Video Prediction

Deep Nets: What have They Ever Done for Vision?

Entrack: Probabilistic Spherical Regression with Entropy Regularization for Fiber Tractography

Premium Partner