Skip to main content
Top

2024 | OriginalPaper | Chapter

Multi-Query Person Search with Transformers

Authors : Ying Chen, Zhihui Li, Andy Song

Published in: Advances in Knowledge Discovery and Data Mining

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We propose a transformer-based multi-query person search (MQPS) method that jointly performs person detection and person re-identification (re-id) in an end-to-end framework. Most existing person search methods employ hand-crafted components and involve multiple steps and stages to detect and identify the target person, which are computationally inefficient and brutal to generalise to different datasets. The recent advance in end-to-end object detection with transformers, mainly the DETR family, employ object queries to learn objects and directly predict a set of bounding boxes and object classes. However, this approach uses one object query per object so that the detected object is centred around the object spatial location, which is not ideal for small and occluded objects during feature representation learning. Therefore, we propose a multi-query method for person detection and person feature representation learning. Specifically, MQPS utilises multiple adjacent object queries to learn a target person object with multi-scale features. Moreover, to improve the feature representation learning of intra-identity objects, we employ a margin ranking loss to bring closer the intra-identity person instances in the feature space. Experiments on CUHK-SYSU and PRW datasets demonstrate the effectiveness of the proposed method.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Cao, J., et al.: PSTR: end-to-end one-step person search with transformers (2022) Cao, J., et al.: PSTR: end-to-end one-step person search with transformers (2022)
4.
go back to reference Chen, D., Zhang, S., Yang, J., Schiele, B.: Norm-aware embedding for efficient person search. In: CVPR (2020) Chen, D., Zhang, S., Yang, J., Schiele, B.: Norm-aware embedding for efficient person search. In: CVPR (2020)
5.
go back to reference Dong, W., Zhang, Z., Song, C., Tan, T.: Bi-directional interaction network for person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020 Dong, W., Zhang, Z., Song, C., Tan, T.: Bi-directional interaction network for person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020
13.
go back to reference Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2999–3007. IEEE Computer Society (2017). https://doi.org/10.1109/ICCV.2017.324 Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2999–3007. IEEE Computer Society (2017). https://​doi.​org/​10.​1109/​ICCV.​2017.​324
14.
go back to reference Lin, X., Ren, P., Xiao, Y., Chang, X., Hauptmann, A.: Person search challenges and solutions: a survey. In: Zhou, Z.H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021, pp. 4500–4507, August 2021. https://doi.org/10.24963/ijcai.2021/613, survey Track Lin, X., Ren, P., Xiao, Y., Chang, X., Hauptmann, A.: Person search challenges and solutions: a survey. In: Zhou, Z.H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021, pp. 4500–4507, August 2021. https://​doi.​org/​10.​24963/​ijcai.​2021/​613, survey Track
15.
go back to reference Liu, H., Shi, W., Huang, W., Guan, Q.: A discriminatively learned feature embedding based on multi-loss fusion for person search. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1668–1672, April 2018. https://doi.org/10.1109/ICASSP.2018.8462484. ISSN 2379-190X Liu, H., Shi, W., Huang, W., Guan, Q.: A discriminatively learned feature embedding based on multi-loss fusion for person search. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1668–1672, April 2018. https://​doi.​org/​10.​1109/​ICASSP.​2018.​8462484. ISSN 2379-190X
16.
go back to reference Munjal, B., Galasso, F., Amin, S.: Knowledge distillation for end-to-end person search. In: 30th British Machine Vision Conference 2019, BMVC 2019, Cardiff, UK, 9–12 September 2019, p. 216. BMVA Press (2019) Munjal, B., Galasso, F., Amin, S.: Knowledge distillation for end-to-end person search. In: 30th British Machine Vision Conference 2019, BMVC 2019, Cardiff, UK, 9–12 September 2019, p. 216. BMVA Press (2019)
17.
go back to reference Wang, C., Ma, B., Chang, H., Shan, S., Chen, X.: TCTS: a task-consistent two-stage framework for person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020 Wang, C., Ma, B., Chang, H., Shan, S., Chen, X.: TCTS: a task-consistent two-stage framework for person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020
18.
go back to reference Xiao, J., Xie, Y., Tillo, T., Huang, K., Wei, Y., Feng, J.: IAN: the individual aggregation network for person search. Pattern Recognit. 87, 332–340 (2019)CrossRef Xiao, J., Xie, Y., Tillo, T., Huang, K., Wei, Y., Feng, J.: IAN: the individual aggregation network for person search. Pattern Recognit. 87, 332–340 (2019)CrossRef
19.
go back to reference Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: Joint detection and identification feature learning for person search. In: CVPR (2017) Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: Joint detection and identification feature learning for person search. In: CVPR (2017)
20.
go back to reference Xu, Y., Ma, B., Huang, R., Lin, L.: Person search in a scene by jointly modeling people commonness and person uniqueness. In: Hua, K.A., Rui, Y., Steinmetz, R., Hanjalic, A., Natsev, A., Zhu, W. (eds.) Proceedings of the ACM International Conference on Multimedia, MM 2014, Orlando, FL, USA, 03–07 November 2014, pp. 937–940. ACM (2014) Xu, Y., Ma, B., Huang, R., Lin, L.: Person search in a scene by jointly modeling people commonness and person uniqueness. In: Hua, K.A., Rui, Y., Steinmetz, R., Hanjalic, A., Natsev, A., Zhu, W. (eds.) Proceedings of the ACM International Conference on Multimedia, MM 2014, Orlando, FL, USA, 03–07 November 2014, pp. 937–940. ACM (2014)
21.
go back to reference Xu, Y., Ma, B., Huang, R., Lin, L.: Person search in a scene by jointly modeling people commonness and person uniqueness. In: Proceedings of the 22nd ACM international conference on Multimedia, MM 2014, pp. 937–940. Association for Computing Machinery, New York, November 2014. https://doi.org/10.1145/2647868.2654965 Xu, Y., Ma, B., Huang, R., Lin, L.: Person search in a scene by jointly modeling people commonness and person uniqueness. In: Proceedings of the 22nd ACM international conference on Multimedia, MM 2014, pp. 937–940. Association for Computing Machinery, New York, November 2014. https://​doi.​org/​10.​1145/​2647868.​2654965
23.
go back to reference Yan, Y., Zhang, Q., Ni, B., Zhang, W., Xu, M., Yang, X.: Learning context graph for person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019 Yan, Y., Zhang, Q., Ni, B., Zhang, W., Xu, M., Yang, X.: Learning context graph for person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
24.
go back to reference Yu, R., et al.: Cascade transformers for end-to-end person search, pp. 7267–7276 (2022) Yu, R., et al.: Cascade transformers for end-to-end person search, pp. 7267–7276 (2022)
25.
go back to reference Zhang, X., Wang, X., Bian, J.W., Shen, C., You, M.: Diverse knowledge distillation for end-to-end person search. arXiv:2012.11187 [cs], December 2020 Zhang, X., Wang, X., Bian, J.W., Shen, C., You, M.: Diverse knowledge distillation for end-to-end person search. arXiv:​2012.​11187 [cs], December 2020
27.
go back to reference Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang, Y., Tian, Q.: Person re-identification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang, Y., Tian, Q.: Person re-identification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
28.
go back to reference Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, 3–7 May 2021. OpenReview.net (2021). https://openreview.net/forum?id=gZ9hCDWe6ke Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, 3–7 May 2021. OpenReview.net (2021). https://​openreview.​net/​forum?​id=​gZ9hCDWe6ke
Metadata
Title
Multi-Query Person Search with Transformers
Authors
Ying Chen
Zhihui Li
Andy Song
Copyright Year
2024
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-2238-9_9

Premium Partner