Skip to main content

2023 | OriginalPaper | Buchkapitel

aMLP-ReID: A Vision MLP Architecture Mixed Linear Attention for Person Reid

verfasst von : Guangyu Lei, Jingsheng Lei

Erschienen in: Frontier Computing

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The Vision Transformer architecture based on Self Attention has been found performing well in feature extracting compared with the CNNs, but there were simply few studies on downstream detecting tasks. The person re-identification task lies on judging whether there is a specific person by extracting image features. There is inevitable loss when convolution kernels are operated with pooling and downsampling with traditional methods based on convolutional network. Therefore, we considered replacing Multihead Attention in Vision Transformer mechanism with a MAL layer based on Vision MLP. The attention mechanism aggregates spatial information. Specifically, the constructed MAL layer explored the MLP structure fitting spatial information. Based on the MLP structure, aMLP-reid mixed linear attention with which was proposed for person re-identification. The improvement was obvious compared with the CNNs on several popular datasets after plenty of experiments. The experiments also demonstrated the effectiveness of Vision MLP architecture for image matching tasks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Xu, Y.F., Wei, H.P., Lin, M.X., et al.: Transformers in computational visual media: a survey. Comput. Vis. Med. 8(1), 33–62 (2022)CrossRef Xu, Y.F., Wei, H.P., Lin, M.X., et al.: Transformers in computational visual media: a survey. Comput. Vis. Med. 8(1), 33–62 (2022)CrossRef
2.
Zurück zum Zitat Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16 × 16 words: transformers for image recognition at scale. arXiv:2010.11929 (2020) Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16 × 16 words: transformers for image recognition at scale. arXiv:​2010.​11929 (2020)
3.
Zurück zum Zitat Hou, Q., Jiang, Z., Yuan, L., et al.: Vision permutator: a permutable MLP-like architecture for visual recognition. arXiv:2106.12368 (2021) Hou, Q., Jiang, Z., Yuan, L., et al.: Vision permutator: a permutable MLP-like architecture for visual recognition. arXiv:​2106.​12368 (2021)
5.
Zurück zum Zitat Wang, G., Yuan, Y., Chen, X., et al.: Learning discriminative features with multiple granularities for person re-identification. arXiv:1804.01438 (2018) Wang, G., Yuan, Y., Chen, X., et al.: Learning discriminative features with multiple granularities for person re-identification. arXiv:​1804.​01438 (2018)
6.
7.
Zurück zum Zitat Zhuang, Z., Wei, L., Xie, L., et al.: Rethinking the distribution gap of person re-identification with camera-based batch normalization. arXiv:2001.08680 (2020) Zhuang, Z., Wei, L., Xie, L., et al.: Rethinking the distribution gap of person re-identification with camera-based batch normalization. arXiv:​2001.​08680 (2020)
8.
Zurück zum Zitat Xia, B., Gong, Y., Zhang, Y.Z., et al.: Second-order non-local attention networks for person re-identification. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3759–3768 (2019) Xia, B., Gong, Y., Zhang, Y.Z., et al.: Second-order non-local attention networks for person re-identification. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3759–3768 (2019)
10.
Zurück zum Zitat Li, H., Wu, G., Zheng, W.-S., et al.: Combined depth space based architecture search for person re-identification. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6725–6734 (2021) Li, H., Wu, G., Zheng, W.-S., et al.: Combined depth space based architecture search for person re-identification. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6725–6734 (2021)
11.
Zurück zum Zitat Chen, J., Jiang, X., Wang, F., et al.: Learning 3D shape feature for texture-insensitive person re-identification. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8142–8151 (2021) Chen, J., Jiang, X., Wang, F., et al.: Learning 3D shape feature for texture-insensitive person re-identification. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8142–8151 (2021)
Metadaten
Titel
aMLP-ReID: A Vision MLP Architecture Mixed Linear Attention for Person Reid
verfasst von
Guangyu Lei
Jingsheng Lei
Copyright-Jahr
2023
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-99-1428-9_84

Neuer Inhalt