Skip to main content
Top

2022 | OriginalPaper | Chapter

Learning Image Representation via Attribute-Aware Attention Networks for Fashion Classification

Authors : Yongquan Wan, Cairong Yan, Bofeng Zhang, Guobing Zou

Published in: MultiMedia Modeling

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Attribute descriptions enrich the characteristics of fashion products, and they play an essential role in fashion image research. We propose a fashion classification model (M2Fashion) based on multi-modal data (text + image). It uses the intra-modal and inter-modal data correlation to locate relevant image regions under the guidance of attributes and the attention mechanism. Compared with traditional single-modal feature representation, learning embedding from multi-modal features can better reflect fine-grained image features. We adopt a multi-task learning framework that combines category classification and attribute prediction tasks. The extensive experimental result on the public dataset DeepFashion shows the superiority of our proposed M2Fashion compared with state-of-the-art methods. It achieves +1.3% top-3 accuracy rate improvement in the category classification task and +5.6%/+3.7% top-3 recall rate improvement in the attribute prediction of part/shape, respectively. A supplementary experiment on attribute-specific image retrieval on the DARN dataset also demonstrates the effectiveness of M2Fashion.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: DeepFashion: powering robust clothes recognition and retrieval with rich annotations. In: CVPR, pp. 1096–1104 (2016) Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: DeepFashion: powering robust clothes recognition and retrieval with rich annotations. In: CVPR, pp. 1096–1104 (2016)
2.
go back to reference Dong, Q., Gong, S., Zhu, X.: Multi-task curriculum transfer deep learning of clothing attributes. In: WACV, pp. 520–529 (2017) Dong, Q., Gong, S., Zhu, X.: Multi-task curriculum transfer deep learning of clothing attributes. In: WACV, pp. 520–529 (2017)
3.
go back to reference Wang, W., Xu, Y., Shen, J., Zhu, S.C.: Attentive fashion grammar network for fashion landmark detection and clothing category classification. In: CVPR, pp. 4271–4280 (2018) Wang, W., Xu, Y., Shen, J., Zhu, S.C.: Attentive fashion grammar network for fashion landmark detection and clothing category classification. In: CVPR, pp. 4271–4280 (2018)
4.
go back to reference Han, X., Wu, Z., Jiang, Y.G., Davis, L.S.: Learning fashion compatibility with bidirectional LSTMs. In: MM, pp. 1078–1086 (2017) Han, X., Wu, Z., Jiang, Y.G., Davis, L.S.: Learning fashion compatibility with bidirectional LSTMs. In: MM, pp. 1078–1086 (2017)
5.
go back to reference Hou, M., Wu, L., Chen, E., Li, Z., Zheng, V.W., Liu, Q.: Explainable fashion recommendation: a semantic attribute region guided approach. In: IJCAI, pp. 4681–4688 (2019) Hou, M., Wu, L., Chen, E., Li, Z., Zheng, V.W., Liu, Q.: Explainable fashion recommendation: a semantic attribute region guided approach. In: IJCAI, pp. 4681–4688 (2019)
6.
go back to reference Zhang, S., Song, Z., Cao, X., Zhang, H., Zhou, J.: Task-aware attention model for clothing attribute prediction. IEEE Trans. Circ. Syst. Video Technol. (TCSVT) 30(4), 1051–1064 (2020)CrossRef Zhang, S., Song, Z., Cao, X., Zhang, H., Zhou, J.: Task-aware attention model for clothing attribute prediction. IEEE Trans. Circ. Syst. Video Technol. (TCSVT) 30(4), 1051–1064 (2020)CrossRef
7.
go back to reference Chen, M., Qin, Y., Qi, L., Sun, Y.: Improving fashion landmark detection by dual attention feature enhancement. In: ICCV Workshop, pp. 3101–3104 (2019) Chen, M., Qin, Y., Qi, L., Sun, Y.: Improving fashion landmark detection by dual attention feature enhancement. In: ICCV Workshop, pp. 3101–3104 (2019)
8.
go back to reference Huang, J., Feris, R., Chen, Q., Yan, S.: Cross-domain image retrieval with a dual attribute-aware ranking network. In: ICCV, pp. 1062–1070 (2015) Huang, J., Feris, R., Chen, Q., Yan, S.: Cross-domain image retrieval with a dual attribute-aware ranking network. In: ICCV, pp. 1062–1070 (2015)
9.
go back to reference Corbiere, C., Ben-Younes, H., Rame, A., Ollion, C.: Leveraging weakly annotated data for fashion image retrieval and label prediction. In: ICCV Workshop, pp. 2268–2274 (2017) Corbiere, C., Ben-Younes, H., Rame, A., Ollion, C.: Leveraging weakly annotated data for fashion image retrieval and label prediction. In: ICCV Workshop, pp. 2268–2274 (2017)
10.
go back to reference Chen, Q., Huang, J., Feris, R., Brown, L.M., Dong, J., Yan, S.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: CVPR, pp. 5315–5324 (2015) Chen, Q., Huang, J., Feris, R., Brown, L.M., Dong, J., Yan, S.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: CVPR, pp. 5315–5324 (2015)
11.
go back to reference Liao, L., He, X., Zhao, B., Ngo, C.W., Chua, T.S.: Interpretable multimodal retrieval for fashion products. In: MM, pp. 1571–1579 (2018) Liao, L., He, X., Zhao, B., Ngo, C.W., Chua, T.S.: Interpretable multimodal retrieval for fashion products. In: MM, pp. 1571–1579 (2018)
12.
go back to reference Han, X., Wu, Z., Huang, P.X., Zhang, X., Zhu, M., Li, Y.: Automatic spatially-aware fashion concept discovery. In: ICCV, pp. 1463–1471 (2017) Han, X., Wu, Z., Huang, P.X., Zhang, X., Zhu, M., Li, Y.: Automatic spatially-aware fashion concept discovery. In: ICCV, pp. 1463–1471 (2017)
13.
go back to reference Ferreira, B.Q., Costeira, J.P., Sousa, R.G., Gui, L.Y., Gomes, J.P.: Pose guided attention for multi-label fashion image classification. In: ICCV Workshop, pp. 3125–3128 (2019) Ferreira, B.Q., Costeira, J.P., Sousa, R.G., Gui, L.Y., Gomes, J.P.: Pose guided attention for multi-label fashion image classification. In: ICCV Workshop, pp. 3125–3128 (2019)
14.
go back to reference Ji, X., Wang, W., Zhang, M., Yang, Y.: Cross-domain image retrieval with attention modeling. In: MM, pp. 1654–1662 (2017) Ji, X., Wang, W., Zhang, M., Yang, Y.: Cross-domain image retrieval with attention modeling. In: MM, pp. 1654–1662 (2017)
15.
go back to reference Li, X., Ye, Z., Zhang, Z., Zhao, M.: Clothes image caption generation with attribute detection and visual attention model. Pattern Recogn. Lett. 141, 68–74 (2021)CrossRef Li, X., Ye, Z., Zhang, Z., Zhao, M.: Clothes image caption generation with attribute detection and visual attention model. Pattern Recogn. Lett. 141, 68–74 (2021)CrossRef
16.
go back to reference Ma, Z., Dong, J., Long, Z., Zhang, Y., He, Y., Xue, H.: Fine-grained fashion similarity learning by attribute-specific embedding network. In: AAAI, pp. 11741–11748 (2020) Ma, Z., Dong, J., Long, Z., Zhang, Y., He, Y., Xue, H.: Fine-grained fashion similarity learning by attribute-specific embedding network. In: AAAI, pp. 11741–11748 (2020)
17.
go back to reference Li, P., Li, Y., Jiang, X., Zhen, X.: Two-stream multi-task network for fashion recognition. In: ICIP, pp. 3038–3042 (2019) Li, P., Li, Y., Jiang, X., Zhen, X.: Two-stream multi-task network for fashion recognition. In: ICIP, pp. 3038–3042 (2019)
18.
go back to reference Min, W., Jiang, S., Sang, J.: Being a supercook: joint food attributes and multimodal content modeling for recipe retrieval and exploration. IEEE Trans. Multimedia 19(5), 1100–1113 (2017)CrossRef Min, W., Jiang, S., Sang, J.: Being a supercook: joint food attributes and multimodal content modeling for recipe retrieval and exploration. IEEE Trans. Multimedia 19(5), 1100–1113 (2017)CrossRef
19.
go back to reference Lu, J., Goswami, V., Rohrbach, M., Parikh, D., Lee, S.: 12-in-1: Multi-task vision and language representation learning. In: CVPR, pp. 10437–10446 (2020) Lu, J., Goswami, V., Rohrbach, M., Parikh, D., Lee, S.: 12-in-1: Multi-task vision and language representation learning. In: CVPR, pp. 10437–10446 (2020)
20.
go back to reference Lanchantin, J., Wang, T., Ordonez, V., Qi, Y.: General multi-label image classification with transformers. In: CVPR, pp. 16478–16488 (2021) Lanchantin, J., Wang, T., Ordonez, V., Qi, Y.: General multi-label image classification with transformers. In: CVPR, pp. 16478–16488 (2021)
21.
go back to reference Kiapour, M.H., Han, X., Lazebnik, S., Berg, A.C., Berg, T.L.: Where to buy it: matching street clothing photos in online shops. In: ICCV, pp. 3343–3351 (2015) Kiapour, M.H., Han, X., Lazebnik, S., Berg, A.C., Berg, T.L.: Where to buy it: matching street clothing photos in online shops. In: ICCV, pp. 3343–3351 (2015)
23.
go back to reference Veit, A., Belongie, S., Karaletsos, T.: Conditional similarity networks. In: CVPR (2017) Veit, A., Belongie, S., Karaletsos, T.: Conditional similarity networks. In: CVPR (2017)
Metadata
Title
Learning Image Representation via Attribute-Aware Attention Networks for Fashion Classification
Authors
Yongquan Wan
Cairong Yan
Bofeng Zhang
Guobing Zou
Copyright Year
2022
DOI
https://doi.org/10.1007/978-3-030-98358-1_6

Premium Partner