Skip to main content

2022 | OriginalPaper | Buchkapitel

Learning Image Representation via Attribute-Aware Attention Networks for Fashion Classification

verfasst von : Yongquan Wan, Cairong Yan, Bofeng Zhang, Guobing Zou

Erschienen in: MultiMedia Modeling

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Attribute descriptions enrich the characteristics of fashion products, and they play an essential role in fashion image research. We propose a fashion classification model (M2Fashion) based on multi-modal data (text + image). It uses the intra-modal and inter-modal data correlation to locate relevant image regions under the guidance of attributes and the attention mechanism. Compared with traditional single-modal feature representation, learning embedding from multi-modal features can better reflect fine-grained image features. We adopt a multi-task learning framework that combines category classification and attribute prediction tasks. The extensive experimental result on the public dataset DeepFashion shows the superiority of our proposed M2Fashion compared with state-of-the-art methods. It achieves +1.3% top-3 accuracy rate improvement in the category classification task and +5.6%/+3.7% top-3 recall rate improvement in the attribute prediction of part/shape, respectively. A supplementary experiment on attribute-specific image retrieval on the DARN dataset also demonstrates the effectiveness of M2Fashion.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: DeepFashion: powering robust clothes recognition and retrieval with rich annotations. In: CVPR, pp. 1096–1104 (2016) Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: DeepFashion: powering robust clothes recognition and retrieval with rich annotations. In: CVPR, pp. 1096–1104 (2016)
2.
Zurück zum Zitat Dong, Q., Gong, S., Zhu, X.: Multi-task curriculum transfer deep learning of clothing attributes. In: WACV, pp. 520–529 (2017) Dong, Q., Gong, S., Zhu, X.: Multi-task curriculum transfer deep learning of clothing attributes. In: WACV, pp. 520–529 (2017)
3.
Zurück zum Zitat Wang, W., Xu, Y., Shen, J., Zhu, S.C.: Attentive fashion grammar network for fashion landmark detection and clothing category classification. In: CVPR, pp. 4271–4280 (2018) Wang, W., Xu, Y., Shen, J., Zhu, S.C.: Attentive fashion grammar network for fashion landmark detection and clothing category classification. In: CVPR, pp. 4271–4280 (2018)
4.
Zurück zum Zitat Han, X., Wu, Z., Jiang, Y.G., Davis, L.S.: Learning fashion compatibility with bidirectional LSTMs. In: MM, pp. 1078–1086 (2017) Han, X., Wu, Z., Jiang, Y.G., Davis, L.S.: Learning fashion compatibility with bidirectional LSTMs. In: MM, pp. 1078–1086 (2017)
5.
Zurück zum Zitat Hou, M., Wu, L., Chen, E., Li, Z., Zheng, V.W., Liu, Q.: Explainable fashion recommendation: a semantic attribute region guided approach. In: IJCAI, pp. 4681–4688 (2019) Hou, M., Wu, L., Chen, E., Li, Z., Zheng, V.W., Liu, Q.: Explainable fashion recommendation: a semantic attribute region guided approach. In: IJCAI, pp. 4681–4688 (2019)
6.
Zurück zum Zitat Zhang, S., Song, Z., Cao, X., Zhang, H., Zhou, J.: Task-aware attention model for clothing attribute prediction. IEEE Trans. Circ. Syst. Video Technol. (TCSVT) 30(4), 1051–1064 (2020)CrossRef Zhang, S., Song, Z., Cao, X., Zhang, H., Zhou, J.: Task-aware attention model for clothing attribute prediction. IEEE Trans. Circ. Syst. Video Technol. (TCSVT) 30(4), 1051–1064 (2020)CrossRef
7.
Zurück zum Zitat Chen, M., Qin, Y., Qi, L., Sun, Y.: Improving fashion landmark detection by dual attention feature enhancement. In: ICCV Workshop, pp. 3101–3104 (2019) Chen, M., Qin, Y., Qi, L., Sun, Y.: Improving fashion landmark detection by dual attention feature enhancement. In: ICCV Workshop, pp. 3101–3104 (2019)
8.
Zurück zum Zitat Huang, J., Feris, R., Chen, Q., Yan, S.: Cross-domain image retrieval with a dual attribute-aware ranking network. In: ICCV, pp. 1062–1070 (2015) Huang, J., Feris, R., Chen, Q., Yan, S.: Cross-domain image retrieval with a dual attribute-aware ranking network. In: ICCV, pp. 1062–1070 (2015)
9.
Zurück zum Zitat Corbiere, C., Ben-Younes, H., Rame, A., Ollion, C.: Leveraging weakly annotated data for fashion image retrieval and label prediction. In: ICCV Workshop, pp. 2268–2274 (2017) Corbiere, C., Ben-Younes, H., Rame, A., Ollion, C.: Leveraging weakly annotated data for fashion image retrieval and label prediction. In: ICCV Workshop, pp. 2268–2274 (2017)
10.
Zurück zum Zitat Chen, Q., Huang, J., Feris, R., Brown, L.M., Dong, J., Yan, S.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: CVPR, pp. 5315–5324 (2015) Chen, Q., Huang, J., Feris, R., Brown, L.M., Dong, J., Yan, S.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: CVPR, pp. 5315–5324 (2015)
11.
Zurück zum Zitat Liao, L., He, X., Zhao, B., Ngo, C.W., Chua, T.S.: Interpretable multimodal retrieval for fashion products. In: MM, pp. 1571–1579 (2018) Liao, L., He, X., Zhao, B., Ngo, C.W., Chua, T.S.: Interpretable multimodal retrieval for fashion products. In: MM, pp. 1571–1579 (2018)
12.
Zurück zum Zitat Han, X., Wu, Z., Huang, P.X., Zhang, X., Zhu, M., Li, Y.: Automatic spatially-aware fashion concept discovery. In: ICCV, pp. 1463–1471 (2017) Han, X., Wu, Z., Huang, P.X., Zhang, X., Zhu, M., Li, Y.: Automatic spatially-aware fashion concept discovery. In: ICCV, pp. 1463–1471 (2017)
13.
Zurück zum Zitat Ferreira, B.Q., Costeira, J.P., Sousa, R.G., Gui, L.Y., Gomes, J.P.: Pose guided attention for multi-label fashion image classification. In: ICCV Workshop, pp. 3125–3128 (2019) Ferreira, B.Q., Costeira, J.P., Sousa, R.G., Gui, L.Y., Gomes, J.P.: Pose guided attention for multi-label fashion image classification. In: ICCV Workshop, pp. 3125–3128 (2019)
14.
Zurück zum Zitat Ji, X., Wang, W., Zhang, M., Yang, Y.: Cross-domain image retrieval with attention modeling. In: MM, pp. 1654–1662 (2017) Ji, X., Wang, W., Zhang, M., Yang, Y.: Cross-domain image retrieval with attention modeling. In: MM, pp. 1654–1662 (2017)
15.
Zurück zum Zitat Li, X., Ye, Z., Zhang, Z., Zhao, M.: Clothes image caption generation with attribute detection and visual attention model. Pattern Recogn. Lett. 141, 68–74 (2021)CrossRef Li, X., Ye, Z., Zhang, Z., Zhao, M.: Clothes image caption generation with attribute detection and visual attention model. Pattern Recogn. Lett. 141, 68–74 (2021)CrossRef
16.
Zurück zum Zitat Ma, Z., Dong, J., Long, Z., Zhang, Y., He, Y., Xue, H.: Fine-grained fashion similarity learning by attribute-specific embedding network. In: AAAI, pp. 11741–11748 (2020) Ma, Z., Dong, J., Long, Z., Zhang, Y., He, Y., Xue, H.: Fine-grained fashion similarity learning by attribute-specific embedding network. In: AAAI, pp. 11741–11748 (2020)
17.
Zurück zum Zitat Li, P., Li, Y., Jiang, X., Zhen, X.: Two-stream multi-task network for fashion recognition. In: ICIP, pp. 3038–3042 (2019) Li, P., Li, Y., Jiang, X., Zhen, X.: Two-stream multi-task network for fashion recognition. In: ICIP, pp. 3038–3042 (2019)
18.
Zurück zum Zitat Min, W., Jiang, S., Sang, J.: Being a supercook: joint food attributes and multimodal content modeling for recipe retrieval and exploration. IEEE Trans. Multimedia 19(5), 1100–1113 (2017)CrossRef Min, W., Jiang, S., Sang, J.: Being a supercook: joint food attributes and multimodal content modeling for recipe retrieval and exploration. IEEE Trans. Multimedia 19(5), 1100–1113 (2017)CrossRef
19.
Zurück zum Zitat Lu, J., Goswami, V., Rohrbach, M., Parikh, D., Lee, S.: 12-in-1: Multi-task vision and language representation learning. In: CVPR, pp. 10437–10446 (2020) Lu, J., Goswami, V., Rohrbach, M., Parikh, D., Lee, S.: 12-in-1: Multi-task vision and language representation learning. In: CVPR, pp. 10437–10446 (2020)
20.
Zurück zum Zitat Lanchantin, J., Wang, T., Ordonez, V., Qi, Y.: General multi-label image classification with transformers. In: CVPR, pp. 16478–16488 (2021) Lanchantin, J., Wang, T., Ordonez, V., Qi, Y.: General multi-label image classification with transformers. In: CVPR, pp. 16478–16488 (2021)
21.
Zurück zum Zitat Kiapour, M.H., Han, X., Lazebnik, S., Berg, A.C., Berg, T.L.: Where to buy it: matching street clothing photos in online shops. In: ICCV, pp. 3343–3351 (2015) Kiapour, M.H., Han, X., Lazebnik, S., Berg, A.C., Berg, T.L.: Where to buy it: matching street clothing photos in online shops. In: ICCV, pp. 3343–3351 (2015)
23.
Zurück zum Zitat Veit, A., Belongie, S., Karaletsos, T.: Conditional similarity networks. In: CVPR (2017) Veit, A., Belongie, S., Karaletsos, T.: Conditional similarity networks. In: CVPR (2017)
Metadaten
Titel
Learning Image Representation via Attribute-Aware Attention Networks for Fashion Classification
verfasst von
Yongquan Wan
Cairong Yan
Bofeng Zhang
Guobing Zou
Copyright-Jahr
2022
DOI
https://doi.org/10.1007/978-3-030-98358-1_6

Premium Partner