Skip to main content

2023 | OriginalPaper | Buchkapitel

Image Captioning for Nantong Blue Calico Through Stacked Local-Global Channel Attention Network

verfasst von : Chenyi Guo, Li Zhang, Xiang Yu

Erschienen in: Artificial Neural Networks and Machine Learning – ICANN 2023

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Nantong blue calico, a Chinese folk hand-made printing and dyeing craft, has become one of intangible cultural heritages (ICHs) in China. To inherite and promote the ICH of Nantong blue calico, this study applies the image captioning technology to explaining blue-calico images. For this purpose, a novel image captioning method, called the stacked local-global channel attention network (SLGCAN), is proposed. This new network focuses on extracting important features from blue-calico images so that it can generate more accurate captions for blue-calico images. SLGCAN contains three parts, residual network (ResNet), stacked local-global channel attention module (SLGCAM), and Transformer. First, the pre-trained ResNet-101 model is used to extract rough features from blue-calico images and then, SLGCAM is to obtain the fine-grained information from rough image features. Eventually, SLGCAN adopts Transformer to encode and decode the fine-grained information of blue-calico images to predict the word information for generating accurate image captions. Experiments are conducted on a collected blue-calico image dataset. In experiments, we compare our SLGCAN with baseline models and show that that the proposed model is feasible and effective.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
15.
Zurück zum Zitat Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: Proceedings of the 32nd International Conference on Machine Learning, pp. 2048–2057. IEEE (2015) Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: Proceedings of the 32nd International Conference on Machine Learning, pp. 2048–2057. IEEE (2015)
16.
Zurück zum Zitat Lu, J., Xiong, C., Parikh, D., Socher, R.: Knowing when to look: adaptive attention via a visual sentinel for image captioning. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3242–3250. IEEE, Honolulu (2017). https://doi.org/10.1109/CVPR.2017.345 Lu, J., Xiong, C., Parikh, D., Socher, R.: Knowing when to look: adaptive attention via a visual sentinel for image captioning. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3242–3250. IEEE, Honolulu (2017). https://​doi.​org/​10.​1109/​CVPR.​2017.​345
18.
Zurück zum Zitat Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics, Philadelphia (2002). https://doi.org/10.3115/1073083.1073135 Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics, Philadelphia (2002). https://​doi.​org/​10.​3115/​1073083.​1073135
19.
Zurück zum Zitat Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72. Association for Computational Linguistics, Ann Arbor (2005). https://aclanthology.org/W05-0909 Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72. Association for Computational Linguistics, Ann Arbor (2005). https://​aclanthology.​org/​W05-0909
20.
Zurück zum Zitat Lin, C.-Y.: ROUGE: a package for automatic evaluation of summaries. In: ACL 2004 Workshop, pp. 74–81. Association for Computational Linguistics (2004) Lin, C.-Y.: ROUGE: a package for automatic evaluation of summaries. In: ACL 2004 Workshop, pp. 74–81. Association for Computational Linguistics (2004)
Metadaten
Titel
Image Captioning for Nantong Blue Calico Through Stacked Local-Global Channel Attention Network
verfasst von
Chenyi Guo
Li Zhang
Xiang Yu
Copyright-Jahr
2023
DOI
https://doi.org/10.1007/978-3-031-44210-0_29

Premium Partner