Skip to main content
Top
Published in:

2025 | OriginalPaper | Chapter

A BERT-Based Method of Named Entity Recognition for Ukiyo-e Titles

Authors : Bohao Wu, Akira Maeda

Published in: Sustainability and Empowerment in the Context of Digital Libraries

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Named entity recognition (NER) is a particularly challenging task, especially for historical documents that lack extensive annotated datasets [6, 14]. The titles of ukiyo-e, a genre of Japanese artworks, contain a significant number of entities and are composed of short texts rich in historical information. The complexity and brevity of these titles pose considerable challenges for analysis. This paper presents the construction of an ukiyo-e NER dataset and introduces a BERT-based NER methodology that achieves notable success in entity recognition within ukiyo-e titles. The study underscores the effectiveness of BERT and its derivative models on the ukiyo-e NER dataset, proposing a viable NER solution for historical documents. The proposed approach demonstrates the capability to perform NER tasks on ukiyo-e titles using pre-trained models on a relatively small annotated dataset, achieving an accuracy exceeding 80%. Furthermore, this paper examines the distinct characteristics of BERT and its derivative models, optimizing their application to the ukiyo-e NER dataset.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
3.
go back to reference Batjargal, B., Khaltarkhuu, G., Kimura, F., Maeda, A.: An approach to named entity extraction from historical documents in traditional Mongolian script. In: IEEE/ACM Joint Conference on Digital Libraries, JCDL 2014, London, United Kingdom, 8–12 September 2014, pp. 489–490. IEEE Computer Society (2014). https://doi.org/10.1109/JCDL.2014.6970239 Batjargal, B., Khaltarkhuu, G., Kimura, F., Maeda, A.: An approach to named entity extraction from historical documents in traditional Mongolian script. In: IEEE/ACM Joint Conference on Digital Libraries, JCDL 2014, London, United Kingdom, 8–12 September 2014, pp. 489–490. IEEE Computer Society (2014). https://​doi.​org/​10.​1109/​JCDL.​2014.​6970239
5.
go back to reference Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019 (Volume 1: Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics (2019). https://doi.org/10.18653/V1/N19-1423 Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019 (Volume 1: Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics (2019). https://​doi.​org/​10.​18653/​V1/​N19-1423
6.
7.
go back to reference Hamdi, A., et al.: A multilingual dataset for named entity recognition, entity linking and stance detection in historical newspapers. In: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2021, Virtual Event, Canada, 11–15 July 2021, pp. 2328–2334. ACM (2021). https://doi.org/10.1145/3404835.3463255 Hamdi, A., et al.: A multilingual dataset for named entity recognition, entity linking and stance detection in historical newspapers. In: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2021, Virtual Event, Canada, 11–15 July 2021, pp. 2328–2334. ACM (2021). https://​doi.​org/​10.​1145/​3404835.​3463255
9.
go back to reference Iwakura, T., Komiya, K., Tachibana, R.: Constructing a Japanese basic named entity corpus of various genres. In: Proceedings of the Sixth Named Entity Workshop, NEWS@ACL 2016, Berlin, Germany, 12 August 2016, pp. 41–46. Association for Computational Linguistics (2016). https://doi.org/10.18653/V1/W16-2706 Iwakura, T., Komiya, K., Tachibana, R.: Constructing a Japanese basic named entity corpus of various genres. In: Proceedings of the Sixth Named Entity Workshop, NEWS@ACL 2016, Berlin, Germany, 12 August 2016, pp. 41–46. Association for Computational Linguistics (2016). https://​doi.​org/​10.​18653/​V1/​W16-2706
10.
go back to reference Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020). https://openreview.net/forum?id=H1eA7AEtvS Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020). https://​openreview.​net/​forum?​id=​H1eA7AEtvS
13.
go back to reference Misawa, S., Taniguchi, M., Miura, Y., Ohkuma, T.: Character-based bidirectional LSTM-CRF with words and characters for Japanese named entity recognition. In: Proceedings of the First Workshop on Subword and Character Level Models in NLP, Copenhagen, Denmark, 7 September 2017, pp. 97–102. Association for Computational Linguistics (2017). https://doi.org/10.18653/V1/W17-4114 Misawa, S., Taniguchi, M., Miura, Y., Ohkuma, T.: Character-based bidirectional LSTM-CRF with words and characters for Japanese named entity recognition. In: Proceedings of the First Workshop on Subword and Character Level Models in NLP, Copenhagen, Denmark, 7 September 2017, pp. 97–102. Association for Computational Linguistics (2017). https://​doi.​org/​10.​18653/​V1/​W17-4114
15.
go back to reference Sasano, R., Kurohashi, S.: Japanese named entity recognition using structural natural language processing. In: Third International Joint Conference on Natural Language Processing, IJCNLP 2008, Hyderabad, India, 7–12 January 2008, pp. 607–612. The Association for Computer Linguistics (2008). https://aclanthology.org/I08-2080/ Sasano, R., Kurohashi, S.: Japanese named entity recognition using structural natural language processing. In: Third International Joint Conference on Natural Language Processing, IJCNLP 2008, Hyderabad, India, 7–12 January 2008, pp. 607–612. The Association for Computer Linguistics (2008). https://​aclanthology.​org/​I08-2080/​
16.
go back to reference Tomori, S., Ninomiya, T., Mori, S.: Domain specific named entity recognition referring to the real world by deep neural networks. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, Berlin, Germany, 7–12 August 2016, Volume 2: Short Papers. The Association for Computer Linguistics (2016). https://doi.org/10.18653/V1/P16-2039 Tomori, S., Ninomiya, T., Mori, S.: Domain specific named entity recognition referring to the real world by deep neural networks. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, Berlin, Germany, 7–12 August 2016, Volume 2: Short Papers. The Association for Computer Linguistics (2016). https://​doi.​org/​10.​18653/​V1/​P16-2039
17.
go back to reference Yamada, I., Asai, A., Shindo, H., Takeda, H., Matsumoto, Y.: LUKE: deep contextualized entity representations with entity-aware self-attention. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, 16–20 November 2020, pp. 6442–6454. Association for Computational Linguistics (2020). https://doi.org/10.18653/V1/2020.EMNLP-MAIN.523 Yamada, I., Asai, A., Shindo, H., Takeda, H., Matsumoto, Y.: LUKE: deep contextualized entity representations with entity-aware self-attention. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, 16–20 November 2020, pp. 6442–6454. Association for Computational Linguistics (2020). https://​doi.​org/​10.​18653/​V1/​2020.​EMNLP-MAIN.​523
18.
go back to reference Yang, Z., Dai, Z., Yang, Y., Carbonell, J.G., Salakhutdinov, R., Le, Q.V.: Xlnet: generalized autoregressive pretraining for language understanding. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, Vancouver, BC, Canada, 8–14 December 2019, pp. 5754–5764 (2019). arXiv:1906.08237 Yang, Z., Dai, Z., Yang, Y., Carbonell, J.G., Salakhutdinov, R., Le, Q.V.: Xlnet: generalized autoregressive pretraining for language understanding. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, Vancouver, BC, Canada, 8–14 December 2019, pp. 5754–5764 (2019). arXiv:​1906.​08237
20.
go back to reference Yuan, G., Li, K., Goto, M., Kimura, F., Maeda, A.: Extraction of information on historical figures from a Japanese biographical dictionary: an attempt to extract named entities from classical Japanese by few-shot learning. In: DEIM Forum 2022, pp. 187–192 (2022). (in Japanese) Yuan, G., Li, K., Goto, M., Kimura, F., Maeda, A.: Extraction of information on historical figures from a Japanese biographical dictionary: an attempt to extract named entities from classical Japanese by few-shot learning. In: DEIM Forum 2022, pp. 187–192 (2022). (in Japanese)
Metadata
Title
A BERT-Based Method of Named Entity Recognition for Ukiyo-e Titles
Authors
Bohao Wu
Akira Maeda
Copyright Year
2025
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-96-0865-2_1

Premium Partner