Skip to main content
Top

2025 | OriginalPaper | Chapter

BioLinkerAI: Capturing Knowledge Using LLMs to Enhance Biomedical Entity Linking

Authors : Ahmad Sakor, Kuldeep Singh, Maria-Esther Vidal

Published in: Web Information Systems Engineering – WISE 2024

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, we introduce BioLinkerAI, a neuro-symbolic approach tailored for biomedical entity linking. Traditional domain-specific entity linking approaches necessitate substantial labeled-training datasets and present challenges when adapting to each new dataset or setting. BioLinkerAI integrates symbolic methodologies with sub-symbolic models to mitigate the constraints of limited training data availability. The symbolic component utilizes a rule-based entity extraction mechanism, underpinned by an extensive set of linguistic and domain-specific rules. Concurrently, the sub-symbolic component employs a Large Language Model (LLM) to achieve precise candidate disambiguation. This mechanism enhances entity linking accuracy, especially when a single entity in a knowledge base such as UMLS aligns with multiple terms, leveraging the contextual intelligence encapsulated within the LLM’s embeddings. Empirical evaluations conducted on a range of biomedical benchmarks demonstrate the superior performance of BioLinkerAI. Notably, it surpasses existing benchmarks in entity linking accuracy (e.g., achieving an improvement from 65.4 (best baseline) to 78.5 (our model) on unseen data accuracy, representing the most stringent evaluation paradigm). Additionally, BioLinkerAI consistently performs adeptly on both structured sentences and individual keywords. To facilitate broader utilization, the source code, datasets, and a public API are available (https://​github.​com/​SDM-TIB/​BioLinkerAI).

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Agarwal, D., Angell, R., Monath, N., McCallum, A.: Entity linking via explicit mention-mention coreference modeling. In: Proceedings of the NAACL 2022 Conference: Human Language Technologies, pp. 4644–4658 (2022) Agarwal, D., Angell, R., Monath, N., McCallum, A.: Entity linking via explicit mention-mention coreference modeling. In: Proceedings of the NAACL 2022 Conference: Human Language Technologies, pp. 4644–4658 (2022)
2.
go back to reference Angell, R., Monath, N., Mohan, S., Yadav, N., McCallum, A.: Clustering-based inference for biomedical entity linking. In: Proceedings of the NAACL 2021 Conference: Human Language Technologies, pp. 2598–2608 (2021) Angell, R., Monath, N., Mohan, S., Yadav, N., McCallum, A.: Clustering-based inference for biomedical entity linking. In: Proceedings of the NAACL 2021 Conference: Human Language Technologies, pp. 2598–2608 (2021)
3.
go back to reference Ayoola, T., Tyagi, S., Fisher, J., Christodoulopoulos, C., Pierleoni, A.: Refined: an efficient zero-shot-capable approach to end-to-end entity linking. arXiv preprint arXiv:2207.04108 (2022) Ayoola, T., Tyagi, S., Fisher, J., Christodoulopoulos, C., Pierleoni, A.: Refined: an efficient zero-shot-capable approach to end-to-end entity linking. arXiv preprint arXiv:​2207.​04108 (2022)
4.
go back to reference Bhowmik, R., Stratos, K., de Melo, G.: Fast and effective biomedical entity linking using a dual encoder. arXiv preprint arXiv:2103.05028 (2021) Bhowmik, R., Stratos, K., de Melo, G.: Fast and effective biomedical entity linking using a dual encoder. arXiv preprint arXiv:​2103.​05028 (2021)
5.
go back to reference De Cao, N., Izacard, G., Riedel, S., Petroni, F.: Autoregressive entity retrieval. In: International Conference on Learning Representations (2020) De Cao, N., Izacard, G., Riedel, S., Petroni, F.: Autoregressive entity retrieval. In: International Conference on Learning Representations (2020)
6.
go back to reference Jiang, H., et al.: LNN-EL: a neuro-symbolic approach to short-text entity linking. In: ACL (Volume 1: Long Papers), pp. 775–787 (2021) Jiang, H., et al.: LNN-EL: a neuro-symbolic approach to short-text entity linking. In: ACL (Volume 1: Long Papers), pp. 775–787 (2021)
7.
go back to reference Le, P., Titov, I.: Distant learning for entity linking with automatic noise detection. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4081–4090 (2019) Le, P., Titov, I.: Distant learning for entity linking with automatic noise detection. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4081–4090 (2019)
8.
go back to reference Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)MathSciNetCrossRef Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)MathSciNetCrossRef
9.
go back to reference Limsopatham, N., Collier, N.: Normalising medical concepts in social media texts by learning semantic representation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1014–1023 (2016) Limsopatham, N., Collier, N.: Normalising medical concepts in social media texts by learning semantic representation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1014–1023 (2016)
10.
go back to reference Liu, F., Shareghi, E., Meng, Z., Basaldella, M., Collier, N.: Self-alignment pretraining for biomedical entity representations. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4228–4238 (2021) Liu, F., Shareghi, E., Meng, Z., Basaldella, M., Collier, N.: Self-alignment pretraining for biomedical entity representations. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4228–4238 (2021)
11.
go back to reference Logeswaran, L., Chang, M.W., Lee, K., Toutanova, K., Devlin, J., Lee, H.: Zero-shot entity linking by reading entity descriptions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3449–3460 (2019) Logeswaran, L., Chang, M.W., Lee, K., Toutanova, K., Devlin, J., Lee, H.: Zero-shot entity linking by reading entity descriptions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3449–3460 (2019)
12.
go back to reference Mulang, I.O., et al.: Encoding knowledge graph entity aliases in an attentive neural networks for Wikidata entity linking. In: Web Information System and Engineering (2020) Mulang, I.O., et al.: Encoding knowledge graph entity aliases in an attentive neural networks for Wikidata entity linking. In: Web Information System and Engineering (2020)
13.
go back to reference Neumann, M., King, D., Beltagy, I., Ammar, W.: ScispaCy: fast and robust models for biomedical natural language processing. In: Proceedings of the 18th BioNLP Workshop, pp. 319–327. Association for Computational Linguistics, Florence, Italy (Aug 2019) Neumann, M., King, D., Beltagy, I., Ammar, W.: ScispaCy: fast and robust models for biomedical natural language processing. In: Proceedings of the 18th BioNLP Workshop, pp. 319–327. Association for Computational Linguistics, Florence, Italy (Aug 2019)
14.
go back to reference Ravi, M.P.K., Singh, K., Mulang, I.O., Shekarpour, S., Hoffart, J., Lehmann, J.: CHOLAN: a modular approach for neural entity linking on Wikipedia and Wikidata. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 504–514 (2021) Ravi, M.P.K., Singh, K., Mulang, I.O., Shekarpour, S., Hoffart, J., Lehmann, J.: CHOLAN: a modular approach for neural entity linking on Wikipedia and Wikidata. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 504–514 (2021)
15.
go back to reference Sakor, A., et al.: Old is gold: linguistic driven approach for entity and relation linking of short text. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019,Volume 1 (Long Papers), pp. 2336–2346. Association for Computational Linguistics (2019) Sakor, A., et al.: Old is gold: linguistic driven approach for entity and relation linking of short text. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019,Volume 1 (Long Papers), pp. 2336–2346. Association for Computational Linguistics (2019)
16.
go back to reference Sakor, A., Singh, K., Patel, A., Vidal, M.E.: Falcon 2.0: an entity and relation linking tool over Wikidata. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 3141–3148 (2020) Sakor, A., Singh, K., Patel, A., Vidal, M.E.: Falcon 2.0: an entity and relation linking tool over Wikidata. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 3141–3148 (2020)
17.
go back to reference Sevgili, Ö., Shelmanov, A., Arkhipov, M., Panchenko, A., Biemann, C.: Neural entity linking: a survey of models based on deep learning. Seman. Web 13(3), 527–570 (2022) Sevgili, Ö., Shelmanov, A., Arkhipov, M., Panchenko, A., Biemann, C.: Neural entity linking: a survey of models based on deep learning. Seman. Web 13(3), 527–570 (2022)
18.
go back to reference Sui, X., et al.: BioFEG: generate latent features for biomedical entity linking. In: Proceedings of EMNLP the 2023 Conference, pp. 11584–11593 (2023) Sui, X., et al.: BioFEG: generate latent features for biomedical entity linking. In: Proceedings of EMNLP the 2023 Conference, pp. 11584–11593 (2023)
19.
go back to reference Sung, M., Jeon, H., Lee, J., Kang, J.: Biomedical entity representations with synonym marginalization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3641–3650 (2020) Sung, M., Jeon, H., Lee, J., Kang, J.: Biomedical entity representations with synonym marginalization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3641–3650 (2020)
22.
go back to reference Varma, M., Orr, L., Wu, S., Leszczynski, M., Ling, X., Ré, C.: Cross-domain data integration for named entity disambiguation in biomedical text. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 4566–4575 (2021) Varma, M., Orr, L., Wu, S., Leszczynski, M., Ling, X., Ré, C.: Cross-domain data integration for named entity disambiguation in biomedical text. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 4566–4575 (2021)
23.
go back to reference Yuan, H., Yuan, Z., Yu, S.: Generative biomedical entity linking via knowledge base-guided pre-training and synonyms-aware fine-tuning. In: NAACL of (Long papers), pp. 4038–4048. Association for Computational Linguistics (2022) Yuan, H., Yuan, Z., Yu, S.: Generative biomedical entity linking via knowledge base-guided pre-training and synonyms-aware fine-tuning. In: NAACL of (Long papers), pp. 4038–4048. Association for Computational Linguistics (2022)
24.
go back to reference Yuan, Z., Zhao, Z., Sun, H., Li, J., Wang, F., Yu, S.: Coder: knowledge-infused cross-lingual medical term embedding for term normalization. J. Biomed. Inform. 126, 103983 (2022)CrossRef Yuan, Z., Zhao, Z., Sun, H., Li, J., Wang, F., Yu, S.: Coder: knowledge-infused cross-lingual medical term embedding for term normalization. J. Biomed. Inform. 126, 103983 (2022)CrossRef
25.
go back to reference Zhu, T., Qin, Y., Chen, Q., Hu, B., Xiang, Y.: Enhancing entity representations with prompt learning for biomedical entity linking. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, pp. 4036–4042 (2021) Zhu, T., Qin, Y., Chen, Q., Hu, B., Xiang, Y.: Enhancing entity representations with prompt learning for biomedical entity linking. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, pp. 4036–4042 (2021)
Metadata
Title
BioLinkerAI: Capturing Knowledge Using LLMs to Enhance Biomedical Entity Linking
Authors
Ahmad Sakor
Kuldeep Singh
Maria-Esther Vidal
Copyright Year
2025
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-96-0573-6_19

Premium Partner