Skip to main content

2025 | OriginalPaper | Buchkapitel

PreAdapter: Pre-training Language Models on Knowledge Graphs

verfasst von : Janna Omeliyanenko, Andreas Hotho, Daniel Schlör

Erschienen in: The Semantic Web – ISWC 2024

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Pre-trained language models have demonstrated state-of-the-art performance in various downstream tasks such as summarization, sentiment classification, and question answering. Leveraging vast amounts of textual data during training, these models inherently hold a certain amount of factual knowledge, which is particularly beneficial for knowledge-driven tasks such as question answering. However, the knowledge implicitly contained within the language models is not complete. Consequently, many studies incorporate additional knowledge from Semantic Web resources such as knowledge graphs, which provide an explicit representation of knowledge in the form of triples.
Seamless integration of this knowledge into language models remains an active research area. Direct pre-training of language models on knowledge graphs followed by fine-tuning on downstream tasks has proven ineffective, primarily due to the catastrophic forgetting effect. Many approaches suggest fusing language models with graph embedding models to enrich language models with information from knowledge graphs, showing improvement over solutions that lack knowledge graph integration in downstream tasks. However, these methods often require additional computational overhead, for instance, by training graph embedding models.
In our work, we propose a novel adapter-based method for integrating knowledge graphs into language models through pre-training. This approach effectively mitigates catastrophic forgetting that can otherwise affect both the original language modeling capabilities and the access to pre-trained knowledge. Through this scheme, our approach ensures access to both the original capabilities of the language model and the integrated Semantic Web knowledge during fine-tuning on downstream tasks. Experimental results on multiple choice question answering tasks demonstrate performance improvements compared to baseline models without knowledge graph integration and other pre-training-based knowledge integration methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Our code is publicly available at: https://​professor-x.​de/​code-preadapter.
 
Literatur
1.
Zurück zum Zitat Aksenov, D., Moreno-Schneider, J., Bourgonje, P., Schwarzenberg, R., Hennig, L., Rehm, G.: Abstractive text summarization based on language model conditioning and locality modeling. In: Calzolari, N. et al. (eds.) Proceedings of the Twelfth Language Resources and Evaluation Conference, pp. 6680–6689 (May 2020) Aksenov, D., Moreno-Schneider, J., Bourgonje, P., Schwarzenberg, R., Hennig, L., Rehm, G.: Abstractive text summarization based on language model conditioning and locality modeling. In: Calzolari, N. et al. (eds.) Proceedings of the Twelfth Language Resources and Evaluation Conference, pp. 6680–6689 (May 2020)
2.
Zurück zum Zitat Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating Embeddings for Modeling Multi-relational Data. In: Advances in Neural Information Processing Systems, vol. 26 (2013) Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating Embeddings for Modeling Multi-relational Data. In: Advances in Neural Information Processing Systems, vol. 26 (2013)
3.
Zurück zum Zitat Bosselut, A., Rashkin, H., Sap, M., Malaviya, C., Celikyilmaz, A., Choi, Y.: COMET: commonsense Transformers for Automatic Knowledge Graph Construction. In: Korhonen, A., Traum, D., Màrquez, L. (eds.) Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4762–4779 (Jul 2019) Bosselut, A., Rashkin, H., Sap, M., Malaviya, C., Celikyilmaz, A., Choi, Y.: COMET: commonsense Transformers for Automatic Knowledge Graph Construction. In: Korhonen, A., Traum, D., Màrquez, L. (eds.) Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4762–4779 (Jul 2019)
4.
Zurück zum Zitat Delange, M., et al.: A continual learning survey: Defying forgetting in classification tasks. IEEE Trans. Pattern Anal. Mach. Intell. (2021) Delange, M., et al.: A continual learning survey: Defying forgetting in classification tasks. IEEE Trans. Pattern Anal. Mach. Intell. (2021)
5.
Zurück zum Zitat Feng, Y., Chen, X., Lin, B.Y., Wang, P., Yan, J., Ren, X.: Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering (Sep 2020) Feng, Y., Chen, X., Lin, B.Y., Wang, P., Yan, J., Ren, X.: Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering (Sep 2020)
6.
Zurück zum Zitat Fichtel, L., Kalo, J.C., Balke, W.T.: Prompt tuning or fine-tuning-investigating relational knowledge in pre-trained language models. In: 3rd Conference on Automated Knowledge Base Construction (2021) Fichtel, L., Kalo, J.C., Balke, W.T.: Prompt tuning or fine-tuning-investigating relational knowledge in pre-trained language models. In: 3rd Conference on Automated Knowledge Base Construction (2021)
8.
Zurück zum Zitat Houlsby, N., et al.: Parameter-efficient transfer learning for nlp. In: International Conference on Machine Learning, pp. 2790–2799. PMLR (2019) Houlsby, N., et al.: Parameter-efficient transfer learning for nlp. In: International Conference on Machine Learning, pp. 2790–2799. PMLR (2019)
10.
Zurück zum Zitat Ke, Z., Lin, H., Shao, Y., Xu, H., Shu, L., Liu, B.: Continual training of language models for few-shot learning. arXiv preprint arXiv:2210.05549 (2022) Ke, Z., Lin, H., Shao, Y., Xu, H., Shu, L., Liu, B.: Continual training of language models for few-shot learning. arXiv preprint arXiv:​2210.​05549 (2022)
11.
Zurück zum Zitat Ke, Z., Liu, B., Ma, N., Xu, H., Shu, L.: Achieving forgetting prevention and knowledge transfer in continual learning. Adv. Neural. Inf. Process. Syst. 34, 22443–22456 (2021) Ke, Z., Liu, B., Ma, N., Xu, H., Shu, L.: Achieving forgetting prevention and knowledge transfer in continual learning. Adv. Neural. Inf. Process. Syst. 34, 22443–22456 (2021)
13.
Zurück zum Zitat Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. 114(13), 3521–3526 (2017)MathSciNetCrossRef Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. 114(13), 3521–3526 (2017)MathSciNetCrossRef
14.
Zurück zum Zitat Lauscher, A., Majewska, O., Ribeiro, L.F., Gurevych, I., Rozanov, N., Glavaš, G.: Common sense or world knowledge? investigating adapter-based knowledge injection into pretrained transformers. In: Proceedings of Deep Learning Inside Out (DeeLIO): the First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pp. 43–49 (2020) Lauscher, A., Majewska, O., Ribeiro, L.F., Gurevych, I., Rozanov, N., Glavaš, G.: Common sense or world knowledge? investigating adapter-based knowledge injection into pretrained transformers. In: Proceedings of Deep Learning Inside Out (DeeLIO): the First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pp. 43–49 (2020)
15.
Zurück zum Zitat Lin, B.Y., Chen, X., Chen, J., Ren, X.: KagNet: knowledge-aware graph networks for commonsense reasoning. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 2829–2839 (Nov 2019) Lin, B.Y., Chen, X., Chen, J., Ren, X.: KagNet: knowledge-aware graph networks for commonsense reasoning. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 2829–2839 (Nov 2019)
16.
Zurück zum Zitat Liu, W., et al.: K-BERT: enabling Language Representation with Knowledge Graph. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34(03), pp. 2901–2908 (2020) Liu, W., et al.: K-BERT: enabling Language Representation with Knowledge Graph. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34(03), pp. 2901–2908 (2020)
17.
Zurück zum Zitat Liu, Y., et al.: RoBERTa: A Robustly Optimized BERT Pretraining Approach (Sep 2019) Liu, Y., et al.: RoBERTa: A Robustly Optimized BERT Pretraining Approach (Sep 2019)
18.
Zurück zum Zitat Luo, H., et al.: Chatkbqa: A generate-then-retrieve framework for knowledge base question answering with fine-tuned large language models (2023) Luo, H., et al.: Chatkbqa: A generate-then-retrieve framework for knowledge base question answering with fine-tuned large language models (2023)
19.
Zurück zum Zitat Luo, L., Li, Y.F., Haf, R., Pan, S.: Reasoning on graphs: faithful and interpretable large language model reasoning. In: The Twelfth International Conference on Learning Representations (Oct 2023) Luo, L., Li, Y.F., Haf, R., Pan, S.: Reasoning on graphs: faithful and interpretable large language model reasoning. In: The Twelfth International Conference on Learning Representations (Oct 2023)
20.
Zurück zum Zitat Mihaylov, T., Clark, P., Khot, T., Sabharwal, A.: Can a suit of armor conduct electricity? a new dataset for open book question answering. In: Riloff, E., Chiang, D., Hockenmaier, J., Tsujii, J. (eds.) Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2381–2391 (Oct 2018) Mihaylov, T., Clark, P., Khot, T., Sabharwal, A.: Can a suit of armor conduct electricity? a new dataset for open book question answering. In: Riloff, E., Chiang, D., Hockenmaier, J., Tsujii, J. (eds.) Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2381–2391 (Oct 2018)
21.
Zurück zum Zitat Moiseev, F., Dong, Z., Alfonseca, E., Jaggi, M.: SKILL: structured knowledge infusion for large language models. In: Carpuat, M., de Marneffe, M.C., Meza Ruiz, I.V. (eds.) Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1581–1588 (Jul 2022) Moiseev, F., Dong, Z., Alfonseca, E., Jaggi, M.: SKILL: structured knowledge infusion for large language models. In: Carpuat, M., de Marneffe, M.C., Meza Ruiz, I.V. (eds.) Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1581–1588 (Jul 2022)
23.
Zurück zum Zitat Omeliyanenko, J., Zehe, A., Hotho, A., Schlör, D.: CapsKG: enabling continual knowledge integration in language models for automatic knowledge graph completion. In: Payne, T.R., et al.(eds.) The Semantic Web - ISWC 2023. pp. 618–636. Springer Nature Switzerland, Cham (2023). https://doi.org/10.1007/978-3-031-47240-4_33 Omeliyanenko, J., Zehe, A., Hotho, A., Schlör, D.: CapsKG: enabling continual knowledge integration in language models for automatic knowledge graph completion. In: Payne, T.R., et al.(eds.) The Semantic Web - ISWC 2023. pp. 618–636. Springer Nature Switzerland, Cham (2023). https://​doi.​org/​10.​1007/​978-3-031-47240-4_​33
24.
Zurück zum Zitat Pan, S., Luo, L., Wang, Y., Chen, C., Wang, J., Wu, X.: Unifying Large Language Models and Knowledge Graphs: A Roadmap. IEEE Trans. Knowl. Data Eng., 1–20 (2024) Pan, S., Luo, L., Wang, Y., Chen, C., Wang, J., Wu, X.: Unifying Large Language Models and Knowledge Graphs: A Roadmap. IEEE Trans. Knowl. Data Eng., 1–20 (2024)
25.
Zurück zum Zitat Pan, X., et al.: Improving Question Answering with External Knowledge Pan, X., et al.: Improving Question Answering with External Knowledge
27.
Zurück zum Zitat Rajani, N.F., McCann, B., Xiong, C., Socher, R.: Explain Yourself! Leveraging Language Models for Commonsense Reasoning arXiv:1906.02361 [cs] (Jun 2019) Rajani, N.F., McCann, B., Xiong, C., Socher, R.: Explain Yourself! Leveraging Language Models for Commonsense Reasoning arXiv:​1906.​02361 [cs] (Jun 2019)
28.
Zurück zum Zitat Roberts, A., Raffel, C., Shazeer, N.: How Much Knowledge Can You Pack Into the Parameters of a Language Model? arXiv:2002.08910 [cs, stat] (Oct 2020) Roberts, A., Raffel, C., Shazeer, N.: How Much Knowledge Can You Pack Into the Parameters of a Language Model? arXiv:​2002.​08910 [cs, stat] (Oct 2020)
29.
Zurück zum Zitat Shen, T., Mao, Y., He, P., Long, G., Trischler, A., Chen, W.: Exploiting structured knowledge in text via graph-guided representation learning. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 8980–8994 (Nov 2020) Shen, T., Mao, Y., He, P., Long, G., Trischler, A., Chen, W.: Exploiting structured knowledge in text via graph-guided representation learning. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 8980–8994 (Nov 2020)
30.
Zurück zum Zitat Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S.: AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts arXiv:2010.15980 [cs] (Nov 2020) Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S.: AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts arXiv:​2010.​15980 [cs] (Nov 2020)
31.
Zurück zum Zitat Speer, R., Chin, J., Havasi, C.: ConceptNet 5.5: An Open Multilingual Graph of General Knowledge (Dec 2018) Speer, R., Chin, J., Havasi, C.: ConceptNet 5.5: An Open Multilingual Graph of General Knowledge (Dec 2018)
32.
Zurück zum Zitat Sun, T., et al.: CoLAKE: contextualized language and knowledge embedding. In: Scott, D., Bel, N., Zong, C. (eds.) Proceedings of the 28th International Conference on Computational Linguistics, pp. 3660–3670 (Dec 2020) Sun, T., et al.: CoLAKE: contextualized language and knowledge embedding. In: Scott, D., Bel, N., Zong, C. (eds.) Proceedings of the 28th International Conference on Computational Linguistics, pp. 3660–3670 (Dec 2020)
33.
Zurück zum Zitat Swamy, V., Romanou, A., Jaggi, M.: Interpreting Language Models Through Knowledge Graph Extraction arXiv:2111.08546 [cs] (Nov 2021) Swamy, V., Romanou, A., Jaggi, M.: Interpreting Language Models Through Knowledge Graph Extraction arXiv:​2111.​08546 [cs] (Nov 2021)
34.
Zurück zum Zitat Talmor, A., Herzig, J., Lourie, N., Berant, J.: CommonsenseQA: a question answering challenge targeting commonsense knowledge. In: Burstein, J., Doran, C., Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4149–4158 (Jun 2019) Talmor, A., Herzig, J., Lourie, N., Berant, J.: CommonsenseQA: a question answering challenge targeting commonsense knowledge. In: Burstein, J., Doran, C., Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4149–4158 (Jun 2019)
35.
Zurück zum Zitat Wang, J., et al.: Knowledge prompting in pre-trained language model for natural language understanding. In: Goldberg, Y., Kozareva, Z., Zhang, Y. (eds.) Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 3164–3177 (Dec 2022) Wang, J., et al.: Knowledge prompting in pre-trained language model for natural language understanding. In: Goldberg, Y., Kozareva, Z., Zhang, Y. (eds.) Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 3164–3177 (Dec 2022)
36.
Zurück zum Zitat Wang, R., et al.: K-Adapter: infusing knowledge into pre-trained models with adapters. In: Zong, C., Xia, F., Li, W., Navigli, R. (eds.) Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 1405–1418 (Aug 2021) Wang, R., et al.: K-Adapter: infusing knowledge into pre-trained models with adapters. In: Zong, C., Xia, F., Li, W., Navigli, R. (eds.) Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 1405–1418 (Aug 2021)
37.
Zurück zum Zitat Wang, X., et al.: Improving Natural Language Inference Using External Knowledge in the Science Questions Domain arXiv:1809.05724 [cs] (Nov 2018) Wang, X., et al.: Improving Natural Language Inference Using External Knowledge in the Science Questions Domain arXiv:​1809.​05724 [cs] (Nov 2018)
38.
Zurück zum Zitat Wang, X., et al.: KEPLER: a unified model for knowledge embedding and pre-trained language representation. Trans. Associat. Comput. Linguist. 9, 176–194 (2021)CrossRef Wang, X., et al.: KEPLER: a unified model for knowledge embedding and pre-trained language representation. Trans. Associat. Comput. Linguist. 9, 176–194 (2021)CrossRef
40.
Zurück zum Zitat Yasunaga, M., Ren, H., Bosselut, A., Liang, P., Leskovec, J.: QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering (Dec 2022) Yasunaga, M., Ren, H., Bosselut, A., Liang, P., Leskovec, J.: QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering (Dec 2022)
41.
Zurück zum Zitat Ye, H., et al.: Ontology-enhanced Prompt-tuning for Few-shot Learning. In: Proceedings of the ACM Web Conference 2022, WWW 2022, pp. 778–787 (Apr 2022) Ye, H., et al.: Ontology-enhanced Prompt-tuning for Few-shot Learning. In: Proceedings of the ACM Web Conference 2022, WWW 2022, pp. 778–787 (Apr 2022)
42.
Zurück zum Zitat Ye, Z.X., Chen, Q., Wang, W., Ling, Z.H.: Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models arXiv:1908.06725 [cs] (May 2020) Ye, Z.X., Chen, Q., Wang, W., Ling, Z.H.: Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models arXiv:​1908.​06725 [cs] (May 2020)
43.
Zurück zum Zitat Zhang, D., Yuan, Z., Liu, Y., Zhuang, F., Chen, H., Xiong, H.: E-BERT: A Phrase and Product Knowledge Enhanced Language Model for E-commerce (Dec 2021) Zhang, D., Yuan, Z., Liu, Y., Zhuang, F., Chen, H., Xiong, H.: E-BERT: A Phrase and Product Knowledge Enhanced Language Model for E-commerce (Dec 2021)
44.
45.
Zurück zum Zitat Zhang, X., et al.: GreaseLM: Graph REASoning Enhanced Language Models for Question Answering (Jan 2022) Zhang, X., et al.: GreaseLM: Graph REASoning Enhanced Language Models for Question Answering (Jan 2022)
46.
Zurück zum Zitat Zhang, Z., Han, X., Liu, Z., Jiang, X., Sun, M., Liu, Q.: ERNIE: enhanced language representation with informative entities. In: Korhonen, A., Traum, D., Màrquez, L. (eds.) Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1441–1451 (Jul 2019) Zhang, Z., Han, X., Liu, Z., Jiang, X., Sun, M., Liu, Q.: ERNIE: enhanced language representation with informative entities. In: Korhonen, A., Traum, D., Màrquez, L. (eds.) Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1441–1451 (Jul 2019)
47.
Zurück zum Zitat Zhao, A., Yu, Y.: Knowledge-enabled BERT for aspect-based sentiment analysis. Knowl.-Based Syst. 227, 107220 (2021)CrossRef Zhao, A., Yu, Y.: Knowledge-enabled BERT for aspect-based sentiment analysis. Knowl.-Based Syst. 227, 107220 (2021)CrossRef
Metadaten
Titel
PreAdapter: Pre-training Language Models on Knowledge Graphs
verfasst von
Janna Omeliyanenko
Andreas Hotho
Daniel Schlör
Copyright-Jahr
2025
DOI
https://doi.org/10.1007/978-3-031-77850-6_12