Skip to main content
Top

2019 | OriginalPaper | Chapter

A New Fine-Tuning Architecture Based on Bert for Word Relation Extraction

Authors : Fanyu Meng, Junlan Feng, Danping Yin, Min Hu

Published in: Natural Language Processing and Chinese Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We introduce a new attention-based neural architecture to fine-tune Bidirectional Encoder Representations from Transformers (BERT) for semantic and grammatical relationship classificaiton at word level. BERT has been widely accepted as a base to create the state-of-the-art models for sentence-level and token-level natural language processing tasks via a fine tuning process, which typically takes the final hidden states as input for a classification layer. Inspired by the Residual Net, we propose in this paper a new architecture that augments the final hidden states with multi-head attention weights from all Transformer layers for fine-tuning. We explain the rationality of this proposal in theory and compare it with recent models for word-level relation tasks such as dependency tree parsing. The resulting model shows evident improvement comparing to the standard BERT fine-tuning model on the dependency parsing task with the English TreeBank data and the semantic relation extraction task of SemEval-2010Task-8.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Chen, D., Manning, C.: A fast and accurate dependency parser using neural networks. In: EMNLP, pp. 740–750, January 2014 Chen, D., Manning, C.: A fast and accurate dependency parser using neural networks. In: EMNLP, pp. 740–750, January 2014
2.
go back to reference Dai, A.M., Le, Q.V.: Semi-supervised sequence learning. CoRR abs/1511.01432 (2015) Dai, A.M., Le, Q.V.: Semi-supervised sequence learning. CoRR abs/1511.01432 (2015)
3.
go back to reference Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018) Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018)
4.
go back to reference Dozat, T., Manning, C.D.: Deep biaffine attention for neural dependency parsing. CoRR abs/1611.01734 (2016) Dozat, T., Manning, C.D.: Deep biaffine attention for neural dependency parsing. CoRR abs/1611.01734 (2016)
5.
go back to reference Hashimoto, K., Xiong, C., Tsuruoka, Y., Socher, R.: A joint many-task model: Growing a neural network for multiple NLP tasks. CoRR abs/1611.01587 (2016) Hashimoto, K., Xiong, C., Tsuruoka, Y., Socher, R.: A joint many-task model: Growing a neural network for multiple NLP tasks. CoRR abs/1611.01587 (2016)
6.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015)
7.
go back to reference Hendrickx, I., Kim, S., Kozareva, Z., Nakov, P., Padó, S., Pennacchiotti, M., Romano, L., Szpakowicz, S.: Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals, pp. 33–38, January 01 2010 Hendrickx, I., Kim, S., Kozareva, Z., Nakov, P., Padó, S., Pennacchiotti, M., Romano, L., Szpakowicz, S.: Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals, pp. 33–38, January 01 2010
8.
go back to reference Howard, J., Ruder, S.: Fine-tuned language models for text classification. CoRR abs/1801.06146 (2018) Howard, J., Ruder, S.: Fine-tuned language models for text classification. CoRR abs/1801.06146 (2018)
9.
go back to reference Kiperwasser, E., Goldberg, Y.: Simple and accurate dependency parsing using bidirectional LSTM feature representations. CoRR abs/1603.04351 (2016) Kiperwasser, E., Goldberg, Y.: Simple and accurate dependency parsing using bidirectional LSTM feature representations. CoRR abs/1603.04351 (2016)
10.
go back to reference de Marnee, M.C., Manning, C.: Stanford typed dependencies manual, January 2008 de Marnee, M.C., Manning, C.: Stanford typed dependencies manual, January 2008
11.
go back to reference McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective dependency parsing using spanning tree algorithms. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 523–530. Association for Computational Linguistics, Stroudsburg, PA, USA (2005). https://doi.org/10.3115/1220575.1220641 McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective dependency parsing using spanning tree algorithms. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 523–530. Association for Computational Linguistics, Stroudsburg, PA, USA (2005). https://​doi.​org/​10.​3115/​1220575.​1220641
12.
go back to reference Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. CoRR abs/1802.05365 (2018) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. CoRR abs/1802.05365 (2018)
13.
go back to reference Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018) Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)
14.
go back to reference Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017)
15.
go back to reference Wang, L., Cao, Z., de Melo, G., Liu, Z.: Relation classification via multi-level attention CNNs. pp. 1298–1307 (01 2016). 10.18653/v1/P16-1123 Wang, L., Cao, Z., de Melo, G., Liu, Z.: Relation classification via multi-level attention CNNs. pp. 1298–1307 (01 2016). 10.18653/v1/P16-1123
Metadata
Title
A New Fine-Tuning Architecture Based on Bert for Word Relation Extraction
Authors
Fanyu Meng
Junlan Feng
Danping Yin
Min Hu
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-32236-6_29

Premium Partner