Skip to main content

2019 | OriginalPaper | Buchkapitel

Fast Neural Chinese Named Entity Recognition with Multi-head Self-attention

verfasst von : Tao Qi, Chuhan Wu, Fangzhao Wu, Suyu Ge, Junxin Liu, Yongfeng Huang, Xing Xie

Erschienen in: Knowledge Graph and Semantic Computing: Knowledge Computing and Language Understanding

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Named entity recognition (NER) is an important task in natural language processing. It is an essential step for many downstream tasks, such as relation extraction and entity linking which are important for knowledge graph building and application. Existing neural NER methods are usually based on the LSTM-CRF framework and its variants. However, since the LSTM network has high time complexity to compute, the efficiency of these LSTM-CRF based NER methods is usually unsatisfactory. In this paper, we propose a fast neural NER model for Chinese texts. Our approach is based on the CNN-SelfAttention-CRF architecture, where the convolutional neural network (CNN) is used to learn contextual character representations from local contexts, the multi-head self-attention network is used to learn contextual character representations from global contexts, and the conditional random fields (CRF) is used to jointly decode the labels of characters in a sentence. Since both CNN and self-attention network can be computed in parallel, our approach can have higher efficiency than those LSTM-CRF based methods. Extensive experiments on two benchmark datasets validate that our approach is more efficient than existing neural NER methods and can achieve comparable or even better performance on Chinese NER.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Cao, P., Chen, Y., Liu, K., Zhao, J., Liu, S.: Adversarial transfer learning for Chinese named entity recognition with self-attention mechanism. In: EMNLP, pp. 182–192 (2018) Cao, P., Chen, Y., Liu, K., Zhao, J., Liu, S.: Adversarial transfer learning for Chinese named entity recognition with self-attention mechanism. In: EMNLP, pp. 182–192 (2018)
2.
Zurück zum Zitat Chen, A., Peng, F., Shan, R., Sun, G.: Chinese named entity recognition with conditional probabilistic models. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, pp. 173–176 (2006) Chen, A., Peng, F., Shan, R., Sun, G.: Chinese named entity recognition with conditional probabilistic models. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, pp. 173–176 (2006)
3.
Zurück zum Zitat Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Linguist. 4, 357–370 (2016)CrossRef Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Linguist. 4, 357–370 (2016)CrossRef
4.
Zurück zum Zitat Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)MATH Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)MATH
5.
Zurück zum Zitat Ekbal, A., Saha, S.: Stacked ensemble coupled with feature selection for biomedical entity extraction. Knowl.-Based Syst. 46, 22–32 (2013)CrossRef Ekbal, A., Saha, S.: Stacked ensemble coupled with feature selection for biomedical entity extraction. Knowl.-Based Syst. 46, 22–32 (2013)CrossRef
6.
Zurück zum Zitat Gregoric, A.Z., Bachrach, Y., Coope, S.: Named entity recognition with parallel recurrent neural networks. In: ACL, pp. 69–74 (2018) Gregoric, A.Z., Bachrach, Y., Coope, S.: Named entity recognition with parallel recurrent neural networks. In: ACL, pp. 69–74 (2018)
7.
Zurück zum Zitat Gridach, M.: Character-level neural network for biomedical named entity recognition. J. Biomed. Inform. 70, 85–91 (2017)CrossRef Gridach, M.: Character-level neural network for biomedical named entity recognition. J. Biomed. Inform. 70, 85–91 (2017)CrossRef
8.
Zurück zum Zitat Habibi, M., Weber, L., Neves, M., Wiegandt, D.L., Leser, U.: Deep learning with word embeddings improves biomedical named entity recognition. Bioinformatics 33(14), i37–i48 (2017)CrossRef Habibi, M., Weber, L., Neves, M., Wiegandt, D.L., Leser, U.: Deep learning with word embeddings improves biomedical named entity recognition. Bioinformatics 33(14), i37–i48 (2017)CrossRef
9.
Zurück zum Zitat Han, X., Sun, L., Zhao, J.: Collective entity linking in web text: a graph-based method. In: SIGIR, pp. 765–774 (2011) Han, X., Sun, L., Zhao, J.: Collective entity linking in web text: a graph-based method. In: SIGIR, pp. 765–774 (2011)
10.
Zurück zum Zitat Hwang, K., Sung, W.: Single stream parallelization of generalized LSTM-like RNNs on a GPU. In: ICASSP, pp. 1047–1051 (2015) Hwang, K., Sung, W.: Single stream parallelization of generalized LSTM-like RNNs on a GPU. In: ICASSP, pp. 1047–1051 (2015)
12.
Zurück zum Zitat Konkol, M., Brychcín, T., Konopík, M.: Latent semantics in named entity recognition. Expert Syst. Appl. 42(7), 3470–3479 (2015)CrossRef Konkol, M., Brychcín, T., Konopík, M.: Latent semantics in named entity recognition. Expert Syst. Appl. 42(7), 3470–3479 (2015)CrossRef
13.
Zurück zum Zitat Kudoh, T., Matsumoto, Y.: Use of support vector learning for chunk identification. In: Fourth Conference on Computational Natural Language Learning and the Second Learning Language in Logic Workshop (2000) Kudoh, T., Matsumoto, Y.: Use of support vector learning for chunk identification. In: Fourth Conference on Computational Natural Language Learning and the Second Learning Language in Logic Workshop (2000)
14.
Zurück zum Zitat Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: NAACL, pp. 260–270 (2016) Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: NAACL, pp. 260–270 (2016)
15.
Zurück zum Zitat LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef
16.
Zurück zum Zitat Lin, Y., Shen, S., Liu, Z., Luan, H., Sun, M.: Neural relation extraction with selective attention over instances. In: ACL, pp. 2124–2133 (2016) Lin, Y., Shen, S., Liu, Z., Luan, H., Sun, M.: Neural relation extraction with selective attention over instances. In: ACL, pp. 2124–2133 (2016)
17.
Zurück zum Zitat Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: ACL, pp. 1064–1074 (2016) Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: ACL, pp. 1064–1074 (2016)
18.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)
19.
Zurück zum Zitat Passos, A., Kumar, V., McCallum, A.: Lexicon infused phrase embeddings for named entity resolution. In: CoNLL, pp. 78–86 (2014) Passos, A., Kumar, V., McCallum, A.: Lexicon infused phrase embeddings for named entity resolution. In: CoNLL, pp. 78–86 (2014)
20.
Zurück zum Zitat Peng, N., Dredze, M.: Named entity recognition for Chinese social media with jointly trained embeddings. In: EMNLP, pp. 548–554 (2015) Peng, N., Dredze, M.: Named entity recognition for Chinese social media with jointly trained embeddings. In: EMNLP, pp. 548–554 (2015)
21.
Zurück zum Zitat Peng, N., Dredze, M.: Multi-task domain adaptation for sequence tagging. In: Proceedings of the 2nd Workshop on Representation Learning for NLP, pp. 91–100 (2017) Peng, N., Dredze, M.: Multi-task domain adaptation for sequence tagging. In: Proceedings of the 2nd Workshop on Representation Learning for NLP, pp. 91–100 (2017)
22.
Zurück zum Zitat Peters, M., Ammar, W., Bhagavatula, C., Power, R.: Semi-supervised sequence tagging with bidirectional language models. In: ACL, pp. 1756–1765 (2017) Peters, M., Ammar, W., Bhagavatula, C., Power, R.: Semi-supervised sequence tagging with bidirectional language models. In: ACL, pp. 1756–1765 (2017)
23.
Zurück zum Zitat Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: CoNLL, pp. 147–155 (2009) Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: CoNLL, pp. 147–155 (2009)
24.
Zurück zum Zitat Shen, Y., Yun, H., Lipton, Z.C., Kronrod, Y., Anandkumar, A.: Deep active learning for named entity recognition. arXiv preprint arXiv:1707.05928 (2017) Shen, Y., Yun, H., Lipton, Z.C., Kronrod, Y., Anandkumar, A.: Deep active learning for named entity recognition. arXiv preprint arXiv:​1707.​05928 (2017)
25.
Zurück zum Zitat Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH
26.
Zurück zum Zitat Tran, Q., MacKinlay, A., Yepes, A.J.: Named entity recognition with stack residual LSTM and trainable bias decoding. In: IJCNLP, pp. 566–575 (2017) Tran, Q., MacKinlay, A., Yepes, A.J.: Named entity recognition with stack residual LSTM and trainable bias decoding. In: IJCNLP, pp. 566–575 (2017)
27.
Zurück zum Zitat Tran, V.C., Nguyen, N.T., Fujita, H., Hoang, D.T., Hwang, D.: A combination of active learning and self-learning for named entity recognition on twitter using conditional random fields. Knowl.-Based Syst. 132, 179–187 (2017)CrossRef Tran, V.C., Nguyen, N.T., Fujita, H., Hoang, D.T., Hwang, D.: A combination of active learning and self-learning for named entity recognition on twitter using conditional random fields. Knowl.-Based Syst. 132, 179–187 (2017)CrossRef
28.
Zurück zum Zitat Wu, F., Liu, J., Wu, C., Huang, Y., Xie, X.: Neural Chinese named entity recognition via CNN-LSTM-CRF and joint training with word segmentation. In: WWW, pp. 3342–3348 (2019) Wu, F., Liu, J., Wu, C., Huang, Y., Xie, X.: Neural Chinese named entity recognition via CNN-LSTM-CRF and joint training with word segmentation. In: WWW, pp. 3342–3348 (2019)
29.
Zurück zum Zitat Zhang, Y., Yang, J.: Chinese NER using lattice LSTM. In: ACL, pp. 1554–1564 (2018) Zhang, Y., Yang, J.: Chinese NER using lattice LSTM. In: ACL, pp. 1554–1564 (2018)
Metadaten
Titel
Fast Neural Chinese Named Entity Recognition with Multi-head Self-attention
verfasst von
Tao Qi
Chuhan Wu
Fangzhao Wu
Suyu Ge
Junxin Liu
Yongfeng Huang
Xing Xie
Copyright-Jahr
2019
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-15-1956-7_9