nach oben

Neural Processing Letters

Erschienen in:

22.04.2022

TFM: A Triple Fusion Module for Integrating Lexicon Information in Chinese Named Entity Recognition

verfasst von: Haitao Liu, Jihua Song, Weiming Peng, Jingbo Sun, Xianwei Xin

Erschienen in: Neural Processing Letters | Ausgabe 4/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Due to the characteristics of the Chinese writing system, character-based Chinese named entity recognition models ignore the word information in sentences, which harms their performance. Recently, many works try to alleviate the problem by integrating lexicon information into character-based models. These models, however, either simply concatenate word embeddings, or have complex structures which lead to low efficiency. Furthermore, word information is viewed as the only resource from lexicon, thus the value of lexicon is not fully explored. In this work, we observe another neglected information, i.e., character position in a word, which is beneficial for identifying character meanings. To fuse character, word and character position information, we modify the key-value memory network and propose a triple fusion module, termed as TFM. TFM is not limited to simple concatenation or suffers from complicated computation, compatibly working with the general sequence labeling model. Experimental evaluations show that our model has performance superiority. The F1-scores on Resume, Weibo and MSRA are 96.19%, 71.12% and 95.63% respectively.

Vorheriger Artikel CBR: An Effective Clustering Approach for Time Series Events

Nächster Artikel Long Time Series Deep Forecasting with Multiscale Feature Extraction and Seq2seq Attention Mechanism

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

http://finance.sina.com.cn/stock/index.shtml.

http://www.weibo.com/.

https://huggingface.co/bert-base-chinese.

https://fasttext.cc/docs/en/crawl-vectors.html.

https://github.com/nghuyong/ERNIE-Pytorch.

Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146CrossRef

Cai Q, Pan Y, Yao T, Yan C, Mei T (2018) Memory matching networks for one-shot image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4080–4088

Cao P, Chen Y, Liu K, Zhao J, Liu S (2018) Adversarial transfer learning for chinese named entity recognition with self-attention mechanism. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 182–192

Chang N, Zhong J, Li Q, Zhu J (2020) A mixed semantic features model for Chinese NER with characters and words. Adv Inf Retr 12035:356

Chiu JP, Nichols E (2016) Named entity recognition with bidirectional LSTM-CNNS. Trans Assoc Comput Linguist 4:357–370CrossRef

Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805

Dhole KD, Manning CD (2020) Syn-qg: syntactic and shallow semantic rules for question generation. arXiv:2004.08694

Ding R, Xie P, Zhang X, Lu W, Li L, Si L (2019) A neural multi-digraph model for chinese ner with gazetteers. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 1462–1467

Dong C, Zhang J, Zong C, Hattori M, Di H (2016) Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In: Natural language understanding and intelligent applications. Springer, pp 239–250

10.

Elhammadi S, Lakshmanan LV, Ng R, Simpson M, Huai B, Wang Z, Wang L (2020) A high precision pipeline for financial knowledge graph construction. In: Proceedings of the 28th international conference on computational linguistics, pp 967–977

11.

Forney GD (1973) The viterbi algorithm. Proc IEEE 61(3):268–278MathSciNetCrossRef

12.

Gong C, Li Z, Xia Q, Chen W, Zhang M (2020) Hierarchical LSTM with char-subword-word tree-structure representation for Chinese named entity recognition. Sci China Inf Sci 63(10):1–15CrossRef

13.

Goyal A, Gupta V, Kumar M (2021) A deep learning-based bilingual hindi and punjabi named entity recognition system using enhanced word embeddings. Knowl Based Syst, 107601

14.

Gui T, Ma R, Zhang Q, Zhao L, Jiang YG, Huang X (2019) CNN-based Chinese ner with lexicon rethinking. In: IJCAI, pp 4982–4988

15.

Gui T, Zou Y, Zhang Q, Peng M, Fu J, Wei Z, Huang XJ (2019) A lexicon-based graph neural network for Chinese NER. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 1039–1049

16.

Gui T, Ye J, Zhang Q, Zhou Y, Gong Y, Huang X (2020) Leveraging document-level label consistency for named entity recognition. In: IJCAI, pp 3976–3982

17.

Hofer M, Kormilitzin A, Goldberg P, Nevado-Holgado A (2018) Few-shot learning for named entity recognition in medical text. arXiv:1811.05468

18.

Hu D, Wei L (2020) SLK-NER: exploiting second-order lexicon knowledge for Chinese NER. arXiv:2007.08416

19.

Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence tagging. arXiv:1508.01991

20.

Lafferty J, McCallum A, Pereira FC (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data

21.

Levow GA (2006) The third international chinese language processing bakeoff: Word segmentation and named entity recognition. In: Proceedings of the Fifth SIGHAN workshop on Chinese language processing, pp 108–117

22.

Li J, Meng K (2021) MFE-NER: multi-feature fusion embedding for chinese named entity recognition. arXiv:2109.07877

23.

Li X, Yan H, Qiu X, Huang X (2020) Flat: Chinese NER using flat-lattice transformer. arXiv:2004.11795

24.

Lin BY, Lee DH, Shen M, Moreno R, Huang X, Shiralkar P, Ren X (2020) Triggerner: learning with entity triggers as explanations for named entity recognition. arXiv:2004.07493

25.

Liu T, Yao JG, Lin CY (2019) Towards improving neural named entity recognition with gazetteers. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5301–5307

26.

Liu W, Xu T, Xu Q, Song J, Zu Y (2019) An encoding strategy based word-character LSTM for Chinese NER. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol. 1 (Long and Short Papers), pp 2379–2389

27.

Luo Y, Xiao F, Zhao H (2020) Hierarchical contextualized representation for named entity recognition. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 8441–8448

28.

Ma R, Peng M, Zhang Q, Huang X (2019) Simplify the usage of lexicon in Chinese NER. arXiv:1908.05969

29.

Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. arXiv:1310.4546

30.

Miller A, Fisch A, Dodge J, Karimi AH, Bordes A, Weston J (2016) Key-value memory networks for directly reading documents. arXiv:1606.03126

31.

Misawa S, Taniguchi M, Miura Y, Ohkuma T (2017) Character-based bidirectional lstm-crf with words and characters for japanese named entity recognition. In: Proceedings of the first workshop on subword and character level models in NLP, pp 97–102

32.

Nie Y, Tian Y, Song Y, Ao X, Wan X (2020) Improving named entity recognition with attentive ensemble of syntactic information. arXiv:2010.15466

33.

Nie Y, Tian Y, Wan X, Song Y, Dai B (2020) Named entity recognition for social media texts with semantic augmentation. arXiv:2010.15458

34.

Peng N, Dredze M (2016) Improving named entity recognition for Chinese social media with word segmentation representation learning. arXiv:1603.00786

35.

Peshterliev S, Dupuy C, Kiss I (2020) Self-attention gazetteer embeddings for named-entity recognition. arXiv:2004.04060

36.

Prakash A, Zhao S, Hasan SA, Datla V, Lee K, Qadir A, Liu J, Farri O (2017) Condensed memory networks for clinical diagnostic inferencing. In: Thirty-first AAAI conference on artificial intelligence

37.

Sui D, Chen Y, Liu K, Zhao J, Liu S (2019) Leverage lexical knowledge for Chinese named entity recognition via collaborative graph network. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 3821–3831

38.

Sun Y, Wang S, Li Y, Feng S, Chen X, Zhang H, Tian X, Zhu D, Tian H, Wu H (2019) Ernie: enhanced representation through knowledge integration. arXiv:1904.09223

39.

Tian Y, Shen W, Song Y, Xia F, He M, Li K (2020) Improving biomedical named entity recognition with syntactic information. BMC Bioinform 21(1):1–17CrossRef

40.

Tian Y, Song Y, Xia F, Zhang T, Wang Y (2020) Improving chinese word segmentation with wordhood memory networks. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 8274–8285

41.

Tong M, Xu B, Wang S, Cao Y, Hou L, Li J, Xie J (2020) Improving event detection via open-domain trigger knowledge. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 5887–5897

42.

Tu Z, Liu Y, Shi S, Zhang T (2018) Learning to remember translation history with a continuous cache. Trans Assoc Comput Linguist 6:407–420CrossRef

43.

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. arXiv:1706.03762

44.

Wu F, Liu J, Wu C, Huang Y, Xie X (2019) Neural Chinese named entity recognition via CNN-LSTM-CRF and joint training with word segmentation. In: The World Wide Web conference, pp 3342–3348

45.

Wu J, Harris I, Zhao H (2021) Spoken language understanding for task-oriented dialogue systems with augmented memory networks. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 797–806

46.

Xu H, Chen Z, Wang S, Jiang X (2021) Chinese NER using Albert and multi-word information. In: ACM turing award celebration conference-China (ACM TURC 2021), pp 141–145

47.

Yan R, Jiang X, Dang D (2021) Named entity recognition by using XLNet-BILSTM-CRF. Neural Process Lett 53:1–18CrossRef

48.

Zhang Y, Yang J (2018) Chinese Ner using lattice LSTM. arXiv:1805.02023

49.

Zhu Y, Wang G, Karlsson BF (2019) Can-ner: convolutional attention network for Chinese named entity recognition. arXiv:1904.02141

Titel: TFM: A Triple Fusion Module for Integrating Lexicon Information in Chinese Named Entity Recognition
verfasst von: Haitao Liu
Jihua Song
Weiming Peng
Jingbo Sun
Xianwei Xin
Publikationsdatum: 22.04.2022
Verlag: Springer US
Erschienen in: Neural Processing Letters / Ausgabe 4/2022
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-022-10768-y

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Buchstaben, die aus einem Megaphon kommen/© MicroStockHub/Getty Images/iStock, Digitale Lieferkette/© zapp2photo / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 4/2022

Self-Adaptive Clustering of Dynamic Multi-Graph Learning

Quasi-Synchronization and Complete Synchronization of Fractional-Order Fuzzy BAM Neural Networks Via Nonlinear Control

3D Face Recognition Using a Fusion of PCA and ICA Convolution Descriptors

Simplification of English and Bengali Sentences for Improving Quality of Machine Translation

Domain Adaptation for POS Tagging with Contrastive Monotonic Chunk-wise Attention

Privacy Enhanced Cloud-Based Facial Recognition

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.