nach oben

Neural Processing Letters

Erschienen in:

04.07.2020

Radical and Stroke-Enhanced Chinese Word Embeddings Based on Neural Networks

verfasst von: Shirui Wang, Wenan Zhou, Qiang Zhou

Erschienen in: Neural Processing Letters | Ausgabe 2/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The internal structural information of words has proven to be very effective for learning Chinese word embeddings. However, most previous attempts made a single form extraction of internal feature to learn representations, ignoring the comprehensive combination of such information. And they focused only on explicit feature of internal structures, even though these structures still have the implicit semantics of words. In this paper, we propose Radical and Stroke-enhanced Word Embeddings (RSWE), a novel method based on neural networks for learning Chinese word embeddings with joint guidance from semantic and morphological internal information. RSWE enables an embedding model to learn simultaneously from (1) implicit semantic information that is exploited from the radicals, and (2) stroke n-grams information that can be explicitly obtained from Chinese words. In the learning process, RSWE uses stroke n-grams to capture the local structural feature of words, and integrates the implicit information exploited from radicals to enhance the semantic of embeddings. Through this combination procedure, semantics of Chinese words are effectively transferred into the learned embeddings. We evaluate the effectiveness of RSWE on word similarity computation, word analogy reasoning, performance over dimensions, performance over learning corpus size, and named entity recognition tasks, the experimental results show that our model outperforms existing state-of-the-art approaches.

Vorheriger Artikel Approximately Optimal Control of Discrete-Time Nonlinear Switched Systems Using Globalized Dual Heuristic Programming

Nächster Artikel An Analysis of Activation Function Saturation in Particle Swarm Optimization Trained Neural Networks

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

https://dumps.wikimedia.org/zhwiki/20161120/.

https://github.com/BYVoid/OpenCC.

https://pypi.org/project/jieba/.

http://www.zdic.net.

Bian J, Gao B, Liu TY (2014) Knowledge-powered deep learning for word embedding. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 132–148

Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146CrossRef

Botha JA, Blunsom P (2014) Compositional morphology for word representations and language modelling. Int Conf Mach Learn 2014:1899–1907

Cao S, Lu W, Zhou J, Li X (2018) cw2vec: learning chinese word embeddings with stroke n-gram information. In: Thirty-second AAAI conference on artificial intelligence, pp 5053–5061

Chen X, Lei X, Liu Z, Sun M, Luan H (2015) Joint learning of character and word embeddings. In: International conference on artificial intelligence, pp 1236–1242

Chung T, Xu B, Liu Y, Ouyang C, Li S, Luo L (2019) Empirical study on character level neural network classifier for Chinese text. Eng Appl Artif Intell 80:1–7CrossRef

Cotterell R, Sch\(\ddot{u}\)tze H (2015) Morphological word-embeddings. In: Proceedings of the 2015 conference of the north American chapter of the association for computational linguistics: human language technologies, pp 1287–1292

Yu J, Xun J, Hao X, Song Y (2017) Joint embeddings of Chinese words, characters, and fine-grained subcharacter components. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 286–291

Heinzerling B, Strube M (2018) BPEmb: tokenization-free pre-trained subword embeddings in 275 languages. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018), pp 2989–2993

10.

Jin P, Wu Y (2012) Semeval-2012 task 4: evaluating chinese word similarity. In: Proceedings of the first joint conference on lexical and computational semantics-volume 1: proceedings of the main conference and the shared task, and volume 2: proceedings of the sixth international workshop on semantic evaluation, association for computational linguistics, pp 374–377

11.

Kim Y, Jernite Y, Sontag D, Rush AM (2016) Character-aware neural language models. In: Thirtieth AAAI conference on artificial intelligence, pp 2741–2749

12.

Li Y, Li W, Sun F, Li S (2015) Component-enhanced Chinese character embeddings. arXiv preprint arXiv:1508.06669

13.

Luong T, Socher R, Manning CD (2013) Better word representations with recursive neural networks for morphology. In: Proceedings of the seventeenth conference on computational natural language learning. pp 104–113

14.

Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional lstm-cnns-crf. arXiv preprint arXiv:1603.01354

15.

Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. arXiv preprint, arXiv:1301.3781

16.

Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013b) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119

17.

Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Empirical methods in natural language processing (EMNLP), pp 1532–1543

18.

Sennrich R, Haddow B, Birch A (2015) Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909

19.

Wang W, Bao F, Gao G (2019) Learning morpheme representation for mongolian named entity recognition. Neural Process Lett 50:1–18CrossRef

20.

Su TR, Lee HY (2017) Learning Chinese word representations from glyphs of characters. arXiv preprint arXiv:1708.04755

21.

Sun Y, Lei L, Nan Y, Ji Z, Wang X (2014) Radical-enhanced chinese character embedding. Lect Notes Comput Sci 8835:279–286CrossRef

22.

Xu J, Liu J, Zhang L, Li Z, Chen H (2016) Improve Chinese word embeddings by exploiting internal structure. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1041–1050

23.

Xu Y, Liu J (2017) Implicitly incorporating morphological information into word embedding. arXiv preprint arXiv:1701.02481

24.

Yang L, Sun M (2015) Improved learning of Chinese word embeddings with semantic knowledge. In: Chinese computational linguistics and natural language processing based on naturally annotated big data, Springer, pp 15–25

25.

Yoshua B, Aaron C, Pascal V (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828CrossRef

26.

Zhang S, Xu X, Pang Y, Han J (2019) Multi-layer attention based CNN for target-dependent sentiment classification. Neural Process Lett 2019:1–15

Titel: Radical and Stroke-Enhanced Chinese Word Embeddings Based on Neural Networks
verfasst von: Shirui Wang
Wenan Zhou
Qiang Zhou
Publikationsdatum: 04.07.2020
Verlag: Springer US
Erschienen in: Neural Processing Letters / Ausgabe 2/2020
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-020-10289-6

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Buchstaben, die aus einem Megaphon kommen/© MicroStockHub/Getty Images/iStock, Digitale Lieferkette/© zapp2photo / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2020

A Novel Combined Model for Short-Term Electric Load Forecasting Based on Whale Optimization Algorithm

Pairwise Generalization Network for Cross-Domain Image Recognition

Feature Selection Method Based on Differential Correlation Information Entropy

A Novel Solution of Using Deep Learning for White Blood Cells Classification: Enhanced Loss Function with Regularization and Weighted Loss (ELFRWL)

Robust Subspace Clustering via Latent Smooth Representation Clustering

An Abstract Painting Generation Method Based on Deep Generative Model

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.