Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 12/2019

26.02.2019 | Original Article

Word-character attention model for Chinese text classification

verfasst von: Xue Qiao, Chen Peng, Zhen Liu, Yanfeng Hu

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 12/2019

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Recent progress in applying neural networks to image classification has motivated the exploration of their applications to text classification tasks. Unlike the majority of these researches devoting to English corpus, in this paper, we focus on Chinese text, which is more intricate in semantic representations. As the basic unit of Chinese words, character plays a vital role in Chinese linguistic. However, most existing Chinese text classification methods typically regard word features as the basic unit of text representation but ignore the beneficial performance of character features. Besides, existing approaches compress the entire word features into a semantic representation, without considering attention mechanism which allows for capturing salient features. To tackle these issues, we propose the word-character attention model (WCAM) for Chinese text classification. This WCAM approach integrates two levels of attention models: word-level attention model captures salient words which have closer semantic relationship to the text meaning, and character-level attention model selects discriminative characters of text. Both are jointly employed to learn representation of texts. Meanwhile, the word-character constraint model and character alignment are introduced in our proposed approach to ensure the highly representative of selected characters as well as enhance their discrimination. Both are jointly employed to exploit the subtle and local differences for distinguishing the text classes. Extensive experiments on two benchmark datasets demonstrate that our WCAM approach achieves comparable or even better performance than the state-of-the-art methods for Chinese text classification.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Pratama BY, Sarno R (2016) Personality classification based on Twitter text using Naive Bayes, KNN and SVM. In: IEEE international conference on data and software engineering, pp 170–174 Pratama BY, Sarno R (2016) Personality classification based on Twitter text using Naive Bayes, KNN and SVM. In: IEEE international conference on data and software engineering, pp 170–174
2.
Zurück zum Zitat Wandabwa H, Zhang D, Sammy K (2017) Text categorization via attribute distance weighted k-nearest neighbor classification. In: IEEE international conference on information technology, pp 225–228 Wandabwa H, Zhang D, Sammy K (2017) Text categorization via attribute distance weighted k-nearest neighbor classification. In: IEEE international conference on information technology, pp 225–228
3.
Zurück zum Zitat Steyn C, Waal AD (2017) Semi-supervised machine learning for textual anomaly detection. In: IEEE pattern recognition association of South Africa and robotics and mechatronics international conference, pp 1–5 Steyn C, Waal AD (2017) Semi-supervised machine learning for textual anomaly detection. In: IEEE pattern recognition association of South Africa and robotics and mechatronics international conference, pp 1–5
4.
Zurück zum Zitat Haddoud M, Mokhtari A, Lecroq T et al (2016) Combining supervised term-weighting metrics for svm text classification with extended term representation. Knowl Inf Syst 49(3):1–23CrossRef Haddoud M, Mokhtari A, Lecroq T et al (2016) Combining supervised term-weighting metrics for svm text classification with extended term representation. Knowl Inf Syst 49(3):1–23CrossRef
5.
Zurück zum Zitat Tuteja SK, Bogiri N (2017) Email spam filtering using BPNN classification algorithm. In: IEEE international conference on automatic control and dynamic optimization techniques, pp 915–919 Tuteja SK, Bogiri N (2017) Email spam filtering using BPNN classification algorithm. In: IEEE international conference on automatic control and dynamic optimization techniques, pp 915–919
6.
Zurück zum Zitat Sun RH, Hao J (2017) Comparisons of word representations for convolutional neural network: an exploratory study on tourism Weibo classification. In: IEEE international conference on service systems and service management, pp 1–5 Sun RH, Hao J (2017) Comparisons of word representations for convolutional neural network: an exploratory study on tourism Weibo classification. In: IEEE international conference on service systems and service management, pp 1–5
7.
Zurück zum Zitat Li J, Li J, Fu X et al (2016) Learning distributed word representation with multi-contextual mixed embedding. Knowl Based Syst 106(C):220–230CrossRef Li J, Li J, Fu X et al (2016) Learning distributed word representation with multi-contextual mixed embedding. Knowl Based Syst 106(C):220–230CrossRef
8.
Zurück zum Zitat Cheng J, Li P, Ding Z et al (2017) Sentiment classification of chinese microblogging texts with global RNN. In: IEEE international conference on data science in cyberspace, pp 653–657 Cheng J, Li P, Ding Z et al (2017) Sentiment classification of chinese microblogging texts with global RNN. In: IEEE international conference on data science in cyberspace, pp 653–657
9.
Zurück zum Zitat Liu S, Bremer PT, Thiagarajan JJ et al (2017) Visual exploration of semantic relationships in neural word embeddings. IEEE Trans Vis Comput Graph 99:1–1 Liu S, Bremer PT, Thiagarajan JJ et al (2017) Visual exploration of semantic relationships in neural word embeddings. IEEE Trans Vis Comput Graph 99:1–1
10.
Zurück zum Zitat Zhang L, Chen C (2017) Sentiment classification with convolutional neural networks: an experimental study on a large-scale chinese conversation corpus. In: IEEE international conference on computational intelligence and security, pp 165–169 Zhang L, Chen C (2017) Sentiment classification with convolutional neural networks: an experimental study on a large-scale chinese conversation corpus. In: IEEE international conference on computational intelligence and security, pp 165–169
11.
Zurück zum Zitat Zhuang H, Wang C, Li C et al (2017) Natural language processing service based on stroke-level convolutional networks for Chinese text classification. In: IEEE international conference on web services, pp 404–411 Zhuang H, Wang C, Li C et al (2017) Natural language processing service based on stroke-level convolutional networks for Chinese text classification. In: IEEE international conference on web services, pp 404–411
12.
Zurück zum Zitat Chen X, Xu L, Liu Z et al (2015) Joint learning of character and word embeddings. In: International conference on artificial intelligence, pp 1236–1242 Chen X, Xu L, Liu Z et al (2015) Joint learning of character and word embeddings. In: International conference on artificial intelligence, pp 1236–1242
13.
Zurück zum Zitat Lai S, Xu L, Liu K, Zhao (2015) Recurrent convolutional neural networks for text classification. In: AAAI, pp 2267–2273 Lai S, Xu L, Liu K, Zhao (2015) Recurrent convolutional neural networks for text classification. In: AAAI, pp 2267–2273
14.
Zurück zum Zitat Li Y, Wang X, Xu P (2018) Chinese text classification model based on deep learning. Future Internet 10(11):113CrossRef Li Y, Wang X, Xu P (2018) Chinese text classification model based on deep learning. Future Internet 10(11):113CrossRef
15.
Zurück zum Zitat Yang J, Lyu Q, Gao S et al (2017) Review aspect extraction based on character-enhanced embedding models. In: IEEE international conference on network infrastructure and digital content, pp 219–223 Yang J, Lyu Q, Gao S et al (2017) Review aspect extraction based on character-enhanced embedding models. In: IEEE international conference on network infrastructure and digital content, pp 219–223
16.
Zurück zum Zitat Zhang X, Zhao J, Lecun Y (2015) Character-level convolutional networks for text classification. In: Advances in neural information processing systems, pp 649–657 Zhang X, Zhao J, Lecun Y (2015) Character-level convolutional networks for text classification. In: Advances in neural information processing systems, pp 649–657
17.
Zurück zum Zitat Zhou Y, Xu J, Cao J et al (2017) Hybrid attention networks for Chinese short text classification. Computación y Sistemas 21(4):759–769 Zhou Y, Xu J, Cao J et al (2017) Hybrid attention networks for Chinese short text classification. Computación y Sistemas 21(4):759–769
18.
Zurück zum Zitat Cho K, Van Merrienboer B, Gulcehre C et al (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078 Cho K, Van Merrienboer B, Gulcehre C et al (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078
19.
Zurück zum Zitat Bafna P, Pramod D, Vaidya A (2016) Document clustering: TF-IDF approach. In: IEEE international conference on electrical, electronics, and optimization techniques, pp 61–66 Bafna P, Pramod D, Vaidya A (2016) Document clustering: TF-IDF approach. In: IEEE international conference on electrical, electronics, and optimization techniques, pp 61–66
20.
Zurück zum Zitat Qu Z, Song X, Zheng S, Wang X, Song X, Li Z (2018) Improved Bayes method based on TF-IDF feature and grade factor feature for chinese information classification. In: 2018 IEEE international conference on big data and smart computing, pp 677–680 Qu Z, Song X, Zheng S, Wang X, Song X, Li Z (2018) Improved Bayes method based on TF-IDF feature and grade factor feature for chinese information classification. In: 2018 IEEE international conference on big data and smart computing, pp 677–680
21.
Zurück zum Zitat Le QV, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, pp 1188–1196 Le QV, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, pp 1188–1196
22.
Zurück zum Zitat Socher R, Huval B, Manning CD, Ng AY (2012) Semantic compositionality through recursive matrix-vector spaces. In: The 2012 joint conference on empirical methods in natural language processing and computational natural language learning, pp 1201–1211 Socher R, Huval B, Manning CD, Ng AY (2012) Semantic compositionality through recursive matrix-vector spaces. In: The 2012 joint conference on empirical methods in natural language processing and computational natural language learning, pp 1201–1211
23.
Zurück zum Zitat Pengfei Liu X, Qiu X, Chen S, Wu XH (2015) Multi-timescale long short-term memory neural network for modelling sentences and documents. In: The 2015 conference on empirical methods in natural language processing, pp 2326–2335 Pengfei Liu X, Qiu X, Chen S, Wu XH (2015) Multi-timescale long short-term memory neural network for modelling sentences and documents. In: The 2015 conference on empirical methods in natural language processing, pp 2326–2335
24.
Zurück zum Zitat Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of the 52nd ACL, pp 655–665 Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of the 52nd ACL, pp 655–665
25.
Zurück zum Zitat Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on EMNLP, pp 1746–1751 Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on EMNLP, pp 1746–1751
26.
Zurück zum Zitat Yang Z, Yang D, Dyer C et al (2017) Hierarchical attention networks for document classification. In: Conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1480–1489 Yang Z, Yang D, Dyer C et al (2017) Hierarchical attention networks for document classification. In: Conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1480–1489
27.
Zurück zum Zitat Li Y, Li W, Sun F, Li S (2015) Component-enhanced chinese character embeddings. In: Proceedings of the 2015 conference on EMNLP, pp 829–834 Li Y, Li W, Sun F, Li S (2015) Component-enhanced chinese character embeddings. In: Proceedings of the 2015 conference on EMNLP, pp 829–834
28.
Zurück zum Zitat Zhou Y, Xu B, Xu J et al (2017) Compositional recurrent neural networks for chinese short text classification. In: IEEE international conference on web intelligence, pp 137–144 Zhou Y, Xu B, Xu J et al (2017) Compositional recurrent neural networks for chinese short text classification. In: IEEE international conference on web intelligence, pp 137–144
29.
Zurück zum Zitat Gao S, Ramanathan A, Tourassi G (2017) Hierarchical convolutional attention networks for text classification. In: The 3rd workshop on representation learning for NLP, pp 11–23 Gao S, Ramanathan A, Tourassi G (2017) Hierarchical convolutional attention networks for text classification. In: The 3rd workshop on representation learning for NLP, pp 11–23
30.
Zurück zum Zitat Su J, Zeng J, Xiong D et al (2018) A hierarchy-to-sequence attentional neural machine translation model. IEEE/ACM Trans Audio Speech Lang Process 26(3):623–632CrossRef Su J, Zeng J, Xiong D et al (2018) A hierarchy-to-sequence attentional neural machine translation model. IEEE/ACM Trans Audio Speech Lang Process 26(3):623–632CrossRef
31.
Zurück zum Zitat Gao L, Guo Z, Zhang H et al (2017) Video captioning with attention-based lstm and semantic consistency. IEEE Trans Multimed 19(9):2045–2055CrossRef Gao L, Guo Z, Zhang H et al (2017) Video captioning with attention-based lstm and semantic consistency. IEEE Trans Multimed 19(9):2045–2055CrossRef
32.
Zurück zum Zitat Yang Z, He X, Gao J et al (2016) Stacked attention networks for image question answering. In: IEEE conference on computer vision and pattern recognition, pp 21–29 Yang Z, He X, Gao J et al (2016) Stacked attention networks for image question answering. In: IEEE conference on computer vision and pattern recognition, pp 21–29
33.
Zurück zum Zitat Zhou P, Shi W, Tian J et al (2016) Attention-based bidirectional long short-term memory networks for relation classification. In: Meeting of the association for computational linguistics, pp 207–212 Zhou P, Shi W, Tian J et al (2016) Attention-based bidirectional long short-term memory networks for relation classification. In: Meeting of the association for computational linguistics, pp 207–212
34.
Zurück zum Zitat Wang L, Cao Z, Melo GD et al (2016) Relation classification via multi-level attention CNNs. In: Meeting of the association for computational linguistics, pp 1298–1307 Wang L, Cao Z, Melo GD et al (2016) Relation classification via multi-level attention CNNs. In: Meeting of the association for computational linguistics, pp 1298–1307
35.
Zurück zum Zitat Ling Y, An Y, Liu M et al (2017) Integrating extra knowledge into word embedding models for biomedical NLP tasks. In: IEEE international joint conference on neural networks, pp 968–975 Ling Y, An Y, Liu M et al (2017) Integrating extra knowledge into word embedding models for biomedical NLP tasks. In: IEEE international joint conference on neural networks, pp 968–975
36.
Zurück zum Zitat Erhan D, Bengio Y, Courville A et al (2010) Why does unsupervised pre-training help deep learning. J Mach Learn Res 11(3):625–660MathSciNetMATH Erhan D, Bengio Y, Courville A et al (2010) Why does unsupervised pre-training help deep learning. J Mach Learn Res 11(3):625–660MathSciNetMATH
37.
Zurück zum Zitat Wang Q, Xu J, Chen H et al (2017) Two improved continuous bag-of-word models. In: IEEE international joint conference on neural networks, pp 2851–2856 Wang Q, Xu J, Chen H et al (2017) Two improved continuous bag-of-word models. In: IEEE international joint conference on neural networks, pp 2851–2856
38.
Zurück zum Zitat Wang J, Liu F, Qin S (2017) Global exponential stability of uncertain memristor-based recurrent neural networks with mixed time delays. Int J Mach Learn Cybern 2:1–13 Wang J, Liu F, Qin S (2017) Global exponential stability of uncertain memristor-based recurrent neural networks with mixed time delays. Int J Mach Learn Cybern 2:1–13
39.
Zurück zum Zitat Na Liu F, Chen M, Lu (2013) Spectral co-clustering documents and words using fuzzy K-harmonic means. Int J Mach Learn Cybern 4(1):75–83CrossRef Na Liu F, Chen M, Lu (2013) Spectral co-clustering documents and words using fuzzy K-harmonic means. Int J Mach Learn Cybern 4(1):75–83CrossRef
40.
Zurück zum Zitat Li P, Yan Ye (2016) Chinese spam filtering based on back-propagation neural networks. Softw Eng 4(2):9–12 Li P, Yan Ye (2016) Chinese spam filtering based on back-propagation neural networks. Softw Eng 4(2):9–12
41.
Zurück zum Zitat Sang L, Xie F, Liu X et al (2017) WEFEST: word embedding feature extension for short text classification. In: IEEE international conference on data mining workshops, pp 677–683 Sang L, Xie F, Liu X et al (2017) WEFEST: word embedding feature extension for short text classification. In: IEEE international conference on data mining workshops, pp 677–683
42.
Zurück zum Zitat Musa AB (2013) Comparative study on classification performance between support vector machine and logistic regression. Int J Mach Learn Cybern 4(1):13–24CrossRef Musa AB (2013) Comparative study on classification performance between support vector machine and logistic regression. Int J Mach Learn Cybern 4(1):13–24CrossRef
43.
Zurück zum Zitat Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119 Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
44.
Zurück zum Zitat Zhang D, Wang D (2015) Relation classification via recurrent neural network. Comput Sci. arXiv:1508.01006 Zhang D, Wang D (2015) Relation classification via recurrent neural network. Comput Sci. arXiv:1508.01006
45.
Zurück zum Zitat Graves A, Jürgen Schmidhuber (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18(5):602–610CrossRef Graves A, Jürgen Schmidhuber (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18(5):602–610CrossRef
46.
Zurück zum Zitat Yong Z, Meng JE, Venkatesan R et al (2016) Sentiment classification using comprehensive attention recurrent models. In: IEEE international joint conference on neural networks, pp 1562–1569 Yong Z, Meng JE, Venkatesan R et al (2016) Sentiment classification using comprehensive attention recurrent models. In: IEEE international joint conference on neural networks, pp 1562–1569
47.
Zurück zum Zitat Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Conference on empirical methods in natural language processing, pp 1412–1421 Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Conference on empirical methods in natural language processing, pp 1412–1421
Metadaten
Titel
Word-character attention model for Chinese text classification
verfasst von
Xue Qiao
Chen Peng
Zhen Liu
Yanfeng Hu
Publikationsdatum
26.02.2019
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 12/2019
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-019-00942-5

Weitere Artikel der Ausgabe 12/2019

International Journal of Machine Learning and Cybernetics 12/2019 Zur Ausgabe

Neuer Inhalt