Skip to main content

2018 | OriginalPaper | Buchkapitel

Co-training Based on Multi-type Text Features

verfasst von : Wenting Liu, Xiaojun Jing, Yaqin Chen, Jia Li

Erschienen in: Signal and Information Processing, Networking and Computers

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Sentiment classification is intended to classify the sentiment color categories expressed by the text. This paper illustrates the sentiment classification method based on the semi-supervised algorithm that aims to improve performance by using unlabeled data. This paper proposes a novel co-training style semi-supervised learning algorithm in order to improve semi-supervised learning ability. In our algorithm, there are three classifiers trained on the original labeled data, where the text representation for each classifier is unigram, bigram, and word2vec, respectively. And then these classifiers can use unlabeled data to update themselves. In detail, any of two classifiers have the same label, then add the new labeled data to a training set of the third classifier. By combining different types of features, our algorithm can extract text information from multiple views which contribute to sentiment classification. In addition, this algorithm doesn’t require redundant and sufficient perspectives. Experiments show that our algorithm is superior to traditional co-training algorithm and partial semi-supervised learning algorithm.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Zhou, X., Wan, X., Xiao, J.: Cross-lingual sentiment classification with bilingual document representation learning. In: Meeting of the Association for Computational Linguistics, pp. 1403–1412 (2016) Zhou, X., Wan, X., Xiao, J.: Cross-lingual sentiment classification with bilingual document representation learning. In: Meeting of the Association for Computational Linguistics, pp. 1403–1412 (2016)
2.
Zurück zum Zitat Xiang, B., Zhou, L.: Improving Twitter sentiment analysis with topic-based mixture modeling and semi-supervised training. In: Meeting of the Association for Computational Linguistics, pp. 434–439 (2014) Xiang, B., Zhou, L.: Improving Twitter sentiment analysis with topic-based mixture modeling and semi-supervised training. In: Meeting of the Association for Computational Linguistics, pp. 434–439 (2014)
3.
Zurück zum Zitat Xie, S., Wang, T.: Dividing for combination: a bootstrapping sentiment classification framework for micro-blogs. In: International Conference on Information Science and Cloud Computing, pp. 78–84. IEEE (2014) Xie, S., Wang, T.: Dividing for combination: a bootstrapping sentiment classification framework for micro-blogs. In: International Conference on Information Science and Cloud Computing, pp. 78–84. IEEE (2014)
4.
Zurück zum Zitat Kim, Y., Zhang, O.: Credibility adjusted term frequency: a supervised term weighting scheme for sentiment analysis and text classification. In: Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 79–83 (2014) Kim, Y., Zhang, O.: Credibility adjusted term frequency: a supervised term weighting scheme for sentiment analysis and text classification. In: Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 79–83 (2014)
5.
Zurück zum Zitat Xu, S., Liang, H.Z., Baldwin, T.: UNIMELB at SemEval-2016 tasks 4A and 4B: an ensemble of neural networks and a Word2Vec based model for sentiment classification. In: International Workshop on Semantic Evaluation, pp. 183–189 (2016) Xu, S., Liang, H.Z., Baldwin, T.: UNIMELB at SemEval-2016 tasks 4A and 4B: an ensemble of neural networks and a Word2Vec based model for sentiment classification. In: International Workshop on Semantic Evaluation, pp. 183–189 (2016)
6.
Zurück zum Zitat Jeevankumar, M., Jain, P., Chetan, M.: Opinion analysis of text on the basis of three domain classification. In: International Conference on Automatic Control and Dynamic Optimization Techniques, pp. 173–177 (2016) Jeevankumar, M., Jain, P., Chetan, M.: Opinion analysis of text on the basis of three domain classification. In: International Conference on Automatic Control and Dynamic Optimization Techniques, pp. 173–177 (2016)
7.
Zurück zum Zitat Li, S., Huang, L., Wang, J.: Semi-stacking for semi-supervised sentiment classification. In: Meeting of the Association for Computational Linguistics and the, International Joint Conference on Natural Language Processing, pp. 27–31 (2015) Li, S., Huang, L., Wang, J.: Semi-stacking for semi-supervised sentiment classification. In: Meeting of the Association for Computational Linguistics and the, International Joint Conference on Natural Language Processing, pp. 27–31 (2015)
8.
Zurück zum Zitat Le, T., Mikolov, T.: Distributed representations of words and phrases. In: Proceedings of the 31th International Conference on Machine Learning, Beijing, pp. 1188–1196 (2014) Le, T., Mikolov, T.: Distributed representations of words and phrases. In: Proceedings of the 31th International Conference on Machine Learning, Beijing, pp. 1188–1196 (2014)
9.
Zurück zum Zitat Silva, N.F.D., Hruschka, E.R., Hruschka Jr., E.R.: Biocom Usp: tweet sentiment analysis with adaptive boosting ensemble. In: International Workshop on Semantic Evaluation, pp. 123–128 (2014) Silva, N.F.D., Hruschka, E.R., Hruschka Jr., E.R.: Biocom Usp: tweet sentiment analysis with adaptive boosting ensemble. In: International Workshop on Semantic Evaluation, pp. 123–128 (2014)
10.
Zurück zum Zitat Giorgis, S., Rousas, A.: A weighted ensemble of SVMs for Twitter sentiment analysis. In: Proceedings of SemEval-2016, pp. 96–99. Association for Computational Linguistics (2016) Giorgis, S., Rousas, A.: A weighted ensemble of SVMs for Twitter sentiment analysis. In: Proceedings of SemEval-2016, pp. 96–99. Association for Computational Linguistics (2016)
11.
Zurück zum Zitat Gao, W., Li, S., Lee, S.Y.M.: Joint learning on sentiment and emotion classification. In: ACM International Conference on Information & Knowledge Management, pp. 1505–1508 (2013) Gao, W., Li, S., Lee, S.Y.M.: Joint learning on sentiment and emotion classification. In: ACM International Conference on Information & Knowledge Management, pp. 1505–1508 (2013)
12.
Zurück zum Zitat Kusner, M.J., Sun, Y., Kolkin, N.I., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of the 32nd International Conference on Machine Learning, vol. 37, Lille, France (2015) Kusner, M.J., Sun, Y., Kolkin, N.I., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of the 32nd International Conference on Machine Learning, vol. 37, Lille, France (2015)
Metadaten
Titel
Co-training Based on Multi-type Text Features
verfasst von
Wenting Liu
Xiaojun Jing
Yaqin Chen
Jia Li
Copyright-Jahr
2018
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-7521-6_26

Neuer Inhalt