Skip to main content

2017 | OriginalPaper | Buchkapitel

Weakly Supervised Feature Compression Based Topic Model for Sentiment Classification

verfasst von : Yan Hu, Xiaofei Xu, Li Li

Erschienen in: Knowledge Science, Engineering and Management

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Sentiment classification aims to use automatic tools to explore the subjective information like opinions and attitudes from user comments. Most of existing methods are centered on the semantic relationships and the extraction of syntactic feature, while the document topic feature is ignored. In this paper, a weakly supervised hierarchical model called external knowledge-based Latent Dirichlet Allocation (ELDA) is proposed to extract document topic feature. First of all, we take advantage of ELDA to compress document feature and increase the polarity weight of document topic feature. And then, we train a classifier based on the topic feature using SVM. Experiment results on one English dataset and one Chinese dataset show that our method can outperform the state-of-the-art models by at least \(4\%\) in terms of accuracy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)MATH Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)MATH
2.
Zurück zum Zitat Boyd-Graber, J., Resnik, P.: Holistic sentiment analysis across languages: multilingual supervised latent dirichlet allocation. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 45–55. Association for Computational Linguistics (2010) Boyd-Graber, J., Resnik, P.: Holistic sentiment analysis across languages: multilingual supervised latent dirichlet allocation. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 45–55. Association for Computational Linguistics (2010)
3.
Zurück zum Zitat Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, vol. 1, p. 12 (2009) Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, vol. 1, p. 12 (2009)
4.
Zurück zum Zitat Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Nat. Acad. Sci. 101(Suppl. 1), 5228–5235 (2004)CrossRef Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Nat. Acad. Sci. 101(Suppl. 1), 5228–5235 (2004)CrossRef
6.
Zurück zum Zitat Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57. ACM (1999) Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57. ACM (1999)
7.
Zurück zum Zitat Kataria, S.S., Kumar, K.S., Rastogi, R.R., Sen, P., Sengamedu, S.H.: Entity disambiguation with hierarchical topic models. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1037–1045. ACM (2011) Kataria, S.S., Kumar, K.S., Rastogi, R.R., Sen, P., Sengamedu, S.H.: Entity disambiguation with hierarchical topic models. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1037–1045. ACM (2011)
8.
Zurück zum Zitat Li, F., Huang, M., Zhu, X.: Sentiment analysis with global topics and local dependency. In: AAAI, vol. 10, pp. 1371–1376 (2010) Li, F., Huang, M., Zhu, X.: Sentiment analysis with global topics and local dependency. In: AAAI, vol. 10, pp. 1371–1376 (2010)
9.
Zurück zum Zitat Li, J., Sun, M.: Experimental study on sentiment classification of Chinese review using machine learning techniques. In: 2007 International Conference on Natural Language Processing and Knowledge Engineering, pp. 393–400. IEEE (2007) Li, J., Sun, M.: Experimental study on sentiment classification of Chinese review using machine learning techniques. In: 2007 International Conference on Natural Language Processing and Knowledge Engineering, pp. 393–400. IEEE (2007)
10.
Zurück zum Zitat Li, X., Pang, J., Mo, B., Rao, Y., Wang, F.L.: Deep neural network for short-text sentiment classification. In: Gao, H., Kim, J., Sakurai, Y. (eds.) DASFAA 2016. LNCS, vol. 9645, pp. 168–175. Springer, Cham (2016). doi:10.1007/978-3-319-32055-7_15 CrossRef Li, X., Pang, J., Mo, B., Rao, Y., Wang, F.L.: Deep neural network for short-text sentiment classification. In: Gao, H., Kim, J., Sakurai, Y. (eds.) DASFAA 2016. LNCS, vol. 9645, pp. 168–175. Springer, Cham (2016). doi:10.​1007/​978-3-319-32055-7_​15 CrossRef
11.
Zurück zum Zitat Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p. 271. Association for Computational Linguistics (2004) Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p. 271. Association for Computational Linguistics (2004)
12.
Zurück zum Zitat Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retrieval 2(1–2), 1–135 (2008) Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retrieval 2(1–2), 1–135 (2008)
13.
Zurück zum Zitat Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing-Volume 10, pp. 79–86. Association for Computational Linguistics (2002) Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing-Volume 10, pp. 79–86. Association for Computational Linguistics (2002)
14.
Zurück zum Zitat Phan, X.-H., Nguyen, L.-M., Horiguchi, S.: Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proceedings of the 17th International Conference on World Wide Web, pp. 91–100. ACM (2008) Phan, X.-H., Nguyen, L.-M., Horiguchi, S.: Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proceedings of the 17th International Conference on World Wide Web, pp. 91–100. ACM (2008)
15.
Zurück zum Zitat Ren, F., Ye, W.: Predicting user-topic opinions in twitter with social and topical context. IEEE Trans. Affect. Comput. 4(4), 412–424 (2013)CrossRef Ren, F., Ye, W.: Predicting user-topic opinions in twitter with social and topical context. IEEE Trans. Affect. Comput. 4(4), 412–424 (2013)CrossRef
16.
Zurück zum Zitat Tan, S., Zhang, J.: An empirical study of sentiment analysis for Chinese documents. Expert Syst. Appl. 34(4), 2622–2629 (2008)CrossRef Tan, S., Zhang, J.: An empirical study of sentiment analysis for Chinese documents. Expert Syst. Appl. 34(4), 2622–2629 (2008)CrossRef
17.
Zurück zum Zitat Turney, P., Littman, M.L.: Unsupervised learning of semantic orientation from a hundred-billion-word corpus (2002) Turney, P., Littman, M.L.: Unsupervised learning of semantic orientation from a hundred-billion-word corpus (2002)
18.
Zurück zum Zitat Wang, H., Yin, P., Yao, J., Liu, J.N.K.: Text feature selection for sentiment classification of Chinese online reviews. J. Exp. Theoret. Artif. Intell. 25(4), 425–439 (2013)CrossRef Wang, H., Yin, P., Yao, J., Liu, J.N.K.: Text feature selection for sentiment classification of Chinese online reviews. J. Exp. Theoret. Artif. Intell. 25(4), 425–439 (2013)CrossRef
19.
Zurück zum Zitat Wang, H., Yin, P., Zheng, L., Liu, J.N.K.: Sentiment classification of online reviews: using sentence-based language model. J. Exp. Theoret. Artif. Intell. 26(1), 13–31 (2014)CrossRef Wang, H., Yin, P., Zheng, L., Liu, J.N.K.: Sentiment classification of online reviews: using sentence-based language model. J. Exp. Theoret. Artif. Intell. 26(1), 13–31 (2014)CrossRef
20.
Zurück zum Zitat Wang, J., Li, L., Tan, F., Zhu, Y., Feng, W.: Detecting hotspot information using multi-attribute based topic model. PLoS ONE 10(10), e0140539 (2015)CrossRef Wang, J., Li, L., Tan, F., Zhu, Y., Feng, W.: Detecting hotspot information using multi-attribute based topic model. PLoS ONE 10(10), e0140539 (2015)CrossRef
21.
Zurück zum Zitat Wu, F., Huang, Y.: Collaborative multi-domain sentiment classification. In: 2015 IEEE International Conference on Data Mining (ICDM), pp. 459–468. IEEE (2015) Wu, F., Huang, Y.: Collaborative multi-domain sentiment classification. In: 2015 IEEE International Conference on Data Mining (ICDM), pp. 459–468. IEEE (2015)
22.
Zurück zum Zitat Xia, R., Zong, C.: Exploring the use of word relation features for sentiment classification. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 1336–1344. Association for Computational Linguistics (2010) Xia, R., Zong, C.: Exploring the use of word relation features for sentiment classification. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 1336–1344. Association for Computational Linguistics (2010)
23.
Zurück zum Zitat Yang, Y., Jia, J., Zhang, S., Boya, W., Chen, Q., Li, J., Xing, C., Tang, J.: How do your friends on social media disclose your emotions? In: AAAI, vol. 14, pp. 1–7 (2014) Yang, Y., Jia, J., Zhang, S., Boya, W., Chen, Q., Li, J., Xing, C., Tang, J.: How do your friends on social media disclose your emotions? In: AAAI, vol. 14, pp. 1–7 (2014)
24.
Zurück zum Zitat Yin, P., Wang, H., Zheng, L.: Sentiment classification of Chinese online reviews: analysing and improving supervised machine learning. Int. J. Web Eng. Technol. 7(4), 381–398 (2012)CrossRef Yin, P., Wang, H., Zheng, L.: Sentiment classification of Chinese online reviews: analysing and improving supervised machine learning. Int. J. Web Eng. Technol. 7(4), 381–398 (2012)CrossRef
25.
Zurück zum Zitat Zhai, Z., Hua, X., Kang, B., Jia, P.: Exploiting effective features for Chinese sentiment classification. Expert Syst. Appl. 38(8), 9139–9146 (2011)CrossRef Zhai, Z., Hua, X., Kang, B., Jia, P.: Exploiting effective features for Chinese sentiment classification. Expert Syst. Appl. 38(8), 9139–9146 (2011)CrossRef
26.
Zurück zum Zitat Zhang, D., Hua, X., Zengcai, S., Yunfeng, X.: Chinese comments sentiment classification based on word2vec and SVM perf. Expert Syst. Appl. 42(4), 1857–1863 (2015)CrossRef Zhang, D., Hua, X., Zengcai, S., Yunfeng, X.: Chinese comments sentiment classification based on word2vec and SVM perf. Expert Syst. Appl. 42(4), 1857–1863 (2015)CrossRef
27.
Metadaten
Titel
Weakly Supervised Feature Compression Based Topic Model for Sentiment Classification
verfasst von
Yan Hu
Xiaofei Xu
Li Li
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-63558-3_3