Skip to main content

2016 | OriginalPaper | Buchkapitel

PerSent: A Freely Available Persian Sentiment Lexicon

verfasst von : Kia Dashtipour, Amir Hussain, Qiang Zhou, Alexander Gelbukh, Ahmad Y. A. Hawalah, Erik Cambria

Erschienen in: Advances in Brain Inspired Cognitive Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

People need to know other people’s opinions to make well-informed decisions to buy products or services. Companies and organizations need to understand people’s attitude towards their products and services and use feedback from the customers to improve their products. Sentiment analysis techniques address these needs. While the majority of Internet users are not English speakers, most research papers in the sentiment-analysis field focus on English; resources for other languages are scarce. In this paper, we introduce a Persian sentiment lexicon, which consists of 1500 words along with their part-of-speech tags and polarity scores. We have used two machine-learning algorithms to evaluate the performance of this resource on a sentiment analysis task. The lexicon is freely available and can be downloaded from our website.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: feature selection for opinion classification in Web forums. ACM Trans. Inf. Syst. (TOIS) 26(3), 12 (2008)CrossRef Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: feature selection for opinion classification in Web forums. ACM Trans. Inf. Syst. (TOIS) 26(3), 12 (2008)CrossRef
Zurück zum Zitat Abdul-Mageed, M., Diab, M.T.: SANA: a large scale multi-genre, multi-dialect lexicon for arabic subjectivity and sentiment analysis. In: LREC, pp. 1162–1169 (2014) Abdul-Mageed, M., Diab, M.T.: SANA: a large scale multi-genre, multi-dialect lexicon for arabic subjectivity and sentiment analysis. In: LREC, pp. 1162–1169 (2014)
Zurück zum Zitat Benamara, F., Cesarano, C., Picariello, A., Recupero, D.R., Subrahmanian, V.S.: Sentiment analysis: adjectives and adverbs are better than adjectives alone. In: ICWSM (2007) Benamara, F., Cesarano, C., Picariello, A., Recupero, D.R., Subrahmanian, V.S.: Sentiment analysis: adjectives and adverbs are better than adjectives alone. In: ICWSM (2007)
Zurück zum Zitat Cambria, E.: Affective computing and sentiment analysis. IEEE Intell. Syst. 31(2), 102–107 (2016)CrossRef Cambria, E.: Affective computing and sentiment analysis. IEEE Intell. Syst. 31(2), 102–107 (2016)CrossRef
Zurück zum Zitat Cambria, E., Howard, N., Xia, Y., Chua, T.S.: Computational intelligence for big social data analysis. IEEE Comput. Intell. Mag. 11(3), 8–9 (2016)CrossRef Cambria, E., Howard, N., Xia, Y., Chua, T.S.: Computational intelligence for big social data analysis. IEEE Comput. Intell. Mag. 11(3), 8–9 (2016)CrossRef
Zurück zum Zitat Cambria, E., Poria, S., Bisio, F., Bajpai, R., Chaturvedi, I.: The CLSA model: a novel framework for concept-level sentiment analysis. In: Gelbukh, A. (ed.) CICLing 2015. LNCS, vol. 9042, pp. 3–22. Springer, Heidelberg (2015). doi:10.1007/978-3-319-18117-2_1 Cambria, E., Poria, S., Bisio, F., Bajpai, R., Chaturvedi, I.: The CLSA model: a novel framework for concept-level sentiment analysis. In: Gelbukh, A. (ed.) CICLing 2015. LNCS, vol. 9042, pp. 3–22. Springer, Heidelberg (2015). doi:10.​1007/​978-3-319-18117-2_​1
Zurück zum Zitat Cambria, E., Schuller, B., Xia, Y., Havasi, C.: New avenues in opinion mining and sentiment analysis. IEEE Intell. Syst. 28(2), 15–21 (2013)CrossRef Cambria, E., Schuller, B., Xia, Y., Havasi, C.: New avenues in opinion mining and sentiment analysis. IEEE Intell. Syst. 28(2), 15–21 (2013)CrossRef
Zurück zum Zitat Cambria, E., Speer, R., Havasi, C., Hussain, A.: SenticNet: a publicly available semantic resource for opinion mining. In: Common-sense Knowledge, AAAI Fall Symposium series, vol. 10 (2010) Cambria, E., Speer, R., Havasi, C., Hussain, A.: SenticNet: a publicly available semantic resource for opinion mining. In: Common-sense Knowledge, AAAI Fall Symposium series, vol. 10 (2010)
Zurück zum Zitat Chen, Y., Skiena, S.: Building sentiment lexicons for all major languages. In: ACL, vol. 2, pp. 383–389 (2014) Chen, Y., Skiena, S.: Building sentiment lexicons for all major languages. In: ACL, vol. 2, pp. 383–389 (2014)
Zurück zum Zitat Dashtipour, K., Poria, S., Hussain, A., Cambria, E., Hawalah, A.Y., Gelbukh, A., Zhou, Q.: Multilingual sentiment analysis: state of the art and independent comparison of techniques. Cogn. Comput. 8, 1–15 (2016) Dashtipour, K., Poria, S., Hussain, A., Cambria, E., Hawalah, A.Y., Gelbukh, A., Zhou, Q.: Multilingual sentiment analysis: state of the art and independent comparison of techniques. Cogn. Comput. 8, 1–15 (2016)
Zurück zum Zitat Dehkharghani, R., Saygin, Y., Yanikoglu, B., Oflazer, K.: SentiTurkNet: a Turkish polarity lexicon for sentiment analysis. Lang. Resour. Eval. 50, 1–19 (2015) Dehkharghani, R., Saygin, Y., Yanikoglu, B., Oflazer, K.: SentiTurkNet: a Turkish polarity lexicon for sentiment analysis. Lang. Resour. Eval. 50, 1–19 (2015)
Zurück zum Zitat de Albornoz, J.C., Plaza, L., Gervás, P.: SentiSense: an easily scalable concept-based affective lexicon for sentiment analysis. In: LREC, pp. 3562–3567 (2012) de Albornoz, J.C., Plaza, L., Gervás, P.: SentiSense: an easily scalable concept-based affective lexicon for sentiment analysis. In: LREC, pp. 3562–3567 (2012)
Zurück zum Zitat Elhawary, M., Elfeky, M.: Mining Arabic business reviews. In: 2010 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 1108–1113. IEEE (2010) Elhawary, M., Elfeky, M.: Mining Arabic business reviews. In: 2010 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 1108–1113. IEEE (2010)
Zurück zum Zitat Elarnaoty, M., AbdelRahman, S., Fahmy, A.: A machine learning approach for opinion holder extraction in Arabic language. arXiv preprint arXiv:1206.1011 (2012) Elarnaoty, M., AbdelRahman, S., Fahmy, A.: A machine learning approach for opinion holder extraction in Arabic language. arXiv preprint arXiv:​1206.​1011 (2012)
Zurück zum Zitat Esuli, A., Sebastiani, F.: Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of LREC, Vol. 6, pp. 417–422 (2006) Esuli, A., Sebastiani, F.: Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of LREC, Vol. 6, pp. 417–422 (2006)
Zurück zum Zitat Hatzivassiloglou, V., McKeown, K.R.: Predicting the semantic orientation of adjectives. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, pp. 174–181. Association for Computational Linguistics (1997) Hatzivassiloglou, V., McKeown, K.R.: Predicting the semantic orientation of adjectives. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, pp. 174–181. Association for Computational Linguistics (1997)
Zurück zum Zitat He, Y., Zhou, D.: Self-training from labeled features for sentiment analysis. Inf. Process. Manage. 47(4), 606–616 (2011)CrossRef He, Y., Zhou, D.: Self-training from labeled features for sentiment analysis. Inf. Process. Manage. 47(4), 606–616 (2011)CrossRef
Zurück zum Zitat Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177. ACM (2004) Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177. ACM (2004)
Zurück zum Zitat Karimi, S.: Aspects of Persian syntax, specificity, and the theory of grammar. University of Washington (1989) Karimi, S.: Aspects of Persian syntax, specificity, and the theory of grammar. University of Washington (1989)
Zurück zum Zitat Kouloumpis, E., Wilson, T., Moore, J.D.: Twitter sentiment analysis: the good the bad and the omg!. In: ICWSM, vol. 11, pp. 538–541 (2011) Kouloumpis, E., Wilson, T., Moore, J.D.: Twitter sentiment analysis: the good the bad and the omg!. In: ICWSM, vol. 11, pp. 538–541 (2011)
Zurück zum Zitat Mahyoub, F.H., Siddiqui, M.A., Dahab, M.Y.: Building an Arabic sentiment lexicon using semi-supervised learning. J. King Saud Univ. Comput. Inf. Sci. 26(4), 417–424 (2014) Mahyoub, F.H., Siddiqui, M.A., Dahab, M.Y.: Building an Arabic sentiment lexicon using semi-supervised learning. J. King Saud Univ. Comput. Inf. Sci. 26(4), 417–424 (2014)
Zurück zum Zitat Maynard, D., Funk, A.: Automatic detection of political opinions in tweets. In: García-Castro, R., Fensel, D., Antoniou, G. (eds.) ESWC 2011. LNCS, vol. 7117, pp. 88–99. Springer, Heidelberg (2012). doi:10.1007/978-3-642-25953-1_8 CrossRef Maynard, D., Funk, A.: Automatic detection of political opinions in tweets. In: García-Castro, R., Fensel, D., Antoniou, G. (eds.) ESWC 2011. LNCS, vol. 7117, pp. 88–99. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-25953-1_​8 CrossRef
Zurück zum Zitat Neviarouskaya, A., Prendinger, H., Ishizuka, M.: SentiFul: a lexicon for sentiment analysis. IEEE Trans. Affect. Comput. 2(1), 22–36 (2011)CrossRef Neviarouskaya, A., Prendinger, H., Ishizuka, M.: SentiFul: a lexicon for sentiment analysis. IEEE Trans. Affect. Comput. 2(1), 22–36 (2011)CrossRef
Zurück zum Zitat Pak, A., Paroubek, P.: Twitter based system: using Twitter for disambiguating sentiment ambiguous adjectives. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 436–439. Association for Computational Linguistics, July 2010 Pak, A., Paroubek, P.: Twitter based system: using Twitter for disambiguating sentiment ambiguous adjectives. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 436–439. Association for Computational Linguistics, July 2010
Zurück zum Zitat Pakray, P., Neogi, S., Bhaskar, P., Poria, S., Bandyopadhyay, S., Gelbukh, A.: A textual entailment system using anaphora resolution. In: System Report, Text Analysis Conference Recognizing Textual Entailment Track (TAC RTE) Notebook, November 2011a Pakray, P., Neogi, S., Bhaskar, P., Poria, S., Bandyopadhyay, S., Gelbukh, A.: A textual entailment system using anaphora resolution. In: System Report, Text Analysis Conference Recognizing Textual Entailment Track (TAC RTE) Notebook, November 2011a
Zurück zum Zitat Pakray, P., Pal, S., Poria, S., Bandyopadhyay, S., Gelbukh, A.: JU_CSE_TAC: textual entailment recognition system at TAC RTE-6. In: System Report, Text Analysis Conference Recognizing Textual Entailment Track (TAC RTE) Notebook (2010) Pakray, P., Pal, S., Poria, S., Bandyopadhyay, S., Gelbukh, A.: JU_CSE_TAC: textual entailment recognition system at TAC RTE-6. In: System Report, Text Analysis Conference Recognizing Textual Entailment Track (TAC RTE) Notebook (2010)
Zurück zum Zitat Pakray, P., Poria, S., Bandyopadhyay, S., Gelbukh, A.: Semantic textual entailment recognition using UNL. Polibits 43, 23–27 (2011b) Pakray, P., Poria, S., Bandyopadhyay, S., Gelbukh, A.: Semantic textual entailment recognition using UNL. Polibits 43, 23–27 (2011b)
Zurück zum Zitat Poria, S., Cambria, E., Gelbukh, A.: Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In: Proceedings of EMNLP, pp. 2539–2544 (2015a) Poria, S., Cambria, E., Gelbukh, A.: Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In: Proceedings of EMNLP, pp. 2539–2544 (2015a)
Zurück zum Zitat Poria, S., Cambria, E., Gelbukh, A.: Aspect extraction for opinion mining with a deep convolutional neural network. Knowl.-Based Syst. 108, 42–49 (2016)CrossRef Poria, S., Cambria, E., Gelbukh, A.: Aspect extraction for opinion mining with a deep convolutional neural network. Knowl.-Based Syst. 108, 42–49 (2016)CrossRef
Zurück zum Zitat Poria, S., Cambria, E., Gelbukh, A., Bisio, F., Hussain, A.: Sentiment data flow analysis by means of dynamic linguistic patterns. IEEE Comput. Intell. Mag. 10(4), 26–36 (2015b) Poria, S., Cambria, E., Gelbukh, A., Bisio, F., Hussain, A.: Sentiment data flow analysis by means of dynamic linguistic patterns. IEEE Comput. Intell. Mag. 10(4), 26–36 (2015b)
Zurück zum Zitat Poria, S., Cambria, E., Winterstein, G., Huang, G.B.: Sentic patterns: dependency-based rules for concept-level sentiment analysis. Knowl.-Based Syst. 69, 45–63 (2014)CrossRef Poria, S., Cambria, E., Winterstein, G., Huang, G.B.: Sentic patterns: dependency-based rules for concept-level sentiment analysis. Knowl.-Based Syst. 69, 45–63 (2014)CrossRef
Zurück zum Zitat Poria, S., Gelbukh, A., Das, D., Bandyopadhyay, S.: Fuzzy clustering for semi-supervised learning–case study: construction of an emotion lexicon. In: Mexican International Conference on Artificial Intelligence, pp. 73–86, October 2012 Poria, S., Gelbukh, A., Das, D., Bandyopadhyay, S.: Fuzzy clustering for semi-supervised learning–case study: construction of an emotion lexicon. In: Mexican International Conference on Artificial Intelligence, pp. 73–86, October 2012
Zurück zum Zitat Remus, R., Quasthoff, U., Heyer, G.: SentiWS – a publicly available german-language resource for sentiment analysis. In: LREC, May 2010 Remus, R., Quasthoff, U., Heyer, G.: SentiWS – a publicly available german-language resource for sentiment analysis. In: LREC, May 2010
Zurück zum Zitat Saraee, M., Bagheri, A.: Feature selection methods in Persian sentiment analysis. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds.) NLDB 2013. LNCS, vol. 7934, pp. 303–308. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38824-8_29 CrossRef Saraee, M., Bagheri, A.: Feature selection methods in Persian sentiment analysis. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds.) NLDB 2013. LNCS, vol. 7934, pp. 303–308. Springer, Heidelberg (2013). doi:10.​1007/​978-3-642-38824-8_​29 CrossRef
Zurück zum Zitat Seraji, M., Megyesi, B., Nivre, J.: A basic language resource kit for Persian, In: LREC, pp. 2245–2252 (2012) Seraji, M., Megyesi, B., Nivre, J.: A basic language resource kit for Persian, In: LREC, pp. 2245–2252 (2012)
Zurück zum Zitat Shi, H.X., Li, X.J.: A sentiment analysis model for hotel reviews based on supervised learning. In: 2011 International Conference on Machine Learning and Cybernetics (ICMLC), vol. 3, pp. 950–954. IEEE (2011) Shi, H.X., Li, X.J.: A sentiment analysis model for hotel reviews based on supervised learning. In: 2011 International Conference on Machine Learning and Cybernetics (ICMLC), vol. 3, pp. 950–954. IEEE (2011)
Zurück zum Zitat Sidorov, G., et al.: Empirical Study of Machine Learning Based Approach for Opinion Mining in Tweets. In: Batyrshin, I., González Mendoza, M. (eds.) MICAI 2012. LNCS (LNAI), vol. 7629, pp. 1–14. Springer, Heidelberg (2013). doi:10.1007/978-3-642-37807-2_1 CrossRef Sidorov, G., et al.: Empirical Study of Machine Learning Based Approach for Opinion Mining in Tweets. In: Batyrshin, I., González Mendoza, M. (eds.) MICAI 2012. LNCS (LNAI), vol. 7629, pp. 1–14. Springer, Heidelberg (2013). doi:10.​1007/​978-3-642-37807-2_​1 CrossRef
Zurück zum Zitat Stone, P., Dunphy, D.C., Smith, M.S., Ogilvie, D.M.: The general inquirer: a computer approach to content analysis. J. Reg. Sci. 8(1), 113–116 (1968)CrossRef Stone, P., Dunphy, D.C., Smith, M.S., Ogilvie, D.M.: The general inquirer: a computer approach to content analysis. J. Reg. Sci. 8(1), 113–116 (1968)CrossRef
Zurück zum Zitat Subrahmanian, V.S., Reforgiato, D.: AVA: adjective-verb-adverb combinations for sentiment analysis. IEEE Intell. Syst. 23(4), 43–50 (2008)CrossRef Subrahmanian, V.S., Reforgiato, D.: AVA: adjective-verb-adverb combinations for sentiment analysis. IEEE Intell. Syst. 23(4), 43–50 (2008)CrossRef
Zurück zum Zitat Tang, H., Tan, S., Cheng, X.: A survey on sentiment detection of reviews. Expert Syst. Appl. 36(7), 10760–10773 (2009)CrossRef Tang, H., Tan, S., Cheng, X.: A survey on sentiment detection of reviews. Expert Syst. Appl. 36(7), 10760–10773 (2009)CrossRef
Zurück zum Zitat Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37(2), 267–307 (2011)CrossRef Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37(2), 267–307 (2011)CrossRef
Zurück zum Zitat Taghva, K., Beckley, R., Sadeh, M.: A stemming algorithm for the farsi language. In: ITCC, vol. 1, pp. 158–162, April 2005 Taghva, K., Beckley, R., Sadeh, M.: A stemming algorithm for the farsi language. In: ITCC, vol. 1, pp. 158–162, April 2005
Zurück zum Zitat Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417–424. Association for Computational Linguistics (2002) Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417–424. Association for Computational Linguistics (2002)
Zurück zum Zitat Waltinger, U.: GermanPolarityClues: a lexical resource for german sentiment analysis. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC) (2010) Waltinger, U.: GermanPolarityClues: a lexical resource for german sentiment analysis. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC) (2010)
Zurück zum Zitat Yang, Y.: Application of Latent Dirichlet Allocation in Online Content Generation. Ph.D. thesis, University of California, Los Angeles (2016) Yang, Y.: Application of Latent Dirichlet Allocation in Online Content Generation. Ph.D. thesis, University of California, Los Angeles (2016)
Metadaten
Titel
PerSent: A Freely Available Persian Sentiment Lexicon
verfasst von
Kia Dashtipour
Amir Hussain
Qiang Zhou
Alexander Gelbukh
Ahmad Y. A. Hawalah
Erik Cambria
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-49685-6_28