Skip to main content

2015 | OriginalPaper | Buchkapitel

IIT-TUDA: System for Sentiment Analysis in Indian Languages Using Lexical Acquisition

verfasst von : Ayush Kumar, Sarah Kohail, Asif Ekbal, Chris Biemann

Erschienen in: Mining Intelligence and Knowledge Exploration

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Social networking platforms such as Facebook and Twitter have become a very popular communication tools among online users to share and express opinions and sentiment about the surrounding world. The availability of such opinionated text content has drawn much attention in the field of Natural Language Processing. Compared to other languages, such as English, little work has been done for Indian languages in this domain. In this paper, we present our contribution in classifying sentiment polarity for Indian tweets as a part of the shared task on Sentiment Analysis in Indian Languages (SAIL 2015). With the support of a distributional thesaurus (DTs) and sentence level co-occurrences, we expand existing Indian sentiment lexicons to reach a higher coverage on sentiment words. Our system achieves an accuracy of 43.20 % and 49.68 % for the constrained submission, and an accuracy of 42.0 % and 46.25 % for the unconstrained setup for Bengali and Hindi, respectively. This puts our system in the first position for Bengali and in the third position for Hindi, amongst six participating teams.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: Feature selection for opinion classification in web forums. ACM Trans. Inf. Syst. 26(3), 12:1–12:34 (2008)CrossRef Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: Feature selection for opinion classification in web forums. ACM Trans. Inf. Syst. 26(3), 12:1–12:34 (2008)CrossRef
2.
Zurück zum Zitat Almatrafi, O., Parack, S., Chavan, B.: Application of location-based sentiment analysis using twitter for identifying trends towards indian general elections 2014. In: Proceedings of the 9th International Conference on Ubiquitous Information Management and Communication, pp. 41:1–41:5. IMCOM 2015 (2015) Almatrafi, O., Parack, S., Chavan, B.: Application of location-based sentiment analysis using twitter for identifying trends towards indian general elections 2014. In: Proceedings of the 9th International Conference on Ubiquitous Information Management and Communication, pp. 41:1–41:5. IMCOM 2015 (2015)
3.
Zurück zum Zitat Biemann, C., Riedl, M.: Text: Now in 2d! a framework for lexical expansion with contextual similarity. J. Lang. Model. 1(1), 55–95 (2013)CrossRef Biemann, C., Riedl, M.: Text: Now in 2d! a framework for lexical expansion with contextual similarity. J. Lang. Model. 1(1), 55–95 (2013)CrossRef
4.
Zurück zum Zitat Biemann, C.: Unsupervised part-of-speech tagging in the large. Res. Lang. Comput. 7(2–4), 101–135 (2009)CrossRef Biemann, C.: Unsupervised part-of-speech tagging in the large. Res. Lang. Comput. 7(2–4), 101–135 (2009)CrossRef
5.
Zurück zum Zitat Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011)CrossRef Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011)CrossRef
6.
Zurück zum Zitat Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)MATH Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)MATH
7.
Zurück zum Zitat Das, A., Bandyopadhyay, S.: Subjectivity detection in english and bengali: A crf-based approach. Proceeding of ICON, Hyderabad, India (2009) Das, A., Bandyopadhyay, S.: Subjectivity detection in english and bengali: A crf-based approach. Proceeding of ICON, Hyderabad, India (2009)
8.
Zurück zum Zitat Das, A., Bandyopadhyay, S.: Sentiwordnet for indian languages. Asian Federation for Natural Language Processing, China, pp. 56–63 (2010) Das, A., Bandyopadhyay, S.: Sentiwordnet for indian languages. Asian Federation for Natural Language Processing, China, pp. 56–63 (2010)
9.
Zurück zum Zitat Feldman, R.: Techniques and applications for sentiment analysis. Commun. ACM 56(4), 82–89 (2013)CrossRef Feldman, R.: Techniques and applications for sentiment analysis. Commun. ACM 56(4), 82–89 (2013)CrossRef
10.
Zurück zum Zitat Ghiassi, M., Skinner, J., Zimbra, D.: Twitter brand sentiment analysis: A hybrid system using n-gram analysis and dynamic artificial neural. Expert Syst. Appl. 40(16), 6266–6282 (2013)CrossRef Ghiassi, M., Skinner, J., Zimbra, D.: Twitter brand sentiment analysis: A hybrid system using n-gram analysis and dynamic artificial neural. Expert Syst. Appl. 40(16), 6266–6282 (2013)CrossRef
11.
Zurück zum Zitat Joshi, A., Balamurali, A., Bhattacharyya, P.: A fall-back strategy for sentiment analysis in hindi: a case study. Proceedings of the 8th ICON, Kharagpur, India (2010) Joshi, A., Balamurali, A., Bhattacharyya, P.: A fall-back strategy for sentiment analysis in hindi: a case study. Proceedings of the 8th ICON, Kharagpur, India (2010)
12.
Zurück zum Zitat Miller, T., Biemann, C., Zesch, T., Gurevych, I.: Using distributional similarity for lexical expansion in knowledge-based word sense disambiguation. In: COLING, pp. 1781–1796 (2012) Miller, T., Biemann, C., Zesch, T., Gurevych, I.: Using distributional similarity for lexical expansion in knowledge-based word sense disambiguation. In: COLING, pp. 1781–1796 (2012)
13.
Zurück zum Zitat Nagy, A., Stamberger, J.: Crowd sentiment detection during disasters and crises. In: Proceedings of the 9th International ISCRAM Conference, Vancouver, Canada, pp. 1–9 (2012) Nagy, A., Stamberger, J.: Crowd sentiment detection during disasters and crises. In: Proceedings of the 9th International ISCRAM Conference, Vancouver, Canada, pp. 1–9 (2012)
14.
Zurück zum Zitat O’Connor, B., Balasubramanyan, R., Routledge, B.R., Smith, N.A.: From tweets topolls: Linking text sentiment to public opinion time series. In: Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media, Washington, DC 11(122-129), pp. 1–2 (2010) O’Connor, B., Balasubramanyan, R., Routledge, B.R., Smith, N.A.: From tweets topolls: Linking text sentiment to public opinion time series. In: Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media, Washington, DC 11(122-129), pp. 1–2 (2010)
15.
Zurück zum Zitat Panchenko, A., Beaufort, R., Naets, H., Fairon, C.: Towards detection of child sexual abuse media: categorization of the associated filenames. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 776–779. Springer, Heidelberg (2013) CrossRef Panchenko, A., Beaufort, R., Naets, H., Fairon, C.: Towards detection of child sexual abuse media: categorization of the associated filenames. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 776–779. Springer, Heidelberg (2013) CrossRef
16.
Zurück zum Zitat Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retrieval 2(1–2), 1–135 (2008)CrossRef Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retrieval 2(1–2), 1–135 (2008)CrossRef
17.
Zurück zum Zitat Patra, B.G., Das, D., Das, A., Prasath, R.: Shared task on sentiment analysis in indian languages (sail) tweets - an overview. In: Mining Intelligence and Knowledge Exploration - Third International Conference, MIKE-2015. Springer, Hyderabad, India (2015) Patra, B.G., Das, D., Das, A., Prasath, R.: Shared task on sentiment analysis in indian languages (sail) tweets - an overview. In: Mining Intelligence and Knowledge Exploration - Third International Conference, MIKE-2015. Springer, Hyderabad, India (2015)
18.
Zurück zum Zitat Quasthoff, U., Richter, M., Biemann, C.: Corpus portal for search in monolingual corpora. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation, Genoa, Italy (2006) Quasthoff, U., Richter, M., Biemann, C.: Corpus portal for search in monolingual corpora. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation, Genoa, Italy (2006)
19.
Zurück zum Zitat Rosenthal, S., Nakov, P., Kiritchenko, S., Mohammad, S.M., Ritter, A., Stoyanov, V.: Semeval-2015 task 10: Sentiment analysis in twitter. In: Proceedings of the 9th International Workshop on Semantic Evaluation, SemEval, Denver, Colorado (2015) Rosenthal, S., Nakov, P., Kiritchenko, S., Mohammad, S.M., Ritter, A., Stoyanov, V.: Semeval-2015 task 10: Sentiment analysis in twitter. In: Proceedings of the 9th International Workshop on Semantic Evaluation, SemEval, Denver, Colorado (2015)
20.
Zurück zum Zitat Rosenthal, S., Nakov, P., Ritter, A., Stoyanov, V.: Semeval-2014 task 9: Sentiment analysis in twitter. In: Proceedings of the 9th International Workshop on Semantic Evaluation, SemEval, Dublin, Ireland (2014) Rosenthal, S., Nakov, P., Ritter, A., Stoyanov, V.: Semeval-2014 task 9: Sentiment analysis in twitter. In: Proceedings of the 9th International Workshop on Semantic Evaluation, SemEval, Dublin, Ireland (2014)
21.
Zurück zum Zitat Tumasjan, A., Sprenger, T.O., Sandner, P.G., Welpe, I.M.: Predicting elections with twitter: What 140 characters reveal about political sentiment. In: ICWSM, Washington, DC 10, pp. 178–185 (2010) Tumasjan, A., Sprenger, T.O., Sandner, P.G., Welpe, I.M.: Predicting elections with twitter: What 140 characters reveal about political sentiment. In: ICWSM, Washington, DC 10, pp. 178–185 (2010)
Metadaten
Titel
IIT-TUDA: System for Sentiment Analysis in Indian Languages Using Lexical Acquisition
verfasst von
Ayush Kumar
Sarah Kohail
Asif Ekbal
Chris Biemann
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-26832-3_65