Skip to main content
Top

2018 | OriginalPaper | Chapter

Multilingual Sentiment Mapping Using Twitter, Open Source Tools, and Dictionary Based Machine Translation Approach

Author : David Kocich

Published in: Dynamics in GIscience

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Online social networks are a popular communication tool for internet users. Millions of users share opinions on different aspects of everyday life. Therefore, microblogging websites are rich sources of data for opinion mining and sentiment analysis. Our current research based on the analysis of migration using various social networks required to implement a tool for automated multilingual analysis of sentiment from as many languages as possible. Usually, all available tools handle to work only with English written texts which are the most common on the social media. Few open source tools which can process French, German and Spanish texts exist too, but it is not optimal to reimplement and join different approaches together. Another requirement is the ability to process dynamic data streams and static historical datasets with high efficiency. Lesser accuracy and completeness of evaluated messages is acceptable as a counterweight for these general requirements. The paper presents sample data collection from Twitter for the opinion mining purposes. We perform multilingual sentiment analysis of the collected data and briefly explain experimental results. The analysis is made with the use of custom built solution utilising the AFINN-165 which is manually evaluated dictionary of English words. This dictionary was translated into other languages using Google Translate API that was tested during the process. It is then possible to determine positive, negative and neutral sentiment. Results of the research bring new insights, offer a possibility for wider use and allow optimisation of the wordlists/tool resulting in the better results of future research. Geospatial analysis of first experimental results undercovers interesting relation between time, location and a sentiment which enables readers to think of various use cases.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Bollen, J., Mao, H., & Pepe, A. (2011). Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. ICWSM, 11, 450–453. Bollen, J., Mao, H., & Pepe, A. (2011). Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. ICWSM, 11, 450–453.
go back to reference Duh, K., Fujino, A., & Nagata, M. (2011). Is machine translation ripe for cross-lingual sentiment classification? In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT ’11, Short Papers (Vol. 2, pp. 429–433). Stroudsburg, PA, USA: Association for Computational Linguistics. Duh, K., Fujino, A., & Nagata, M. (2011). Is machine translation ripe for cross-lingual sentiment classification? In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT ’11, Short Papers (Vol. 2, pp. 429–433). Stroudsburg, PA, USA: Association for Computational Linguistics.
go back to reference Hauthal, E., & Burghardt, D. (2015). Temporal occurrence and time-dependency of georeferenced emotions extracted from user-generated content. Presented at the 18th AGILE International Conference on Geographic Information Science, Lisbon. Hauthal, E., & Burghardt, D. (2015). Temporal occurrence and time-dependency of georeferenced emotions extracted from user-generated content. Presented at the 18th AGILE International Conference on Geographic Information Science, Lisbon.
go back to reference Horák, J., Belaj, P., Ivan, I., Nemec, P., Ardielli, J., & Růžička, J. (2011). Geoparsing of Czech RSS news and evaluation of its spatial distribution. In R. Katarzyniak, T.-F. Chiu, C.-F. Hong, & N. T. Nguyen (Eds.), Semantic methods for knowledge management and communication, studies in computational intelligence (pp. 353–367). Berlin, Heidelberg: Springer.CrossRef Horák, J., Belaj, P., Ivan, I., Nemec, P., Ardielli, J., & Růžička, J. (2011). Geoparsing of Czech RSS news and evaluation of its spatial distribution. In R. Katarzyniak, T.-F. Chiu, C.-F. Hong, & N. T. Nguyen (Eds.), Semantic methods for knowledge management and communication, studies in computational intelligence (pp. 353–367). Berlin, Heidelberg: Springer.CrossRef
go back to reference Ivan, I., Kocich, D., & Horák, J. (2016). Identification of crime environmental factors based on spatial human data integration. In: SGEM Conference Proceedings, Presented at the SGEM 2016 : 16th International Multidisciplinary Scientific Geoconference (Book2 Vol. 1, pp. 697–704), Albena, Bulgaria. doi:10.5593/SGEM2016/B21/S08.087 Ivan, I., Kocich, D., & Horák, J. (2016). Identification of crime environmental factors based on spatial human data integration. In: SGEM Conference Proceedings, Presented at the SGEM 2016 : 16th International Multidisciplinary Scientific Geoconference (Book2 Vol. 1, pp. 697–704), Albena, Bulgaria. doi:10.​5593/​SGEM2016/​B21/​S08.​087
go back to reference Kocich, D., & Horák, J. (2016). Twitter as a source of big spatial data. In SGEM Conference Proceedings, Presented at the SGEM 2016 : 16th international multidisciplinary scientific geoconference (Book2 Vol. 1, pp. 921–928). Albena, Bulgaria. doi:10.5593/SGEM2016/B21/S08.116 Kocich, D., & Horák, J. (2016). Twitter as a source of big spatial data. In SGEM Conference Proceedings, Presented at the SGEM 2016 : 16th international multidisciplinary scientific geoconference (Book2 Vol. 1, pp. 921–928). Albena, Bulgaria. doi:10.​5593/​SGEM2016/​B21/​S08.​116
go back to reference Koehn, P., Och, F.J., & Marcu, D. (2003). Statistical phrase-based translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology—Volume 1, NAACL ’03 (pp. 48–54). Stroudsburg, PA, USA: Association for Computational Linguistics. doi:10.3115/1073445.1073462 Koehn, P., Och, F.J., & Marcu, D. (2003). Statistical phrase-based translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology—Volume 1, NAACL ’03 (pp. 48–54). Stroudsburg, PA, USA: Association for Computational Linguistics. doi:10.​3115/​1073445.​1073462
go back to reference Kotzias, D., Denil, M., de Freitas, N., & Smyth, P. (2015). From group to individual labels using deep features. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’15 (pp. 597–606). New York, NY, USA: ACM. doi:10.1145/2783258.2783380 Kotzias, D., Denil, M., de Freitas, N., & Smyth, P. (2015). From group to individual labels using deep features. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’15 (pp. 597–606). New York, NY, USA: ACM. doi:10.​1145/​2783258.​2783380
go back to reference Lampos, V., Bie, T. D., & Cristianini, N. (2010). Flu detector—Tracking epidemics on twitter. In J. L. Balcázar, F. Bonchi, A. Gionis, & M. Sebag (Eds.), Machine learning and knowledge discovery in databases (pp. 599–602)., Lecture Notes in Computer Science Berlin Heidelberg: Springer.CrossRef Lampos, V., Bie, T. D., & Cristianini, N. (2010). Flu detector—Tracking epidemics on twitter. In J. L. Balcázar, F. Bonchi, A. Gionis, & M. Sebag (Eds.), Machine learning and knowledge discovery in databases (pp. 599–602)., Lecture Notes in Computer Science Berlin Heidelberg: Springer.CrossRef
go back to reference Letsch, C. (2014). Turkey twitter users flout Erdogan ban on micro-blogging site. The Guardian, 21. Letsch, C. (2014). Turkey twitter users flout Erdogan ban on micro-blogging site. The Guardian, 21.
go back to reference Nguyen, V. H., Nguyen, H. T., & Snasel, V. (2015). Normalization of vietnamese tweets on twitter. In A. Abraham, X. H. Jiang, V. Snášel, & J.-S. Pan (Eds.), Intelligent data analysis and applications, Advances in intelligent systems and computing (pp. 179–189). Berlin: Springer International Publishing. Nguyen, V. H., Nguyen, H. T., & Snasel, V. (2015). Normalization of vietnamese tweets on twitter. In A. Abraham, X. H. Jiang, V. Snášel, & J.-S. Pan (Eds.), Intelligent data analysis and applications, Advances in intelligent systems and computing (pp. 179–189). Berlin: Springer International Publishing.
go back to reference Nielsen, F.Å. (2011). A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. ArXiv: 11032903 Cs. Nielsen, F.Å. (2011). A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. ArXiv: 11032903 Cs.
go back to reference Pak, A., & Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. In LREc (pp. 1320–1326). Pak, A., & Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. In LREc (pp. 1320–1326).
go back to reference Refaee, E., & Rieser, V. (2014). An arabic twitter corpus for subjectivity and sentiment analysis. In LREC (pp. 2268–2273). Refaee, E., & Rieser, V. (2014). An arabic twitter corpus for subjectivity and sentiment analysis. In LREC (pp. 2268–2273).
go back to reference Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., et al. (2016). Google’s neural machine translation system: bridging the gap between human and machine translation. ArXiv: 160908144 Cs. Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., et al. (2016). Google’s neural machine translation system: bridging the gap between human and machine translation. ArXiv: 160908144 Cs.
go back to reference Xiang, G., Fan, B., Wang, L., Hong, J., & Rose, C. (2012). Detecting offensive tweets via topical feature discovery over a large scale twitter corpus. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM ’12 (pp. 1980–1984). New York, NY, USA: ACM. doi:10.1145/2396761.2398556 Xiang, G., Fan, B., Wang, L., Hong, J., & Rose, C. (2012). Detecting offensive tweets via topical feature discovery over a large scale twitter corpus. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM ’12 (pp. 1980–1984). New York, NY, USA: ACM. doi:10.​1145/​2396761.​2398556
Metadata
Title
Multilingual Sentiment Mapping Using Twitter, Open Source Tools, and Dictionary Based Machine Translation Approach
Author
David Kocich
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-61297-3_16

Premium Partner