Skip to main content

2018 | OriginalPaper | Buchkapitel

Opinion Mining in Social Networks for Algerian Dialect

verfasst von : Mehdi Bettiche, Moncef Zakaria Mouffok, Chahnez Zakaria

Erschienen in: Information Processing and Management of Uncertainty in Knowledge-Based Systems. Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

There has been a significant increase in the volume of Arabic dialect messages on social networks, providing a rich source for opinion mining research. Most research works done on Arabic dialect focus on messages written in Arabic script, with very limited scope on Latin script. In this paper, we are interested in the classification of social networks messages retrieved from Twitter, Facebook and YouTube written in Algerian dialect in Latin script into positive or negative classes using existing opinion mining approaches (lexical-based, machine learning, and hybrid). Also, we apply a regrouping process in the preprocessing step to overcome the issues related to the Algerian dialect such as the orthographic varieties to express the same word. Furthermore, we focus on the hybrid approach which consists in automatically annotating the training corpus with the lexical-based approach and then use the machine learning approach on this corpus for creating the classification model. This approach allows classifying the messages into positive or negative classes, without having to annotate manually a training corpus.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Harrat, S., Meftouh, K., Abbas, M., Hidouci, K.W., Smaili, K.: An algerian dialect: Study and resources. Int. J. Adv. Comput. Sci. Appl.-IJACSA 7(3), 384–396 (2016) Harrat, S., Meftouh, K., Abbas, M., Hidouci, K.W., Smaili, K.: An algerian dialect: Study and resources. Int. J. Adv. Comput. Sci. Appl.-IJACSA 7(3), 384–396 (2016)
2.
Zurück zum Zitat Harrat, S., Meftouh, K., Abbas, M., Jamoussi, S., Saad, M., Smaili, K.: Cross-dialectal arabic processing. In: International Conference on Intelligent Text Processing and Computational Linguistics (2015) Harrat, S., Meftouh, K., Abbas, M., Jamoussi, S., Saad, M., Smaili, K.: Cross-dialectal arabic processing. In: International Conference on Intelligent Text Processing and Computational Linguistics (2015)
3.
Zurück zum Zitat Bouamor, H., Habash, N., Oflazer, K.: A multidialectal parallel corpus of arabic. In: LREC, pp. 1240–1245 (2014) Bouamor, H., Habash, N., Oflazer, K.: A multidialectal parallel corpus of arabic. In: LREC, pp. 1240–1245 (2014)
4.
Zurück zum Zitat Cotterell, R., Callison-Burch, C.: A multi-dialect, multi-genre corpus of informal written arabic. In: LREC, pp. 241–245 (2014) Cotterell, R., Callison-Burch, C.: A multi-dialect, multi-genre corpus of informal written arabic. In: LREC, pp. 241–245 (2014)
5.
Zurück zum Zitat Jarrar, M., Habash, N., Alrimawi, F., Akra, D., Zalmout, N.: Curras: an annotated corpus for the Palestinian Arabic dialect. Lang. Resour. Eval. 51(3), 745–775 (2017)CrossRef Jarrar, M., Habash, N., Alrimawi, F., Akra, D., Zalmout, N.: Curras: an annotated corpus for the Palestinian Arabic dialect. Lang. Resour. Eval. 51(3), 745–775 (2017)CrossRef
6.
Zurück zum Zitat Malmasi, S., Zampieri, M.: Arabic Dialect identification in speech transcripts. In: VarDial, vol. 3, p. 106 (2016) Malmasi, S., Zampieri, M.: Arabic Dialect identification in speech transcripts. In: VarDial, vol. 3, p. 106 (2016)
7.
Zurück zum Zitat Huang, F.: Improved arabic dialect classification with social media data. In: EMNLP, pp. 2118–2126 (2015) Huang, F.: Improved arabic dialect classification with social media data. In: EMNLP, pp. 2118–2126 (2015)
8.
Zurück zum Zitat Belgacem, M., Antoniadis, G., Besacier, L.: Automatic identification of arabic dialects. In: LREC (2010) Belgacem, M., Antoniadis, G., Besacier, L.: Automatic identification of arabic dialects. In: LREC (2010)
9.
Zurück zum Zitat Sadat, F., Kazemi, F., Farzindar, A.: Automatic identification of arabic language varieties and dialects in social media. In: Proceedings of SocialNLP, vol. 22 (2014) Sadat, F., Kazemi, F., Farzindar, A.: Automatic identification of arabic language varieties and dialects in social media. In: Proceedings of SocialNLP, vol. 22 (2014)
10.
Zurück zum Zitat Ali, A., Dehak, N., Cardinal, P., Khurana, S., Yella, S.H., Glass, J., Renals, S.: Automatic dialect detection in arabic broadcast speech. arXiv preprint arXiv:1509.06928 (2015) Ali, A., Dehak, N., Cardinal, P., Khurana, S., Yella, S.H., Glass, J., Renals, S.: Automatic dialect detection in arabic broadcast speech. arXiv preprint arXiv:​1509.​06928 (2015)
11.
Zurück zum Zitat Karima, A., Menacer, M.A., Smaili, K.: CALYOU: a comparable spoken algerian corpus harvested from youtube. In: 18th Annual Conference of the International Communication Association (Interspeech) (2017) Karima, A., Menacer, M.A., Smaili, K.: CALYOU: a comparable spoken algerian corpus harvested from youtube. In: 18th Annual Conference of the International Communication Association (Interspeech) (2017)
12.
Zurück zum Zitat Zarra, T., Chiheb, R., Moumen, R., Faizi, R., Afia, A.E.: Topic and sentiment model applied to the colloquial Arabic: a case study of Maghrebi Arabic. In: Proceedings of the 2017 International Conference on Smart Digital Environment, pp. 174–181. ACM (2017) Zarra, T., Chiheb, R., Moumen, R., Faizi, R., Afia, A.E.: Topic and sentiment model applied to the colloquial Arabic: a case study of Maghrebi Arabic. In: Proceedings of the 2017 International Conference on Smart Digital Environment, pp. 174–181. ACM (2017)
13.
Zurück zum Zitat Medhaffar, S., Bougares, F., Esteve, Y., Hadrich-Belguith, L.: Sentiment analysis of tunisian dialects: linguistic resources and experiments. In: Proceedings of the Third Arabic Natural Language Processing Workshop, pp. 55–61 (2017) Medhaffar, S., Bougares, F., Esteve, Y., Hadrich-Belguith, L.: Sentiment analysis of tunisian dialects: linguistic resources and experiments. In: Proceedings of the Third Arabic Natural Language Processing Workshop, pp. 55–61 (2017)
14.
Zurück zum Zitat Guellil, I., Azouaou, F.: Bilingual Lexicon for Algerian Arabic Dialect Treatment in Social Media Guellil, I., Azouaou, F.: Bilingual Lexicon for Algerian Arabic Dialect Treatment in Social Media
15.
Zurück zum Zitat Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 231–240. ACM (2008) Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 231–240. ACM (2008)
16.
Zurück zum Zitat Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177. ACM (2004) Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177. ACM (2004)
17.
Zurück zum Zitat Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417–424. Association for Computational Linguistics (2002) Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417–424. Association for Computational Linguistics (2002)
18.
Zurück zum Zitat Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Computat. Linguist. 37(2), 267–307 (2011)CrossRef Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Computat. Linguist. 37(2), 267–307 (2011)CrossRef
19.
Zurück zum Zitat Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical methods in Natural Language Processing, vol. 10, pp. 79–86. Association for Computational Linguistics (2002) Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical methods in Natural Language Processing, vol. 10, pp. 79–86. Association for Computational Linguistics (2002)
20.
Zurück zum Zitat Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: LREC, vol. 10 (2010) Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: LREC, vol. 10 (2010)
21.
Zurück zum Zitat Wang, G., Sun, J., Ma, J., Xu, K., Gu, J.: Sentiment classification: the contribution of ensemble learning. Decis. Support Syst. 57, 77–93 (2014)CrossRef Wang, G., Sun, J., Ma, J., Xu, K., Gu, J.: Sentiment classification: the contribution of ensemble learning. Decis. Support Syst. 57, 77–93 (2014)CrossRef
22.
Zurück zum Zitat Zhang, L., Ghosh, R., Dekhil, M., Hsu, M., Liu, B.: Combining lexicon-based and learning-based methods for twitter sentiment analysis (2011a) Zhang, L., Ghosh, R., Dekhil, M., Hsu, M., Liu, B.: Combining lexicon-based and learning-based methods for twitter sentiment analysis (2011a)
23.
Zurück zum Zitat Holmes, D., McCabe, M.C.: Improving precision and recall for soundex retrieval. In: International Conference on Information Technology: Coding and Computing, Proceedings, pp. 22–26. IEEE (2002) Holmes, D., McCabe, M.C.: Improving precision and recall for soundex retrieval. In: International Conference on Information Technology: Coding and Computing, Proceedings, pp. 22–26. IEEE (2002)
24.
Zurück zum Zitat Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. In :Soviet physics doklad, vol. 10, no. 8, pp. 707–710 (1966) Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. In :Soviet physics doklad, vol. 10, no. 8, pp. 707–710 (1966)
Metadaten
Titel
Opinion Mining in Social Networks for Algerian Dialect
verfasst von
Mehdi Bettiche
Moncef Zakaria Mouffok
Chahnez Zakaria
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-91479-4_52