Skip to main content

2015 | OriginalPaper | Buchkapitel

Detecting Automatically-Generated Arabic Tweets

verfasst von : Hind Almerekhi, Tamer Elsayed

Erschienen in: Information Retrieval Technology

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Recently, Twitter, one of the most widely-known social media platforms, got infiltrated by several automation programs, commonly known as “bots”. Bots can be easily abused to spread spam and hinder information extraction applications by posting lots of automatically-generated tweets that occupy a good portion of the continuous stream of tweets. This problem heavily affects users in the Arab region due to the recent developing political events as automated tweets can disturb communication and waste time needed in filtering such tweets.
To mitigate this problem, this research work addresses the classification of Arabic tweets into automated or manual. We proposed four categories of features including formality, structural, tweet-specific, and temporal features. Our experimental evaluation over about 3.5 k randomly sampled Arabic tweets shows that classification based on individual categories of features outperform the baseline unigram-based classifier in terms of classification accuracy. Additionally, combining tweet-specific and unigram features improved classification accuracy to 92 %, which is a significant improvement over the baseline classifier, constituting a very strong reference baseline for future studies.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aggarwal, A., Rajadesingan, A., Kumaraguru, P.: Phishari: automatic realtime phishing detection on twitter. In: IEEE eCrime Researchers Summit (eCrime), pp. 1–12. IEEE (2012) Aggarwal, A., Rajadesingan, A., Kumaraguru, P.: Phishari: automatic realtime phishing detection on twitter. In: IEEE eCrime Researchers Summit (eCrime), pp. 1–12. IEEE (2012)
2.
Zurück zum Zitat Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of twitter accounts: are you a human, bot, or cyborg? IEEE Trans. Dependable Secure Comput. 9(6), 811–824 (2012)CrossRef Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of twitter accounts: are you a human, bot, or cyborg? IEEE Trans. Dependable Secure Comput. 9(6), 811–824 (2012)CrossRef
3.
Zurück zum Zitat Ghosh, S., Viswanath, B., Kooti, F., Sharma, N.K., Korlam, G., Benevenuto, F., Ganguly, N., Gummadi, K.P.: Understanding and combating link farming in the twitter social network. In: Proceedings of the 21st International Conference on World Wide Web (WWW), pp. 61–70. ACM (2012) Ghosh, S., Viswanath, B., Kooti, F., Sharma, N.K., Korlam, G., Benevenuto, F., Ganguly, N., Gummadi, K.P.: Understanding and combating link farming in the twitter social network. In: Proceedings of the 21st International Conference on World Wide Web (WWW), pp. 61–70. ACM (2012)
4.
Zurück zum Zitat Hasanain, M., Elsayed, T., Magdy, W.: Identification of answer-seeking questions in arabic microblogs. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management (CIKM), pp. 1839–1842. ACM (2014) Hasanain, M., Elsayed, T., Magdy, W.: Identification of answer-seeking questions in arabic microblogs. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management (CIKM), pp. 1839–1842. ACM (2014)
5.
Zurück zum Zitat Hentschel, M., Alonso, O., Counts, S., Kandylas, V.: Finding users we trust: scaling up verified twitter users using their communication patterns. In: Eighth International AAAI Conference on Web and Social Media (ICWSM) (2014) Hentschel, M., Alonso, O., Counts, S., Kandylas, V.: Finding users we trust: scaling up verified twitter users using their communication patterns. In: Eighth International AAAI Conference on Web and Social Media (ICWSM) (2014)
6.
Zurück zum Zitat Hu, X., Tang, J., Liu, H.: Online social spammer detection. In: Twenty-Eighth AAAI Conference on Artificial Intelligence (AAAI) (2014) Hu, X., Tang, J., Liu, H.: Online social spammer detection. In: Twenty-Eighth AAAI Conference on Artificial Intelligence (AAAI) (2014)
7.
Zurück zum Zitat Laboreiro, G., Sarmento, L., Oliveira, E.: Identifying automatic posting systems in microblogs. In: Antunes, L., Pinto, H.S. (eds.) EPIA 2011. LNCS, vol. 7026, pp. 634–648. Springer, Heidelberg (2011)CrossRef Laboreiro, G., Sarmento, L., Oliveira, E.: Identifying automatic posting systems in microblogs. In: Antunes, L., Pinto, H.S. (eds.) EPIA 2011. LNCS, vol. 7026, pp. 634–648. Springer, Heidelberg (2011)CrossRef
8.
Zurück zum Zitat Lee, K., Eoff, B.D., Caverlee, J.: Seven months with the devils: a long-term study of content polluters on twitter. In: Fifth International AAAI Conference on Web and Social Media (ICWSM). Citeseer (2011) Lee, K., Eoff, B.D., Caverlee, J.: Seven months with the devils: a long-term study of content polluters on twitter. In: Fifth International AAAI Conference on Web and Social Media (ICWSM). Citeseer (2011)
9.
Zurück zum Zitat Martinez-Romo, J., Araujo, L.: Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Syst. Appl. 40(8), 2992–3000 (2013)CrossRef Martinez-Romo, J., Araujo, L.: Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Syst. Appl. 40(8), 2992–3000 (2013)CrossRef
10.
Zurück zum Zitat Wald, R., Khoshgoftaar, T.M., Napolitano, A., Sumner, C.: Predicting susceptibility to social bots on twitter. In: IEEE 14th International Conference on Information Reuse and Integration (IRI), pp. 6–13. IEEE (2013) Wald, R., Khoshgoftaar, T.M., Napolitano, A., Sumner, C.: Predicting susceptibility to social bots on twitter. In: IEEE 14th International Conference on Information Reuse and Integration (IRI), pp. 6–13. IEEE (2013)
11.
Zurück zum Zitat Yang, C., Harkreader, R., Zhang, J., Shin, S., Gu, G.: Analyzing spammers’ social networks for fun and profit: a case study of cyber criminal ecosystem on twitter. In: Proceedings of the 21st International Conference on World Wide Web (WWW), pp. 71–80. ACM (2012) Yang, C., Harkreader, R., Zhang, J., Shin, S., Gu, G.: Analyzing spammers’ social networks for fun and profit: a case study of cyber criminal ecosystem on twitter. In: Proceedings of the 21st International Conference on World Wide Web (WWW), pp. 71–80. ACM (2012)
12.
Zurück zum Zitat Zhang, C.M., Paxson, V.: Detecting and analyzing automated activity on twitter. In: Spring, N., Riley, G.F. (eds.) PAM 2011. LNCS, vol. 6579, pp. 102–111. Springer, Heidelberg (2011)CrossRef Zhang, C.M., Paxson, V.: Detecting and analyzing automated activity on twitter. In: Spring, N., Riley, G.F. (eds.) PAM 2011. LNCS, vol. 6579, pp. 102–111. Springer, Heidelberg (2011)CrossRef
13.
Zurück zum Zitat Zhu, Y., Wang, X., Zhong, E., Liu, N.N., Li, H., Yang, Q.: Discovering spammers in social networks. In: Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI) (2012) Zhu, Y., Wang, X., Zhong, E., Liu, N.N., Li, H., Yang, Q.: Discovering spammers in social networks. In: Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI) (2012)
14.
Zurück zum Zitat Zubiaga, A., Spina, D., Martínez, R., Fresno, V.: Real-time classification of twitter trends. J. Assoc. Inf. Sci. Technol. 66(3), 462–473 (2014)CrossRef Zubiaga, A., Spina, D., Martínez, R., Fresno, V.: Real-time classification of twitter trends. J. Assoc. Inf. Sci. Technol. 66(3), 462–473 (2014)CrossRef
Metadaten
Titel
Detecting Automatically-Generated Arabic Tweets
verfasst von
Hind Almerekhi
Tamer Elsayed
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-28940-3_10

Neuer Inhalt