Skip to main content
Top
Published in: The Journal of Supercomputing 15/2023

29-04-2023

An online and highly-scalable streaming platform for filtering trolls with transfer learning

Authors: Chun-Ming Lai, Ting-Wei Chang, Chao-Tung Yang

Published in: The Journal of Supercomputing | Issue 15/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The internet has reached a mature stage of development, and Online Social Media (OSM) platforms such as Twitter and Facebook have become vital channels for public communication and discussion on matters of public interest. However, these platforms are often plagued by improper statements or content, propagated by anonymous users and trolls, which negatively impact both the platforms and their users. Existing methods for dealing with inappropriate information rely on (semi)-manual offline assessments, which do not fully account for the streaming nature of OSM feeds. In this paper, we implement a robust and decoupled system that considers social media data as streaming data. With a publisher and consumer model, our system can process more than 179 MB of data per second with only 166.3 ms latency using Apache Kafka. Accordingly, we deploy a well-trained transfer learning model to classify incoming data streams, with an accuracy of 0.836. Our proposed architecture has the potential to assist online communities in developing more constructive and flawless OSM platforms. We believe that our contribution will help address the challenges associated with improper content on OSM platforms and pave the way for the development of more effective and efficient solutions.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Rosa H, Pereira N, Ribeiro R, Ferreira P, Carvalho J, Oliveira S, Coheur L, Paulino P, Simão A, Trancoso I (2019) Automatic cyberbullying detection: a systematic review. Comput Hum Behav 93:333–345CrossRef Rosa H, Pereira N, Ribeiro R, Ferreira P, Carvalho J, Oliveira S, Coheur L, Paulino P, Simão A, Trancoso I (2019) Automatic cyberbullying detection: a systematic review. Comput Hum Behav 93:333–345CrossRef
3.
go back to reference Hinduja S, Patchin J (2019) Connecting adolescent suicide to the severity of bullying and cyberbullying. J Sch Violence 18:333–346CrossRef Hinduja S, Patchin J (2019) Connecting adolescent suicide to the severity of bullying and cyberbullying. J Sch Violence 18:333–346CrossRef
4.
go back to reference Sawhney R, Agarwal S, Neerkaje A, Aletras N, Nakov P, Flek L (2022) Towards suicide ideation detection through online conversational context. In: Proceedings Of The 45th International ACM SIGIR Conference On Research And Development In Information Retrieval. pp 1716-1727 Sawhney R, Agarwal S, Neerkaje A, Aletras N, Nakov P, Flek L (2022) Towards suicide ideation detection through online conversational context. In: Proceedings Of The 45th International ACM SIGIR Conference On Research And Development In Information Retrieval. pp 1716-1727
5.
go back to reference Hossain E, Sharif O, Hoque M (2021) NLP-CUET@DravidianLangTech-EACL2021: investigating visual and textual features to identify trolls from multimodal social media memes. In: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages. pp 300-306 (2021,4), https://aclanthology.org/2021.dravidianlangtech-1.43 Hossain E, Sharif O, Hoque M (2021) NLP-CUET@DravidianLangTech-EACL2021: investigating visual and textual features to identify trolls from multimodal social media memes. In: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages. pp 300-306 (2021,4), https://​aclanthology.​org/​2021.​dravidianlangtec​h-1.​43
6.
go back to reference Stewart L, Arif A, Starbird K (2018) Examining trolls and polarization with a retweet network. In: Proc ACM WSDM, Workshop On Misinformation And Misbehavior Mining on the Web. 70 Stewart L, Arif A, Starbird K (2018) Examining trolls and polarization with a retweet network. In: Proc ACM WSDM, Workshop On Misinformation And Misbehavior Mining on the Web. 70
7.
go back to reference Ali R, Farooq U, Arshad U, Shahzad W, Beg MO (2022) Hate speech detection on twitter using transfer learning. Comput Speech Lang 74:101365CrossRef Ali R, Farooq U, Arshad U, Shahzad W, Beg MO (2022) Hate speech detection on twitter using transfer learning. Comput Speech Lang 74:101365CrossRef
8.
go back to reference Kumar DA, Chinnalagu A (2020) Sentiment and emotion in social media covid-19 conversations: Sab-lstm approach. In: 2020 9th International Conference System Modeling and Advancement in Research Trends (SMART), pages 463-467 Kumar DA, Chinnalagu A (2020) Sentiment and emotion in social media covid-19 conversations: Sab-lstm approach. In: 2020 9th International Conference System Modeling and Advancement in Research Trends (SMART), pages 463-467
9.
go back to reference Devlin J, Chang M, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 Devlin J, Chang M, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:​1810.​04805
10.
go back to reference Mendhe C, Henderson N, Srivastava G, Mago V (2020) A scalable platform to collect, store, visualize, and analyze big data in real time. IEEE Trans Comput Soc Syst 8:260–269CrossRef Mendhe C, Henderson N, Srivastava G, Mago V (2020) A scalable platform to collect, store, visualize, and analyze big data in real time. IEEE Trans Comput Soc Syst 8:260–269CrossRef
11.
go back to reference Alothali E, Alashwal H, Salih M, Hayawi K (2021) Real time detection of social bots on Twitter using machine learning and Apache Kafka. In: 2021 5th Cyber Security In Networking Conference (CSNet). pp 98-102 Alothali E, Alashwal H, Salih M, Hayawi K (2021) Real time detection of social bots on Twitter using machine learning and Apache Kafka. In: 2021 5th Cyber Security In Networking Conference (CSNet). pp 98-102
12.
go back to reference Lai CM, Chen MH, Kristiani E, Verma VK, Yang CT (2022) Fake news classification based on content level features. Appl Sci 12(3):1116CrossRef Lai CM, Chen MH, Kristiani E, Verma VK, Yang CT (2022) Fake news classification based on content level features. Appl Sci 12(3):1116CrossRef
13.
go back to reference Fathoni H, Yen HY, Yang CT, Huang CY, Kristiani E (2021) A container-based of edge device monitoring on kubernetes. In: Chang JW, Yen NL, Hung JC (eds) Frontier Computing. Springer, Singapore, pp 231–237CrossRef Fathoni H, Yen HY, Yang CT, Huang CY, Kristiani E (2021) A container-based of edge device monitoring on kubernetes. In: Chang JW, Yen NL, Hung JC (eds) Frontier Computing. Springer, Singapore, pp 231–237CrossRef
14.
go back to reference Dewi L, Noertjahyana A, Palit H, Yedutun K (2019) Server scalability using kubernetes. In: 2019 4th Technology Innovation Management and Engineering Science International Conference (TIMES-iCON). pp 1-4 Dewi L, Noertjahyana A, Palit H, Yedutun K (2019) Server scalability using kubernetes. In: 2019 4th Technology Innovation Management and Engineering Science International Conference (TIMES-iCON). pp 1-4
15.
go back to reference Hugo A, Morin B, Svantorp K (2020) Bridging mqtt and kafka to support c-its: a feasibility study. In: 2020 21st IEEE International Conference on Mobile Data Management (MDM), pages 371-376 Hugo A, Morin B, Svantorp K (2020) Bridging mqtt and kafka to support c-its: a feasibility study. In: 2020 21st IEEE International Conference on Mobile Data Management (MDM), pages 371-376
16.
go back to reference van Dongen G, Van Den Poel D (2021) A performance analysis of fault recovery in stream processing frameworks. IEEE Access 9:93745–93763CrossRef van Dongen G, Van Den Poel D (2021) A performance analysis of fault recovery in stream processing frameworks. IEEE Access 9:93745–93763CrossRef
17.
go back to reference Wu H, Shang Z, Wolter K (2020) Learning to reliably deliver streaming data with apache kafka. In: 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pages 564-571 Wu H, Shang Z, Wolter K (2020) Learning to reliably deliver streaming data with apache kafka. In: 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pages 564-571
18.
go back to reference Wu H, Shang Z, Peng G, Wolter K (2020) A reactive batching strategy of apache kafka for reliable stream processing in real-time. In: 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), pp 207-217 Wu H, Shang Z, Peng G, Wolter K (2020) A reactive batching strategy of apache kafka for reliable stream processing in real-time. In: 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), pp 207-217
19.
go back to reference Xiao J, Zhou Z (2020) Research progress of RNN language model. In: 2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA). pp 1285-1288 Xiao J, Zhou Z (2020) Research progress of RNN language model. In: 2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA). pp 1285-1288
20.
go back to reference Eker A, Eker K, Duru N (2021) Multi-Class Sentiment Analysis from Turkish Tweets with RNN. In: 2021 6th International Conference on Computer Science and Engineering (UBMK). pp 560-564 Eker A, Eker K, Duru N (2021) Multi-Class Sentiment Analysis from Turkish Tweets with RNN. In: 2021 6th International Conference on Computer Science and Engineering (UBMK). pp 560-564
21.
go back to reference Saha D, Das A, Nath TC, Saha S, Das R (2022) Detection of Fake News and Rumors in Social Media Using Machine Learning Techniques With Semantic Attributes. In: Convergence Of Deep Learning In Cyber-IoT Systems And Security. pp 85 Saha D, Das A, Nath TC, Saha S, Das R (2022) Detection of Fake News and Rumors in Social Media Using Machine Learning Techniques With Semantic Attributes. In: Convergence Of Deep Learning In Cyber-IoT Systems And Security. pp 85
22.
go back to reference Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. Proc Int AAAI Conf Web Soc Media 11:512–515CrossRef Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. Proc Int AAAI Conf Web Soc Media 11:512–515CrossRef
23.
go back to reference Waseem Z, Hovy D (2016) Hateful symbols or hateful people? Predictive features for hate speech detection on twitter. In: Proceedings Of The NAACL Student Research Workshop. pp 88-93 Waseem Z, Hovy D (2016) Hateful symbols or hateful people? Predictive features for hate speech detection on twitter. In: Proceedings Of The NAACL Student Research Workshop. pp 88-93
25.
26.
go back to reference Qian J, ElSherief M, Belding E, Wang W (2018) Leveraging intra-user and inter-user representation learning for automated hate speech detection. arXiv:1804.03124 Qian J, ElSherief M, Belding E, Wang W (2018) Leveraging intra-user and inter-user representation learning for automated hate speech detection. arXiv:​1804.​03124
27.
go back to reference Alothali E, Alashwal H, Salih M, Hayawi K (2021) Real time detection of social bots on twitter using machine learning and apache kafka. In: 2021 5th Cyber Security in Networking Conference (CSNet), pp 98-102 Alothali E, Alashwal H, Salih M, Hayawi K (2021) Real time detection of social bots on twitter using machine learning and apache kafka. In: 2021 5th Cyber Security in Networking Conference (CSNet), pp 98-102
28.
go back to reference Fimoza D, Amalia A, Harumy TH (2021) Sentiment analysis for movie review in bahasa indonesia using bert. In: 2021 International Conference on Data Science, Artificial Intelligence, and Business Analytics (DATABIA), pp 27-34 Fimoza D, Amalia A, Harumy TH (2021) Sentiment analysis for movie review in bahasa indonesia using bert. In: 2021 International Conference on Data Science, Artificial Intelligence, and Business Analytics (DATABIA), pp 27-34
29.
go back to reference Ksieniewicz P, Zyblewski P, Choraś M, Kozik R, Giełczyk A, Woźniak M (2020) Fake news detection from data streams. In: 2020 International Joint Conference On Neural Networks (IJCNN). pp 1-8 Ksieniewicz P, Zyblewski P, Choraś M, Kozik R, Giełczyk A, Woźniak M (2020) Fake news detection from data streams. In: 2020 International Joint Conference On Neural Networks (IJCNN). pp 1-8
30.
go back to reference Roy P, Tripathy A, Das T, Gao X (2020) A framework for hate speech detection using deep convolutional neural network. IEEE Access 8:204951–204962CrossRef Roy P, Tripathy A, Das T, Gao X (2020) A framework for hate speech detection using deep convolutional neural network. IEEE Access 8:204951–204962CrossRef
31.
go back to reference Fimoza D, Amalia A, Harumy T (2021) Sentiment analysis for movie review in Bahasa Indonesia using BERT. In: 2021 International Conference on Data Science, Artificial Intelligence, and Business Analytics (DATABIA). pp 27-34 Fimoza D, Amalia A, Harumy T (2021) Sentiment analysis for movie review in Bahasa Indonesia using BERT. In: 2021 International Conference on Data Science, Artificial Intelligence, and Business Analytics (DATABIA). pp 27-34
32.
go back to reference Jiang Z, Di Troia F, Stamp M (2021) Sentiment analysis for troll detection on Weibo. In: Malware Analysis Using Artificial Intelligence and Deep Learning. pp 555-579 Jiang Z, Di Troia F, Stamp M (2021) Sentiment analysis for troll detection on Weibo. In: Malware Analysis Using Artificial Intelligence and Deep Learning. pp 555-579
33.
go back to reference Del Vigna12 F, Cimino23 A, Dell’Orletta F, Petrocchi M, Tesconi M (2017) Hate me, hate me not: Hate speech detection on facebook. In: Proceedings of the First Italian Conference on Cybersecurity (ITASEC17). pp 86-95 Del Vigna12 F, Cimino23 A, Dell’Orletta F, Petrocchi M, Tesconi M (2017) Hate me, hate me not: Hate speech detection on facebook. In: Proceedings of the First Italian Conference on Cybersecurity (ITASEC17). pp 86-95
34.
go back to reference Wagh R, Punde P (2018) Survey on sentiment analysis using twitter dataset. In: 2018 Second International Conference on Electronics, Communication And Aerospace Technology (ICECA). pp 208-211 Wagh R, Punde P (2018) Survey on sentiment analysis using twitter dataset. In: 2018 Second International Conference on Electronics, Communication And Aerospace Technology (ICECA). pp 208-211
Metadata
Title
An online and highly-scalable streaming platform for filtering trolls with transfer learning
Authors
Chun-Ming Lai
Ting-Wei Chang
Chao-Tung Yang
Publication date
29-04-2023
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 15/2023
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-023-05312-1

Other articles of this Issue 15/2023

The Journal of Supercomputing 15/2023 Go to the issue

Premium Partner