Top

World Wide Web

Published in:

16-03-2022

Identifying informative tweets during a pandemic via a topic-aware neural language model

Authors: Wang Gao, Lin Li, Xiaohui Tao, Jing Zhou, Jun Tao

Published in: World Wide Web | Issue 1/2023

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Every epidemic affects the real lives of many people around the world and leads to terrible consequences. Recently, many tweets about the COVID-19 pandemic have been shared publicly on social media platforms. The analysis of these tweets is helpful for emergency response organizations to prioritize their tasks and make better decisions. However, most of these tweets are non-informative, which is a challenge for establishing an automated system to detect useful information in social media. Furthermore, existing methods ignore unlabeled data and topic background knowledge, which can provide additional semantic information. In this paper, we propose a novel Topic-Aware BERT (TABERT) model to solve the above challenges. TABERT first leverages a topic model to extract the latent topics of tweets. Secondly, a flexible framework is used to combine topic information with the output of BERT. Finally, we adopt adversarial training to achieve semi-supervised learning, and a large amount of unlabeled data can be used to improve inner representations of the model. Experimental results on the dataset of COVID-19 English tweets show that our model outperforms classic and state-of-the-art baselines.

previous article Group homophily based facility location selection in geo-social networks

next article Intra- and inter-association attention network-enhanced policy learning for social group recommendation

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

https://coronavirus.jhu.edu/map.html

https://pypi.org/project/emoji/

http://nlp.stanford.edu/projects/glove/

https://github.com/google-research/bert

https://github.com/google-research/ALBERT

Al-garadi, M.A., Khan, M.S., Varathan, K.D., Mujtaba, G., Al-Kabsi, A.M.: Using online social networks to track a pandemic: A systematic review. Journal of Biomedical Informatics 62, 1–11 (2016)CrossRef

Cai, T., Li, J., Mian, A.S., Sellis, T., Yu, J.X., et al.: Target-aware holistic influence maximization in spatial social networks. IEEE Transactions on Knowledge and Data Engineering 34(4), 1993–2007 (2022)

Chaudhary, Y., Gupta, P., Saxena, K., Kulkarni, V., Runkler, T.A., Schütze, H.: Topicbert for energy efficient document classification. In: Proceedings of Empirical Methods in Natural Language Processing (EMNLP), pp. 1682–1690 (2020)

Cheng, X., Yan, X., Lan, Y., Guo, J.: Btm: Topic modeling over short texts. IEEE Transactions on Knowledge and Data Engineering 26(12), 2928–2941 (2014)CrossRef

Chowdhury, J.R., Caragea, C., Caragea, D.: Cross-lingual disaster-related multi-label tweet classification with manifold mixup. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 292–298 (2020)

Chowdhury, J.R., Caragea, C., Caragea, D.: On identifying hashtags in disaster twitter data. In: Proceedings of Conference on Artificial Intelligence (AAAI), pp. 498–506 (2020)

Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 4171–4186 (2019)

Feng, J., Rao, Y., Xie, H., Wang, F.L., Li, Q.: User group based emotion detection and topic discovery over short text. World Wide Web 23(3), 1553–1587 (2020)CrossRef

Gao, W., Fang, Y., Li, L., Tao, X.: Event detection in social media via graph neural network. In: Web Information Systems Engineering (WISE), pp. 370–384 (2021)

10.

Gao, W., Peng, M., Wang, H., Zhang, Y., Xie, Q., Tian, G.: Incorporating word embeddings into topic modeling of short text. Knowledge and Information Systems 61, 1123–1145 (2019)CrossRef

11.

Gao, W., Peng, M., Wang, H., Zhang, Y., Han, W., Hu, G., Xie, Q.: Generation of topic evolution graphs from short text streams. Neurocomputing 383, 282–294 (2020)CrossRef

12.

Gao, W., Fang, Y., Zhang, F., Yang, Z.: Representation learning of knowledge graphs using convolutional neural networks. Neural Network World 30, 145–160 (2020)CrossRef

13.

Gao, W., Li, L., Zhu, X., Wang, Y.: Detecting disaster-related tweets via multimodal adversarial neural network. IEEE MultiMedia 27(4), 28–37 (2020)CrossRef

14.

Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A.C., Bengio, Y.: Generative adversarial nets. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 2672–2680 (2014)

15.

Haldar, N.A.H., Reynolds, M., Shao, Q., Paris, C., Li, J., Chen, Y.: Activity location inference of users based on social relationship. World Wide Web 24(4), 1165–1183 (2021)CrossRef

16.

Hu, W., Tsujii, J.: A latent concept topic model for robust topic inference using word embeddings. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 380–386 (2016)

17.

Huang, J., Peng, M., Li, P., Hu, Z., Xu, C.: Improving biterm topic model with word embeddings. World Wide Web 23(6), 3099–3124 (2020)CrossRef

18.

Imran, M., Mitra, P., Castillo, C.: Twitter as a lifeline: Human-annotated twitter corpora for NLP of crisis-related messages. In: Proceedings of International Conference on Language Resources and Evaluation (LREC), pp. 1–6 (2016)

19.

Kumar, A., Singh, J.P., Dwivedi, Y.K., Rana, N.P.: A deep multi-modal neural network for informative twitter content classification during emergencies. Annals of Operations Research 7, 1–32 (2020)

20.

Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: A lite BERT for self-supervised learning of language representations. In: Proceedings of International Conference on Learning Representations (ICLR), pp. 1–17 (2020)

21.

Li, C., Duan, Y., Wang, H., Zhang, Z., Sun, A., Ma, Z.: Enhancing topic modeling for short texts with auxiliary word embeddings. ACM Transactions on Information Systems 36(2), 1–30 (2017)CrossRef

22.

Li, J., Cai, T., Deng, K., Wang, X., Sellis, T., Xia, F.: Community-diversified influence maximization in social networks. Information Systems 92, 1–12 (2020)CrossRef

23.

Li, Z., Wang, X., Li, J., Zhang, Q.: Deep attributed network representation learning of complex coupling and interaction. Knowledge-Based Systems 212, 1–15 (2021)CrossRef

24.

Long, Z., Alharthi, R., El Saddik, A.: Needfull-a tweet analysis platform to study human needs during the covid-19 pandemic in new york state. IEEE Access 8, 136046–136055 (2020)CrossRef

25.

Mahata, D., Talburt, J.R., Singh, V.K.: From chirps to whistles: Discovering event-specific informative content from twitter. In: Proceedings of the ACM Web Science Conference (WebSci), pp. 1–10 (2015)

26.

Mukherjee, S., Kumar, R., Bala, P.K.: Managing a natural disaster: actionable insights from microblog data. Journal of Decision Systems 31, 134–149 (2022)CrossRef

27.

Narayan, S., Cohen, S.B., Lapata, M.: Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. In: Proceedings of Empirical Methods in Natural Language Processing (EMNLP), pp. 1797–1807 (2018)

28.

Neppalli, V.K., Caragea, C., Caragea, D.: Deep neural networks versus naive bayes classifiers for identifying informative tweets during disasters. In: Proceedings of Information Systems for Crisis Response and Management (ISCRAM), pp. 1–10 (2018)

29.

Nguyen, D.Q., Vu, T., Rahimi, A., Dao, M.H., Nguyen, L.T., Doan, L.: WNUT-2020 task 2: identification of informative COVID-19 english tweets. In: Proceedings of the Workshop on Noisy User-generated Text (WNUT), pp. 314–318 (2020)

30.

Roy, S., Mishra, S., Matam, R.: Classification and summarization for informative tweets. In: Proceedings of IEEE International Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), pp. 1–4 (2020)

31.

Salimans, T., Goodfellow, I.J., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training gans. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 2226–2234 (2016)

32.

Sarki, R., Ahmed, K., Wang, H., Zhang, Y.: Automated detection of mild and multi-class diabetic eye diseases using deep learning. Health Information Science and Systems 8(1), 1–9 (2020)CrossRef

33.

Shahi, G.K., Dirkson, A., Majchrzak, T.A.: An exploratory study of COVID-19 misinformation on twitter. Online Social Networks and Media 22, 1–16 (2021)CrossRef

34.

Singh, L., Bansal, S., Bode, L., Budak, C., Chi, G., Kawintiranon, K., Padden, C., Vanarsdall, R., Vraga, E.K., Wang, Y.: A first look at COVID-19 information and misinformation sharing on twitter. 1–24 arxiv:2003.13907 (2020)

35.

Sreenivasulu, M., Sridevi, M.: Classifying informative and non-informative tweets from the twitter by adapting image features during disaster. Multimedia Tools and Applications 79(3), 28901–28923 (2020)

36.

Supriya, S., Siuly, S., Wang, H., Zhang, Y.: Automated epilepsy detection techniques from electroencephalogram signals: a review study. Health Information Science and Systems 8(1), 1–15 (2020) CrossRef

37.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 5998–6008 (2017)

38.

Wang, L., Yao, J., Tao, Y., Zhong, L., Liu, W., Du, Q.: A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pp. 4453–4460 (2018)

39.

Yang, Y., Guan, Z., Li, J., Zhao, W., Cui, J., Wang, Q.: Interpretable and efficient heterogeneous graph convolutional network. IEEE Transactions on Knowledge and Data Engineering, 1–14 (2021)

40.

Yin, J., Tang, M., Cao, J., Wang, H., You, M., Lin, Y.: Vulnerability exploitation time prediction: an integrated framework for dynamic imbalanced learning. World Wide Web 25, 401–423 (2022)CrossRef

41.

Yin, H., Yang, S., Song, X., Liu, W., Li, J.: Deep fusion of multimodal features for social media retweet time prediction. World Wide Web 24(4), 1027–1044 (2021)CrossRef

42.

Zahera, H.M., Elgendy, I.A., Jalota, R., Sherif, M.A.: Fine-tuned BERT model for multi-label tweets classification. In: Proceedings of the Text Retrieval Conference (TREC), pp. 1–7 (2019)

43.

Zeng, J., Li, J., Song, Y., Gao, C., Lyu, M.R., King, I.: Topic memory networks for short text classification. In: Proceedings of Empirical Methods in Natural Language Processing (EMNLP), pp. 3120–3131 (2018)

Title: Identifying informative tweets during a pandemic via a topic-aware neural language model
Authors: Wang Gao
Lin Li
Xiaohui Tao
Jing Zhou
Jun Tao
Publication date: 16-03-2022
Publisher: Springer US
Published in: World Wide Web / Issue 1/2023
Print ISSN: 1386-145X
Electronic ISSN: 1573-1413
DOI: https://doi.org/10.1007/s11280-022-01034-1

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Other articles of this Issue 1/2023

Gated graph convolutional network with enhanced representation and joint attention for distant supervised heterogeneous relation extraction

TransO: a knowledge-driven representation learning method with ontology information constraints

Bipartite graph capsule network

Attention-based hierarchical denoised deep clustering network

Intra- and inter-association attention network-enhanced policy learning for social group recommendation

Memory-augmented meta-learning framework for session-based target behavior recommendation

Premium Partner