Top

Published in:

2023 | OriginalPaper | Chapter

Text Summarization for Big Data Analytics: A Comprehensive Review of GPT 2 and BERT Approaches

Authors : G. Bharathi Mohan, R. Prasanna Kumar, Srinivasan Parathasarathy, S. Aravind, K. B. Hanish, G. Pavithria

Published in: Data Analytics for Internet of Things Infrastructure

Publisher: Springer Nature Switzerland

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The goal of approaches to automatic text summarization is to construct summaries while extracting the essential information from one or more input texts. Large models could be trained thanks to the ability to examine text non-sequentially, which led to the Transformer becoming the most well-known NLP model. Big data and associated methodologies are frequently used to handle and alter these massive volumes of information. This chapter looks at large data methodologies and method such as Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer 2 (GPT 2) models for multi-document summarization. The Transformer, BERT and GPT and GPT 2 models in text summarization give very close results in terms of accuracy and they need to be compared to give a model that performs better. In this chapter, the two models have been compared and our results have shown that BERT performs better than GPT 2. This is found based on the results given by ROUGE metrics on a news article dataset containing 100 text files to summarize.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter The Symbiotic Relation of IoT and AI for Applications in Various Domains: Trends and Future Directions

next chapter Leveraging Secured E-Voting Using Decentralized Blockchain Technology

Ma, T., Pan, Q., Rong, H., Qian, Y., Tian, Y., & Al-Nabhan, N. (2022). T-BERTSum: Topic-aware text summarization based on BERT. IEEE Transactions on Computational Social Systems, 9(3), 879–890. https://doi.org/10.1109/TCSS.2021.3088506CrossRef

Babar, S., Tech-Cse, M., & Rit (2013). Text summarization: An overview.

Gupta, A., Chugh, D., & Katarya, R. (2022). Automated news summarization using transformers. In Sustainable advanced computing (pp. 249–259). Springer.CrossRef

Suleiman, D., & Awajan, A. (2020). Deep learning based abstractive text summarization: Approaches, datasets, evaluation measures, and challenges. Mathematical Problems in Engineering, 2020.

Shini, R. S., & Kumar, V. A. (2021). Recurrent neural network based text summarization techniques by word sequence generation. In 2021 6th international conference on inventive computation technologies (ICICT) (pp. 1224–1229). IEEE.CrossRef

Ozsoy, M. G., Alpaslan, F. N., & Cicekli, I. (2011). Text summarization using latent semantic analysis. Journal of Information Science, 37(4), 405–417.MathSciNetCrossRef

Mahajani, A., Pandya, V., Maria, I., & Sharma, D. (2019). A comprehensive survey on extractive and abstractive techniques for text summarization. In Ambient communications and computer systems (pp. 339–351).CrossRef

Liu, Y., & Lapata, M. (2019, August 22). Text summarization with pretrained encoders. arXiv preprint arXiv:1908.08345.

Rahman, M. M., & Siddiqui, F. H. (2019). An optimized abstractive text summarization model using peephole convolutional LSTM. Symmetry, 11(10), 1290.CrossRef

10.

Vig, J. (2019, June 12). A multiscale visualization of attention in the transformer model. arXiv preprint arXiv:1906.05714.

11.

Song, S., Huang, H., & Ruan, T. (2019). Abstractive text summarization using LSTM-CNN based deep learning. Multimedia Tools and Applications, 78(1), 857–875.CrossRef

12.

Chopra, S., Auli, M., & Rush, A. M. (2016). Abstractive sentence summarization with attentive recurrent neural networks. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: Human language technologies (pp. 93–98).

13.

Nagalavi, D., Hanumanthappa, M., & Ravikumar, K. (2019). An improved attention layer assisted recurrent convolutional neural network model for abstractive text summarization. INFOCOMP Journal of Computer Science, 18(2), 36–47.

14.

Kieuvongngam, V., Tan, B., & Niu, Y. (2020, June 3). Automatic text summarization of covid-19 medical research articles using bert and gpt-2. arXiv preprint arXiv:2006.01997.

15.

Miller, D. (2019, June 7). Leveraging BERT for extractive text summarization on lectures. arXiv preprint arXiv:1906.04165.

16.

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proc. Conf. North Amer.- Chapter Assoc. Comput. Linguistics, Hum. Lang. Technol (Vol. 1, pp. 4171–4186). Association for Computational Linguistics. https://doi.org/10.18653/v1/n19-1423CrossRef

17.

Abdel-Salam, S., & Rafea, A. (2022). Performance study on extractive text summarization using BERT models. Information, 13(2), 67.CrossRef

18.

Liu, Y. (2019, March 25). Fine-tune BERT for extractive summarization. arXiv preprint arXiv:1903.10318.

19.

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.

20.

Montesinos, D. M. (2020, September 10). Modern methods for text generation. arXiv preprint arXiv:2009.04968.

21.

Barbella, M., & Tortora, G. Rouge metric evaluation for text summarization techniques. Available at SSRN 4120317.

22.

Bharathi Mohan, G., & Prasanna Kumar, R. Survey of text document summarization based on ensemble topic vector clustering model. In P. P. Joby, V. E. Balas, & R. Palanisamy (Eds.), IoT based control networks and intelligent systems (Lecture notes in networks and systems) (Vol. 528). Springer. https://doi.org/10.1007/978-981-19-5845-8_60

23.

Mohan, G. B., & Kumar, R. P. (2022). A comprehensive survey on topic modeling in text summarization. In D. K. Sharma, S. L. Peng, R. Sharma, & D. A. Zaitsev (Eds.), Micro-electronics and telecommunication engineering . ICMETE 2021 (Lecture notes in networks and systems) (Vol. 373). Springer. https://doi.org/10.1007/978-981-16-8721-1_22CrossRef

24.

Kalpana, G., Kumar, R. P., & Ravi, T. (2010). Classifier based duplicate record elimination for query results from web databases. In Trendz in Information Sciences & Computing (TISC2010) (pp. 50–53). https://doi.org/10.1109/TISC.2010.5714607CrossRef

25.

Assegie, T. A., Rangarajan, P. K., Kumar, N. K., & Vigneswari, D. (2022). An empirical study on machine learning algorithms for heart disease prediction. IAES International Journal of Artificial Intelligence (IJ-AI), 11(3), 1066. 10.11591/ijai.v11.i3.pp1066-1073.CrossRef

26.

Mohan, G. B., & Kumar, R. P. (2022). Lattice abstraction-based content summarization using baseline abstractive lexical chaining progress. International Journal of Information Technology. https://doi.org/10.1007/s41870-022-01080-y

27.

Yang, Z., Dong, Y., Deng, J., Sha, B., & Xu, T. (2021). Research on automatic news text summarization technology based on GPT2 model. In 2021 3rd international conference on artificial intelligence and advanced manufacture. https://doi.org/10.1145/3495018.3495091CrossRef

Title: Text Summarization for Big Data Analytics: A Comprehensive Review of GPT 2 and BERT Approaches
Authors: G. Bharathi Mohan
R. Prasanna Kumar
Srinivasan Parathasarathy
S. Aravind
K. B. Hanish
G. Pavithria
Publisher: Springer Nature Switzerland
Book: Data Analytics for Internet of Things Infrastructure
Print ISBN: 978-3-031-33807-6

Electronic ISBN: 978-3-031-33808-3

Copyright Year: 2023
DOI: https://doi.org/10.1007/978-3-031-33808-3_14

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner