Top

Published in:

2019 | OriginalPaper | Chapter

Application of Extractive Text Summarization Algorithms to Speech-to-Text Media

Authors : Domínguez M. Victor, Fidalgo F. Eduardo, Rubel Biswas, Enrique Alegre, Laura Fernández-Robles

Published in: Hybrid Artificial Intelligent Systems

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

This paper presents how speech-to-text summarization can be performed using extractive text summarization algorithms. Our objective is to make a recommendation about which of the six text summary algorithms evaluated in the study is the most suitable for the task of audio summarization. First, we have selected six text summarization algorithms: Luhn, TextRank, LexRank, LSA, SumBasic, and KLSum. Then, we have evaluated them on two datasets, DUC2001 and OWIDSum, with six ROUGE metrics. After that, we have selected five speech documents from ISCI Corpus dataset, and we have transcribed using the Automatic Speech Recognition (ASR) from Google Cloud Speech API. Finally, we applied the studied extractive summarization algorithms to these five text samples to obtain a text summary from the original audio file. Experimental results showed that Luhn and TextRank obtained the best performance for the task of extractive speech-to-text summarization on the samples evaluated.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Constraint Programming Based Algorithm for Solving Large-Scale Vehicle Routing Problems

next chapter User Profiles Matching for Different Social Networks Based on Faces Identification

Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958)MathSciNetCrossRef

Mihalcea, R., Tarau, P.: Text Rank: bringing order into texts. In: Proceedings of EMNLP-04 and Conference on Empirical Methods in Natural Language Processing (EMNLP) (2004)

Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)CrossRef

Xiangen, H., Zhiqiang, C., Max, L., Andrew, O., Phanni, P., Art, G.: A revised algorithm for latent semantic analysis. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI 2003), pp. 1489–1491. Morgan Kaufmann Publishers Inc., San Francisco (2003)

Aria, H., Lucy, V.: Exploring content models for multi-document summarization. In: Proceedings of Human Language Technologies, Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2009), pp. 362–370. Association for Computational Linguistics, Stroudsburg (2009)

Nenkova, A., Wandervende, L.: The impact of frequency on summarization. Technical report, Microsoft Research (2005)

Murthy, V., Vishnu Vardhan, M.B., Vijaypal Sreenivas, P., Reddy, V.: Text classification using text summarization–a case study on Telugu text. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 3, 1399–1403 (2013)

Joshi, A., Fidalgo, E., Alegre, E.: Summarization of text from illegal documents in Tor domains using extractive algorithms. In: International Conference on Applications of Intelligent Systems (APPIS), Las Palmas de Gran Canaria (2018)

Joshi, A., Fidalgo, E., Alegre, E., Fernández-Robles, L.: SummCoder: an unsupervised framework for extractive text summarization based on deep auto-encoders. Expert Syst. Appl. 129, 200–215 (2019)CrossRef

10.

DUC 2002: Document Understanding Conference (2002). https://duc.nist.gov/. Accessed 07 Apr 2019

11.

David, M.B., Andrew, Y.N., Michael, I.J.: Latent Dirichlet allocation. JMLR 3, 993–1022 (2003)

12.

Blei, D.M., Griffiths, T.L., Jordan, M.I., Tenenbaum, J.B.: Hierarchical topic models and the nested Chinese restaurant process. In: NIPS (2004)

13.

Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Proceedings of the ACL Workshop, pp. 74–81 (2004)

14.

Nallapati, R., Zhou, B., Santos, C.D., Gulcehre, C., Xiang, B.: Abstractive text summarization using sequence-to-sequence RNNs and beyond. In: Proceedings of the 20th SIGNAL Conference on Computational Natural Language Learning, pp. 282–290 (2016)

15.

Chopra, S., Auli, M., Rush, A.M.: Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 93–98 (2016)

16.

Cao, Z., Wei, F., Li, S., Li, W., Zhou, M., Wang, H.: Learning summary prior representation of extractive summarization. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 829–833 (2005)

17.

Cao, Z., Wenjie, L., Sujian, L., Furu, W., Yanran, L.: AttSum: joint learning of focusing and summarization with neuronal attention. In: COLING, pp. 547–556 (2016)

18.

Cheng, J., Lapata, M.: Neural summarization by extracting sentences and words. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 484–494 (2016)

19.

Lil, P., Lam, W., Bing, L., Wang, Z.: Deep recurrent generative decoder for abstractive text summarization. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Proceedings, pp. 2091–2100 (2017)

20.

Martinez, J., Perez-Meana, H., Escamilla-Hernandez, E., Suzuki, M.M.: Mel-frequency cepstral coefficients for speaker recognition: a review. Int. J. Adv. Eng. Res. Dev. 2, 248–251 (2015)

21.

Zulkifly, M.A., Yahya, N.: Relative spectral-perceptual linear prediction (RASTA-PLP) speech signal analysis using singular value decomposition (SVD). In: IEEE 3rd International Symposium on Robotics and Manufacturing Automation (ROMA) (2017)

22.

Xuan, G., Zhang, W., Chai, P.: EM algorithms of Gaussian mixture model and hidden Markov model. In: Proceedings of the 2001 International Conference on Image Processing, pp. 145–148 (2001)

23.

Villalba, J., Lleida, E., Ortega, A., Miguel, A.: Reliability estimation of the speaker verification decisions using Bayesian networks to combine information from multiple speech quality measures. In: Torre Toledano, D., et al. (eds.) IberSPEECH 2012. CCIS, vol. 328, pp. 1–10. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35292-8_1CrossRef

24.

Gliozzo, A.M., Giuliano, C., Strapparava, C.: Domain kernels for word sense disambiguation. In: 43rd Annual Meeting of the Association for Computational Linguistics, pp. 403–410 (2005)

25.

Al-Nabki, M., Fidalgo, E., Alegre, E., Fernández-Robles, L.: ToRank: identifying the most influential suspicious domains in the Tor network. Expert Syst. Appl. 123, 212–226 (2019)CrossRef

Title: Application of Extractive Text Summarization Algorithms to Speech-to-Text Media
Authors: Domínguez M. Victor
Fidalgo F. Eduardo
Rubel Biswas
Enrique Alegre
Laura Fernández-Robles
Publisher: Springer International Publishing
Book: Hybrid Artificial Intelligent Systems
Print ISBN: 978-3-030-29858-6

Electronic ISBN: 978-3-030-29859-3

Copyright Year: 2019
DOI: https://doi.org/10.1007/978-3-030-29859-3_46

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner