Skip to main content

2022 | OriginalPaper | Buchkapitel

Comparative Analysis of NLP Text Embedding Techniques with Neural Network Layered Architecture on Online Movie Reviews

verfasst von : Hemlata Goyal, Amar Sharma, Ranu Sewada, Devansh Arora, Sunita Singhal

Erschienen in: Artificial Intelligence and Speech Technology

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In NLP world, there is a need to convert the text data into numerical form in a smart way of text embedding with the machine learning architecture. In this research, the comparative text embedding methods of Binary Term Frequency, Count Vector, Term Frequency - Inverse Document Frequency, and Word2Vec is used for converting text to meaningful representations of vectors, containing numerical values. In order to analyze the performance of the various text embedding techniques, Neural Network Layered Architecture is designed for movie review’s polarity classification to include input layer, dense layers followed by the ReLU (Rectified Linear Unit) activation layers and Sigmoid activation function to make classifications on the basis of training–testing performance. Word2Vec text embedding scored the highest training and testing accuracy among all the text embedding techniques of Binary Term Frequency, Count Vector, Term Frequency - Inverse Document Frequency, and Word2Vec with 89.75% and 86.94% respectively with \(\pm { }1.0\,\,{\varvec{error}}\) for the online movie reviews.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
3.
Zurück zum Zitat Musto, C., Semeraro, G., De Gemmis, M., Lops, P.: Word embedding techniques for content-based recommender systems: an empirical evaluation. In: Recsys posters Musto, C., Semeraro, G., De Gemmis, M., Lops, P.: Word embedding techniques for content-based recommender systems: an empirical evaluation. In: Recsys posters
4.
Zurück zum Zitat Wan, M., Gu, G., Qian, W., Ren, K., Chen, Q., Zhang, H., Maldague, X.: Total variation regularization term-based low-rank and sparse matrix representation model for infrared moving target tracking. Remote Sens. 10(4), 510 (2018)CrossRef Wan, M., Gu, G., Qian, W., Ren, K., Chen, Q., Zhang, H., Maldague, X.: Total variation regularization term-based low-rank and sparse matrix representation model for infrared moving target tracking. Remote Sens. 10(4), 510 (2018)CrossRef
5.
Zurück zum Zitat Khattak, F.K., Jeblee, S., Pou-Prom, C., Abdalla, M., Meaney, C., Rudzicz, F.: A survey of word embeddings for clinical text. J. Biomed. Inform. X, 4, 100057 (2019) Khattak, F.K., Jeblee, S., Pou-Prom, C., Abdalla, M., Meaney, C., Rudzicz, F.: A survey of word embeddings for clinical text. J. Biomed. Inform. X, 4, 100057 (2019)
6.
Zurück zum Zitat Arora, R., Singh, P., Goyal, H., Singhal, S., Vijayvargiya, S.: Comparative question answering system based on natural language processing and machine learning. In: 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), pp. 373–378. IEEE, March 2021 Arora, R., Singh, P., Goyal, H., Singhal, S., Vijayvargiya, S.: Comparative question answering system based on natural language processing and machine learning. In: 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), pp. 373–378. IEEE, March 2021
7.
Zurück zum Zitat Kaplan, A.M., Haenlein, M.: Users of the world, unite! the challenges and opportunities of social media. Bus. Horiz. 53(1), 59–68 (2010)CrossRef Kaplan, A.M., Haenlein, M.: Users of the world, unite! the challenges and opportunities of social media. Bus. Horiz. 53(1), 59–68 (2010)CrossRef
8.
Zurück zum Zitat Ribeiro, F.N., Araújo, M., Gonçalves, P., Gonçalves, M.A., Benevenuto, F.: Sentibench-a benchmark comparison of state-of-the-practice sentiment analysis methods. EPJ Data Sci. 5(1), 1–29 (2016)CrossRef Ribeiro, F.N., Araújo, M., Gonçalves, P., Gonçalves, M.A., Benevenuto, F.: Sentibench-a benchmark comparison of state-of-the-practice sentiment analysis methods. EPJ Data Sci. 5(1), 1–29 (2016)CrossRef
9.
Zurück zum Zitat Shahmirzadi, O., Lugowski, A., Younge, K.: Text similarity in vector space models: a comparative study. In: 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 659–666. IEEE, December 2019 Shahmirzadi, O., Lugowski, A., Younge, K.: Text similarity in vector space models: a comparative study. In: 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 659–666. IEEE, December 2019
11.
Zurück zum Zitat Yun-tao, Z., Ling, G., Yong-cheng, W.: An improved TF-IDF approach for text classification. J. Zhejiang Univ.-Sci. A, 6(1), 49–55 (2005) Yun-tao, Z., Ling, G., Yong-cheng, W.: An improved TF-IDF approach for text classification. J. Zhejiang Univ.-Sci. A, 6(1), 49–55 (2005)
12.
Zurück zum Zitat Qaiser, S., Ali, R.: Text mining: use of TF-IDF to examine the relevance of words to documents. Int. J. Comput. Appl. 181(1), 25–29 (2018) Qaiser, S., Ali, R.: Text mining: use of TF-IDF to examine the relevance of words to documents. Int. J. Comput. Appl. 181(1), 25–29 (2018)
13.
Zurück zum Zitat Rezaeinia, S.M., Rahmani, R., Ghodsi, A., Veisi, H.: Sentiment analysis based on improved pre-trained word embeddings. Expert Syst. Appl. 117, 139–147 (2019)CrossRef Rezaeinia, S.M., Rahmani, R., Ghodsi, A., Veisi, H.: Sentiment analysis based on improved pre-trained word embeddings. Expert Syst. Appl. 117, 139–147 (2019)CrossRef
14.
Zurück zum Zitat Stein, R.A., Jaques, P.A., Valiati, J.F.: An analysis of hierarchical text classification using word embeddings. Inf. Sci. 471, 216–232 (2019)CrossRef Stein, R.A., Jaques, P.A., Valiati, J.F.: An analysis of hierarchical text classification using word embeddings. Inf. Sci. 471, 216–232 (2019)CrossRef
15.
Zurück zum Zitat Wang, Y., et al.: A comparison of word embeddings for the biomedical natural language processing. J. Biomed. Inform. 87, 12–20 (2018) Wang, Y., et al.: A comparison of word embeddings for the biomedical natural language processing. J. Biomed. Inform. 87, 12–20 (2018)
Metadaten
Titel
Comparative Analysis of NLP Text Embedding Techniques with Neural Network Layered Architecture on Online Movie Reviews
verfasst von
Hemlata Goyal
Amar Sharma
Ranu Sewada
Devansh Arora
Sunita Singhal
Copyright-Jahr
2022
DOI
https://doi.org/10.1007/978-3-030-95711-7_20

Premium Partner