nach oben

Artificial Intelligence Review

Erschienen in:

22.02.2023

Impact of word embedding models on text analytics in deep learning environment: a review

verfasst von: Deepak Suresh Asudani, Naresh Kumar Nagwani, Pradeep Singh

Erschienen in: Artificial Intelligence Review | Ausgabe 9/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The selection of word embedding and deep learning models for better outcomes is vital. Word embeddings are an n-dimensional distributed representation of a text that attempts to capture the meanings of the words. Deep learning models utilize multiple computing layers to learn hierarchical representations of data. The word embedding technique represented by deep learning has received much attention. It is used in various natural language processing (NLP) applications, such as text classification, sentiment analysis, named entity recognition, topic modeling, etc. This paper reviews the representative methods of the most prominent word embedding and deep learning models. It presents an overview of recent research trends in NLP and a detailed understanding of how to use these models to achieve efficient results on text analytics tasks. The review summarizes, contrasts, and compares numerous word embedding and deep learning models and includes a list of prominent datasets, tools, APIs, and popular publications. A reference for selecting a suitable word embedding and deep learning approach is presented based on a comparative analysis of different techniques to perform text analytics tasks. This paper can serve as a quick reference for learning the basics, benefits, and challenges of various word representation approaches and deep learning models, with their application to text analytics and a future outlook on research. It can be concluded from the findings of this study that domain-specific word embedding and the long short term memory model can be employed to improve overall text analytics task performance.

Vorheriger Artikel Concept of hidden classes in pattern classification

Nächster Artikel Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Agüero-Torales MM, Abreu Salas JI, López-Herrera AG (2021) Deep learning and multilingual sentiment analysis on social media data: An overview. Appl Soft Comput 107:107373. https://doi.org/10.1016/j.asoc.2021.107373CrossRef

Akhtyamova L, Martínez P, Verspoor K, Cardiff J (2020) Testing contextualized word embeddings to improve NER in Spanish clinical case narratives. IEEE Access 8:164717–164726. https://doi.org/10.1109/ACCESS.2020.3018688CrossRef

Akkasi A, Moens MF (2021) Causal relationship extraction from biomedical text using deep neural models: a comprehensive survey. J Biomed Inform 119:103820. https://doi.org/10.1016/j.jbi.2021.103820CrossRef

Al-Ramahi M, Alsmadi I (2021) Classifying insincere questions on Question Answering (QA) websites: meta-textual features and word embedding. J Bus Anal 4:55–66. https://doi.org/10.1080/2573234X.2021.1895681CrossRef

Alamoudi ES, Alghamdi NS (2021) Sentiment classification and aspect-based sentiment analysis on yelp reviews using deep learning and word embeddings. J Decis Syst 30:259–281. https://doi.org/10.1080/12460125.2020.1864106CrossRef

Alatawi HS, Alhothali AM, Moria KM (2021) Detecting white supremacist hate speech using domain specific word embedding with deep learning and BERT. IEEE Access 9:106363–106374. https://doi.org/10.1109/ACCESS.2021.3100435CrossRef

Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training

Alharthi R, Alhothali A, Moria K (2021) A real-time deep-learning approach for filtering Arabic low-quality content and accounts on Twitter. Inf Syst 99:101740. https://doi.org/10.1016/j.is.2021.101740CrossRef

Almuhareb A, Alsanie W, Al-thubaity A (2019) Arabic word segmentation with long short- term memory neural networks and word embedding. IEEE Access. https://doi.org/10.1109/ACCESS.2019.2893460CrossRef

Almuzaini HA, Azmi AM (2020) Impact of stemming and word embedding on deep learning-based Arabic text categorization. IEEE Access 8:127913–127928. https://doi.org/10.1109/ACCESS.2020.3009217CrossRef

Alqaisi T, O’Keefe S (2019) En-Ar bilingual word embeddings withoutword alignment: Factors Effects. In: Proc Fourth Arab Nat Lang Process Work - Assoc Comput Linguist ANLPW-ACL-2019, pp 97–107. https://doi.org/10.18653/v1/w19-4611

Alrajhi K, ELAffendi MA (2019) Automatic Arabic part-of-speech tagging: deep learning neural LSTM versus Word2Vec. Int J Comput Digit Syst 8:308–315. https://doi.org/10.12785/ijcds/080310CrossRef

Alwehaibi A, Bikdash M, Albogmi M, Roy K (2021) A study of the performance of embedding methods for Arabic short-text sentiment analysis using deep learning approaches. J King Saud Univ. https://doi.org/10.1016/j.jksuci.2021.07.011CrossRef

Amin S, Irfan Uddin M, Ali Zeb M et al (2020) Detecting dengue/flu infections based on tweets using LSTM and word embedding. IEEE Access 8:189054–189068. https://doi.org/10.1109/ACCESS.2020.3031174CrossRef

Atzeni M, Reforgiato Recupero D (2020) Multi-domain sentiment analysis with mimicked and polarized word embeddings for human–robot interaction. Futur Gener Comput Syst 110:984–999. https://doi.org/10.1016/j.future.2019.10.012CrossRef

Ayu D, Khotimah K (2019) Sentiment analysis of hotel aspect using probabilistic latent semantic analysis word embedding and LSTM. Int J Intell Eng Syst. https://doi.org/10.22266/ijies2019.0831.26CrossRef

Beddiar DR, Jahan MS, Oussalah M (2021) Data expansion using back translation and paraphrasing for hate speech detection. Online Soc Networks Media 24:153. https://doi.org/10.1016/j.osnem.2021.100153CrossRef

Bengio Y, Ducharme R, Vincent P et al (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155. https://doi.org/10.1162/153244303322533223CrossRefMATH

Bernardy JP, Lappin S (2022) A neural model for compositional word embeddings and sentence processing. In: Proc Work Cogn Model Comput Linguist C, pp 12–22. https://doi.org/10.18653/v1/2022.cmcl-1.2

Birjali M, Kasri M, Beni-Hssane A (2021) A comprehensive survey on sentiment analysis: approaches, challenges and trends. Knowl-Based Syst 226:107134. https://doi.org/10.1016/j.knosys.2021.107134CrossRef

Blanco A, Perez-de-Viñaspre O, Pérez A, Casillas A (2020) Boosting ICD multi-label classification of health records with contextual embeddings and label-granularity. Comput Methods Programs Biomed. https://doi.org/10.1016/j.cmpb.2019.105264CrossRef

Brown TB, Mann B, Ryder N et al (2020) Language models are few-shot learners. Adv Neural Inf Process Syst. https://doi.org/10.48550/arXiv.2005.14165CrossRef

Budhkar A, Vishnubhotla K, Hossain S, Rudzicz F (2019) Generative adversarial networks for text using word2vec intermediaries. In: Proc 4th Work Represent Learn NLP, Assoc Comput Linguist RepL4NLP-ACL-2019, pp 15–26. https://doi.org/10.18653/v1/W19-4303

Cai S, Palazoglu A, Zhang L, Hu J (2019) Process alarm prediction using deep learning and word embedding methods. ISA Trans 85:274–283. https://doi.org/10.1016/j.isatra.2018.10.032CrossRef

Campbell JC, Hindle A, Stroulia E (2015) Latent dirichlet allocation: extracting topics from software engineering data. Art Sci Anal Softw Data 3:139–159. https://doi.org/10.1016/B978-0-12-411519-4.00006-9CrossRef

Catelli R, Casola V, De Pietro G et al (2021) Combining contextualized word representation and sub-document level analysis through Bi-LSTM+CRF architecture for clinical de-identification. Knowl Based Syst 213:106649. https://doi.org/10.1016/j.knosys.2020.106649CrossRef

Catelli R, Gargiulo F, Casola V et al (2020) Crosslingual named entity recognition for clinical de-identification applied to a COVID-19 Italian data set. Appl Soft Comput J 97:106779. https://doi.org/10.1016/j.asoc.2020.106779CrossRef

Chai Y, Du L, Qiu J et al (2022) Dynamic prototype network based on sample adaptation for few-shot malware detection. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2022.3142820CrossRef

Chalkidis I, Kampas D (2019) Deep learning in law: early adaptation and legal word embeddings trained on large corpora. Artif Intell Law 27:171–198. https://doi.org/10.1007/s10506-018-9238-9CrossRef

Chen YC, Huang SF, Lee HY et al (2019) Audio Word2vec: sequence-to-sequence autoencoding for unsupervised learning of audio segmentation and representation. IEEE/ACM Trans Audio Speech Lang Process 27:1481–1493. https://doi.org/10.1109/TASLP.2019.2922832CrossRef

Cheng L, Kim N, Liu H (2022) Debiasing word embeddings with nonlinear geometry. In: Proc 29th Int Conf Comput Linguist COLING, pp 1286–1298. https://doi.org/10.48550/arXiv.2208.13899

Choudhary M, Chouhan SS, Pilli ES, Vipparthi SK (2021) BerConvoNet: a deep learning framework for fake news classification. Appl Soft Comput 110:10614. https://doi.org/10.1016/j.asoc.2021.107614CrossRef

Chuan CH, Agres K, Herremans D (2020) From context to concept: exploring semantic relationships in music with word2vec. Neural Comput Appl 32:1023–1036. https://doi.org/10.1007/s00521-018-3923-1CrossRef

Chuang SP, Liu AH, Sung TW, Lee HY (2021) Improving automatic speech recognition and speech translation via word embedding prediction. IEEE/ACM Trans Audio Speech Lang Process 29:93–105. https://doi.org/10.1109/TASLP.2020.3037543CrossRef

Craja P, Kim A, Lessmann S (2020) Deep learning for detecting financial statement fraud. Decis Support Syst. https://doi.org/10.1016/j.dss.2020.113421CrossRef

Dau A, Salim N, Idris R (2021) An adaptive deep learning method for item recommendation system. Knowl Based Syst 213:106681. https://doi.org/10.1016/j.knosys.2020.106681CrossRef

Dadkhah S, Shoeleh F, Yadollahi MM et al (2021) A real-time hostile activities analyses and detection system. Appl Soft Comput 104:107175. https://doi.org/10.1016/j.asoc.2021.107175CrossRef

de Mendonça LRC, da Cruz Júnior G (2020) Deep neural annealing model for the semantic representation of documents. Eng Appl Artif Intell 96:103982. https://doi.org/10.1016/j.engappai.2020.103982CrossRef

Deng D, Jing L, Yu J, Sun S (2019) Sparse self-attention LSTM for sentiment lexicon construction. IEEE/ACM Trans Audio Speech Lang Process 27:1777–1790. https://doi.org/10.1109/TASLP.2019.2933326CrossRef

Dessì D, Recupero DR, Sack H (2021) An assessment of deep learning models and word embeddings for toxicity detection within online textual comments. Electron. https://doi.org/10.3390/electronics10070779CrossRef

Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL HLT Conf North Am Chapter Assoc Comput Linguist Hum Lang Technol, vol 1, pp 4171–4186. https://doi.org/10.18653/v1/N19-1423

Dhar A, Mukherjee H, Sekhar N, Kaushik D (2020) Text categorization : past and present. Springer, Amsterdam

Dharmaretnam D, Foster C, Fyshe A (2021) Words as a window: using word embeddings to explore the learned representations of convolutional neural networks. Neural Netw 137:63–74. https://doi.org/10.1016/j.neunet.2020.12.009CrossRef

Döbrössy B, Makrai M, Tarján B, Szaszák G (2019) Investigating sub-word embedding strategies for the morphologically rich and free phrase-order Hungarian. In: Proc 4th Work Represent Learn NLP, Assoc Comput Linguist RepL4NLP-ACL-2019, pp 187–193. https://doi.org/10.18653/v1/w19-4321

Dogru HB, Tilki S, Jamil A, Ali Hameed A (2021) Deep learning-based classification of news texts using Doc2Vec model. In: 1st Int Conf Artif Intell Data Anal CAIDA-2021, pp 91–96. https://doi.org/10.1109/CAIDA51941.2021.9425290

Dridi A, Gaber MM, Muhammad Atif Azad R, Bhogal J (2019) Leap2Trend: a temporal word embedding approach for instant detection of emerging scientific trends. IEEE Access 7:176414–176428. https://doi.org/10.1109/ACCESS.2019.2957440CrossRef

Du C, Sun H, Wang J, et al (2019) Investigating capsule network and semantic feature on hyperplanes for text classification. In: Proc 2019—Conf Empir Methods Nat Lang Process 9th Int Jt Conf Nat Lang Process (EMNLP-IJCNLP-ACL), Assoc Comput Linguist, pp 456–465. https://doi.org/10.18653/v1/d19-1043

Ebadulla D, Raman R, Shetty HK, Mamatha HR (2021) A comparative study on language models for the Kannada language. In : Proc 4th Int Conf Nat Lang Speech Process Assoc Comput Linguist ICNLSP-ACL-2021, pp 280–284

Ekaterina Vylomova NH (2021) Semantic changes in harm-related concepts in English. Language Science Press, Berlin

El-Alami F, zahra, Ouatik El Alaoui S, En Nahnahi N, (2021) Contextual semantic embeddings based on fine-tuned AraBERT model for Arabic text multi-class categorization. J King Saud Univ. https://doi.org/10.1016/j.jksuci.2021.02.005CrossRef

El-Assady M, Kehlbeck R, Collins C et al (2020) Semantic concept spaces: guided topic model refinement using word-embedding projections. IEEE Trans Vis Comput Graph 26:1001–1011. https://doi.org/10.1109/TVCG.2019.2934654CrossRef

El-Demerdash K, El-Khoribi RA, Ismail Shoman MA, Abdou S (2022) Deep learning based fusion strategies for personality prediction. Egypt Inform J 23:47–53. https://doi.org/10.1016/j.eij.2021.05.004CrossRef

Elnagar A, Al-Debsi R, Einea O (2020) Arabic text classification using deep learning models. Inf Process Manag 57:102121. https://doi.org/10.1016/j.ipm.2019.102121CrossRef

Elsafoury F, Wilson SR, Katsigiannis S, Ramzan N (2022) SOS: systematic offensive stereotyping bias in word embeddings. In: Proc 29th Int Conf Comput Linguist COLING 1263–1274

Erk K (2012) Vector space models of word meaning and phrase meaning: a survey. Linguist Lang Compass 6:635–653. https://doi.org/10.1002/lnco.362CrossRef

Ezeani I, Piao S, Neale S, et al (2019) Leveraging pre-trained embeddings for Welsh taggers. In: Proc 4th Work Represent Learn NLP, Assoc Comput Linguist RepL4NLP-ACL-2019, pp 270–280. https://doi.org/10.18653/v1/W19-4332

Fan B, Fan W, Smith C, Garner H, “Skip”, (2020) Adverse drug event detection and extraction from open data: a deep learning approach. Inf Process Manag 57:102131. https://doi.org/10.1016/j.ipm.2019.102131CrossRef

Faris H, Habib M, Faris M et al (2021) An intelligent multimodal medical diagnosis system based on patients’ medical questions and structured symptoms for telemedicine. Inform Med Unlocked 23:100513. https://doi.org/10.1016/j.imu.2021.100513CrossRef

Fesseha A, Xiong S, Emiru ED et al (2021) Text classification based on convolutional neural networks and word embedding for low-resource languages: Tigrinya. Informatics 12:1–17. https://doi.org/10.3390/info12020052CrossRef

Firth JR (1957) Studies in linguistic analysis. Blackwell, Oxford

Flisar J, Podgorelec V (2019) Identification of self-admitted technical debt using enhanced feature selection based on word embedding. IEEE Access 7:106475–106494. https://doi.org/10.1109/ACCESS.2019.2933318CrossRef

Flor M, Hao J (2021) Text mining and automated scoring. Comput Psychom New Methodol New Gener Digit Learn Assess. https://doi.org/10.1007/978-3-030-74394-9_14CrossRef

Fouad MM, Mahany A, Aljohani N et al (2020) ArWordVec: efficient word embedding models for Arabic tweets. Soft Comput 24:8061–8068. https://doi.org/10.1007/s00500-019-04153-6CrossRef

Fu X, Yang Y (2019) WEDeepT3: predicting type III secreted effectors based on word embedding and deep learning. Quant Biol 7:293–301. https://doi.org/10.1007/s40484-019-0184-7CrossRef

Giarelis N, Kanakaris N, Karacapilidis N (2020) On a novel representation of multiple textual documents in a single graph. Smart Innov Syst Technol 193:105–115. https://doi.org/10.1007/978-981-15-5925-9_9/TABLES/1CrossRef

Giesen J, Kahlmeyer P, Nussbaum F, Zarrieß S (2022) Leveraging the Wikipedia Graph for Evaluating Word Embeddings. Proc Thirty-First Int Jt Conf Artif Intell IJCAI-22 4136–4142. https://doi.org/10.24963/ijcai.2022/574

Giorgi J, Nitski O, Wang B, Bader G (2021) DeCLUTR: deep contrastive learning for unsupervised textual representations. In: Proc 59th Annu Meet Assoc Comput Linguist 11th Int Jt Conf Nat Lang Process ACL-IJCNLP, pp 879–895. https://doi.org/10.18653/v1/2021.acl-long.72

González JÁ, Hurtado LF, Pla F (2020) Transformer based contextualization of pre-trained word embeddings for irony detection in Twitter. Inf Process Manag 57:102262. https://doi.org/10.1016/j.ipm.2020.102262CrossRef

Goodrum H, Roberts K, Bernstam EV (2020) Automatic classification of scanned electronic health record documents. Int J Med Inform 144:104302. https://doi.org/10.1016/j.ijmedinf.2020.104302CrossRef

Greiner-Petter A, Youssef A, Ruas T et al (2020) Math-word embedding in math search and semantic extraction. Scientometrics 125:3017–3046. https://doi.org/10.1007/s11192-020-03502-9CrossRef

Grishman R, Sundheim BM (1996) Message Understanding Conference—6: A Brief History. In: The 16th International Conference on Computational Linguistics. COLING 1996, pp 466–471

Grzeça M, Becker K, Galante R (2020) Drink2Vec: Improving the classification of alcohol-related tweets using distributional semantics and external contextual enrichment. Inf Process Manag 57:102369. https://doi.org/10.1016/j.ipm.2020.102369CrossRef

Guo Y, Zhou D, Nie R et al (2020) DeepANF: a deep attentive neural framework with distributed representation for chromatin accessibility prediction. Neurocomputing 379:305–318. https://doi.org/10.1016/j.neucom.2019.10.091CrossRef

Ha P, Zhang S, Djuric N, Vucetic S (2020) Improving word embeddings through iterative refinement of word- and character-level models. In: Proc 28th Int Conf Comput Linguist COLING, pp 1204–1213. https://doi.org/10.18653/v1/2020.coling-main.104

Hajek P, Barushka A, Munk M (2020) Fake consumer review detection using deep neural networks integrating word embeddings and emotion mining. Neural Comput Appl 32:17259–17274. https://doi.org/10.1007/s00521-020-04757-2CrossRef

Hammar K, Jaradat S, Dokoohaki N, Matskin M (2020) Deep text classification of Instagram data using word embeddings and weak supervision. In: Web Intelligence, vol 18, pp 53–67. https://doi.org/10.3233/WEB-200428

Hao Y, Mu T, Hong R et al (2020) Cross-domain sentiment encoding through stochastic word embedding. IEEE Trans Knowl Data Eng 32:1909–1922. https://doi.org/10.1109/TKDE.2019.2913379CrossRef

Harb JGD, Ebeling R, Becker K (2020) A framework to analyze the emotional reactions to mass violent events on Twitter and influential factors. Inf Process Manag 57:2372. https://doi.org/10.1016/j.ipm.2020.102372CrossRef

Harris ZS (1954) Distributional structure. WORD, Rutledge, Taylor Fr Gr 10:146–162. https://doi.org/10.1080/00437956.1954.11659520CrossRef

Hasni S, Faiz S (2021) Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets. Soc Netw Anal Min. https://doi.org/10.1007/s13278-021-00777-5CrossRef

Hu K, Luo Q, Qi K et al (2019) Understanding the topic evolution of scientific literatures like an evolving city: using Google Word2Vec model and spatial autocorrelation analysis. Inf Process Manag 56:1185–1203. https://doi.org/10.1016/j.ipm.2019.02.014CrossRef

Ihm S, Lee J, Park Y (2019) Skip-gram-KR : Korean word embedding for semantic clustering. IEEE Access. https://doi.org/10.1109/ACCESS.2019.2905252CrossRef

Jang B, Kim M, Harerimana G et al (2020) Bi-LSTM model to increase accuracy in text classification: combining word2vec CNN and attention mechanism. Appl Sci. https://doi.org/10.3390/app10175841CrossRef

Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proc 2014 Conf Empir Methods Nat Lang Process Assoc Comput Linguist EMNLP-ACL, pp 1532–1543.. https://doi.org/10.3115/v1/D14-1162

Jeon S, Kim HK (2021) AutoVAS: an automated vulnerability analysis system with a deep learning approach. Comput Secur 106:102308. https://doi.org/10.1016/j.cose.2021.102308CrossRef

Ji S, Satish N, Li S, Dubey PK (2019) Parallelizing word2vec in shared and distributed memory. IEEE Trans Parallel Distrib Syst 30:2090–2100. https://doi.org/10.1109/TPDS.2019.2904058CrossRef

Jiang L, Sun X, Mercaldo F, Santone A (2020) DECAB-LSTM: deep contextualized attentional bidirectional LSTM for cancer hallmark classification. Knowl-Based Syst 210:106486. https://doi.org/10.1016/j.knosys.2020.106486CrossRef

Jiang L, Sun X, Mercaldo F, Santone A (2020) DECAB-LSTM: deep contextualized attentional bidirectional LSTM for cancer hallmark classification. Knowl Based Syst 210:6486. https://doi.org/10.1016/j.knosys.2020.106486CrossRef

Jiao Q, Zhang S (2021) A brief survey of word embedding and its recent development. In: IAEAC 2021—IEEE 5th Adv Inf Technol Electron Autom Control Conf 2021, pp 1697–1701. https://doi.org/10.1109/IAEAC50856.2021.9390956

Jin K, Wi J, Kang K, Kim Y (2020) Korean historical documents analysis with improved dynamic word embedding. Appl Sci 10:1–12. https://doi.org/10.3390/app10217939CrossRef

Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: 15th Conf Eur Chapter Assoc Comput Linguist EACL 2017 - Proc Conf, vol 2, pp 427–431. https://doi.org/10.18653/v1/e17-2068

Kalouli AL, De Paiva V, Crouch R (2019) Composing noun phrase vector representations. Proc 4th Work Represent Learn NLP, Assoc Comput Linguist RepL4NLP-ACL-2019 84–95. https://doi.org/10.18653/v1/w19-4311

Kalyan KS, Sangeetha S (2021) BertMCN: mapping colloquial phrases to standard medical concepts using BERT and highway network. Artif Intell Med 112:102008. https://doi.org/10.1016/j.artmed.2021.102008CrossRef

Kapil P, Ekbal A (2020) A deep neural network based multi-task learning approach to hate speech detection. Knowl-Based Syst 210:106458. https://doi.org/10.1016/j.knosys.2020.106458CrossRef

Kastrati Z, Imran AS, Kurti A (2019) Integrating word embeddings and document topics with deep learning in a video classification framework. Pattern Recogn Lett 128:85–92. https://doi.org/10.1016/j.patrec.2019.08.019CrossRef

Khan W, Daud A, Alotaibi F et al (2020) Deep recurrent neural networks with word embeddings for Urdu named entity recognition. ETRI J 42:90–100. https://doi.org/10.4218/etrij.2018-0553CrossRef

Khan Z, Hussain MI, Iltaf N et al (2021) Contextual recommender system for E-commerce applications. Appl Soft Comput 109:107552. https://doi.org/10.1016/j.asoc.2021.107552CrossRef

Khanal J (2020) Identifying enhancers and their strength by the integration of word embedding and convolution neural network. IEEE Access 8:58369–58376. https://doi.org/10.1109/ACCESS.2020.2982666CrossRef

Kilimci ZH (2020) Sentiment analysis based direction prediction in bitcoin using deep learning algorithms and word embedding models. Int J Intell Syst Appl Eng 8:60–65. https://doi.org/10.18201/ijisae.2020261585CrossRef

Kilimci ZH, Duvar R (2020) An efficient word embedding and deep learning based model to forecast the direction of stock exchange market using twitter and financial news sites: a case of istanbul stock exchange (BIST 100). IEEE Access 8:188186–188198. https://doi.org/10.1109/ACCESS.2020.3029860CrossRef

Kim J, Jeong OR (2021) Mirroring vector space embedding for new words. IEEE Access 9:99954–99967. https://doi.org/10.1109/ACCESS.2021.3096238CrossRef

Kim N, Hong S (2021) Automatic classification of citizen requests for transportation using deep learning: case study from Boston city. Inf Process Manag 58:102410. https://doi.org/10.1016/j.ipm.2020.102410CrossRef

Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: 5th Int Conf Learn Represent ICLR 2017—Conf Track Proc, pp 1–14. https://doi.org/10.48550/arXiv.1609.02907

Kitchenham B (2004) Procedures for performing systematic reviews, version 1.0. Empir Softw Eng 33:1–26

Koutsomitropoulos DA, Andriopoulos AD (2021) Thesaurus-based word embeddings for automated biomedical literature classification. Neural Comput Appl. https://doi.org/10.1007/s00521-021-06053-zCrossRef

Kozlowski D, Lannelongue E, Saudemont F et al (2020) A three-level classification of French tweets in ecological crises. Inf Process Manag 57:2284. https://doi.org/10.1016/j.ipm.2020.102284CrossRef

Kumar N, Suman RR, Kumar S (2021) Text classification and topic modelling of web extracted data. In: 2021 2nd Glob Conf Adv Technol GCAT 2021, pp 2–9. https://doi.org/10.1109/GCAT52182.2021.9587459

Lavanya PM, Sasikala E (2021) Deep learning techniques on text classification using Natural language processing (NLP) in social healthcare network: a comprehensive survey. In: 2021 3rd Int Conf Signal Process Commun ICPSC 2021, pp 603–609. https://doi.org/10.1109/ICSPC51351.2021.9451752

Li B, Drozd A, Guo Y et al (2019a) Scaling Word2Vec on Big Corpus. Data Sci Eng 4:157–175. https://doi.org/10.1007/s41019-019-0096-6CrossRef

Li M, Sun Y, Lu H et al (2020a) Deep reinforcement learning for partially observable data poisoning attack in crowdsensing systems. IEEE Internet Things J 7:6266–6278. https://doi.org/10.1109/JIOT.2019.2962914CrossRef

Li S, Pan R, Luo H et al (2021) Adaptive cross-contextual word embedding for word polysemy with unsupervised topic modeling. Knowl Based Syst 218:106827. https://doi.org/10.1016/j.knosys.2021.106827CrossRef

Li X, Jiang H, Kamei Y, Chen X (2018) Bridging semantic gaps between natural languages and APIs with word embedding. IEEE Trans Softw Eng 46:1081–1097. https://doi.org/10.1109/TSE.2018.2876006CrossRef

Li X, Zhang H, Zhou XH (2020) Chinese clinical named entity recognition with variant neural structures based on BERT methods. J Biomed Inform 107:103422. https://doi.org/10.1016/j.jbi.2020.103422CrossRef

Li Y, Yang T (2018) Word embedding for understanding natural language: a survey. Big Data Appl. https://doi.org/10.1007/978-3-319-53817-4_4CrossRef

Li Z, Yang F, Luo Y (2019b) Context embedding based on Bi-LSTM in semi-supervised biomedical word sense disambiguation. IEEE Access 7:72928–72935. https://doi.org/10.1109/ACCESS.2019.2912584CrossRef

Liao S, Chen J, Wang Y, et al (2020) Embedding compression with isotropic iterative quantization. In: Assoc Adv Artif Intell (AAAI 2020)—34th AAAI Conf Artif Intell, pp 8336–8343. https://doi.org/10.1609/aaai.v34i05.6350

Liao Z, Ni J (2021) Construction of Chinese synonymous nouns discrimination and query system based on the semantic relation of embedded system and LSTM. Microprocess Microsyst 82:103848. https://doi.org/10.1016/j.micpro.2021.103848CrossRef

Lippincott T, Shapiro P, Duh K, McNamee P (2019) JHU system description for the MADAR Arabic dialect identification shared task. In: Proc Fourth Arab Nat Lang Process Work Assoc Comput Linguist ANLP-ACL-2019, pp 264–268. https://doi.org/10.18653/v1/w19-4634

Liu G, Lu Y, Shi K et al (2019) Mapping bug reports to relevant source code files based on the vector space model and word embedding. IEEE Access 7:78870–78881. https://doi.org/10.1109/ACCESS.2019.2922686CrossRef

Liu J, Gao L, Guo S et al (2021) A hybrid deep-learning approach for complex biochemical named entity recognition. Knowl Based Syst 221:106958. https://doi.org/10.1016/j.knosys.2021.106958CrossRef

Liu J, Zheng S, Xu G, Lin M (2021b) Cross-domain sentiment aware word embeddings for review sentiment analysis. Int J Mach Learn Cybern 12:343–354. https://doi.org/10.1007/s13042-020-01175-7CrossRef

Liu N, Shen B (2020) Aspect-based sentiment analysis with gated alternate neural network. Knowl Based Syst 188:105010. https://doi.org/10.1016/j.knosys.2019.105010CrossRef

Lu H, Jin C, Helu X et al (2022) DeepAutoD: research on distributed machine learning oriented scalable mobile communication security unpacking system. IEEE Trans Netw Sci Eng 9:2052–2065. https://doi.org/10.1109/TNSE.2021.3100750CrossRef

Luo C, Tan Z, Min G et al (2021) A novel web attack detection system for internet of things via ensemble classification. IEEE Trans Ind Inform 17:5810–5818. https://doi.org/10.1109/TII.2020.3038761CrossRef

Magna AAR, Allende-Cid H, Taramasco C et al (2020) Application of machine learning and word embeddings in the classification of cancer diagnosis using patient anamnesis. IEEE Access 8:106198–106213. https://doi.org/10.1109/ACCESS.2020.3000075CrossRef

Malla SJ, Alphonse PJA (2021) COVID-19 outbreak: an ensemble pre-trained deep learning model for detecting informative tweets. Appl Soft Comput 107:107495. https://doi.org/10.1016/j.asoc.2021.107495CrossRef

Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. In: 1st Int Conf Learn Represent ICLR 2013a - Work Track Proc, pp 1–12. https://doi.org/10.48550/arXiv.1301.3781

Mikolov T, Sutskever Ilya, Chen K et al (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst. https://doi.org/10.48550/arXiv.1310.4546CrossRef

Mohamed EH, Moussa MES, Haggag MH (2020) An enhanced sentiment analysis framework based on pre-trained word embedding. Int J Comput Intell Appl. https://doi.org/10.1142/S1469026820500315CrossRef

Moradi M, Dashti M, Samwald M (2020) Summarization of biomedical articles using domain-specific word embeddings and graph ranking. J Biomed Inform 107:103452. https://doi.org/10.1016/j.jbi.2020.103452CrossRef

Morales-Garzón A, Gomez-Romero J, Martin-Bautista MJ (2021) A word embedding-based method for unsupervised adaptation of cooking recipes. IEEE Access 9:27389–27404. https://doi.org/10.1109/ACCESS.2021.3058559CrossRef

Moreo A, Esuli A, Sebastiani F (2021) Word-class embeddings for multiclass text classification. Springer, New YorkCrossRefMATH

Mulki H, Haddad H, Gridach M, Babaoǧlu I (2019) Syntax-ignorant N-gram embeddings for sentiment analysis of Arabic dialects. In: Proc Fourth Arab Nat Lang Process Work Assoc Comput Linguist ANLP-ACL-2019, pp 30–39. https://doi.org/10.18653/v1/w19-4604

Phat NH, Anh NTM (2020) Vietnamese text classification algorithm using long short term memory and Word2Vec. Artif Intell Knowl Data Eng 19:1255–1279. https://doi.org/10.15622/ia.2020.19.6.5CrossRef

Naderalvojoud B, Sezer EA (2020) Sentiment aware word embeddings using refinement and senti-contextualized learning approach. Neurocomputing 405:149–160. https://doi.org/10.1016/j.neucom.2020.03.094CrossRef

Nasar Z, Jaffry SW, Malik MK (2021) Named entity recognition and relation extraction: state-of-the-art. ACM Comput Surv. https://doi.org/10.1145/3445965CrossRef

Nasim Z (2020) On building an interpretable topic modeling approach for the Urdu language. In: Proc Twenty-Ninth Int Jt Conf Artif Intell Dr Consort Track, IJCAI-DCT-2020 5200–5201. https://doi.org/10.24963/ijcai.2020/740

Nassif AB, Elnagar A, Shahin I, Henno S (2021) Deep learning for Arabic subjective sentiment analysis: challenges and research opportunities. Appl Soft Comput 98:106836. https://doi.org/10.1016/j.asoc.2020.106836CrossRef

Nguyen D, Grieve J (2020) Do word embeddings capture spelling variation? In: Proc 28th Int Conf Comput Linguist COLING pp 870–881. https://doi.org/10.18653/v1/2020.coling-main.75

Ning G, Bai Y (2021) Biomedical named entity recognition based on Glove-BLSTM-CRF model. J Comput Methods Sci Eng 21:125–133. https://doi.org/10.3233/JCM-204419CrossRef

Ochodek M, Kopczyńska S, Staron M (2020) Deep learning model for end-to-end approximation of COSMIC functional size based on use-case names. Inf Softw Technol. https://doi.org/10.1016/j.infsof.2020.106310CrossRef

Ohashi S, Isogawa M, Kajiwara T, Arase Y (2020) Tiny Word Embeddings Using Globally Informed Reconstruction. Proc 28th Int Conf Comput Linguist COLING 1199–1203. https://doi.org/10.18653/v1/2020.coling-main.103

Okoli C, Schabram K (2010) A guide to conducting a systematic literature review of information systems research. Work Pap Inf Syst. https://doi.org/10.2139/ssrn.1954824CrossRef

Onan A (2021) Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks. Concurr Comput Pract Exp 33:1–12. https://doi.org/10.1002/cpe.5909CrossRef

Pan C, Huang J, Gong J, Yuan X (2019a) Few-shot transfer learning for text classification with lightweight word embedding based models. IEEE Access 7:53296–53304. https://doi.org/10.1109/ACCESS.2019.2911850CrossRef

Pan Q, Dong H, Wang Y, et al (2019b) Recommendation of crowdsourcing tasks based on Word2vec semantic tags. Algorithm Optim Wirel Mob Appl Smart Cities. https://doi.org/10.1155/2019/2121850

Pandey B, Kumar Pandey D, Pratap Mishra B, Rhmann W (2021) A comprehensive survey of deep learning in the field of medical imaging and medical natural language processing: challenges and research directions. J King Saud Univ. https://doi.org/10.1016/j.jksuci.2021.01.007CrossRef

Parikh P, Abburi H, Badjatiya P, et al (2019) Multi-label categorization of accounts of sexism using a neural framework. In: Proc 2019 - Conf Empir Methods Nat Lang Process 9th Int Jt Conf Nat Lang Process Assoc Comput Linguist EMNLP-IJCNLP-ACL 1642–1652. https://doi.org/10.18653/v1/d19-1174

Pattisapu N, Gupta M, Kumaraguru P, Varma V (2019) A distant supervision based approach to medical persona classification. J Biomed Inform 94:3205. https://doi.org/10.1016/j.jbi.2019.103205CrossRef

Pennington J, Socher R, Manning CD (2014) GloVe: global vectors for word representation. https://nlp.stanford.edu/projects/glove/. Accessed 10 Jun 2021

Peters ME, Neumann M, Iyyer M, et al (2018) Deep contextualized word representations. In: NAACL HLT 2018 - 2018 Conf North Am Chapter Assoc Comput Linguist Hum Lang Technol - Proc Conf 1:2227–2237. https://doi.org/10.18653/v1/n18-1202

Qiu J, Chai Y, Tian Z et al (2020a) Automatic concept extraction based on semantic graphs from big data in smart city. IEEE Trans Comput Soc Syst 7:225–233. https://doi.org/10.1109/TCSS.2019.2946181CrossRef

Qiu J, Du L, Zhang D et al (2020b) Nei-TTE: intelligent traffic time estimation based on fine-grained time derivation of road segments for smart city. IEEE Trans Ind Inform 16:2659–2666. https://doi.org/10.1109/TII.2019.2943906CrossRef

Qiu Q, Xie Z, Wu L, Li W (2019) Geoscience keyphrase extraction algorithm using enhanced word embedding. Expert Syst Appl 125:157–169. https://doi.org/10.1016/j.eswa.2019.02.001CrossRef

Racharak T (2021) On approximation of concept similarity measure in description logic ELH with pre-trained word embedding. IEEE Access 9:61429–61443. https://doi.org/10.1109/ACCESS.2021.3073730CrossRef

Radford A, Wu J, Child R, et al (2019) Language models are unsupervised multitask learners. 1:OpenAI blog

Raunak V, Gupta V, Metze F (2019) Effective Dimensionality Reduction for Word Embeddings. N: Proc 4th Work Represent Learn NLP, Assoc Comput Linguist RepL4NLP-ACL-2019 235–243. https://doi.org/10.18653/v1/W19-4328

Ren Z, Shen Q, Diao X, Xu H (2021) A sentiment-aware deep learning approach for personality detection from text. Inf Process Manag 58:2532. https://doi.org/10.1016/j.ipm.2021.102532CrossRef

Rethmeier N, Plank B (2019) MoRTy: unsupervised learning of task-specialized word embeddings by autoencoding. In: Proc 4th Work Represent Learn NLP, Assoc Comput Linguist RepL4NLP-ACL-2019 49–54. https://doi.org/10.18653/v1/w19-4307

Rezaeinia SM, Rahmani R, Ghodsi A, Veisi H (2019) Sentiment analysis based on improved pre-trained word embeddings. Expert Syst Appl 117:139–147. https://doi.org/10.1016/j.eswa.2018.08.044CrossRef

Rida-e-fatima S, Javed A, Banjar A et al (2019) A multi-layer dual attention deep learning model with refined word embeddings for aspect-based sentiment analysis. IEEE Access 7:114795–114807. https://doi.org/10.1109/ACCESS.2019.2927281CrossRef

Risch J, Krestel R, Risch J, Krestel R (2019). Domain-Specific Word Embeddings for Patent Classification. https://doi.org/10.1108/DTA-01-2019-0002CrossRef

Roman M, Shahid A, Khan S et al (2021) Citation intent classification using word embedding. IEEE Access 9:9982–9995. https://doi.org/10.1109/ACCESS.2021.3050547CrossRef

Roy PK, Singh JP, Banerjee S (2020) Deep learning to filter SMS Spam. Futur Gener Comput Syst 102:524–533. https://doi.org/10.1016/j.future.2019.09.001CrossRef

Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18:613–620. https://doi.org/10.1145/361219.361220CrossRefMATH

Scott D, Richard H, Susan T et al (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41:391–407. https://doi.org/10.1002/1097-4571CrossRef

See A (2019) Natural language processing with deep learning: natural language generation. 2022:1–39

Shahzad K, Kanwal S, Malik K et al (2019) A word-embedding-based approach for accurate identification of corresponding activities. Comput Electr Eng 78:218–229. https://doi.org/10.1016/j.compeleceng.2019.07.011CrossRef

Shaikh S, Daudpotta SM, Imran AS (2021) Bloom’s learning outcomes’ automatic classification using LSTM and pretrained word embeddings. IEEE Access 9:117887–117909. https://doi.org/10.1109/access.2021.3106443CrossRef

Sharma M, Kandasamy I, Vasantha WB (2021) Comparison of neutrosophic approach to various deep learning models for sentiment analysis. Knowledge-Based Syst 223:107058. https://doi.org/10.1016/j.knosys.2021.107058CrossRef

Shekhar S, Sharma DK, Sufyan Beg MM (2019) An effective cybernated word embedding system for analysis and language identification in code-mixed social media text. Int J Knowl-Based Intell Eng Syst 23(3):167–79. https://doi.org/10.3233/KES-190409CrossRef

Shi W, Chen M, Tian Y, Chang KW (2019) Learning bilingual word embeddings using lexical definitions. In: Proc 4th Work Represent Learn NLP, Assoc Comput Linguist RepL4NLP-ACL-2019 142–147. https://doi.org/10.18653/v1/w19-4316

Shin B, Yang H, Choi JD (2019) The pupil has become the master: teacher-student model-based word embedding distillation with ensemble learning. In: Proc Twenty-Eighth Int Jt Conf Artif Intell IJCAI-2019 2019-Augus:3439–3445. https://doi.org/10.24963/ijcai.2019/477

Shin HS, Kwon HY, Ryu SJ (2020) A new text classification model based on contrastive word embedding for detecting cybersecurity intelligence in twitter. Electron 9:1–21. https://doi.org/10.3390/electronics9091527CrossRef

Smetanin S, Komarov M (2021) Deep transfer learning baselines for sentiment analysis in Russian. Inf Process Manag 58:2484. https://doi.org/10.1016/j.ipm.2020.102484CrossRef

Song M, Park H, Shin Shik K (2019) Attention-based long short-term memory network using sentiment lexicon embedding for aspect-level sentiment analysis in Korean. Inf Process Manag 56:637–653. https://doi.org/10.1016/j.ipm.2018.12.005CrossRef

Spinde T, Rudnitckaia L, Mitrović J et al (2021) Automated identification of bias inducing words in news articles using linguistic and context-oriented features. Inf Process Manag 58:102505. https://doi.org/10.1016/j.ipm.2021.102505CrossRef

Suárez-Paniagua V, Rivera Zavala RM, Segura-Bedmar I, Martínez P (2019) A two-stage deep learning approach for extracting entities and relationships from medical texts. J Biomed Inform 99:3285. https://doi.org/10.1016/j.jbi.2019.103285CrossRef

Sun G, Li Y, Yu H, Chang V (2020) Attention distribution guided information transfer networks for recommendation in practice. Appl Soft Comput J. https://doi.org/10.1016/j.asoc.2020.106772CrossRef

Sun Z, Sarma PK, Sethares WA, Liang Y (2020b) Learning relationships between text, audio, and video via deep canonical correlation for multimodal language analysis. Assoc Adv Artif Intell (AAAI 2020b)—34th AAAI Conf Artif Intell 8992–8999. https://doi.org/10.1609/aaai.v34i05.6431

Talafha B, Farhan W, Altakrouri A, Al-Natsheh HT (2019) Mawdoo3 AI at MADAR Shared Task: Arabic Tweet Dialect Identification. Proc Fourth Arab Nat Lang Process Work Assoc Comput Linguist ANLP-ACL-2019 239–243. https://doi.org/10.18653/v1/w19-4629

TensorFlow Hub BERT. https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4. Accessed 14 Mar 2022

Tian G, Zhao S, Wang J et al (2019) Semantic sparse service discovery using word embedding and Gaussian LDA. IEEE Access 7:88231–88242. https://doi.org/10.1109/ACCESS.2019.2926559CrossRef

Toor AS, Wechsler H, Nappi M (2019) Biometric surveillance using visual question answering. Pattern Recogn Lett 126:111–118. https://doi.org/10.1016/j.patrec.2018.02.013CrossRef

Torregrossa F, Allesiardo R, Claveau V et al (2021) A survey on training and evaluation of word embeddings. Int J Data Sci Anal 11:85–103. https://doi.org/10.1007/s41060-021-00242-8CrossRef

Dinter VR, Catal C, Tekinerdogan B (2021) A multi-channel convolutional neural network approach to automate the citation screening process. Appl Soft Comput 112:7765. https://doi.org/10.1016/j.asoc.2021.107765CrossRef

Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. Adv Neural Inf Process Syst. https://doi.org/10.48550/arXiv.1706.03762CrossRef

Vazirgiannis M (2017) Graph of words: boosting text mining with graphs. Int World Wide Web Conf Commun. https://doi.org/10.1145/3041021.3055362CrossRef

Verma P, Khandelwal B (2019) Word embeddings and its application in deep learning. Int J Innov Technol Explor Eng 8:337–341. https://doi.org/10.35940/ijitee.K1343.0981119CrossRef

Vijayvergia A, Kumar K (2021) Selective shallow models strength integration for emotion detection using GloVe and LSTM. Multimed Tools Appl 80:28349–28363. https://doi.org/10.1007/s11042-021-10997-8CrossRef

Wang B, Kuo CCJ (2020) SBERT-WK: a sentence embedding method by dissecting BERT-based word models. IEEE/ACM Trans Audio Speech Lang Process 28:2146–2157. https://doi.org/10.1109/TASLP.2020.3008390CrossRef

Wang L, Zhang J, Chen G, Qiao D (2021) Identifying comparable entities with indirectly associative relations and word embeddings from web search logs. Decis Support Syst 141:113465. https://doi.org/10.1016/j.dss.2020.113465CrossRef

Wang P, Luo Y, Chen Z et al (2019) Orientation analysis for Chinese news based on word embedding and syntax rules. IEEE Access 7:159888–159898. https://doi.org/10.1109/ACCESS.2019.2950900CrossRef

Wang S, Cao J, Yu PS (2022) Deep learning for spatio-temporal data mining: a survey. IEEE Trans Knowl Data Eng 34:3681–3700. https://doi.org/10.1109/TKDE.2020.3025580CrossRef

Wang S, Tseng B, Hernandez-Boussard T (2021) Development and evaluation of novel ophthalmology domain-specific neural word embeddings to predict visual prognosis. Int J Med Inform 150:104464. https://doi.org/10.1016/j.ijmedinf.2021.104464CrossRef

Wang S, Zhou W, Jiang C (2020) A survey of word embeddings based on deep learning. Computing 102:717–740. https://doi.org/10.1007/s00607-019-00768-7MathSciNetCrossRefMATH

Wang Y, Huang G, Li J et al (2021c) Refined global word embeddings based on sentiment concept for sentiment analysis. IEEE Access 9:37075–37085. https://doi.org/10.1109/ACCESS.2021.3062654CrossRef

Warnecke A, Arp D, Wressnegger C, Rieck K (2020) Evaluating explanation methods for deep learning in security. In: Proc—5th IEEE Eur Symp Secur Privacy-2020 158–174. https://doi.org/10.1109/EuroSP48549.2020.00018

Wen G, Chen H, Li H et al (2020) Cross domains adversarial learning for Chinese named entity recognition for online medical consultation. J Biomed Inform 112:3608. https://doi.org/10.1016/j.jbi.2020.103608CrossRef

Wu C, Gao R, Zhang Y, De Marinis Y (2019) PTPD: predicting therapeutic peptides by deep learning and word2vec. BMC Bioinform 20:1–8. https://doi.org/10.1186/s12859-019-3006-zCrossRef

Wu L, Cui P, Pei J, Zhao L (2022) Graph neural networks: foundations, frontiers, and applications. Springer, SingaporeCrossRefMATH

Xiao Y, Fan Z, Tan C et al (2019) Sense-based topic word embedding model for item recommendation. IEEE Access 7:44748–44760. https://doi.org/10.1109/ACCESS.2019.2909578CrossRef

Xiao Y, Keung J, Bennin KE, Mi Q (2018) Improving bug localization with word embedding and enhanced convolutional neural networks. Inf Softw Technol. https://doi.org/10.1016/j.infsof.2018.08.002CrossRef

Xiong J, Yu L, Zhang D, Leng Y (2021) DNCP: an attention-based deep learning approach enhanced with attractiveness and timeliness of News for online news click prediction. Inf Manag. https://doi.org/10.1016/j.im.2021.103428CrossRef

Xu D, Tian Z, Lai R et al (2020) Deep learning based emotion analysis of microblog texts. Inf Fusion 64:1–11. https://doi.org/10.1016/j.inffus.2020.06.002CrossRef

Yang C, Zhou W, Wang Z, et al (2021a) Accurate and Explainable Recommendation via Hierarchical Attention Network Oriented Towards Crowd Intelligence. Knowledge-Based Syst 213:106687. https://doi.org/10.1016/j.knosys.2020.106687

Yang J, Liu Y, Qian M, et al (2019) Information extraction from electronic medical records using multitask recurrent neural network with contextual word embedding. Appl Sci 9:. https://doi.org/10.3390/app9183658

Yang R, Wu F, Zhang C, Zhang L (2021b) iEnhancer-GAN: A Deep Learning Framework in Combination with Word Embedding and Sequence Generative Adversarial Net to Identify Enhancers and Their Strength. Int J Mol Sci 22:. https://doi.org/10.3390/ijms22073589

Yao L, Mao C, Luo Y (2019) Graph Convolutional Networks for Text Classification. Thirty-Third AAAI Conf Artif Intell 19. https://doi.org/10.1609/aaai.v33i01.33017370

Yi MH, Lim MJ, Ko H, Shin JH (2021) Method of Profanity Detection Using Word Embedding and LSTM. Mob Inf Syst 2021:. https://doi.org/10.1155/2021/6654029

Yildirim S (2019) Improving word embeddings projection for Turkish hypernym extraction. 4418–4428. https://doi.org/10.3906/elk-1903-65

Yildiz B, Tezgider M (2021) Improving word embedding quality with innovative automated approaches to hyperparameters. Concurr Comput Pract Exp 33:1–10. https://doi.org/10.1002/cpe.6091CrossRef

Yilmaz S, Toklu S (2020) A deep learning analysis on question classification task using Word2vec representations. Neural Comput Appl 32:2909–2928. https://doi.org/10.1007/s00521-020-04725-wCrossRef

Young T, Hazarika D, Poria S, Cambria E (2018) Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag 13:55–75. https://doi.org/10.1109/MCI.2018.2840738CrossRef

Yusuf SM, Zhang F, Zeng M, Li M (2021) DeepPPF: a deep learning framework for predicting protein family. Neurocomputing 428:19–29. https://doi.org/10.1016/j.neucom.2020.11.062CrossRef

Zhang Y, Liu Y, Zhu J, Wu X (2021) FSPRM: a feature subsequence based probability representation model for Chinese word embedding. IEEE/ACM Trans Audio Speech Lang Process 29:1702–1716. https://doi.org/10.1109/TASLP.2021.3073868CrossRef

Zhang Y, Yu X, Cui Z et al (2020) Every document owns its structure: inductive text classification via graph neural networks. In: 58th Annu Meet Assoc Comput Linguist, pp 334–339. https://doi.org/10.18653/v1/2020.acl-main.31

Zhao H, Phung D, Huynh V, et al (2021) Topic Modelling Meets Deep Neural Networks: A Survey. 4713–4720. https://doi.org/10.24963/ijcai.2021/638

Zhelezniak V, Shen A, Busbridge D, et al (2019) Correlations between Word Vector Sets. Proc 2019 - Conf Empir Methods Nat Lang Process 9th Int Jt Conf Nat Lang Process Assoc Comput Linguist EMNLP-IJCNLP-ACL 77–87. https://doi.org/10.18653/v1/d19-1008

Zheng C, Fan H, Shi Y (2020) A Domain expertise and word-embedding geometric projection based semantic mining framework for measuring the soft power of social entities. IEEE Access 8:204597–204611. https://doi.org/10.1109/ACCESS.2020.3037462

Zhu W, Liu S, Liu C et al (2020a) Learning multimodal word representations by explicitly embedding syntactic and phonetic information. IEEE Access 8:223306–223315. https://doi.org/10.1109/ACCESS.2020.3042183CrossRef

Zhu Y, Li Y, Yue Y et al (2020b) A hybrid classification method via character embedding in chinese short text with few words. IEEE Access 8:92120–92128. https://doi.org/10.1109/ACCESS.2020.2994450CrossRef

Zobnin A, Elistratova E (2019) Learning Word Embeddings without Context Vectors. Proc 4th Work Represent Learn NLP, Assoc Comput Linguist RepL4NLP-ACL-2019 244–249. https://doi.org/10.18653/v1/w19-4329

Zuheros C, Tabik S, Valdivia A et al (2019) Deep recurrent neural network for geographical entities disambiguation on social media data. Knowledge-Based Syst 173:117–127. https://doi.org/10.1016/j.knosys.2019.02.030CrossRef

Zulqarnain M, Ghazali R, Ghouse MG, Mushtaq MF (2019) Efficient processing of GRU based on word embedding for text classification. Int J Informatics Vis 3:377–383. https://doi.org/10.30630/joiv.3.4.289CrossRef

Titel: Impact of word embedding models on text analytics in deep learning environment: a review
verfasst von: Deepak Suresh Asudani
Naresh Kumar Nagwani
Pradeep Singh
Publikationsdatum: 22.02.2023
Verlag: Springer Netherlands
Erschienen in: Artificial Intelligence Review / Ausgabe 9/2023
Print ISSN: 0269-2821
Elektronische ISSN: 1573-7462
DOI: https://doi.org/10.1007/s10462-023-10419-1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 9/2023

An efficient lightweight convolutional neural network for industrial surface defect detection

Cross-domain decision making based on criterion weights and risk attitudes for the diagnosis of breast lesions

Machine learning algorithms to forecast air quality: a survey

An extensive survey on the use of supervised machine learning techniques in the past two decades for prediction of drug side effects

Games of GANs: game-theoretical models for generative adversarial networks

Mind the gap: challenges of deep learning approaches to Theory of Mind

Premium Partner