Top

Soft Computing

Published in:

12-11-2019 | Focus

A deep extraction model for an unseen keyphrase detection

Authors: Amin Ghazi Zahedi, Morteza Zahedi, Mansoor Fateh

Published in: Soft Computing | Issue 11/2020

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The keyphrase represents the basic concepts for a text. In many natural language processing tasks, it is necessary to extract qualitative keyphrases. Considering previous studies regarding text modeling, the meanings and concepts associated with the text had not been particularly considered as significant. According to recent research, cluster-related documents have a good subscription, especially in the keyphrases that are not directly appearing in a text document. Therefore, in this study, the main structure of the proposed model is based on the keyphrases disappearing in the document. We called it unseen keyphrase. Considering the proposed method, a model is developed to extract the basic concepts of the text using the same text estimates and through adding keyphrases to the deep network hidden layers of training. The main purpose of this structure is to first make visible unseen keyphrase and then to use an RNN to predict them. Considering the proposed method, the problem of not representing basic concepts and the unseen keyphrase are significantly solved. This study provides new insight into the concept of text. This mechanism is used by highlighting the role of unseen keyphrase that appears directly without the need for external knowledge. This method is tested on four public datasets in this field. The results revealed an average improvement of 12% compared to the public methods such as TF-IDF, KEA, and RNN.

previous article A network representation method based on edge information extraction

next article A new fuzzy time series method based on an ARMA-type recurrent Pi-Sigma artificial neural network

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Alrehamy H, Walker C (2018) Exploiting extensible background knowledge for clustering-based automatic keyphrase extraction. Soft Comput 22(21):7041–7057CrossRef

Andrew G, Arora R, Bilmes J, Livescu K (2013) Deep canonical correlation analysis. In: International conference on machine learning, vol 28, pp 1247–55

Atarashi K (2018) Semi-supervised learning from crowds using deep generative models, pp 1555–1562

Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate, pp 1–15

Bruna J, Mallat S (2013) Invariant scattering convolution networks. IEEE Trans Pattern Anal Mach Intell 35(8):1872–1886CrossRef

Cai D, He X, Han J (2011) Locally consistent concept factorization for document clustering. IEEE Trans Knowl Data Eng 23(6):902–913CrossRef

De Soete G, Carroll JD (1994) K-means clustering in a low-dimensional Euclidean space. In: New approaches in classification and data analysis. Springer, pp 212–19

Grineva M, Grinev M, Lizorkin D (2009) Extracting key terms from noisy and multitheme documents. In: Proceedings of the 18th international conference on world wide web—WWW ’09. ACM Press, New York

Gu J, Lu Z, Li H, Li VOK (2016) Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Stroudsburg, pp 1631–1640

Hasan KS, Ng V (2010) Conundrums in unsupervised keyphrase extraction: making sense of the state-of-the-art. In 23rd international conference on computational linguistics association for computational linguistics coling 2010, pp 365–73

Hershey JR, Chen Z, Roux JL, Watanabe S (2016) Deep clustering: discriminative embeddings for segmentation and separation. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) IEEE, pp 31–35

Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507MathSciNetCrossRef

Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef

Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366CrossRef

Hulth A, Megyesi BB (2006) A study on automatically extracted keywords in text categorization. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the ACL–ACL ’06. Association for Computational Linguistics, Morristown, pp 537–44

Liu Z, Chen X, Zheng Y, Sun M (2011) Automatic keyphrase extraction by bridging vocabulary gap. In: Proceedings of the fifteenth conference on computational natural language learning. Association for Computational Linguistics, pp 135–44

Liu J, Ren H, Wu M, Wang J, Kim H (2017) Multiple relations extraction among multiple entities in unstructured text. Soft Comput 22:4295–4305CrossRef

Medelyan O, Frank E, Witten IH (2009) Human-competitive tagging using automatic keyphrase extraction. In: Proceedings of the 2009 conference on empirical methods in natural language processing: volume 3. Association for Computational Linguistics, pp 1318–27

Meng R, Zhao S, Han S, He D, Brusilovsky P, Chi Y (2017) Deep keyphrase generation. In Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Stroudsburg, pp 582–92

Mihalcea R (2004) Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on interactive poster and demonstration sessions, vol 85. Association for Computational Linguistics, Morristown, pp 20

Mihalcea R, Tarau P (2004) TextRank: bringing order into texts. Proc EMNLP 85:404–411

Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–14

Ng A (2011) Sparse autoencoder. CS294A Lecture Notes, pp 1–19

Patel VM, Van Nguyen H, Vidal RR, Van Nguyen H, Vidal RR (2013) Latent space sparse subspace clustering. In: Proceedings of the IEEE international conference on computer vision, pp 225–32

Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. ArXiv Preprint arXiv:1509.00685

Shen S, Cheng Y, He Z, He W, Wu H, Sun M, Liu Y (2015) Minimum risk training for neural machine translation. ArXiv Preprint arXiv:1512.02433

Van Merri B, Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. ArXiv Preprint arXiv:1406.1078

Vinyals O, Le Q (2015) A neural conversational model. ArXiv Preprint arXiv:1506.05869

Vinyals O, Kaiser Ł, Koo T, Petrov S, Sutskever I, Hinton G (2015) Grammar as a foreign language. In: Advances in neural information processing systems, pp 2773–2781

Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: International conference on machine learning, vol 48, pp 478–87

Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization.” In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 267–73

Yang J, Parikh D, Batra D (2016) Joint unsupervised learning of deep representations and image clusters. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 5147–56

Yang B, Xiao F, Sidiropoulos ND (2017a) Learning from hidden traits: joint factor analysis and latent clustering. IEEE Trans Signal Process 65(1):256–269MathSciNetCrossRef

Yang B, Fu X, Sidiropoulos ND, Hong M (2017b) Towards K-means-friendly spaces: simultaneous deep learning and clustering. In: 34th international conference on machine learning, ICML 2017, 8, pp 5888–5901

Yu J, Liu H, Zheng X (2019) Two-dimensional joint local and nonlocal discriminant analysis-based 2D image feature extraction for deep learning. Neural Comput Appl. https://doi.org/10.1007/s00521-019-04085-0 CrossRef

Zhang Q, Wang Y, Gong Y, Huang X (2016) Keyphrase extraction using deep recurrent neural networks on twitter. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 836–45

Zreik C, Bouveyron P, Latouche R (2016) The stochastic topic block model for the clustering of vertices in networks with textual edges. Statistics and Computing 28:11–31MathSciNetMATH

Title: A deep extraction model for an unseen keyphrase detection
Authors: Amin Ghazi Zahedi
Morteza Zahedi
Mansoor Fateh
Publication date: 12-11-2019
Publisher: Springer Berlin Heidelberg
Published in: Soft Computing / Issue 11/2020
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI: https://doi.org/10.1007/s00500-019-04486-2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 11/2020

Scheduling two-stage assembly flow shop with random machines breakdowns: integrated new self-adapted differential evolutionary and simulation approach

Exploration of social media for sentiment analysis using deep learning

Prediction research of financial time series based on deep learning

Using big data computing framework and parallelized PSO algorithm to construct the reservoir dispatching rule optimization

Improving decision-making efficiency of image game based on deep Q-learning

A novel quality-of-service-aware web services composition using biogeography-based optimization algorithm

Premium Partner