Skip to main content
Erschienen in: Soft Computing 22/2020

06.05.2020 | Methodologies and Application

Semantic analysis-based relevant data retrieval model using feature selection, summarization and CNN

verfasst von: Antony Rosewelt, Arokia Renjit

Erschienen in: Soft Computing | Ausgabe 22/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Semantic analysis is playing a major role and task in text mining process caused by the presence of huge number of relevant and irrelevant data in Internet and other resources. Here, the semantic-based text summarization must be incorporated for the successful relevant data extraction by using data classification. The accurate classification process is done by using deep learning techniques recently. However, no existing model is achieved reasonable relevancy accuracy. For overcoming the drawbacks, we propose an effective semantic analysis-based relevant data retrieval model for retrieving the relevant data from local repository or web applications in Internet. This new model consists of (i) semantic similarity-based feature selection and (ii) enrichment technique, (iii) data summarization technique and iv) text relationship-based deep neural network classifier. Here, we propose a new semantic analysis-based feature selection algorithm to select the similarity indexed relevant data from local repositories or web applications. In addition, a new semantic-based data summarization technique is also introduced for summarizing the text that is available in the online resources. Finally, a new semantic similarity-based deep neural network-based classifier is also introduced for categorizing the data according to the semantic relation. The proposed model is proved the effectiveness of the data retrieval process by conducting various experiments based on the relevant data extraction from Internet resources, and it also tested with the recognized datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Altınel B, Ganiz MC (2018) Semantic text classification: a survey of past and recent advances. Inf Process Manag 54:1129–1153CrossRef Altınel B, Ganiz MC (2018) Semantic text classification: a survey of past and recent advances. Inf Process Manag 54:1129–1153CrossRef
Zurück zum Zitat Araque O, Zhu G, Iglesias CA (2019) A semantic similarity-based perspective of affect lexicons for sentiment analysis. Knowl-Based Syst 165:346–359CrossRef Araque O, Zhu G, Iglesias CA (2019) A semantic similarity-based perspective of affect lexicons for sentiment analysis. Knowl-Based Syst 165:346–359CrossRef
Zurück zum Zitat Babashzadeh A, Daoud M, Huang J (2013) Using semantic-based association rule mining for improving clinical text retrieval. In: Huang G, Liu X, He J, Klawonn F, Yao G (eds) Health Information Science. Springer, Berlin, pp 186–197CrossRef Babashzadeh A, Daoud M, Huang J (2013) Using semantic-based association rule mining for improving clinical text retrieval. In: Huang G, Liu X, He J, Klawonn F, Yao G (eds) Health Information Science. Springer, Berlin, pp 186–197CrossRef
Zurück zum Zitat Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. Association of Computational Linguistics (ACL), Stroudsburg Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. Association of Computational Linguistics (ACL), Stroudsburg
Zurück zum Zitat Braytee A, Liu W, Catchpoole DR, Kennedy PJ (2017) Multi-label feature selection using correlation information. In: Proceedings of the ACM conference on information and knowledge management, Singapore. November 06–10, 2017, pp 1649–1656 Braytee A, Liu W, Catchpoole DR, Kennedy PJ (2017) Multi-label feature selection using correlation information. In: Proceedings of the ACM conference on information and knowledge management, Singapore. November 06–10, 2017, pp 1649–1656
Zurück zum Zitat Brown G, Pocock AC, Zhao M, Luján M (2012) Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J Mach Learn Res 13:27–66MathSciNetMATH Brown G, Pocock AC, Zhao M, Luján M (2012) Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J Mach Learn Res 13:27–66MathSciNetMATH
Zurück zum Zitat Camastra F, Ciaramella A, Maratea A, Son LH, Staiano A (2020) Semantic maps for knowledge management of web and social information. In: Acampora G, Pedrycz W, Vasilakos A, Vitiello A (eds) Computational intelligence for semantic knowledge management. Studies in computational intelligence, vol 837. Springer, Berlin Camastra F, Ciaramella A, Maratea A, Son LH, Staiano A (2020) Semantic maps for knowledge management of web and social information. In: Acampora G, Pedrycz W, Vasilakos A, Vitiello A (eds) Computational intelligence for semantic knowledge management. Studies in computational intelligence, vol 837. Springer, Berlin
Zurück zum Zitat Diakopoulos NA, Shamma DA (2010) Characterizing debate performance via aggregated Twitter sentiment. In: Proceedings of the 28th international conference on human factors in computing systems, pp 1195–1198 Diakopoulos NA, Shamma DA (2010) Characterizing debate performance via aggregated Twitter sentiment. In: Proceedings of the 28th international conference on human factors in computing systems, pp 1195–1198
Zurück zum Zitat Ferreira R, de Souza Cabral L, Lins RD, e Silva GP, Freitas F, Cavalcanti GD, Lima R, Simske SJ, Favaro L (2013) Assessing sentence scoring techniques for extractive text summarization. Expert Syst Appl 40(14):5755–5764CrossRef Ferreira R, de Souza Cabral L, Lins RD, e Silva GP, Freitas F, Cavalcanti GD, Lima R, Simske SJ, Favaro L (2013) Assessing sentence scoring techniques for extractive text summarization. Expert Syst Appl 40(14):5755–5764CrossRef
Zurück zum Zitat Ganapathy S, Kulothungan K, Muthurajkumar S, Vijayalakshmi M, Yogesh P, Kannan A (2013) Intelligent feature selection and classification techniques for intrusion detection in networks: a survey. EURASIP J Wirel Commun Netw 271(1):1–16 Ganapathy S, Kulothungan K, Muthurajkumar S, Vijayalakshmi M, Yogesh P, Kannan A (2013) Intelligent feature selection and classification techniques for intrusion detection in networks: a survey. EURASIP J Wirel Commun Netw 271(1):1–16
Zurück zum Zitat Goularte FB, Nassar SM, Fileto R, Saggion H (2019) A text summarization method based on fuzzy rules and applicable to automated assessment. Expert Syst Appl 115:264–275CrossRef Goularte FB, Nassar SM, Fileto R, Saggion H (2019) A text summarization method based on fuzzy rules and applicable to automated assessment. Expert Syst Appl 115:264–275CrossRef
Zurück zum Zitat He B, Guan Y, Dai R (2019) Classifying medical relations in clinical text via convolutional neural networks. Artif Intell Med 93:43–49CrossRef He B, Guan Y, Dai R (2019) Classifying medical relations in clinical text via convolutional neural networks. Artif Intell Med 93:43–49CrossRef
Zurück zum Zitat Heyong W, Ming H (2019) Supervised Hebb rule based feature selection for text classification. Inf Process Manag 56:167–191CrossRef Heyong W, Ming H (2019) Supervised Hebb rule based feature selection for text classification. Inf Process Manag 56:167–191CrossRef
Zurück zum Zitat Kanimozhi U, Manjula D, Ganapathy S, Kannan A (2019) An intelligent risk prediction system for breast cancer using fuzzy temporal rules. Natl Acad Sci Lett 42:227–232CrossRef Kanimozhi U, Manjula D, Ganapathy S, Kannan A (2019) An intelligent risk prediction system for breast cancer using fuzzy temporal rules. Natl Acad Sci Lett 42:227–232CrossRef
Zurück zum Zitat Lauscher A, Glavas G, Ponzetto SP, Eckert K (2017) Investigating convolutional networks and domain-specific embeddings for semantic classification of citations. Proceedings of the 6th international workshop on mining scientific publications. ACM, Toronto, pp 24–28 Lauscher A, Glavas G, Ponzetto SP, Eckert K (2017) Investigating convolutional networks and domain-specific embeddings for semantic classification of citations. Proceedings of the 6th international workshop on mining scientific publications. ACM, Toronto, pp 24–28
Zurück zum Zitat Ma W, Wu Y, Cen F, Wang G (2020) MDFN: multi-scale deep feature learning network for object detection. Pattern Recogn 100:107149CrossRef Ma W, Wu Y, Cen F, Wang G (2020) MDFN: multi-scale deep feature learning network for object detection. Pattern Recogn 100:107149CrossRef
Zurück zum Zitat McAuley J, Leskovec J (2013) Hidden factors and hidden topics: understanding rating dimensions with review text. In: RecSys McAuley J, Leskovec J (2013) Hidden factors and hidden topics: understanding rating dimensions with review text. In: RecSys
Zurück zum Zitat Mohamed EH, Shokry EM (2020) QSST: a quranic semantic search tool based on word embedding. J King Saud Univ Comput Inf Sci Mohamed EH, Shokry EM (2020) QSST: a quranic semantic search tool based on word embedding. J King Saud Univ Comput Inf Sci
Zurück zum Zitat Perumal SP, Sannasi G, Arputharaj K (2019) An intelligent fuzzy rule based e-learning recommendation system for dynamic user interests. J Supercomput 75(8):5145–5160CrossRef Perumal SP, Sannasi G, Arputharaj K (2019) An intelligent fuzzy rule based e-learning recommendation system for dynamic user interests. J Supercomput 75(8):5145–5160CrossRef
Zurück zum Zitat Ramesh LS, Ganapathy S, Bhuvaneshwari R, Kulothungan K, Pandiyaraju V, Kannan A (2015) Prediction of user interests for providing relevant information using relevance feedback and re-ranking. Int J Intell Inf Technol 11(4):55–71CrossRef Ramesh LS, Ganapathy S, Bhuvaneshwari R, Kulothungan K, Pandiyaraju V, Kannan A (2015) Prediction of user interests for providing relevant information using relevance feedback and re-ranking. Int J Intell Inf Technol 11(4):55–71CrossRef
Zurück zum Zitat Rao G, Huang W, Feng Z, Cong Q (2018) LSTM with sentence representations for document-level sentiment classification. Neurocomputing 308:49–57CrossRef Rao G, Huang W, Feng Z, Cong Q (2018) LSTM with sentence representations for document-level sentiment classification. Neurocomputing 308:49–57CrossRef
Zurück zum Zitat Renjith A, Manjula P, Mohan Kumar P (2015) Brain tumour classification and abnormality detection using neuro-fuzzy technique and Otsu thresholding. J Med Eng Technol 39(8):498–507CrossRef Renjith A, Manjula P, Mohan Kumar P (2015) Brain tumour classification and abnormality detection using neuro-fuzzy technique and Otsu thresholding. J Med Eng Technol 39(8):498–507CrossRef
Zurück zum Zitat Rosewelt LA, Renjit JA (2019) An intelligent subtype fuzzy cluster based relevant user data retrieval model for effective classification. In: 2019 fifth international conference on science technology engineering and mathematics (ICONSTEM), pp 49–54 Rosewelt LA, Renjit JA (2019) An intelligent subtype fuzzy cluster based relevant user data retrieval model for effective classification. In: 2019 fifth international conference on science technology engineering and mathematics (ICONSTEM), pp 49–54
Zurück zum Zitat Rosewelt LA, Renjit A (2019b) Data mining tool for effective classification and retrieval of relevant user data using fuzzy and BSO. Int J Pure Appl Math 119:1239–1255 Rosewelt LA, Renjit A (2019b) Data mining tool for effective classification and retrieval of relevant user data using fuzzy and BSO. Int J Pure Appl Math 119:1239–1255
Zurück zum Zitat Sinoara RA, Camacho-Collados J, Rossi RG, Navigli R, Rezende SO (2019) Knowledge-enhanced document embeddings for text classification. Knowl-Based Syst 163:955–971CrossRef Sinoara RA, Camacho-Collados J, Rossi RG, Navigli R, Rezende SO (2019) Knowledge-enhanced document embeddings for text classification. Knowl-Based Syst 163:955–971CrossRef
Zurück zum Zitat Srivastava SK, Singh SK, Suri JS (2019) Effect of incremental feature enrichment on healthcare text classification system: a machine learning paradigm. Comput Methods Programs Biomed 172:35–51CrossRef Srivastava SK, Singh SK, Suri JS (2019) Effect of incremental feature enrichment on healthcare text classification system: a machine learning paradigm. Comput Methods Programs Biomed 172:35–51CrossRef
Zurück zum Zitat Veloso AA, Almeida HM, Gonzalves MA, Meira Jr W (2008) Learning to rank at query-time using association rules. In: Proceedings of the thirty first annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 267–274 Veloso AA, Almeida HM, Gonzalves MA, Meira Jr W (2008) Learning to rank at query-time using association rules. In: Proceedings of the thirty first annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 267–274
Zurück zum Zitat Wang D, Nyberg E (2015) A long short-term memory model for answer sentence selection in question answering. ACL Wang D, Nyberg E (2015) A long short-term memory model for answer sentence selection in question answering. ACL
Zurück zum Zitat Wang J, Wei J, Yang Z (2016) Supervised feature selection by preserving class correlation. In: Proceedings of the twenty-fifth ACM international conference on information and knowledge management, Indianapolis, IN, USA, October 24–28, 2016, pp 1613–1622 Wang J, Wei J, Yang Z (2016) Supervised feature selection by preserving class correlation. In: Proceedings of the twenty-fifth ACM international conference on information and knowledge management, Indianapolis, IN, USA, October 24–28, 2016, pp 1613–1622
Zurück zum Zitat Wang J, Wei J, Yang Z, Wang S (2017a) Feature selection by maximizing independent classification information. IEEE Trans Knowl Data Eng 29(4):828–841CrossRef Wang J, Wei J, Yang Z, Wang S (2017a) Feature selection by maximizing independent classification information. IEEE Trans Knowl Data Eng 29(4):828–841CrossRef
Zurück zum Zitat Wang X, Chen R, Hong C, Zeng Z, Zhou Z (2017b) Semi-supervised multi-label feature selection via label correlation analysis with l 1-norm graph embedding. Image Vis Comput 63:10–23CrossRef Wang X, Chen R, Hong C, Zeng Z, Zhou Z (2017b) Semi-supervised multi-label feature selection via label correlation analysis with l 1-norm graph embedding. Image Vis Comput 63:10–23CrossRef
Zurück zum Zitat Wu P, Li X, Shen S, He D (2020) Social media opinion summarization using emotion cognition and convolutional neural networks. Int J Inf Manag 51:101978CrossRef Wu P, Li X, Shen S, He D (2020) Social media opinion summarization using emotion cognition and convolutional neural networks. Int J Inf Manag 51:101978CrossRef
Zurück zum Zitat Xu W, Tan Y (2019) Semi-supervised target-oriented sentiment classification. Neurocomputing 337:120–128CrossRef Xu W, Tan Y (2019) Semi-supervised target-oriented sentiment classification. Neurocomputing 337:120–128CrossRef
Zurück zum Zitat Yager RR, Ford KM, Cañas AJ (1991) An approach to the linguistic summarization of data. In: Bouchon-Meunier B, Yager RR, Zadeh LA (eds) Uncertainty in knowledge bases, IPMU 1990, Lecture Notes in Computer Science, vol 521. Springer, Berlin, Heidelberg Yager RR, Ford KM, Cañas AJ (1991) An approach to the linguistic summarization of data. In: Bouchon-Meunier B, Yager RR, Zadeh LA (eds) Uncertainty in knowledge bases, IPMU 1990, Lecture Notes in Computer Science, vol 521. Springer, Berlin, Heidelberg
Zurück zum Zitat Yang M, Wang X, Lu Y, Lv J, Shen Y, Li C (2020) Plausibility-promoting generative adversarial network for abstractive text summarization with multi-task constraint. Inf Sci 521:46–61CrossRef Yang M, Wang X, Lu Y, Lv J, Shen Y, Li C (2020) Plausibility-promoting generative adversarial network for abstractive text summarization with multi-task constraint. Inf Sci 521:46–61CrossRef
Zurück zum Zitat Yu H, Searsmith D, Li X, Han J (2004) Scalable construction of topic directory with nonparametric closed termset mining. In: Proceedings of the fourth IEEE international conference on data mining. IEEE, pp 563–566 Yu H, Searsmith D, Li X, Han J (2004) Scalable construction of topic directory with nonparametric closed termset mining. In: Proceedings of the fourth IEEE international conference on data mining. IEEE, pp 563–566
Zurück zum Zitat Zhang J, Li C, Cao D, Lin Y, Su S, Dai L, Li S (2018) Multi-label learning with label-specific features by resolving label correlations. Knowl Based Syst 159:148–157CrossRef Zhang J, Li C, Cao D, Lin Y, Su S, Dai L, Li S (2018) Multi-label learning with label-specific features by resolving label correlations. Knowl Based Syst 159:148–157CrossRef
Zurück zum Zitat Zhang Y, Zhang Z, Miao D, Wang J (2019) Three-way enhanced convolutional neural networks for sentence-level sentiment classification. Inf Sci 477:55–64CrossRef Zhang Y, Zhang Z, Miao D, Wang J (2019) Three-way enhanced convolutional neural networks for sentence-level sentiment classification. Inf Sci 477:55–64CrossRef
Metadaten
Titel
Semantic analysis-based relevant data retrieval model using feature selection, summarization and CNN
verfasst von
Antony Rosewelt
Arokia Renjit
Publikationsdatum
06.05.2020
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 22/2020
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-020-04990-w

Weitere Artikel der Ausgabe 22/2020

Soft Computing 22/2020 Zur Ausgabe