Skip to main content
Erschienen in: Knowledge and Information Systems 2/2024

09.10.2023 | Regular Paper

Ontology-based soft computing and machine learning model for efficient retrieval

verfasst von: Sanjay Kumar Anand, Suresh Kumar

Erschienen in: Knowledge and Information Systems | Ausgabe 2/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Unstructured and unorganized data always degrade the performance of search techniques and produce irrelevant results in response to the query as well as decrease the speed of retrieval results. Ontology in semantic web (SW) provides an adequate solution to represent the knowledge, because of its backbone knowledge of an application or domain. But, domain ontology has three basic problems while retrieving useful knowledge from a domain ontology: (a) structuring/arrangement, (b) unnecessary knowledge reduction, selection and extraction, and (c) speeding up the retrieval process. To resolve these problems, we proposed multi-level k-mean clustering approach with rough set and Bayesian network model for ontology (MLK-rBO). The proposed model works in four different phases—clustering, knowledge discovery, building a probabilistic network, and model evaluation. The model ensembles three different techniques, namely clustering, rough set (RS), and Bayesian network (BN). Finally, the proposed model is tested with statistical parameters and compared with other models, namely decision tree (DT), random forest (RF), and support vector machine (SVM) to evaluate performance. By analyzing experimental results, we observed that the MLK-rBO gives better accuracy: 98.36% for survey data (fever) and 86% for Wine quality data than available models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Sharma A, Kumar S (2020) Bayesian rough set based information retrieval. J Stat Manag Syst 23(7):1147–1158 Sharma A, Kumar S (2020) Bayesian rough set based information retrieval. J Stat Manag Syst 23(7):1147–1158
2.
Zurück zum Zitat Cai L, Zhu Y (2015) The challenges of data quality and data quality assessment in the big data era. Data Sci J 14 Cai L, Zhu Y (2015) The challenges of data quality and data quality assessment in the big data era. Data Sci J 14
3.
Zurück zum Zitat Schrage M (2016) How the big data explosion has changed decision making. Harvard Bus Rev Schrage M (2016) How the big data explosion has changed decision making. Harvard Bus Rev
4.
Zurück zum Zitat Anand SK, Kumar S (2022) Uncertainty analysis in ontology-based knowledge representation. N Gener Comput 40(1):339–376CrossRef Anand SK, Kumar S (2022) Uncertainty analysis in ontology-based knowledge representation. N Gener Comput 40(1):339–376CrossRef
5.
Zurück zum Zitat Chaudhary N, Kumar S, Yadav AK, Chakraverti S (2019) Novel ranking approach using pattern recognition for ontology in semantic search engine. In: 2019 International conference on issues and challenges in intelligent computing techniques (ICICT), vol. 1, pp 1–4. IEEE Chaudhary N, Kumar S, Yadav AK, Chakraverti S (2019) Novel ranking approach using pattern recognition for ontology in semantic search engine. In: 2019 International conference on issues and challenges in intelligent computing techniques (ICICT), vol. 1, pp 1–4. IEEE
6.
7.
Zurück zum Zitat Hitzler P (2021) A review of the semantic web field. Commun ACM 64(2):76–83CrossRef Hitzler P (2021) A review of the semantic web field. Commun ACM 64(2):76–83CrossRef
8.
Zurück zum Zitat Sharma A, Kumar S Shallow neural network and ontology-based novel semantic document indexing for information retrieval Sharma A, Kumar S Shallow neural network and ontology-based novel semantic document indexing for information retrieval
9.
Zurück zum Zitat Sharma A, Kumar S (2019) Semantic web-based information retrieval models: a systematic survey. In: International conference on recent developments in science, engineering and technology, pp 204–222. Springer Sharma A, Kumar S (2019) Semantic web-based information retrieval models: a systematic survey. In: International conference on recent developments in science, engineering and technology, pp 204–222. Springer
10.
Zurück zum Zitat Gómez-Pérez A, Corcho O (2002) Ontology languages for the semantic web. IEEE Intell Syst 17(1):54–60CrossRef Gómez-Pérez A, Corcho O (2002) Ontology languages for the semantic web. IEEE Intell Syst 17(1):54–60CrossRef
11.
Zurück zum Zitat Guarino N (1995) Formal ontology, conceptual analysis and knowledge representation. Int J Hum Comput Stud 43(5–6):625–640CrossRef Guarino N (1995) Formal ontology, conceptual analysis and knowledge representation. Int J Hum Comput Stud 43(5–6):625–640CrossRef
12.
Zurück zum Zitat Kumar N, Kumar S (2013) Querying RDF and owl data source using sparql. In: 2013 Fourth international conference on computing, communications and networking technologies (ICCCNT), pp 1–6. IEEE Kumar N, Kumar S (2013) Querying RDF and owl data source using sparql. In: 2013 Fourth international conference on computing, communications and networking technologies (ICCCNT), pp 1–6. IEEE
13.
Zurück zum Zitat Khan JA, Kumar S (2014) Deep analysis for development of RDF, RDFs and owl ontologies with protege. In: Proceedings of 3rd international conference on reliability, infocom technologies and optimization, pp 1–6. IEEE Khan JA, Kumar S (2014) Deep analysis for development of RDF, RDFs and owl ontologies with protege. In: Proceedings of 3rd international conference on reliability, infocom technologies and optimization, pp 1–6. IEEE
14.
Zurück zum Zitat Khan JA, Kumar S (2014) Owl, rdf, rdfs inference derivation using jena semantic framework & pellet reasoner. In: 2014 International conference on advances in engineering & technology research (ICAETR-2014), pp 1–8. IEEE Khan JA, Kumar S (2014) Owl, rdf, rdfs inference derivation using jena semantic framework & pellet reasoner. In: 2014 International conference on advances in engineering & technology research (ICAETR-2014), pp 1–8. IEEE
15.
Zurück zum Zitat McGuinness DL, Van Harmelen F et al (2004) Owl web ontology language overview. W3c Recommend 10(10):2004 McGuinness DL, Van Harmelen F et al (2004) Owl web ontology language overview. W3c Recommend 10(10):2004
16.
Zurück zum Zitat Kumar S, Singh M, De A (2012) Owl-based ontology indexing and retrieving algorithms for semantic search engine. In: 2012 7th international conference on computing and convergence technology (ICCCT), pp 1135–1140. IEEE Kumar S, Singh M, De A (2012) Owl-based ontology indexing and retrieving algorithms for semantic search engine. In: 2012 7th international conference on computing and convergence technology (ICCCT), pp 1135–1140. IEEE
17.
Zurück zum Zitat Ghorbel F, Hamdi F, Métais E (2020) Dealing with precise and imprecise temporal data in crisp ontology. Int J Inf Technol Web Eng (IJITWE) 15(2):30–49CrossRef Ghorbel F, Hamdi F, Métais E (2020) Dealing with precise and imprecise temporal data in crisp ontology. Int J Inf Technol Web Eng (IJITWE) 15(2):30–49CrossRef
18.
Zurück zum Zitat Anand S, Verma A (2010) Development of ontology for smart hospital and implementation using UML and RDF. Int J Comput Sci Issues (IJCSI) 7(5):206 Anand S, Verma A (2010) Development of ontology for smart hospital and implementation using UML and RDF. Int J Comput Sci Issues (IJCSI) 7(5):206
19.
Zurück zum Zitat Li S, Fu Y (2017) Robust representation for data analytics. Springer, BerlinCrossRef Li S, Fu Y (2017) Robust representation for data analytics. Springer, BerlinCrossRef
20.
Zurück zum Zitat Kamal A, Dhakal P, Javaid AY, Devabhaktuni VK, Kaur D, Zaientz J, Marinier R (2021) Recent advances and challenges in uncertainty visualization: a survey. J Vis 24(5):861–890CrossRef Kamal A, Dhakal P, Javaid AY, Devabhaktuni VK, Kaur D, Zaientz J, Marinier R (2021) Recent advances and challenges in uncertainty visualization: a survey. J Vis 24(5):861–890CrossRef
21.
Zurück zum Zitat Nuzzolese AG, Gentile AL, Presutti V, Gangemi A, Garigliotti D, Navigli R (2015) Open knowledge extraction challenge. In: Semantic web evaluation challenges: second SemWebEval xhallenge at ESWC 2015, Portorož, Slovenia, May 31–June 4, 2015, Revised Selected Papers, pp 3–15 . Springer Nuzzolese AG, Gentile AL, Presutti V, Gangemi A, Garigliotti D, Navigli R (2015) Open knowledge extraction challenge. In: Semantic web evaluation challenges: second SemWebEval xhallenge at ESWC 2015, Portorož, Slovenia, May 31–June 4, 2015, Revised Selected Papers, pp 3–15 . Springer
22.
Zurück zum Zitat Corcoglioniti F, Dragoni M, Rospocher M, Aprosio AP (2016) Knowledge extraction for information retrieval. In: The semantic web. Latest advances and new domains: 13th international conference, ESWC 2016, Heraklion, Crete, Greece, May 29–June 2, 2016, Proceedings 13, pp. 317–333. Springer Corcoglioniti F, Dragoni M, Rospocher M, Aprosio AP (2016) Knowledge extraction for information retrieval. In: The semantic web. Latest advances and new domains: 13th international conference, ESWC 2016, Heraklion, Crete, Greece, May 29–June 2, 2016, Proceedings 13, pp. 317–333. Springer
23.
Zurück zum Zitat Allan J, Aslam J, Belkin N, Buckley C, Callan J, Croft B, Dumais S, Fuhr N, Harman D, Harper DJ, et al. (2003) Challenges in information retrieval and language modeling: report of a workshop held at the center for intelligent information retrieval, University of Massachusetts Amherst, September 2002. In: ACM SIGIR Forum, vol. 37, pp. 31–47. ACM New York, NY, USA Allan J, Aslam J, Belkin N, Buckley C, Callan J, Croft B, Dumais S, Fuhr N, Harman D, Harper DJ, et al. (2003) Challenges in information retrieval and language modeling: report of a workshop held at the center for intelligent information retrieval, University of Massachusetts Amherst, September 2002. In: ACM SIGIR Forum, vol. 37, pp. 31–47. ACM New York, NY, USA
24.
Zurück zum Zitat Kobayashi M, Takeda K (2000) Information retrieval on the web. ACM Comput Surv (CSUR) 32(2):144–173CrossRef Kobayashi M, Takeda K (2000) Information retrieval on the web. ACM Comput Surv (CSUR) 32(2):144–173CrossRef
25.
Zurück zum Zitat Clare A, King RD (2003) Predicting gene function in saccharomyces cerevisiae. Bioinformatics 19(suppl-2):42–49CrossRef Clare A, King RD (2003) Predicting gene function in saccharomyces cerevisiae. Bioinformatics 19(suppl-2):42–49CrossRef
26.
Zurück zum Zitat Navigli R, Velardi P (2005) Structural semantic interconnections: a knowledge-based approach to word sense disambiguation. IEEE Trans Pattern Anal Mach Intell 27(7):1075–1086CrossRef Navigli R, Velardi P (2005) Structural semantic interconnections: a knowledge-based approach to word sense disambiguation. IEEE Trans Pattern Anal Mach Intell 27(7):1075–1086CrossRef
27.
Zurück zum Zitat Petnga L, Austin M (2016) An ontological framework for knowledge modeling and decision support in cyber-physical systems. Adv Eng Inform 30(1):77–94CrossRef Petnga L, Austin M (2016) An ontological framework for knowledge modeling and decision support in cyber-physical systems. Adv Eng Inform 30(1):77–94CrossRef
28.
Zurück zum Zitat Levashenko V, Zaitseva E (2012) Fuzzy decision trees in medical decision making support system. In: 2012 Federated conference on computer science and information systems (FedCSIS), pp 213–219. IEEE Levashenko V, Zaitseva E (2012) Fuzzy decision trees in medical decision making support system. In: 2012 Federated conference on computer science and information systems (FedCSIS), pp 213–219. IEEE
29.
Zurück zum Zitat Ding G, Sun T, Xu Y (2013) Multi-schema matching based on clustering techniques. In: 2013 10th international conference on fuzzy systems and knowledge discovery (FSKD), pp 778–782. IEEE Ding G, Sun T, Xu Y (2013) Multi-schema matching based on clustering techniques. In: 2013 10th international conference on fuzzy systems and knowledge discovery (FSKD), pp 778–782. IEEE
30.
Zurück zum Zitat Yue L, Zuo W, Peng T, Wang Y, Han X (2015) A fuzzy document clustering approach based on domain-specified ontology. Data Knowl Eng 100:148–166CrossRef Yue L, Zuo W, Peng T, Wang Y, Han X (2015) A fuzzy document clustering approach based on domain-specified ontology. Data Knowl Eng 100:148–166CrossRef
31.
Zurück zum Zitat Otoom AF, Abdallah EE, Kilani Y, Kefaye A, Ashour M (2015) Effective diagnosis and monitoring of heart disease. Int J Softw Eng Appl 9(1):143–156 Otoom AF, Abdallah EE, Kilani Y, Kefaye A, Ashour M (2015) Effective diagnosis and monitoring of heart disease. Int J Softw Eng Appl 9(1):143–156
32.
Zurück zum Zitat Kandhasamy JP, Balamurali S (2015) Performance analysis of classifier models to predict diabetes mellitus. Proc Comput Sci 47:45–51CrossRef Kandhasamy JP, Balamurali S (2015) Performance analysis of classifier models to predict diabetes mellitus. Proc Comput Sci 47:45–51CrossRef
33.
Zurück zum Zitat Afshari A, Mirhosseini S (2016) A new approach in diabetes diagnosis by hybrid of genetic algorithm and decision tree. Int J Sci 5(1):805–814 Afshari A, Mirhosseini S (2016) A new approach in diabetes diagnosis by hybrid of genetic algorithm and decision tree. Int J Sci 5(1):805–814
34.
Zurück zum Zitat Alshahrani M, Khan MA, Maddouri O, Kinjo AR, Queralt-Rosinach N, Hoehndorf R (2017) Neuro-symbolic representation learning on biological knowledge graphs. Bioinformatics 33(17):2723–2730CrossRef Alshahrani M, Khan MA, Maddouri O, Kinjo AR, Queralt-Rosinach N, Hoehndorf R (2017) Neuro-symbolic representation learning on biological knowledge graphs. Bioinformatics 33(17):2723–2730CrossRef
35.
Zurück zum Zitat Valls A, Gibert K, Orellana A, Antón-Clavé S (2018) Using ontology-based clustering to understand the push and pull factors for British tourists visiting a Mediterranean coastal destination. Inf Manag 55(2):145–159CrossRef Valls A, Gibert K, Orellana A, Antón-Clavé S (2018) Using ontology-based clustering to understand the push and pull factors for British tourists visiting a Mediterranean coastal destination. Inf Manag 55(2):145–159CrossRef
36.
Zurück zum Zitat Mahmoud H, Abbas E, Fathy I (2018) Data mining and ontology-based techniques in healthcare management. Int J Intell Eng Inf 6(6):509–526 Mahmoud H, Abbas E, Fathy I (2018) Data mining and ontology-based techniques in healthcare management. Int J Intell Eng Inf 6(6):509–526
37.
Zurück zum Zitat Smaili FZ, Gao X, Hoehndorf R (2018) Onto2vec: joint vector-based representation of biological entities and their ontology-based annotations. Bioinformatics 34(13):52–60CrossRef Smaili FZ, Gao X, Hoehndorf R (2018) Onto2vec: joint vector-based representation of biological entities and their ontology-based annotations. Bioinformatics 34(13):52–60CrossRef
38.
Zurück zum Zitat El Massari H, Gherabi N, Mhammedi S, Ghandi H, Qanouni F, Bahaj M (2022) An ontological model based on machine learning for predicting breast cancer. Int J Adv Comput Sci Appl (IJACSA) 13(7) (2022) El Massari H, Gherabi N, Mhammedi S, Ghandi H, Qanouni F, Bahaj M (2022) An ontological model based on machine learning for predicting breast cancer. Int J Adv Comput Sci Appl (IJACSA) 13(7) (2022)
39.
Zurück zum Zitat Wu X, Wang Z (2022) Multi-objective optimal allocation of regional water resources based on slime mould algorithm. J Supercomput 78(16):18288–18317CrossRef Wu X, Wang Z (2022) Multi-objective optimal allocation of regional water resources based on slime mould algorithm. J Supercomput 78(16):18288–18317CrossRef
40.
Zurück zum Zitat Al-Yaseen WL, Othman ZA, Nazri MZA (2017) Multi-level hybrid support vector machine and extreme learning machine based on modified k-means for intrusion detection system. Expert Syst Appl 67:296–303CrossRef Al-Yaseen WL, Othman ZA, Nazri MZA (2017) Multi-level hybrid support vector machine and extreme learning machine based on modified k-means for intrusion detection system. Expert Syst Appl 67:296–303CrossRef
41.
Zurück zum Zitat Moreira O, Popp M, Schulz C (2018) Evolutionary multi-level acyclic graph partitioning. In: Proceedings of the genetic and evolutionary computation conference, pp 332–339 Moreira O, Popp M, Schulz C (2018) Evolutionary multi-level acyclic graph partitioning. In: Proceedings of the genetic and evolutionary computation conference, pp 332–339
42.
Zurück zum Zitat Rodriguez-Hoyos A, Estrada-Jiménez J, Rebollo-Monedero D, Mezher AM, Parra-Arnau J, Forne J (2020) The fast maximum distance to average vector (f-mdav): an algorithm for k-anonymous microaggregation in big data. Eng Appl Artif Intell 90:103531CrossRef Rodriguez-Hoyos A, Estrada-Jiménez J, Rebollo-Monedero D, Mezher AM, Parra-Arnau J, Forne J (2020) The fast maximum distance to average vector (f-mdav): an algorithm for k-anonymous microaggregation in big data. Eng Appl Artif Intell 90:103531CrossRef
43.
Zurück zum Zitat Zhang Q, Xie Q, Wang G (2016) A survey on rough set theory and its applications. CAAI Trans Intell Technol 1(4):323–333CrossRef Zhang Q, Xie Q, Wang G (2016) A survey on rough set theory and its applications. CAAI Trans Intell Technol 1(4):323–333CrossRef
44.
Zurück zum Zitat Kumar S, Kumar N, Singh M, De A (2013) A rule-based approach for extraction of link-context from anchor-text structure. In: Intelligent informatics, pp 261–271. Springer Kumar S, Kumar N, Singh M, De A (2013) A rule-based approach for extraction of link-context from anchor-text structure. In: Intelligent informatics, pp 261–271. Springer
45.
Zurück zum Zitat Pawlak Z (1998) Rough set theory and its applications to data analysis. Cybern Syst 29(7):661–688CrossRef Pawlak Z (1998) Rough set theory and its applications to data analysis. Cybern Syst 29(7):661–688CrossRef
46.
Zurück zum Zitat Yao Y (2008) Probabilistic rough set approximations. Int J Approx Reason 49(2):255–271CrossRef Yao Y (2008) Probabilistic rough set approximations. Int J Approx Reason 49(2):255–271CrossRef
47.
48.
Zurück zum Zitat Zhang NL, Poole D (1994) A simple approach to Bayesian network computations. In: Proceedings of of the tenth Canadian conference on artificial intelligence (1994) Zhang NL, Poole D (1994) A simple approach to Bayesian network computations. In: Proceedings of of the tenth Canadian conference on artificial intelligence (1994)
49.
Zurück zum Zitat Fuster-Parra P, Tauler P, Bennasar-Veny M, Ligęza A, Lopez-Gonzalez A, Aguiló A (2016) Bayesian network modeling: a case study of an epidemiologic system analysis of cardiovascular risk. Comput Methods Programs Biomed 126:128–142CrossRef Fuster-Parra P, Tauler P, Bennasar-Veny M, Ligęza A, Lopez-Gonzalez A, Aguiló A (2016) Bayesian network modeling: a case study of an epidemiologic system analysis of cardiovascular risk. Comput Methods Programs Biomed 126:128–142CrossRef
50.
Zurück zum Zitat Shafer G (1985) Conditional probability. Int Stat Rev 261–275 Shafer G (1985) Conditional probability. Int Stat Rev 261–275
51.
Zurück zum Zitat Agarwal R (2020) The 5 classification evaluation metrics every data scientist must know. Towards Data Sci Agarwal R (2020) The 5 classification evaluation metrics every data scientist must know. Towards Data Sci
52.
Zurück zum Zitat Kiapour A, Nematollahi N (2011) Robust Bayesian prediction and estimation under a squared log error loss function. Stat Probab Lett 81(11):1717–1724MathSciNetCrossRef Kiapour A, Nematollahi N (2011) Robust Bayesian prediction and estimation under a squared log error loss function. Stat Probab Lett 81(11):1717–1724MathSciNetCrossRef
53.
Zurück zum Zitat Anand SK, Kumar S (2022) Experimental comparisons of clustering approaches for data representation. ACM Comput Surv (CSUR) 55(3):1–33CrossRef Anand SK, Kumar S (2022) Experimental comparisons of clustering approaches for data representation. ACM Comput Surv (CSUR) 55(3):1–33CrossRef
54.
Zurück zum Zitat de Souto MC, Coelho AL, Faceli K, Sakata TC, Bonadia V, Costa IG (2012) A comparison of external clustering evaluation indices in the context of imbalanced data sets. In: 2012 Brazilian symposium on neural networks, pp 49–54. IEEE de Souto MC, Coelho AL, Faceli K, Sakata TC, Bonadia V, Costa IG (2012) A comparison of external clustering evaluation indices in the context of imbalanced data sets. In: 2012 Brazilian symposium on neural networks, pp 49–54. IEEE
55.
Zurück zum Zitat Chaudhary N, Kumar S, Gupta S (2021) A novel ontology design and comparative analysis of various retrieval schemes on education domain in protégé. In: ICT analysis and applications, pp 487–495. Springer Chaudhary N, Kumar S, Gupta S (2021) A novel ontology design and comparative analysis of various retrieval schemes on education domain in protégé. In: ICT analysis and applications, pp 487–495. Springer
56.
Zurück zum Zitat Kumar S, Singh M, De A (2010) Information retrieval modeling techniques for web documents. In: Published in international conference on reliability, InfoCom technology and optimization (ICROTO 2010), pp 392–399 Kumar S, Singh M, De A (2010) Information retrieval modeling techniques for web documents. In: Published in international conference on reliability, InfoCom technology and optimization (ICROTO 2010), pp 392–399
Metadaten
Titel
Ontology-based soft computing and machine learning model for efficient retrieval
verfasst von
Sanjay Kumar Anand
Suresh Kumar
Publikationsdatum
09.10.2023
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 2/2024
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-023-01990-8

Weitere Artikel der Ausgabe 2/2024

Knowledge and Information Systems 2/2024 Zur Ausgabe

Premium Partner