Skip to main content
Erschienen in: Soft Computing 3/2020

27.05.2019 | Methodologies and Application

An effective quality analysis of XML web data using hybrid clustering and classification approach

verfasst von: M. Gopianand, P. Jaganathan

Erschienen in: Soft Computing | Ausgabe 3/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

An effective quality analysis of XML web data using clustering and classification approach is used in our proposed method. XML is turning into a standard in representation of data, it is attractive to support keyword search in XML database. A keyword search searches for words anyplace in record. It is developed as best worldview for finding data on web. The most imperative prerequisite for the keyword search is to rank the consequences of question so that the most pertinent outcomes show up. Here, we gather more XML documents. Followed by that, feature extraction occurs. Since the selected feature contains both relevant as well as irrelevant features it is essential to filter the irrelevant features. For the purpose of selecting, the relevant features probability-based feature selection method is used. Then for clustering the relevant features on the basis of keywords weighted fuzzy c means clustering algorithm is used. In order to assess the XML data quality, optimal neural network (ONN) classifier is utilized. In this ONN classifier in order to select the optimal weights, whale optimization algorithm is used. Thus, the web pages are effectively ranked. The efficiency of the proposed method is assessed using clustering and classification accuracy, RMSE, and search time. The proposed method is implemented in JAVA.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Algergawy A, Schallehn E, Saake G (2009) Improving XML schema matching performance using Prüfer sequences. Data Knowl Eng 68(8):728–747CrossRef Algergawy A, Schallehn E, Saake G (2009) Improving XML schema matching performance using Prüfer sequences. Data Knowl Eng 68(8):728–747CrossRef
Zurück zum Zitat Alpuente M, Ballis D, Falaschi M, Frechina F, Romero D (2013) Rewriting-based repairing strategies for XML repositories. J Logic Algebraic Progr 82(8):326–352MathSciNetCrossRef Alpuente M, Ballis D, Falaschi M, Frechina F, Romero D (2013) Rewriting-based repairing strategies for XML repositories. J Logic Algebraic Progr 82(8):326–352MathSciNetCrossRef
Zurück zum Zitat Barros EG, Laender AHF, Moro MM, da Silva AS (2016) LCA-based algorithms for efficiently processing multiple keyword queries over XML streams. Data Knowl Eng 103:1–18CrossRef Barros EG, Laender AHF, Moro MM, da Silva AS (2016) LCA-based algorithms for efficiently processing multiple keyword queries over XML streams. Data Knowl Eng 103:1–18CrossRef
Zurück zum Zitat Böttcher S, Hartel R, Wolters D (2016) S2CX: from relational data via SQL/XML to (Un-)Compressed XML. Inf Syst 56:198–213CrossRef Böttcher S, Hartel R, Wolters D (2016) S2CX: from relational data via SQL/XML to (Un-)Compressed XML. Inf Syst 56:198–213CrossRef
Zurück zum Zitat Cao Y, Lung C-H, Majumdar S (2016) Efficient message delivery models for XML-based publish/subscribe systems. Comput Commun 85:58–73CrossRef Cao Y, Lung C-H, Majumdar S (2016) Efficient message delivery models for XML-based publish/subscribe systems. Comput Commun 85:58–73CrossRef
Zurück zum Zitat Greco S, Gullo F, Ponti G, Tagarelli A (2011) Collaborative clustering of XML documents. J Comput Syst Sci 77(6):988–1008MathSciNetCrossRef Greco S, Gullo F, Ponti G, Tagarelli A (2011) Collaborative clustering of XML documents. J Comput Syst Sci 77(6):988–1008MathSciNetCrossRef
Zurück zum Zitat Grijzenhout S, Marx M (2013) The quality of the XML Web. Web Semant Sci Serv Agents World Wide Web 19:59–68CrossRef Grijzenhout S, Marx M (2013) The quality of the XML Web. Web Semant Sci Serv Agents World Wide Web 19:59–68CrossRef
Zurück zum Zitat Liu J, Zhang XX (2016) Dynamic labeling scheme for XML updates. Knowl Based Syst 106:135–149CrossRef Liu J, Zhang XX (2016) Dynamic labeling scheme for XML updates. Knowl Based Syst 106:135–149CrossRef
Zurück zum Zitat Ma Z, Bai L, Ishikawa Y, Yan L (2017) Consistencies of fuzzy spatiotemporal data in XML documents. Fuzzy Sets Syst 343:97–125MathSciNetCrossRef Ma Z, Bai L, Ishikawa Y, Yan L (2017) Consistencies of fuzzy spatiotemporal data in XML documents. Fuzzy Sets Syst 343:97–125MathSciNetCrossRef
Zurück zum Zitat Mata C, Oliver A, Lalande A, Walker P, Martí J (2017) On the use of XML in medical imaging web-based applications. IRBM 38(1):3–12CrossRef Mata C, Oliver A, Lalande A, Walker P, Martí J (2017) On the use of XML in medical imaging web-based applications. IRBM 38(1):3–12CrossRef
Zurück zum Zitat Mohammed S, Barradah AF, El-Alfy E-SM (2016) Selectivity estimation of extended XML query tree patterns based on prime number labeling and synopsis modeling. Simul Model Pract Theory 64:30–42CrossRef Mohammed S, Barradah AF, El-Alfy E-SM (2016) Selectivity estimation of extended XML query tree patterns based on prime number labeling and synopsis modeling. Simul Model Pract Theory 64:30–42CrossRef
Zurück zum Zitat Morris KC (2010) A framework for XML schema naming and design rules development tools. Comput Stand Interfaces 32(4):179–184CrossRef Morris KC (2010) A framework for XML schema naming and design rules development tools. Comput Stand Interfaces 32(4):179–184CrossRef
Zurück zum Zitat Nečaský M, Klímek J, Malý J, Mlýnková I (2012) Evolution and change management of XML-based systems. J Syst Softw 85(3):683–707mCrossRef Nečaský M, Klímek J, Malý J, Mlýnková I (2012) Evolution and change management of XML-based systems. J Syst Softw 85(3):683–707mCrossRef
Zurück zum Zitat Qadah GZ (2017) Indexing techniques for processing generalized XML documents. Comput Stand Interfaces 49:34–43CrossRef Qadah GZ (2017) Indexing techniques for processing generalized XML documents. Comput Stand Interfaces 49:34–43CrossRef
Zurück zum Zitat Qtaish A, Ahmad K (2016) XAncestor: an efficient mapping approach for storing and querying XML documents in relational database using path-based technique. Knowl Based Syst 114:167–192CrossRef Qtaish A, Ahmad K (2016) XAncestor: an efficient mapping approach for storing and querying XML documents in relational database using path-based technique. Knowl Based Syst 114:167–192CrossRef
Zurück zum Zitat Safabahar B, Mirabi M (2017) A new structure and access mechanism for secure and efficient XML data broadcast in mobile wireless networks. J Syst Softw 125:119–132CrossRef Safabahar B, Mirabi M (2017) A new structure and access mechanism for secure and efficient XML data broadcast in mobile wireless networks. J Syst Softw 125:119–132CrossRef
Zurück zum Zitat Schweinsberg K, Wegner L (2017) Advantages of complex SQL types in storing XML documents. Future Gener Comput Syst 68:500–507CrossRef Schweinsberg K, Wegner L (2017) Advantages of complex SQL types in storing XML documents. Future Gener Comput Syst 68:500–507CrossRef
Zurück zum Zitat Sengupta A (2012) On the feasibility of using conceptual modeling constructs for the design and analysis of XML data. Data Knowl Eng 72:219–238CrossRef Sengupta A (2012) On the feasibility of using conceptual modeling constructs for the design and analysis of XML data. Data Knowl Eng 72:219–238CrossRef
Zurück zum Zitat Szymczak M, Zadrożny S, Bronselaer A, De Tré G (2015) Coreference detection in an XML schema. Inf Sci 296:237–262CrossRef Szymczak M, Zadrożny S, Bronselaer A, De Tré G (2015) Coreference detection in an XML schema. Inf Sci 296:237–262CrossRef
Zurück zum Zitat Tekli J, Charbel N, Chbeir R (2016) Building semantic trees from XML documents. Web Semant Sci Serv Agents World Wide Web 37–38:1–24CrossRef Tekli J, Charbel N, Chbeir R (2016) Building semantic trees from XML documents. Web Semant Sci Serv Agents World Wide Web 37–38:1–24CrossRef
Zurück zum Zitat Vela B, Mazón JN, Blanco C, Fernández-Medina E, Trujillo J, Marcos E (2013) Development of secure XML data warehouses with QVT. Inf Softw Technol 55(9):1651–1677CrossRef Vela B, Mazón JN, Blanco C, Fernández-Medina E, Trujillo J, Marcos E (2013) Development of secure XML data warehouses with QVT. Inf Softw Technol 55(9):1651–1677CrossRef
Zurück zum Zitat Wang D (2007) An XML-based testing strategy for probing security vulnerabilities in the diameter protocol. Bell Labs Tech J 12(3):79–93CrossRef Wang D (2007) An XML-based testing strategy for probing security vulnerabilities in the diameter protocol. Bell Labs Tech J 12(3):79–93CrossRef
Metadaten
Titel
An effective quality analysis of XML web data using hybrid clustering and classification approach
verfasst von
M. Gopianand
P. Jaganathan
Publikationsdatum
27.05.2019
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 3/2020
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-019-04045-9

Weitere Artikel der Ausgabe 3/2020

Soft Computing 3/2020 Zur Ausgabe