Skip to main content
Erschienen in: Datenbank-Spektrum 2/2013

01.07.2013 | Schwerpunktbeitrag

Improving RDF Data Through Association Rule Mining

verfasst von: Ziawasch Abedjan, Felix Naumann

Erschienen in: Datenbank-Spektrum | Ausgabe 2/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Linked Open Data comprises very many and often large public data sets, which are mostly presented in the Rdf triple structure of subject, predicate, and object. However, the heterogeneity of available open data requires significant integration steps before it can be used in applications. A promising and novel technique to explore such data is the use of association rule mining. We introduce “mining configurations”, which allow us to mine Rdf data sets in various ways. Different configurations enable us to identify schema and value dependencies that in combination result in interesting use cases. We present rule-based approaches for predicate suggestion, data enrichment, ontology improvement, and query relaxation. On the one hand we prevent inconsistencies in the data through predicate suggestion, enrichment with missing facts, and alignment of the corresponding ontology. On the other hand we support users to handle inconsistencies during query formulation through predicate expansion techniques. Based on these approaches, we show that association rule mining benefits the integration and usability of Rdf data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Abedjan Z, Lorey J, Naumann F (2012) Reconciling ontologies and the web of data. In: Proceedings of the international conference on information and knowledge management (CIKM), New York, NY, USA, pp 1532–1536 Abedjan Z, Lorey J, Naumann F (2012) Reconciling ontologies and the web of data. In: Proceedings of the international conference on information and knowledge management (CIKM), New York, NY, USA, pp 1532–1536
2.
Zurück zum Zitat Abedjan Z, Naumann F (2011) Context and target configurations for mining RDF data (2 pp.). In: Proceedings of the international workshop on search and mining entity-relationship data (SMER), Glasgow Abedjan Z, Naumann F (2011) Context and target configurations for mining RDF data (2 pp.). In: Proceedings of the international workshop on search and mining entity-relationship data (SMER), Glasgow
3.
Zurück zum Zitat Abedjan Z, Naumann F (2013) Synonym analysis for predicate expansion. In: Proceedings of the extended semantic web conference (ESWC), Montpellier, France Abedjan Z, Naumann F (2013) Synonym analysis for predicate expansion. In: Proceedings of the extended semantic web conference (ESWC), Montpellier, France
4.
Zurück zum Zitat Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM international conference on management of data (SIGMOD), Washington, DC, USA, pp 207–216 Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM international conference on management of data (SIGMOD), Washington, DC, USA, pp 207–216
5.
Zurück zum Zitat Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the international conference on very large databases (VLDB), Santiago de Chile, Chile, pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the international conference on very large databases (VLDB), Santiago de Chile, Chile, pp 487–499
6.
Zurück zum Zitat Baeza-Yates RA, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley/Longman, Boston Baeza-Yates RA, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley/Longman, Boston
7.
Zurück zum Zitat Bizer C, Lehmann J, Kobilarov G, Auer S, Becker C, Cyganiak R, Hellmann S (2009) DBpedia—a crystallization point for the web of data. J Web Semant 7:154–165 CrossRef Bizer C, Lehmann J, Kobilarov G, Auer S, Becker C, Cyganiak R, Hellmann S (2009) DBpedia—a crystallization point for the web of data. J Web Semant 7:154–165 CrossRef
8.
Zurück zum Zitat Böhm C, Freitag M, Heise A, Lehmann C, Mascher A, Naumann F, Ercegovac V, Hernandez M, Haase P, Schmidt M (2012) GovWILD: integrating open government data for transparency. In: Proceedings of the international world wide web conference (WWW). Demo Böhm C, Freitag M, Heise A, Lehmann C, Mascher A, Naumann F, Ercegovac V, Hernandez M, Haase P, Schmidt M (2012) GovWILD: integrating open government data for transparency. In: Proceedings of the international world wide web conference (WWW). Demo
9.
Zurück zum Zitat Buitelaar P, Cimiano P (eds) (2008) Ontology learning and population: bridging the gap between text and knowledge. Frontiers in artificial intelligence and applications, vol 167. IOS Press, Amsterdam Buitelaar P, Cimiano P (eds) (2008) Ontology learning and population: bridging the gap between text and knowledge. Frontiers in artificial intelligence and applications, vol 167. IOS Press, Amsterdam
10.
Zurück zum Zitat Cafarella MJ, Halevy A, Wang DZ, Wu E, Zhang Y (2008) WebTables: exploring the power of tables on the web. In: Proceedings of the VLDB endowment, vol 1, pp 538–549 Cafarella MJ, Halevy A, Wang DZ, Wu E, Zhang Y (2008) WebTables: exploring the power of tables on the web. In: Proceedings of the VLDB endowment, vol 1, pp 538–549
11.
Zurück zum Zitat Elbassuoni S, Ramanath M, Weikum G (2012) RDF Xpress: a flexible expressive RDF search engine. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval. ACM, New York, p 1013 Elbassuoni S, Ramanath M, Weikum G (2012) RDF Xpress: a flexible expressive RDF search engine. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval. ACM, New York, p 1013
12.
Zurück zum Zitat Fleischhacker D, Völker J, Stuckenschmidt H (2012) Mining RDF data for property axioms. In: Meersman R, Panetto H, Dillon T, Rinderle-Ma S, Dadam P, Zhou X, Pearson S, Ferscha A, Bergamaschi S, Cruz I (eds) On the move to meaningful internet systems: OTM 2012. Lecture notes in computer science, vol 7566. Springer, Berlin, pp 718–735 CrossRef Fleischhacker D, Völker J, Stuckenschmidt H (2012) Mining RDF data for property axioms. In: Meersman R, Panetto H, Dillon T, Rinderle-Ma S, Dadam P, Zhou X, Pearson S, Ferscha A, Bergamaschi S, Cruz I (eds) On the move to meaningful internet systems: OTM 2012. Lecture notes in computer science, vol 7566. Springer, Berlin, pp 718–735 CrossRef
13.
Zurück zum Zitat Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the ACM international conference on management of data (SIGMOD), pp 1–12 Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the ACM international conference on management of data (SIGMOD), pp 1–12
14.
Zurück zum Zitat Heath T, Bizer C (2011) Linked data: evolving the web into a global data space, 1st edn, Morgan & Claypool Heath T, Bizer C (2011) Linked data: evolving the web into a global data space, 1st edn, Morgan & Claypool
15.
Zurück zum Zitat Józefowska J, Lawrynowicz A, Lukaszewski T (2010) The role of semantics in mining frequent patterns from knowledge bases in description logics with rules. Theory Pract Log Program 10:251–289 MathSciNetMATHCrossRef Józefowska J, Lawrynowicz A, Lukaszewski T (2010) The role of semantics in mining frequent patterns from knowledge bases in description logics with rules. Theory Pract Log Program 10:251–289 MathSciNetMATHCrossRef
16.
Zurück zum Zitat Kuramochi M, Karypis G (2001) Frequent subgraph discovery. In: Proceedings of the IEEE international conference on data mining (ICDM), Washington, DC, pp 313–320 Kuramochi M, Karypis G (2001) Frequent subgraph discovery. In: Proceedings of the IEEE international conference on data mining (ICDM), Washington, DC, pp 313–320
17.
Zurück zum Zitat Lange D, Böhm C, Naumann F (2010) Extracting structured information from Wikipedia articles to populate infoboxes. In: Proceedings of the international conference on information and knowledge management (CIKM). ACM, New York, pp 1661–1664 Lange D, Böhm C, Naumann F (2010) Extracting structured information from Wikipedia articles to populate infoboxes. In: Proceedings of the international conference on information and knowledge management (CIKM). ACM, New York, pp 1661–1664
18.
Zurück zum Zitat Maedche A, Staab S (2001) Ontology learning for the semantic web. IEEE Intell Syst 16:72–79 CrossRef Maedche A, Staab S (2001) Ontology learning for the semantic web. IEEE Intell Syst 16:72–79 CrossRef
19.
Zurück zum Zitat Nebot V, Berlanga R (2010) Mining association rules from semantic web data. In: Proceedings of the international conference on industrial engineering and other applications of applied intelligent systems (IEA/AIE), Cordoba, Spain, vol 2, pp 504–513 Nebot V, Berlanga R (2010) Mining association rules from semantic web data. In: Proceedings of the international conference on industrial engineering and other applications of applied intelligent systems (IEA/AIE), Cordoba, Spain, vol 2, pp 504–513
20.
Zurück zum Zitat Völker J, Niepert M (2011) Statistical schema induction. In: Proceedings of the extended semantic web conference (ESWC), Heraklion, Greece, pp 124–138 Völker J, Niepert M (2011) Statistical schema induction. In: Proceedings of the extended semantic web conference (ESWC), Heraklion, Greece, pp 124–138
21.
Zurück zum Zitat Wu F, Weld DS (2007) Autonomously semantifying Wikipedia. In: Proceedings of the international conference on information and knowledge management (CIKM). ACM, New York, pp 41–50 Wu F, Weld DS (2007) Autonomously semantifying Wikipedia. In: Proceedings of the international conference on information and knowledge management (CIKM). ACM, New York, pp 41–50
22.
Zurück zum Zitat Wu F, Weld DS (2008) Automatically refining the Wikipedia infobox ontology. In: Proceedings of the international world wide web conference (WWW), Beijing, China, pp 635–644 Wu F, Weld DS (2008) Automatically refining the Wikipedia infobox ontology. In: Proceedings of the international world wide web conference (WWW), Beijing, China, pp 635–644
23.
Zurück zum Zitat Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12:372–390 CrossRef Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12:372–390 CrossRef
Metadaten
Titel
Improving RDF Data Through Association Rule Mining
verfasst von
Ziawasch Abedjan
Felix Naumann
Publikationsdatum
01.07.2013
Verlag
Springer-Verlag
Erschienen in
Datenbank-Spektrum / Ausgabe 2/2013
Print ISSN: 1618-2162
Elektronische ISSN: 1610-1995
DOI
https://doi.org/10.1007/s13222-013-0126-x

Weitere Artikel der Ausgabe 2/2013

Datenbank-Spektrum 2/2013 Zur Ausgabe

Dissertationen

Dissertationen

Community

News

Premium Partner