Skip to main content
Erschienen in: Artificial Intelligence Review 2/2013

01.08.2013

Discovering meaning on the go in large heterogenous data

verfasst von: Harry Halpin, Fiona McNeill

Erschienen in: Artificial Intelligence Review | Ausgabe 2/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The world is increasingly full of data. Organisations, governments and individuals are creating increasingly large data sources, and in many cases making them publicly available. This offers massive potential for interaction and mutual collaboration. But using this data often creates problems. Those creating the data will use their own terminology, structure and formats for the data, meaning that data from one source will be incompatible with data from another source. When presented with a large, unknown data source, it is very difficult to ascribe meaning to the terms of that data source, and to understand what is being conveyed. Much effort has been invested in data interpretation prior to run-time, with large data sources being matched against each other off-line. But data is often used dynamically, and so to maximise the value of the data it is necessary to extract meaning from it dynamically. We therefore postulate that an essential competent of utilising the world of data in which we increasingly live is the development of the ability to discover meaning on the go in large, heterogenous data.This paper provides an overview of the current state-of-the-art, reviewing the aims and achievements in different fields which can be applied to this problem. We take a brief look at cutting edge research in this field, summarising four papers published in the special issue of the AI Review on Discovering Meaning on the go in Large Heterogenous Data, and conclude with our thoughts about where research in this field is going, and what our priorities must be to enable us to move closer to achieving this goal.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
5
These are the papers included in the Special Issue of the AI Review on Discovering Meaning on the go in Large Heterogeneous Data.
 
9
The size of the index of Sindice, the largest Linked Data search engine, as of September 2012. See http://​sindice.​com/​.
 
10
JavaScript Object Notation, a simple key-value pair notation, see www.​json.​org/​ for details.
 
13
Usually called “vocabularies” to emphasize their social nature and lack of use of inference, as to distinguish them from heavy-weight description logic-based formalisms.
 
15
Schema.org deploys a HTML5 feature known as “microdata” to put markup into web-pages (Hickson 2012). Microdata is structurally similar to JSON insofar as it consists of markup that lets parts of web-pages be labeled as types of “item” that have key-value pair “item properties.” After much debate, schema.org also took on using a subset of RDFa, a way to embed RDF directly into web-pages as well (Adida et al. 2008). Although RDFa is much more flexible, it comes at the cost of being more confusing for web-masters.
 
18
This list of techniques is adapted from the figure on p. 65 of Euzenat and Shvaiko (2007).
 
Literatur
Zurück zum Zitat Angles R, Gutierrez C (2008) Survey of graph database models. ACM Comput Surv 40(1):1–39CrossRef Angles R, Gutierrez C (2008) Survey of graph database models. ACM Comput Surv 40(1):1–39CrossRef
Zurück zum Zitat Auer S, Bizer C, Lehmann J, Kobilarov G, Cyganiak R, Ives Z (2007) DBpedia: a nucleus for a web of open data. In: Proceedings of the international and Asian semantic web conference (ISWC/ASWC2007), Busan, Korea, pp 718–728 Auer S, Bizer C, Lehmann J, Kobilarov G, Cyganiak R, Ives Z (2007) DBpedia: a nucleus for a web of open data. In: Proceedings of the international and Asian semantic web conference (ISWC/ASWC2007), Busan, Korea, pp 718–728
Zurück zum Zitat Aurnhammer M, Hanappe P, Steels L (2006) Augmenting navigation for collaborative tagging with emergent semantics. In: Proceedings of the 5th international conference on the semantic web, ISWC’06, Springer, Berlin, Heidelberg, pp 58–71 Aurnhammer M, Hanappe P, Steels L (2006) Augmenting navigation for collaborative tagging with emergent semantics. In: Proceedings of the 5th international conference on the semantic web, ISWC’06, Springer, Berlin, Heidelberg, pp 58–71
Zurück zum Zitat Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison Wesley-Longman, New York City Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison Wesley-Longman, New York City
Zurück zum Zitat Baeza-Yates RA, Ciaramita M, Mika P, Zaragoza H (2008) Towards semantic search. In: Proceedings of conference on applications of natural language to information systems (NLDB), pp 4–11 Baeza-Yates RA, Ciaramita M, Mika P, Zaragoza H (2008) Towards semantic search. In: Proceedings of conference on applications of natural language to information systems (NLDB), pp 4–11
Zurück zum Zitat Bizer C (2004) D2rq—treating non-rdf databases as virtual rdf graphs. In: Proceedings of the 3rd international semantic web conference (ISWC2004) Bizer C (2004) D2rq—treating non-rdf databases as virtual rdf graphs. In: Proceedings of the 3rd international semantic web conference (ISWC2004)
Zurück zum Zitat Blanco R, Halpin H, Herzig D, Mika P, Pound J, Thompson H, Duc TT (2011) Entity search evaluation over structured web data. In: Proceedings of the 1st international workshop on entity-oriented sarch workshop on entity-oriented search (SIGIR (2011) ACM, New York, NY, USA Blanco R, Halpin H, Herzig D, Mika P, Pound J, Thompson H, Duc TT (2011) Entity search evaluation over structured web data. In: Proceedings of the 1st international workshop on entity-oriented sarch workshop on entity-oriented search (SIGIR (2011) ACM, New York, NY, USA
Zurück zum Zitat Choi N, Song IY, Han H (2006) A survey on ontology mapping. SIGMOD Rec 35:34–41CrossRef Choi N, Song IY, Han H (2006) A survey on ontology mapping. SIGMOD Rec 35:34–41CrossRef
Zurück zum Zitat Crestani F, Dominich S, Lalmas M, van Rijsbergen CJ (2003) Mathematical, logical and formal methods in information retrieval: an introduction to the special issue. J Am Soc Inf Sci Technol 54(4):281–284CrossRef Crestani F, Dominich S, Lalmas M, van Rijsbergen CJ (2003) Mathematical, logical and formal methods in information retrieval: an introduction to the special issue. J Am Soc Inf Sci Technol 54(4):281–284CrossRef
Zurück zum Zitat Cudré-Mauroux P, Haghani P, Jost M, Aberer K, De Meer H (2009) idmesh: graph-based disambiguation of linked data. In: Proceedings of the 18th international conference on world wide web, ACM, New York, NY, USA, WWW ’09, pp 591–600 Cudré-Mauroux P, Haghani P, Jost M, Aberer K, De Meer H (2009) idmesh: graph-based disambiguation of linked data. In: Proceedings of the 18th international conference on world wide web, ACM, New York, NY, USA, WWW ’09, pp 591–600
Zurück zum Zitat Euzenat J, Shvaiko P (2007) Ontology Matching. Springer, BerlinMATH Euzenat J, Shvaiko P (2007) Ontology Matching. Springer, BerlinMATH
Zurück zum Zitat Euzenat J, Valtchev P (2004) Similarity-based ontology alignment in owl-lite. In: ECAI, pp 333–337 Euzenat J, Valtchev P (2004) Similarity-based ontology alignment in owl-lite. In: ECAI, pp 333–337
Zurück zum Zitat Euzenat J, Meilicke C, Stuckenschmidt H, Shvaiko P, dos Santos CT (2011) Ontology alignment evaluation initiative: six years of experience. J Data Semant 15:158–192CrossRef Euzenat J, Meilicke C, Stuckenschmidt H, Shvaiko P, dos Santos CT (2011) Ontology alignment evaluation initiative: six years of experience. J Data Semant 15:158–192CrossRef
Zurück zum Zitat Fensel D (2001) Ontologies: a silver bullet for knowledge management and electronic commerce. Springer, London Fensel D (2001) Ontologies: a silver bullet for knowledge management and electronic commerce. Springer, London
Zurück zum Zitat Franklin MJ, Kossmann D, Kraska T, Ramesh S, Xin R (2011) Crowddb: answering queries with crowdsourcing. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, ACM, New York, NY, USA, SIGMOD ’11, pp 61–72 Franklin MJ, Kossmann D, Kraska T, Ramesh S, Xin R (2011) Crowddb: answering queries with crowdsourcing. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, ACM, New York, NY, USA, SIGMOD ’11, pp 61–72
Zurück zum Zitat Gruber T (2004) Every ontology is a treaty. SIGSEMIS, Bulletin 1 Gruber T (2004) Every ontology is a treaty. SIGSEMIS, Bulletin 1
Zurück zum Zitat Guha RV, Lenat D, (1993) Language, representation and contexts. J Inf Process 15(3):340–349 Guha RV, Lenat D, (1993) Language, representation and contexts. J Inf Process 15(3):340–349
Zurück zum Zitat Halpin H (2012) Social semantics: the search for meaning on the web. Springer, London Halpin H (2012) Social semantics: the search for meaning on the web. Springer, London
Zurück zum Zitat Halpin H, Lavrenko V (2011) Relevance feedback between web search and the semantic web. In: Proceedings of the international joint conference on artificial intelligence (IJCAI), Barcelona, Spain, pp 2250–2255 Halpin H, Lavrenko V (2011) Relevance feedback between web search and the semantic web. In: Proceedings of the international joint conference on artificial intelligence (IJCAI), Barcelona, Spain, pp 2250–2255
Zurück zum Zitat Halpin H, Hayes PJ, McCusker JP, McGuinness DL, Thompson HS (2010) When owl: sameas isn’t the same: an analysis of identity in linked data. In: Proceedings of the 9th international semantic web conference on the semantic web—vol Part I, Springer, Berlin, Heidelberg, ISWC’10, pp 305–320 http://dl.acm.org/citation.cfm?id=1940281.1940302 Halpin H, Hayes PJ, McCusker JP, McGuinness DL, Thompson HS (2010) When owl: sameas isn’t the same: an analysis of identity in linked data. In: Proceedings of the 9th international semantic web conference on the semantic web—vol Part I, Springer, Berlin, Heidelberg, ISWC’10, pp 305–320 http://​dl.​acm.​org/​citation.​cfm?​id=​1940281.​1940302
Zurück zum Zitat Havely A (2005) Why your data won’t mix. ACM Queue 3(8):50–58 Havely A (2005) Why your data won’t mix. ACM Queue 3(8):50–58
Zurück zum Zitat Horrocks I, Patel-Schneider P, van Harmelen F (2003) From SHIQ and RDF to OWL: the making of a web ontology language. J Web Semant 1(1):17–26CrossRef Horrocks I, Patel-Schneider P, van Harmelen F (2003) From SHIQ and RDF to OWL: the making of a web ontology language. J Web Semant 1(1):17–26CrossRef
Zurück zum Zitat Horrocks I, Parsia B, Patel-Schneider P, Hendler J (2005) Semantic web architecture: stack or two towers? In: Proceedings of the third international conference on principles and practice of semantic web reasoning, Springer, Berlin, Heidelberg, PPSWR’05, pp 37–41 Horrocks I, Parsia B, Patel-Schneider P, Hendler J (2005) Semantic web architecture: stack or two towers? In: Proceedings of the third international conference on principles and practice of semantic web reasoning, Springer, Berlin, Heidelberg, PPSWR’05, pp 37–41
Zurück zum Zitat Jones KS (2004) What’s new about the semantic web?: some questions. SIGIR Forum 38(2):18–23CrossRef Jones KS (2004) What’s new about the semantic web?: some questions. SIGIR Forum 38(2):18–23CrossRef
Zurück zum Zitat Kalfoglou Y, Schorlemmer M (2003) Ontology mapping: the state of the art. Knowl Eng Rev 18(1):1–31CrossRef Kalfoglou Y, Schorlemmer M (2003) Ontology mapping: the state of the art. Knowl Eng Rev 18(1):1–31CrossRef
Zurück zum Zitat Kwok C, Etzioni O, Weld DS (2001) Scaling question answering to the web. ACM Trans Inf Syst 19(3): 242–262 Kwok C, Etzioni O, Weld DS (2001) Scaling question answering to the web. ACM Trans Inf Syst 19(3): 242–262
Zurück zum Zitat Mazzieri M, Dragoni AF, Marche UPD (2005) A fuzzy semantics for semantic web languages. In: Proceedings of workshop on uncertainty reasoning for the aemantic web (URSW) at the 4th international semantic web conference (ISWC), pp 12–22 Mazzieri M, Dragoni AF, Marche UPD (2005) A fuzzy semantics for semantic web languages. In: Proceedings of workshop on uncertainty reasoning for the aemantic web (URSW) at the 4th international semantic web conference (ISWC), pp 12–22
Zurück zum Zitat McCarthy J, Hayes P (1969) Some philosophical problems from the standpoint of artificial intelligence. In: Meltzer B, Michie D (eds) Machine intelligence, vol 4. Edinburgh University Press, Edinburgh, pp 463–502 McCarthy J, Hayes P (1969) Some philosophical problems from the standpoint of artificial intelligence. In: Meltzer B, Michie D (eds) Machine intelligence, vol 4. Edinburgh University Press, Edinburgh, pp 463–502
Zurück zum Zitat Mika P (2008) Microsearch: an interface for semantic search. In: Proceedings of semantic search workshop at the European semantic web conference Mika P (2008) Microsearch: an interface for semantic search. In: Proceedings of semantic search workshop at the European semantic web conference
Zurück zum Zitat Moraru A, Mladenic D, Vucnik M, Porcius M, Fortuna C, Mohorcic M (2011) Exposing real world information for the web of things. In: Proceedings of the 8th International Workshop on Information Integration on the Web: in conjunction with WWW 2011, ACM, New York, NY, USA, IIWeb ’11, pp 6:1–6:6. Moraru A, Mladenic D, Vucnik M, Porcius M, Fortuna C, Mohorcic M (2011) Exposing real world information for the web of things. In: Proceedings of the 8th International Workshop on Information Integration on the Web: in conjunction with WWW 2011, ACM, New York, NY, USA, IIWeb ’11, pp 6:1–6:6.
Zurück zum Zitat Noy NF (2004) Semantic integration: a survey of ontology-based approaches. SIGMOD Rec 33:65–70CrossRef Noy NF (2004) Semantic integration: a survey of ontology-based approaches. SIGMOD Rec 33:65–70CrossRef
Zurück zum Zitat Pentland A (2012) Society’s nervous system: building effective government, energy, and public health systems. IEEE Comput 45(1):31–38CrossRef Pentland A (2012) Society’s nervous system: building effective government, energy, and public health systems. IEEE Comput 45(1):31–38CrossRef
Zurück zum Zitat Putnam H (1975) The meaning of meaning. In: Gunderson K (ed) Language, mind, and knowledge. University of Minnesota Press, Minneapolis Putnam H (1975) The meaning of meaning. In: Gunderson K (ed) Language, mind, and knowledge. University of Minnesota Press, Minneapolis
Zurück zum Zitat Reiter R (1978) Logic and data bases. In: On closed world data bases. Plenum Publishing, New York City, New York Reiter R (1978) Logic and data bases. In: On closed world data bases. Plenum Publishing, New York City, New York
Zurück zum Zitat Shvaiko P, Euzenat J (2008) Ten challenges for ontology matching. In: OTM conferences (2), pp 1164–1182 Shvaiko P, Euzenat J (2008) Ten challenges for ontology matching. In: OTM conferences (2), pp 1164–1182
Zurück zum Zitat Shvaiko P, Euzenat J (2012) Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data Eng (to appear) Shvaiko P, Euzenat J (2012) Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data Eng (to appear)
Zurück zum Zitat Silverstein C, Marais H, Henzinger M, Moricz M (1999) Analysis of a very large web search engine query log. SIGIR Forum 33(1):6–12CrossRef Silverstein C, Marais H, Henzinger M, Moricz M (1999) Analysis of a very large web search engine query log. SIGIR Forum 33(1):6–12CrossRef
Zurück zum Zitat Simperl E, Acosta M, Norton B (2012) A semantically enabled architecture for crowdsourced linked data management. In: CrowdSearch WWW2012 workshop proceedings, pp 9–14 Simperl E, Acosta M, Norton B (2012) A semantically enabled architecture for crowdsourced linked data management. In: CrowdSearch WWW2012 workshop proceedings, pp 9–14
Zurück zum Zitat Von Ahn L (2005) Human computation. PhD thesis, Carnegie Mellon University, Pittsburgh, PA, USA, aAI3205378 Von Ahn L (2005) Human computation. PhD thesis, Carnegie Mellon University, Pittsburgh, PA, USA, aAI3205378
Zurück zum Zitat Weinberger D (2007) Everything is miscellaneous: the power of the new digital disorder. Times Books, New York City Weinberger D (2007) Everything is miscellaneous: the power of the new digital disorder. Times Books, New York City
Zurück zum Zitat Wilks Y (2007) Karen Spärck Jones (1935–2007). IEEE Intell Syst 22(3):8–9CrossRef Wilks Y (2007) Karen Spärck Jones (1935–2007). IEEE Intell Syst 22(3):8–9CrossRef
Zurück zum Zitat Wittgenstein L (1953) Philosophical investigations. Blackwell Publishers, London, United Kingdom (republished 2001) Wittgenstein L (1953) Philosophical investigations. Blackwell Publishers, London, United Kingdom (republished 2001)
Zurück zum Zitat Wun A, Petrovi M, Jacobsen HA (2007) A system for semantic data fusion in sensor networks. In: Proceedings of the 2007 inaugural international conference on distributed event-based systems, ACM, New York, NY, USA, DEBS ’07, pp 75–79 Wun A, Petrovi M, Jacobsen HA (2007) A system for semantic data fusion in sensor networks. In: Proceedings of the 2007 inaugural international conference on distributed event-based systems, ACM, New York, NY, USA, DEBS ’07, pp 75–79
Metadaten
Titel
Discovering meaning on the go in large heterogenous data
verfasst von
Harry Halpin
Fiona McNeill
Publikationsdatum
01.08.2013
Verlag
Springer Netherlands
Erschienen in
Artificial Intelligence Review / Ausgabe 2/2013
Print ISSN: 0269-2821
Elektronische ISSN: 1573-7462
DOI
https://doi.org/10.1007/s10462-012-9377-4

Weitere Artikel der Ausgabe 2/2013

Artificial Intelligence Review 2/2013 Zur Ausgabe

Premium Partner