Skip to main content
Top
Published in: Artificial Intelligence Review 2/2013

01-08-2013

Discovering meaning on the go in large heterogenous data

Authors: Harry Halpin, Fiona McNeill

Published in: Artificial Intelligence Review | Issue 2/2013

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The world is increasingly full of data. Organisations, governments and individuals are creating increasingly large data sources, and in many cases making them publicly available. This offers massive potential for interaction and mutual collaboration. But using this data often creates problems. Those creating the data will use their own terminology, structure and formats for the data, meaning that data from one source will be incompatible with data from another source. When presented with a large, unknown data source, it is very difficult to ascribe meaning to the terms of that data source, and to understand what is being conveyed. Much effort has been invested in data interpretation prior to run-time, with large data sources being matched against each other off-line. But data is often used dynamically, and so to maximise the value of the data it is necessary to extract meaning from it dynamically. We therefore postulate that an essential competent of utilising the world of data in which we increasingly live is the development of the ability to discover meaning on the go in large, heterogenous data.This paper provides an overview of the current state-of-the-art, reviewing the aims and achievements in different fields which can be applied to this problem. We take a brief look at cutting edge research in this field, summarising four papers published in the special issue of the AI Review on Discovering Meaning on the go in Large Heterogenous Data, and conclude with our thoughts about where research in this field is going, and what our priorities must be to enable us to move closer to achieving this goal.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
5
These are the papers included in the Special Issue of the AI Review on Discovering Meaning on the go in Large Heterogeneous Data.
 
9
The size of the index of Sindice, the largest Linked Data search engine, as of September 2012. See http://​sindice.​com/​.
 
10
JavaScript Object Notation, a simple key-value pair notation, see www.​json.​org/​ for details.
 
13
Usually called “vocabularies” to emphasize their social nature and lack of use of inference, as to distinguish them from heavy-weight description logic-based formalisms.
 
15
Schema.org deploys a HTML5 feature known as “microdata” to put markup into web-pages (Hickson 2012). Microdata is structurally similar to JSON insofar as it consists of markup that lets parts of web-pages be labeled as types of “item” that have key-value pair “item properties.” After much debate, schema.org also took on using a subset of RDFa, a way to embed RDF directly into web-pages as well (Adida et al. 2008). Although RDFa is much more flexible, it comes at the cost of being more confusing for web-masters.
 
18
This list of techniques is adapted from the figure on p. 65 of Euzenat and Shvaiko (2007).
 
Literature
go back to reference Angles R, Gutierrez C (2008) Survey of graph database models. ACM Comput Surv 40(1):1–39CrossRef Angles R, Gutierrez C (2008) Survey of graph database models. ACM Comput Surv 40(1):1–39CrossRef
go back to reference Auer S, Bizer C, Lehmann J, Kobilarov G, Cyganiak R, Ives Z (2007) DBpedia: a nucleus for a web of open data. In: Proceedings of the international and Asian semantic web conference (ISWC/ASWC2007), Busan, Korea, pp 718–728 Auer S, Bizer C, Lehmann J, Kobilarov G, Cyganiak R, Ives Z (2007) DBpedia: a nucleus for a web of open data. In: Proceedings of the international and Asian semantic web conference (ISWC/ASWC2007), Busan, Korea, pp 718–728
go back to reference Aurnhammer M, Hanappe P, Steels L (2006) Augmenting navigation for collaborative tagging with emergent semantics. In: Proceedings of the 5th international conference on the semantic web, ISWC’06, Springer, Berlin, Heidelberg, pp 58–71 Aurnhammer M, Hanappe P, Steels L (2006) Augmenting navigation for collaborative tagging with emergent semantics. In: Proceedings of the 5th international conference on the semantic web, ISWC’06, Springer, Berlin, Heidelberg, pp 58–71
go back to reference Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison Wesley-Longman, New York City Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison Wesley-Longman, New York City
go back to reference Baeza-Yates RA, Ciaramita M, Mika P, Zaragoza H (2008) Towards semantic search. In: Proceedings of conference on applications of natural language to information systems (NLDB), pp 4–11 Baeza-Yates RA, Ciaramita M, Mika P, Zaragoza H (2008) Towards semantic search. In: Proceedings of conference on applications of natural language to information systems (NLDB), pp 4–11
go back to reference Bizer C (2004) D2rq—treating non-rdf databases as virtual rdf graphs. In: Proceedings of the 3rd international semantic web conference (ISWC2004) Bizer C (2004) D2rq—treating non-rdf databases as virtual rdf graphs. In: Proceedings of the 3rd international semantic web conference (ISWC2004)
go back to reference Blanco R, Halpin H, Herzig D, Mika P, Pound J, Thompson H, Duc TT (2011) Entity search evaluation over structured web data. In: Proceedings of the 1st international workshop on entity-oriented sarch workshop on entity-oriented search (SIGIR (2011) ACM, New York, NY, USA Blanco R, Halpin H, Herzig D, Mika P, Pound J, Thompson H, Duc TT (2011) Entity search evaluation over structured web data. In: Proceedings of the 1st international workshop on entity-oriented sarch workshop on entity-oriented search (SIGIR (2011) ACM, New York, NY, USA
go back to reference Choi N, Song IY, Han H (2006) A survey on ontology mapping. SIGMOD Rec 35:34–41CrossRef Choi N, Song IY, Han H (2006) A survey on ontology mapping. SIGMOD Rec 35:34–41CrossRef
go back to reference Crestani F, Dominich S, Lalmas M, van Rijsbergen CJ (2003) Mathematical, logical and formal methods in information retrieval: an introduction to the special issue. J Am Soc Inf Sci Technol 54(4):281–284CrossRef Crestani F, Dominich S, Lalmas M, van Rijsbergen CJ (2003) Mathematical, logical and formal methods in information retrieval: an introduction to the special issue. J Am Soc Inf Sci Technol 54(4):281–284CrossRef
go back to reference Cudré-Mauroux P, Haghani P, Jost M, Aberer K, De Meer H (2009) idmesh: graph-based disambiguation of linked data. In: Proceedings of the 18th international conference on world wide web, ACM, New York, NY, USA, WWW ’09, pp 591–600 Cudré-Mauroux P, Haghani P, Jost M, Aberer K, De Meer H (2009) idmesh: graph-based disambiguation of linked data. In: Proceedings of the 18th international conference on world wide web, ACM, New York, NY, USA, WWW ’09, pp 591–600
go back to reference Euzenat J, Shvaiko P (2007) Ontology Matching. Springer, BerlinMATH Euzenat J, Shvaiko P (2007) Ontology Matching. Springer, BerlinMATH
go back to reference Euzenat J, Valtchev P (2004) Similarity-based ontology alignment in owl-lite. In: ECAI, pp 333–337 Euzenat J, Valtchev P (2004) Similarity-based ontology alignment in owl-lite. In: ECAI, pp 333–337
go back to reference Euzenat J, Meilicke C, Stuckenschmidt H, Shvaiko P, dos Santos CT (2011) Ontology alignment evaluation initiative: six years of experience. J Data Semant 15:158–192CrossRef Euzenat J, Meilicke C, Stuckenschmidt H, Shvaiko P, dos Santos CT (2011) Ontology alignment evaluation initiative: six years of experience. J Data Semant 15:158–192CrossRef
go back to reference Fensel D (2001) Ontologies: a silver bullet for knowledge management and electronic commerce. Springer, London Fensel D (2001) Ontologies: a silver bullet for knowledge management and electronic commerce. Springer, London
go back to reference Franklin MJ, Kossmann D, Kraska T, Ramesh S, Xin R (2011) Crowddb: answering queries with crowdsourcing. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, ACM, New York, NY, USA, SIGMOD ’11, pp 61–72 Franklin MJ, Kossmann D, Kraska T, Ramesh S, Xin R (2011) Crowddb: answering queries with crowdsourcing. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, ACM, New York, NY, USA, SIGMOD ’11, pp 61–72
go back to reference Gruber T (2004) Every ontology is a treaty. SIGSEMIS, Bulletin 1 Gruber T (2004) Every ontology is a treaty. SIGSEMIS, Bulletin 1
go back to reference Guha RV, Lenat D, (1993) Language, representation and contexts. J Inf Process 15(3):340–349 Guha RV, Lenat D, (1993) Language, representation and contexts. J Inf Process 15(3):340–349
go back to reference Halpin H (2012) Social semantics: the search for meaning on the web. Springer, London Halpin H (2012) Social semantics: the search for meaning on the web. Springer, London
go back to reference Halpin H, Lavrenko V (2011) Relevance feedback between web search and the semantic web. In: Proceedings of the international joint conference on artificial intelligence (IJCAI), Barcelona, Spain, pp 2250–2255 Halpin H, Lavrenko V (2011) Relevance feedback between web search and the semantic web. In: Proceedings of the international joint conference on artificial intelligence (IJCAI), Barcelona, Spain, pp 2250–2255
go back to reference Halpin H, Hayes PJ, McCusker JP, McGuinness DL, Thompson HS (2010) When owl: sameas isn’t the same: an analysis of identity in linked data. In: Proceedings of the 9th international semantic web conference on the semantic web—vol Part I, Springer, Berlin, Heidelberg, ISWC’10, pp 305–320 http://dl.acm.org/citation.cfm?id=1940281.1940302 Halpin H, Hayes PJ, McCusker JP, McGuinness DL, Thompson HS (2010) When owl: sameas isn’t the same: an analysis of identity in linked data. In: Proceedings of the 9th international semantic web conference on the semantic web—vol Part I, Springer, Berlin, Heidelberg, ISWC’10, pp 305–320 http://​dl.​acm.​org/​citation.​cfm?​id=​1940281.​1940302
go back to reference Havely A (2005) Why your data won’t mix. ACM Queue 3(8):50–58 Havely A (2005) Why your data won’t mix. ACM Queue 3(8):50–58
go back to reference Horrocks I, Patel-Schneider P, van Harmelen F (2003) From SHIQ and RDF to OWL: the making of a web ontology language. J Web Semant 1(1):17–26CrossRef Horrocks I, Patel-Schneider P, van Harmelen F (2003) From SHIQ and RDF to OWL: the making of a web ontology language. J Web Semant 1(1):17–26CrossRef
go back to reference Horrocks I, Parsia B, Patel-Schneider P, Hendler J (2005) Semantic web architecture: stack or two towers? In: Proceedings of the third international conference on principles and practice of semantic web reasoning, Springer, Berlin, Heidelberg, PPSWR’05, pp 37–41 Horrocks I, Parsia B, Patel-Schneider P, Hendler J (2005) Semantic web architecture: stack or two towers? In: Proceedings of the third international conference on principles and practice of semantic web reasoning, Springer, Berlin, Heidelberg, PPSWR’05, pp 37–41
go back to reference Jones KS (2004) What’s new about the semantic web?: some questions. SIGIR Forum 38(2):18–23CrossRef Jones KS (2004) What’s new about the semantic web?: some questions. SIGIR Forum 38(2):18–23CrossRef
go back to reference Kalfoglou Y, Schorlemmer M (2003) Ontology mapping: the state of the art. Knowl Eng Rev 18(1):1–31CrossRef Kalfoglou Y, Schorlemmer M (2003) Ontology mapping: the state of the art. Knowl Eng Rev 18(1):1–31CrossRef
go back to reference Kwok C, Etzioni O, Weld DS (2001) Scaling question answering to the web. ACM Trans Inf Syst 19(3): 242–262 Kwok C, Etzioni O, Weld DS (2001) Scaling question answering to the web. ACM Trans Inf Syst 19(3): 242–262
go back to reference Mazzieri M, Dragoni AF, Marche UPD (2005) A fuzzy semantics for semantic web languages. In: Proceedings of workshop on uncertainty reasoning for the aemantic web (URSW) at the 4th international semantic web conference (ISWC), pp 12–22 Mazzieri M, Dragoni AF, Marche UPD (2005) A fuzzy semantics for semantic web languages. In: Proceedings of workshop on uncertainty reasoning for the aemantic web (URSW) at the 4th international semantic web conference (ISWC), pp 12–22
go back to reference McCarthy J, Hayes P (1969) Some philosophical problems from the standpoint of artificial intelligence. In: Meltzer B, Michie D (eds) Machine intelligence, vol 4. Edinburgh University Press, Edinburgh, pp 463–502 McCarthy J, Hayes P (1969) Some philosophical problems from the standpoint of artificial intelligence. In: Meltzer B, Michie D (eds) Machine intelligence, vol 4. Edinburgh University Press, Edinburgh, pp 463–502
go back to reference Mika P (2008) Microsearch: an interface for semantic search. In: Proceedings of semantic search workshop at the European semantic web conference Mika P (2008) Microsearch: an interface for semantic search. In: Proceedings of semantic search workshop at the European semantic web conference
go back to reference Moraru A, Mladenic D, Vucnik M, Porcius M, Fortuna C, Mohorcic M (2011) Exposing real world information for the web of things. In: Proceedings of the 8th International Workshop on Information Integration on the Web: in conjunction with WWW 2011, ACM, New York, NY, USA, IIWeb ’11, pp 6:1–6:6. Moraru A, Mladenic D, Vucnik M, Porcius M, Fortuna C, Mohorcic M (2011) Exposing real world information for the web of things. In: Proceedings of the 8th International Workshop on Information Integration on the Web: in conjunction with WWW 2011, ACM, New York, NY, USA, IIWeb ’11, pp 6:1–6:6.
go back to reference Noy NF (2004) Semantic integration: a survey of ontology-based approaches. SIGMOD Rec 33:65–70CrossRef Noy NF (2004) Semantic integration: a survey of ontology-based approaches. SIGMOD Rec 33:65–70CrossRef
go back to reference Pentland A (2012) Society’s nervous system: building effective government, energy, and public health systems. IEEE Comput 45(1):31–38CrossRef Pentland A (2012) Society’s nervous system: building effective government, energy, and public health systems. IEEE Comput 45(1):31–38CrossRef
go back to reference Putnam H (1975) The meaning of meaning. In: Gunderson K (ed) Language, mind, and knowledge. University of Minnesota Press, Minneapolis Putnam H (1975) The meaning of meaning. In: Gunderson K (ed) Language, mind, and knowledge. University of Minnesota Press, Minneapolis
go back to reference Reiter R (1978) Logic and data bases. In: On closed world data bases. Plenum Publishing, New York City, New York Reiter R (1978) Logic and data bases. In: On closed world data bases. Plenum Publishing, New York City, New York
go back to reference Shvaiko P, Euzenat J (2008) Ten challenges for ontology matching. In: OTM conferences (2), pp 1164–1182 Shvaiko P, Euzenat J (2008) Ten challenges for ontology matching. In: OTM conferences (2), pp 1164–1182
go back to reference Shvaiko P, Euzenat J (2012) Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data Eng (to appear) Shvaiko P, Euzenat J (2012) Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data Eng (to appear)
go back to reference Silverstein C, Marais H, Henzinger M, Moricz M (1999) Analysis of a very large web search engine query log. SIGIR Forum 33(1):6–12CrossRef Silverstein C, Marais H, Henzinger M, Moricz M (1999) Analysis of a very large web search engine query log. SIGIR Forum 33(1):6–12CrossRef
go back to reference Simperl E, Acosta M, Norton B (2012) A semantically enabled architecture for crowdsourced linked data management. In: CrowdSearch WWW2012 workshop proceedings, pp 9–14 Simperl E, Acosta M, Norton B (2012) A semantically enabled architecture for crowdsourced linked data management. In: CrowdSearch WWW2012 workshop proceedings, pp 9–14
go back to reference Von Ahn L (2005) Human computation. PhD thesis, Carnegie Mellon University, Pittsburgh, PA, USA, aAI3205378 Von Ahn L (2005) Human computation. PhD thesis, Carnegie Mellon University, Pittsburgh, PA, USA, aAI3205378
go back to reference Weinberger D (2007) Everything is miscellaneous: the power of the new digital disorder. Times Books, New York City Weinberger D (2007) Everything is miscellaneous: the power of the new digital disorder. Times Books, New York City
go back to reference Wilks Y (2007) Karen Spärck Jones (1935–2007). IEEE Intell Syst 22(3):8–9CrossRef Wilks Y (2007) Karen Spärck Jones (1935–2007). IEEE Intell Syst 22(3):8–9CrossRef
go back to reference Wittgenstein L (1953) Philosophical investigations. Blackwell Publishers, London, United Kingdom (republished 2001) Wittgenstein L (1953) Philosophical investigations. Blackwell Publishers, London, United Kingdom (republished 2001)
go back to reference Wun A, Petrovi M, Jacobsen HA (2007) A system for semantic data fusion in sensor networks. In: Proceedings of the 2007 inaugural international conference on distributed event-based systems, ACM, New York, NY, USA, DEBS ’07, pp 75–79 Wun A, Petrovi M, Jacobsen HA (2007) A system for semantic data fusion in sensor networks. In: Proceedings of the 2007 inaugural international conference on distributed event-based systems, ACM, New York, NY, USA, DEBS ’07, pp 75–79
Metadata
Title
Discovering meaning on the go in large heterogenous data
Authors
Harry Halpin
Fiona McNeill
Publication date
01-08-2013
Publisher
Springer Netherlands
Published in
Artificial Intelligence Review / Issue 2/2013
Print ISSN: 0269-2821
Electronic ISSN: 1573-7462
DOI
https://doi.org/10.1007/s10462-012-9377-4

Other articles of this Issue 2/2013

Artificial Intelligence Review 2/2013 Go to the issue

Premium Partner