ABSTRACT
Over the past years, Semantic Web and Linked Data technologies have reached the backend of a considerable number of applications. Consequently, large amounts of RDF data are constantly being made available across the planet. While experts can easily gather information from this wealth of data by using the W3C standard query language SPARQL, most lay users lack the expertise necessary to proficiently interact with these applications. Consequently, non-expert users usually have to rely on forms, query builders, question answering or keyword search tools to access RDF data. However, these tools have so far been unable to explicate the queries they generate to lay users, making it difficult for these users to i) assess the correctness of the query generated out of their input, and ii) to adapt their queries or iii) to choose in an informed manner between possible interpretations of their input. This paper addresses this drawback by presenting SPARQL2NL, a generic approach that allows verbalizing SPARQL queries, i.e., converting them into natural language. Our framework can be integrated into applications where lay users are required to understand SPARQL or to generate SPARQL queries in a direct (forms, query builders) or an indirect (keyword search, question answering) manner. We evaluate our approach on the DBpedia question set provided by QALD-2 within a survey setting with both SPARQL experts and lay users. The results of the 115 filled surveys show that SPARQL2NL can generate complete and easily understandable natural language descriptions. In addition, our results suggest that even SPARQL experts can process the natural language representation of SPARQL queries computed by our approach more efficiently than the corresponding SPARQL queries. Moreover, non-experts are enabled to reliably understand the content of SPARQL queries.
- G. Aguado, A. Bañón, John A. Bateman, S. Bernardos, M. Fernández, A. Gómez-Pérez, E. Nieto, A. Olalla, R. Plaza, and A. Sánchez. ONTOGENERATION: Reusing domain and linguistic ontologies for Spanish text generation. In Workshop on Applications of Ontologies and Problem Solving Methods, ECAI'98, 1998.Google Scholar
- Kalina Bontcheva and Yorick Wilks. Automatic report generation from ontologies: The miakt approach. In NLDB, pages 324--335, 2004.Google ScholarCross Ref
- H. Dalianis and E.H. Hovy. Aggregation in natural language generation. In G. Adorni and M. Zock, editors, Trends in natural language generation: an artificial intelligence perspective, volume 1036 of Lecture Notes in Artificial Intelligence, pages 88--105. Springer, 1996. Google ScholarDigital Library
- Brian Davis, Ahmad Iqbal, Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, and Siegfried Handschuh. Roundtrip ontology authoring. In ISWC, pages 50--65, 2008. Google ScholarDigital Library
- George Doddington. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proceedings of HLT, pages 138--145, 2002. Google ScholarDigital Library
- Basil Ell, Denny Vrandecic, and Elena Paslaru Bontas Simperl. Labels in the web of data. In Proceedings of ISWC, volume 7031, pages 162--176. Springer, 2011. Google ScholarDigital Library
- Basil Ell, Denny Vrandecic, and Elena Simperl. SPARTIQULATION - Verbalizing SPARQL queries. In Proceedings of ILD Workshop, ESWC2012, 2012.Google Scholar
- Gunther Fliedl, Christian Kop, and Jurgen Vohringer. Guideline based evaluation and verbalization of owl class and property labels. Data Knowl. Eng., 69(4), 2010. Google ScholarDigital Library
- Norbert E. Fuchs. First-order reasoning for attempto controlled english. In CNL, pages 73--94, 2010. Google ScholarDigital Library
- Dimitrios Galanis and Ion Androutsopoulos. Generating multilingual descriptions from linguistically annotated owl ontologies: the naturalowl system. In Proceedings of the Eleventh European Workshop on Natural Language Generation, ENLG '07, pages 143--146, Stroudsburg, PA, USA, 2007. Association for Computational Linguistics. Google ScholarDigital Library
- Daniel Gerber and Axel-Cyrille Ngonga Ngomo. Extracting multilingual natural-language patterns for rdf predicates. In EKAW, pages 87--96, 2012. Google ScholarDigital Library
- Daniel Gerber and Axel-Cyrille Ngonga Ngomo. Bootstrapping the linked data web. In 1st Workshop on Web Scale Knowledge Extraction @ ISWC 2011, 2011.Google Scholar
- Normunds Gruzitis, Gunta Nespore, and Baiba Saulite. Verbalizing ontologies in controlled baltic languages. In Inguna Skadina and Andrejs Vasiljevs, editors, Baltic HLT, volume 219 of Frontiers in Artificial Intelligence and Applications, pages 187--194. IOS Press, 2010. Google ScholarDigital Library
- Daniel Hewlett, Aditya Kalyanpur, Vladimir Kolovski, and Chris Halaschek-Wiener. Effective natural language paraphrasing of ontologies on the semantic web. In Proceedings of the End User Semantic Web Interaction Workshop (ISWC 2005), 2005.Google Scholar
- Yannis Ioannidis. From databases to natural language: The unusual direction. In E. Kapetanios, V. Sugumaran, and M. Spiliopoulou, editors, Natural Language and Information Systems, volume 5039 of LNCS, pages 12--16, 2008. Google ScholarDigital Library
- Kaarel Kaljurand and Norbert E. Fuchs. Verbalizing OWL in Attempto Controlled English. In Proceedings of Third International Workshop on OWL: Experiences and Directions, Innsbruck, Austria (6th-7th June 2007), volume 258, 2007.Google Scholar
- G. Koutrika, A. Simitsis, and Y.E. Ioannidis. Explaining structured queries in natural language. In Proceedings of the 26th International Conference on Data Engineering (ICDE), pages 333--344, 2010.Google ScholarCross Ref
- Jens Lehmann and Lorenz Bühmann. Autosparql: Let users query your knowledge base. In Proceedings of ESWC 2011, 2011. Google ScholarDigital Library
- Jens Lehmann, Daniel Gerber, Mohamed Morsey, and Axel-Cyrille Ngonga Ngomo. Defacto - deep fact validation. In ISWC, 2012. Google ScholarDigital Library
- Chris Mellish and Xiantang Sun. The semantic web as a linguistic resource: opportunities for natural language generation. In Twenty-sixth SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, 2006.Google ScholarCross Ref
- H. Piccinini, M. A. Casanova, A. L. Furtado, and B. P. Nunes. Verbalization of rdf triples with applications. In ISWC - Outrageous Ideas track, 2011.Google Scholar
- Aleksander Pohl. The polish interface for linked open data. In Proceedings of the ISWC 2010 Posters & Demonstrations Track, pages 165--168, 2011.Google Scholar
- Ana-Maria Popescu, Oren Etzioni, and Henry Kautz. Towards a theory of natural language interfaces to databases. In Proceedings of the 8th international conference on Intelligent user interfaces, IUI '03, pages 149--157, 2003. Google ScholarDigital Library
- Ehud Reiter and Robert Dale. Building natural language generation systems. Cambridge University Press, New York, NY, USA, 2000. Google ScholarDigital Library
- Saeedeh Shekarpour, Soren Auer, Axel-Cyrille Ngonga Ngomo, Daniel Gerber, Sebastian Hellmann, and Claus Stadler. Keyword-driven sparql query generation leveraging background knowledge. In ACM/IEEE WI, 2011. Google ScholarDigital Library
- Xiantang Sun and Chris Mellish. An experiment on "free generation" from single rdf triples. In Proceedings of the Eleventh European Workshop on Natural Language Generation, ENLG '07, pages 105--108, Stroudsburg, PA, USA, 2007. Association for Computational Linguistics. Google ScholarDigital Library
- Allan Third, Sandra Williams, and Richard Power. Owl to english: a tool for generating organised easily-navigated hypertexts from ontologies. In ISWC Poster and Demo Track, 2011.Google Scholar
- Christina Unger, Lorenz Buhmann, Jens Lehmann, Axel-Cyrille Ngonga Ngomo, Daniel Gerber, and Philipp Cimiano. Template-based question answering over RDF data. In Proceedings of WWW, 2012. Google ScholarDigital Library
- Graham Wilcock. Talking OWLs: Towards an Ontology Verbalizer. In Human Language Technology for the Semantic Web and Web Services, Workshop at ISWC 2003, pages 109--112, 2003.Google Scholar
- Graham Wilcock and Kristiina Jokinen. Generating Responses and Explanations from RDF/XML and DAML+OIL, 2003.Google Scholar
Recommendations
Knowledge Graphs
In this article, we provide a comprehensive introduction to knowledge graphs, which have recently garnered significant attention from both industry and academia in scenarios that require exploiting diverse, dynamic, large-scale collections of data. After ...
Semantics and complexity of SPARQL
SPARQL is the standard language for querying RDF data. In this article, we address systematically the formal study of the database aspects of SPARQL, concentrating in its graph pattern matching facility. We provide a compositional semantics for the core ...
SPARQL2NL: verbalizing sparql queries
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide WebLinked Data technologies are now being employed by a large number of applications. While experts can query the backend of these applications using the standard query language SPARQL, most lay users lack the expertise necessary to proficiently interact ...
Comments