skip to main content
research-article

gStore: answering SPARQL queries via subgraph matching

Authors Info & Claims
Published:01 May 2011Publication History
Skip Abstract Section

Abstract

Due to the increasing use of RDF data, efficient processing of SPARQL queries over RDF datasets has become an important issue. However, existing solutions suffer from two limitations: 1) they cannot answer SPARQL queries with wildcards in a scalable manner; and 2) they cannot handle frequent updates in RDF repositories efficiently. Thus, most of them have to reprocess the dataset from scratch. In this paper, we propose a graph-based approach to store and query RDF data. Rather than mapping RDF triples into a relational database as most existing methods do, we store RDF data as a large graph. A SPARQL query is then converted into a corresponding subgraph matching query. In order to speed up query processing, we develop a novel index, together with some effective pruning rules and efficient search algorithms. Our method can answer exact SPARQL queries and queries with wildcards in a uniform manner. We also propose an effective maintenance algorithm to handle online updates over RDF repositories. Extensive experiments confirm the efficiency and effectiveness of our solution.

References

  1. D. J. Abadi, A. Marcus, S. Madden, and K. Hollenbach. Sw-store: a vertically partitioned dbms for semantic web data management. VLDB J., 18(2):385--406, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. J. Abadi, A. Marcus, S. Madden, and K. J. Hollenbach. Scalable semantic web data management using vertical partitioning. In VLDB, pages 411--422, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Bizer, J. Lehmann, G. Kobilarov, S. Auer, C. Becker, R. Cyganiak, and S. Hellmann. Dbpedia - a crystallization point for the web of data. J. Web Sem., 7(3):154--165, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. V. Bönström, A. Hinze, and H. Schweppe. Storing RDF as a graph. In LA-WEB, pages 27--36, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Broekstra, A. Kampman, and F. van Harmelen. Sesame: A generic architecture for storing and querying RDF and RDF schema. In ISWC, pages 54--68, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. The MIT Press, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. U. Deppisch. S-tree: A dynamic balanced signature index for office retrieval. In SIGIR, pages 77--87, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Faloutsos and S. Christodoulakis. Signature files: An access method for documents and its analytical performance evaluation. ACM Trans. Inf. Syst., 2(4):267--288, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. L. Gravano, P. G. Ipeirotis, H. V. Jagadish, N. Koudas, S. Muthukrishnan, L. Pietarinen, and D. Srivastava. Using q-grams in a DBMS for approximate string processing. IEEE Data Eng. Bull., 24(4):28--34, 2001.Google ScholarGoogle Scholar
  10. L. Gravano, P. G. Ipeirotis, N. Koudas, and D. Srivastava. Text joins in an RDBMS for web data integration. In WWW, pages 90--101, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Harth, J. Umbrich, A. Hogan, and S. Decker. Yars2: A federated repository for querying graph structured data from the web. In ISWC/ASWC, pages 211--224, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Neumann and G. Weikum. RDF-3X: a risc-style engine for RDF. PVLDB, 1(1):647--659, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Neumann and G. Weikum. Scalable join processing on very large RDF graphs. In SIGMOD, pages 627--640, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Neumann and G. Weikum. The RDF-3X engine for scalable management of RDF data. VLDB J., 19(1):91--113, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Neumann and G. Weikum. X-RDF-3X: Fast querying, high update rates, and consistency for RDF databases. PVLDB, 1(1):256--263, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Pérez, M. Arenas, and C. Gutierrez. Semantics and complexity of SPARQL. ACM Trans. Database Syst., 34(3):16:1--16:45, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Shasha, J. T.-L. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In PODS, pages 39--52, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Stocker, A. Seaborne, A. Bernstein, C. Kiefer, and D. Reynolds. SPARQL basic graph pattern optimization using selectivity estimation. In WWW, pages 595--604, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: a core of semantic knowledge. In WWW, pages 697--706, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. E. Tousidou, P. Bozanis, and Y. Manolopoulos. Signature-based structures for objects with set-valued attributes. Inf. Syst., 27(2):93--121, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. O. Udrea, A. Pugliese, and V. S. Subrahmanian. Grin: A graph based RDF index. In AAAI, pages 1465--1470, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. Weiss, P. Karras, and A. Bernstein. Hexastore: sextuple indexing for semantic web data management. PVLDB, 1(1):1008--1019, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. K. Wilkinson, C. Sayers, H. A. Kuno, and D. Reynolds. Efficient RDF storage and retrieval in jena2. In SWDB, pages 131--150, 2003.Google ScholarGoogle Scholar
  24. X. Yan, P. S. Yu, and J. Han. Graph indexing: A frequent structure-based approach. In SIGMOD, pages 335--346, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Yan, C. Wang, A. Zhou, W. Qian, L. Ma, and Y. Pan. Efficient indices using graph partitioning in RDF triple stores. In ICDE, pages 1263--1266, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. Zou, L. Chen, J. X. Yu, and Y. Lu. A novel spectral coding in a large graph database. In EDBT, pages 181--192, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. gStore: answering SPARQL queries via subgraph matching

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image Proceedings of the VLDB Endowment
            Proceedings of the VLDB Endowment  Volume 4, Issue 8
            May 2011
            58 pages

            Publisher

            VLDB Endowment

            Publication History

            • Published: 1 May 2011
            Published in pvldb Volume 4, Issue 8

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader