skip to main content
10.1145/2723372.2723732acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Open Access

SQLGraph: An Efficient Relational-Based Property Graph Store

Published:27 May 2015Publication History

ABSTRACT

We show that existing mature, relational optimizers can be exploited with a novel schema to give better performance for property graph storage and retrieval than popular noSQL graph stores. The schema combines relational storage for adjacency information with JSON storage for vertex and edge attributes. We demonstrate that this particular schema design has benefits compared to a purely relational or purely JSON solution. The query translation mechanism translates Gremlin queries with no side effects into SQL queries so that one can leverage relational query optimizers. We also conduct an empirical evaluation of our schema design and query translation mechanism with two existing popular property graph stores. We show that our system is 2-8 times better on query performance, and 10-30 times better in throughput on 4.3 billion edge graphs compared to existing stores.

References

  1. D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach. Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd international conference on very large data bases, pages 411--422. VLDB Endowment, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Angles, P. Boncz, J. Larriba-Pey, I. Fundulaki, T. Neumann, O. Erling, P. Neubauer, N. Martinez-Bazan, V. Kotsev, and I. Toma. The linked data benchmark council: A graph and RDF industry benchmarking effort. SIGMOD Rec., 43(1):27--31, May 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. G. Armstrong, V. Ponnekanti, D. Borthakur, and M. Callaghan. LinkBench: A database benchmark based on the Facebook social graph. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 1185--1196, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Bizer and A. Schultz. The Berlin SPARQL benchmark. International Journal On Semantic Web and Information Systems, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  5. M. A. Bornea, J. Dolby, A. Kementsietsidis, K. Srinivas, P. Dantressangle, O. Udrea, and B. Bhattacharjee. Building an efficient RDF store over a relational database. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 121--132. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. Bugiotti, F. Goasdoué, Z. Kaoudi, and I. Manolescu. RDF data management in the Amazon cloud. In Proceedings of the 2012 Joint EDBT/ICDT Workshops, EDBT-ICDT '12, pages 61--72, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Chebotko, S. Lu, and F. Fotouhi. Semantics preserving SPARQL-to-SQL translation. Data and Knowledge Engineering, 68(10):973 -- 1000, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Ciglan, A. Averbuch, and L. Hluchy. Benchmarking traversal operations over graph databases. In 28th International Conference on Data Engineering Workshops (ICDEW), pages 186--189. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Cyganiak. A relational algebra for SPARQL. Digital Media Systems Laboratory HP Laboratories Bristol. HPL-2005--170, page 35, 2005.Google ScholarGoogle Scholar
  10. F. Di Pinto, D. Lembo, M. Lenzerini, R. Mancini, A. Poggi, R. Rosati, M. Ruzzi, and D. F. Savo. Optimizing query rewriting in ontology-based data access. In Proceedings of the 16th International Conference on Extending Database Technology, EDBT '13, pages 561--572, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Dominguez-Sal, P. Urbón-Bayes, A. Giménez-Vanó, S. Gómez-Villamor, N. Martínez-Bazán, and J.-L. Larriba-Pey. Survey of graph database performance on the HPC scalable graph analysis benchmark. In Web-Age Information Management, pages 37--48. Springer, 2010. Google ScholarGoogle ScholarCross RefCross Ref
  12. Y. Guo, Z. Pan, and J. Heflin. LUBM: A benchmark for OWL knowledge base systems. Journal of Web Semantics, 3(2--3):158--182, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Harris and N. Shadbolt. SPARQL query processing with conventional relational database systems. In Web Information Systems Engineering--WISE 2005 Workshops, pages 235--244. Springer, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. O. Hartig and B. Thompson. Foundations of an alternative approach to reification in RDF. CoRR, abs/1406.3399, 2014.Google ScholarGoogle Scholar
  15. F. Holzschuher and R. Peinl. Performance of graph query languages: Comparison of Cypher, Gremlin and native access in Neo4J. In Proceedings of the Joint EDBT/ICDT 2013 Workshops, pages 195--204, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Huang, D. J. Abadi, and K. Ren. Scalable SPARQL querying of large RDF graphs. PVLDB, 4(11):1123--1134, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Jouili and V. Vansteenberghe. An empirical comparison of graph databases. In SocialCom, pages 708--715. IEEE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Z. Kaoudi and I. Manolescu. Cloud-based RDF data management. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pages 725--729, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. Ma, Y. Yang, Z. Qiu, G. Xie, Y. Pan, and S. Liu. Towards a complete OWL ontology benchmark. In Proceedings of the 3rd European Conference on The Semantic Web, ESWC'06, pages 125--139, Berlin, Heidelberg, 2006. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. Macko, D. Margo, and M. Seltzer. Performance introspection of graph databases. In Proceedings of the 6th International Systems and Storage Conference, page 18. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. N. Martínez-Bazan, V. Muntés-Mulero, S. Gómez-Villamor, J. Nin, M.-A. Sánchez-Martínez, and J.-L. Larriba-Pey. DEX: High-performance exploration on large graphs for information retrieval. In Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, CIKM '07, pages 573--582, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Morsey, J. Lehmann, S. Auer, and A.-C. N. Ngomo. DBpedia SPARQL benchmark--performance assessment with real queries on real data. In The Semantic Web--ISWC 2011, pages 454--469. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. C. Murphy, K. B. Wheeler, B. W. Barrett, and J. A. Ang. Introducing the graph 500. Cray Users Group (CUG), 2010.Google ScholarGoogle Scholar
  24. T. Neumann and G. Weikum. RDF-3X: A RISC-style engine for RDF. Proc. VLDB Endow., 1(1):647--659, Aug. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. T. Neumann and G. Weikum. x-RDF-3X: Fast querying, high update rates, and consistency for RDF databases. Proc. VLDB Endow., 3(1--2):256--263, Sept. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. K. Nitta and I. Savnik. Survey of RDF storage managers. In DBKDA 2014, The Sixth International Conference on Advances in Databases, Knowledge, and Data Applications, pages 148--153, 2014.Google ScholarGoogle Scholar
  27. N. Papailiou, I. Konstantinou, D. Tsoumakos, and N. Koziris. H2RDF: Adaptive query processing on RDF data in the cloud. In Proceedings of the 21st International Conference Companion on World Wide Web, WWW '12 Companion, pages 397--400, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. N. Papailiou, D. Tsoumakos, I. Konstantinou, P. Karras, and N. Koziris. H2RDFGoogle ScholarGoogle Scholar
  29. : an efficient data management system for big RDF graphs. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2014, Snowbird, Utah, USA on June 22--27, 2014. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Rodríguez-Muro, R. Kontchakov, and M. Zakharyaschev. Ontology-based data access: Ontop of databases. In International Semantic Web Conference, ISWC 2013, pages 558--573. Springer, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. S. S. Sahoo, W. Halb, S. Hellmann, K. Idehen, S. Auer, J. Sequeda, and A. Ezzat. A survey of current approaches for mapping of relational databases to RDF. W3C RDB2RDF XG Incubator Report, 2009.Google ScholarGoogle Scholar
  32. S. Sakr and G. Al-Naymat. Relational processing of RDF queries: a survey. ACM SIGMOD Record, 38(4):23--28, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Schmidt, T. Hornung, G. Lausen, and C. Pinkel. SP2Bench: a SPARQL performance benchmark. In Data Engineering, 2009. ICDE'09. IEEE 25th International Conference on, pages 222--233. IEEE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Schmidt, M. Meier, and G. Lausen. Foundations of SPARQL query optimization. In Proceedings of the 13th International Conference on Database Theory, ICDT '10, pages 4--33, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. Stocker, A. Seaborne, A. Bernstein, C. Kiefer, and D. Reynolds. SPARQL basic graph pattern optimization using selectivity estimation. In Proceedings of the 17th International Conference on World Wide Web, WWW '08, pages 595--604, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Tinkerpop. Blueprints. Available: https://github.com/tinkerpop/blueprints/wiki, 2014.Google ScholarGoogle Scholar
  37. Tinkerpop. Gremlin pipes. Available: https://github.com/tinkerpop/pipes/wiki, 2014.Google ScholarGoogle Scholar
  38. Tinkerpop. Gremlin query language. Available: https://github.com/tinkerpop/gremlin/wiki, 2014.Google ScholarGoogle Scholar
  39. K. Wilkinson, C. Sayers, H. A. Kuno, and D. Reynolds. Efficient RDF Storage and Retrieval in Jena2. In Semantic Web and Databases Workshop, pages 131--150, 2003.Google ScholarGoogle Scholar
  40. P. Yuan, P. Liu, B. Wu, H. Jin, W. Zhang, and L. Liu. Triplebit: A fast and compact system for large scale RDF data. Proc. VLDB Endow., 6(7):517--528, May 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SQLGraph: An Efficient Relational-Based Property Graph Store

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
      May 2015
      2110 pages
      ISBN:9781450327589
      DOI:10.1145/2723372

      Copyright © 2015 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 May 2015

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGMOD '15 Paper Acceptance Rate106of415submissions,26%Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader