Skip to main content
Log in

A survey of RDF management technologies and benchmark datasets

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

With the fast development of semantic web and some other areas, the amount of resource description framework (RDF) data has increased significantly. How to efficiently manage these masses of RDF data has become a challenging task, and has attracted many scholars to research. This paper introduces the state-of-the-art of the RDF storage and query technologies according to some classification criteria. In addition, several prevailing benchmark datasets are introduced and compared. Finally, research challenges and opportunities in future are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. GraphDB is the newest version of OWLIM.

  2. Blazegraph is the newest version of Bigdata.

References

  • Abadi DJ, Marcus A, Madden SR, Hollenbach K (2007) Scalable semantic web data management using vertical partitioning. VLDB 2007:411–422

    Google Scholar 

  • Abadi DJ, Marcus A, Madden SR, Hollenbach K (2009) Sw-store: a vertically partitioned DBMS for semantic web data management. VLDB J 18(2):385–406

    Article  Google Scholar 

  • Abadi DJ et al (2007) Column stores for wide and sparse data. CIDR 2007:292–297

    Google Scholar 

  • Beckmann JL, Halverson A, Krishnamurthy R, Naughton JF (2006) Extending RDBMSs to support sparse datasets using an interpreted attribute storage format. ICDE 2006:58–58

    Google Scholar 

  • Berners-Lee T, Hendler J, Lassila O et al (2001) The semantic web. Sci Am 284(5):28–37

    Article  Google Scholar 

  • Bizer C, Schultz A (2009) The Berlin SPARQL benchmark. Int J Semant Web Inf Syst 5(2):1–24

    Article  Google Scholar 

  • Broekstra J, Kampman A, Van Harmelen F (2002) Sesame: a generic architecture for storing and querying RDF and RDF schema. ISWC 2002:54–68

    MATH  Google Scholar 

  • Carroll JJ, Dickinson I, Dollin C, Reynolds D, Seaborne A, Wilkinson K (2004) Jena: implementing the semantic web recommendations. WWW 2004:74–83

    Google Scholar 

  • Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE (2008) Bigtable: a distributed storage system for structured data. ACM Trans Comput Syst 26(2):4–26

    Article  Google Scholar 

  • Chawla T, Singh G, Pilli ES, Govil M (2016) Research issues in RDF management systems. ETCT 2016:1–5

    Google Scholar 

  • Chen Y, Ou J, Jiang Y, Meng X (2006) Hstar: a semantic repository for large scale OWL documents. ASWC 2006:415–428

    Google Scholar 

  • Cheng J, Ma Z, Tong Q (2018) RDF storage and querying: a literature review. Information retrieval and management: concepts, methodologies, tools, and applications, IGI Global, pp 415–433

  • Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113

    Article  Google Scholar 

  • Duan S, Kementsietsidis A, Srinivas K, Udrea O (2011) Apples and oranges: a comparison of RDF benchmarks and real RDF datasets. SIGMOD 2011:145–156

    Google Scholar 

  • Erling O, Mikhailov I (2009) RDF support in the virtuoso DBMS. In: Networked knowledge—Networked media—Integrating knowledge management 2009, pp 7–24

    Google Scholar 

  • Galarraga L, Hose K, Schenkel R (2014) Partout: a distributed engine for efficient RDF processing. WWW 2014:267–268

    Google Scholar 

  • Goasdoué F, Kaoudi Z, Manolescu I, Quiané-Ruiz JA (2015) Cliquesquare: flat plans for massively parallel RDF queries. ICDE 2015:771–782

    Google Scholar 

  • Guo Y, Pan Z, Heflin J (2005) Lubm: a benchmark for OWL knowledge base systems. J Web Semant 3(2):158–182

    Article  Google Scholar 

  • Gurajada S, Seufert S, Miliaraki I, Theobald M (2014) Triad: a distributed shared-nothing RDF engine based on asynchronous message passing. SIGMOD 2014:289–300

    Google Scholar 

  • Hammoud M, Rabbou DA, Nouri R, Beheshti SMR, Sakr S (2015) Dream: distributed RDF engine with adaptive query planner and minimal communication. Proc VLDB Endow 8(6):654–665

    Article  Google Scholar 

  • Han J, Haihong E, Le G, Du J (2011) Survey on NoSql database. ICPCA 2011:363–366

    Google Scholar 

  • Harris S, Gibbins N (2003) 3store: efficient bulk RDF storage. PSSS 2003:1–15

    Google Scholar 

  • Harris S, Lamb N, Shadbolt N (2009) 4store: the design and implementation of a clustered RDF store. SSWS 2009:94–109

    Google Scholar 

  • Harth A, Decker S (2005) Optimized index structures for querying RDF from the web. LA-WEB 2005:10–19

    Google Scholar 

  • Heese R, Znamirowski M (2012) Resource centered RDF data management. In: SSWS 2011 workshop, pp 138–153

  • Hertel A, Broekstra J, Stuckenschmidt H (2009) RDF storage and retrieval systems. In: Staab S, Studer R (eds) Handbook on ontologies. Springer, Berlin, Heidelberg, pp 489–508

    Chapter  Google Scholar 

  • Huang J, Abadi DJ, Ren K (2011) Scalable SPARQL querying of large RDF graphs. Proc VLDB Endow 4(11):1123–1134

    Google Scholar 

  • Huang J, Venkatraman K, Abadi DJ (2014) Query optimization of distributed pattern matching. ICDE 2014:64–75

    Google Scholar 

  • Husain M, McGlothlin J, Masud MM, Khan L, Thuraisingham BM (2011) Heuristics-based query processing for large RDF graphs using cloud computing. IEEE Trans Knowl Data Eng 23(9):1312–1327

    Article  Google Scholar 

  • Kiryakov A, Ognyanov D, Manov D (2005) Owlim: a pragmatic semantic repository for OWL. In: WISE 2005 workshops, pp 182–192

    Chapter  Google Scholar 

  • Ma L, Su Z, Pan Y, Zhang L, Liu T (2004) Rstar: an RDF storage and query system for enterprise resource management. CIKM 2004:484–491

    Google Scholar 

  • Ma L, Yang Y, Qiu Z, Xie G, Pan Y, Liu S (2006) Towards a complete OWL ontology benchmark. Semant Web 2006:125–139

    Google Scholar 

  • Ma Z, Yan L (2016) A review of RDF storage in nosql databases. In: Managing big data in cloud computing environments, IGI Global, pp 210–229

  • McBride B (2002) Jena: a semantic web toolkit. IEEE Internet Comput 6(6):55–59

    Article  Google Scholar 

  • Membrey P, Plugge E, Hawkins T (2010) The definitive guide to MongoDB: the noSQL database for cloud and desktop computing. O'Reilly Media, Inc.

  • Morsey M, Lehmann J, Auer S, Ngomo ACN (2011) Dbpedia SPARQL benchmark-performance assessment with real queries on real data. ISWC 2011:454–469

    Google Scholar 

  • Murray C, Alexander N, Das S, Eadon G, Ravada S (2005) Oracle spatial resource description framework (RDF). Oracle Corporation

  • Neumann T, Weikum G (2010) The RDF-3X engine for scalable management of RDF data. VLDB J 19(1):91–113

    Article  Google Scholar 

  • Pan Z, Heflin J (2004) Dldb: extending relational databases to support semantic web queries. In: ISWC 2003 workshop

  • Papailiou N, Tsoumakos D, Konstantinou I, Karras P, Koziris N (2014) H2RDF+: an efficient data management system for big RDF graphs. In: SIGMOD 2014, pp 909–912

  • Prud E, Seaborne A, et al (2006) SPARQL query language for RDF. W3C working draft

  • Rohloff K, Schantz RE (2010) High-performance, massively scalable distributed systems using the mapreduce software framework: the shard triple-store. SPLASH 2010:4–8

    Google Scholar 

  • Schmidt M, Hornung T, Lausen G, Pinkel C (2009) S\(P^2\)Bench: a SPARQL performance benchmark. ICDE 2009:222–233

    Google Scholar 

  • Sidirourgos L, Goncalves R, Kersten M, Nes N, Manegold S (2008) Column-store support for RDF data management: not all swans are white. Proc VLDB Endow 1(2):1553–1563

    Article  Google Scholar 

  • Sivasubramanian S (2012) Amazon dynamodb: a seamlessly scalable non-relational database service. SIGMOD 2012:729–730

    Google Scholar 

  • Webber J (2012) A programmatic introduction to neo4j. SPLASH 2012:217–218

    Article  Google Scholar 

  • Wood D, Gearon P, Adams T (2005) Kowari: a platform for semantic web storage and analysis. In: XTech 2005 conference, pp 05–0402

  • Yan Y, Wang C, Zhou A, Qian W, Ma L, Pan Y (2009) Efficient indices using graph partitioning in RDF triple stores. ICDE 2009:1263–1266

    Google Scholar 

  • Zeng K, Yang J, Wang H, Shao B, Wang Z (2013) A distributed graph engine for web scale RDF data. Proc VLDB Endow 6(4):265–276

    Article  Google Scholar 

  • Zou L, Özsu MT (2017) Graph-based RDF data management. Data Sci Eng 2(1):56–70

    Article  Google Scholar 

  • Zou L, Mo J, Chen L, Özsu MT, Zhao D (2011) gStore: answering SPARQL queries via subgraph matching. Proc VLDB Endow 4(8):482–493

    Article  Google Scholar 

  • Zou L, Özsu MT, Chen L, Shen X, Huang R, Zhao D (2014) gStore: a graph-based SPARQL query engine. VLDB J 23(4):565–590

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Nos. 61471035, 61601129) and the double first class construct program of USC (No. 2017SYL16).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Tao Zhu or Huansheng Ning.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pan, Z., Zhu, T., Liu, H. et al. A survey of RDF management technologies and benchmark datasets. J Ambient Intell Human Comput 9, 1693–1704 (2018). https://doi.org/10.1007/s12652-018-0876-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-018-0876-2

Keywords

Navigation