2015 | OriginalPaper | Chapter
MapReduce-Based Bulk-Loading Algorithm for Fast Search for Billions of Triples
Authors : Jung-Ho Um, Seungwoo Lee, Tae-Hong Kim, Chang-Hoo Jeong, Kwangik Seo, Joonho Park, Hanmin Jung
Published in: Computer Science and its Applications
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Due to the development of IT and scientific technology, huge amounts of data are continuously being created and the big data era can be said to have arrived. Therefore, triple store inserting and inquiring into knowledge bases has to be scaled up in order to deal with such large sources of data. To this end, we propose a triple store system based on a distributed database that uses bulk-loading for billions of triples to store data and to respond to user queries quickly. In order to achieve this purpose, we introduce a bulk-loading algorithm using the MapReduce framework and the SPARQL query processing engine to connect to a large distributed database. Experimental results show that the proposed bulk-loading algorithm can use 101K triples per second to load approximately 33 billion triples. This implies that we will be able to deal with billions of triples.