2015 | OriginalPaper | Chapter
Efficient Indexing for OLAP Query Processing with MapReduce
Authors : Woo Lam Kang, Hyeon Gyu Kim, Yoon Joon Lee
Published in: Computer Science and its Applications
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
As in the conventional databases, an index can be used to improve performance in MapReduce when processing OLAP queries with it. Regarding this, Hadoop++ suggested Trojan index to reduce network I/O by storing a partitioned data and its index together into a same
data block
, which is a data storage unit in MapReduce. However, this approach requires complex computation to put the data and index into the same block, from which index generation time can significantly increase. In this paper, we propose a new indexing method to resolve this issue. The basic idea of the proposed method is to insert the data and index into separate blocks, and force them to be co-located in the same node. Our experimental results show that the proposed method provides better performance than the existing indexing scheme, including the Trojan index.