2010 | OriginalPaper | Chapter
Distributed SLCA-Based XML Keyword Search by Map-Reduce
Authors : Chenjing Zhang, Qiang Ma, Xiaoling Wang, Aoying Zhou
Published in: Database Systems for Advanced Applications
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Large scales of XML information comes continually from new Web applications, and SLCA (Smallest Lowest Common Ancestor)-based XML keyword search is one of the most important information retrieval approaches. Previous approaches focus on building index for XML documents. However in information dissemination scenario, it is impossible to build index in advance for continuous XML document streams. This paper addresses SLCA-based keyword search for continuous XML documents by Map-Reduce mechanism. We use parallel algorithms to process plenty of XML documents in Hadoop environment. A distributed SLCA computation method is designed, where each net node computes SLCA independently and just a little information needs be transmitted. A real Hadoop environment is built and we demonstrate the efficiency of our algorithms analytically and experimentally.