Indexing is one of the most important key factors in efficient XML information retrieval. Inappropriate indexing may result in improper search results. This paper presents a multi-indexing system that considers not only structure information of documents but also characteristics of pertinent elements in XML documents. The system extracts semantic elements from XML document corpus and identifies characteristics of the elements. By using the characteristics, document structures are classified and a particular indexing method is assigned to each classified structure. Efficiency of our system is confirmed through XML dataset from news stories with relatively accurate formats. The results indicate that the system has significantly high precision in search by element.
Weitere Kapitel dieses Buchs durch Wischen aufrufen
- Multi-indexing System for News Stories Based on XML Documents
- Springer Berlin Heidelberg