Many researches related to storing XML data have been performed and some of them proposed methods to improve the performance of databases by reducing the joins between tables. Those methods are very efficient in deriving and optimizing tables from a DTD or XML schema in which elements and attributes are defined. Nevertheless, those methods are not effective in an XML schema for biological information such as microarray data because even though microarray data have complex hierarchies just a few core values of microarray data repeatedly appear in the hierarchies. In this paper, we propose a new algorithm to extract core features which is repeatedly occurs in an XML schema for biological information, and elucidate how to improve classification speed and efficiency by using a decision tree rather than pattern matching in classifying structural similarities. We designed a database for storing biological information using features extracted by our algorithm. By experimentation, we showed that the proposed classification algorithm also reduced the number of joins between tables.
Weitere Kapitel dieses Buchs durch Wischen aufrufen
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
- Structural Similarity Mining in Semi-structured Microarray Data for Efficient Storage Construction
- Springer Berlin Heidelberg
Neuer Inhalt/© ITandMEDIA