Published in:

2000 | OriginalPaper | Chapter

A New Sampling Strategy for Building Decision Trees from Large Databases

Authors : J. H. Chauchat, R. Rakotomalala

Published in: Data Analysis, Classification, and Related Methods

Publisher: Springer Berlin Heidelberg

Included in: Professional Book Archive

Get Access

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

We propose a fast and efficient sampling strategy to build decision trees from a very large database, even when there are many numerical attributes which must be discretized at each step. Successive samples are used, one on each tree node. Applying the method to a simulated database (virtually infinite size) confirms that when the database is large and contains many numerical attributes, our strategy of fast sampling on each node (with sample size about n = 300 or 500) speeds up the mining process while maintaining the accuracy of the classifier.

Springer Professional