2000 | OriginalPaper | Chapter
A New Sampling Strategy for Building Decision Trees from Large Databases
Authors : J. H. Chauchat, R. Rakotomalala
Published in: Data Analysis, Classification, and Related Methods
Publisher: Springer Berlin Heidelberg
Included in: Professional Book Archive
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
We propose a fast and efficient sampling strategy to build decision trees from a very large database, even when there are many numerical attributes which must be discretized at each step. Successive samples are used, one on each tree node. Applying the method to a simulated database (virtually infinite size) confirms that when the database is large and contains many numerical attributes, our strategy of fast sampling on each node (with sample size about n = 300 or 500) speeds up the mining process while maintaining the accuracy of the classifier.