Abstract
An algorithm for record clustering is presented. It is capable of detecting sudden changes in users' access patterns and then suggesting an appropriate assignment of records to blocks. It is conceptually simple, highly intuitive, does not need to classify queries into types, and avoids collecting individual query statistics. Experimental results indicate that it converges rapidly; its performance is about 50 percent better than that of the total sort method, and about 100 percent better than that of randomly assigning records to blocks.
- 1 CHANG, J., AND FU, K. Extended K-d tree database organization: A dynamic multiattribute clustering method. IEEE Softw. Eng. (1981), 284-290.]]Google Scholar
- 2 DEW{TT, D., et al. Implementation techniques for main memory database systems. ACM SIGMOD, (1984), 1-8.]] Google ScholarDigital Library
- 3 ESWARAN, K.P. Placement of records in a file and file allocation in a computer network. IFIP (Aug. 1974), 304-307.]]Google Scholar
- 4 FLORY, A., GUNTHER, J., AND KOULOUMDIJAN, J. Database reorganization by clustering methods. Inf. Syst. 3, 1 (1978), 59-62.]]Google ScholarCross Ref
- 5 GHOSH, S.P. Database organization for Data Management. Academic Press, New York, 1977.]] Google ScholarDigital Library
- 6 HAMMER, M., AND CHAN, A. Index selection in a self-adaptive database management system. ACM SIGMOD (1976), 1-8.]] Google Scholar
- 7 HAMMER, IVL, AND NIAMIR, B. A heuristic approach to attribute partitioning in a self-adaptive database management system. ACM SIGMOD (1976), 1-8.]] Google ScholarDigital Library
- 8 JAKOBSSON, M. Reducing block accesses in inverted files by partial clustering. Inf. Syst. 5 (1980),1-5.]]Google ScholarCross Ref
- 9 KNUTH, D. Sorting and Searching. Vol. 3, Addison-Wesley, Reading, Mass., 397-398.]]Google Scholar
- 10 Llou, J. H., AND YAO, S.B. Multidimensional clustering for database organization, inf. Syst. 2 (!977), !87-!98.]]Google Scholar
- 11 OMIECINSKI, E., AND SCHEUERMANN, P. A global approach to record clustering and file reorganization. Tech. Rep., Dept. of EECS, Northwestern Univ., Dec. 1983.]]Google Scholar
- 12 RIVEST, R. On self-organizing sequential search heuristics. Commun. ACM (!976), 63-67.]] Google ScholarDigital Library
- 13 SALTON, G. Dynamic Information and Library Processing. Prentice-Hail, Englewood Cliffs, N.J., 1975.]] Google ScholarDigital Library
- 14 SCHEFFE, H. The Analysis of Variance. John Wiley, New York, 1959.]]Google Scholar
- 15 VAN RIGSBERGEN, C.J. Information Retrieval. 2nd. Ed., Butterworth, London, 1980.]]Google Scholar
- 16 WILLIARD, D. Efficiently processing relational calculus expressions using range query'theory. ACM SIGMOD (1984), 164-175.]] Google ScholarDigital Library
- 17 YAO, S.B. Approximating block accesses in database organization. Commun. ACM 20 (1977), 260-261.]] Google ScholarDigital Library
- 18 Yu, C. T., AND CHEN, C. H. Adaptive document clustering. To appear in ACM SIGIR Conference, June 1985.]] Google ScholarDigital Library
- 19 Yu, C. T., SIu, M. K., AND CHEN, C.H. File allocation in distributed databases with interaction between files. In Proceedings of Conference on Very Large Data Bases, (1983), 248-259.]] Google ScholarDigital Library
- 20 Yu, C. T., Sxu, M. K., LAM, K., AND CHEN, C.H. Adaptive file allocation in'a star computer network. IEEE COMPSAC (1983), 537-546. (Selected for reprint in IEEE Trans. Softw. Eng.).]] Google ScholarDigital Library
- 21 Yu, C. T., SIu, M. K., LAM, K., AND TA{, F. Adaptive clustering schemes: General framework. IEEE COMPAC (Nov. 1981), 81-89.]]Google Scholar
Index Terms
- Adaptive record clustering
Recommendations
Document clustering as a record linkage problem
DocEng '18: Proceedings of the ACM Symposium on Document Engineering 2018This work examines document clustering as a record linkage problem, focusing on named-entities and frequent terms, using several vector and graph-based document representation methods and k-means clustering with different similarity measures. The JedAI ...
Hierarchical Adaptive Clustering
This paper studies an adaptive clustering problem. We focus on re-clustering an object set, previously clustered, when the feature set characterizing the objects increases. We propose an adaptive clustering method based on a hierarchical agglomerative ...
An Approach of Standardization and Searching based on Hierarchical Bayesian Clustering (HBC) for Record Linkage System
C5 '07: Proceedings of the Fifth International Conference on Creating, Connecting and Collaborating through ComputingInformation sources on the Web are controlled by different text formats, and have varying inconsistencies. Data form many online sources do not contain enough information to accurately link the records. To link record from different data sources, any ...
Comments