Abstract
Spatial joins are one of the most important operations for combining spatial objects of several relations. The efficient processing of a spatial join is extremely important since its execution time is superlinear in the number of spatial objects of the participating relations, and this number of objects may be very high. In this paper, we present a first detailed study of spatial join processing using R-trees, particularly R*-trees. R-trees are very suitable for supporting spatial queries and the R*-tree is one of the most efficient members of the R-tree family. Starting from a straightforward approach, we present several techniques for improving its execution time with respect to both, CPU- and I/O-time. Eventually, we end up with an algorithm whose total execution time is improved over the first approach by an order of magnitude. Using a buffer of reasonable size, I/O-time is almost optimal, i.e. it almost corresponds to the time for reading each required page of the relations exactly once. The performance of the various approaches is investigated in an experimental performance comparison where several large data sets from real applications are used.
- 1 Becker, L. A.: 'A New Algorithm and a Cost Model for Join Processing with Grid Files', PhD-thesis, University of Siegen, 1992.Google Scholar
- 2 Beckmann N., Kriegel H.-P., Schneider R., Seeger B.: 'The R*-tree: An Efficient and Robust Access Method for Points and Rectangles', Proc. ACM SIGMOD Int. Conf. on Management of Data, Atlantic City, N.J., 1990, pp. 322-331. Google ScholarDigital Library
- 3 Burrough P. A.: 'Principles of Geographical Information Systems for Land Resources Assessment', Oxford University Press, 1986.Google Scholar
- 4 Bureau of the Census: "Tiger/Line Precensus Files: 1990 technical documentation', Bureau of the Census, Washington, DC, 1989.Google Scholar
- 5 Bentley J.L., Wood D.: 'An Optimal Worst Case Algorithm for Reporting Intersections of Rectangles', IEEE Trans. on Computers, Vol. C- 29, No. 7, 1980, pp. 571-577.Google Scholar
- 6 Faloutsos, C.: 'Gray Codes for Partial Match and Range Queries', IEEE Trans. on Software Engineering, Vol. 14, No. 10, 1988, pp. 1381- 1393. Google ScholarDigital Library
- 7 Fotouhi F., Pramanik S.: 'Optimal Secondary Storage Access Sequence for Performing Relational Join', IEEE Trans. on Knowledge and Data Engineering, Vol. 1, No. 3, 1989, pp. 318-328. Google ScholarDigital Library
- 8 Gargantini, I. : "An Effective Way to Represent Quadtrees', Comm. of the ACM, Vol. 25, No. 12, 1982, pp. 905-910. Google ScholarDigital Library
- 9 Gfinther, O.: "Efficient Computations of Spatial Joins', Proc. 9th Int. (?oaf. on Data Engineering, Vienna, Austria, 1993. Google ScholarDigital Library
- 10 Guttman A.: 'R-trees: A Dynamic Index Structure for Spatial Searching', Proc. ACM SIGMOD Int. Conf. on Management of Data, Boston, MA., 1984, pp. 47-57. Google ScholarDigital Library
- 11 Harada L., Nakano M., Kitsuregawa M., Takagi M.: 'Query Processing Methods for Multt-Anribute Clustered Relations', Proc. 16th Int. Conf. on Very Large Data Bases, Brisbane, 1990, pp. 59-70. Google ScholarDigital Library
- 12 Hoel E. G., Samet H.: "A Qualitative Comparison Study of Data Structures for Large Line Segment Databases', Proc. ACM SIGMOD Int. Conf. on Management of Data, San Diego, CA., 1992, pp. 205-214. Google ScholarDigital Library
- 13 Kriegel H.-P., Brinkhoff T., Schneider R.: "An Efficient Map Overlay Algorithm based on Spatial Access Methods and Computational Geometry', Proc. Int. Workshop on Database Management Systems for Geographical Applications, Capri, Italy, 1991, in: Geographic Database Management Systems, Springer, 1992, pp. 194-211.Google Scholar
- 14 Kamel, I., Faloutsos, C.: 'Parallel R-Trees', Proc. ACM SIGMOD Int. Conf. on Management of Data, San Diego, CA., 1992, pp. 195-204. Google ScholarDigital Library
- 15 Mishra P., Eich M.H.: 'Join Processing in Relational Databases', ACM Computing Surveys, Vol. 24, No. i, 1992, pp. 63-113. Google ScholarDigital Library
- 16 Merret T., Kambayashi Y., Yasuura H.: "Scheduling of Page-Fetches in Join-Operations', Proc. 7th Int. Conf. on Very Large Data Bases, Cannes, 1981, pp. 488-498.Google Scholar
- 17 Orenstein J. A., Merrett T. H.: "A Class of Data Structures for Associative Searching' Proc. 3rd ACM SIGACT/SIGMOD Symp. on Principles of Database Systems, 1984, pp. 181-190. Google ScholarDigital Library
- 18 Orenstein J. A.: 'Spatial Query Processing in an Object-Oriented Da. tabase System' Proc. ACM SIGMOD Int. Conf. on Management of Data, Washington D.C., 1986, pp. 326-333. Google ScholarDigital Library
- 19 Orenstein J. A.: 'Redundancy in Spatial Databases" Proe. ACM SIG- MOD Int. Conf. on Management of Data, Portland, Oreg., 1989, pp. 294-305. Google ScholarDigital Library
- 20 Preparata F. P., Shamos M. I.: "Computational Geometry; Springer, 1988. Google ScholarDigital Library
- 21 Rotem D.; "Spatial Join Indices', Proe. Int. Conf. on Data Ensineerins, 1991, pp. 500-509. Google ScholarDigital Library
- 22 Samet H.: 'The Design and Analysis of Spatial Data Structures', Addison Wesley, 1990. Google ScholarDigital Library
- 23 Stonebraker M., Rowe L., Hirohama M.: "The Implementation of POSTGRES', IEEE Trans. on Knowledge and Data Engineering, Vol. 2, No. 1, 1990, pp. 125-142. Google ScholarDigital Library
- 24 Statistical Office of the European Communities: 'Regions', 1990.Google Scholar
Index Terms
- Efficient processing of spatial joins using R-trees
Recommendations
Efficient processing of spatial joins using R-trees
SIGMOD '93: Proceedings of the 1993 ACM SIGMOD international conference on Management of dataSpatial joins are one of the most important operations for combining spatial objects of several relations. The efficient processing of a spatial join is extremely important since its execution time is superlinear in the number of spatial objects of the ...
Multi-way Spatial Joins Using R-Trees: Methodology and Performance Evaluation
SSD '99: Proceedings of the 6th International Symposium on Advances in Spatial DatabasesWe propose a new multi-way spatial join algorithm called M-way R-tree join which synchronously traverses M R-trees. The M-way R-tree join can be considered as a generalization of the 2-way R-tree join. Although a generalization of the 2-way R-tree join ...
Efficient processing of spatial joins with DOT-based indexing
A spatial join is a query that searches for a set of object pairs satisfying a given spatial relationship from a database. It is one of the most costly queries, and thus requires an efficient processing algorithm that fully exploits the features of the ...
Comments