Abstract
This paper describes PBSM (Partition Based Spatial-Merge), a new algorithm for performing spatial join operation. This algorithm is especially effective when neither of the inputs to the join have an index on the joining attribute. Such a situation could arise if both inputs to the join are intermediate results in a complex query, or in a parallel environment where the inputs must be dynamically redistributed. The PBSM algorithm partitions the inputs into manageable chunks, and joins them using a computational geometry based plane-sweeping technique. This paper also presents a performance study comparing the the traditional indexed nested loops join algorithm, a spatial join algorithm based on joining spatial indices, and the PBSM algorithm. These comparisons are based on complete implementations of these algorithms in Paradise, a database system for handling GIS applications. Using real data sets, the performance study examines the behavior of these spatial join algorithms in a variety of situations, including the cases when both, one, or none of the inputs to the join have an suitable index. The study also examines the effect of clustering the join inputs on the performance of these join algorithms. The performance comparisons demonstrates the feasibility, and applicability of the PBSM join algorithm.
- Arc95 ESRI, Redlands, CA. "ARC/INFO: The World's GIS. An ESRI White Paper", March 1995.Google Scholar
- Ben75 J. L. Bentley. "Multidimensional Binary Search Trees Used for Associative Searching". In Communication ofthe A CM, volume 18(9), September 1975. Google ScholarDigital Library
- Ben79 J. L. Bentley. "Multidlmensionat Binary Search Trees in Database Applications". In IEEE Trat~sactions on Software Engineering, volume 5(4), 1979.Google Scholar
- BHF93 L. Becker, K. Hinnchs, and U. Finke. "'A New Algorithm for Computing Joans With Grid Files". In IEEE Transactions on Knowledge and Dam Engineering, 1993. Google ScholarDigital Library
- BKS93 T. Brinkhoff, H. P. Kriegel, and B. Seeger. "Efficient Processing of Spatial Joins Using R-trees". In Proceedings of the 1993 A CM-SIGMOD Conference, Washington, DC, May 1993. Google ScholarDigital Library
- BKSS90 N. Beckmann, H. R Kriegel, R. Schneider, and B. Seeger. "'The R*-tree: An Efficient and Robust Access Method for Points and Rectangles". in Proceedbtgs of the 1990 A CM-SIGMOD Conference, June 1990. Google ScholarDigital Library
- BKSS94 T. Brinkhoff, H. R Kriegel, R. Schneider, and B. Seeger. "'Multi-step Processing of Spatial Joins". In Proceedings of the 1994 A CM-SIGMOD Colference, Minneapolis, May 1994. Google ScholarDigital Library
- Bur86 R A. Burrough. "Principles of Geographic hzformation Systems for Land Resources Assessment". Oxford University Press, 1986.Google Scholar
- CDF+94 M. J. Carey, D. J. DeWitt, M. J. Franklin, N. E. Hail, M. McAuliffe, J. F. Naughton, D. T. Schuh, M. H. Solomon, C. K. Tan, O. Tsatatos, S. White, and M. J. Zwilling. "Shoring up Persistent Applications". In P~vceedings of the 1994 ACM- SIGMOD Conference, "Minneapolis, Minnesota", May 1994. Google ScholarDigital Library
- CFR87 T. Sellis C. Faloutsos and N. Roussopoulos. "Analysis of Object Oriented Spatial Access Methods". In Proceedings of the 1987 A CM-SIGMOD Conferelzce, San Francisco, May 1987. Google ScholarDigital Library
- Cor95 Intergraph Corporation. "GIS/AM/FM Information". "http://wwm intergraph'c~m/utihnap'shtmt ", 1995.Google Scholar
- DKL+94 D. J. DeWitt, N. Kabra, J. Luo, J. M. Patel, and J, Yu. "Claent-Server Paradise". In Proceedings of the 20th VLDB Colf, Santiago, Chile, September 1994. Google ScholarDigital Library
- DLPY93 D. J. DeWitt, J. Luo, J. M. PateI, and J. Yu. "Paradise -A Parallel Geographic Information System". In Proceedings of the A CM Workshop on Advance~ in Geographic Information Systems, Arlington, Virginia, November 1993.Google Scholar
- DNSS92 D. J. DeWitt, J. R Naughton, D. A. Schneider, and S. Seshadri. "'Practical Skew Handling in Parallel Joins". In Proceedbzgs of the 19th VLDB Coil, August 1992. Google ScholarDigital Library
- GS87 R. H. Gtiting and W. Shilling. "A Practical Diwde-and- Conquer Algorithm for the Rectangle Intersection Problem". In bzformation Sciences, volume 42, 1987. Google ScholarDigital Library
- Gün93 O. G~nther. "Efficient Computation of Spatial Joins". In IEEE Transactions on Knowledge and Data E1zgineering, 1993. Google ScholarDigital Library
- Gut84 A. Gutman. "R-trees: A Dynamic Index Structure for Spatial Searching". In Proceedings of the 1984 A CM-SIGMOD Conference, Boston, Mass, June 1984. Google ScholarDigital Library
- HNKT90 L. Harada, M. Nakano, M. Kitsuregawa, and M. Takagi. "Query Processing Methods for Multi-Attribute Clustered Relations". In Proceedings of the 16th VLDB Conf, Brisbane, Australia, 1990. Google ScholarDigital Library
- HS95 E. G. Hoel and H. Samet. "Benchmarking Spatial Join Operations with Spatial Output". In Proceedings of the 21st VLDB Conf., Zurich, Switzerland, September 1995. Google ScholarDigital Library
- KHT89 M. Kitsuregawa, L. Harada, and M. Takagi. ~'Join Strategies on KD-Tree Indexed Relations". In IEEE Transactions on Knowledge and Data Engineering, 1989. Google ScholarDigital Library
- LR94 M. L. Lo and C. V. Ravishankar. "Spatial Joins Using Seeded Trees". In Proceedings of the 1994 A CM-SIGMOD Conference, Minneapolis, May 1994. Google ScholarDigital Library
- LR95 M.L. Lo and C. V. Ravishankar. "Generating Seeded Trees From Data Sets". In Proceedings of the Fourth International Symposium on Large Spatial Databases, Portland, ME, August 1995. Google ScholarDigital Library
- LR96 M.L. Lo and C. V. Ravishankar. "Spatial Hash-Joins". In Proceedings of the 1996 A CM-SIGMOD Conference, Montreal, Canada, June 1996. Google ScholarDigital Library
- MC80 C. Mead and L. Conway. "bTtroduction to VLSI Systems". Addison-Wesley, Reading, Mass., 1980. Google ScholarDigital Library
- MGR91 D.J. Maguire, M. E Goodchild, and D. W. Rhind. "Geographic h~formation Systems" volume 1. Longman Scientific & Technical, copublishect in the US with John Wiley & Sons, Inc. New York, 1991.Google Scholar
- NHS84 J. Nievergelt, H. Hinterberger, and K. C. Sevcik. "The Grid File: An Adaptable, Symmetric Multikey File Structure". A CM Tra~zsactions on Database Systems, March 1984. Google ScholarDigital Library
- NS86 R. C. Nelson and H. Samet. "A Consistent Hierarchical Representation for Vector Data". In Computer Graphics, volume 20(4), August 1986. Google ScholarDigital Library
- OM84 J. A. Orenstein and T. H. Merrett. "A Class of Data Structures for Associative Searching". In Proceedings of the 3rd A CM SIGA CT-SIGMOD Symposium on Principles of Database Systems, 1984. Google ScholarDigital Library
- OM88 J. A. Orenstein and F. A. Manola. "PROBE Spatial Data Modeling and Query Processing in an Image Database Application". In IEEE Transactions on Software Engineering, volume 14(5), May I988. Google ScholarDigital Library
- Ore86 J, A. Orenstein. "Spatial Query Processing in an Object- Oriented Database System". In Proceedings of the I986 ACM- SIGMOD Conference, 1986. Google ScholarDigital Library
- Ore89 J. A. Orenstein. "'Redundancy in Spatml Databases". In Proceedings of the 1989 A CM-SIGMOD Conference, 1989. Google ScholarDigital Library
- Ore90 J. A. Orenstein. "A Comparison of Spatial Query Processing Techniques for Native and Parameter Spaces". In Proceedings of the 1990 ACM-SIGMOD Colference, 1990. Google ScholarDigital Library
- PD J. M. Patel and D. J. DeWitt. "Partition Based Spatial-Merge Join". http://www.cs.wisc.edu/paradise/paradise.papers.html. Google ScholarDigital Library
- PS88 F. R Preparata and M. I. Shamos, editors. "Computational Geometry" Springer, 1988. Google ScholarDigital Library
- Rot91 D. Rotem. "Spatial Join Indices". in IEEE Transactions on Knowledge and Data Engineering, Kobe, April 1991.Google Scholar
- SFGM93 M. Stonebraker, J. Frew, K. Gardels, and J. Meredith. "The SEQUOIA 2000 Storage Benchmark". In Proceedings of the 1993 ACM-SIGMOD Conference, Washington, D.C., May 1993. Google ScholarDigital Library
- Sto86 M. Stonebraker. "The Case for Shared Nothing". Database Engineering, 9( 1 ), 1986.Google Scholar
- Tig U.S. Bureau of the Census, Washington, DC. "TIGER~Line Files(TM). 1992 Technical Documentation".Google Scholar
- TY95 K L. Tan and J. X. Yu. "A Performance Study of Declusterlng Strategies for Parallel Spatial Databases". In The 6th hzternational Conference on Database and Expert Systems Applications (DEXA), London, United Kingdom, September 1995. Google ScholarDigital Library
- Ube94 M. Ubell. "The Montage Extensible DataBlade Architecture". In Proceedings of the 1994 A CM-SIGMOD Conference, May 1994. Google ScholarDigital Library
- Val87 R Valduriez. "'Join Indices". InACMTODS, volume t2(2), 1987. Google ScholarDigital Library
- ZG90 H. Zeller and J. Gray. "An Adaptive Hash Join Algorithm for Multiuser Environments". In Proceedings of the 16th VLDB Co~, Brisbane, Australia, 1990. Google ScholarDigital Library
Index Terms
- Partition based spatial-merge join
Recommendations
Partition based spatial-merge join
SIGMOD '96: Proceedings of the 1996 ACM SIGMOD international conference on Management of dataThis paper describes PBSM (Partition Based Spatial-Merge), a new algorithm for performing spatial join operation. This algorithm is especially effective when neither of the inputs to the join have an index on the joining attribute. Such a situation ...
Multi-way spatial join selectivity for the ring join graph
Efficient spatial query processing is very important since the applications of the spatial DBMS (e.g. GIS, CAD/CAM, LBS) handle massive amount of data and consume much time. Many spatial queries contain the multi-way spatial join due to the fact that ...
The Sort-Merge-Shrink join
One of the most common operations in analytic query processing is the application of an aggregate function to the result of a relational join. We describe an algorithm called the Sort-Merge-Shrink (SMS) Join for computing the answer to such a query over ...
Comments