ABSTRACT
Spatial applications frequently need to join two data sets based on some spatial relationship between objects in the two data sets. This operation, called a spatial join, is an expensive operation and in the past many algorithms have been proposed for evaluating the spatial join operation on a single processor system. However, the use of parallelism for handling queries involving large volumes of spatial data has received little attention. In this paper, we explore the use of parallelism for evaluating the spatial join operation. We first propose two strategies for storing spatial data in a parallel database system. We propose a number of spatial join algorithms based on these declustering strategies. Two algorithms are identified as the key algorithms in this design space. We analyze these two algorithms both analytically and experimentally. The experimental evaluation uses real data sets and is based on an actual implementation in a parallel database system. The experiments show that both algorithms can effectively exploit parallelism.
- {1} D. Abel, B. C. Ooi, K. Tan, R. Power, and J. X. Yu. "Spatial Join Strategies in Distributed Spatial DBMS". In Proceedings of 4th Intl. Symp. On Large Spatial Databases, pages 348--367, 1995. Google ScholarDigital Library
- {2} T. Brinkhoff, H.-P. Kriegel, and B. Seeger. "Efficient Processing of Spatial Joins Using R-trees". In Proceedings of the 1993 ACM-SIGMOD Conference, Washington, DC, May 1993. Google ScholarDigital Library
- {3} T. Brinkhoff, H.-P. Kriegel, and B. Seeger. Parallel processing of spatial joins using R-trees. In Proceedings of the 12th International Conference on Data Engineering, pages 258-265, Washington - Brussels - Tokyo, Feb. 1996. IEEE Computer Society. Google ScholarDigital Library
- {4} "VPFView 1.0 Users Manual for the Digital Chart of the World". Defense Mapping Agency, July 1992.Google Scholar
- {5} D. J. DeWitt and J. Gray, "Parallel Database Systems: The Future of Database Processing or a Passing Fad?". Communication of the ACM, June, 1992. Google ScholarDigital Library
- {6} O. Günther, "Efficient Computation of Spatial Joins", In IEEE TKDE, 1993. Google ScholarDigital Library
- {7} R. H. Güting and W. Shilling. "A Practical Divide-and-Conquer Algorithm for the Rectangle Intersection Problem". In Information Sciences, volume 42, 1987. Google ScholarDigital Library
- {8} A. Gutman. "R-trees: A Dynamic Index Structure for Spatial Searching". In Proceedings of the 1984 ACM-SIGMOD Conference, Boston, Mass, June 1984. Google ScholarDigital Library
- {9} E. G. Hoel and H. Samet, "Performance of Data-Parallel Spatial Operations". In Proceedings of the 20th VLDB Conf., pages 156-167, Santiago, Chile, Sept. 1994. Google ScholarDigital Library
- {10} E. G. Hoel and H. Samet. "Benchmarking Spatial Join Operations with Spatial Output". In Proceedings of the 21st VLDB Conf., Zurich, Switzerland, Sept. 1995. Google ScholarDigital Library
- {11} Y.-W. Huang, N. Jing, and E. A. Rundensteiner, "Spatial Joins Using R-trees: Breadth-First Traversal with Global Optimizations". In Proceedings of the 23st VLDB Conf., pages 396-405, Athens, Greece, Aug. 1997. Google ScholarDigital Library
- {12} I. Kamel and C. Faloutsos, "Parallel R-Trees". In Proceedings of the 1992 ACM-SIGMOD Conference, San Diego, California, June 1992. Google ScholarDigital Library
- {13} M. Kitsuregawa and Y. Ogawa. "Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC) ". In Proceedings of the 16th VLDB Conf., pages 210-221, 1990. Google ScholarDigital Library
- {14} N. Koudas, C. Faloutsos, and I. Kamel, "Declustering Spatial Databases on a Multi--Computer Architecture". In EDBT, pages 592-614, 1996. Google ScholarDigital Library
- {15} M. L. Lo and C. V. Ravishankar, "Spatial Joins Using Seeded Trees". In Proceedings of the 1994 ACM-SIGMOD Conference, Minneapolis, May 1994. Google ScholarDigital Library
- {16} M. L. Lo and C. V. Ravishankar, "Spatial Hash-Joins". In Proceedings of the 1996 ACM-SIGMOD Conference, Montreal, Canada, June 1996. Google ScholarDigital Library
- {17} F. Olken and D. Rotem, "Sampling from Spatial Databases". In Proc. of the 9th International Conference on Data Engineering, pages 199-208, Apr. 1993. Google ScholarDigital Library
- {18} J. A. Orenstein, "Spatial Query Processing in an Object-Oriented Database System". In Proceedings of the 1986 ACM-SIGMOD Conference, 1986. Google ScholarDigital Library
- {19} J. M. Patel and D. J. DeWitt, Partition based spatial-merge join. In Proceedings of ACM SIGMOD 1996, pages 259-270, June 1996. Google ScholarDigital Library
- {20} J. M. Patel and D. J. DeWitt, "Clone Join and Shadow Join: Two Parallel Algorithms for Executing Spatial Join Operations". Technical Report, University of Wisconsin, CS-TR-99-1403, Aug. 1999.Google Scholar
- {21} J. M. Patel, J. Yu, N. Kabra, K. Tufte, B. Nag, J. Burger, N. E. Hall, K. Ramasamy, R. Lueder, C. Ellman, J. Kupsch, S. Guo, D. J. DeWitt, and J. F. Naughton. "Building a Scaleable Geo--Spatial DBMS: Technology, Implementation, and Evaluation." In Proceedings of the 1997 ACM-SIGMOD Conference, pages 336-347, Tucson, Arizona, USA, May 1997. Google ScholarDigital Library
- {22} S. Shekhar, S. Ravada, V. Kumar, D. Chubb, and G. Turner, "Declustering and Load-Balancing Methods for Parallelizing Geographic Information Systems". In IEEE TKDE, volume 10(4), pages 632-655, 1998. Google ScholarDigital Library
- {23} SpaceImaging Corp., Lanham, MD. Space Imaging Catalog of Products and Services, Feb. 1999.Google Scholar
- {24} K.-L. Tan and J. X. Yu, "A Performance Study of Declustering Strategies for Parallel Spatial Databases". In DEXA, pages 157-166, London, United Kingdom, Sept. 1995. Google ScholarDigital Library
- {25} C. B. Walton, A. G. Dale, and R. Jenevein, "A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins". In Proceedings of the 17th VLDB Conf., pages 537--548, Barcelona, Spain, Sept. 1991. Google ScholarDigital Library
- {26} X. Zhou, D. J. Abel, and D. Truffet, "Data Partitioning For Parallel Spatial Join Processing". In Proceedings of 5th Intl. Symp. On Large Spatial Databases, pages 178-196, Berlin, Germany, July 1997. Google ScholarDigital Library
Index Terms
- Clone join and shadow join: two parallel spatial join algorithms
Recommendations
Iterative spatial join
The key issue in performing spatial joins is finding the pairs of intersecting rectangles. For unindexed data sets, this is usually resolved by partitioning the data and then performing a plane sweep on the individual partitions. The resulting join can ...
Multi-way spatial join selectivity for the ring join graph
Efficient spatial query processing is very important since the applications of the spatial DBMS (e.g. GIS, CAD/CAM, LBS) handle massive amount of data and consume much time. Many spatial queries contain the multi-way spatial join due to the fact that ...
Spatial Join Processing Using Corner Transformation
Spatial join finds pairs of spatial objects having a specific spatial relationship in spatial database systems. Since spatial join is a fairly expensive operation, we need an efficient algorithm taking advantage of the characteristics of available ...
Comments