skip to main content
article
Free Access

Avoiding Cartesian products for multiple joins

Published:15 January 1997Publication History
Skip Abstract Section

Abstract

Computing the natural join of a set of relations is an important operation in relational database systems. The ordering of joins determines to a large extent the computation time of the join. Since the number of possible orderings could be very large, query optimizers first reduce the search space by using various heuristics and then try to select an optimal ordering of joins. Avoiding Cartesian products is a common heuristic for reducing the search space, but it cannot guarantee optimal ordering in its search space, because the cheapest Cartesian-product-free (CPF for short) ordering could be significantly worse than an optimal non-CPF ordering by a factor of an arbitrarily large number. In this paper, we use programs consisting of joins, semijoins, and projections for computing the join of some relations, and we introduce a novel algorithm that derives programs from CPF orderings of joins. We show that there exists a CPF ordering from which our algorithm derives a program whose cost is within a constant factor of the cost of an optimal ordering. Thus, our result demonstrates the effectiveness of avoiding Cartesian products as a heuristic for restricting the search space of orderings of joins.

References

  1. BEERI, C. R., FAGIN, R., MAIER, D., AND YANNAKAKIS, M. 1983. On the desirability of acyclic database schemas. J. ACM 30, 3 (July), 479-513. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. BERNSTEIN, P. A., AND GOODMAN, N. 1981. The power of natural semijoins. SIAM J. Comput. 10, 4 (Nov.) 751-771.Google ScholarGoogle ScholarCross RefCross Ref
  3. CHANDRA, A. K., AND MERLIN, P. M. 1977. Optimal implementation of conjunctive queries in relational data bases. In Proceedings of the 9th ACM Symposium on the Theory of Computing, (Boulder, Colo., May 2-4). ACM, New York, pp. 77-90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. GOODMAN, N., AND SHMUELI, O. 1984. The tree projection theorem and relational query processing. J. Comput. Syst. Sci. 28, 1 (Feb.), 60-79.Google ScholarGoogle ScholarCross RefCross Ref
  5. SAGIV, Y., AND SHMUELI, O. 1993. Solving queries by tree projections. ACM Trans. Database Syst. 18, 3 (Sept.), 487-511. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. SELINGER, P. G., ASTRAHAN, M. M., CHAMBERLIN, D. D., LORIE, P. A., AND PRICE, f. G. 1979. Access path selection in a relational database system. In Proceedings of the 1979 ACM-SIGMOD International Conference on Management of Data (Boston, Mass., May 30-June 1). ACM, New York, pp. 23-34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. SMITH, D. E., AND GENESERETH, M. R. 1985. Ordering conjunctive queries. Artif. Intell. 26, 171-215. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. SWAMI, A., AND GUPTA, A. 1988. Optimization of large join queries. In Proceedings of the 1988 ACM-SIGMOD International Conference on Management of Data (Chicago, Ill., June 1-3). ACM, New York, 1988, pp. 8-17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. SWAMI, A. 1989. Optimization of large join queries: Combining heuristics and combinatorial techniques. In Proceedings of the 1989 ACM-SIGMOD International Conference on Management of Data (Portland, Ore., May 31-June 2). ACM, New York, 367-376. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. fAY, Y. C. 1993. On the optimality of strategies for multiple joins. J. ACM 40, 5 (Nov.), 1067-1086. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. ULLMAN, J.D. 1989. Principles of Database and Knowledge-Base Systems, vol. 2. Computer Science Press, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. WONG, E., AND YOUSEFFI, K. 1976. Decomposition-A strategy for query processing. ACM Trans. Database Syst. 1, 3, (Sept.), 223-241. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. YANNAKAKIS, M. 1981. Algorithms for acyclic database schemes. In Proceedings of the International Conference on Very Large Data Bases (Cannes, France, Sept. 9-11). ACM, New York, pp. 82-94.Google ScholarGoogle Scholar

Index Terms

  1. Avoiding Cartesian products for multiple joins

              Recommendations

              Reviews

              Jack Minker

              An important task in relational databases is syntactic optimization of a query. To develop an efficient implementation, it is essential to find an optimum method of computing multiple join operations. This is a difficult task, because the state space of possible orderings of joins can be very large. One heuristic that has been used is to avoid Cartesian products. Morishita introduces a clever algorithm that derives programs from Cartesian-product-free orderings of joins. The resulting programs for computing the join of a set of relations consist of joins, semijoins, and projections. He also shows that there exists a Cartesian-product-free program whose cost is within a constant factor of the cost of an optimal ordering. The paper is important for implementors of database systems. It demonstrates the importance of avoiding Cartesian products as a heuristic for restricting the search space of ordering of joins. It also shows how to develop an effective implementation of a Cartesian-product-free program.

              Access critical reviews of Computing literature here

              Become a reviewer for Computing Reviews.

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              • Published in

                cover image Journal of the ACM
                Journal of the ACM  Volume 44, Issue 1
                Jan. 1997
                199 pages
                ISSN:0004-5411
                EISSN:1557-735X
                DOI:10.1145/256292
                Issue’s Table of Contents

                Copyright © 1997 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 15 January 1997
                Published in jacm Volume 44, Issue 1

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • article

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader