skip to main content
research-article

Optimal algorithms for evaluating rank joins in database systems

Published:15 February 2008Publication History
Skip Abstract Section

Abstract

In the rank join problem, we are given a set of relations and a scoring function, and the goal is to return the join results with the top k scores. It is often the case in practice that the inputs may be accessed in ranked order and the scoring function is monotonic. These conditions allow for efficient algorithms that solve the rank join problem without reading all of the input. In this article, we present a thorough analysis of such rank join algorithms. A strong point of our analysis is that it is based on a more general problem statement than previous work, making it more relevant to the execution model that is employed by database systems. One of our results indicates that the well-known HRJN algorithm has shortcomings, because it does not stop reading its input as soon as possible. We find that it is NP-hard to overcome this weakness in the general case, but cases of limited query complexity are tractable. We prove the latter with an algorithm that infers provably tight bounds on the potential benefit of reading more input in order to stop as soon as possible. As a result, the algorithm achieves a cost that is within a constant factor of optimal.

References

  1. Agrawal, P. and Widom, J. 2009. Confidence-Aware join algorithms. In Proceedings of the IEEE International Conference on Data Engineering. IEEE Computer Society, 628--639. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bast, H., Majumdar, D., Schenkel, R., Theobald, M., and Weikum, G. 2006. IO-Top-k: Index-Access optimized top-k query processing. In Proceedings of the 32nd International Conference on Very Large Data Bases. VLDB Endowment, 475--486. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bruno, N., Gravano, L., and Marian, A. 2002. Evaluating top-k queries over Web-accessible databases. In Proceedings of the 18th International Conference on Data Engineering. IEEE Computer Society, 369--380. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chang, K. C.-C. and Hwang, S.-w. 2002. Minimal probing: Supporting expensive predicates for top-k queries. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 346--357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chaudhuri, S. and Gravano, L. 1996. Optimizing queries over multimedia repositories. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 91--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chaudhuri, S. and Gravano, L. 1999. Evaluating top-k selection queries. In Proceedings of the 25th International Conference on Very Large Data Bases. Morgan Kaufmann, San Francisco, CA, 397--410. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Fagin, R. 1999. Combining fuzzy information from multiple systems. J. Comput. Syst. Sci. 58, 1, 83--99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Fagin, R., Lotem, A., and Naor, M. 2003. Optimal aggregation algorithms for middleware. J. Comput. Syst. Sci. 66, 4, 614--656. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Finger, J. and Polyzotis, N. 2009. Robust and efficient algorithms for rank join evaluation. In Proceedings of the 35th SIGMOD International Conference on Management of Data. ACM, New York, 415--428. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Graefe, G. 1993. Query evaluation techniques for large databases. ACM Comput. Surv. 25, 2, 73--170. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ilyas, I. F., Aref, W. G., and Elmagarmid, A. K. 2002. Joining ranked inputs in practice. In Proceedings of the 28th International Conference on Very Large Data Bases. VLDB Endowment, 950--961. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ilyas, I. F., Aref, W. G., and Elmagarmid, A. K. 2004a. Supporting top-k join queries in relational databases. VLDB J. 13, 3, 207--221. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ilyas, I. F., Aref, W. G., Elmagarmid, A. K., Elmongui, H. G., Shah, R., and Vitter, J. S. 2006. Adaptive rank-aware query optimization in relational databases. ACM Trans. Datab. Syst. 31, 4, 1257--1304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ilyas, I. F., Shah, R., Aref, W. G., Vitter, J. S., and Elmagarmid, A. K. 2004b. Rank-Aware query optimization. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 203--214. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kimelfeld, B. and Sagiv, Y. 2006. Incrementally computing ordered answers of acyclic conjunctive queries. In Next Generation Information Technologies and Systems. Springer, 141--152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Li, C., Chang, K. C.-C., and Ilyas, I. F. 2006. Supporting ad-hoc ranking aggregates. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 61--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Li, C., Chang, K. C.-C., Ilyas, I. F., and Song, S. 2005. RankSQL: Query algebra and optimization for relational top-k queries. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 131--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Mamoulis, N., Yiu, M. L., Cheng, K. H., and Cheung, D. W. 2007. Efficient top-k aggregation of ranked inputs. ACM Trans. Datab. Syst. 32, 3, 19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Natsev, A., Chang, Y.-C., Smith, J. R., Li, C.-S., and Vitter, J. S. 2001. Supporting incremental join queries on ranked inputs. In Proceedings of the 27th International Conference on Very Large Data Bases. Morgan Kaufmann, San Francisco, CA, 281--290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Schnaitter, K. and Polyzotis, N. 2008. Evaluating rank joins with optimal cost. In Proceedings of the 27th Symposium on Principles of Database Systems. ACM, New York, 43--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Schnaitter, K., Spiegel, J., and Polyzotis, N. 2007. Depth estimation for ranking query optimization. In Proceedings of the 33rd International Conference on Very Large Data Bases. VLDB Endowment, 902--913. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Yan, W. P. and Larson, P.-A. 1995. Eager aggregation and lazy aggregation. In Proceedings of the 21st International Conference on Very Large Data Bases. Morgan Kaufmann, San Francisco, CA, 345--357. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Optimal algorithms for evaluating rank joins in database systems

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Database Systems
        ACM Transactions on Database Systems  Volume 35, Issue 1
        February 2010
        310 pages
        ISSN:0362-5915
        EISSN:1557-4644
        DOI:10.1145/1670243
        Issue’s Table of Contents

        Copyright © 2008 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Accepted: 1 September 2009
        • Revised: 1 May 2009
        • Received: 1 November 2008
        • Published: 15 February 2008
        Published in tods Volume 35, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader