skip to main content
research-article

Efficient sort-based skyline evaluation

Authors Info & Claims
Published:12 December 2008Publication History
Skip Abstract Section

Abstract

Skyline queries compute the set of Pareto-optimal tuples in a relation, that is, those tuples that are not dominated by any other tuple in the same relation. Although several algorithms have been proposed for efficiently evaluating skyline queries, they either necessitate the relation to have been indexed or have to perform the dominance tests on all the tuples in order to determine the result. In this article we introduce salsa, a novel skyline algorithm that exploits the idea of presorting the input data so as to effectively limit the number of tuples to be read and compared. This makes salsa also attractive when skyline queries are executed on top of systems that do not understand skyline semantics, or when the skyline logic runs on clients with limited power and/or bandwidth. We prove that, if one considers symmetric sorting functions, the number of tuples to be read is minimized by sorting data according to a “minimum coordinate,” minC, criterion, and that performance can be further improved if data distribution is known and an asymmetric sorting function is used. Experimental results obtained on synthetic and real datasets show that salsa consistently outperforms state-of-the-art sequential skyline algorithms and that its performance can be accurately predicted.

Skip Supplemental Material Section

Supplemental Material

References

  1. Acharya, S., Poosala, V., and Ramaswamy, S. 1999. Selectivity estimation in spatial databases. In Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data (SIGMOD). Philadelphia, PA, ACM, New York, NY, 13--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Balke, W.-T., Güntzer, U., and Zheng, J. X. 2004. Efficient distributed skylining for web information systems. In Proceedings of the 6th International Conference on Extending Database Technology (EDBT). Lecture Notes in Computer Science, vol. 2992, Springer, Berlin, Heidelberg, New York, 256--273.Google ScholarGoogle Scholar
  3. Bartolini, I., Ciaccia, P., Oria, V., and Özsu, T. 2004. Integrating the results of multimedia sub-queries using qualitative preferences. In Proceedings of the 10th International Workshop on Multimedia Information Systems (MIS). College Park, MD. Lecture Notes in Computer Science, vol. 2992, Springer, Berlin, Heidelberg, New York, 66--75.Google ScholarGoogle Scholar
  4. Bartolini, I., Ciaccia, P., Oria, V., and Özsu, T. 2007. Flexible integration of multimedia sub-queries with qualitative preferences. Multimed. Tools Appl. 33, 3, 275--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bartolini, I., Ciaccia, P., and Patella, M. 2006. SaLSa: computing the skyline without scanning the whole sky. In Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management. Arlington, VA, ACM, New York, NY, 405--414. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Börzsönyi, S., Kossmann, D., and Stocker, K. 2001. The skyline operator. In Proceedings of the 17th International Conference on Data Engineering (ICDE). Heidelberg, Germany, IEEE Computer Society, Washington, DC, 421--430. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Buchta, C. 1989. On the average number of maxima in a set of vectors. Inform. Process. Lett. 33, 2, 63--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chaudhuri, S., Dalvi, N., and Kaushik, R. 2006. Robust cardinality and cost estimation for skyline operator. In Proceedings of the 22nd International Conference on Data Engineering (ICDE). Atlanta, GA. IEEE Computer Society, Washington, DC, 64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chomicki, J. 2003. Preference Formulas in Relational Queries. ACM Trans. Database Syst. 28, 4, 427--466. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chomicki, J., Godfrey, P., Gryz, J., and Liang, D. 2002. Skyline with presorting. Tech. Rep. CS-2002-04, York University, Toronto, ON.Google ScholarGoogle Scholar
  11. Chomicki, J., Godfrey, P., Gryz, J., and Liang, D. 2003. Skyline with presorting. In Proceedings of the 19th International Conference on Data Engineering (ICDE). Bangalore, India. IEEE Computer Society, Washington, DC, 717--816.Google ScholarGoogle Scholar
  12. Garey, M. R. and Johnson, D. S. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Godfrey, P. 2004. Skyline cardinality for relational processing. In Proceedings of the 3rd International Symposium on Foundations of Information and Knowledge Systems (FoIKS). Lecture Notes in Computer Science, vol. 2942, Springer, Berlin, Heidelberg, New York, 78--97.Google ScholarGoogle Scholar
  14. Godfrey, P., Shipley, R., and Gryz, J. 2005. Maximal vector computation in large data sets. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB). Trondheim, Norway. Morgan Kaufmann, San Francisco, CA, 229--240. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kiessling, W. 2002. Foundations of preferences in database systems. In Proceedings of the 28th International Conference on Very Large Data Bases (VLDB). Hong Kong, China. Morgan Kaufmann, San Francisco, CA, 311--322. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kossmann, D., Ramsak, F., and Rost, S. 2002. Shooting stars in the sky: an online algorithm for skyline queries. In Proceedings of the 28th International Conference on Very Large Data Bases (VLDB). Hong Kong, China. Morgan Kaufmann, San Francisco, CA, 275--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Papadias, D., Tao, Y., Fu, G., and Seeger, B. 2003. An optimal and progressive algorithm for skyline queries. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD). San Diego, CA. ACM, New York, NY, 467--478. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Papadias, D., Tao, Y., Fu, G., and Seeger, B. 2005. Progressive skyline computation in database systems. ACM Trans. Database Syst. 30, 1, 41--82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Paredes, R. and Navarro, G. 2006. Optimal incremental sorting. In Proceedings of the 8th Workshop on Algorithm Engineering and Experiments (ALENEX) and the 3rd Workshop on Analytic Algorithmics and Combinatorics (ANALCO). Miami, FL. SIAM Press, Philadelphia, PA, 171--182.Google ScholarGoogle Scholar
  20. Preparata, F. P. and Shamos, M. I. 1985. Computational Geometry—An Introduction. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Sakamoto, H. 1943. On the distributions of the product and the quotient of the independent and uniformly distributed random variables. Tohoku Math. J. 49, 243--260.Google ScholarGoogle Scholar
  22. Tan, K.-L., Eng, P.-K., and Ooi, B. C. 2001. Efficient progressive skyline computation. In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB). Rome, Italy, Morgan Kaufmann, San Francisco, CA, 301--310. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Tao, Y., Xiao, X., and Pei, J. 2006. SUBSKY: efficient computation of skylines in subspaces. In Proceedings of the 22nd International Conference on Data Engineering (ICDE). Atlanta, GA. IEEE, Computer Society, Washington, DC, 65. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Efficient sort-based skyline evaluation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Database Systems
        ACM Transactions on Database Systems  Volume 33, Issue 4
        November 2008
        379 pages
        ISSN:0362-5915
        EISSN:1557-4644
        DOI:10.1145/1412331
        Issue’s Table of Contents

        Copyright © 2008 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 December 2008
        • Accepted: 1 July 2008
        • Revised: 1 March 2008
        • Received: 1 June 2007
        Published in tods Volume 33, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader