Abstract
Decision making is challenging when there is more than one criterion to consider. In such cases, it is common to assign a goodness score to each item as a weighted sum of its attribute values and rank them accordingly. Clearly, the ranking obtained depends on the weights used for this summation. Ideally, one would want the ranked order not to change if the weights are changed slightly. We call this property stability of the ranking. A consumer of a ranked list may trust the ranking more if it has high stability. A producer of a ranked list prefers to choose weights that result in a stable ranking, both to earn the trust of potential consumers and because a stable ranking is intrinsically likely to be more meaningful.
In this paper, we develop a framework that can be used to assess the stability of a provided ranking and to obtain a stable ranking within an "acceptable" range of weight values (called "the region of interest"). We address the case where the user cares about the rank order of the entire set of items, and also the case where the user cares only about the top-k items. Using a geometric interpretation, we propose algorithms that produce stable rankings. In addition to theoretical analyses, we conduct extensive experiments on real datasets that validate our proposal.
- M. Gladwell. The order of things: What college rankings really tell us. The New Yorker Magazine, Feb 14, 2011.Google Scholar
- N. A. Bowman and M. N. Bastedo. Anchoring effects in world university rankings: exploring biases in reputation scores. Higher Education, 61(4):431--444, 2011.Google ScholarCross Ref
- J. Monks and R. G. Ehrenberg. The impact of us news and world report college rankings on admission outcomes and pricing decisions at selective private institutions. Technical report, National Bureau of Economic Research, 1999.Google ScholarCross Ref
- A. Langville and C. Meyer. Who's #1? The Science of Rating and Ranking. Princeton University Press, 2012. Google ScholarDigital Library
- K. Yang, J. Stoyanovich, A. Asudeh, B. Howe, H. Jagadish, and G. Miklau. A nutritional label for rankings. In SIGMOD, pages 1773--1776. ACM, 2018. Google ScholarDigital Library
- CSMetrics. www.csmetrics.org/. {Online; accessed April 2018}.Google Scholar
- H. Edelsbrunner. Algorithms in combinatorial geometry, volume 10. Springer Science & Business Media, 2012. Google ScholarDigital Library
- S. Borzsony, D. Kossmann, and K. Stocker. The skyline operator. In ICDE, pages 421--430. IEEE, 2001. Google ScholarDigital Library
- X. Lin, Y. Yuan, Q. Zhang, and Y. Zhang. Selecting stars: The k most representative skyline operator. In Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on, pages 86--95. IEEE, 2007.Google ScholarCross Ref
- D. Nanongkai, A. D. Sarma, A. Lall, R. J. Lipton, and J. Xu. Regret-minimizing representative databases. PVLDB, 3(1--2):1114--1124, 2010. Google ScholarDigital Library
- S. Chester, A. Thomo, S. Venkatesh, and S. Whitesides. Computing k-regret minimizing sets. PVLDB, 7(5):389--400, 2014. Google ScholarDigital Library
- I.-F. Su, Y.-C. Chung, and C. Lee. Top-k combinatorial skyline queries. In International Conference on Database Systems for Advanced Applications, pages 79--93. Springer, 2010. Google ScholarDigital Library
- A. Asudeh, H. Jagadish, J. Stoyanovich, and G. Das. Designing fair ranking schemes. In SIGMOD. ACM, 2019.Google ScholarDigital Library
- A. Asudeh, S. Thirumuruganathan, N. Zhang, and G. Das. Discovering the skyline of web databases. PVLDB, 9(7):600--611, 2016. Google ScholarDigital Library
- A. Asudeh, G. Zhang, N. Hassan, C. Li, and G. V. Zaruba. Crowdsourcing pareto-optimal object finding by pairwise comparisons. In CIKM, pages 753--762. ACM, 2015. Google ScholarDigital Library
- A. Asudeh, H. Jagadish, G. Miklau, and J. Stoyanovich. On obtaining stable rankings. CoRR, abs/1804.10990, 2018. Google ScholarDigital Library
- M. E. Dyer and A. M. Frieze. On the complexity of computing the volume of a polyhedron. SIAM Journal on Computing, 17(5):967--974, 1988. Google ScholarDigital Library
- P. Orlik and H. Terao. Arrangements of hyperplanes, volume 300. Springer, 2013.Google Scholar
- B. Grünbaum. Arrangements of hyperplanes. In Convex Polytopes. Springer, 2003.Google ScholarCross Ref
- V. V. Schechtman and A. N. Varchenko. Arrangements of hyperplanes and lie algebra homology. Inventiones mathematicae, 106(1), 1991.Google Scholar
- P. K. Agarwal and M. Sharir. Arrangements and their applications. Handbook of computational geometry, 2000.Google Scholar
- I. F. Ilyas, G. Beskales, and M. A. Soliman. A survey of top-k query processing techniques in relational database systems. CSUR, 40(4), 2008. Google ScholarDigital Library
- R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. Journal of Computer and System Sciences, 66(4), 2003. Google ScholarDigital Library
- A. Asudeh, N. Zhang, and G. Das. Query reranking as a service. PVLDB, 9(11):888--899, 2016. Google ScholarDigital Library
- Y. D. Gunasekaran, A. Asudeh, S. Hasani, N. Zhang, A. Jaoua, and G. Das. QR2: A third-party query reranking service over web databases. In ICDE, pages 1653--1656. IEEE, 2018.Google ScholarCross Ref
- H. Blaker. Confidence curves and improved exact confidence intervals for discrete distributions. Canadian Journal of Statistics, 28(4):783--798, 2000.Google ScholarCross Ref
- F. J. Hickernell, L. Jiang, Y. Liu, and A. B. Owen. Guaranteed conservative fixed width confidence intervals via monte carlo sampling. In Monte Carlo and Quasi-Monte Carlo Methods 2012, pages 105--128. Springer, 2013.Google Scholar
- W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American statistical association, 58(301):13--30, 1963.Google Scholar
- H. Chernoff. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. The Annals of Mathematical Statistics, pages 493--507, 1952.Google ScholarCross Ref
- C. P. Robert. Monte carlo methods. Wiley Online Library, 2004.Google Scholar
- R. Durrett. Probability: theory and examples. Cambridge university press, 2010. Google ScholarDigital Library
- A. Asudeh, A. Nazi, N. Zhang, G. Das, and H. Jagadish. RRR: Rank-regret representative. In SIGMOD. ACM, 2019.Google ScholarDigital Library
- M. E. Muller. A note on a method for generating points uniformly on n-dimensional spheres. Communications of the ACM, 2(4), 1959. Google ScholarDigital Library
- G. Marsaglia et al. Choosing a point from the surface of a sphere. The Annals of Mathematical Statistics, 43(2), 1972.Google ScholarCross Ref
- H. Cramér. Mathematical methods of statistics (PMS-9), volume 9. Princeton university press, 2016.Google Scholar
- S. Lucidl and M. Piccioni. Random tunneling by means of acceptance-rejection sampling for global optimization. Journal of optimization theory and applications, 62(2):255--277, 1989.Google Scholar
- L. Devroye. Sample-based non-uniform random variate generation. In Proceedings of the 18th conference on Winter simulation, pages 260--265. ACM, 1986. Google ScholarDigital Library
- K. Fischer, B. Gärtner, and M. Kutz. Fast smallest-enclosing-ball computation in high dimensions. In European Symposium on Algorithms, pages 630--641. Springer, 2003.Google ScholarCross Ref
- S. Li. Concise formulas for the area and volume of a hyperspherical cap. Asian Journal of Mathematics and Statistics, 4(1):66--70, 2011.Google ScholarCross Ref
- T. F. I. de Football Association (FIFA). Fifa rankings. www.fifa.com/fifa-world-ranking/ranking-table/men/index.html. {Online; accessed April 2018}.Google Scholar
- FIFA. Fifa/coca-cola world ranking procedure. http://www.fifa.com/fifa-world-ranking/procedure/men.html, 28 March 2008.Google Scholar
- BlueNile. www.bluenile.com/diamond-search? {Online; accessed Feb. 2018}.Google Scholar
- US Department of Transportation's dataset. http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time.Google Scholar
- F. Geerts, H. Mannila, and E. Terzi. Relational link-based ranking. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, pages 552--563. VLDB Endowment, 2004. Google ScholarDigital Library
- S. Chaudhuri and G. Das. Keyword querying and ranking in databases. PVLDB, 2(2):1658--1659, 2009. Google ScholarDigital Library
- R. Agrawal, R. Rantzau, and E. Terzi. Context-sensitive ranking. In SIGMOD, pages 383--394. ACM, 2006. Google ScholarDigital Library
- P. K. Agarwal, L. Arge, J. Erickson, P. G. Franciosa, and J. S. Vitter. Efficient searching with linear constraints. JCSS, 61(2), 2000. Google ScholarDigital Library
- A. Asudeh, A. Nazi, N. Zhang, and G. Das. Efficient computation of regret-ratio minimizing set: A compact maxima representative. In SIGMOD, pages 821--834. ACM, 2017. Google ScholarDigital Library
- J. Stoyanovich, W. Mee, and K. A. Ross. Semantic ranking and result visualization for life sciences publications. In ICDE, pages 860--871, 2010.Google ScholarCross Ref
- S. Chaudhuri, G. Das, V. Hristidis, and G. Weikum. Probabilistic ranking of database query results. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, pages 888--899. VLDB Endowment, 2004. Google ScholarDigital Library
- J. Li, B. Saha, and A. Deshpande. A unified approach to ranking in probabilistic databases. PVLDB, 2(1):502--513, 2009. Google ScholarDigital Library
- J. Li and A. Deshpande. Ranking continuous probabilistic datasets. PVLDB, 3(1--2):638--649, 2010. Google ScholarDigital Library
- V. Hristidis and Y. Papakonstantinou. Algorithms and applications for answering ranked queries using ranked views. The VLDB Journal, 13(1):49--70, 2004. Google ScholarDigital Library
- G. Das, D. Gunopulos, N. Koudas, and D. Tsirogiannis. Answering top-k queries using views. In Proceedings of the 32nd international conference on Very large data bases, pages 451--462. VLDB Endowment, 2006. Google ScholarDigital Library
- Y.-C. Chang, L. Bergman, V. Castelli, C.-S. Li, M.-L. Lo, and J. R. Smith. The onion technique: indexing for linear optimization queries. In SIGMOD, 2000. Google ScholarDigital Library
- G. R. Hjaltason and H. Samet. Ranking in spatial databases. In SSTD. Springer, 1995. Google ScholarDigital Library
- M. F. Rahman, A. Asudeh, N. Koudas, and G. Das. Efficient computation of subspace skyline over categorical domains. In CIKM, pages 407--416. ACM, 2017. Google ScholarDigital Library
- M. De Berg, O. Cheong, M. Van Kreveld, and M. Overmars. Computational Geometry: Introduction. Springer, 2008. Google ScholarDigital Library
- K. Mouratidis. Geometric approaches for top-k queries. PVLDB, 10(12):1985--1987, 2017. Google ScholarDigital Library
Recommendations
User rankings of search engine results
In this study, we investigate the similarities and differences between rankings of search results by users and search engines. Sixty-seven students took part in a 3-week-long experiment, during which they were asked to identify and rank the top 10 ...
Authority Rankings from HITS, PageRank, and SALSA: Existence, Uniqueness, and Effect of Initialization
Algorithms such as Kleinberg's HITS algorithm, the PageRank algorithm of Brin and Page, and the SALSA algorithm of Lempel and Moran use the link structure of a network of web pages to assign weights to each page in the network. The weights can then be ...
What's Going on in Search Engine Rankings?
AINAW '08: Proceedings of the 22nd International Conference on Advanced Information Networking and Applications - WorkshopsMany people use search engines every day to retrieve documents from the Web. Although the social influence of search engine rankings has become significant, ranking algorithms are not disclosed. In this paper, we have investigated three major search ...
Comments