skip to main content
10.1145/2911451.2914689acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper
Public Access

Load-Balancing in Distributed Selective Search

Published:07 July 2016Publication History

ABSTRACT

Simulation and analysis have shown that selective search can reduce the cost of large-scale distributed information retrieval. By partitioning the collection into small topical shards, and then using a resource ranking algorithm to choose a subset of shards to search for each query, fewer postings are evaluated. Here we extend the study of selective search using a fine-grained simulation investigating: selective search efficiency in a parallel query processing environment; the difference in efficiency when term-based and sample-based resource selection algorithms are used; and the effect of two policies for assigning index shards to machines. Results obtained for two large datasets and four large query logs confirm that selective search is significantly more efficient than conventional distributed search. In particular, we show that selective search is capable of both higher throughput and lower latency in a parallel environment than is exhaustive search.

References

  1. R. Aly, D. Hiemstra, and T. Demeester. Taily: Shard Selection Using the Tail of Score Distributions. In Proc. SIGIR, pages 673--682, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. F. Cacheda, V. Carneiro, V. Plachouras, and I. Ounis. Performance analysis of distributed information retrieval architectures using an improved network simulation model. Inf. Proc. Man., 43: 204--224, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Y. Kim, J. Callan, J. S. Culpepper, and A. Moffat. Does selective search benefit from WAND optimization? In Proc. ECIR, 2016. To appear.Google ScholarGoogle ScholarCross RefCross Ref
  4. A. Kulkarni. Efficient and Effective Large-Scale Search. PhD thesis, Carnegie Mellon University, 2013.Google ScholarGoogle Scholar
  5. A. Kulkarni and J. Callan. Document allocation policies for selective searching of distributed indexes. In Proc. CIKM, pages 449--458, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Kulkarni and J. Callan. Selective search: Efficient and effective search of large textual collections. ACM Trans. Inf. Sys., 33 (4): 17:1--17:33, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Kulkarni, A. Tigelaar, D. Hiemstra, and J. Callan. Shard ranking and cutoff estimation for topically partitioned collections. In Proc. CIKM, pages 555--564, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Moffat, W. Webber, and J. Zobel. Load balancing for term-distributed parallel retrieval. In Proc. SIGIR, pages 348--355, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. L. Powell, J. C. French, J. Callan, M. Connell, and C. L. Viles. The impact of database selection on distributed searching. In Proc. SIGIR, pages 232--239, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Load-Balancing in Distributed Selective Search

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
            July 2016
            1296 pages
            ISBN:9781450340694
            DOI:10.1145/2911451

            Copyright © 2016 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 7 July 2016

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • short-paper

            Acceptance Rates

            SIGIR '16 Paper Acceptance Rate62of341submissions,18%Overall Acceptance Rate792of3,983submissions,20%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader