2015 | OriginalPaper | Chapter
ShRkC: Shard Rank Cutoff Prediction for Selective Search
Author : Anagha Kulkarni
Published in: String Processing and Information Retrieval
Publisher: Springer International Publishing
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
In search environments where large document collections are partitioned into smaller subsets (
shards
), processing the query against only the relevant shards improves search efficiency. The problem of ranking the shards based on their estimated relevance to the query has been studied extensively. However, a related important task of identifying
how many
of the top ranked relevant shards should be searched for the query, so as to balance the competing objectives of effectiveness and efficiency, has not received much attention. This task of
shard rank cutoff estimation
is the focus of the presented work. The central premise for the proposed solution is that the number of top shards searched should be dependent on – 1. the query, 2. the given ranking of shards, and 3. on the type of
search need
being served (precision-oriented versus recall-oriented task). An array of features that capture these three factors are defined, and a regression model is induced based on these features to learn a query-specific shard rank cutoff estimator. An empirical evaluation using two large datasets demonstrates that the learned shard rank cutoff estimator provides substantial improvement in search efficiency as compared to strong baselines without degrading search effectiveness.