ABSTRACT
Type-ahead search can on-the-fly find answers as a user types in a keyword query. A main challenge in this search paradigm is the high-efficiency requirement that queries must be answered within milliseconds. In this paper we study how to answer top-k queries in this paradigm, i.e., as a user types in a query letter by letter, we want to efficiently find the k best answers. Instead of inventing completely new algorithms from scratch, we study challenges when adopting existing top-k algorithms in the literature that heavily rely on two basic list-access methods: random access and sorted access. We present two algorithms to support random access efficiently. We develop novel techniques to support efficient sorted access using list pruning and materialization. We extend our techniques to support fuzzy type-ahead search which allows minor errors between query keywords and answers. We report our experimental results on several real large data sets to show that the proposed techniques can answer top-k queries efficiently in type-ahead search.
- H. Bast, A. Chitea, F. M. Suchanek, and I. Weber. Ester: efficient search on text, entities, and relations. In SIGIR, pages 671--678, 2007. Google ScholarDigital Library
- H. Bast and I. Weber. Type less, find more: fast autocompletion search with a succinct index. In SIGIR, pages 364--371, 2006. Google ScholarDigital Library
- H. Bast and I. Weber. The completesearch engine: Interactive, efficient, and towards ir & db integration. In CIDR, pages 88--95, 2007.Google Scholar
- S. Chaudhuri, V. Ganti, and R. Kaushik. A primitive operator for similarity joins in data cleaning. In ICDE, pages 5--16, 2006. Google ScholarDigital Library
- S. Chaudhuri and R. Kaushik. Extending autocompletion to tolerate errors. In SIGMOD Conference, pages 707--718, 2009. Google ScholarDigital Library
- R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In PODS, pages 102--113, 2001. Google ScholarDigital Library
- J. Fan, G. Li, and L. Zhou. Interactive SQL query suggestion: Making databases user-friendly. ICDE, pages 351--362, 2011. Google ScholarDigital Library
- J. Feng, and G. Li. Efficient Fuzzy Type-Ahead Search in XML Data. IEEE TKDE, 24(5):882--895, 2012. Google ScholarDigital Library
- K. Grabski and T. Scheffer. Sentence completion. In SIGIR, pages 433--439, 2004. Google ScholarDigital Library
- L. Gravano, P. G. Ipeirotis, H. V. Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava. Approximate string joins in a database (almost) for free. In VLDB, pages 491--500, 2001. Google ScholarDigital Library
- M. Hadjieleftheriou, A. Chandel, N. Koudas, and D. Srivastava. Fast indexes and algorithms for set similarity selection queries. In ICDE, pages 267--276, 2008. Google ScholarDigital Library
- I. F. Ilyas, G. Beskales, and M. A. Soliman. A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv., 40(4), 2008. Google ScholarDigital Library
- S. Ji, G. Li, C. Li, and J. Feng. Efficient interactive fuzzy keyword search. In WWW, pages 371--380, 2009. Google ScholarDigital Library
- N. Khoussainova, Y. Kwon, M. Balazinska, and D. Suciu. Snipsuggest: Context-aware autocompletion for sql. PVLDB, 4(1):22--33, 2010. Google ScholarDigital Library
- K. Kukich. Techniques for automatically correcting words in text. ACM Comput. Surv., 24(4):377--439, 1992. Google ScholarDigital Library
- H. Lee, R. T. Ng, and K. Shim. Extending q-grams to estimate selectivity of string matching with low edit distance. In VLDB, pages 195--206, 2007. Google ScholarDigital Library
- C. Li, J. Lu, and Y. Lu. Efficient merging and filtering algorithms for approximate string searches. In ICDE, pages 257--266, 2008. Google ScholarDigital Library
- G. Li, J. Feng, and C. Li. Supporting search-as-you-type using sql in databases. IEEE TKDE, 2012.Google Scholar
- G. Li, S. Ji, C. Li, and J. Feng. Efficient type-ahead search on relational data: a tastier approach. In SIGMOD Conference, pages 695--706, 2009. Google ScholarDigital Library
- G. Li, S. Ji, C. Li, and J. Feng. Efficient fuzzy full-text type-ahead search. VLDB J., 20(4):617--640, 2011. Google ScholarDigital Library
- N. Mamoulis, K. H. Cheng, M. L. Yiu, and D. W. Cheung. Efficient aggregation of ranked inputs. In ICDE, page 72--83, 2006. Google ScholarDigital Library
- H. Motoda and K. Yoshida. Machine learning techniques to make computers easier to use. Artif. Intell., 103(1--2):295--321, 1998. Google ScholarDigital Library
- A. Nandi and H. V. Jagadish. Effective phrase prediction. In VLDB, pages 219--230, 2007. Google ScholarDigital Library
- J. Qin, W. Wang, Y. Lu, C. Xiao, and X. Lin. Efficient exact edit similarity query processing with the asymmetric signature scheme. In SIGMOD Conference, pages 1033--1044, 2011. Google ScholarDigital Library
- M. Theobald, R. Schenkel, and G. Weikum. Efficient and self-tuning incremental query expansion for top-k query processing. In SIGIR, pages 242--249, 2005. Google ScholarDigital Library
- H. E. Williams, J. Zobel, and D. Bahle. Fast phrase querying with combined indexes. ACM TOIS, 22(4):573--594, 2004. Google ScholarDigital Library
Index Terms
- Supporting efficient top-k queries in type-ahead search
Recommendations
Efficient type-ahead search on relational data: a TASTIER approach
SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of dataExisting keyword-search systems in relational databases require users to submit a complete query to compute answers. Often users feel "left in the dark" when they have limited knowledge about the data, and have to use a try-and-see approach for ...
Efficient fuzzy full-text type-ahead search
Traditional information systems return answers after a user submits a complete query. Users often feel "left in the dark" when they have limited knowledge about the underlying data and have to use a try-and-see approach for finding information. A recent ...
Automatic URL completion and prediction using fuzzy type-ahead search
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrievalType-ahead search is a new information-access paradigm, in which systems can find answers to keyword queries "on-the-fly" as a user types in a query. It improves traditional autocomplete search by allowing query keywords to appear at different places in ...
Comments