Abstract
Search engines are continuously employing advanced techniques that aim to capture user intentions and provide results that go beyond the data that simply satisfy the query conditions. Examples include the personalized results, related searches, similarity search, popular and relaxed queries. In this work we introduce a novel query paradigm that considers a user query as an example of the data in which the user is interested. We call these queries exemplar queries and claim that they can play an important role in dealing with the information deluge. We provide a formal specification of the semantics of such queries and show that they are fundamentally different from notions like queries by example, approximate and related queries. We provide an implementation of these semantics for graph-based data and present an exact solution with a number of optimizations that improve performance without compromising the quality of the answers. We also provide an approximate solution that prunes the search space and achieves considerably better time-performance with minimal or no impact on effectiveness. We experimentally evaluate the effectiveness and efficiency of these solutions with synthetic and real datasets, and illustrate the usefulness of exemplar queries in practice.
- R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM, pages 5--14, 2009. Google ScholarDigital Library
- A. Anagnostopoulos, L. Becchetti, C. Castillo, and A. Gionis. An optimization framework for query recommendation. In WSDM, pages 161--170, 2010. Google ScholarDigital Library
- R. Baeza-Yates, P. Boldi, and C. Castillo. Generalizing pagerank: damping functions for link-based ranking algorithms. In SIGIR, pages 308--315, 2006. Google ScholarDigital Library
- I. Bedini, B. Elser, and Y. Velegrakis. The trento big data platform for public administration and large companies: Use cases and opportunities. PVLDB, 6(11):1166--1167, 2013. Google ScholarDigital Library
- S. Bergamaschi, E. Domnori, F. Guerra, R. Trillo Lado, and Y. Velegrakis. Keyword search over relational databases: a meta-data approach. In SIGMOD, pages 565--576, 2011. Google ScholarDigital Library
- S. Bergamaschi, F. Guerra, S. Rota, and Y. Velegrakis. A hidden markov model approach to keyword-based search over relational databases. In ER, pages 411--420, 2011. Google ScholarDigital Library
- S. Bhatia, D. Majumdar, and P. Mitra. Query suggestions in the absence of query logs. In SIGIR, pages 795--804, 2011. Google ScholarDigital Library
- P. Boldi, F. Bonchi, C. Castillo, and S. Vigna. Query reformulation mining: models, patterns, and applications. IR, 14(3):257--289, 2011. Google ScholarDigital Library
- I. Bordino, G. De Francisci Morales, I. Weber, and F. Bonchi. From machu_picchu to rafting the urubamba river: anticipating information needs via the entity-query graph. In WSDM, pages 275--284, 2013. Google ScholarDigital Library
- S. Chakrabarti. Dynamic personalized pagerank in entity-relation graphs. In WWW, pages 571--580, 2007. Google ScholarDigital Library
- X. Dong, A. Y. Halevy, and J. Madhavan. Reference Reconciliation in Complex Information Spaces. In SIGMOD, pages 85--96, 2005. Google ScholarDigital Library
- Z. Dou, S. Hu, Y. Luo, R. Song, and J. Wen. Finding dimensions for queries. In CIKM, pages 1311--1320, 2011. Google ScholarDigital Library
- X. Gao, B. Xiao, D. Tao, and X. Li. A survey of graph edit distance. Pattern Analysis and applications, 13(1):113--129, 2010. Google ScholarDigital Library
- S. Gauch and J. B. Smith. Search improvement via automatic query reformulation. TOIS, 9(3):249--280, 1991. Google ScholarDigital Library
- T. H. Haveliwala. Topic-sensitive pagerank. In WWW, pages 517--526, 2002. Google ScholarDigital Library
- M. R. Henzinger, T. A. Henzinger, and P. W. Kopke. Computing simulations on finite and infinite graphs. In FOCS, pages 453--462. IEEE, 1995. Google ScholarDigital Library
- B. Jansen, D. Booth, and A. Spink. Determining the informational, navigational, and transactional intent of web queries. Information Processing & Management, 44(3):1251--1266, 2008. Google ScholarDigital Library
- G. Jeh and J. Widom. Scaling personalized web search. In WWW, pages 271--279, 2003. Google ScholarDigital Library
- A. Khan, N. Li, X. Yan, Z. Guan, S. Chakraborty, and S. Tao. Neighborhood based fast graph search in large networks. In SIGMOD, pages 901--912, 2011. Google ScholarDigital Library
- A. Khan, Y. Wu, C. C. Aggarwal, and X. Yan. Nema: Fast graph search with label similarity. In PVLDB, pages 181--192, 2013. Google ScholarDigital Library
- N. Lao and W. W. Cohen. Fast query execution for retrieval models based on path-constrained random walks. In KDD, pages 881--888, 2010. Google ScholarDigital Library
- C. Mishra and N. Koudas. Interactive query refinement. In EDBT, pages 862--873, 2009. Google ScholarDigital Library
- D. Mottin, A. Marascu, S. B. Roy, G. Das, T. Palpanas, and Y. Velegrakis. A probabilistic optimization framework for the empty-answer problem. PVLDB, 6(14):1762--1773, 2013. Google ScholarDigital Library
- D. Mottin, T. Palpanas, and Y. Velegrakis. Entity Ranking Using Click-Log Information. IDA Journal, 17(5):837--856, 2013. Google ScholarDigital Library
- V. M. Ngo and T. H. Cao. Ontology-based query expansion with latently related named entities for semantic text search. In IJIIDS, volume 283, pages 41--52. 2010.Google ScholarCross Ref
- L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford InfoLab, November 1999.Google Scholar
- D. Park. Concurrency and automata on infinite sequences. Springer, 1981.Google ScholarCross Ref
- Y. Qiu and H.-P. Frei. Concept based query expansion. In SIGIR, pages 160--169, 1993. Google ScholarDigital Library
- C. E. Shannon. A mathematical theory of communication. SIG-MOBILE Mob. Comput. Commun. Rev., 5(1):3--55, Jan. 2001. Google ScholarDigital Library
- D. Vallet and H. Zaragoza. Inferring the most important types of a query: a semantic approach. In SIGIR, pages 857--858, 2008. Google ScholarDigital Library
- X. Wang, X. Ding, A. K. H. Tung, S. Ying, and H. Jin. An efficient graph indexing method. In ICDE, pages 210--221, 2012. Google ScholarDigital Library
- X. Wang and C. Zhai. Mining term association patterns from search logs for effective query reformulation. In CIKM, pages 479--488, 2008. Google ScholarDigital Library
- X. Yan, P. S. Yu, and J. Han. Graph indexing: a frequent structure-based approach. In SIGMOD, pages 335--346, 2004. Google ScholarDigital Library
- P. Zhao and J. Han. On graph query optimization in large networks. VLDB J., 3(1-2):340--351, 2010. Google ScholarDigital Library
- M. M. Zloof. Query by example. In AFIPS NCC, pages 431--438, 1975. Google ScholarDigital Library
Index Terms
- Exemplar queries: give me an example of what you need
Recommendations
Exemplar queries: a new way of searching
Modern search engines employ advanced techniques that go beyond the structures that strictly satisfy the query conditions in an effort to better capture the user intentions. In this work, we introduce a novel query paradigm that considers a user query ...
Approximating expressive queries on graph-modeled data
We present GeX for the approximate matching of complex queries on graph-modeled data.GeX generalizes existing approaches and allows for querying any graph-based datasets.GeX query language supports queries ranging from keyword-based to complex ones.GeX ...
Nearest group queries
SSDBM '13: Proceedings of the 25th International Conference on Scientific and Statistical Database Managementk nearest neighbor (kNN) search is an important problem in a vast number of applications, including clustering, pattern recognition, image retrieval and recommendation systems. It finds k elements from a data source D that are closest to a given query ...
Comments