ABSTRACT
An enterprise information worker is often aware of a few example tuples (but not the entire result) that should be present in the output of the query. We study the problem of discovering the minimal project join query that contains the given example tuples in its output. Efficient discovery of such queries is challenging. We propose novel algorithms to solve this problem. Our experiments on real-life datasets show that the proposed solution is significantly more efficient compared with na\"{i}ve adaptations of known techniques.
- S. Agrawal, S. Chaudhuri, and G. Das. Dbxplorer: A system for keyword-based search over relational databases. In ICDE, 2002.Google ScholarDigital Library
- G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using banks. In ICDE, 2002. Google ScholarDigital Library
- L. Blunschi, C. Jossen, D. Kossmann, M. Mori, and K. Stockinger. Soda: Generating sql for business users. PVLDB, 2012. Google ScholarDigital Library
- B. Ding, J. X. Yu, S. Wang, L. Qin, X. Zhang, and X. Lin. Finding top-k min-cost connected trees in databases. In ICDE, 2007.Google ScholarCross Ref
- J. Fan, G. Li, and L. Zhou. Interactive sql query suggestion: Making databases user-friendly. In ICDE, 2011. Google ScholarDigital Library
- K. Golenberg, B. Kimelfeld, and Y. Sagiv. Keyword proximity search in complex data graphs. In SIGMOD, 2008. Google ScholarDigital Library
- D. Golovin and A. Krause. Adaptive submodularity: A new approach to active learning and stochastic optimization. In COLT, 2010.Google Scholar
- D. Golovin and A. Krause. Adaptive submodularity: Theory and applications in active learning and stochastic optimization. JAIR, 2011. Google ScholarDigital Library
- H. He, H. Wang, J. Yang, and P. S. Yu. Blinks: ranked keyword searches on graphs. In SIGMOD, 2007. Google ScholarDigital Library
- V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. In VLDB, 2002. Google ScholarDigital Library
- H. V. Jagadish, A. Chapman, A. Elkiss, M. Jayapandian, Y. Li, A. Nandi, and C. Yu. Making database systems usable. In SIGMOD, 2007. Google ScholarDigital Library
- G. Li, B. C. Ooi, J. Feng, J. Wang, and L. Zhou. Ease: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data. In SIGMOD, 2008. Google ScholarDigital Library
- Z. Liu, S. Parthasarathy, A. Ranganathan, and H. Yang. Near-optimal algorithms for shared filter evaluation in data stream systems. In SIGMOD, 2008. Google ScholarDigital Library
- Y. Luo, X. Lin, W. Wang, and X. Zhou. Spark: top-k keyword query in relational databases. In SIGMOD, 2007. Google ScholarDigital Library
- A. Markowetz, Y. Yang, and D. Papadias. Keyword search on relational data streams. In SIGMOD, 2007. Google ScholarDigital Library
- K. Munagala, U. Srivastava, and J. Widom. Optimization of continuous queries with shared expensive filters. In PODS, 2007. Google ScholarDigital Library
- R. Pimplikar and S. Sarawagi. Answering table queries on the web using column keywords. PVLDB, 2012. Google ScholarDigital Library
- L. Qian, M. J. Cafarella, and H. V. Jagadish. Sample-driven schema mapping. In SIGMOD, 2012. Google ScholarDigital Library
- L. Qin, J. X. Yu, and L. Chang. Keyword search in databases: the power of rdbms. In SIGMOD, 2009. Google ScholarDigital Library
- P. Roy, S. Seshadri, S. Sudarshan, and S. Bhobe. Efficient and extensible algorithms for multi query optimization. In SIGMOD, 2000. Google ScholarDigital Library
- T. K. Sellis. Multiple-query optimization. ACM Trans. Database Syst., 1988. Google ScholarDigital Library
- Q. T. Tran, C.-Y. Chan, and S. Parthasarathy. Query by output. In SIGMOD, 2009. Google ScholarDigital Library
- M. Zhang, H. Elmeleegy, C. M. Procopiuc, and D. Srivastava. Reverse engineering complex join queries. In SIGMOD, 2013. Google ScholarDigital Library
Index Terms
- Discovering queries based on example tuples
Recommendations
S4: Top-k Spreadsheet-Style Search for Query Discovery
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataAn enterprise information worker is often aware of a few example tuples that should be present in the output of the query. Query discovery systems have been developed to discover project-join queries that contain the given example tuples in their ...
Reverse engineering complex join queries
SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of DataWe study the following problem: Given a database D with schema G and an output table Out, compute a join query Q that generates OUT from D. A simpler variant allows Q to return a superset of Out. This problem has numerous applications, both by itself, ...
Learning Join Queries from User Examples
Special Issue: Invited 2014 PODS and EDBT Revised ArticlesWe investigate the problem of learning join queries from user examples. The user is presented with a set of candidate tuples and is asked to label them as positive or negative examples, depending on whether or not she would like the tuples as part of ...
Comments