skip to main content
research-article

SODA: generating SQL for business users

Published:01 June 2012Publication History
Skip Abstract Section

Abstract

The purpose of data warehouses is to enable business analysts to make better decisions. Over the years the technology has matured and data warehouses have become extremely successful. As a consequence, more and more data has been added to the data warehouses and their schemas have become increasingly complex. These systems still work great in order to generate pre-canned reports. However, with their current complexity, they tend to be a poor match for non tech-savvy business analysts who need answers to ad-hoc queries that were not anticipated.

This paper describes the design, implementation, and experience of the SODA system (Search over DAta Warehouse). SODA bridges the gap between the business needs of analysts and the technical complexity of current data warehouses. SODA enables a Google-like search experience for data warehouses by taking keyword queries of business users and automatically generating executable SQL. The key idea is to use a graph pattern matching algorithm that uses the metadata model of the data warehouse. Our results with real data from a global player in the financial services industry show that SODA produces queries with high precision and recall, and makes it much easier for business users to interactively explore highly-complex data warehouses.

References

  1. S. Agrawal, S. Chaudhuri, and G. Das. DBExplorer: A System for Keyword-Based Search over Relational Databases. In ICDE, pages 5--16, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Bergamaschi, E. Domnori, F. Guerra, R. T. Lado, and Y. Velegrakis. Keyword Search over Relational Databases: A Metadata Approach. In SIGMOD, pages 565--576, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword Searching and Browsing in Databases using BANKS. In ICDE, pages 431--440, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Blunschi, C. Jossen, D. Kossmann, M. Mori, and K. Stockinger. Data-Thirsty Business Analysts need SODA - Search Over DAta Warehouse. In CIKM (demo), pages 2525--2528, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. Brunner and K. Stockinger. Data Warehouse Historization Concept. Credit Suisse internal architecture document, 2008.Google ScholarGoogle Scholar
  6. E. Demidova, I. Oelze, and P. Fankhauser. Do We Mean the Same?: Disambiguation of Extracted Keyword Queries for Database Search. In KEYS, pages 33--38, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. E. Demidova, X. Zhou, I. Oelze, and W. Nejdl. Evaluating Evidences for Keyword Query Disambiguation in Entity Centric Database Search. In DEXA (2), pages 240--247, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Geppert, L. Baumgartner, and D. Jonscher. The Data Warehouse Reference Architecture. Credit Suisse internal architecture document, 2008.Google ScholarGoogle Scholar
  9. H. He, H. Wang, J. Yang, and P. S. Yu. BLINKS: Ranked Keyword Searches on Graphs. In SIGMOD, pages 305--316, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. V. Hristidis and Y. Papakonstantinou. DISCOVER: Keyword Search in Relational Databases. In VLDB, pages 670--681, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Jossen, L. Blunschi, M. Mori, D. Kossmann, and K. Stockinger. The Credit Suisse Meta-data Warehouse. In ICDE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. N. Khoussainova, Y. Kwon, M. Balazinska, and D. Suciu. SnipSuggest: Context-Aware Autocompletion for SQL. PVLDB, 4(1): 22--33, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Kimball. The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses. John Wiley, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Li, H. Yang, and H. V. Jagadish. NaLIX: Generic Natural Language Search Environment for XML Data. Transactions on Database Systems, 32(4), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. F. Liu, C. Yu, W. Meng, and A. Chowdhury. Effective Keyword Search in Relational Databases. In SIGMOD, pages 563--574, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Z. Liu and Y. Chen. Processing Keyword Search on XML: A Survey. World Wide Web, 14(5--6): 671--707, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Ortega-Binderberger, K. Chakrabarti, and S. Mehrotra. An Approach to Integrating Query Refinement in SQL. In EDBT, pages 15--33, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. Qin, J. X. Yu, and L. Chang. Keyword Search in Databases: The Power of RDBMS. In SIGMOD, pages 681--694, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Simitsis, G. Koutrika, and Y. Ioannidis. Précis: From Unstructured Keywords as Queries to Structured Databases as Answers. VLDB Journal, 17(1): 117--149, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. T. Snodgrass. Developing Time-Oriented Database Applications in SQL. Morgan Kaufmann, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. http://www.w3.org/TR/rdf-sparql-query/. SPARQL Query Language for RDF.Google ScholarGoogle Scholar
  22. A. S. Szalay, J. Gray, A. Thakar, P. Z. Kunszt, T. Malik, J. Raddick, C. Stoughton, and J. vandenBerg. The SDSS Skyserver: Public Access to the Sloan Digital Sky Server Data. In SIGMOD, pages 570--581, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Tata and G. M. Lohman. SQAK: Doing More with Keywords. In SIGMOD, pages 889--902, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. X. Yang, C. M. Procopiuc, and D. Srivastava. Summarizing Relational Database. PVLDB, 2(1): 634--645, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 5, Issue 10
    June 2012
    180 pages

    Publisher

    VLDB Endowment

    Publication History

    • Published: 1 June 2012
    Published in pvldb Volume 5, Issue 10

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader