skip to main content
research-article

Keyword++: a framework to improve keyword search over entity databases

Published:01 September 2010Publication History
Skip Abstract Section

Abstract

Keyword search over entity databases (e.g., product, movie databases) is an important problem. Current techniques for keyword search on databases may often return incomplete and imprecise results. On the one hand, they either require that relevant entities contain all (or most) of the query keywords, or that relevant entities and the query keywords occur together in several documents from a known collection. Neither of these requirements may be satisfied for a number of user queries. Hence results for such queries are likely to be incomplete in that highly relevant entities may not be returned. On the other hand, although some returned entities contain all (or most) of the query keywords, the intention of the keywords in the query could be different from that in the entities. Therefore, the results could also be imprecise.

To remedy this problem, in this paper, we propose a general framework that can improve an existing search interface by translating a keyword query to a structured query. Specifically, we leverage the keyword to attribute value associations discovered in the results returned by the original search interface. We show empirically that the translated structured queries alleviate the above problems.

References

  1. S. Agrawal, K. Chakrabarti, S. Chaudhuri, V. Ganti, A. C. Konig, and D. Xin. Exploiting web search engines to search structured databases. In WWW, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Agrawal, S. Chaudhuri, and G. Das. Dbxplorer: A system for keyword-based search over relational databases. In ICDE, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Bautin and S. Skiena. Concordance-based entity-oriented search. In Web Intelligence, pages 586--592, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. G. Bhalotia, A. Hulgeri, C. Naukhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using banks. In ICDE, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Chakrabarti, K. Puniyani, and S. Das. Optimizing scoring functions and indexes for proximity search in type-annotated corpora. In WWW, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Chaudhuri, B.-C. Chen, V. Ganti, and R. Kaushik. Example-driven design of efficient record matching queries. In VLDB, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Chaudhuri, V. Ganti, and D. Xin. Exploiting web search to generate synonyms for entities. In WWW, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. Cheng and K. C.-C. Chang. Entity search engine: Towards agile best-effort information integration over the web. In CIDR, 2007.Google ScholarGoogle Scholar
  9. T. Cheng, H. W. Lauw, and S. Paparizos. Fuzzy matching of web queries to structured data. In ICDE, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  10. E. Chu, A. Baid, X. Chai, A. Doan, and J. Naughton. Combining keyword search and forms for ad hoc querying of databases. In SIGMOD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. B. Dantzig. Application of the simplexmethod to a transportation problem. In Activity Analysis of Production and Allocation, 1951.Google ScholarGoogle Scholar
  12. V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. In VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Kullback. The kullback-leibler distance. The American Statistician, 41, 1987.Google ScholarGoogle Scholar
  14. E. Levina and P. Bickel. The earthmovers distance is the mallows distance: Some insights from statistics. In ICCV, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  15. Z. Nie, J.-R. Wen, and W.-Y. Ma. Object-level vertical search. In CIDR, 2007.Google ScholarGoogle Scholar
  16. S. Paprizos, A. Ntoulas, J. Shafer, and R. Agrawal. Answering web queries using structured data sources. In SIGMOD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Pound, I. F. Ilyas, and G. E. Weddell. Expressive and flexible access to web-extracted data: a keyword-based structured query language. In SIGMOD Conference, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Rubner, C. Tomasi, and L. J. Guibas. A metric for distributions with applications to image databases. In ICCV, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. N. Sarkas, S. Paparizos, and P. Tsaparas. Structured annotations of web queries. In SIGMOD Conference, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Tata and G. M. Lohman. Sqak: Doing more with keywords. In SIGMOD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Q. T. Tran, C.-Y. Chan, and S. Parthasarathy. Query by output. In SIGMOD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Keyword++: a framework to improve keyword search over entity databases
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the VLDB Endowment
        Proceedings of the VLDB Endowment  Volume 3, Issue 1-2
        September 2010
        1658 pages

        Publisher

        VLDB Endowment

        Publication History

        • Published: 1 September 2010
        Published in pvldb Volume 3, Issue 1-2

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader