skip to main content
10.1145/2623330.2623677acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Open question answering over curated and extracted knowledge bases

Authors Info & Claims
Published:24 August 2014Publication History

ABSTRACT

We consider the problem of open-domain question answering (Open QA) over massive knowledge bases (KBs). Existing approaches use either manually curated KBs like Freebase or KBs automatically extracted from unstructured text. In this paper, we present OQA, the first approach to leverage both curated and extracted KBs.

A key technical challenge is designing systems that are robust to the high variability in both natural language questions and massive KBs. OQA achieves robustness by decomposing the full Open QA problem into smaller sub-problems including question paraphrasing and query reformulation. OQA solves these sub-problems by mining millions of rules from an unlabeled question corpus and across multiple KBs. OQA then learns to integrate these rules by performing discriminative training on question-answer pairs using a latent-variable structured perceptron algorithm. We evaluate OQA on three benchmark question sets and demonstrate that it achieves up to twice the precision and recall of a state-of-the-art Open QA system.

Skip Supplemental Material Section

Supplemental Material

p1156-sidebyside.mp4

mp4

306.6 MB

References

  1. M. Banko, E. Brill, S. Dumais, and J. Lin. AskMSR: Question answering using the worldwide web. In 2002 AAAI Spring Symposium on Mining Answers from Texts and Knowledge Bases, 2002.Google ScholarGoogle Scholar
  2. M. Banko, M. J. Cafarella, S. Soderland, M. Broadhead, and O. Etzioni. Open Information Extraction from the Web. In IJCAI, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on Freebase from question-answer pairs. In EMNLP, 2013.Google ScholarGoogle Scholar
  4. Q. Cai and A. Yates. Large-scale Semantic Parsing via Schema Matching and Lexicon Extension. In ACL, 2013.Google ScholarGoogle Scholar
  5. A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. H. Jr., and T. Mitchell. Toward an architecture for never-ending language learning. In AAAI, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. K. Chandra and P. M. Merlin. Optimal implementation of conjunctive queries in relational data bases. In STOC, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Clarke, D. Goldwasser, M.-W. Chang, and D. Roth. Driving Semantic Parsing from the World's Response. In CoNLL, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Doan, A. Y. Halevy, and Z. G. Ives. Principles of Data Integration. Morgan Kaufmann, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Fader, S. Soderland, and O. Etzioni. Identifying relations for open information extraction. In EMNLP, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Fader, L. Zettlemoyer, and O. Etzioni. Paraphrase-Driven Learning for Open Question Answering. In ACL, 2013.Google ScholarGoogle Scholar
  11. Y. Freund and R. E. Schapire. Large margin classification using the perceptron algorithm. Mach. Learn., 37(3):277--296, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. J. Grosz, D. E. Appelt, P. A. Martin, and F. C. N. Pereira. TEAM: An Experiment in the Design of Transportable Natural-Language Interfaces. Artificial Intelligence, 32(2):173--243, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. Katz. Annotating the World Wide Web using Natural Language. In RIAO, pages 136--159, 1997.Google ScholarGoogle Scholar
  14. P. Koehn. Pharaoh: A beam search decoder for phrase-based statistical machine translation models. In AMTA, Lecture Notes in Computer Science, pages 115--124. Springer, 2004.Google ScholarGoogle Scholar
  15. T. Kwiatkowski, E. Choi, Y. Artzi, and L. Zettlemoyer. Scaling semantic parsers with on-the-fly ontology matching. In EMNLP, 2013.Google ScholarGoogle Scholar
  16. C. Kwok, O. Etzioni, and D. S. Weld. Scaling question answering to the web. ACM Trans. Inf. Syst., 19(3):242--262, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. Liang, A. Bouchard-Côté, D. Klein, and B. Taskar. An end-to-end discriminative approach to machine translation. In ACL, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Liang, M. Jordan, and D. Klein. Learning Dependency-Based Compositional Semantics. In ACL, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Lin and P. Pantel. DIRT -- Discovery of inference rules from text. In KDD, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. Lin, Mausam, and O. Etzioni. Entity linking at web scale. AKBC-WEKEX, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Mintz, S. Bills, R. Snow, and D. Jurafsky. Distant Supervision for Relation Extraction Without Labeled Data. In ACL, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. X. Sun, T. Matsuzaki, D. Okanohara, and J. Tsujii. Latent variable perceptron algorithm for structured classification. In IJCAI, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. Unger, L. Bühmann, J. Lehmann, A.-C. N. Ngomo, D. Gerber, and P. Cimiano. Template-Based Question Answering over RDF Data. In WWW, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. E. M. Voorhees and D. M. Tice. Building a question answering test collection. In SIGIR, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Walter, C. Unger, P. Cimiano, and D. B\"ar. Evaluation of a Layered Approach to Question Answering over Linked Data. In ISWC, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Y. W. Wong and R. J. Mooney. Learning synchronous grammars for semantic parsing with lambda calculus. In ACL, 2007.Google ScholarGoogle Scholar
  27. W. Wu, H. Li, H. Wang, and K. Q. Zhu. Probase: a probabilistic taxonomy for text understanding. In SIGMOD, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Yahya, K. Berberich, S. Elbassuoni, M. Ramanath, V. Tresp, and G. Weikum. Natural Language Questions for the Web of Data. In EMNLP, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Yates and O. Etzioni. Unsupervised methods for determining object and relation synonyms on the web. JAIR, 34:255--296, March 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. M. Zelle and R. J. Mooney. Learning to Parse Database Queries Using Inductive Logic Programming. In AAAI, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. L. S. Zettlemoyer and M. Collins. Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars. In UAI, 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Open question answering over curated and extracted knowledge bases

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2014
      2028 pages
      ISBN:9781450329569
      DOI:10.1145/2623330

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 August 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      KDD '14 Paper Acceptance Rate151of1,036submissions,15%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader