skip to main content
10.1145/2736277.2741651acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Open Domain Question Answering via Semantic Enrichment

Authors Info & Claims
Published:18 May 2015Publication History

ABSTRACT

Most recent question answering (QA) systems query large-scale knowledge bases (KBs) to answer a question, after parsing and transforming natural language questions to KBs-executable forms (e.g., logical forms). As a well-known fact, KBs are far from complete, so that information required to answer questions may not always exist in KBs. In this paper, we develop a new QA system that mines answers directly from the Web, and meanwhile employs KBs as a significant auxiliary to further boost the QA performance. Specifically, to the best of our knowledge, we make the first attempt to link answer candidates to entities in Freebase, during answer candidate generation. Several remarkable advantages follow: (1) Redundancy among answer candidates is automatically reduced. (2) The types of an answer candidate can be effortlessly determined by those of its corresponding entity in Freebase. (3) Capitalizing on the rich information about entities in Freebase, we can develop semantic features for each answer candidate after linking them to Freebase. Particularly, we construct answer-type related features with two novel probabilistic models, which directly evaluate the appropriateness of an answer candidate's types under a given question. Overall, such semantic features turn out to play significant roles in determining the true answers from the large answer candidate pool. The experimental results show that across two testing datasets, our QA system achieves an 18%~54% improvement under F_1 metric, compared with various existing QA systems.

References

  1. K. Balog and R. Neumayer. Hierarchical target type identification for entity-oriented queries. In CIKM, pages 2391--2394. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on Freebase from question-answer pairs. In EMNLP, pages 1533--1544, 2013.Google ScholarGoogle Scholar
  3. J. Berant and P. Liang. Semantic parsing via paraphrasing. In ACL, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  4. C. Bishop et al. Pattern recognition and machine learning, volume 1. springer New York, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. JMLR, 3:993--1022, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD, pages 1247--1250. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. E. Brill, S. Dumais, and M. Banko. An analysis of the AskMSR question-answering system. In EMNLP, pages 257--264, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. E. Brill, J. J. Lin, M. Banko, S. T. Dumais, and A. Y. Ng. Data-intensive question answering. In TREC, 2001.Google ScholarGoogle Scholar
  9. C. Burges. From RankNet to LambdaRank to LambdaMART: An overview. Learning, 11:23--581, 2010.Google ScholarGoogle Scholar
  10. C. Burges, K. M. Svore, P. N. Bennett, A. Pastusiak, and Q. Wu. Learning to rank using an ensemble of lambda-gradient models. In Yahoo! Learning to Rank Challenge, pages 25--35, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Chaturvedi, V. Castelli, R. Florian, R. M. Nallapati, and H. Raghavan. Joint question clustering and relevance prediction for open domain non-factoid question answering. In WWW, pages 503--514, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Chu-Carroll, J. Prager, C. Welty, K. Czuba, and D. Ferrucci. A multi-strategy and multi-source approach to question answering. Technical report, DTIC Document, 2006.Google ScholarGoogle Scholar
  13. S. Cucerzan and A. Sil. The msr systems for entity linking and temporal slot filling at TAC 2013. In Text Analysis Conference, 2013.Google ScholarGoogle Scholar
  14. X. Dong, K. Murphy, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, T. Strohmann, S. Sun, and W. Zhang. Knowledge vault: A Web-scale approach to probabilistic knowledge fusion. In SIGKDD, pages 601--610, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. O. Etzioni. Search needs a shake-up. Nature, 476(7358):25--26, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  16. A. Fader, S. Soderland, and O. Etzioni. Identifying relations for open information extraction. In EMNLP, pages 1535--1545, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Fader, L. Zettlemoyer, and O. Etzioni. Paraphrase-driven learning for open question answering. In ACL, pages 1608--1618, 2013.Google ScholarGoogle Scholar
  18. A. Fader, L. Zettlemoyer, and O. Etzioni. Open question answering over curated and extracted knowledge bases. In SIGKDD. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. Fellbaum. WordNet: An electronic lexical database. 1998. http://www. cogsci. princeton. edu/wn, 2010.Google ScholarGoogle Scholar
  20. D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, D. Gondek, A. A. Kalyanpur, A. Lally, J. Murdock, E. Nyberg, J. Prager, et al. Building watson: An overview of the DeepQA project. AI magazine, 31(3):59--79, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics, pages 1189--1232, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  22. S. Harabagiu, D. Moldovan, M. Pasca, R. Mihalcea, M. Surdeanu, R. Bunescu, R. Girju, V. Rus, and P. Morarescu. FALCON: Boosting knowledge for answer engines. In TREC, volume 9, pages 479--488, 2000.Google ScholarGoogle Scholar
  23. G. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504--507, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  24. E. Hovy, L. Gerber, U. Hermjakob, M. Junk, and C. Lin. Question answering in Webclopedia. In TREC, volume 9, 2000.Google ScholarGoogle Scholar
  25. J. Ko, E. Nyberg, and L. Si. A probabilistic graphical model for joint answer ranking in question answering. In SIGIR on Rearch and Development in IR, pages 343--350. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. C. Kwok, O. Etzioni, and D. Weld. Scaling question answering to the Web. TOIS, 19(3):242--262, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Lally, J. Prager, M. McCord, B. Boguraev, S. Patwardhan, J. Fan, P. Fodor, and J. Chu-Carroll. Question analysis: How watson reads a clue. IBM Journal of Research and Development, 56(3.4):2--1, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. X. Li and D. Roth. Learning question classifiers. In ICCL, pages 1--7, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. C. Liu and J. Nocedal. On the limited memory BFGS method for large scale optimization. Mathematical programming, 45(1-3):503--528, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. X. Luo, H. Raghavan, V. Castelli, S. Maskey, and R. Florian. Finding what matters in questions. In HLT-NAACL, pages 878--887, 2013.Google ScholarGoogle Scholar
  31. E. Marsh and D. Perzanowski. MUC-7 evaluation of ie technology: Overview of results. In MUC-7, volume 20, 1998.Google ScholarGoogle Scholar
  32. B. Min, R. Grishman, L. Wan, C. Wang, and D. Gondek. Distant supervision for relation extraction with an incomplete knowledge base. In HLT-NAACL, pages 777--782, 2013.Google ScholarGoogle Scholar
  33. J. W. Murdock, A. Kalyanpur, C. Welty, J. Fan, D. A. Ferrucci, D. Gondek, L. Zhang, and H. Kanayama. Typing candidate answers using type coercion. IBM Journal of Research and Development, 56(3.4):7--1, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S. Na, I. Kang, S. Lee, and J. Lee. Question answering approach using a WordNet-based answer type taxonomy. In TREC, 2002.Google ScholarGoogle Scholar
  35. C. Pinchak and D. Lin. A probabilistic answer type model. In EACL, 2006.Google ScholarGoogle Scholar
  36. S. Robertson and H. Zaragoza. On rank-based effectiveness measures and optimization. Information Retrieval, 10(3):321--339, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. N. Schlaefer, P. Gieselmann, T. Schaaf, and A. Waibel. A pattern learning approach to question answering within the ephyra framework. In Text, speech and dialogue, pages 687--694. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. F. M. Suchanek, G. Kasneci, and G. Weikum. YAGO: a core of semantic knowledge. In WWW, pages 697--706. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. C. Tsai, W. Yih, and C. Burges. Web-based question answering: Revisiting AskMSR. Technical Report MSR-TR-2015-20, Microsoft Research, 2015.Google ScholarGoogle Scholar
  40. C. Unger, L. Buhmann, J. Lehmann, A. Ngonga Ngomo, D. Gerber, and P. Cimiano. Template-based question answering over RDF data. In WWW, pages 639--648. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. E. M. Voorhees and D. M. Tice. Building a question answering test collection. In SIGIR on Rearch and Development in IR, pages 200--207. ACM, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. R. West, E. Gabrilovich, K. Murphy, S. Sun, R. Gupta, and D. Lin. Knowledge base completion via search-based question answering. In WWW, pages 515--526, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. R. W. White, M. Richardson, and W. Yih. Questions vs. queries in informational search tasks. Technical Report MSR-TR-2014-96, Microsoft Research, 2014.Google ScholarGoogle Scholar
  44. M. Yahya, K. Berberich, S. Elbassuoni, M. Ramanath, V. Tresp, and G. Weikum. Natural language questions for the Web of data. In EMNLP-CoNLL, pages 379--390, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. X. Yao and B. Van Durme. Information extraction over structured data: Question answering with Freebase. In ACL, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  46. L. Zou, R. Huang, H. Wang, J. X. Yu, W. He, and D. Zhao. Natural language question answering over RDF: a graph data driven approach. In SIGMOD, pages 313--324. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Open Domain Question Answering via Semantic Enrichment

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        WWW '15: Proceedings of the 24th International Conference on World Wide Web
        May 2015
        1460 pages
        ISBN:9781450334693

        Copyright © 2015 Copyright is held by the International World Wide Web Conference Committee (IW3C2)

        Publisher

        International World Wide Web Conferences Steering Committee

        Republic and Canton of Geneva, Switzerland

        Publication History

        • Published: 18 May 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        WWW '15 Paper Acceptance Rate131of929submissions,14%Overall Acceptance Rate1,899of8,196submissions,23%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader