ABSTRACT
Most recent question answering (QA) systems query large-scale knowledge bases (KBs) to answer a question, after parsing and transforming natural language questions to KBs-executable forms (e.g., logical forms). As a well-known fact, KBs are far from complete, so that information required to answer questions may not always exist in KBs. In this paper, we develop a new QA system that mines answers directly from the Web, and meanwhile employs KBs as a significant auxiliary to further boost the QA performance. Specifically, to the best of our knowledge, we make the first attempt to link answer candidates to entities in Freebase, during answer candidate generation. Several remarkable advantages follow: (1) Redundancy among answer candidates is automatically reduced. (2) The types of an answer candidate can be effortlessly determined by those of its corresponding entity in Freebase. (3) Capitalizing on the rich information about entities in Freebase, we can develop semantic features for each answer candidate after linking them to Freebase. Particularly, we construct answer-type related features with two novel probabilistic models, which directly evaluate the appropriateness of an answer candidate's types under a given question. Overall, such semantic features turn out to play significant roles in determining the true answers from the large answer candidate pool. The experimental results show that across two testing datasets, our QA system achieves an 18%~54% improvement under F_1 metric, compared with various existing QA systems.
- K. Balog and R. Neumayer. Hierarchical target type identification for entity-oriented queries. In CIKM, pages 2391--2394. ACM, 2012. Google ScholarDigital Library
- J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on Freebase from question-answer pairs. In EMNLP, pages 1533--1544, 2013.Google Scholar
- J. Berant and P. Liang. Semantic parsing via paraphrasing. In ACL, 2014.Google ScholarCross Ref
- C. Bishop et al. Pattern recognition and machine learning, volume 1. springer New York, 2006. Google ScholarDigital Library
- D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. JMLR, 3:993--1022, 2003. Google ScholarDigital Library
- K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD, pages 1247--1250. ACM, 2008. Google ScholarDigital Library
- E. Brill, S. Dumais, and M. Banko. An analysis of the AskMSR question-answering system. In EMNLP, pages 257--264, 2002. Google ScholarDigital Library
- E. Brill, J. J. Lin, M. Banko, S. T. Dumais, and A. Y. Ng. Data-intensive question answering. In TREC, 2001.Google Scholar
- C. Burges. From RankNet to LambdaRank to LambdaMART: An overview. Learning, 11:23--581, 2010.Google Scholar
- C. Burges, K. M. Svore, P. N. Bennett, A. Pastusiak, and Q. Wu. Learning to rank using an ensemble of lambda-gradient models. In Yahoo! Learning to Rank Challenge, pages 25--35, 2011.Google ScholarDigital Library
- S. Chaturvedi, V. Castelli, R. Florian, R. M. Nallapati, and H. Raghavan. Joint question clustering and relevance prediction for open domain non-factoid question answering. In WWW, pages 503--514, 2014. Google ScholarDigital Library
- J. Chu-Carroll, J. Prager, C. Welty, K. Czuba, and D. Ferrucci. A multi-strategy and multi-source approach to question answering. Technical report, DTIC Document, 2006.Google Scholar
- S. Cucerzan and A. Sil. The msr systems for entity linking and temporal slot filling at TAC 2013. In Text Analysis Conference, 2013.Google Scholar
- X. Dong, K. Murphy, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, T. Strohmann, S. Sun, and W. Zhang. Knowledge vault: A Web-scale approach to probabilistic knowledge fusion. In SIGKDD, pages 601--610, 2014. Google ScholarDigital Library
- O. Etzioni. Search needs a shake-up. Nature, 476(7358):25--26, 2011.Google ScholarCross Ref
- A. Fader, S. Soderland, and O. Etzioni. Identifying relations for open information extraction. In EMNLP, pages 1535--1545, 2011. Google ScholarDigital Library
- A. Fader, L. Zettlemoyer, and O. Etzioni. Paraphrase-driven learning for open question answering. In ACL, pages 1608--1618, 2013.Google Scholar
- A. Fader, L. Zettlemoyer, and O. Etzioni. Open question answering over curated and extracted knowledge bases. In SIGKDD. ACM, 2014. Google ScholarDigital Library
- C. Fellbaum. WordNet: An electronic lexical database. 1998. http://www. cogsci. princeton. edu/wn, 2010.Google Scholar
- D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, D. Gondek, A. A. Kalyanpur, A. Lally, J. Murdock, E. Nyberg, J. Prager, et al. Building watson: An overview of the DeepQA project. AI magazine, 31(3):59--79, 2010.Google ScholarDigital Library
- J. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics, pages 1189--1232, 2001.Google ScholarCross Ref
- S. Harabagiu, D. Moldovan, M. Pasca, R. Mihalcea, M. Surdeanu, R. Bunescu, R. Girju, V. Rus, and P. Morarescu. FALCON: Boosting knowledge for answer engines. In TREC, volume 9, pages 479--488, 2000.Google Scholar
- G. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504--507, 2006.Google ScholarCross Ref
- E. Hovy, L. Gerber, U. Hermjakob, M. Junk, and C. Lin. Question answering in Webclopedia. In TREC, volume 9, 2000.Google Scholar
- J. Ko, E. Nyberg, and L. Si. A probabilistic graphical model for joint answer ranking in question answering. In SIGIR on Rearch and Development in IR, pages 343--350. ACM, 2007. Google ScholarDigital Library
- C. Kwok, O. Etzioni, and D. Weld. Scaling question answering to the Web. TOIS, 19(3):242--262, 2001. Google ScholarDigital Library
- A. Lally, J. Prager, M. McCord, B. Boguraev, S. Patwardhan, J. Fan, P. Fodor, and J. Chu-Carroll. Question analysis: How watson reads a clue. IBM Journal of Research and Development, 56(3.4):2--1, 2012. Google ScholarDigital Library
- X. Li and D. Roth. Learning question classifiers. In ICCL, pages 1--7, 2002. Google ScholarDigital Library
- D. C. Liu and J. Nocedal. On the limited memory BFGS method for large scale optimization. Mathematical programming, 45(1-3):503--528, 1989. Google ScholarDigital Library
- X. Luo, H. Raghavan, V. Castelli, S. Maskey, and R. Florian. Finding what matters in questions. In HLT-NAACL, pages 878--887, 2013.Google Scholar
- E. Marsh and D. Perzanowski. MUC-7 evaluation of ie technology: Overview of results. In MUC-7, volume 20, 1998.Google Scholar
- B. Min, R. Grishman, L. Wan, C. Wang, and D. Gondek. Distant supervision for relation extraction with an incomplete knowledge base. In HLT-NAACL, pages 777--782, 2013.Google Scholar
- J. W. Murdock, A. Kalyanpur, C. Welty, J. Fan, D. A. Ferrucci, D. Gondek, L. Zhang, and H. Kanayama. Typing candidate answers using type coercion. IBM Journal of Research and Development, 56(3.4):7--1, 2012. Google ScholarDigital Library
- S. Na, I. Kang, S. Lee, and J. Lee. Question answering approach using a WordNet-based answer type taxonomy. In TREC, 2002.Google Scholar
- C. Pinchak and D. Lin. A probabilistic answer type model. In EACL, 2006.Google Scholar
- S. Robertson and H. Zaragoza. On rank-based effectiveness measures and optimization. Information Retrieval, 10(3):321--339, 2007. Google ScholarDigital Library
- N. Schlaefer, P. Gieselmann, T. Schaaf, and A. Waibel. A pattern learning approach to question answering within the ephyra framework. In Text, speech and dialogue, pages 687--694. Springer, 2006. Google ScholarDigital Library
- F. M. Suchanek, G. Kasneci, and G. Weikum. YAGO: a core of semantic knowledge. In WWW, pages 697--706. ACM, 2007. Google ScholarDigital Library
- C. Tsai, W. Yih, and C. Burges. Web-based question answering: Revisiting AskMSR. Technical Report MSR-TR-2015-20, Microsoft Research, 2015.Google Scholar
- C. Unger, L. Buhmann, J. Lehmann, A. Ngonga Ngomo, D. Gerber, and P. Cimiano. Template-based question answering over RDF data. In WWW, pages 639--648. ACM, 2012. Google ScholarDigital Library
- E. M. Voorhees and D. M. Tice. Building a question answering test collection. In SIGIR on Rearch and Development in IR, pages 200--207. ACM, 2000. Google ScholarDigital Library
- R. West, E. Gabrilovich, K. Murphy, S. Sun, R. Gupta, and D. Lin. Knowledge base completion via search-based question answering. In WWW, pages 515--526, 2014. Google ScholarDigital Library
- R. W. White, M. Richardson, and W. Yih. Questions vs. queries in informational search tasks. Technical Report MSR-TR-2014-96, Microsoft Research, 2014.Google Scholar
- M. Yahya, K. Berberich, S. Elbassuoni, M. Ramanath, V. Tresp, and G. Weikum. Natural language questions for the Web of data. In EMNLP-CoNLL, pages 379--390, 2012. Google ScholarDigital Library
- X. Yao and B. Van Durme. Information extraction over structured data: Question answering with Freebase. In ACL, 2014.Google ScholarCross Ref
- L. Zou, R. Huang, H. Wang, J. X. Yu, W. He, and D. Zhao. Natural language question answering over RDF: a graph data driven approach. In SIGMOD, pages 313--324. ACM, 2014. Google ScholarDigital Library
Index Terms
- Open Domain Question Answering via Semantic Enrichment
Recommendations
Table Cell Search for Question Answering
WWW '16: Proceedings of the 25th International Conference on World Wide WebTables are pervasive on the Web. Informative web tables range across a large variety of topics, which can naturally serve as a significant resource to satisfy user information needs. Driven by such observations, in this paper, we investigate an ...
Fusing Essential Knowledge for Text-Based Open-Domain Question Answering
Advances in Knowledge Discovery and Data MiningAbstractQuestion answering (QA) systems can be classified as either text-based QA systems or knowledge base QA (KBQA) systems, depending on the used knowledge source. KBQA systems are generally domain-specific and can’t deal with a variety of questions in ...
Quality-aware collaborative question answering: methods and evaluation
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data MiningCommunity Question Answering (QA) portals contain questions and answers contributed by hundreds of millions of users. These databases of questions and answers are of great value if they can be used directly to answer questions from any user. In this ...
Comments