research-article

Open Domain Question Answering via Semantic Enrichment

Authors:
Huan Sun

University of California, Santa Barbara, GOLETA, CA, USA

University of California, Santa Barbara, GOLETA, CA, USA
View Profile

,
Hao Ma

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Wen-tau Yih

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Chen-Tse Tsai

University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA

University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
View Profile

,
Jingjing Liu

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Ming-Wei Chang

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

WWW '15: Proceedings of the 24th International Conference on World Wide WebMay 2015Pages 1045–1055https://doi.org/10.1145/2736277.2741651

Published:18 May 2015Publication History

WWW '15: Proceedings of the 24th International Conference on World Wide Web

Pages 1045–1055

ABSTRACT

Most recent question answering (QA) systems query large-scale knowledge bases (KBs) to answer a question, after parsing and transforming natural language questions to KBs-executable forms (e.g., logical forms). As a well-known fact, KBs are far from complete, so that information required to answer questions may not always exist in KBs. In this paper, we develop a new QA system that mines answers directly from the Web, and meanwhile employs KBs as a significant auxiliary to further boost the QA performance. Specifically, to the best of our knowledge, we make the first attempt to link answer candidates to entities in Freebase, during answer candidate generation. Several remarkable advantages follow: (1) Redundancy among answer candidates is automatically reduced. (2) The types of an answer candidate can be effortlessly determined by those of its corresponding entity in Freebase. (3) Capitalizing on the rich information about entities in Freebase, we can develop semantic features for each answer candidate after linking them to Freebase. Particularly, we construct answer-type related features with two novel probabilistic models, which directly evaluate the appropriateness of an answer candidate's types under a given question. Overall, such semantic features turn out to play significant roles in determining the true answers from the large answer candidate pool. The experimental results show that across two testing datasets, our QA system achieves an 18%~54% improvement under F_1 metric, compared with various existing QA systems.

References

K. Balog and R. Neumayer. Hierarchical target type identification for entity-oriented queries. In CIKM, pages 2391--2394. ACM, 2012. Google ScholarDigital Library
J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on Freebase from question-answer pairs. In EMNLP, pages 1533--1544, 2013.Google Scholar
J. Berant and P. Liang. Semantic parsing via paraphrasing. In ACL, 2014.Google ScholarCross Ref
C. Bishop et al. Pattern recognition and machine learning, volume 1. springer New York, 2006. Google ScholarDigital Library
D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. JMLR, 3:993--1022, 2003. Google ScholarDigital Library
K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD, pages 1247--1250. ACM, 2008. Google ScholarDigital Library
E. Brill, S. Dumais, and M. Banko. An analysis of the AskMSR question-answering system. In EMNLP, pages 257--264, 2002. Google ScholarDigital Library
E. Brill, J. J. Lin, M. Banko, S. T. Dumais, and A. Y. Ng. Data-intensive question answering. In TREC, 2001.Google Scholar
C. Burges. From RankNet to LambdaRank to LambdaMART: An overview. Learning, 11:23--581, 2010.Google Scholar
C. Burges, K. M. Svore, P. N. Bennett, A. Pastusiak, and Q. Wu. Learning to rank using an ensemble of lambda-gradient models. In Yahoo! Learning to Rank Challenge, pages 25--35, 2011.Google ScholarDigital Library
S. Chaturvedi, V. Castelli, R. Florian, R. M. Nallapati, and H. Raghavan. Joint question clustering and relevance prediction for open domain non-factoid question answering. In WWW, pages 503--514, 2014. Google ScholarDigital Library
J. Chu-Carroll, J. Prager, C. Welty, K. Czuba, and D. Ferrucci. A multi-strategy and multi-source approach to question answering. Technical report, DTIC Document, 2006.Google Scholar
S. Cucerzan and A. Sil. The msr systems for entity linking and temporal slot filling at TAC 2013. In Text Analysis Conference, 2013.Google Scholar
X. Dong, K. Murphy, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, T. Strohmann, S. Sun, and W. Zhang. Knowledge vault: A Web-scale approach to probabilistic knowledge fusion. In SIGKDD, pages 601--610, 2014. Google ScholarDigital Library
O. Etzioni. Search needs a shake-up. Nature, 476(7358):25--26, 2011.Google ScholarCross Ref
A. Fader, S. Soderland, and O. Etzioni. Identifying relations for open information extraction. In EMNLP, pages 1535--1545, 2011. Google ScholarDigital Library
A. Fader, L. Zettlemoyer, and O. Etzioni. Paraphrase-driven learning for open question answering. In ACL, pages 1608--1618, 2013.Google Scholar
A. Fader, L. Zettlemoyer, and O. Etzioni. Open question answering over curated and extracted knowledge bases. In SIGKDD. ACM, 2014. Google ScholarDigital Library
C. Fellbaum. WordNet: An electronic lexical database. 1998. http://www. cogsci. princeton. edu/wn, 2010.Google Scholar
D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, D. Gondek, A. A. Kalyanpur, A. Lally, J. Murdock, E. Nyberg, J. Prager, et al. Building watson: An overview of the DeepQA project. AI magazine, 31(3):59--79, 2010.Google ScholarDigital Library
J. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics, pages 1189--1232, 2001.Google ScholarCross Ref
S. Harabagiu, D. Moldovan, M. Pasca, R. Mihalcea, M. Surdeanu, R. Bunescu, R. Girju, V. Rus, and P. Morarescu. FALCON: Boosting knowledge for answer engines. In TREC, volume 9, pages 479--488, 2000.Google Scholar
G. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504--507, 2006.Google ScholarCross Ref
E. Hovy, L. Gerber, U. Hermjakob, M. Junk, and C. Lin. Question answering in Webclopedia. In TREC, volume 9, 2000.Google Scholar
J. Ko, E. Nyberg, and L. Si. A probabilistic graphical model for joint answer ranking in question answering. In SIGIR on Rearch and Development in IR, pages 343--350. ACM, 2007. Google ScholarDigital Library
C. Kwok, O. Etzioni, and D. Weld. Scaling question answering to the Web. TOIS, 19(3):242--262, 2001. Google ScholarDigital Library
A. Lally, J. Prager, M. McCord, B. Boguraev, S. Patwardhan, J. Fan, P. Fodor, and J. Chu-Carroll. Question analysis: How watson reads a clue. IBM Journal of Research and Development, 56(3.4):2--1, 2012. Google ScholarDigital Library
X. Li and D. Roth. Learning question classifiers. In ICCL, pages 1--7, 2002. Google ScholarDigital Library
D. C. Liu and J. Nocedal. On the limited memory BFGS method for large scale optimization. Mathematical programming, 45(1-3):503--528, 1989. Google ScholarDigital Library
X. Luo, H. Raghavan, V. Castelli, S. Maskey, and R. Florian. Finding what matters in questions. In HLT-NAACL, pages 878--887, 2013.Google Scholar
E. Marsh and D. Perzanowski. MUC-7 evaluation of ie technology: Overview of results. In MUC-7, volume 20, 1998.Google Scholar
B. Min, R. Grishman, L. Wan, C. Wang, and D. Gondek. Distant supervision for relation extraction with an incomplete knowledge base. In HLT-NAACL, pages 777--782, 2013.Google Scholar
J. W. Murdock, A. Kalyanpur, C. Welty, J. Fan, D. A. Ferrucci, D. Gondek, L. Zhang, and H. Kanayama. Typing candidate answers using type coercion. IBM Journal of Research and Development, 56(3.4):7--1, 2012. Google ScholarDigital Library
S. Na, I. Kang, S. Lee, and J. Lee. Question answering approach using a WordNet-based answer type taxonomy. In TREC, 2002.Google Scholar
C. Pinchak and D. Lin. A probabilistic answer type model. In EACL, 2006.Google Scholar
S. Robertson and H. Zaragoza. On rank-based effectiveness measures and optimization. Information Retrieval, 10(3):321--339, 2007. Google ScholarDigital Library
N. Schlaefer, P. Gieselmann, T. Schaaf, and A. Waibel. A pattern learning approach to question answering within the ephyra framework. In Text, speech and dialogue, pages 687--694. Springer, 2006. Google ScholarDigital Library
F. M. Suchanek, G. Kasneci, and G. Weikum. YAGO: a core of semantic knowledge. In WWW, pages 697--706. ACM, 2007. Google ScholarDigital Library
C. Tsai, W. Yih, and C. Burges. Web-based question answering: Revisiting AskMSR. Technical Report MSR-TR-2015-20, Microsoft Research, 2015.Google Scholar
C. Unger, L. Buhmann, J. Lehmann, A. Ngonga Ngomo, D. Gerber, and P. Cimiano. Template-based question answering over RDF data. In WWW, pages 639--648. ACM, 2012. Google ScholarDigital Library
E. M. Voorhees and D. M. Tice. Building a question answering test collection. In SIGIR on Rearch and Development in IR, pages 200--207. ACM, 2000. Google ScholarDigital Library
R. West, E. Gabrilovich, K. Murphy, S. Sun, R. Gupta, and D. Lin. Knowledge base completion via search-based question answering. In WWW, pages 515--526, 2014. Google ScholarDigital Library
R. W. White, M. Richardson, and W. Yih. Questions vs. queries in informational search tasks. Technical Report MSR-TR-2014-96, Microsoft Research, 2014.Google Scholar
M. Yahya, K. Berberich, S. Elbassuoni, M. Ramanath, V. Tresp, and G. Weikum. Natural language questions for the Web of data. In EMNLP-CoNLL, pages 379--390, 2012. Google ScholarDigital Library
X. Yao and B. Van Durme. Information extraction over structured data: Question answering with Freebase. In ACL, 2014.Google ScholarCross Ref
L. Zou, R. Huang, H. Wang, J. X. Yu, W. He, and D. Zhao. Natural language question answering over RDF: a graph data driven approach. In SIGMOD, pages 313--324. ACM, 2014. Google ScholarDigital Library

Index Terms

Open Domain Question Answering via Semantic Enrichment
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
      1. Logic programming and answer set programming
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Table Cell Search for Question Answering
WWW '16: Proceedings of the 25th International Conference on World Wide Web

Tables are pervasive on the Web. Informative web tables range across a large variety of topics, which can naturally serve as a significant resource to satisfy user information needs. Driven by such observations, in this paper, we investigate an ...
Read More
Fusing Essential Knowledge for Text-Based Open-Domain Question Answering
Advances in Knowledge Discovery and Data Mining
Abstract
Question answering (QA) systems can be classified as either text-based QA systems or knowledge base QA (KBQA) systems, depending on the used knowledge source. KBQA systems are generally domain-specific and can’t deal with a variety of questions in ...
Read More
Quality-aware collaborative question answering: methods and evaluation
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining

Community Question Answering (QA) portals contain questions and answers contributed by hundreds of millions of users. These databases of questions and answers are of great value if they can be used directly to answer questions from any user. In this ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '15: Proceedings of the 24th International Conference on World Wide Web
May 2015
1460 pages
ISBN:9781450334693
General Chairs:
Aldo Gangemi
National Research Council, Italy & Paris 13 University-CNRS, France
,
Stefano Leonardi
Sapienza University of Rome, Italy
,
Alessandro Panconesi
Sapienza University of Rome, Italy
Copyright © 2015 Copyright is held by the International World Wide Web Conference Committee (IW3C2)
Sponsors
In-Cooperation
Publisher
International World Wide Web Conferences Steering Committee
Republic and Canton of Geneva, Switzerland
Publication History
- Published: 18 May 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
knowledge bases
question answering
web search
Qualifiers
- research-article
Conference

Acceptance Rates
WWW '15 Paper Acceptance Rate131of929submissions,14%Overall Acceptance Rate1,899of8,196submissions,23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 47
  Total Citations
  View Citations
- 671
  Total Downloads
- Downloads (Last 12 months)18
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Open Domain Question Answering via Semantic Enrichment

WWW '15: Proceedings of the 24th International Conference on World Wide Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

Table Cell Search for Question Answering

Fusing Essential Knowledge for Text-Based Open-Domain Question Answering

Quality-aware collaborative question answering: methods and evaluation