research-article

Table Cell Search for Question Answering

Authors:
Huan Sun

University of California, Santa Barbara, Santa Barbara, CA, USA

University of California, Santa Barbara, Santa Barbara, CA, USA
View Profile

,
Hao Ma

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Xiaodong He

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Wen-tau Yih

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Yu Su

University of California, Santa Barbara, Santa Barbara, CA, USA

University of California, Santa Barbara, Santa Barbara, CA, USA
View Profile

,
Xifeng Yan

University of California, Santa Barbara, Santa Barbara, CA, USA

University of California, Santa Barbara, Santa Barbara, CA, USA
View Profile

WWW '16: Proceedings of the 25th International Conference on World Wide WebApril 2016Pages 771–782https://doi.org/10.1145/2872427.2883080

Published:11 April 2016Publication History

WWW '16: Proceedings of the 25th International Conference on World Wide Web

Pages 771–782

ABSTRACT

Tables are pervasive on the Web. Informative web tables range across a large variety of topics, which can naturally serve as a significant resource to satisfy user information needs. Driven by such observations, in this paper, we investigate an important yet largely under-addressed problem: Given millions of tables, how to precisely retrieve table cells to answer a user question. This work proposes a novel table cell search framework to attack this problem. We first formulate the concept of a relational chain which connects two cells in a table and represents the semantic relation between them. With the help of search engine snippets, our framework generates a set of relational chains pointing to potentially correct answer cells. We further employ deep neural networks to conduct more fine-grained inference on which relational chains best match the input question and finally extract the corresponding answer cells. Based on millions of tables crawled from the Web, we evaluate our framework in the open-domain question answering (QA) setting, using both the well-known WebQuestions dataset and user queries mined from Bing search engine logs. On WebQuestions, our framework is comparable to state-of-the-art QA systems based on knowledge bases (KBs), while on Bing queries, it outperforms other systems with a 56.7% relative gain. Moreover, when combined with results from our framework, KB-based QA performance can obtain a relative improvement of 28.1% to 66.7%, demonstrating that web tables supply rich knowledge that might not exist or is difficult to be identified in existing KBs.

References

Freebase wiki. http://wiki.freebase.com/wiki/Wikipedia.Google Scholar
M. D. Adelfio and H. Samet. Schema extraction for tabular data on the web. VLDB, 6(6):421--432, 2013. Google ScholarDigital Library
I. Androutsopoulos, G. D. Ritchie, and P. Thanisch. Natural language interfaces to databases--an introduction. Natural language engineering, 1(01):29--81, 1995. Google ScholarCross Ref
S. Balakrishnan, A. Y. Halevy, B. Harb, H. Lee, J. Madhavan, A. Rostamizadeh, W. Shen, K. Wilder, F. Wu, and C. Yu. Applying webtables in practice. In CIDR, 2015.Google Scholar
J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on freebase from question-answer pairs. In EMNLP, pages 1533--1544, 2013.Google Scholar
J. Berant and P. Liang. Semantic parsing via paraphrasing. In ACL, pages 1415--1425, 2014. Google ScholarCross Ref
K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD, pages 1247--1250. ACM, 2008. Google ScholarDigital Library
E. Brill, S. Dumais, and M. Banko. An analysis of the AskMSR question-answering system. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10, pages 257--264. ACL, 2002. Google ScholarDigital Library
C. J. Burges. From RankNet to LambdaRank to LambdaMART: An overview. Learning, 11:23--581, 2010.Google Scholar
C. J. Burges, K. M. Svore, P. N. Bennett, A. Pastusiak, and Q. Wu. Learning to rank using an ensemble of lambda-gradient models. In Yahoo! Learning to Rank Challenge, pages 25--35, 2011.Google Scholar
M. J. Cafarella, A. Halevy, D. Z. Wang, E. Wu, and Y. Zhang. Webtables: exploring the power of tables on the web. VLDB, 1(1):538--549, 2008. Google ScholarDigital Library
M. J. Cafarella, A. Y. Halevy, Y. Zhang, D. Z. Wang, and E. Wu. Uncovering the relational web. In WebDB. Citeseer, 2008.Google Scholar
J. Chu-Carroll, J. Prager, C. Welty, K. Czuba, and D. Ferrucci. A multi-strategy and multi-source approach to question answering. Technical report, DTIC Document, 2006.Google Scholar
A. Das Sarma, L. Fang, N. Gupta, A. Halevy, H. Lee, F. Wu, R. Xin, and C. Yu. Finding related tables. In SIGMOD, pages 817--828. ACM, 2012. Google ScholarDigital Library
X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, K. Murphy, T. Strohmann, S. Sun, and W. Zhang. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In SIGKDD, pages 601--610. ACM, 2014. Google ScholarDigital Library
A. Fader, L. Zettlemoyer, and O. Etzioni. Open question answering over curated and extracted knowledge bases. In SIGKDD. ACM, 2014. Google ScholarDigital Library
A. Fader, L. S. Zettlemoyer, and O. Etzioni. Paraphrase-driven learning for open question answering. In ACL, pages 1608--1618, 2013.Google Scholar
D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, D. Gondek, A. A. Kalyanpur, A. Lally, J. W. Murdock, E. Nyberg, J. Prager, et al. Building watson: An overview of the deepqa project. AI magazine, 31(3):59--79, 2010.Google ScholarDigital Library
J. H. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics, pages 1189--1232, 2001. Google ScholarCross Ref
J. Gao, P. Pantel, M. Gamon, X. He, L. Deng, and Y. Shen. Modeling interestingness with deep neural networks. In EMNLP, 2014. Google ScholarCross Ref
B. Hu, Z. Lu, H. Li, and Q. Chen. Convolutional neural network architectures for matching natural language sentences. In NIPS, pages 2042--2050, 2014.Google ScholarDigital Library
P.-S. Huang, X. He, J. Gao, L. Deng, A. Acero, and L. Heck. Learning deep structured semantic models for web search using clickthrough data. In CIKM, pages 2333--2338. ACM, 2013. Google ScholarDigital Library
A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015. Google ScholarCross Ref
J. Ko, E. Nyberg, and L. Si. A probabilistic graphical model for joint answer ranking in question answering. In SIGIR, pages 343--350. ACM, 2007. Google ScholarDigital Library
F. Li and H. Jagadish. Constructing an interactive natural language interface for relational databases. VLDB, 8(1):73--84, 2014. Google ScholarDigital Library
Y. Li, H. Yang, and H. Jagadish. Nalix: an interactive natural language interface for querying xml. In SIGMOD, pages 900--902. ACM, 2005. Google ScholarDigital Library
G. Limaye, S. Sarawagi, and S. Chakrabarti. Annotating and searching web tables using entities, types and relationships. VLDB, 3(1--2):1338--1347, 2010. Google ScholarDigital Library
C. D. Manning, P. Raghavan, H. Schütze, et al. Introduction to information retrieval, volume 1. Cambridge university press Cambridge, 2008. Google ScholarCross Ref
B. Min, R. Grishman, L. Wan, C. Wang, and D. Gondek. Distant supervision for relation extraction with an incomplete knowledge base. In HLT-NAACL, pages 777--782, 2013.Google Scholar
D. Nadeau and S. Sekine. A survey of named entity recognition and classification. Lingvisticae Investigationes, 30(1):3--26, 2007. Google ScholarCross Ref
P. Pasupat and P. Liang. Compositional semantic parsing on semi-structured tables. In ACL, 2015. Google ScholarCross Ref
R. Pimplikar and S. Sarawagi. Answering table queries on the web using column keywords. VLDB, 5(10):908--919, 2012. Google ScholarDigital Library
D. Pinto, M. Branstein, R. Coleman, W. B. Croft, M. King, W. Li, and X. Wei. Quasm: a system for question answering using semi-structured data. In Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, pages 46--55. ACM, 2002. Google ScholarDigital Library
A.-M. Popescu, O. Etzioni, and H. Kautz. Towards a theory of natural language interfaces to databases. In Proceedings of the 8th international conference on Intelligent user interfaces, pages 149--157. ACM, 2003. Google ScholarDigital Library
S. Reddy, M. Lapata, and M. Steedman. Large-scale semantic parsing without question-answer pairs. Transactions of the Association for Computational Linguistics, 2:377--392, 2014.Google ScholarCross Ref
N. Schlaefer, P. Gieselmann, T. Schaaf, and A. Waibel. A pattern learning approach to question answering within the ephyra framework. In Text, speech and dialogue, pages 687--694. Springer, 2006. Google ScholarDigital Library
Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil. A latent semantic model with convolutional-pooling structure for information retrieval. In CIKM, pages 101--110. ACM, 2014. Google ScholarDigital Library
Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil. Learning semantic representations using convolutional neural networks for web search. In WWW companion, pages 373--374, 2014. Google ScholarDigital Library
R. Socher, D. Chen, C. D. Manning, and A. Ng. Reasoning with neural tensor networks for knowledge base completion. In NIPS, pages 926--934, 2013.Google ScholarDigital Library
H. Sun, H. Ma, W.-t. Yih, C.-T. Tsai, J. Liu, and M.-W. Chang. Open domain question answering via semantic enrichment. In WWW, pages 1045--1055, 2015. Google ScholarDigital Library
I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, pages 3104--3112, 2014.Google ScholarDigital Library
C. Unger, L. Bühmann, J. Lehmann, A.-C. Ngonga Ngomo, D. Gerber, and P. Cimiano. Template-based question answering over RDF data. In WWW, pages 639--648, 2012. Google ScholarDigital Library
P. Venetis, A. Halevy, J. Madhavan, M. Paşca, W. Shen, F. Wu, G. Miao, and C. Wu. Recovering semantics of tables on the web. VLDB, 4(9):528--538, 2011. Google ScholarDigital Library
E. M. Voorhees and D. M. Tice. Building a question answering test collection. In SIGIR, pages 200--207. ACM, 2000. Google ScholarDigital Library
R. West, E. Gabrilovich, K. Murphy, S. Sun, R. Gupta, and D. Lin. Knowledge base completion via search-based question answering. In WWW, pages 515--526, 2014. Google ScholarDigital Library
M. Yahya, K. Berberich, S. Elbassuoni, M. Ramanath, V. Tresp, and G. Weikum. Natural language questions for the web of data. In EMNLP-CoNLL, pages 379--390. ACL, 2012. Google ScholarDigital Library
M. Yakout, K. Ganjam, K. Chakrabarti, and S. Chaudhuri. Infogather: entity augmentation and attribute discovery by holistic matching with web tables. In SIGMOD, pages 97--108. ACM, 2012. Google ScholarDigital Library
M. Yang, B. Ding, S. Chaudhuri, and K. Chakrabarti. Finding patterns in a knowledge base using keywords to compose table answers. VLDB, 7(14):1809--1820, 2014. Google ScholarDigital Library
Y. Yang and M.-W. Chang. S-mart: Novel tree-based structured learning algorithms applied to tweet entity linking. In ACL, 2015.Google ScholarCross Ref
X. Yao and B. Van Durme. Information extraction over structured data: Question answering with freebase. In ACL, 2014.Google ScholarCross Ref
W.-t. Yih, M.-W. Chang, X. He, and J. Gao. Semantic parsing via staged query graph generation: Question answering with knowledge base. In ACL, 2015.Google ScholarCross Ref
M. Zhang and K. Chakrabarti. Infogather+: Semantic matching and annotation of numeric and time-varying attributes in web tables. In SIGMOD, pages 145--156. ACM, 2013. Google ScholarDigital Library
L. Zou, R. Huang, H. Wang, J. X. Yu, W. He, and D. Zhao. Natural language question answering over RDF: a graph data driven approach. In SIGMOD, pages 313--324. ACM, 2014. Google ScholarDigital Library

Index Terms

Table Cell Search for Question Answering

Recommendations

Open Domain Question Answering via Semantic Enrichment
WWW '15: Proceedings of the 24th International Conference on World Wide Web

Most recent question answering (QA) systems query large-scale knowledge bases (KBs) to answer a question, after parsing and transforming natural language questions to KBs-executable forms (e.g., logical forms). As a well-known fact, KBs are far from ...
Read More
Quality-aware collaborative question answering: methods and evaluation
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining

Community Question Answering (QA) portals contain questions and answers contributed by hundreds of millions of users. These databases of questions and answers are of great value if they can be used directly to answer questions from any user. In this ...
Read More
Combining evidence with a probabilistic framework for answer ranking and answer merging in question answering

Question answering (QA) aims at finding exact answers to a user's question from a large collection of documents. Most QA systems combine information retrieval with extraction techniques to identify a set of likely candidates and then utilize some ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '16: Proceedings of the 25th International Conference on World Wide Web
April 2016
1482 pages
ISBN:9781450341431
General Chairs:
Jacqueline Bourdeau
Tele-university (TELUQ), Montreal, QC, Canada
,
Jim A. Hendler
Rensselaer Polytechnic Institute, Troy, NY, USA
,
Roger Nkambou Nkambou
Université du Québec à Montréal, Montreal, QC, Canada
,
Program Chairs:
Ian Horrocks
University of Oxford, UK
,
Ben Y. Zhao
University of California at Santa Barbara, CA, USA
Copyright © 2016 Copyright is held by the International World Wide Web Conference Committee (IW3C2)
Sponsors
In-Cooperation
Publisher
International World Wide Web Conferences Steering Committee
Republic and Canton of Geneva, Switzerland
Publication History
- Published: 11 April 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
knowledge bases
question answering
table cell search
web search
Qualifiers
- research-article
Conference

Acceptance Rates
WWW '16 Paper Acceptance Rate115of727submissions,16%Overall Acceptance Rate1,899of8,196submissions,23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 45
  Total Citations
  View Citations
- 666
  Total Downloads
- Downloads (Last 12 months)56
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Table Cell Search for Question Answering

WWW '16: Proceedings of the 25th International Conference on World Wide Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

Open Domain Question Answering via Semantic Enrichment

Quality-aware collaborative question answering: methods and evaluation

Combining evidence with a probabilistic framework for answer ranking and answer merging in question answering

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Table Cell Search for Question Answering

WWW '16: Proceedings of the 25th International Conference on World Wide Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

Open Domain Question Answering via Semantic Enrichment

Quality-aware collaborative question answering: methods and evaluation

Combining evidence with a probabilistic framework for answer ranking and answer merging in question answering

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media