skip to main content
10.1145/2872427.2883080acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Table Cell Search for Question Answering

Authors Info & Claims
Published:11 April 2016Publication History

ABSTRACT

Tables are pervasive on the Web. Informative web tables range across a large variety of topics, which can naturally serve as a significant resource to satisfy user information needs. Driven by such observations, in this paper, we investigate an important yet largely under-addressed problem: Given millions of tables, how to precisely retrieve table cells to answer a user question. This work proposes a novel table cell search framework to attack this problem. We first formulate the concept of a relational chain which connects two cells in a table and represents the semantic relation between them. With the help of search engine snippets, our framework generates a set of relational chains pointing to potentially correct answer cells. We further employ deep neural networks to conduct more fine-grained inference on which relational chains best match the input question and finally extract the corresponding answer cells. Based on millions of tables crawled from the Web, we evaluate our framework in the open-domain question answering (QA) setting, using both the well-known WebQuestions dataset and user queries mined from Bing search engine logs. On WebQuestions, our framework is comparable to state-of-the-art QA systems based on knowledge bases (KBs), while on Bing queries, it outperforms other systems with a 56.7% relative gain. Moreover, when combined with results from our framework, KB-based QA performance can obtain a relative improvement of 28.1% to 66.7%, demonstrating that web tables supply rich knowledge that might not exist or is difficult to be identified in existing KBs.

References

  1. Freebase wiki. http://wiki.freebase.com/wiki/Wikipedia.Google ScholarGoogle Scholar
  2. M. D. Adelfio and H. Samet. Schema extraction for tabular data on the web. VLDB, 6(6):421--432, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. I. Androutsopoulos, G. D. Ritchie, and P. Thanisch. Natural language interfaces to databases--an introduction. Natural language engineering, 1(01):29--81, 1995. Google ScholarGoogle ScholarCross RefCross Ref
  4. S. Balakrishnan, A. Y. Halevy, B. Harb, H. Lee, J. Madhavan, A. Rostamizadeh, W. Shen, K. Wilder, F. Wu, and C. Yu. Applying webtables in practice. In CIDR, 2015.Google ScholarGoogle Scholar
  5. J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on freebase from question-answer pairs. In EMNLP, pages 1533--1544, 2013.Google ScholarGoogle Scholar
  6. J. Berant and P. Liang. Semantic parsing via paraphrasing. In ACL, pages 1415--1425, 2014. Google ScholarGoogle ScholarCross RefCross Ref
  7. K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD, pages 1247--1250. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. E. Brill, S. Dumais, and M. Banko. An analysis of the AskMSR question-answering system. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10, pages 257--264. ACL, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. J. Burges. From RankNet to LambdaRank to LambdaMART: An overview. Learning, 11:23--581, 2010.Google ScholarGoogle Scholar
  10. C. J. Burges, K. M. Svore, P. N. Bennett, A. Pastusiak, and Q. Wu. Learning to rank using an ensemble of lambda-gradient models. In Yahoo! Learning to Rank Challenge, pages 25--35, 2011.Google ScholarGoogle Scholar
  11. M. J. Cafarella, A. Halevy, D. Z. Wang, E. Wu, and Y. Zhang. Webtables: exploring the power of tables on the web. VLDB, 1(1):538--549, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. J. Cafarella, A. Y. Halevy, Y. Zhang, D. Z. Wang, and E. Wu. Uncovering the relational web. In WebDB. Citeseer, 2008.Google ScholarGoogle Scholar
  13. J. Chu-Carroll, J. Prager, C. Welty, K. Czuba, and D. Ferrucci. A multi-strategy and multi-source approach to question answering. Technical report, DTIC Document, 2006.Google ScholarGoogle Scholar
  14. A. Das Sarma, L. Fang, N. Gupta, A. Halevy, H. Lee, F. Wu, R. Xin, and C. Yu. Finding related tables. In SIGMOD, pages 817--828. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, K. Murphy, T. Strohmann, S. Sun, and W. Zhang. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In SIGKDD, pages 601--610. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Fader, L. Zettlemoyer, and O. Etzioni. Open question answering over curated and extracted knowledge bases. In SIGKDD. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Fader, L. S. Zettlemoyer, and O. Etzioni. Paraphrase-driven learning for open question answering. In ACL, pages 1608--1618, 2013.Google ScholarGoogle Scholar
  18. D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, D. Gondek, A. A. Kalyanpur, A. Lally, J. W. Murdock, E. Nyberg, J. Prager, et al. Building watson: An overview of the deepqa project. AI magazine, 31(3):59--79, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. H. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics, pages 1189--1232, 2001. Google ScholarGoogle ScholarCross RefCross Ref
  20. J. Gao, P. Pantel, M. Gamon, X. He, L. Deng, and Y. Shen. Modeling interestingness with deep neural networks. In EMNLP, 2014. Google ScholarGoogle ScholarCross RefCross Ref
  21. B. Hu, Z. Lu, H. Li, and Q. Chen. Convolutional neural network architectures for matching natural language sentences. In NIPS, pages 2042--2050, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P.-S. Huang, X. He, J. Gao, L. Deng, A. Acero, and L. Heck. Learning deep structured semantic models for web search using clickthrough data. In CIKM, pages 2333--2338. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015. Google ScholarGoogle ScholarCross RefCross Ref
  24. J. Ko, E. Nyberg, and L. Si. A probabilistic graphical model for joint answer ranking in question answering. In SIGIR, pages 343--350. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. F. Li and H. Jagadish. Constructing an interactive natural language interface for relational databases. VLDB, 8(1):73--84, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Y. Li, H. Yang, and H. Jagadish. Nalix: an interactive natural language interface for querying xml. In SIGMOD, pages 900--902. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. G. Limaye, S. Sarawagi, and S. Chakrabarti. Annotating and searching web tables using entities, types and relationships. VLDB, 3(1--2):1338--1347, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. D. Manning, P. Raghavan, H. Schütze, et al. Introduction to information retrieval, volume 1. Cambridge university press Cambridge, 2008. Google ScholarGoogle ScholarCross RefCross Ref
  29. B. Min, R. Grishman, L. Wan, C. Wang, and D. Gondek. Distant supervision for relation extraction with an incomplete knowledge base. In HLT-NAACL, pages 777--782, 2013.Google ScholarGoogle Scholar
  30. D. Nadeau and S. Sekine. A survey of named entity recognition and classification. Lingvisticae Investigationes, 30(1):3--26, 2007. Google ScholarGoogle ScholarCross RefCross Ref
  31. P. Pasupat and P. Liang. Compositional semantic parsing on semi-structured tables. In ACL, 2015. Google ScholarGoogle ScholarCross RefCross Ref
  32. R. Pimplikar and S. Sarawagi. Answering table queries on the web using column keywords. VLDB, 5(10):908--919, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. D. Pinto, M. Branstein, R. Coleman, W. B. Croft, M. King, W. Li, and X. Wei. Quasm: a system for question answering using semi-structured data. In Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, pages 46--55. ACM, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. A.-M. Popescu, O. Etzioni, and H. Kautz. Towards a theory of natural language interfaces to databases. In Proceedings of the 8th international conference on Intelligent user interfaces, pages 149--157. ACM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. S. Reddy, M. Lapata, and M. Steedman. Large-scale semantic parsing without question-answer pairs. Transactions of the Association for Computational Linguistics, 2:377--392, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  36. N. Schlaefer, P. Gieselmann, T. Schaaf, and A. Waibel. A pattern learning approach to question answering within the ephyra framework. In Text, speech and dialogue, pages 687--694. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil. A latent semantic model with convolutional-pooling structure for information retrieval. In CIKM, pages 101--110. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil. Learning semantic representations using convolutional neural networks for web search. In WWW companion, pages 373--374, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. R. Socher, D. Chen, C. D. Manning, and A. Ng. Reasoning with neural tensor networks for knowledge base completion. In NIPS, pages 926--934, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. H. Sun, H. Ma, W.-t. Yih, C.-T. Tsai, J. Liu, and M.-W. Chang. Open domain question answering via semantic enrichment. In WWW, pages 1045--1055, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, pages 3104--3112, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. C. Unger, L. Bühmann, J. Lehmann, A.-C. Ngonga Ngomo, D. Gerber, and P. Cimiano. Template-based question answering over RDF data. In WWW, pages 639--648, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. P. Venetis, A. Halevy, J. Madhavan, M. Paşca, W. Shen, F. Wu, G. Miao, and C. Wu. Recovering semantics of tables on the web. VLDB, 4(9):528--538, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. E. M. Voorhees and D. M. Tice. Building a question answering test collection. In SIGIR, pages 200--207. ACM, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. R. West, E. Gabrilovich, K. Murphy, S. Sun, R. Gupta, and D. Lin. Knowledge base completion via search-based question answering. In WWW, pages 515--526, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. M. Yahya, K. Berberich, S. Elbassuoni, M. Ramanath, V. Tresp, and G. Weikum. Natural language questions for the web of data. In EMNLP-CoNLL, pages 379--390. ACL, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. M. Yakout, K. Ganjam, K. Chakrabarti, and S. Chaudhuri. Infogather: entity augmentation and attribute discovery by holistic matching with web tables. In SIGMOD, pages 97--108. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. M. Yang, B. Ding, S. Chaudhuri, and K. Chakrabarti. Finding patterns in a knowledge base using keywords to compose table answers. VLDB, 7(14):1809--1820, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Y. Yang and M.-W. Chang. S-mart: Novel tree-based structured learning algorithms applied to tweet entity linking. In ACL, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  50. X. Yao and B. Van Durme. Information extraction over structured data: Question answering with freebase. In ACL, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  51. W.-t. Yih, M.-W. Chang, X. He, and J. Gao. Semantic parsing via staged query graph generation: Question answering with knowledge base. In ACL, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  52. M. Zhang and K. Chakrabarti. Infogather+: Semantic matching and annotation of numeric and time-varying attributes in web tables. In SIGMOD, pages 145--156. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. L. Zou, R. Huang, H. Wang, J. X. Yu, W. He, and D. Zhao. Natural language question answering over RDF: a graph data driven approach. In SIGMOD, pages 313--324. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Table Cell Search for Question Answering

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              WWW '16: Proceedings of the 25th International Conference on World Wide Web
              April 2016
              1482 pages
              ISBN:9781450341431

              Copyright © 2016 Copyright is held by the International World Wide Web Conference Committee (IW3C2)

              Publisher

              International World Wide Web Conferences Steering Committee

              Republic and Canton of Geneva, Switzerland

              Publication History

              • Published: 11 April 2016

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              WWW '16 Paper Acceptance Rate115of727submissions,16%Overall Acceptance Rate1,899of8,196submissions,23%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader