ABSTRACT
Many researchers have used lexical networks and ontologies to mitigate synonymy and polysemy problems in Question Answering (QA), systems coupled with taggers, query classifiers, and answer extractors in complex and ad-hoc ways. We seek to make QA systems reproducible with shared and modest human effort, carefully separating knowledge from algorithms. To this end, we propose an aesthetically "clean" Bayesian inference scheme for exploiting lexical relations for passage-scoring for QA. The factors which contribute to the efficacy of Bayesian Inferencing on lexical relations are soft word sense disambiguation, parameter smoothing which ameliorates the data sparsity problem and estimation of joint probability over words which overcomes the deficiency of naive-bayes-like approaches. Our system is superior to vector-space ranking techniques from IR, and its accuracy approaches that of the top contenders at the TREC QA tasks in recent years.
- Abe, Naoki, and Hang Li. 1996. Learning word association norms using tree cut pair models. In Proceedings of the 13th International Conference on Machine Learning.Google Scholar
- C. Buckley. 1985. Implementation of the smart information retrieval system. Technical report, Technical Report TR85-686, Department of Computer Science, Cornell University. Google ScholarDigital Library
- C. L. A. Clarke, Gordon V. Cormack, and Thomas R. Lynam. 2001. Exploiting redundancy in question answering. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 358--365. ACM Press. Google ScholarDigital Library
- C. Fellbaum, 1998. WordNet: An Electronic Lexical Database, chapter Using WordNet for Text Retrieval, pages 285--303. The MIT Press: Cambridge, MA.Google Scholar
- Christiane Fellbaum. 1998b. WordNet: An Electronic Lexical Database. The MIT Press.Google Scholar
- Sanda Harabagiu, Dan Moldovan, Marius Pasca, Rada Mihalcea, Mihai Surdeanu, Razvan Bunescu, Roxana Girju, Vasile Rus, and Paul Morarescu. 2000. Falcon: Boosting knowledge for answer engines. In Proceedings of the ninth text retrieval conference (TREC-9), November.Google Scholar
- David Heckerman. 1995. A Tutorial on Learning Bayesian Networks. Technical Report MSR-TR-95-06, March.Google Scholar
- Boris Katz. 1997. From sentence processing to information access on the world wide web. AAAI Spring Symposium on Natural Language Processing for the World Wide Web, Stanford University, Stanford CA.Google Scholar
- Cody C. T. Kwok, Oren Etzioni, and Daniel S. Weld. 2001. Scaling question answering to the web. In Proceedings of the Tenth International World Wide Web Conference, pages 150--161. Google ScholarDigital Library
- David D. Lewis and Karen Sparck Jones. 1996. Natural language processing for information retrieval. Communications of the ACM, 39(1):92--101. Google ScholarDigital Library
- J. Pearl. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers, Inc. Google ScholarDigital Library
- Adwait Ratnaparkhi. 1996. A maximum entropy part-of-speech tagger. In Proceedings of the Empirical Methods in Natural Language Processing Conference, May 17-18, 1996. University of Pennsylvania.Google Scholar
- Mark Sanderson. 1994. Word sense disambiguation and information retrieval. In Proceedings of SIGIR-94, 17th ACM International Conference on Research and Development in Information Retrieval, pages 49--57, Dublin, IE. Google ScholarDigital Library
- Ellen Vorhees. 2000. Overview of TREC-9 question answering track. Text REtreival Conference 9.Google Scholar
- Wiebe, Janyce, O'Hara, Tom, Rebecca Bruce. 1998. Constructing Bayesian networks from WordNet for word sense disambiguation: representation and processing issues. In Proc. COLING-ACL '98 Workshop on the Usage of WordNet in Natural Language Processing Systems.Google Scholar
- P. Dempster, N. M. Laird and D. B. Rubin. 1977. Maximum Likelihood from Incomplete Data via The EM Algorithm. In Journal of Royal Statistical Society, Vol. 39, pp. 1--38, 1977.Google Scholar
- Ganesh Ramakrishnan and Pushpak Bhattacharyya. 2003. Text Representation with WordNet Synsets: A Soft Sense Disambiguation Approach. To appear in Proceedings of the 8th International Conference on Natural Language in Information Systems, Springer Verlag.Google Scholar
Recommendations
AnswerBus question answering system
HLT '02: Proceedings of the second international conference on Human Language Technology ResearchAnswerBus is an open-domain question answering system based on sentence level Web information retrieval. It accepts users' natural-language questions in English, German, French, Spanish, Italian and Portuguese and provides answers in English. Five ...
Translating lexical semantic relations: the first step towards multilingual wordnets
SEMANET '02: Proceedings of the 2002 workshop on Building and using semantic networks - Volume 11Establishing correspondences between wordnets of different languages is essential to both multilingual knowledge processing and for bootstrapping wordnets of low-density languages. We claim that such correspondences must be based on lexical semantic ...
Capturing paradigmatic and syntagmatic lexical relations: towards accurate Chinese part-of-speech tagging
ACL '12: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1From the perspective of structural linguistics, we explore paradigmatic and syntagmatic lexical relations for Chinese POS tagging, an important and challenging task for Chinese language processing. Paradigmatic lexical relations are explicitly captured ...
Comments