Skip to main content
Erschienen in: Journal of Intelligent Information Systems 2/2010

01.04.2010

Answering questions with an n-gram based passage retrieval engine

verfasst von: Davide Buscaldi, Paolo Rosso, José Manuel Gómez-Soriano, Emilio Sanchis

Erschienen in: Journal of Intelligent Information Systems | Ausgabe 2/2010

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we present a Question Answering system based on redundancy and a Passage Retrieval method that is specifically oriented to Question Answering. We suppose that in a large enough document collection the answer to a given question may appear in several different forms. Therefore, it is possible to find one or more sentences that contain the answer and that also include tokens from the original question. The Passage Retrieval engine is almost language-independent since it is based on n-gram structures. Question classification and answer extraction modules are based on shallow patterns.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
4
The passage retrieval engine JIRS can be obtained at the following URL: http://​sourceforge.​net/​projects/​jirs/​.
 
5
Note that the comma (,) is included in the position count.
 
Literatur
Zurück zum Zitat Abney, S., Collins, M., & Singhal, A. (2000). Answer extraction. In Proceedings of the sixth conference on applied natural language processing, applied natural language conferences (pp. 296–301). Seattle, Washington: Morgan Kaufmann Publishers. Abney, S., Collins, M., & Singhal, A. (2000). Answer extraction. In Proceedings of the sixth conference on applied natural language processing, applied natural language conferences (pp. 296–301). Seattle, Washington: Morgan Kaufmann Publishers.
Zurück zum Zitat Aceves, R., Villaseñor, L., & Montes, M. (2005). Towards a multilingual QA system based on the web data redundancy. In AWIC, 2005 (pp. 32–37). Lodz, Poland. Aceves, R., Villaseñor, L., & Montes, M. (2005). Towards a multilingual QA system based on the web data redundancy. In AWIC, 2005 (pp. 32–37). Lodz, Poland.
Zurück zum Zitat Ahn, K., Alex, B., Bos, J., Dalmas, T., Leidner, J. L., & Smillie, M. B. (2005). Cross-lingual question answering using off-the-shelf machine translation. In Multilingual information access for text, speech and images, LNCS (Vol. 3491, pp. 446–457). Springer. Ahn, K., Alex, B., Bos, J., Dalmas, T., Leidner, J. L., & Smillie, M. B. (2005). Cross-lingual question answering using off-the-shelf machine translation. In Multilingual information access for text, speech and images, LNCS (Vol. 3491, pp. 446–457). Springer.
Zurück zum Zitat Aunimo, L., Kuuskoski, R., & Makkonen, J. (2005). Finnish as source language in bilingual question answering. In Multilingual information access for text, speech and images, LNCS (Vol. 3491, pp. 482–493). Springer. Aunimo, L., Kuuskoski, R., & Makkonen, J. (2005). Finnish as source language in bilingual question answering. In Multilingual information access for text, speech and images, LNCS (Vol. 3491, pp. 482–493). Springer.
Zurück zum Zitat Benajiba, Y., Rosso, P., & Gómez, J. M. (2007). Adapting JIRS passage retrieval system to the Arabic. In Proc. 8th int. conf. on comput. linguistics and intelligent text processing, CICLing-2007, LNCS (Vol. 4394, pp. 530–541). Springer. Benajiba, Y., Rosso, P., & Gómez, J. M. (2007). Adapting JIRS passage retrieval system to the Arabic. In Proc. 8th int. conf. on comput. linguistics and intelligent text processing, CICLing-2007, LNCS (Vol. 4394, pp. 530–541). Springer.
Zurück zum Zitat Bilotti, M. W., Ogilvie, P., Callan, J., & Nyberg, E. (2007). Structured retrieval for question answering. In Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’07), 23–27 July 2007 (pp. 351–358). Amsterdam, The Netherlands: ACM. Bilotti, M. W., Ogilvie, P., Callan, J., & Nyberg, E. (2007). Structured retrieval for question answering. In Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’07), 23–27 July 2007 (pp. 351–358). Amsterdam, The Netherlands: ACM.
Zurück zum Zitat Brill, E., Lin, J., Banko, M., Dumais, S. T., & Ng, A. Y. (2001). Data-intensive question answering. In Proceedings of the 10th text retrieval conference (TREC-10) (pp. 393–400). Gaithersburg, Maryland. Brill, E., Lin, J., Banko, M., Dumais, S. T., & Ng, A. Y. (2001). Data-intensive question answering. In Proceedings of the 10th text retrieval conference (TREC-10) (pp. 393–400). Gaithersburg, Maryland.
Zurück zum Zitat Buchholz, S. (2001). Using grammatical relations, answer frequencies and the World Wide Web for TREC question answering. In Proceedings of the 10th text retrieval conference (TREC-10) (pp. 502–506). Gaithersburg, Maryland. Buchholz, S. (2001). Using grammatical relations, answer frequencies and the World Wide Web for TREC question answering. In Proceedings of the 10th text retrieval conference (TREC-10) (pp. 502–506). Gaithersburg, Maryland.
Zurück zum Zitat Cao, J., Roussinov, D., Robles-Flores, J. A., & Nunamaker, J. F., Jr. (2005). Automated question answering from lecture videos: NLP vs. pattern matching. In Proceedings of the 38th Hawaii international conference on system sciences (HICSS 2005). Big Island, Hawaii, USA: IEEE Computer Society. Cao, J., Roussinov, D., Robles-Flores, J. A., & Nunamaker, J. F., Jr. (2005). Automated question answering from lecture videos: NLP vs. pattern matching. In Proceedings of the 38th Hawaii international conference on system sciences (HICSS 2005). Big Island, Hawaii, USA: IEEE Computer Society.
Zurück zum Zitat Clarke, C., Cormack, G., & Lynam, T. (2001). Exploiting redundancy in question answering. In 24th ACM SIGIR conference (pp. 358–365). Clarke, C., Cormack, G., & Lynam, T. (2001). Exploiting redundancy in question answering. In 24th ACM SIGIR conference (pp. 358–365).
Zurück zum Zitat Del Castillo, A., Gómez, M. M., & Villaseñor-Pineda, L. (2004). QA on the web: A preliminary study for Spanish language. In Proceedings of the fifth Mexican international conference in computer science (ENC’04) (pp. 322–328). Colima, Mexico. Del Castillo, A., Gómez, M. M., & Villaseñor-Pineda, L. (2004). QA on the web: A preliminary study for Spanish language. In Proceedings of the fifth Mexican international conference in computer science (ENC’04) (pp. 322–328). Colima, Mexico.
Zurück zum Zitat Giménez, J., & Márquez, L. (2004). SVMTool: A general POS Tagger generator based on support vector machines. In Proceedings of 4th LREC. Lisbon, Portugal. Giménez, J., & Márquez, L. (2004). SVMTool: A general POS Tagger generator based on support vector machines. In Proceedings of 4th LREC. Lisbon, Portugal.
Zurück zum Zitat Gómez, J. M., Buscaldi, D., Bisbal, E., Sanchis, E., & Rosso, P. (2005). A multilingual question answering system using an n-grams based passage retrieval. In Proc. workshop on natural language processing for information retrieval, 2nd Indian int. conf. on artificial intelligence (IICAI-2005) (pp. 686–672). Pune, India. Gómez, J. M., Buscaldi, D., Bisbal, E., Sanchis, E., & Rosso, P. (2005). A multilingual question answering system using an n-grams based passage retrieval. In Proc. workshop on natural language processing for information retrieval, 2nd Indian int. conf. on artificial intelligence (IICAI-2005) (pp. 686–672). Pune, India.
Zurück zum Zitat Gómez, J. M., Buscaldi, D., Rosso, P., & Sanchis, E. (2007a). JIRS Language-independent Passage Retrieval system: A comparative study. In Proc. 5th int. conf. on natural language processing (ICON-2007), 4–6 January. Hyderabad, India. Gómez, J. M., Buscaldi, D., Rosso, P., & Sanchis, E. (2007a). JIRS Language-independent Passage Retrieval system: A comparative study. In Proc. 5th int. conf. on natural language processing (ICON-2007), 4–6 January. Hyderabad, India.
Zurück zum Zitat Gómez, J. M., Rosso, P., & Sanchis, E. (2007b). Re-ranking of Yahoo snippets with the JIRS Passage Retrieval system. In Proc. workshop on cross lingual information access (CLIA-2007), 20th int. joint conf. on artificial intelligence (IJCAI-07), 6–12 January 2007. Hyderabad, India. Gómez, J. M., Rosso, P., & Sanchis, E. (2007b). Re-ranking of Yahoo snippets with the JIRS Passage Retrieval system. In Proc. workshop on cross lingual information access (CLIA-2007), 20th int. joint conf. on artificial intelligence (IJCAI-07), 6–12 January 2007. Hyderabad, India.
Zurück zum Zitat Greenwood, M. A. (2004). Using pertainyms to improve passage retrieval for questions requesting information about a location. In Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2004). Sheffield, UK. Greenwood, M. A. (2004). Using pertainyms to improve passage retrieval for questions requesting information about a location. In Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2004). Sheffield, UK.
Zurück zum Zitat Hacioglu, K., & Ward, W. (2003). Question classification with support vector machines and error correcting codes. In Proceedings of the 2003 conference of the North American chapter of the association for computational linguistics on human language technology: Companion volume of the proceedings of HLT-NAACL 2003–Short papers - Volume 2 (Edmonton, Canada, May 27– June 1, 2003) (pp. 28–30). North American Chapter Of The Association For Computational Linguistics. Association for Computational Linguistics, Morristown, NJ. doi:10.3115/1073483.1073493. Hacioglu, K., & Ward, W. (2003). Question classification with support vector machines and error correcting codes. In Proceedings of the 2003 conference of the North American chapter of the association for computational linguistics on human language technology: Companion volume of the proceedings of HLT-NAACL 2003–Short papers - Volume 2 (Edmonton, Canada, May 27– June 1, 2003) (pp. 28–30). North American Chapter Of The Association For Computational Linguistics. Association for Computational Linguistics, Morristown, NJ. doi:10.​3115/​1073483.​1073493.
Zurück zum Zitat Hermjakob, U. (2001). Parsing and question classification for question answering. In Proceedings of the ACL 2001 workshop on open-domain question answering (pp. 17–22). Toulouse, France. Hermjakob, U. (2001). Parsing and question classification for question answering. In Proceedings of the ACL 2001 workshop on open-domain question answering (pp. 17–22). Toulouse, France.
Zurück zum Zitat Hess, M. (1996). The 1996 international conference on tools with artificial intelligence (TAI 96). In Proc. conference on research and development in information retrieval (SIGIR 1996). Zürich, Switzerland. Hess, M. (1996). The 1996 international conference on tools with artificial intelligence (TAI 96). In Proc. conference on research and development in information retrieval (SIGIR 1996). Zürich, Switzerland.
Zurück zum Zitat Hovy, E., Gerber, L., Hermjakob, U., Junk, M., & Lin, C. (2000). Question answering in webclopedia. In Proceedings of the ninth text retrieval conference (TREC-9). Gaithersburg, Maryland. Hovy, E., Gerber, L., Hermjakob, U., Junk, M., & Lin, C. (2000). Question answering in webclopedia. In Proceedings of the ninth text retrieval conference (TREC-9). Gaithersburg, Maryland.
Zurück zum Zitat Juárez, A., Téllez, A., Delicia, C., Montes, M., Villaseñor, L. (2007). Using machine learning and text mining in question answering. In 7th workshop of the cross-language evaluation forum (CLEF 2006), LNCS (Vol. 4730). Springer 2007. Juárez, A., Téllez, A., Delicia, C., Montes, M., Villaseñor, L. (2007). Using machine learning and text mining in question answering. In 7th workshop of the cross-language evaluation forum (CLEF 2006), LNCS (Vol. 4730). Springer 2007.
Zurück zum Zitat Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics, Doklady, 10, 707–710.MathSciNet Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics, Doklady, 10, 707–710.MathSciNet
Zurück zum Zitat Li, X., & Roth, D. (2002). Learning question classifiers. In Proc. international conference on computational linguistics (COLING 2002). Taipei, Taiwan. Li, X., & Roth, D. (2002). Learning question classifiers. In Proc. international conference on computational linguistics (COLING 2002). Taipei, Taiwan.
Zurück zum Zitat Liu, X., & Croft, W. (2002). Passage retrieval based on language models. In Proceedings of the eleventh international conference on information and knowledge management (CIKM 02) (pp. 375–382). McLean, Virginia. Liu, X., & Croft, W. (2002). Passage retrieval based on language models. In Proceedings of the eleventh international conference on information and knowledge management (CIKM 02) (pp. 375–382). McLean, Virginia.
Zurück zum Zitat Llopis, F., & Vicedo, J. L. (2002). IR-n: A passage retrieval system at CLEF-2001. Revised papers from the second workshop of the cross-language evaluation forum on evaluation of cross-language information retrieval systems (September 03–04, 2001). In C. Peters, M. Braschler, J. Gonzalo, & M. Kluck (Eds.) Lecture notes in computer science (Vol. 2406, pp. 244–252). London: Springer. Llopis, F., & Vicedo, J. L. (2002). IR-n: A passage retrieval system at CLEF-2001. Revised papers from the second workshop of the cross-language evaluation forum on evaluation of cross-language information retrieval systems (September 03–04, 2001). In C. Peters, M. Braschler, J. Gonzalo, & M. Kluck (Eds.) Lecture notes in computer science (Vol. 2406, pp. 244–252). London: Springer.
Zurück zum Zitat Magnini, B., Negri, M., Prevete, R., & Tanev, H. (2001). Multilingual question/answering: The DIOGENE system. In Proceedings of the 10th text retrieval conference (TREC-10). Gaithersburg, Maryland. Magnini, B., Negri, M., Prevete, R., & Tanev, H. (2001). Multilingual question/answering: The DIOGENE system. In Proceedings of the 10th text retrieval conference (TREC-10). Gaithersburg, Maryland.
Zurück zum Zitat Magnini, B., Vallin, S., Ayache, C., Erbach, G., Peñas, A., De Rijke, M., et al. (2005). Overview of the CLEF 2004 multilingual question answering track. In Multilingual information access for text, speech and images, LNCS (Vol. 3491, pp. 371–391). Springer 2005. Magnini, B., Vallin, S., Ayache, C., Erbach, G., Peñas, A., De Rijke, M., et al. (2005). Overview of the CLEF 2004 multilingual question answering track. In Multilingual information access for text, speech and images, LNCS (Vol. 3491, pp. 371–391). Springer 2005.
Zurück zum Zitat Magnini, B., Giampiccolo, D., Forner, P., Ayache, C., Osenova, P., Peñas, A., et al. (2007). Overview of the CLEF 2006 multilingual question answering track. In Evaluation of multilingual and multi-modal information retrieval, LNCS (Vol. 4730, pp. 223–256). Springer. Magnini, B., Giampiccolo, D., Forner, P., Ayache, C., Osenova, P., Peñas, A., et al. (2007). Overview of the CLEF 2006 multilingual question answering track. In Evaluation of multilingual and multi-modal information retrieval, LNCS (Vol. 4730, pp. 223–256). Springer.
Zurück zum Zitat Moldovan, D. I., Pasca, M., Harabagiu, S. M., & Surdeanu, M. (2003). Performance issues and error analysis in an open-domain question answering system. ACM Transactions on Information Systems, 21, 133–154. doi:10.1145/763693.763694.CrossRef Moldovan, D. I., Pasca, M., Harabagiu, S. M., & Surdeanu, M. (2003). Performance issues and error analysis in an open-domain question answering system. ACM Transactions on Information Systems, 21, 133–154. doi:10.​1145/​763693.​763694.CrossRef
Zurück zum Zitat Narayanan, S., & Harabagiu, S. (2004). Question answering based on semantic structures, international conference on computational linguistics (COLING 2004) (pp. 693–702). Geneva, Switzerland. Narayanan, S., & Harabagiu, S. (2004). Question answering based on semantic structures, international conference on computational linguistics (COLING 2004) (pp. 693–702). Geneva, Switzerland.
Zurück zum Zitat Neumann, G., & Sacaleanu, B. (2005). Experiments on robust nl question interpretation and multi-layered document annotation for a cross-language question/answering system. In Multilingual information access for text, speech and images, LNCS (Vol. 3491, pp. 411–422). Springer 2005. Neumann, G., & Sacaleanu, B. (2005). Experiments on robust nl question interpretation and multi-layered document annotation for a cross-language question/answering system. In Multilingual information access for text, speech and images, LNCS (Vol. 3491, pp. 411–422). Springer 2005.
Zurück zum Zitat Pérez, M., Montes, M., López, A., & Villaseñor, L. (2006) The role of lexical features in question answering for Spanish. In Accessing multilingual information repositories: 6th workshop of the cross-language evaluation forum, CLEF 2005, LNCS (Vol. 4022). Revised Selected Papers. Springer 2006. Pérez, M., Montes, M., López, A., & Villaseñor, L. (2006) The role of lexical features in question answering for Spanish. In Accessing multilingual information repositories: 6th workshop of the cross-language evaluation forum, CLEF 2005, LNCS (Vol. 4022). Revised Selected Papers. Springer 2006.
Zurück zum Zitat Roberts, I., & Gaizauskas, R. J. (2004). Evaluating passage retrieval approaches for question answering. In Advances in information retrieval, 26th European conference on IR research (ECIR 2004) (pp. 72–84). Sunderland, UK. Roberts, I., & Gaizauskas, R. J. (2004). Evaluating passage retrieval approaches for question answering. In Advances in information retrieval, 26th European conference on IR research (ECIR 2004) (pp. 72–84). Sunderland, UK.
Zurück zum Zitat Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. In Proceedings of the conference on new methods in language processing. Manchester, UK. Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. In Proceedings of the conference on new methods in language processing. Manchester, UK.
Zurück zum Zitat Vallin, S., Magnini, B., Giampiccolo, D., Aunimo, L., Ayache, C., Osenova, P., et al. (2006). Overview of the CLEF 2005 multilingual question answering track. In Accessing multilingual information repositories, LNCS (Vol. 4022, pp. 307–331). Springer 2006. Vallin, S., Magnini, B., Giampiccolo, D., Aunimo, L., Ayache, C., Osenova, P., et al. (2006). Overview of the CLEF 2005 multilingual question answering track. In Accessing multilingual information repositories, LNCS (Vol. 4022, pp. 307–331). Springer 2006.
Zurück zum Zitat Vicedo, J. L., Izquierdo, R., Llopis, F., & Munoz, R. (2003). Question answering in Spanish. In Working notes of the Cross-Lingual Evaluation Forum (CLEF 2003). Trondheim, Norway. Vicedo, J. L., Izquierdo, R., Llopis, F., & Munoz, R. (2003). Question answering in Spanish. In Working notes of the Cross-Lingual Evaluation Forum (CLEF 2003). Trondheim, Norway.
Zurück zum Zitat Voorhees, E.M. (1999). The TREC-8 question answering track report. In Proceedings of the eighth text retrieval conference (TREC-8). Gaithersburg, Maryland. Voorhees, E.M. (1999). The TREC-8 question answering track report. In Proceedings of the eighth text retrieval conference (TREC-8). Gaithersburg, Maryland.
Zurück zum Zitat Voorhees, E. M. (2000). Overview of the TREC-9 question answering track. In Proceedings of the ninth text retrieval conference (TREC-9). Gaithersburg, Maryland. Voorhees, E. M. (2000). Overview of the TREC-9 question answering track. In Proceedings of the ninth text retrieval conference (TREC-9). Gaithersburg, Maryland.
Zurück zum Zitat Voorhees, E. M. (2001) Overview of TREC 2001. In Proceedings of the tenth text retrieval conference (TREC-10). Gaithersburg, Maryland. Voorhees, E. M. (2001) Overview of TREC 2001. In Proceedings of the tenth text retrieval conference (TREC-10). Gaithersburg, Maryland.
Metadaten
Titel
Answering questions with an n-gram based passage retrieval engine
verfasst von
Davide Buscaldi
Paolo Rosso
José Manuel Gómez-Soriano
Emilio Sanchis
Publikationsdatum
01.04.2010
Verlag
Springer US
Erschienen in
Journal of Intelligent Information Systems / Ausgabe 2/2010
Print ISSN: 0925-9902
Elektronische ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-009-0082-y