Skip to main content
Erschienen in: Information Systems Frontiers 1/2021

28.02.2020

Towards End-to-End Multilingual Question Answering

verfasst von: Ekaterina Loginova, Stalin Varanasi, Günter Neumann

Erschienen in: Information Systems Frontiers | Ausgabe 1/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Multilingual question answering (MLQA) is a critical part of an accessible natural language interface. However, current solutions demonstrate performance far below that of monolingual systems. We believe that deep learning approaches are likely to improve performance in MLQA drastically. This work aims to discuss the current state-of-the-art and remaining challenges. We outline requirements and suggestions for practical parallel data collection and describe existing methods, benchmarks and datasets. We also demonstrate that a simple translation of texts can be inadequate in case of Arabic, English and German languages (on InsuranceQA and SemEval datasets), and thus more sophisticated models are required. We hope that our overview will re-ignite interest in multilingual question answering, especially with regard to neural approaches.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aceves-Pérez, R.M., Montes-y Gómez, M., Villaseñor-Pineda, L. (2007) Enhancing cross-language question answering by combining multiple question translations. In: International Conference on Intelligent Text Processing and Computational Linguistics, Springer, pp 485–493. Aceves-Pérez, R.M., Montes-y Gómez, M., Villaseñor-Pineda, L. (2007) Enhancing cross-language question answering by combining multiple question translations. In: International Conference on Intelligent Text Processing and Computational Linguistics, Springer, pp 485–493.
Zurück zum Zitat Almarwani, N., Diab, M. (2017) GW\_QA at SemEval-2017 task 3: question answer re-ranking on Arabic fora. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp 344–348. Almarwani, N., Diab, M. (2017) GW\_QA at SemEval-2017 task 3: question answer re-ranking on Arabic fora. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp 344–348.
Zurück zum Zitat Attia, M., Samih, Y., Elkahky, A., Kallmeyer, L. (2018) Multilingual multi-class sentiment classification using convolutional neural networks. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018), European Language Resource Association, URL http://aclweb.org/anthology/L18-1101. Attia, M., Samih, Y., Elkahky, A., Kallmeyer, L. (2018) Multilingual multi-class sentiment classification using convolutional neural networks. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018), European Language Resource Association, URL http://​aclweb.​org/​anthology/​L18-1101.
Zurück zum Zitat Bahdanau, D., Cho, K., Bengio, Y. (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:14090473. Bahdanau, D., Cho, K., Bengio, Y. (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:14090473.
Zurück zum Zitat Banerjee, S., Chakma, K., Naskar, S. K., Das, A., Rosso, P., Bandyopadhyay, S., & Choudhury, M. (2016a). Overview of the mixed script information retrieval (msir) at fire-2016. Organization (ORG), 67, 24. Banerjee, S., Chakma, K., Naskar, S. K., Das, A., Rosso, P., Bandyopadhyay, S., & Choudhury, M. (2016a). Overview of the mixed script information retrieval (msir) at fire-2016. Organization (ORG), 67, 24.
Zurück zum Zitat Banerjee, S., Naskar, S. K., Rosso, P., Bandyopadhyay, S. (2016b) The first cross-script code-mixed question answering corpus. In: MultiLingMine@ ECIR, pp 56–65. Banerjee, S., Naskar, S. K., Rosso, P., Bandyopadhyay, S. (2016b) The first cross-script code-mixed question answering corpus. In: MultiLingMine@ ECIR, pp 56–65.
Zurück zum Zitat Barman, U., Das, A., Wagner, J., Foster, J. (2014) Code mixing: A challenge for language identification in the language of social media. In: Proceedings of the first workshop on computational approaches to code switching, pp 13–23. Barman, U., Das, A., Wagner, J., Foster, J. (2014) Code mixing: A challenge for language identification in the language of social media. In: Proceedings of the first workshop on computational approaches to code switching, pp 13–23.
Zurück zum Zitat Bender, E. M. (2009) Linguistically naïve!= language independent: why nlp needs linguistic typology. In: Proceedings of the EACL 2009 Workshop on the Interaction between Linguistics and Computational Linguistics: Virtuous, Vicious or Vacuous?, pp 26–32. Bender, E. M. (2009) Linguistically naïve!= language independent: why nlp needs linguistic typology. In: Proceedings of the EACL 2009 Workshop on the Interaction between Linguistics and Computational Linguistics: Virtuous, Vicious or Vacuous?, pp 26–32.
Zurück zum Zitat Boldrini, E., Ferrández, S., Izquierdo, R., Tomás, D., Vicedo, J. L. (2009) A parallel corpus labeled using open and restricted domain ontologies. In: International Conference on Intelligent Text Processing and Computational Linguistics, Springer, pp 346–356. Boldrini, E., Ferrández, S., Izquierdo, R., Tomás, D., Vicedo, J. L. (2009) A parallel corpus labeled using open and restricted domain ontologies. In: International Conference on Intelligent Text Processing and Computational Linguistics, Springer, pp 346–356.
Zurück zum Zitat Bouma, G., Kloosterman, G., Mur, J., Van Noord, G., Van Der Plas, L., Tiedemann, J. (2007) Question answering with joost at clef 2007. In: Workshop of the Cross-Language Evaluation Forum for European Languages, Springer, pp 257–260. Bouma, G., Kloosterman, G., Mur, J., Van Noord, G., Van Der Plas, L., Tiedemann, J. (2007) Question answering with joost at clef 2007. In: Workshop of the Cross-Language Evaluation Forum for European Languages, Springer, pp 257–260.
Zurück zum Zitat Cabrio, E., Cojan, J., Aprosio, A. P., Magnini, B., Lavelli, A., Gandon, F. (2012) Qakis: an open domain qa system based on relational patterns. In: International Semantic Web Conference, ISWC 2012. Cabrio, E., Cojan, J., Aprosio, A. P., Magnini, B., Lavelli, A., Gandon, F. (2012) Qakis: an open domain qa system based on relational patterns. In: International Semantic Web Conference, ISWC 2012.
Zurück zum Zitat Cabrio, E., Cimiano, P., Lopez, V., Ngomo, A. C. N., Unger, C., Walter, S. (2013) Qald-3: Multilingual question answering over linked data. CLEF (Working Notes) 38. Cabrio, E., Cimiano, P., Lopez, V., Ngomo, A. C. N., Unger, C., Walter, S. (2013) Qald-3: Multilingual question answering over linked data. CLEF (Working Notes) 38.
Zurück zum Zitat Chakma, K., & Das, A. (2016). Cmir: A corpus for evaluation of code mixed information retrieval of hindi-english tweets. Computación y Sistemas, 20(3), 425–434.CrossRef Chakma, K., & Das, A. (2016). Cmir: A corpus for evaluation of code mixed information retrieval of hindi-english tweets. Computación y Sistemas, 20(3), 425–434.CrossRef
Zurück zum Zitat Chandu, K. R., Chinnakotla, M., Black, A. W., Shrivastava, M. (2017) Webshodh: A code mixed factoid question answering system for web. In: International Conference of the Cross-Language Evaluation Forum for European Languages, Springer, pp 104–111. Chandu, K. R., Chinnakotla, M., Black, A. W., Shrivastava, M. (2017) Webshodh: A code mixed factoid question answering system for web. In: International Conference of the Cross-Language Evaluation Forum for European Languages, Springer, pp 104–111.
Zurück zum Zitat Chandu, K., Loginova, E., Gupta, V., van Genabith, J., Neuman, G., Chinnakotla, M., Nyberg E, Black AW (2018) Code-mixed question answering challenge: Crowd-sourcing data and techniques. In: Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching, pp 29–38. Chandu, K., Loginova, E., Gupta, V., van Genabith, J., Neuman, G., Chinnakotla, M., Nyberg E, Black AW (2018) Code-mixed question answering challenge: Crowd-sourcing data and techniques. In: Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching, pp 29–38.
Zurück zum Zitat Chen, M., Zaniolo, C. (2017) Learning multi-faceted knowledge graph embeddings for natural language processing. In: IJCAI, pp 5169–5170. Chen, M., Zaniolo, C. (2017) Learning multi-faceted knowledge graph embeddings for natural language processing. In: IJCAI, pp 5169–5170.
Zurück zum Zitat Chen, G., Chen, C., Xing, Z., Xu, B. (2016) Learning a dual-language vector space for domain-specific cross-lingual question retrieval. In: 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE), IEEE, pp 744–755. Chen, G., Chen, C., Xing, Z., Xu, B. (2016) Learning a dual-language vector space for domain-specific cross-lingual question retrieval. In: 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE), IEEE, pp 744–755.
Zurück zum Zitat Chen, D., Fisch, A., Weston, J., Bordes, A. (2017) Reading wikipedia to answer open-domain questions. arXiv preprint arXiv:170400051. Chen, D., Fisch, A., Weston, J., Bordes, A. (2017) Reading wikipedia to answer open-domain questions. arXiv preprint arXiv:170400051.
Zurück zum Zitat Choudhury, M., Chittaranjan, G., Gupta, P., Das, A. (2014) Overview of fire 2014 track on transliterated search. Proceedings of FIRE pp 68–89. Choudhury, M., Chittaranjan, G., Gupta, P., Das, A. (2014) Overview of fire 2014 track on transliterated search. Proceedings of FIRE pp 68–89.
Zurück zum Zitat Cimiano, P. (2009) Flexible semantic composition with dudes. In: Proceedings of the Eighth International Conference on Computational Semantics, Association for Computational Linguistics, pp 272–276. Cimiano, P. (2009) Flexible semantic composition with dudes. In: Proceedings of the Eighth International Conference on Computational Semantics, Association for Computational Linguistics, pp 272–276.
Zurück zum Zitat Devlin, J., Chang, M. W., Lee, K., Toutanova, K. (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805. Devlin, J., Chang, M. W., Lee, K., Toutanova, K. (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805.
Zurück zum Zitat Du, X., Shao, J., Cardie, C. (2017) Learning to ask: Neural question generation for reading comprehension. arXiv preprint arXiv:170500106. Du, X., Shao, J., Cardie, C. (2017) Learning to ask: Neural question generation for reading comprehension. arXiv preprint arXiv:170500106.
Zurück zum Zitat Feng, M., Xiang, B., Glass, M. R., Wang, L., Zhou, B. (2015) Applying deep learning to answer selection: A study and an open task. In: Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on, IEEE, pp 813–820. Feng, M., Xiang, B., Glass, M. R., Wang, L., Zhou, B. (2015) Applying deep learning to answer selection: A study and an open task. In: Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on, IEEE, pp 813–820.
Zurück zum Zitat Ferrández, S., Ferrández, A. (2007) The negative effect of machine translation on cross–lingual question answering. In: International Conference on Intelligent Text Processing and Computational Linguistics, Springer, pp 494–505. Ferrández, S., Ferrández, A. (2007) The negative effect of machine translation on cross–lingual question answering. In: International Conference on Intelligent Text Processing and Computational Linguistics, Springer, pp 494–505.
Zurück zum Zitat Ferrandez, O., Spurk, C., Kouylekov, M., Dornescu, I., Ferrandez, S., Negri, M., Izquierdo, R., Tomas, D., Orasan, C., Neumann, G., et al. (2011). The qall-me framework: A specifiable-domain multilingual question answering architecture. Web Semantics: Science, Services and Agents on the World Wide Web, 9(2), 137–145.CrossRef Ferrandez, O., Spurk, C., Kouylekov, M., Dornescu, I., Ferrandez, S., Negri, M., Izquierdo, R., Tomas, D., Orasan, C., Neumann, G., et al. (2011). The qall-me framework: A specifiable-domain multilingual question answering architecture. Web Semantics: Science, Services and Agents on the World Wide Web, 9(2), 137–145.CrossRef
Zurück zum Zitat Forner, P., Peñas, A., Agirre, E., Alegria, I., Forăscu, C., Moreau, N., Osenova, P., Prokopidis, P., Rocha, P., Sacaleanu, B. et al. (2008) Overview of the clef 2008 multilingual question answering track. In: Workshop of the Cross-Language Evaluation Forum for European Languages, Springer, pp 262–295. Forner, P., Peñas, A., Agirre, E., Alegria, I., Forăscu, C., Moreau, N., Osenova, P., Prokopidis, P., Rocha, P., Sacaleanu, B. et al. (2008) Overview of the clef 2008 multilingual question answering track. In: Workshop of the Cross-Language Evaluation Forum for European Languages, Springer, pp 262–295.
Zurück zum Zitat Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., & Lempitsky, V. (2016). Domain-adversarial training of neural networks. The Journal of Machine Learning Research, 17(1), 2096–2030. Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., & Lempitsky, V. (2016). Domain-adversarial training of neural networks. The Journal of Machine Learning Research, 17(1), 2096–2030.
Zurück zum Zitat Ghosh, S., Ghosh, S., Das, D. (2017) Complexity metric for code-mixed social media text. arXiv preprint arXiv:170701183. Ghosh, S., Ghosh, S., Das, D. (2017) Complexity metric for code-mixed social media text. arXiv preprint arXiv:170701183.
Zurück zum Zitat Glavas, G., Litschko, R., Ruder, S., Vulic, I. (2019) How to (properly) evaluate cross-lingual word embeddings: On strong baselines, comparative analyses, and some misconceptions. arXiv preprint arXiv:190200508. Glavas, G., Litschko, R., Ruder, S., Vulic, I. (2019) How to (properly) evaluate cross-lingual word embeddings: On strong baselines, comparative analyses, and some misconceptions. arXiv preprint arXiv:190200508.
Zurück zum Zitat Gong, Y., Bowman, S. R. (2017) Ruminating reader: Reasoning with gated multi-hop attention. arXiv preprint arXiv:170407415. Gong, Y., Bowman, S. R. (2017) Ruminating reader: Reasoning with gated multi-hop attention. arXiv preprint arXiv:170407415.
Zurück zum Zitat Gupta, P., Bali, K., Banchs, R. E., Choudhury, M., Rosso, P. (2014) Query expansion for mixed-script information retrieval. In: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, ACM, pp 677–686. Gupta, P., Bali, K., Banchs, R. E., Choudhury, M., Rosso, P. (2014) Query expansion for mixed-script information retrieval. In: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, ACM, pp 677–686.
Zurück zum Zitat Haas, C., Riezler, S. (2015) Response-based learning for machine translation of open-domain database queries. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1339–1344. Haas, C., Riezler, S. (2015) Response-based learning for machine translation of open-domain database queries. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1339–1344.
Zurück zum Zitat Hadla, L. S., Hailat, T. M., Al-Kabi, M. N. (2014) Evaluating arabic to english machine translation. Editorial Preface 5(11). Hadla, L. S., Hailat, T. M., Al-Kabi, M. N. (2014) Evaluating arabic to english machine translation. Editorial Preface 5(11).
Zurück zum Zitat Hakimov, S., Jebbara, S., Cimiano, P. (2017) Amuse: Multilingual semantic parsing for question answering over linked data. In: International Semantic Web Conference, Springer, pp 329–346. Hakimov, S., Jebbara, S., Cimiano, P. (2017) Amuse: Multilingual semantic parsing for question answering over linked data. In: International Semantic Web Conference, Springer, pp 329–346.
Zurück zum Zitat Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.CrossRef Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.CrossRef
Zurück zum Zitat Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., & Ngonga Ngomo, A. C. (2017). Survey on challenges of question answering in the semantic web. Semantic Web, 8(6), 895–920.CrossRef Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., & Ngonga Ngomo, A. C. (2017). Survey on challenges of question answering in the semantic web. Semantic Web, 8(6), 895–920.CrossRef
Zurück zum Zitat Honnibal, M., Johnson, M. (2015) An improved non-monotonic transition system for dependency parsing. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 1373–1378. Honnibal, M., Johnson, M. (2015) An improved non-monotonic transition system for dependency parsing. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 1373–1378.
Zurück zum Zitat Jaech, A., Mulcaire, G., Hathi, S., Ostendorf, M., Smith, N. A. (2016) Hierarchical character-word models for language identification. arXiv preprint arXiv:160803030. Jaech, A., Mulcaire, G., Hathi, S., Ostendorf, M., Smith, N. A. (2016) Hierarchical character-word models for language identification. arXiv preprint arXiv:160803030.
Zurück zum Zitat Jauhiainen, T., Lui, M., Zampieri, M., Baldwin, T., Lindén, K. (2018) Automatic language identification in texts: A survey. arXiv preprint arXiv:180408186. Jauhiainen, T., Lui, M., Zampieri, M., Baldwin, T., Lindén, K. (2018) Automatic language identification in texts: A survey. arXiv preprint arXiv:180408186.
Zurück zum Zitat Jia, R., Liang, P. (2017) Adversarial examples for evaluating reading comprehension systems. arXiv preprint arXiv:170707328. Jia, R., Liang, P. (2017) Adversarial examples for evaluating reading comprehension systems. arXiv preprint arXiv:170707328.
Zurück zum Zitat Johnson, M., Schuster, M., Le, Q. V., Krikun, M., Wu, Y., Chen, Z., Thorat, N., Viégas, F., Wattenberg, M., Corrado, G. et al. (2016) Google’s multilingual neural machine translation system: enabling zero-shot translation. arXiv preprint arXiv:161104558. Johnson, M., Schuster, M., Le, Q. V., Krikun, M., Wu, Y., Chen, Z., Thorat, N., Viégas, F., Wattenberg, M., Corrado, G. et al. (2016) Google’s multilingual neural machine translation system: enabling zero-shot translation. arXiv preprint arXiv:161104558.
Zurück zum Zitat Joty, S., Nakov, P., Màrquez, L., Jaradat, I. (2017) Cross-language learning with adversarial neural networks: Application to community question answering. arXiv preprint arXiv:170606749. Joty, S., Nakov, P., Màrquez, L., Jaradat, I. (2017) Cross-language learning with adversarial neural networks: Application to community question answering. arXiv preprint arXiv:170606749.
Zurück zum Zitat Joulin, A., Grave, E., Bojanowski, P., Mikolov, T. (2016) Bag of tricks for efficient text classification. arXiv preprint arXiv:160701759. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T. (2016) Bag of tricks for efficient text classification. arXiv preprint arXiv:160701759.
Zurück zum Zitat Kalouli, A. L., Kaiser, K., Hautli-Janisz, A., Kaiser, G. A., Butt, M. (2018) A multilingual approach to question classification. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018). Kalouli, A. L., Kaiser, K., Hautli-Janisz, A., Kaiser, G. A., Butt, M. (2018) A multilingual approach to question classification. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018).
Zurück zum Zitat Kim, Y. (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:14085882. Kim, Y. (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:14085882.
Zurück zum Zitat King, B., Abney, S. (2013) Labeling the languages of words in mixed-language documents using weakly supervised methods. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1110–1119. King, B., Abney, S. (2013) Labeling the languages of words in mixed-language documents using weakly supervised methods. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1110–1119.
Zurück zum Zitat Kingma, D. P., Ba, J. (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980. Kingma, D. P., Ba, J. (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980.
Zurück zum Zitat Lai, S, Xu, L., Liu, K., Zhao, J. (2015) Recurrent convolutional neural networks for text classification. In: Twenty-ninth AAAI conference on artificial intelligence. Lai, S, Xu, L., Liu, K., Zhao, J. (2015) Recurrent convolutional neural networks for text classification. In: Twenty-ninth AAAI conference on artificial intelligence.
Zurück zum Zitat Lample, G., Conneau, A., Denoyer, L., Ranzato, M. (2017) Unsupervised machine translation using monolingual corpora only. arXiv preprint arXiv:171100043. Lample, G., Conneau, A., Denoyer, L., Ranzato, M. (2017) Unsupervised machine translation using monolingual corpora only. arXiv preprint arXiv:171100043.
Zurück zum Zitat Lee, K., Yoon, K., Park, S., Hwang, S. W. (2018) Semi-supervised training data generation for multilingual question answering. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018). Lee, K., Yoon, K., Park, S., Hwang, S. W. (2018) Semi-supervised training data generation for multilingual question answering. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018).
Zurück zum Zitat Lui, M., Baldwin, T. (2011) Cross-domain feature selection for language identification. In: Proceedings of 5th international joint conference on natural language processing, pp 553–561. Lui, M., Baldwin, T. (2011) Cross-domain feature selection for language identification. In: Proceedings of 5th international joint conference on natural language processing, pp 553–561.
Zurück zum Zitat Luong, T., Pham, H., Manning, C. D. (2015) Bilingual word representations with monolingual quality in mind. In: Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, pp 151–159. Luong, T., Pham, H., Manning, C. D. (2015) Bilingual word representations with monolingual quality in mind. In: Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, pp 151–159.
Zurück zum Zitat Magnini, B., Romagnoli, S., Vallin, A., Herrera, J., Peñas, A., Peinado, V., Verdejo, F., de Rijke, M. (2003a) Creating the disequa corpus: a test set for multilingual question answering. In: Workshop of the Cross-Language Evaluation Forum for European Languages, Springer, pp 487–500. Magnini, B., Romagnoli, S., Vallin, A., Herrera, J., Peñas, A., Peinado, V., Verdejo, F., de Rijke, M. (2003a) Creating the disequa corpus: a test set for multilingual question answering. In: Workshop of the Cross-Language Evaluation Forum for European Languages, Springer, pp 487–500.
Zurück zum Zitat Magnini, B., Romagnoli, S., Vallin, A., Herrera, J., Penas, A., Peinado, V., Verdejo, F., de Rijke, M. (2003b) The multiple language question answering track at clef 2003. In: Workshop of the Cross-Language Evaluation Forum for European Languages, Springer, pp 471–486. Magnini, B., Romagnoli, S., Vallin, A., Herrera, J., Penas, A., Peinado, V., Verdejo, F., de Rijke, M. (2003b) The multiple language question answering track at clef 2003. In: Workshop of the Cross-Language Evaluation Forum for European Languages, Springer, pp 471–486.
Zurück zum Zitat Magnini, B., Vallin, A., Ayache, C., Erbach, G., Peñas, A., De Rijke, M., Rocha, P., Simov, K., Sutcliffe, R. (2004) Overview of the clef 2004 multilingual question answering track. In: Workshop of the Cross-Language Evaluation Forum for European Languages, Springer, pp 371–391. Magnini, B., Vallin, A., Ayache, C., Erbach, G., Peñas, A., De Rijke, M., Rocha, P., Simov, K., Sutcliffe, R. (2004) Overview of the clef 2004 multilingual question answering track. In: Workshop of the Cross-Language Evaluation Forum for European Languages, Springer, pp 371–391.
Zurück zum Zitat Martino, G. D. S., Romeo, S., Barrón-Cedeno, A., Joty, S., Marquez, L., Moschitti, A., Nakov, P. (2017) Cross-language question re-ranking. arXiv preprint arXiv:171001487. Martino, G. D. S., Romeo, S., Barrón-Cedeno, A., Joty, S., Marquez, L., Moschitti, A., Nakov, P. (2017) Cross-language question re-ranking. arXiv preprint arXiv:171001487.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., Dean, J. (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., Dean, J. (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119.
Zurück zum Zitat Mohammad, S. M., Salameh, M., & Kiritchenko, S. (2016). How translation alters sentiment. Journal of Artificial Intelligence Research, 55, 95–130.CrossRef Mohammad, S. M., Salameh, M., & Kiritchenko, S. (2016). How translation alters sentiment. Journal of Artificial Intelligence Research, 55, 95–130.CrossRef
Zurück zum Zitat Molina, G., AlGhamdi, F., Ghoneim, M., Hawwari, A., Rey-Villamizar, N., Diab, M., Solorio, T. (2016) Overview for the second shared task on language identification in code-switched data. In: Proceedings of the Second Workshop on Computational Approaches to Code Switching, pp 40–49. Molina, G., AlGhamdi, F., Ghoneim, M., Hawwari, A., Rey-Villamizar, N., Diab, M., Solorio, T. (2016) Overview for the second shared task on language identification in code-switched data. In: Proceedings of the Second Workshop on Computational Approaches to Code Switching, pp 40–49.
Zurück zum Zitat Nakov, P., Màrquez, L., Magdy, W., Moschitti, A., Glass, J., Randeree, B. (2015) Semeval-2015 task 3: Answer selection in community question answering. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pp 269–281. Nakov, P., Màrquez, L., Magdy, W., Moschitti, A., Glass, J., Randeree, B. (2015) Semeval-2015 task 3: Answer selection in community question answering. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pp 269–281.
Zurück zum Zitat Nakov, P., Mãrquez, L., Moschitti, A., Magdy, W., Mubarak, H., Abed Alhakim Freihat, Glass, J., Randeree, B. (2016) Semeval-2016 task 3: Community question answering. In: Bethard S, Cer DM, Carpuat M, Jurgens D, Nakov P, Zesch T (eds) SemEval@NAACL-HLT, The Association for Computer Linguistics, pp 525–545, URL http://dblp.uni-trier.de/db/conf/semeval/semeval2016.htmlNak ovMMMMFGR16. Nakov, P., Mãrquez, L., Moschitti, A., Magdy, W., Mubarak, H., Abed Alhakim Freihat, Glass, J., Randeree, B. (2016) Semeval-2016 task 3: Community question answering. In: Bethard S, Cer DM, Carpuat M, Jurgens D, Nakov P, Zesch T (eds) SemEval@NAACL-HLT, The Association for Computer Linguistics, pp 525–545, URL http://​dblp.​uni-trier.​de/​db/​conf/​semeval/​semeval2016.​htmlNak ovMMMMFGR16.
Zurück zum Zitat Nakov, P., Hoogeveen, D., Màrquez, L., Moschitti, A., Mubarak, H., Baldwin, T., Verspoor, K. (2017) Semeval-2017 task 3: Community question answering. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp 27–48. Nakov, P., Hoogeveen, D., Màrquez, L., Moschitti, A., Mubarak, H., Baldwin, T., Verspoor, K. (2017) Semeval-2017 task 3: Community question answering. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp 27–48.
Zurück zum Zitat Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A. (2017) Automatic differentiation in pytorch. Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A. (2017) Automatic differentiation in pytorch.
Zurück zum Zitat Raghavi, K. C., Chinnakotla, M. K., Shrivastava, M. (2015) Answer ka type kya he?: Learning to classify questions in code-mixed language. In: Proceedings of the 24th International Conference on World Wide Web, ACM, pp 853–858. Raghavi, K. C., Chinnakotla, M. K., Shrivastava, M. (2015) Answer ka type kya he?: Learning to classify questions in code-mixed language. In: Proceedings of the 24th International Conference on World Wide Web, ACM, pp 853–858.
Zurück zum Zitat Rahman, M. M., Hisamoto, S., Duh, K. (2019) Query expansion for cross-language question re-ranking. arXiv preprint arXiv:190407982. Rahman, M. M., Hisamoto, S., Duh, K. (2019) Query expansion for cross-language question re-ranking. arXiv preprint arXiv:190407982.
Zurück zum Zitat Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P. (2016) Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:160605250. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P. (2016) Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:160605250.
Zurück zum Zitat Rajpurkar, P., Jia, R., Liang, P. (2018) Know what you don’t know: Unanswerable questions for squad. arXiv preprint arXiv:180603822. Rajpurkar, P., Jia, R., Liang, P. (2018) Know what you don’t know: Unanswerable questions for squad. arXiv preprint arXiv:180603822.
Zurück zum Zitat Riedl, M., Biemann, C. (2016) Unsupervised compound splitting with distributional semantics rivals supervised methods. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 617–622. Riedl, M., Biemann, C. (2016) Unsupervised compound splitting with distributional semantics rivals supervised methods. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 617–622.
Zurück zum Zitat Rücklé, A., Swarnkar, K., Gurevych, I. (2019) Improved cross-lingual question retrieval for community question answering. In: The World Wide Web Conference, ACM, pp 3179–3186. Rücklé, A., Swarnkar, K., Gurevych, I. (2019) Improved cross-lingual question retrieval for community question answering. In: The World Wide Web Conference, ACM, pp 3179–3186.
Zurück zum Zitat Ruder, S. (2017) A survey of cross-lingual embedding models. arXiv preprint arXiv:170604902. Ruder, S. (2017) A survey of cross-lingual embedding models. arXiv preprint arXiv:170604902.
Zurück zum Zitat Sacaleanu, B., Neumann, G. (2006) Cross-cutting aspects of cross-language question answering systems. In: Proceedings of the Workshop on Multilingual Question Answering-MLQA’06. Sacaleanu, B., Neumann, G. (2006) Cross-cutting aspects of cross-language question answering systems. In: Proceedings of the Workshop on Multilingual Question Answering-MLQA’06.
Zurück zum Zitat Sasaki, Y., Lin, C. J., Chen, K. h, Chen, H. H. (2007) Overview of the ntcir-6 cross-lingual question answering task. In: Proceedings of the 6th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, Question Answering and Cross-Lingual Information Access, May 15–18, Citeseer, pp 153–163. Sasaki, Y., Lin, C. J., Chen, K. h, Chen, H. H. (2007) Overview of the ntcir-6 cross-lingual question answering task. In: Proceedings of the 6th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, Question Answering and Cross-Lingual Information Access, May 15–18, Citeseer, pp 153–163.
Zurück zum Zitat Seo, M., Kembhavi, A., Farhadi, A., Hajishirzi, H. (2016) Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:161101603. Seo, M., Kembhavi, A., Farhadi, A., Hajishirzi, H. (2016) Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:161101603.
Zurück zum Zitat Sequiera, R., Choudhury, M., Gupta, P., Rosso, P., Kumar, S., Banerjee, S., Naskar, S. K., Bandyopadhyay, S., Chittaranjan, G., Das, A. et al. (2015) Overview of fire-2015 shared task on mixed script information retrieval. In: FIRE Workshops, vol 1587, pp 19–25. Sequiera, R., Choudhury, M., Gupta, P., Rosso, P., Kumar, S., Banerjee, S., Naskar, S. K., Bandyopadhyay, S., Chittaranjan, G., Das, A. et al. (2015) Overview of fire-2015 shared task on mixed script information retrieval. In: FIRE Workshops, vol 1587, pp 19–25.
Zurück zum Zitat Solorio, T., Blair, E., Maharjan, S., Bethard, S., Diab, M., Ghoneim, M., Hawwari, A., AlGhamdi, F., Hirschberg, J., Chang, A. et al. (2014) Overview for the first shared task on language identification in code-switched data. In: Proceedings of the First Workshop on Computational Approaches to Code Switching, pp 62–72. Solorio, T., Blair, E., Maharjan, S., Bethard, S., Diab, M., Ghoneim, M., Hawwari, A., AlGhamdi, F., Hirschberg, J., Chang, A. et al. (2014) Overview for the first shared task on language identification in code-switched data. In: Proceedings of the First Workshop on Computational Approaches to Code Switching, pp 62–72.
Zurück zum Zitat Sugiyama, K., Mizukami, M., Neubig, G., Yoshino, K., Sakti, S., Toda, T., Nakamura, S. (2015) An investigation of machine translation evaluation metrics in cross-lingual question answering. In: Proceedings of the Tenth Workshop on Statistical Machine Translation, pp 442–449. Sugiyama, K., Mizukami, M., Neubig, G., Yoshino, K., Sakti, S., Toda, T., Nakamura, S. (2015) An investigation of machine translation evaluation metrics in cross-lingual question answering. In: Proceedings of the Tenth Workshop on Statistical Machine Translation, pp 442–449.
Zurück zum Zitat Tan, M., dos Santos, C., Xiang, B., Zhou, B. (2016) Improved representation learning for question answer matching. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol 1, pp 464–473. Tan, M., dos Santos, C., Xiang, B., Zhou, B. (2016) Improved representation learning for question answer matching. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol 1, pp 464–473.
Zurück zum Zitat Tuggener, D. (2016) Incremental coreference resolution for german. PhD thesis, Universität Zürich. Tuggener, D. (2016) Incremental coreference resolution for german. PhD thesis, Universität Zürich.
Zurück zum Zitat Ture, F., Boschee, E. (2016) Learning to translate for multilingual question answering. arXiv preprint arXiv:160908210. Ture, F., Boschee, E. (2016) Learning to translate for multilingual question answering. arXiv preprint arXiv:160908210.
Zurück zum Zitat Unger, C., Forascu, C., Lopez, V., Ngomo, A. C. N., Cabrio, E., Cimiano, P., Walter, S. (2014) Question answering over linked data (qald-4). Unger, C., Forascu, C., Lopez, V., Ngomo, A. C. N., Cabrio, E., Cimiano, P., Walter, S. (2014) Question answering over linked data (qald-4).
Zurück zum Zitat Unger, C., Forascu, C., Lopez, V., Ngomo, A. C. N., Cabrio, E., Cimiano, P., Walter, S. (2015) Answering over linked data (qald-5). In: Working notes for CLEF 2015 conference. Unger, C., Forascu, C., Lopez, V., Ngomo, A. C. N., Cabrio, E., Cimiano, P., Walter, S. (2015) Answering over linked data (qald-5). In: Working notes for CLEF 2015 conference.
Zurück zum Zitat Unger, C., Ngomo, A. C. N., Cabrio, E. (2016) 6th open challenge on question answering over linked data (qald-6). In: Semantic Web Evaluation Challenge, Springer, pp 171–177. Unger, C., Ngomo, A. C. N., Cabrio, E. (2016) 6th open challenge on question answering over linked data (qald-6). In: Semantic Web Evaluation Challenge, Springer, pp 171–177.
Zurück zum Zitat Upadhyay, S., Faruqui, M., Dyer, C., Roth, D. (2016) Cross-lingual models of word embeddings: An empirical comparison. arXiv preprint arXiv:160400425. Upadhyay, S., Faruqui, M., Dyer, C., Roth, D. (2016) Cross-lingual models of word embeddings: An empirical comparison. arXiv preprint arXiv:160400425.
Zurück zum Zitat Usbeck, R., Ngomo, A. C. N., Haarmann, B., Krithara, A., Röder, M., Napolitano, G. (2017) 7th open challenge on question answering over linked data (qald-7). In: Semantic Web Evaluation Challenge, Springer, pp 59–69. Usbeck, R., Ngomo, A. C. N., Haarmann, B., Krithara, A., Röder, M., Napolitano, G. (2017) 7th open challenge on question answering over linked data (qald-7). In: Semantic Web Evaluation Challenge, Springer, pp 59–69.
Zurück zum Zitat Usbeck, R., Gusmita, R. H., Ngomo, A. C. N., Saleem, M. (2018a) 9th challenge on question answering over linked data (qald-9). In: Semdeep/NLIWoD@ ISWC, pp 58–64. Usbeck, R., Gusmita, R. H., Ngomo, A. C. N., Saleem, M. (2018a) 9th challenge on question answering over linked data (qald-9). In: Semdeep/NLIWoD@ ISWC, pp 58–64.
Zurück zum Zitat Usbeck, R., Ngomo, A. C. N., Conrads, F., Röder, M., Napolitano, G. (2018b) 8th challenge on question answering over linked data (qald-8). language 7:1. Usbeck, R., Ngomo, A. C. N., Conrads, F., Röder, M., Napolitano, G. (2018b) 8th challenge on question answering over linked data (qald-8). language 7:1.
Zurück zum Zitat Vallin, A., Magnini, B., Giampiccolo, D., Aunimo, L., Ayache, C., Osenova, P., Peñas, A., De Rijke, M., Sacaleanu, B., Santos, D. et al. (2005) Overview of the clef 2005 multilingual question answering track. In: Workshop of the Cross-Language Evaluation Forum for European Languages, Springer, pp 307–331. Vallin, A., Magnini, B., Giampiccolo, D., Aunimo, L., Ayache, C., Osenova, P., Peñas, A., De Rijke, M., Sacaleanu, B., Santos, D. et al. (2005) Overview of the clef 2005 multilingual question answering track. In: Workshop of the Cross-Language Evaluation Forum for European Languages, Springer, pp 307–331.
Zurück zum Zitat Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., Polosukhin, I. (2017) Attention is all you need. In: Advances in Neural Information Processing Systems, pp 5998–6008. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., Polosukhin, I. (2017) Attention is all you need. In: Advances in Neural Information Processing Systems, pp 5998–6008.
Zurück zum Zitat Veyseh, A. P. B. (2016) Cross-lingual question answering using common semantic space. In: Proceedings of TextGraphs-10: the Workshop on Graph-based Methods for Natural Language Processing, pp 15–19. Veyseh, A. P. B. (2016) Cross-lingual question answering using common semantic space. In: Proceedings of TextGraphs-10: the Workshop on Graph-based Methods for Natural Language Processing, pp 15–19.
Zurück zum Zitat Vinyals, O., Fortunato, M., Jaitly, N. (2015) Pointer networks. In: Advances in Neural Information Processing Systems, pp 2692–2700. Vinyals, O., Fortunato, M., Jaitly, N. (2015) Pointer networks. In: Advances in Neural Information Processing Systems, pp 2692–2700.
Zurück zum Zitat Vulić, I., Moens, M. F. (2015) Monolingual and cross-lingual information retrieval models based on (bilingual) word embeddings. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, ACM, pp 363–372. Vulić, I., Moens, M. F. (2015) Monolingual and cross-lingual information retrieval models based on (bilingual) word embeddings. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, ACM, pp 363–372.
Zurück zum Zitat Wang, S., Jiang, J. (2016) Machine comprehension using match-lstm and answer pointer. arXiv preprint arXiv:160807905. Wang, S., Jiang, J. (2016) Machine comprehension using match-lstm and answer pointer. arXiv preprint arXiv:160807905.
Zurück zum Zitat Wang, Z., Mi, H., Hamza, W., Florian, R. (2016) Multi-perspective context matching for machine comprehension. arXiv preprint arXiv:161204211. Wang, Z., Mi, H., Hamza, W., Florian, R. (2016) Multi-perspective context matching for machine comprehension. arXiv preprint arXiv:161204211.
Zurück zum Zitat Weissenborn, D., Wiese, G., Seiffe, L. (2017) Making neural qa as simple as possible but not simpler. arXiv preprint arXiv:170304816. Weissenborn, D., Wiese, G., Seiffe, L. (2017) Making neural qa as simple as possible but not simpler. arXiv preprint arXiv:170304816.
Zurück zum Zitat Xiong, C., Zhong, V., Socher, R. (2016) Dynamic coattention networks for question answering. arXiv preprint arXiv:161101604. Xiong, C., Zhong, V., Socher, R. (2016) Dynamic coattention networks for question answering. arXiv preprint arXiv:161101604.
Zurück zum Zitat Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E. (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1480–1489. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E. (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1480–1489.
Zurück zum Zitat Zhang, Y., Riesa, J., Gillick, D., Bakalov, A., Baldridge, J., Weiss, D. (2018) A fast, compact, accurate model for language identification of codemixed text. arXiv preprint arXiv:181004142. Zhang, Y., Riesa, J., Gillick, D., Bakalov, A., Baldridge, J., Weiss, D. (2018) A fast, compact, accurate model for language identification of codemixed text. arXiv preprint arXiv:181004142.
Zurück zum Zitat Zhou, G., Xie, Z., He, T., Zhao, J., & Hu, X. T. (2016). Learning the multilingual translation representations for question retrieval in community question answering via non-negative matrix factorization. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 24(7), 1305–1314.CrossRef Zhou, G., Xie, Z., He, T., Zhao, J., & Hu, X. T. (2016). Learning the multilingual translation representations for question retrieval in community question answering via non-negative matrix factorization. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 24(7), 1305–1314.CrossRef
Metadaten
Titel
Towards End-to-End Multilingual Question Answering
verfasst von
Ekaterina Loginova
Stalin Varanasi
Günter Neumann
Publikationsdatum
28.02.2020
Verlag
Springer US
Erschienen in
Information Systems Frontiers / Ausgabe 1/2021
Print ISSN: 1387-3326
Elektronische ISSN: 1572-9419
DOI
https://doi.org/10.1007/s10796-020-09996-1

Weitere Artikel der Ausgabe 1/2021

Information Systems Frontiers 1/2021 Zur Ausgabe

Premium Partner