Skip to main content
Top

2019 | OriginalPaper | Chapter

Efficient Answer-Annotation for Frequent Questions

Authors : Markus Zlabinger, Navid Rekabsaz, Stefan Zlabinger, Allan Hanbury

Published in: Experimental IR Meets Multilinguality, Multimodality, and Interaction

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Ground truth is a crucial resource for the creation of effective question-answering (Q-A) systems. When no appropriate ground truth is available, as it is often the case in domain-specific Q-A systems (e.g. customer-support, tourism) or in languages other than English, new ground truth can be created by human annotation. The annotation process in which a human annotator looks up the corresponding answer label for each question from an answer catalog (\(\textsc {Sequential}\) approach), however, is usually time-consuming and costly. In this paper, we propose a new approach, in which the annotator first manually groups questions that have the same intent as a candidate question, and then, labels the entire group in one step (\(\textsc {Group}\text {-}\textsc {Wise}\) approach). To retrieve same-intent questions effectively, we evaluate various unsupervised semantic similarity methods from recent literature, and implement the most effective one in our annotation approach. Afterwards, we compare the \(\textsc {Group}\text {-}\textsc {Wise}\) approach with the \(\textsc {Sequential}\) approach with respect to answer look-ups, annotation time, and label-quality. We show based on 500 German customer-support questions that the \(\textsc {Group}\text {-}\textsc {Wise}\) approach requires 51% fewer answer look-ups, is 41% more time-efficient, and retains the same label-quality as the \(\textsc {Sequential}\) approach. Note that the described approach is limited to Q-A systems where frequently asked questions occur.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
2
Questions are translated from German.
 
Literature
1.
go back to reference Arora, S., Liang, Y., Ma, T.: Simple but tough-to-beat baseline for sentence embeddings. In: International Conference on Learning Representations (2017) Arora, S., Liang, Y., Ma, T.: Simple but tough-to-beat baseline for sentence embeddings. In: International Conference on Learning Representations (2017)
3.
4.
go back to reference Chahuara, P., Lampert, T., Gancarski, P.: Retrieving and Ranking Similar Questions from Question-Answer Archives Using Topic Modelling and Topic Distribution Regression. arXiv:1606.03783 [cs], June 2016 Chahuara, P., Lampert, T., Gancarski, P.: Retrieving and Ranking Similar Questions from Question-Answer Archives Using Topic Modelling and Topic Distribution Regression. arXiv:​1606.​03783 [cs], June 2016
5.
go back to reference Charlet, D., Damnati, G.: SimBow at SemEval-2017 task 3: soft-cosine semantic similarity between questions for community question answering. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp. 315–319. Association for Computational Linguistics, Vancouver, August 2017 Charlet, D., Damnati, G.: SimBow at SemEval-2017 task 3: soft-cosine semantic similarity between questions for community question answering. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp. 315–319. Association for Computational Linguistics, Vancouver, August 2017
6.
go back to reference Franco-Salvador, M., Kar, S., Solorio, T., Rosso, P.: UH-PRHLT at SemEval-2016 task 3: combining lexical and semantic-based features for community question answering. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 814–821. Association for Computational Linguistics, San Diego, June 2016 Franco-Salvador, M., Kar, S., Solorio, T., Rosso, P.: UH-PRHLT at SemEval-2016 task 3: combining lexical and semantic-based features for community question answering. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 814–821. Association for Computational Linguistics, San Diego, June 2016
7.
go back to reference Goyal, N.: LearningToQuestion at SemEval 2017 task 3: ranking similar questions by learning to rank using rich features. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp. 310–314 (2017) Goyal, N.: LearningToQuestion at SemEval 2017 task 3: ranking similar questions by learning to rank using rich features. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp. 310–314 (2017)
8.
go back to reference Hazem, A., El Amal Boussaha, B., Hernandez, N.: MappSent: a textual mapping approach for question-to-question similarity. In: Recent Advances in Natural Language Processing Meet Deep Learning, RANLP 2017, pp. 291–300, November 2017 Hazem, A., El Amal Boussaha, B., Hernandez, N.: MappSent: a textual mapping approach for question-to-question similarity. In: Recent Advances in Natural Language Processing Meet Deep Learning, RANLP 2017, pp. 291–300, November 2017
9.
go back to reference Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014) Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)
11.
go back to reference Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
12.
go back to reference Nakov, P., et al.: SemEval-2017 task 3: community question answering. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp. 27–48. Association for Computational Linguistics, Vancouver, August 2017 Nakov, P., et al.: SemEval-2017 task 3: community question answering. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp. 27–48. Association for Computational Linguistics, Vancouver, August 2017
13.
go back to reference Nakov, P., et al.: SemEval-2016 task 3: community question answering. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 525–545. Association for Computational Linguistics, San Diego, June 2016 Nakov, P., et al.: SemEval-2016 task 3: community question answering. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 525–545. Association for Computational Linguistics, San Diego, June 2016
14.
go back to reference Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-gram features. In: Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2018 (2018) Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-gram features. In: Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2018 (2018)
16.
go back to reference Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ Questions for Machine Comprehension of Text. arXiv:1606.05250 [cs], June 2016 Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ Questions for Machine Comprehension of Text. arXiv:​1606.​05250 [cs], June 2016
17.
go back to reference Wang, D.S.: A domain-specific question answering system based on ontology and question templates. In: 2010 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, pp. 151–156, June 2010. https://doi.org/10.1109/SNPD.2010.31 Wang, D.S.: A domain-specific question answering system based on ontology and question templates. In: 2010 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, pp. 151–156, June 2010. https://​doi.​org/​10.​1109/​SNPD.​2010.​31
18.
go back to reference Wen, J.R., Nie, J.Y., Zhang, H.J.: Query clustering using user logs. ACM Trans. Inf. Syst. 20(1), 23 (2002)CrossRef Wen, J.R., Nie, J.Y., Zhang, H.J.: Query clustering using user logs. ACM Trans. Inf. Syst. 20(1), 23 (2002)CrossRef
19.
go back to reference Wu, C.H., Yeh, J.F., Chen, M.J.: Domain-specific FAQ retrieval using independent aspects. ACM Trans. Asian Lang. Inf. Process. 4(1), 17 (2005)CrossRef Wu, C.H., Yeh, J.F., Chen, M.J.: Domain-specific FAQ retrieval using independent aspects. ACM Trans. Asian Lang. Inf. Process. 4(1), 17 (2005)CrossRef
Metadata
Title
Efficient Answer-Annotation for Frequent Questions
Authors
Markus Zlabinger
Navid Rekabsaz
Stefan Zlabinger
Allan Hanbury
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-28577-7_8

Premium Partner