Skip to main content
Erschienen in: Neural Computing and Applications 7/2024

06.12.2023 | Original Article

Similar question retrieval with incorporation of multi-dimensional quality analysis for community question answering

verfasst von: Yue Liu, Weize Tang, Zitu Liu, Aihua Tang, Lipeng Zhang

Erschienen in: Neural Computing and Applications | Ausgabe 7/2024

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The semantic-based method for question retrieval is an important method for searching similar questions in community question answering (CQA). The major challenges in question retrieval lie in polysemy and lexical gaps between questions, and the quality of retrieved similar questions by semantic retrieval model might not be high enough to effectively solve one’s doubts. In order to address these challenges, a high-quality and multi-level semantic analysis-based similar question retrieval framework named HQML-QR is proposed, which consists of semantic representation from tag-level and sentence-level semantics for question retrieval (TS-QR) and multi-dimensional quality analysis (MDQQ). Firstly, TS-QR extracts multi-level semantic features of the question contents, where graph embedding model is utilized to learn coarse-grained semantics of questions from the scope of the tag. Meanwhile, in order to effectively identify polysemy and extract fine-grained sentence semantic of questions, TS-QR integrates the pre-trained language model based on self-attention mechanism to ensure the accuracy of question retrieval. Secondly, based on the quality factors in CQA (i.e., popularity, question, answer and user), MDQQ constructs a multi-dimensional quality evaluation model to provide a reasonable quality measurement standard for questions. Under the guidance of the quality of questions, the similarity score obtained by semantic vector matching is updated to retrieve high-quality and semantically similar questions. Finally, experiments are executed on CQADupStack dataset from Stack Overflow and the experimental results show that the P@N of HQML-QR has an average increase of 5.65%, 4.44% and 4.34% compared with LDA-VSM-SEM, WET-QR, RCM-QR, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Qu M, Qiu G, He X, Zhang C, Wu H, Bu J, Chen C (2009) Probabilistic question recommendation for question answering communities. In: Proceedings of the 18th International Conference on World Wide Web, pp 1229–1230 Qu M, Qiu G, He X, Zhang C, Wu H, Bu J, Chen C (2009) Probabilistic question recommendation for question answering communities. In: Proceedings of the 18th International Conference on World Wide Web, pp 1229–1230
2.
Zurück zum Zitat Jeon J, Croft WB, Lee JH (2005) Finding similar questions in large question and answer archives. In: Proceedings of the 2005 ACM CIKM international conference on information and knowledge management, pp 84–90 Jeon J, Croft WB, Lee JH (2005) Finding similar questions in large question and answer archives. In: Proceedings of the 2005 ACM CIKM international conference on information and knowledge management, pp 84–90
3.
Zurück zum Zitat Zhao J, Guan Z, Sun H (2019) Riker: Mining rich keyword representations for interpretable product question answering. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1389–1398 Zhao J, Guan Z, Sun H (2019) Riker: Mining rich keyword representations for interpretable product question answering. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1389–1398
4.
Zurück zum Zitat Chen Z, Zhang C, Zhao Z, Yao C, Cai D (2018) Question retrieval for community-based question answering via heterogeneous social influential network. Neurocomputing 285:117–124CrossRef Chen Z, Zhang C, Zhao Z, Yao C, Cai D (2018) Question retrieval for community-based question answering via heterogeneous social influential network. Neurocomputing 285:117–124CrossRef
5.
Zurück zum Zitat Othman N, Faiz R, Smaïli K (2020) Improving the community question retrieval performance using attention-based siamese LSTM. In: Natural Language Processing and Information Systems—25th International Conference on Applications of Natural Language to Information Systems, vol 12089, pp 252–263. Springer, New York Othman N, Faiz R, Smaïli K (2020) Improving the community question retrieval performance using attention-based siamese LSTM. In: Natural Language Processing and Information Systems—25th International Conference on Applications of Natural Language to Information Systems, vol 12089, pp 252–263. Springer, New York
6.
Zurück zum Zitat Liu Y, Tang A, Sun Z, Tang W, Cai F, Wang C (2020) An integrated retrieval framework for similar questions: word-semantic embedded label clustering - LDA with question life cycle. Inf Sci 537:227–245MathSciNetCrossRef Liu Y, Tang A, Sun Z, Tang W, Cai F, Wang C (2020) An integrated retrieval framework for similar questions: word-semantic embedded label clustering - LDA with question life cycle. Inf Sci 537:227–245MathSciNetCrossRef
7.
Zurück zum Zitat Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: 1st International conference on learning representations Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: 1st International conference on learning representations
8.
Zurück zum Zitat Zhang K, Wu W, Wu H, Li Z, Zhou M (2014) Question retrieval with high quality answers in community question answering. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management, pp 371–380 Zhang K, Wu W, Wu H, Li Z, Zhou M (2014) Question retrieval with high quality answers in community question answering. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management, pp 371–380
9.
Zurück zum Zitat Lee J, Kim S, Song Y, Rim H (2008) Bridging lexical gaps between queries and questions on large online q &a collections with compact translation models. In: 2008 conference on empirical methods in Natural Language Processing. ACL, pp 410–418 Lee J, Kim S, Song Y, Rim H (2008) Bridging lexical gaps between queries and questions on large online q &a collections with compact translation models. In: 2008 conference on empirical methods in Natural Language Processing. ACL, pp 410–418
10.
Zurück zum Zitat Zhou G, Cai L, Zhao J, Liu K (2011) Phrase-based translation model for question retrieval in community question answer archives. In: The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, pp 653–662 Zhou G, Cai L, Zhao J, Liu K (2011) Phrase-based translation model for question retrieval in community question answer archives. In: The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, pp 653–662
11.
Zurück zum Zitat Cai L, Zhou G, Liu K, Zhao J (2011) Learning the latent topics for question retrieval in community QA. In: Fifth international joint conference on Natural Language Processing, pp 273–281 Cai L, Zhou G, Liu K, Zhao J (2011) Learning the latent topics for question retrieval in community QA. In: Fifth international joint conference on Natural Language Processing, pp 273–281
12.
Zurück zum Zitat Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. Journal of machine Learning research 3(Jan):993–1022 Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. Journal of machine Learning research 3(Jan):993–1022
13.
Zurück zum Zitat Liu M, Fang Y, Choulos AG, Park DH, Hu X (2017) Product review summarization through question retrieval and diversification. Inf. Retr. J. 20(6):575–605CrossRef Liu M, Fang Y, Choulos AG, Park DH, Hu X (2017) Product review summarization through question retrieval and diversification. Inf. Retr. J. 20(6):575–605CrossRef
14.
Zurück zum Zitat Zhou G, He T, Zhao J, Hu P Learning continuous word embedding with metadata for question retrieval in community question answering. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, pp 250–259 Zhou G, He T, Zhao J, Hu P Learning continuous word embedding with metadata for question retrieval in community question answering. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, pp 250–259
15.
Zurück zum Zitat Li B, Du X, Chen M (2020) Cross-language question retrieval with multi-layer representation and layer-wise adversary. Inf Sci 527:241–252CrossRef Li B, Du X, Chen M (2020) Cross-language question retrieval with multi-layer representation and layer-wise adversary. Inf Sci 527:241–252CrossRef
16.
Zurück zum Zitat Shen Y, Rong W, Sun Z, Ouyang Y, Xiong Z (2015) Question/answer matching for CQA system via combining lexical and sequential information. In: Proceedings of the twenty-ninth AAAI conference on artificial intelligence, pp 275–281 Shen Y, Rong W, Sun Z, Ouyang Y, Xiong Z (2015) Question/answer matching for CQA system via combining lexical and sequential information. In: Proceedings of the twenty-ninth AAAI conference on artificial intelligence, pp 275–281
17.
Zurück zum Zitat Agichtein E, Castillo C, Donato D, Gionis A, Mishne G (2008) Finding high-quality content in social media. In: Proceedings of the international conference on Web Search and Web Data Mining, pp 183–194 Agichtein E, Castillo C, Donato D, Gionis A, Mishne G (2008) Finding high-quality content in social media. In: Proceedings of the international conference on Web Search and Web Data Mining, pp 183–194
18.
Zurück zum Zitat Bian J, Liu Y, Agichtein E, Zha H (2008) Finding the right facts in the crowd: factoid question answering over social media. In: Proceedings of the 17th international conference on World Wide Web, pp 467–476 Bian J, Liu Y, Agichtein E, Zha H (2008) Finding the right facts in the crowd: factoid question answering over social media. In: Proceedings of the 17th international conference on World Wide Web, pp 467–476
19.
Zurück zum Zitat Sakai T, Ishikawa D, Kando N, Seki Y, Kuriyama K, Lin C (2011) Using graded-relevance metrics for evaluating community QA answer selection. In: Proceedings of the forth international conference on Web Search and Web Data Mining, pp 187–196 Sakai T, Ishikawa D, Kando N, Seki Y, Kuriyama K, Lin C (2011) Using graded-relevance metrics for evaluating community QA answer selection. In: Proceedings of the forth international conference on Web Search and Web Data Mining, pp 187–196
20.
Zurück zum Zitat Shah C, Pomerantz J (2010) Evaluating and predicting answer quality in community QA. In: Proceeding of the 33rd international ACM SIGIR conference on research and development in information retrieval, pp 411–418 Shah C, Pomerantz J (2010) Evaluating and predicting answer quality in community QA. In: Proceeding of the 33rd international ACM SIGIR conference on research and development in information retrieval, pp 411–418
21.
Zurück zum Zitat Ghasemi N, Fatourechi R, Momtazi S (2021) User embedding for expert finding in community question answering. ACM Trans Knowl Discov Data 15(4):70–17016CrossRef Ghasemi N, Fatourechi R, Momtazi S (2021) User embedding for expert finding in community question answering. ACM Trans Knowl Discov Data 15(4):70–17016CrossRef
22.
Zurück zum Zitat Liu Y, Tang W, Liu Z, Ding L, Tang A (2022) High-quality domain expert finding method in CQA based on multi-granularity semantic analysis and interest drift. Inf Sci 596:395–413CrossRef Liu Y, Tang W, Liu Z, Ding L, Tang A (2022) High-quality domain expert finding method in CQA based on multi-granularity semantic analysis and interest drift. Inf Sci 596:395–413CrossRef
23.
Zurück zum Zitat Li B, Jin T, Lyu MR, King I, Mak B (2012) Analyzing and predicting question quality in community question answering services. In: Proceedings of the 21st World Wide Web conference, pp 775–782 Li B, Jin T, Lyu MR, King I, Mak B (2012) Analyzing and predicting question quality in community question answering services. In: Proceedings of the 21st World Wide Web conference, pp 775–782
24.
Zurück zum Zitat Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: The 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710 Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: The 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710
25.
Zurück zum Zitat Sennrich R, Haddow B, Birch A (2016) Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics Sennrich R, Haddow B, Birch A (2016) Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics
26.
Zurück zum Zitat Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 4171–4186 Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 4171–4186
28.
Zurück zum Zitat Hoogeveen D, Wang L, Baldwin T, Verspoor KM (2018) Web forum retrieval and text analytics: a survey. Found Trends Inf Retr 12(1):1–163CrossRef Hoogeveen D, Wang L, Baldwin T, Verspoor KM (2018) Web forum retrieval and text analytics: a survey. Found Trends Inf Retr 12(1):1–163CrossRef
29.
Zurück zum Zitat Li Z, Jiang J, Sun Y, Wang W (2019) Personalized question routing via heterogeneous network embedding. In: The thirty-third AAAI conference on artificial intelligence, AAAI 2019, the thirty-first innovative applications of artificial intelligence conference, pp 192–199 Li Z, Jiang J, Sun Y, Wang W (2019) Personalized question routing via heterogeneous network embedding. In: The thirty-third AAAI conference on artificial intelligence, AAAI 2019, the thirty-first innovative applications of artificial intelligence conference, pp 192–199
30.
Zurück zum Zitat Ravi S, Pang B, Rastogi V, Kumar R (2014) Great question! question quality in community q &a. In: Adar E, Resnick P, Choudhury MD, Hogan B, Oh A (eds) Proceedings of the eighth international conference on Weblogs and Social Media Ravi S, Pang B, Rastogi V, Kumar R (2014) Great question! question quality in community q &a. In: Adar E, Resnick P, Choudhury MD, Hogan B, Oh A (eds) Proceedings of the eighth international conference on Weblogs and Social Media
31.
32.
Zurück zum Zitat Xiong D, Wang J, Lin H (2012) An lda-based approach to finding similar questions for community question answer. J Chin Inform Process 26(5):40–45 Xiong D, Wang J, Lin H (2012) An lda-based approach to finding similar questions for community question answer. J Chin Inform Process 26(5):40–45
33.
Zurück zum Zitat Othman N, Faiz R, Smaïli K (2018) Using word embeddings to retrieve semantically similar questions in community question answering. J Int Sci Gen Appl 1(1) Othman N, Faiz R, Smaïli K (2018) Using word embeddings to retrieve semantically similar questions in community question answering. J Int Sci Gen Appl 1(1)
34.
Zurück zum Zitat Lei T, Joshi H, Barzilay R, Jaakkola TS, Tymoshenko K, Moschitti A, Màrquez L (2016) Semi-supervised question retrieval with gated convolutions. In: NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1279–1289 Lei T, Joshi H, Barzilay R, Jaakkola TS, Tymoshenko K, Moschitti A, Màrquez L (2016) Semi-supervised question retrieval with gated convolutions. In: NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1279–1289
Metadaten
Titel
Similar question retrieval with incorporation of multi-dimensional quality analysis for community question answering
verfasst von
Yue Liu
Weize Tang
Zitu Liu
Aihua Tang
Lipeng Zhang
Publikationsdatum
06.12.2023
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 7/2024
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-023-09266-6

Weitere Artikel der Ausgabe 7/2024

Neural Computing and Applications 7/2024 Zur Ausgabe