Skip to main content
Erschienen in: Automated Software Engineering 2/2022

01.11.2022

Grouping related stack overflow comments for software developer recommendation

verfasst von: Viral Sheth, Kostadin Damevski

Erschienen in: Automated Software Engineering | Ausgabe 2/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Stack Overflow is a question and answer forum widely used by developers all over the world. Contributors share their knowledge on this platform not only in the form of answers, but also as comments to those answers. With millions of developer-contributed comments, the valuable knowledge contained within them remains difficult to locate by readers. Moreover, Stack Overflow’s comment hiding mechanism that only shows the top five most highly voted comments and hides the remaining leads to wealth condensation. Recently, researchers have observed that the Stack Overflow’s comment display mechanism hides important and relevant comments and makes it difficult for readers to understand the conversational context, as many comments are related to other hidden comments. In this paper, we propose a set of features and a machine learning-based technique to identify the relatedness of pairs of comments. Further, we extend the relatedness into comment clustering, as, with clusters, readers can get the entire context of a set of comments that form a single conversational thread. We evaluate our methods against several baselines to show that they provide strong improvements, although the problem in general is made difficult by the short text and narrow topic of discussion in the comments.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aggarwal, A., López, C., Hsiao, I.H.: The role of comments’ controversy in large-scale online discussion forums. In: Proceedings of the 27th ACM Conference on Hypertext and Social Media, pp. 179–182 (2016) Aggarwal, A., López, C., Hsiao, I.H.: The role of comments’ controversy in large-scale online discussion forums. In: Proceedings of the 27th ACM Conference on Hypertext and Social Media, pp. 179–182 (2016)
Zurück zum Zitat Calefato, F., Lanubile, F., Novielli, N.: Emotxt: a toolkit for emotion recognition from text. In: 2017 seventh international conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW), pp. 79–80. IEEE (2017) Calefato, F., Lanubile, F., Novielli, N.: Emotxt: a toolkit for emotion recognition from text. In: 2017 seventh international conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW), pp. 79–80. IEEE (2017)
Zurück zum Zitat Diyanati, A., Sheykhahmadloo, B.S., Fakhrahmad, S.M., Sadredini, M.H., Diyanati, M.H.: A proposed approach to determining expertise level of Stack Overflow programmers based on mining of user comments. J. Comput. Lang. 61, 1 (2020) Diyanati, A., Sheykhahmadloo, B.S., Fakhrahmad, S.M., Sadredini, M.H., Diyanati, M.H.: A proposed approach to determining expertise level of Stack Overflow programmers based on mining of user comments. J. Comput. Lang. 61, 1 (2020)
Zurück zum Zitat Elsner, M., Charniak, E.: Disentangling chat. Comput. Linguist. 36(3), 389–409 (2010)CrossRef Elsner, M., Charniak, E.: Disentangling chat. Comput. Linguist. 36(3), 389–409 (2010)CrossRef
Zurück zum Zitat Imran, M.M., Ciborowska, A., Damevski, K.: Automatically selecting follow-up questions for deficient bug reports. In: Proceedings of the 18th International Conference on Mining Software Repositories (MSR’21) (2021) Imran, M.M., Ciborowska, A., Damevski, K.: Automatically selecting follow-up questions for deficient bug reports. In: Proceedings of the 18th International Conference on Mining Software Repositories (MSR’21) (2021)
Zurück zum Zitat Jiang, J.Y., Chen, F., Chen, Y.Y., Wang, W.: Learning to disentangle interleaved conversational threads with a Siamese hierarchical network and similarity ranking. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1812–1822 (2018) Jiang, J.Y., Chen, F., Chen, Y.Y., Wang, W.: Learning to disentangle interleaved conversational threads with a Siamese hierarchical network and similarity ranking. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1812–1822 (2018)
Zurück zum Zitat Kummerfeld, J.K., Gouravajhala, S.R., Peper, J., Athreya, V., Gunasekara, C., Ganhotra, J., Patel, S.S., Polymenakos, L., Lasecki, W.S.: A large-scale corpus for conversation disentanglement. arXiv preprint arXiv:1810.11118 (2018) Kummerfeld, J.K., Gouravajhala, S.R., Peper, J., Athreya, V., Gunasekara, C., Ganhotra, J., Patel, S.S., Polymenakos, L., Lasecki, W.S.: A large-scale corpus for conversation disentanglement. arXiv preprint arXiv:​1810.​11118 (2018)
Zurück zum Zitat Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics. pp. 159–174 (1977) Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics. pp. 159–174 (1977)
Zurück zum Zitat Loper, E., Bird, S.: Nltk: The natural language toolkit. arXiv preprint cs/0205028 (2002) Loper, E., Bird, S.: Nltk: The natural language toolkit. arXiv preprint cs/0205028 (2002)
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:​1301.​3781 (2013)
Zurück zum Zitat Novielli, N., Calefato, F., Lanubile, F.: A gold standard for emotion annotation in Stack Overflow. In: 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR), pp. 14–17. IEEE (2018) Novielli, N., Calefato, F., Lanubile, F.: A gold standard for emotion annotation in Stack Overflow. In: 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR), pp. 14–17. IEEE (2018)
Zurück zum Zitat Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014) Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Zurück zum Zitat Rahman, M.M., Roy, C.K., Keivanloo, I.: Recommending insightful comments for source code using crowdsourced knowledge. In: 2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM), pp. 81–90. IEEE (2015) Rahman, M.M., Roy, C.K., Keivanloo, I.: Recommending insightful comments for source code using crowdsourced knowledge. In: 2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM), pp. 81–90. IEEE (2015)
Zurück zum Zitat Sengupta, S., Haythornthwaite, C.: Learning with comments: an analysis of comments and community on Stack Overflow. In: Proceedings of the 53rd Hawaii International Conference on System Sciences (2020) Sengupta, S., Haythornthwaite, C.: Learning with comments: an analysis of comments and community on Stack Overflow. In: Proceedings of the 53rd Hawaii International Conference on System Sciences (2020)
Zurück zum Zitat Shi, L., Chen, X., Yang, Y., Jiang, H., Jiang, Z., Niu, N., Wang, Q.: A first look at developers’ live chat on gitter. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2021, p. 391–403. Association for Computing Machinery, New York, NY, USA https://doi.org/10.1145/3468264.3468562 (2021) Shi, L., Chen, X., Yang, Y., Jiang, H., Jiang, Z., Niu, N., Wang, Q.: A first look at developers’ live chat on gitter. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2021, p. 391–403. Association for Computing Machinery, New York, NY, USA https://​doi.​org/​10.​1145/​3468264.​3468562 (2021)
Zurück zum Zitat Zhang, H., Wang, S., Chen, T.H., Hassan, A.E.: Reading answers on Stack Overflow: not enough! IEEE Trans. Softw. Eng. 47, 2520 (2019)CrossRef Zhang, H., Wang, S., Chen, T.H., Hassan, A.E.: Reading answers on Stack Overflow: not enough! IEEE Trans. Softw. Eng. 47, 2520 (2019)CrossRef
Zurück zum Zitat Zhang, H., Wang, S., Chen, T.H.P., Hassan, A.E.: Does the hiding mechanism for Stack Overflow comments work well? No! arXiv preprint arXiv:1904.00946 (2019) Zhang, H., Wang, S., Chen, T.H.P., Hassan, A.E.: Does the hiding mechanism for Stack Overflow comments work well? No! arXiv preprint arXiv:​1904.​00946 (2019)
Metadaten
Titel
Grouping related stack overflow comments for software developer recommendation
verfasst von
Viral Sheth
Kostadin Damevski
Publikationsdatum
01.11.2022
Verlag
Springer US
Erschienen in
Automated Software Engineering / Ausgabe 2/2022
Print ISSN: 0928-8910
Elektronische ISSN: 1573-7535
DOI
https://doi.org/10.1007/s10515-022-00339-9

Weitere Artikel der Ausgabe 2/2022

Automated Software Engineering 2/2022 Zur Ausgabe

Premium Partner