Skip to main content
Erschienen in: Empirical Software Engineering 4/2023

01.07.2023

BTLink : automatic link recovery between issues and commits based on pre-trained BERT model

verfasst von: Jinpeng Lan, Lina Gong, Jingxuan Zhang, Haoxiang Zhang

Erschienen in: Empirical Software Engineering | Ausgabe 4/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Data traceability in software development can connect different software artifacts to enhance the observability of developer practices. In particular, traceability links between issues and commits (i.e., issue-commit links) play a key role in software maintenance tasks (e.g., bug localization and bug prediction). In practice, developers typically manually make the issue-commit links by adding the issue identifier into the message of the corresponding commits, which results in missing issue commit links being prevalent in software projects. To recover the missing issue commit links, previous studies have proposed some automatic approaches. However, due to the difference between heuristic rules and real-world behavior, as well as insufficient semantic understanding, these approaches cannot achieve the expected performance. Since the text contained in issues and commits contains highly related information, thorough text understanding can improve traceability links. Meanwhile, pre-trained models (i.e., PTMs) have been successfully used to explore the semantic information of text in various software engineering tasks (e.g., software code generation). Therefore, our study proposes a novel BERT -based method (i.e., BTLink) that employs the pre-trained models to automatically recover the issue-commits links. Our proposed BTlink method includes a BERT embedding layer, a fusion layer, and a classifier layer. First, we build two pre-trained BERT encoders to respectively explore the feature representation of the issue text in combination with commit code and commit text. Then we build the fusion layer to examine the joint feature vector. Finally, we build the classifier layer to identify the links between issue and commit. In addition, to further our investigation and verify the effectiveness of BTLink, we conduct an extensive case study on 12 issue-commit links datasets from open source software projects, and observe that: (i) compared to state-of-the-art approaches, our proposed BTLink improves the performance of automatic issue-commit links recovery on all studied measures; (ii) both text and code information in the issues and commits are effective to recover more accurate issue-commit links; (iii) our proposed BTLink is more applicable to the cross-project context compared to state-of-the-art approaches.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Adoma AF, Henry NM, Chen W (2020) Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), IEEE, pp 117–121 Adoma AF, Henry NM, Chen W (2020) Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), IEEE, pp 117–121
Zurück zum Zitat Ahmad WU, Chakraborty S, Ray B, Chang KW (2021) Unified pre-training for program understanding and generation. arXiv preprint arXiv:2103.06333 Ahmad WU, Chakraborty S, Ray B, Chang KW (2021) Unified pre-training for program understanding and generation. arXiv preprint arXiv:​2103.​06333
Zurück zum Zitat Ahmed T, Ledesma NR, Devanbu P (2021) Synfix: Automatically fixing syntax errors using compiler diagnostics. arXiv preprint arXiv:2104.14671 Ahmed T, Ledesma NR, Devanbu P (2021) Synfix: Automatically fixing syntax errors using compiler diagnostics. arXiv preprint arXiv:​2104.​14671
Zurück zum Zitat Amershi S, Begel A, Bird C, DeLine R, Gall H, Kamar E, Nagappan N, Nushi B, Zimmermann T (2019) Software engineering for machine learning: A case study. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), IEEE, pp 291–300 Amershi S, Begel A, Bird C, DeLine R, Gall H, Kamar E, Nagappan N, Nushi B, Zimmermann T (2019) Software engineering for machine learning: A case study. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), IEEE, pp 291–300
Zurück zum Zitat Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug? In: Proceedings of the 28th international conference on Software engineering, pp 361–370 Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug? In: Proceedings of the 28th international conference on Software engineering, pp 361–370
Zurück zum Zitat Applis L, Panichella A, van Deursen A (2021) Assessing robustness of ml-based program analysis tools using metamorphic program transformations. In: 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), IEEE, pp 1377–1381 Applis L, Panichella A, van Deursen A (2021) Assessing robustness of ml-based program analysis tools using metamorphic program transformations. In: 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), IEEE, pp 1377–1381
Zurück zum Zitat Bachmann A, Bird C, Rahman F, Devanbu P, Bernstein A (2010) The missing links: bugs and bug-fix commits. In: Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering, pp 97–106 Bachmann A, Bird C, Rahman F, Devanbu P, Bernstein A (2010) The missing links: bugs and bug-fix commits. In: Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering, pp 97–106
Zurück zum Zitat Beller M, Gousios G, Panichella A, Proksch S, Amann S, Zaidman A (2017) Developer testing in the ide: Patterns, beliefs, and behavior. IEEE Trans Softw Eng 45(3):261–284CrossRef Beller M, Gousios G, Panichella A, Proksch S, Amann S, Zaidman A (2017) Developer testing in the ide: Patterns, beliefs, and behavior. IEEE Trans Softw Eng 45(3):261–284CrossRef
Zurück zum Zitat Berabi B, He J, Raychev V, Vechev M (2021) Tfix: Learning to fix coding errors with a text-to-text transformer. In: Meila M, Zhang T (eds) Proceedings of the 38th International Conference on Machine Learning, vol 139 of Proceedings of Machine Learning Research, PMLR, pp 780–791 Berabi B, He J, Raychev V, Vechev M (2021) Tfix: Learning to fix coding errors with a text-to-text transformer. In: Meila M, Zhang T (eds) Proceedings of the 38th International Conference on Machine Learning, vol 139 of Proceedings of Machine Learning Research, PMLR, pp 780–791
Zurück zum Zitat Bergstra J, Komer B, Eliasmith C, Yamins D, Cox DD (2015) Hyperopt: a python library for model selection and hyperparameter optimization. Comput Sci Discov 8(1):014008CrossRef Bergstra J, Komer B, Eliasmith C, Yamins D, Cox DD (2015) Hyperopt: a python library for model selection and hyperparameter optimization. Comput Sci Discov 8(1):014008CrossRef
Zurück zum Zitat Bhardwaj R, Majumder N, Poria S (2021) Investigating gender bias in BERT. Cogn Comput 13(4):1008–1018CrossRef Bhardwaj R, Majumder N, Poria S (2021) Investigating gender bias in BERT. Cogn Comput 13(4):1008–1018CrossRef
Zurück zum Zitat Bird C, Bachmann A, Aune E, Duffy J, Bernstein A, Filkov V, Devanbu P (2009) Fair and balanced? bias in bug-fix datasets. In: Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pp 121–130 Bird C, Bachmann A, Aune E, Duffy J, Bernstein A, Filkov V, Devanbu P (2009) Fair and balanced? bias in bug-fix datasets. In: Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pp 121–130
Zurück zum Zitat Bird C, Bachmann A, Rahman F, Bernstein A (2010) Linkster: enabling efficient manual inspection and annotation of mined data. In: Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering, pp 369–370 Bird C, Bachmann A, Rahman F, Bernstein A (2010) Linkster: enabling efficient manual inspection and annotation of mined data. In: Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering, pp 369–370
Zurück zum Zitat Borg M, Svensson O, Berg K, Hansson D (2019) Szz unleashed: an open implementation of the szz algorithm-featuring example usage in a study of just-in-time bug prediction for the jenkins project. In: Proceedings of the 3rd ACM SIGSOFT International Workshop on Machine Learning Techniques for Software Quality Evaluation, pp 7–12 Borg M, Svensson O, Berg K, Hansson D (2019) Szz unleashed: an open implementation of the szz algorithm-featuring example usage in a study of just-in-time bug prediction for the jenkins project. In: Proceedings of the 3rd ACM SIGSOFT International Workshop on Machine Learning Techniques for Software Quality Evaluation, pp 7–12
Zurück zum Zitat Canbek G, Sagiroglu S, Temizel TT, Baykal N (2017) Binary classification performance measures/metrics: A comprehensive visualized roadmap to gain new insights. In: 2017 International Conference on Computer Science and Engineering (UBMK), pp 821–826 Canbek G, Sagiroglu S, Temizel TT, Baykal N (2017) Binary classification performance measures/metrics: A comprehensive visualized roadmap to gain new insights. In: 2017 International Conference on Computer Science and Engineering (UBMK), pp 821–826
Zurück zum Zitat Cao J, Chen B, Sun C, Hu L, Wu S, Peng X (2022) Understanding performance problems in deep learning systems. In: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp 357–369 Cao J, Chen B, Sun C, Hu L, Wu S, Peng X (2022) Understanding performance problems in deep learning systems. In: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp 357–369
Zurück zum Zitat Catal C (2012) Performance evaluation metrics for software fault prediction studies. Acta Polytech Hung 9(4):193–206 Catal C (2012) Performance evaluation metrics for software fault prediction studies. Acta Polytech Hung 9(4):193–206
Zurück zum Zitat Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H, Chen K, et al (2015) Xgboost: extreme gradient boosting. R package version 04-2 1(4):1–4 Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H, Chen K, et al (2015) Xgboost: extreme gradient boosting. R package version 04-2 1(4):1–4
Zurück zum Zitat Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC Genomics 21(1):1–13CrossRef Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC Genomics 21(1):1–13CrossRef
Zurück zum Zitat Chicco D, Warrens MJ, Jurman G (2021) The Matthews correlation coefficient (mcc) is more informative than cohen’s kappa and brier score in binary classification assessment. IEEE Access 9:78368–78381 Chicco D, Warrens MJ, Jurman G (2021) The Matthews correlation coefficient (mcc) is more informative than cohen’s kappa and brier score in binary classification assessment. IEEE Access 9:78368–78381
Zurück zum Zitat Chongpakdee P, Vatanawood W (2017) Estimating user story points using document fingerprints. In: 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), IEEE, pp 149–152 Chongpakdee P, Vatanawood W (2017) Estimating user story points using document fingerprints. In: 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), IEEE, pp 149–152
Zurück zum Zitat Ciborowska A, Damevski K (2022) Fast changeset-based bug localization with bert. In: 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE), IEEE, pp 946–957 Ciborowska A, Damevski K (2022) Fast changeset-based bug localization with bert. In: 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE), IEEE, pp 946–957
Zurück zum Zitat Claes M, Mäntylä MV (2020) 20-mad: 20 years of issues and commits of mozilla and apache development. In: Proceedings of the 17th International Conference on Mining Software Repositories, pp 503–507 Claes M, Mäntylä MV (2020) 20-mad: 20 years of issues and commits of mozilla and apache development. In: Proceedings of the 17th International Conference on Mining Software Repositories, pp 503–507
Zurück zum Zitat Cortiz D (2021) Exploring transformers in emotion recognition: a comparison of bert, distillbert, roberta, xlnet and electra. arXiv preprint arXiv:2104.02041 Cortiz D (2021) Exploring transformers in emotion recognition: a comparison of bert, distillbert, roberta, xlnet and electra. arXiv preprint arXiv:​2104.​02041
Zurück zum Zitat Cubranic D, Murphy GC (2003) Hipikat: Recommending pertinent software development artifacts. In: 25th International Conference on Software Engineering, 2003. Proceedings., IEEE, pp 408–418 Cubranic D, Murphy GC (2003) Hipikat: Recommending pertinent software development artifacts. In: 25th International Conference on Software Engineering, 2003. Proceedings., IEEE, pp 408–418
Zurück zum Zitat Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:​1810.​04805
Zurück zum Zitat Dit B, Revelle M, Poshyvanyk D (2013) Integrating information retrieval, execution and link analysis algorithms to improve feature location in software. Empir Softw Eng 18(2):277–309CrossRef Dit B, Revelle M, Poshyvanyk D (2013) Integrating information retrieval, execution and link analysis algorithms to improve feature location in software. Empir Softw Eng 18(2):277–309CrossRef
Zurück zum Zitat Dit B, Revelle M, Gethers M, Poshyvanyk D (2013) Feature location in source code: a taxonomy and survey. J Softw: Evol Process 25(1):53–95 Dit B, Revelle M, Gethers M, Poshyvanyk D (2013) Feature location in source code: a taxonomy and survey. J Softw: Evol Process 25(1):53–95
Zurück zum Zitat Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D, et al (2020) Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D, et al (2020) Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:​2002.​08155
Zurück zum Zitat Fischer M, Pinzger M, Gall H (2003) Populating a release history database from version control and bug tracking systems. In: International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings., IEEE, pp 23–32 Fischer M, Pinzger M, Gall H (2003) Populating a release history database from version control and bug tracking systems. In: International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings., IEEE, pp 23–32
Zurück zum Zitat Flint SW, Chauhan J, Dyer R (2021) Escaping the time pit: Pitfalls and guidelines for using time-based git data. In: 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), IEEE, pp 85–96 Flint SW, Chauhan J, Dyer R (2021) Escaping the time pit: Pitfalls and guidelines for using time-based git data. In: 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), IEEE, pp 85–96
Zurück zum Zitat Goldberg Y, Levy O (2014) word2vec explained: deriving mikolov et al.’s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722 Goldberg Y, Levy O (2014) word2vec explained: deriving mikolov et al.’s negative-sampling word-embedding method. arXiv preprint arXiv:​1402.​3722
Zurück zum Zitat Golzadeh M, Decan A, Constantinou E, Mens T (2021) Identifying bot activity in github pull request and issue comments. In: 2021 IEEE/ACM Third International Workshop on Bots in Software Engineering (BotSE), IEEE, pp 21–25 Golzadeh M, Decan A, Constantinou E, Mens T (2021) Identifying bot activity in github pull request and issue comments. In: 2021 IEEE/ACM Third International Workshop on Bots in Software Engineering (BotSE), IEEE, pp 21–25
Zurück zum Zitat Gong L, Zhang J, Wei M, Zhang H, Huang Z (2022) What is the intended usage context of this model?-an exploratory study of pre-trained models on various model repositories. ACM Trans Softw Eng Methodol Gong L, Zhang J, Wei M, Zhang H, Huang Z (2022) What is the intended usage context of this model?-an exploratory study of pre-trained models on various model repositories. ACM Trans Softw Eng Methodol
Zurück zum Zitat Gong L, Zhang H, Zhang J, Wei M, Huang Z (2022) A comprehensive investigation of the impact of class overlap on software defect prediction. IEEE Trans Softw Eng Gong L, Zhang H, Zhang J, Wei M, Huang Z (2022) A comprehensive investigation of the impact of class overlap on software defect prediction. IEEE Trans Softw Eng
Zurück zum Zitat González-Carvajal S, Garrido-Merchán EC (2020) Comparing bert against traditional machine learning text classification. arXiv preprint arXiv:2005.13012 González-Carvajal S, Garrido-Merchán EC (2020) Comparing bert against traditional machine learning text classification. arXiv preprint arXiv:​2005.​13012
Zurück zum Zitat Guo D, Lu S, Duan N, Wang Y, Zhou M, Yin J (2022) Unixcoder: Unified cross-modal pre-training for code representation. arXiv preprint arXiv:2203.03850 Guo D, Lu S, Duan N, Wang Y, Zhou M, Yin J (2022) Unixcoder: Unified cross-modal pre-training for code representation. arXiv preprint arXiv:​2203.​03850
Zurück zum Zitat Guo D, Ren S, Lu S, Feng Z, Tang D, Liu S, Zhou L, Duan N, Svyatkovskiy A, Fu S, et al (2020) Graphcodebert: Pre-training code representations with data flow. arXiv preprint arXiv:2009.08366 Guo D, Ren S, Lu S, Feng Z, Tang D, Liu S, Zhou L, Duan N, Svyatkovskiy A, Fu S, et al (2020) Graphcodebert: Pre-training code representations with data flow. arXiv preprint arXiv:​2009.​08366
Zurück zum Zitat Hall T, Sharp H, Beecham S, Baddoo N, Robinson H (2008) What do we know about developer motivation? IEEE Softw 25(4):92–94CrossRef Hall T, Sharp H, Beecham S, Baddoo N, Robinson H (2008) What do we know about developer motivation? IEEE Softw 25(4):92–94CrossRef
Zurück zum Zitat Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef
Zurück zum Zitat Husain H, Wu HH, Gazit T, Allamanis M, Brockschmidt M (2019) Codesearchnet challenge: Evaluating the state of semantic code search. arXiv preprint arXiv:1909.09436 Husain H, Wu HH, Gazit T, Allamanis M, Brockschmidt M (2019) Codesearchnet challenge: Evaluating the state of semantic code search. arXiv preprint arXiv:​1909.​09436
Zurück zum Zitat Jagadeesh M, Alphonse P (2020) Nit_covid-19 at wnut-2020 task 2: Deep learning model roberta for identify informative covid-19 english tweets. In: W-NUT@ EMNLP, pp 450–454 Jagadeesh M, Alphonse P (2020) Nit_covid-19 at wnut-2020 task 2: Deep learning model roberta for identify informative covid-19 english tweets. In: W-NUT@ EMNLP, pp 450–454
Zurück zum Zitat Jivani AG et al (2011) A comparative study of stemming algorithms. Int J Comp Tech Appl 2(6):1930–1938 Jivani AG et al (2011) A comparative study of stemming algorithms. Int J Comp Tech Appl 2(6):1930–1938
Zurück zum Zitat Jung TH (2021) Commitbert: Commit message generation using pre-trained programming language model. arXiv preprint arXiv:2105.14242 Jung TH (2021) Commitbert: Commit message generation using pre-trained programming language model. arXiv preprint arXiv:​2105.​14242
Zurück zum Zitat Kalliamvakou E, Gousios G, Blincoe K, Singer L, German DM, Damian D (2014) The promises and perils of mining github. In: Proceedings of the 11th working conference on mining software repositories, pp 92–101 Kalliamvakou E, Gousios G, Blincoe K, Singer L, German DM, Damian D (2014) The promises and perils of mining github. In: Proceedings of the 11th working conference on mining software repositories, pp 92–101
Zurück zum Zitat Kalman RE (1960) On the general theory of control systems. Proceedings First International Conference on Automatic Control. USSR, Moscow, pp 481–492 Kalman RE (1960) On the general theory of control systems. Proceedings First International Conference on Automatic Control. USSR, Moscow, pp 481–492
Zurück zum Zitat Kalyan KS, Sangeetha S (2020) Social media medical concept normalization using RoBERTa in ontology enriched text similarity framework. In: Proceedings of Knowledgeable NLP: the First Workshop on Integrating Structured Knowledge and Neural Networks for NLP, Association for Computational Linguistics, Suzhou, China, pp 21–26 Kalyan KS, Sangeetha S (2020) Social media medical concept normalization using RoBERTa in ontology enriched text similarity framework. In: Proceedings of Knowledgeable NLP: the First Workshop on Integrating Structured Knowledge and Neural Networks for NLP, Association for Computational Linguistics, Suzhou, China, pp 21–26
Zurück zum Zitat Kazameini A, Fatehi S, Mehta Y, Eetemadi S, Cambria E (2020) Personality trait detection using bagged svm over bert word embedding ensembles. arXiv preprint arXiv:2010.01309 Kazameini A, Fatehi S, Mehta Y, Eetemadi S, Cambria E (2020) Personality trait detection using bagged svm over bert word embedding ensembles. arXiv preprint arXiv:​2010.​01309
Zurück zum Zitat Kim S, Zimmermann T, Whitehead Jr EJ, Zeller A (2007) Predicting faults from cached history. In: 29th International Conference on Software Engineering (ICSE’07), IEEE, pp 489–498 Kim S, Zimmermann T, Whitehead Jr EJ, Zeller A (2007) Predicting faults from cached history. In: 29th International Conference on Software Engineering (ICSE’07), IEEE, pp 489–498
Zurück zum Zitat Lanubile F, Ebert C, Prikladnicki R, Vizcaíno A (2010) Collaboration tools for global software engineering. IEEE Softw 27(2):52CrossRef Lanubile F, Ebert C, Prikladnicki R, Vizcaíno A (2010) Collaboration tools for global software engineering. IEEE Softw 27(2):52CrossRef
Zurück zum Zitat Le TDB, Linares-Vásquez M, Lo D, Poshyvanyk D (2015) Rclinker: Automated linking of issue reports and commits leveraging rich contextual information. In: 2015 IEEE 23rd International Conference on Program Comprehension, IEEE, pp 36–47 Le TDB, Linares-Vásquez M, Lo D, Poshyvanyk D (2015) Rclinker: Automated linking of issue reports and commits leveraging rich contextual information. In: 2015 IEEE 23rd International Conference on Program Comprehension, IEEE, pp 36–47
Zurück zum Zitat Linares-Vásquez M, Cortés-Coy LF, Aponte J, Poshyvanyk D (2015) Changescribe: A tool for automatically generating commit messages. In: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol 2. IEEE, pp 709–712 Linares-Vásquez M, Cortés-Coy LF, Aponte J, Poshyvanyk D (2015) Changescribe: A tool for automatically generating commit messages. In: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol 2. IEEE, pp 709–712
Zurück zum Zitat Lin J, Liu Y, Zeng Q, Jiang M, Cleland-Huang J (2021) Traceability transformed: Generating more accurate links with pre-trained bert models. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), IEEE, pp 324–335 Lin J, Liu Y, Zeng Q, Jiang M, Cleland-Huang J (2021) Traceability transformed: Generating more accurate links with pre-trained bert models. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), IEEE, pp 324–335
Zurück zum Zitat Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:​1907.​11692
Zurück zum Zitat Loeliger J, McCullough M (2012) Version Control with Git: Powerful tools and techniques for collaborative software development. O’Reilly Media, Inc Loeliger J, McCullough M (2012) Version Control with Git: Powerful tools and techniques for collaborative software development. O’Reilly Media, Inc
Zurück zum Zitat Loper E, Bird S (2002) Nltk: The natural language toolkit. arXiv preprint cs/0205028 Loper E, Bird S (2002) Nltk: The natural language toolkit. arXiv preprint cs/0205028
Zurück zum Zitat Lu S, Guo D, Ren S, Huang J, Svyatkovskiy A, Blanco A, Clement C, Drain D, Jiang D, Tang D, et al (2021) Codexglue: A machine learning benchmark dataset for code understanding and generation. arXiv preprint arXiv:2102.04664 Lu S, Guo D, Ren S, Huang J, Svyatkovskiy A, Blanco A, Clement C, Drain D, Jiang D, Tang D, et al (2021) Codexglue: A machine learning benchmark dataset for code understanding and generation. arXiv preprint arXiv:​2102.​04664
Zurück zum Zitat Lu W, Jiao J, Zhang R (2020) Twinbert: Distilling knowledge to twin-structured compressed bert models for large-scale retrieval. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp 2645–2652 Lu W, Jiao J, Zhang R (2020) Twinbert: Distilling knowledge to twin-structured compressed bert models for large-scale retrieval. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp 2645–2652
Zurück zum Zitat Maalej W, Happel HJ (2010) Can development work describe itself? In: 2010 7th IEEE working conference on mining software repositories (MSR 2010), IEEE, pp 191–200 Maalej W, Happel HJ (2010) Can development work describe itself? In: 2010 7th IEEE working conference on mining software repositories (MSR 2010), IEEE, pp 191–200
Zurück zum Zitat Macbeth G, Razumiejczyk E, Ledesma RD (2011) Cliff’s delta calculator: A non-parametric effect size program for two groups of observations. Univ Psychol 10(2):545–555CrossRef Macbeth G, Razumiejczyk E, Ledesma RD (2011) Cliff’s delta calculator: A non-parametric effect size program for two groups of observations. Univ Psychol 10(2):545–555CrossRef
Zurück zum Zitat Mahmud J, Faisal F, Arnob RI, Anastasopoulos A, Moran K (2021) Code to comment translation: A comparative study on model effectiveness & errors. arXiv preprint arXiv:2106.08415 Mahmud J, Faisal F, Arnob RI, Anastasopoulos A, Moran K (2021) Code to comment translation: A comparative study on model effectiveness & errors. arXiv preprint arXiv:​2106.​08415
Zurück zum Zitat Mashhadi E, Hemmati H (2021) Applying codebert for automated program repair of java simple bugs. In: 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), IEEE, pp 505–509 Mashhadi E, Hemmati H (2021) Applying codebert for automated program repair of java simple bugs. In: 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), IEEE, pp 505–509
Zurück zum Zitat Mazrae PR, Izadi M, Heydarnoori A (2021) Automated recovery of issue-commit links leveraging both textual and non-textual data. In: 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, pp 263–273 Mazrae PR, Izadi M, Heydarnoori A (2021) Automated recovery of issue-commit links leveraging both textual and non-textual data. In: 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, pp 263–273
Zurück zum Zitat Meqdadi O, Alhindawi N, Alsakran J, Saifan A, Migdadi H (2019) Mining software repositories for adaptive change commits using machine learning techniques. Inf Softw Technol 109:80–91CrossRef Meqdadi O, Alhindawi N, Alsakran J, Saifan A, Migdadi H (2019) Mining software repositories for adaptive change commits using machine learning techniques. Inf Softw Technol 109:80–91CrossRef
Zurück zum Zitat Natekin A, Knoll A (2013) Gradient boosting machines, a tutorial. Front Neurorobotics 7:21CrossRef Natekin A, Knoll A (2013) Gradient boosting machines, a tutorial. Front Neurorobotics 7:21CrossRef
Zurück zum Zitat Nguyen AT, Nguyen TT, Nguyen HA, Nguyen TN (2012) Multi-layered approach for recovering links between bug reports and fixes. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, pp 1–11 Nguyen AT, Nguyen TT, Nguyen HA, Nguyen TN (2012) Multi-layered approach for recovering links between bug reports and fixes. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, pp 1–11
Zurück zum Zitat Nguyen QT, Nguyen TL, Luong NH, Ngo QH (2020) Fine-tuning bert for sentiment analysis of vietnamese reviews. In: 2020 7th NAFOSTED Conference on Information and Computer Science (NICS), IEEE, pp 302–307 Nguyen QT, Nguyen TL, Luong NH, Ngo QH (2020) Fine-tuning bert for sentiment analysis of vietnamese reviews. In: 2020 7th NAFOSTED Conference on Information and Computer Science (NICS), IEEE, pp 302–307
Zurück zum Zitat Nguyen TH, Adams B, Hassan AE (2010) A case study of bias in bug-fix datasets. In: 2010 17th Working Conference on Reverse Engineering, IEEE, pp 259–268 Nguyen TH, Adams B, Hassan AE (2010) A case study of bias in bug-fix datasets. In: 2010 17th Working Conference on Reverse Engineering, IEEE, pp 259–268
Zurück zum Zitat Ortu M, Destefanis G, Adams B, Murgia A, Marchesi M, Tonelli R (2015) The jira repository dataset: Understanding social aspects of software development. In: Proceedings of the 11th international conference on predictive models and data analytics in software engineering, pp 1–4 Ortu M, Destefanis G, Adams B, Murgia A, Marchesi M, Tonelli R (2015) The jira repository dataset: Understanding social aspects of software development. In: Proceedings of the 11th international conference on predictive models and data analytics in software engineering, pp 1–4
Zurück zum Zitat Ouni S, Fkih F, Omri MN (2022) Bert-and cnn-based tobeat approach for unwelcome tweets detection. Soc Netw Anal Min 12(1):144CrossRef Ouni S, Fkih F, Omri MN (2022) Bert-and cnn-based tobeat approach for unwelcome tweets detection. Soc Netw Anal Min 12(1):144CrossRef
Zurück zum Zitat Peinelt N, Nguyen D, Liakata M (2020) tbert: Topic models and bert joining forces for semantic similarity detection. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 7047–7055 Peinelt N, Nguyen D, Liakata M (2020) tbert: Topic models and bert joining forces for semantic similarity detection. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 7047–7055
Zurück zum Zitat Perez L, Ottens L, Viswanathan S (2021) Automatic code generation using pre-trained language models. arXiv preprint arXiv:2102.10535 Perez L, Ottens L, Viswanathan S (2021) Automatic code generation using pre-trained language models. arXiv preprint arXiv:​2102.​10535
Zurück zum Zitat Picoreti R, do Carmo AP, de Queiroz FM, Garcia AS, Vassallo RF, Simeonidou D (2018) Multilevel observability in cloud orchestration. In: 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), IEEE, pp 776–784 Picoreti R, do Carmo AP, de Queiroz FM, Garcia AS, Vassallo RF, Simeonidou D (2018) Multilevel observability in cloud orchestration. In: 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), IEEE, pp 776–784
Zurück zum Zitat Pour MV, Li Z, Ma L, Hemmati H (2021) A search-based testing framework for deep neural networks of source code embedding. 2021 14th IEEE Conference on Software Testing. Verification and Validation (ICST), IEEE, pp 36–46 Pour MV, Li Z, Ma L, Hemmati H (2021) A search-based testing framework for deep neural networks of source code embedding. 2021 14th IEEE Conference on Software Testing. Verification and Validation (ICST), IEEE, pp 36–46
Zurück zum Zitat Qasim R, Bangyal WH, Alqarni MA, Ali Almazroi A (2022) A fine-tuned bert-based transfer learning approach for text classification. J Healthc Eng 2022 Qasim R, Bangyal WH, Alqarni MA, Ali Almazroi A (2022) A fine-tuned bert-based transfer learning approach for text classification. J Healthc Eng 2022
Zurück zum Zitat Qiu X, Sun T, Xu Y, Shao Y, Dai N, Huang X (2020) Pre-trained models for natural language processing: A survey. Sci China Technol Sci 63(10):1872–1897CrossRef Qiu X, Sun T, Xu Y, Shao Y, Dai N, Huang X (2020) Pre-trained models for natural language processing: A survey. Sci China Technol Sci 63(10):1872–1897CrossRef
Zurück zum Zitat Rahman F, Posnett D, Herraiz I, Devanbu P (2013) Sample size vs. bias in defect prediction. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering, pp 147–157 Rahman F, Posnett D, Herraiz I, Devanbu P (2013) Sample size vs. bias in defect prediction. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering, pp 147–157
Zurück zum Zitat Ramos J, et al (2003) Using tf-idf to determine word relevance in document queries. In: Proceedings of the first instructional conference on machine learning, vol 242. Citeseer, pp 29–48 Ramos J, et al (2003) Using tf-idf to determine word relevance in document queries. In: Proceedings of the first instructional conference on machine learning, vol 242. Citeseer, pp 29–48
Zurück zum Zitat Raulji JK, Saini JR (2016) Stop-word removal algorithm and its implementation for sanskrit language. Int J Comput Appl 150(2):15–17 Raulji JK, Saini JR (2016) Stop-word removal algorithm and its implementation for sanskrit language. Int J Comput Appl 150(2):15–17
Zurück zum Zitat Ruan H, Chen B, Peng X, Zhao W (2019) Deeplink: Recovering issue-commit links based on deep learning. J Syst Softws 158:110406CrossRef Ruan H, Chen B, Peng X, Zhao W (2019) Deeplink: Recovering issue-commit links based on deep learning. J Syst Softws 158:110406CrossRef
Zurück zum Zitat Santos EA, Hindle A (2016) Judging a commit by its cover; or can a commit message predict build failure? PeerJ Prepr 4:e1771v1 Santos EA, Hindle A (2016) Judging a commit by its cover; or can a commit message predict build failure? PeerJ Prepr 4:e1771v1
Zurück zum Zitat Scanniello G, Marcus A, Pascale D (2015) Link analysis algorithms for static concept location: an empirical assessment. Empir Softw Eng 20(6):1666–1720CrossRef Scanniello G, Marcus A, Pascale D (2015) Link analysis algorithms for static concept location: an empirical assessment. Empir Softw Eng 20(6):1666–1720CrossRef
Zurück zum Zitat Sellam T, Yadlowsky S, Wei J, Saphra N, D’Amour A, Linzen T, Bastings J, Turc I, Eisenstein J, Das D, et al (2021) The multiberts: Bert reproductions for robustness analysis. arXiv preprint arXiv:2106.16163 Sellam T, Yadlowsky S, Wei J, Saphra N, D’Amour A, Linzen T, Bastings J, Turc I, Eisenstein J, Das D, et al (2021) The multiberts: Bert reproductions for robustness analysis. arXiv preprint arXiv:​2106.​16163
Zurück zum Zitat Selva Birunda S, Kanniga Devi R (2021) A review on word embedding techniques for text classification. Innovative Data Communication Technologies and Application, pp 267–281 Selva Birunda S, Kanniga Devi R (2021) A review on word embedding techniques for text classification. Innovative Data Communication Technologies and Application, pp 267–281
Zurück zum Zitat Shi E, Wang Y, Du L, Chen J, Han S, Zhang H, Zhang D, Sun H (2022) On the evaluation of neural code summarization. In: Proceedings of the 44th International Conference on Software Engineering, pp 1597–1608 Shi E, Wang Y, Du L, Chen J, Han S, Zhang H, Zhang D, Sun H (2022) On the evaluation of neural code summarization. In: Proceedings of the 44th International Conference on Software Engineering, pp 1597–1608
Zurück zum Zitat Song Y, Wang J, Liang Z, Liu Z, Jiang T (2020) Utilizing bert intermediate layers for aspect based sentiment analysis and natural language inference. arXiv preprint arXiv:2002.04815 Song Y, Wang J, Liang Z, Liu Z, Jiang T (2020) Utilizing bert intermediate layers for aspect based sentiment analysis and natural language inference. arXiv preprint arXiv:​2002.​04815
Zurück zum Zitat Spadini D, Aniche M, Bacchelli A (2018) Pydriller: Python framework for mining software repositories. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Association for Computing Machinery, New York, NY, USA, ESEC/FSE 2018, p 908–911 Spadini D, Aniche M, Bacchelli A (2018) Pydriller: Python framework for mining software repositories. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Association for Computing Machinery, New York, NY, USA, ESEC/FSE 2018, p 908–911
Zurück zum Zitat Su J, Cao J, Liu W, Ou Y (2021) Whitening sentence representations for better semantics and faster retrieval. arXiv preprint arXiv:2103.15316 Su J, Cao J, Liu W, Ou Y (2021) Whitening sentence representations for better semantics and faster retrieval. arXiv preprint arXiv:​2103.​15316
Zurück zum Zitat Sun Y, Wang Q, Yang Y (2017) Frlink: Improving the recovery of missing issue-commit links by revisiting file relevance. Inf Softw Technol 84:33–47CrossRef Sun Y, Wang Q, Yang Y (2017) Frlink: Improving the recovery of missing issue-commit links by revisiting file relevance. Inf Softw Technol 84:33–47CrossRef
Zurück zum Zitat Sun Y, Chen C, Wang Q, Boehm B (2017b) Improving missing issue-commit link recovery using positive and unlabeled data. In: 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), IEEE, pp 147–152 Sun Y, Chen C, Wang Q, Boehm B (2017b) Improving missing issue-commit link recovery using positive and unlabeled data. In: 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), IEEE, pp 147–152
Zurück zum Zitat Sun C, Qiu X, Xu Y, Huang X (2019) How to fine-tune bert for text classification? In: China national conference on Chinese computational linguistics, Springer, pp 194–206 Sun C, Qiu X, Xu Y, Huang X (2019) How to fine-tune bert for text classification? In: China national conference on Chinese computational linguistics, Springer, pp 194–206
Zurück zum Zitat Sun Y, Wang Q, Li M (2016) Understanding the contribution of non-source documents in improving missing link recovery: An empirical study. In: Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp 1–10 Sun Y, Wang Q, Li M (2016) Understanding the contribution of non-source documents in improving missing link recovery: An empirical study. In: Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp 1–10
Zurück zum Zitat Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2018) The impact of automated parameter optimization on defect prediction models. IEEE Trans Softw Eng 45(7):683–711CrossRef Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2018) The impact of automated parameter optimization on defect prediction models. IEEE Trans Softw Eng 45(7):683–711CrossRef
Zurück zum Zitat Tao W, Wang Y, Shi E, Du L, Han S, Zhang H, Zhang D, Zhang W (2022) A large-scale empirical study of commit message generation: models, datasets and evaluation. Empir Softw Eng 27(7):198CrossRef Tao W, Wang Y, Shi E, Du L, Han S, Zhang H, Zhang D, Zhang W (2022) A large-scale empirical study of commit message generation: models, datasets and evaluation. Empir Softw Eng 27(7):198CrossRef
Zurück zum Zitat Tian H, Liu K, Kaboré AK, Koyuncu A, Li L, Klein J, Bissyandé TF (2020) Evaluating representation learning of code changes for predicting patch correctness in program repair. In: 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), IEEE, pp 981–992 Tian H, Liu K, Kaboré AK, Koyuncu A, Li L, Klein J, Bissyandé TF (2020) Evaluating representation learning of code changes for predicting patch correctness in program repair. In: 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), IEEE, pp 981–992
Zurück zum Zitat Vasilescu B, Filkov V, Serebrenik A (2015) Perceptions of diversity on git hub: A user survey. In: 2015 IEEE/ACM 8th International Workshop on Cooperative and Human Aspects of Software Engineering, IEEE, pp 50–56 Vasilescu B, Filkov V, Serebrenik A (2015) Perceptions of diversity on git hub: A user survey. In: 2015 IEEE/ACM 8th International Workshop on Cooperative and Human Aspects of Software Engineering, IEEE, pp 50–56
Zurück zum Zitat Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Processing Syst 30 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Processing Syst 30
Zurück zum Zitat Vieira R, da Silva A, Rocha L, Gomes JP (2019) From reports to bug-fix commits: A 10 years dataset of bug-fixing activity from 55 apache’s open source projects. In: Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering, pp 80–89 Vieira R, da Silva A, Rocha L, Gomes JP (2019) From reports to bug-fix commits: A 10 years dataset of bug-fixing activity from 55 apache’s open source projects. In: Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering, pp 80–89
Zurück zum Zitat Viera AJ, Garrett JM et al (2005) Understanding interobserver agreement: the kappa statistic. Fam Med 37(5):360–363 Viera AJ, Garrett JM et al (2005) Understanding interobserver agreement: the kappa statistic. Fam Med 37(5):360–363
Zurück zum Zitat Wang Y, Sun Y, Ma Z, Gao L, Xu Y, Sun T (2020) Application of pre-training models in named entity recognition. In: 2020 12th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), vol 1. IEEE, pp 23–26 Wang Y, Sun Y, Ma Z, Gao L, Xu Y, Sun T (2020) Application of pre-training models in named entity recognition. In: 2020 12th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), vol 1. IEEE, pp 23–26
Zurück zum Zitat Wang Y, Wang W, Joty S, Hoi SC (2021) Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:2109.00859 Wang Y, Wang W, Joty S, Hoi SC (2021) Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:​2109.​00859
Zurück zum Zitat Woolson RF (2007) Wilcoxon signed-rank test. Wiley encyclopedia of clinical trials, pp 1–3 Woolson RF (2007) Wilcoxon signed-rank test. Wiley encyclopedia of clinical trials, pp 1–3
Zurück zum Zitat Wu R, Wen M, Cheung SC, Zhang H (2018) Changelocator: locate crash-inducing changes based on crash reports. Empir Softw Eng 23:2866–2900CrossRef Wu R, Wen M, Cheung SC, Zhang H (2018) Changelocator: locate crash-inducing changes based on crash reports. Empir Softw Eng 23:2866–2900CrossRef
Zurück zum Zitat Wu R, Zhang H, Kim S, Cheung SC (2011) Relink: recovering links between bugs and changes. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pp 15–25 Wu R, Zhang H, Kim S, Cheung SC (2011) Relink: recovering links between bugs and changes. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pp 15–25
Zurück zum Zitat Yang G, Zhou Y, Yu C, Chen X (2021) Deepscc: Source code classification based on fine-tuned roberta. arXiv preprint arXiv:2110.00914 Yang G, Zhou Y, Yu C, Chen X (2021) Deepscc: Source code classification based on fine-tuned roberta. arXiv preprint arXiv:​2110.​00914
Zurück zum Zitat Yogish D, Manjunath T, Hegadi RS (2018) Review on natural language processing trends and techniques using nltk. In: International Conference on Recent Trends in Image Processing and Pattern Recognition, Springer, pp 589–606 Yogish D, Manjunath T, Hegadi RS (2018) Review on natural language processing trends and techniques using nltk. In: International Conference on Recent Trends in Image Processing and Pattern Recognition, Springer, pp 589–606
Zurück zum Zitat Zhang Z, Li Y, Wang J, Liu B, Li D, Guo Y, Chen X, Liu Y (2022) Remos: Reducing defect inheritance in transfer learning via relevant model slicing. In: 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE), IEEE, pp 1856–1868 Zhang Z, Li Y, Wang J, Liu B, Li D, Guo Y, Chen X, Liu Y (2022) Remos: Reducing defect inheritance in transfer learning via relevant model slicing. In: 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE), IEEE, pp 1856–1868
Zurück zum Zitat Zhang Y, Wallace B (2015) A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. arXiv preprint arXiv:1510.03820 Zhang Y, Wallace B (2015) A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. arXiv preprint arXiv:​1510.​03820
Zurück zum Zitat Zhang C, Yamana H (2020) Wuy at semeval-2020 task 7: Combining bert and naïve bayes-svm for humor assessment in edited news headlines. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp 1071–1076 Zhang C, Yamana H (2020) Wuy at semeval-2020 task 7: Combining bert and naïve bayes-svm for humor assessment in edited news headlines. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp 1071–1076
Zurück zum Zitat Zhang X, Zhu C, Li Y, Guo J, Liu L, Gu H (2020) Precfix: Large-scale patch recommendation by mining defect-patch pairs. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Practice, pp 41–50 Zhang X, Zhu C, Li Y, Guo J, Liu L, Gu H (2020) Precfix: Large-scale patch recommendation by mining defect-patch pairs. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Practice, pp 41–50
Zurück zum Zitat Zolkifli NN, Ngah A, Deraman A (2018) Version control system: A review. Procedia Comput Sci 135:408–415CrossRef Zolkifli NN, Ngah A, Deraman A (2018) Version control system: A review. Procedia Comput Sci 135:408–415CrossRef
Metadaten
Titel
BTLink : automatic link recovery between issues and commits based on pre-trained BERT model
verfasst von
Jinpeng Lan
Lina Gong
Jingxuan Zhang
Haoxiang Zhang
Publikationsdatum
01.07.2023
Verlag
Springer US
Erschienen in
Empirical Software Engineering / Ausgabe 4/2023
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-023-10342-7

Weitere Artikel der Ausgabe 4/2023

Empirical Software Engineering 4/2023 Zur Ausgabe

Premium Partner