Skip to main content
Top
Published in: Empirical Software Engineering 5/2023

01-09-2023

EnCoSum: enhanced semantic features for multi-scale multi-modal source code summarization

Authors: Yuexiu Gao, Hongyu Zhang, Chen Lyu

Published in: Empirical Software Engineering | Issue 5/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Code summarization aims to generate concise natural language descriptions for a piece of code, which can help developers comprehend the source code. Analysis of current work shows that the extraction of syntactic and semantic features of source code is crucial for generating high-quality summaries. To provide a more comprehensive feature representation of source code from different perspectives, we propose an approach named EnCoSum, which enhances semantic features for the multi-scale multi-modal code summarization method. This method complements our previously proposed M2TS approach (multi-scale multi-modal approach based on Transformer for source code summarization), which uses the multi-scale method to capture Abstract Syntax Trees (ASTs) structural information more completely and accurately at multiple local and global levels. In addition, we devise a new cross-modal fusion method to fuse source code and AST features, which can highlight key features in each modality that help generate summaries. To obtain richer semantic information, we improve M2TS. First, we add data flow and control flow to ASTs, and added-edge ASTs, called Enhanced-ASTs (E-ASTs). In addition, we introduce method name sequences extracted in the source code, which exist more knowledge about critical tokens in the corresponding summaries and can help the model generate higher-quality summaries. We conduct extensive experiments on processed Java and Python datasets and evaluate our approach via the four most commonly used machine translation metrics. The experimental results demonstrate that EnCoSum is effective and outperforms current state-of-the-art methods. Further, we perform ablation experiments on each of the model’s key components, and the results show that they all contribute to the performance of EnCoSum.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Ahmad WU, Chakraborty S, Ray B, Chang K-W (2020) A transformer-based approach for source code summarization. In: ACL Ahmad WU, Chakraborty S, Ray B, Chang K-W (2020) A transformer-based approach for source code summarization. In: ACL
go back to reference Ahmad WU, Chakraborty S, Ray B, Chang K-W (2021) Unified pre-training for program understanding and generation. arXiv preprint arXiv:2103.06333 Ahmad WU, Chakraborty S, Ray B, Chang K-W (2021) Unified pre-training for program understanding and generation. arXiv preprint arXiv:​2103.​06333
go back to reference Allamanis M (2022) Graph neural networks in program analysis. Foundations, Frontiers, and Applications, Graph Neural Networks, pp 483–497 Allamanis M (2022) Graph neural networks in program analysis. Foundations, Frontiers, and Applications, Graph Neural Networks, pp 483–497
go back to reference Allamanis M, Barr ET, Bird C, Sutton C (2015) Suggesting accurate method and class names. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering, pp 38–49 Allamanis M, Barr ET, Bird C, Sutton C (2015) Suggesting accurate method and class names. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering, pp 38–49
go back to reference Allamanis M, Brockschmidt M, Khademi M (2015) Learning to represent programs with graphs. In: International conference on learning representations Allamanis M, Brockschmidt M, Khademi M (2015) Learning to represent programs with graphs. In: International conference on learning representations
go back to reference Allamanis M, Peng H, Sutton C (2016) A convolutional attention network for extreme summarization of source code. In: International conference on machine learning, pp 2091–2100. PMLR Allamanis M, Peng H, Sutton C (2016) A convolutional attention network for extreme summarization of source code. In: International conference on machine learning, pp 2091–2100. PMLR
go back to reference Allamanis M, Tarlow D, Gordon A, Wei Y (2015) Bimodal modelling of source code and natural language. In: International conference on machine learning, pp 2123–2132. PMLR Allamanis M, Tarlow D, Gordon A, Wei Y (2015) Bimodal modelling of source code and natural language. In: International conference on machine learning, pp 2123–2132. PMLR
go back to reference Alon U, Brody S, Levy O, Yahav E (2018) code2seq: Generating sequences from structured representations of code. arXiv preprint arXiv:1808.01400 Alon U, Brody S, Levy O, Yahav E (2018) code2seq: Generating sequences from structured representations of code. arXiv preprint arXiv:​1808.​01400
go back to reference Banerjee S, Lavie A (2005) Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In: Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp 65–72 Banerjee S, Lavie A (2005) Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In: Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp 65–72
go back to reference Barone AVM, Sennrich R (2017) A parallel corpus of python functions and documentation strings for automated code documentation and code generation. arXiv e-prints, pp arXiv:1707 Barone AVM, Sennrich R (2017) A parallel corpus of python functions and documentation strings for automated code documentation and code generation. arXiv e-prints, pp arXiv:​1707
go back to reference Cho K, Merriënboer BV, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 Cho K, Merriënboer BV, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:​1406.​1078
go back to reference Eddy BP, Robinson JA, Kraft NA, Carver JC (2013) Evaluating source code summarization techniques: Replication and expansion. In: 2013 21st International Conference on Program Comprehension (ICPC), pp 13–22. IEEE Eddy BP, Robinson JA, Kraft NA, Carver JC (2013) Evaluating source code summarization techniques: Replication and expansion. In: 2013 21st International Conference on Program Comprehension (ICPC), pp 13–22. IEEE
go back to reference Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D et al (2020) Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D et al (2020) Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:​2002.​08155
go back to reference Gao S, Gao C, He Y, Zeng J, Nie LY, Xia X (2021) Code structure guided transformer for source code summarization. arXiv preprint arXiv:2104.09340 Gao S, Gao C, He Y, Zeng J, Nie LY, Xia X (2021) Code structure guided transformer for source code summarization. arXiv preprint arXiv:​2104.​09340
go back to reference Gao Y, Lyu C (2022) M2ts: Multi-scale multi-modal approach based on transformer for source code summarization. arXiv preprint arXiv:2203.09707 Gao Y, Lyu C (2022) M2ts: Multi-scale multi-modal approach based on transformer for source code summarization. arXiv preprint arXiv:​2203.​09707
go back to reference Guo D, Ren S, Lu S, Feng Z, Tang D, Liu S, Zhou L, Duan N, Svyatkovskiy A, Fu S et al (2021) Graphcodebert: Pre-training code representations with data flow. In: ICLR Guo D, Ren S, Lu S, Feng Z, Tang D, Liu S, Zhou L, Duan N, Svyatkovskiy A, Fu S et al (2021) Graphcodebert: Pre-training code representations with data flow. In: ICLR
go back to reference Haiduc S, Aponte J, Marcus A (2010a) Supporting program comprehension with source code summarization. In: 2010 acm/ieee 32nd international conference on software engineering, volume 2, pp 223–226. IEEE Haiduc S, Aponte J, Marcus A (2010a) Supporting program comprehension with source code summarization. In: 2010 acm/ieee 32nd international conference on software engineering, volume 2, pp 223–226. IEEE
go back to reference Haiduc S, Aponte J, Moreno L, Marcus A (2010b) On the use of automated text summarization techniques for summarizing source code. In: 2010 17th Working conference on reverse engineering, pp 35–44. IEEE Haiduc S, Aponte J, Moreno L, Marcus A (2010b) On the use of automated text summarization techniques for summarizing source code. In: 2010 17th Working conference on reverse engineering, pp 35–44. IEEE
go back to reference Haije T, Intelligentie Bachelor Opleiding, Kunstmatige Gavves E, Heuer H (2016) Automatic comment generation using a neural translation model. Inf Softw Technol 55(3):258–268 Haije T, Intelligentie Bachelor Opleiding, Kunstmatige Gavves E, Heuer H (2016) Automatic comment generation using a neural translation model. Inf Softw Technol 55(3):258–268
go back to reference Hasan M, Muttaqueen T, Ishtiaq AA, Mehrab KS, Haque Md, Anjum M, Hasan T, Ahmad WU, Iqbal A, Shahriyar R (2021) Codesc: A large code-description parallel dataset. arXiv preprint arXiv:2105.14220 Hasan M, Muttaqueen T, Ishtiaq AA, Mehrab KS, Haque Md, Anjum M, Hasan T, Ahmad WU, Iqbal A, Shahriyar R (2021) Codesc: A large code-description parallel dataset. arXiv preprint arXiv:​2105.​14220
go back to reference Hindle A, Barr ET, Gabel M, Su Z, Devanbu P (2016) On the naturalness of software. Commun ACM 59(5):122–131CrossRef Hindle A, Barr ET, Gabel M, Su Z, Devanbu P (2016) On the naturalness of software. Commun ACM 59(5):122–131CrossRef
go back to reference Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef
go back to reference Hu X, Li G, Xia X, Lo D, Jin Z (2020) Deep code comment generation with hybrid lexical and syntactical information. Empir Softw Eng 25(3):2179–2217CrossRef Hu X, Li G, Xia X, Lo D, Jin Z (2020) Deep code comment generation with hybrid lexical and syntactical information. Empir Softw Eng 25(3):2179–2217CrossRef
go back to reference Hu X, Li G, Xia X, Lo D, Jin Z (2018b) Deep code comment generation. In: 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC), pp 200–20010. IEEE Hu X, Li G, Xia X, Lo D, Jin Z (2018b) Deep code comment generation. In: 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC), pp 200–20010. IEEE
go back to reference Hu X, Li G, Xia X, Lo D, Lu S, Jin Z (2018a) Summarizing source code with transferred api knowledge Hu X, Li G, Xia X, Lo D, Lu S, Jin Z (2018a) Summarizing source code with transferred api knowledge
go back to reference Iyer S, Konstas I, Cheung A, Zettlemoyer L (2016) Summarizing source code using a neural attention model. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 2073–2083 Iyer S, Konstas I, Cheung A, Zettlemoyer L (2016) Summarizing source code using a neural attention model. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 2073–2083
go back to reference Jiang X, Zheng Z, Lyu C, Li L, Lyu L (2021) Treebert: A tree-based pre-trained model for programming language. In: Uncertainty in artificial intelligence, pp 54–63. PMLR Jiang X, Zheng Z, Lyu C, Li L, Lyu L (2021) Treebert: A tree-based pre-trained model for programming language. In: Uncertainty in artificial intelligence, pp 54–63. PMLR
go back to reference Ko AJ, Myers BA, Aung Htet Htet (2004) Six learning barriers in end-user programming systems. In: 2004 IEEE Symposium on visual languages-human centric computing, pp 199–206. IEEE Ko AJ, Myers BA, Aung Htet Htet (2004) Six learning barriers in end-user programming systems. In: 2004 IEEE Symposium on visual languages-human centric computing, pp 199–206. IEEE
go back to reference Ko AJ, Myers BA, Coblenz MJ, Htet Aung Htet (2006) An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Trans Softw Eng 32(12):971–987CrossRef Ko AJ, Myers BA, Coblenz MJ, Htet Aung Htet (2006) An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Trans Softw Eng 32(12):971–987CrossRef
go back to reference LeClair A, Haque S, Wu L, McMillan C (2020) Improved code summarization via a graph neural network. In: Proceedings of the 28th international conference on program comprehension, pp 184–195 LeClair A, Haque S, Wu L, McMillan C (2020) Improved code summarization via a graph neural network. In: Proceedings of the 28th international conference on program comprehension, pp 184–195
go back to reference LeClair A, Jiang S, McMillan C (2019) A neural model for generating natural language summaries of program subroutines. In: 2019 IEEE/ACM 41st international conference on software engineering (ICSE), pp 795–806. IEEE LeClair A, Jiang S, McMillan C (2019) A neural model for generating natural language summaries of program subroutines. In: 2019 IEEE/ACM 41st international conference on software engineering (ICSE), pp 795–806. IEEE
go back to reference LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRef LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRef
go back to reference Lin C-Y (2004) Rouge: A package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81 Lin C-Y (2004) Rouge: A package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81
go back to reference Lin C, Ouyang Z, Zhuang J, Chen J, Li H, Wu R (2021) Improving code summarization with block-wise abstract syntax tree splitting. In: 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC), pp 184–195. IEEE Lin C, Ouyang Z, Zhuang J, Chen J, Li H, Wu R (2021) Improving code summarization with block-wise abstract syntax tree splitting. In: 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC), pp 184–195. IEEE
go back to reference Lu S, Guo D, Ren S, Huang J, Svyatkovskiy A, Blanco A, Clement C, Drain D, Jiang D, Tang D et al (2021) Codexglue: A machine learning benchmark dataset for code understanding and generation. In: Thirty-fifth conference on neural information processing systems datasets and benchmarks track (Round 1) Lu S, Guo D, Ren S, Huang J, Svyatkovskiy A, Blanco A, Clement C, Drain D, Jiang D, Tang D et al (2021) Codexglue: A machine learning benchmark dataset for code understanding and generation. In: Thirty-fifth conference on neural information processing systems datasets and benchmarks track (Round 1)
go back to reference McBurney PW, McMillan C (2015) Automatic source code summarization of context for java methods. IEEE Trans Softw Eng 42(2):103–119CrossRef McBurney PW, McMillan C (2015) Automatic source code summarization of context for java methods. IEEE Trans Softw Eng 42(2):103–119CrossRef
go back to reference Mehrotra N, Agarwal N, Gupta P, Anand S, Lo D, Purandare R (2021) Modeling functional similarity in source code with graph-based siamese networks. IEEE Trans Softw Eng Mehrotra N, Agarwal N, Gupta P, Anand S, Lo D, Purandare R (2021) Modeling functional similarity in source code with graph-based siamese networks. IEEE Trans Softw Eng
go back to reference Mou L, Li G, Zhang L, Wang T, Jin Z (2016) Convolutional neural networks over tree structures for programming language processing. In: Thirtieth AAAI conference on artificial intelligence, Mou L, Li G, Zhang L, Wang T, Jin Z (2016) Convolutional neural networks over tree structures for programming language processing. In: Thirtieth AAAI conference on artificial intelligence,
go back to reference Niu C, Li C, Ng V, Ge J, Huang L, Luo B (2022) Spt-code: sequence-to-sequence pre-training for learning source code representations. In: Proceedings of the 44th international conference on software engineering, pp 2006–2018 Niu C, Li C, Ng V, Ge J, Huang L, Luo B (2022) Spt-code: sequence-to-sequence pre-training for learning source code representations. In: Proceedings of the 44th international conference on software engineering, pp 2006–2018
go back to reference Papineni K, Roukos S, Ward T, Zhu W-J (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 311–318 Papineni K, Roukos S, Ward T, Zhu W-J (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 311–318
go back to reference Phan L, Tran H, Le D, Nguyen H, Annibal J, Peltekian A, Ye Y (2021) Cotext: Multi-task learning with code-text transformer. In: Proceedings of the 1st Workshop on Natural Language Processing for Programming (NLP4Prog 2021), pp 40–47 Phan L, Tran H, Le D, Nguyen H, Annibal J, Peltekian A, Ye Y (2021) Cotext: Multi-task learning with code-text transformer. In: Proceedings of the 1st Workshop on Natural Language Processing for Programming (NLP4Prog 2021), pp 40–47
go back to reference Robbins H, Monro S (1951) A stochastic approximation method. Ann Math Stat 400–407 Robbins H, Monro S (1951) A stochastic approximation method. Ann Math Stat 400–407
go back to reference Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536CrossRefMATH Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536CrossRefMATH
go back to reference Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. In: EMNLP Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. In: EMNLP
go back to reference Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2008) The graph neural network model. IEEE Transactions on Neural Networks 20(1):61–80CrossRef Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2008) The graph neural network model. IEEE Transactions on Neural Networks 20(1):61–80CrossRef
go back to reference Shido Y, Kobayashi Y, Yamamoto A, Miyamoto A, Matsumura T (2019) Automatic source code summarization with extended tree-lstm. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp 1–8. IEEE Shido Y, Kobayashi Y, Yamamoto A, Miyamoto A, Matsumura T (2019) Automatic source code summarization with extended tree-lstm. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp 1–8. IEEE
go back to reference Shuai J, Xu L, Liu C, Yan M, Xia X, Lei Y (2020) Improving code search with co-attentive representation learning. In: Proceedings of the 28th international conference on program comprehension, pp 196–207 Shuai J, Xu L, Liu C, Yan M, Xia X, Lei Y (2020) Improving code search with co-attentive representation learning. In: Proceedings of the 28th international conference on program comprehension, pp 196–207
go back to reference Singer J, Lethbridge T, Vinson N, Anquetil N (2010) An examination of software engineering work practices. In: CASCON First decade high impact papers, pp 174–188 Singer J, Lethbridge T, Vinson N, Anquetil N (2010) An examination of software engineering work practices. In: CASCON First decade high impact papers, pp 174–188
go back to reference Sridhara G, Hill E, Muppaneni D, Pollock L, Vijay-Shanker K (2010) Towards automatically generating summary comments for java methods. In: Proceedings of the IEEE/ACM international conference on Automated software engineering, pp 43–52 Sridhara G, Hill E, Muppaneni D, Pollock L, Vijay-Shanker K (2010) Towards automatically generating summary comments for java methods. In: Proceedings of the IEEE/ACM international conference on Automated software engineering, pp 43–52
go back to reference Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH
go back to reference Sun Z, Zhu Q, Xiong Y, Sun Y, Mou L, Zhang L (2020) Treegen: A tree-based transformer architecture for code generation. Proceedings of the AAAI Conference on Artificial Intelligence 34:8984–8991CrossRef Sun Z, Zhu Q, Xiong Y, Sun Y, Mou L, Zhang L (2020) Treegen: A tree-based transformer architecture for code generation. Proceedings of the AAAI Conference on Artificial Intelligence 34:8984–8991CrossRef
go back to reference Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems 27 Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems 27
go back to reference Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075 Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:​1503.​00075
go back to reference Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in Neural Information Processing Systems 30 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in Neural Information Processing Systems 30
go back to reference Vedantam R, Zitnick CL, Parikh D (2015) Cider: Consensus-based image description evaluation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4566–4575 Vedantam R, Zitnick CL, Parikh D (2015) Cider: Consensus-based image description evaluation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4566–4575
go back to reference Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: A neural image caption generator. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164 Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: A neural image caption generator. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164
go back to reference Wang W, Li G, Shen S, Xia X, Jin Z (2020) Modular tree network for source code representation learning. ACM Trans Softw Eng Methodol (TOSEM) 29(4):1–23 Wang W, Li G, Shen S, Xia X, Jin Z (2020) Modular tree network for source code representation learning. ACM Trans Softw Eng Methodol (TOSEM) 29(4):1–23
go back to reference Wang W, Li G, Ma B, Xia X, Jin Z (2020) Detecting code clones with graph neural network and flow-augmented abstract syntax tree. In: 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp 261–271. IEEE Wang W, Li G, Ma B, Xia X, Jin Z (2020) Detecting code clones with graph neural network and flow-augmented abstract syntax tree. In: 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp 261–271. IEEE
go back to reference Wang Y, Wang W, Joty S, Hoi SCH (2021) Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv e-prints, pp arXiv:2109 Wang Y, Wang W, Joty S, Hoi SCH (2021) Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv e-prints, pp arXiv:​2109
go back to reference Wang Y, Wang W, Joty S, Hoi SCH (2021) Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 8696–8708 Wang Y, Wang W, Joty S, Hoi SCH (2021) Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 8696–8708
go back to reference Wang W, Zhang Y, Sui Y, Wan Y, Zhao Z, Wu J, Yu P, Xu G (2020) Reinforcement-learning-guided source code summarization via hierarchical attention. IEEE Trans Softw Eng Wang W, Zhang Y, Sui Y, Wan Y, Zhao Z, Wu J, Yu P, Xu G (2020) Reinforcement-learning-guided source code summarization via hierarchical attention. IEEE Trans Softw Eng
go back to reference Wan Y, Zhao Z, Yang M, Xu G, Ying H, Wu J, Yu PS (2018) Improving automatic source code summarization via deep reinforcement learning. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, pp 397–407 Wan Y, Zhao Z, Yang M, Xu G, Ying H, Wu J, Yu PS (2018) Improving automatic source code summarization via deep reinforcement learning. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, pp 397–407
go back to reference Wei B, Li G, Xia X, Fu Z, Jin Z (2019) Code generation as a dual task of code summarization. Advances in Neural Information Processing Systems 32 Wei B, Li G, Xia X, Fu Z, Jin Z (2019) Code generation as a dual task of code summarization. Advances in Neural Information Processing Systems 32
go back to reference Wong E, Yang J, Tan L (2013) Autocomment: Mining question and answer sites for automatic comment generation. In: 2013 28th IEEE/ACM International conference on automated software engineering (ASE), pp 562–567. IEEE Wong E, Yang J, Tan L (2013) Autocomment: Mining question and answer sites for automatic comment generation. In: 2013 28th IEEE/ACM International conference on automated software engineering (ASE), pp 562–567. IEEE
go back to reference Xia X, Bao L, Lo D, Xing Z, Hassan AE, Li S (2017) Measuring program comprehension: A large-scale field study with professionals. IEEE Trans Softw Eng 44(10):951–976CrossRef Xia X, Bao L, Lo D, Xing Z, Hassan AE, Li S (2017) Measuring program comprehension: A large-scale field study with professionals. IEEE Trans Softw Eng 44(10):951–976CrossRef
go back to reference Xu K, Wu L, Wang Z, Feng Y, Witbrock M, Sheinin V (2018) Graph2seq: Graph to sequence learning with attention-based neural networks. arXiv preprint arXiv:1804.00823 Xu K, Wu L, Wang Z, Feng Y, Witbrock M, Sheinin V (2018) Graph2seq: Graph to sequence learning with attention-based neural networks. arXiv preprint arXiv:​1804.​00823
go back to reference Yamaguchi F, Golde N, Arp D, Rieck K (2014) Modeling and discovering vulnerabilities with code property graphs. In: 2014 IEEE Symposium on security and privacy, pp 590–604. IEEE Yamaguchi F, Golde N, Arp D, Rieck K (2014) Modeling and discovering vulnerabilities with code property graphs. In: 2014 IEEE Symposium on security and privacy, pp 590–604. IEEE
go back to reference Yang Z, Keung J, Yu X, Gu X, Wei Z, Ma X, Zhang M (2021) A multi-modal transformer-based code summarization approach for smart contracts. In: 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC), pages 1–12. IEEE Yang Z, Keung J, Yu X, Gu X, Wei Z, Ma X, Zhang M (2021) A multi-modal transformer-based code summarization approach for smart contracts. In: 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC), pages 1–12. IEEE
go back to reference Zhang J, Wang X, Zhang H, Sun H, Liu X (2020) Retrieval-based neural source code summarization. In: 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE), pp 1385–1397. IEEE Zhang J, Wang X, Zhang H, Sun H, Liu X (2020) Retrieval-based neural source code summarization. In: 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE), pp 1385–1397. IEEE
go back to reference Zhang J, Wang X, Zhang H, Sun H, Wang K, Liu X (2019) A novel neural source code representation based on abstract syntax tree. In: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp 783–794. IEEE Zhang J, Wang X, Zhang H, Sun H, Wang K, Liu X (2019) A novel neural source code representation based on abstract syntax tree. In: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp 783–794. IEEE
go back to reference Zhao G, Huang J (2018) Deepsim: deep learning code functional similarity. In: Proceedings of the 2018 26th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 141–151 Zhao G, Huang J (2018) Deepsim: deep learning code functional similarity. In: Proceedings of the 2018 26th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 141–151
go back to reference Zhou Y, Liu S, Siow J, Du X, Liu Y (2019) Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. Advances in Neural Information Processing Systems 32 Zhou Y, Liu S, Siow J, Du X, Liu Y (2019) Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. Advances in Neural Information Processing Systems 32
Metadata
Title
EnCoSum: enhanced semantic features for multi-scale multi-modal source code summarization
Authors
Yuexiu Gao
Hongyu Zhang
Chen Lyu
Publication date
01-09-2023
Publisher
Springer US
Published in
Empirical Software Engineering / Issue 5/2023
Print ISSN: 1382-3256
Electronic ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-023-10384-x

Other articles of this Issue 5/2023

Empirical Software Engineering 5/2023 Go to the issue

Premium Partner