Top

Empirical Software Engineering

Published in:

01-05-2024

CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection

Authors: Amal Alazba, Hamoud Aljamaan, Mohammad Alshayeb

Published in: Empirical Software Engineering | Issue 3/2024

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Context

Code smell detection is the process of identifying poorly designed and implemented code pieces. Machine learning-based approaches require enormous amounts of manually labeled data, which are costly and difficult to scale. Unsupervised semantic feature learning, or learning without manual annotation, is vital for effectively harvesting an enormous amount of available data.

Objective

The objective of this study is to propose a new code smell detection approach that utilizes self-supervised learning to learn intermediate representations without the need for labels and then fine-tune these representations on multiple tasks.

Method

We propose a Code Representation with Transformers (CoRT) to learn the semantic and structural features of the source code by training transformers to recognize masked reserved words that are applied to the code given as input. We empirically demonstrated that the defined proxy task provides a powerful method for learning semantic and structural features. We exhaustively evaluated our approach on four downstream tasks: detection of the Data Class, God Class, Feature Envy, and Long Method code smells. Moreover, we compare our results with those of two paradigms: supervised learning and a feature-based approach. Finally, we conducted a cross-project experiment to evaluate the generalizability of our method to unseen labeled data.

Results

The results indicate that the proposed method has a high detection performance for code smells. For instance, the detection performance of CoRT on Data Class achieved a score of F1 between 88.08–99.4, Area Under Curve (AUC) between 89.62–99.88, and Matthews Correlation Coefficient (MCC) between 75.28–98.8, while God Class achieved a value of F1 ranges from 86.32–99.03, AUC of 92.1–99.85, and MCC of 76.15–98.09. Compared with the baseline model and feature-based approach, CoRT achieved better detection performance and had a high capability to detect code smells in unseen datasets.

Conclusions

The proposed method has been shown to be effective in detecting class-level, and method-level code smells.

previous article Taxonomy of inline code comment smells

next article Comparative analysis of real issues in open-source machine learning projects

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

https://scale.com/pricing

https://github.com/amalazba/CoRT-Transformer-based-Code-Representations-with-Self-supervision-for-Code-Smell-Detection

https://zenodo.org/record/3992730#.YvsvfuxBxEI

https://zenodo.org/record/6555241#.YvssBuxByi4

https://zenodo.org/record/4103861#.Yvsyu-xByi4

https://www.tensorflow.org/

Abdou A, Darwish N (2022) Severity classification of software code smells using machine learning techniques: A comparative study. J Softw Evol Process e2454. https://doi.org/10.1002/smr.2454

AbuHassan A, Alshayeb M, Ghouti L (2021) Software smell detection techniques: A systematic literature review. J Softw Evol Process 33:e2320. https://doi.org/10.1002/smr.2320CrossRef

Akiba T, Sano S, Yanase T, Ohta T, Koyama M (2019) Optuna: A next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery, New York, NY, USA, pp 2623–2631. https://doi.org/10.1145/3292500.3330701

Alazba A, Aljamaan H (2021) Code smell detection using feature selection and stacking ensemble: An empirical investigation. Inf Softw Technol 138:106648. https://doi.org/10.1016/j.infsof.2021.106648CrossRef

Alazba A, Aljamaan H, Alshayeb M (2023) Deep learning approaches for bad smell detection: a systematic literature review. Empir Softw Eng 28:77. https://doi.org/10.1007/s10664-023-10312-zCrossRef

Alkhaeir T, Walter B (2021) The Effect of Code Smells on the Relationship Between Design Patterns and Defects. IEEE Access 9:3360–3373. https://doi.org/10.1109/ACCESS.2020.3047870CrossRef

Alkharabsheh K, Crespo Y, Manso E, Taboada JA (2019) Software Design Smell Detection: a systematic mapping study. Softw Qual J 27:1069–1148. https://doi.org/10.1007/s11219-018-9424-8CrossRef

Al-Shaaby A, Aljamaan H, Alshayeb M (2020) Bad Smell Detection Using Machine Learning Techniques: A Systematic Literature Review. Arab J Sci Eng. https://doi.org/10.1007/s13369-019-04311-wCrossRef

Amorim L, Antunes N, Fonseca B, Ribeiro M (2015) Experience report: evaluating the effectiveness of decision trees for detecting code smells. In: 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE), pp 261–269. https://doi.org/10.1109/ISSRE.2015.7381819

Arcelli Fontana F, Zanoni M (2017) Code smell severity classification using machine learning techniques. Knowl-Based Syst 128:43–58. https://doi.org/10.1016/j.knosys.2017.04.014CrossRef

Arcelli Fontana F, Mäntylä MV, Zanoni M, Marino A (2016) Comparing and experimenting machine learning techniques for code smell detection. Empir Softw Eng 21:1143–1191. https://doi.org/10.1007/s10664-015-9378-4CrossRef

Banker RD, Datar SM, Kemerer CF, Zweig D (1993) Software complexity and maintenance costs. Commun ACM 36:81–94. https://doi.org/10.1145/163359.163375CrossRef

Barbez A, Khomh F, Guéhéneuc Y-G (2019a) A machine-learning based ensemble method for anti-patterns detection. J Syst Softw 161:110486. https://doi.org/10.1016/j.jss.2019.110486

Barbez A, Khomh F, Gueheneuc Y-G (2019b) Deep Learning anti-patterns from Code metrics history. In: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, Cleveland, OH, USA, pp 114–124. https://doi.org/10.1109/ICSME.2019.00021

Bryton S, Brito e Abreu F, Monteiro M (2010) Reducing subjectivity in code smells detection: experimenting with the long method. In: 2010 seventh international conference on the quality of information and communications technology. pp 337–342. https://doi.org/10.1109/QUATIC.2010.60

Charalampidou S, Ampatzoglou A, Avgeriou P (2015) Size and cohesion metrics as indicators of the long method bad smell: An empirical study. In: Proceedings of the 11th International Conference on Predictive Models and Data Analytics in Software Engineering. Association for Computing Machinery, Beijing, China, pp 1–10. https://doi.org/10.1145/2810146.2810155

Chen Z, Chen L, Ma W et al (2018) Understanding metric-based detectable smells in Python software: A comparative study. Inf Softw Technol 94:14–29. https://doi.org/10.1016/j.infsof.2017.09.011CrossRef

Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of naacL-HLT, vol 1, p 2. https://doi.org/10.18653/V1/N19-1423

Dewangan S, Rao RS, Mishra A, Gupta M (2021) A novel approach for code smell detection: An empirical study. IEEE Access 9:162869–162883. https://doi.org/10.1109/ACCESS.2021.3133810CrossRef

Di Nucci D, Palomba F, Tamburri DA, Serebrenik A, De Lucia A (2018) Detecting code smells using machine learning techniques: Are we there yet? 2018 IEEE 25th Int Conf Softw Anal Evol Reengineering SANER 612–621. https://doi.org/10.1109/SANER.2018.8330266

dos Reis JP, Abreu FB e, Carneiro G de F (2022) Crowdsmelling: A preliminary study on using collective knowledge in code smells detection. Empir Softw Eng 27:69. https://doi.org/10.1007/s10664-021-10110-5CrossRef

Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D, Zhou M (2020) CodeBERT: A pre-trained model for programming and natural languages. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online. Association for Computational Linguistics, pp 1536–1547. https://doi.org/10.18653/v1/2020.findings-emnlp.139

Fontana FA, Zanoni M, Marino A, Mäntylä MV (2013) Code smell detection: Towards a machine learning-based approach. In: Proceedings of the 2013 IEEE international conference on software maintenance. IEEE Computer Society, USA, pp 396–399. https://doi.org/10.1109/ICSM.2013.56

Fowler M, Beck K, Brant J et al (1999) Refactoring: Improving the design of existing code, 1st edn. Addison-Wesley Professional, Reading, MA

Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. ArXiv, abs/1803.07728.

Guggulothu T, Moiz SA (2020) Code smell detection using multi-label classification approach. Softw Qual J 28:1063–1086. https://doi.org/10.1007/s11219-020-09498-yCrossRef

Guo X, Shi C, Jiang H (2019) Deep semantic-based feature envy identification. In: Proceedings of the 11th Asia-Pacific Symposium on Internetware. Association for Computing Machinery, New York, NY, USA, pp 1–6. https://doi.org/10.1145/3361242.3361257

Guo D, Ren S, Lu S, Feng Z, Tang D, Liu S, Zhou L, Duan N, Yin J, Jiang D, Zhou M (2020) GraphCodeBERT: Pre-training Code Representations with Data Flow. ArXiv, abs/2009.08366

Guo D, Lu S, Duan N, Wang Y, Zhou M, Yin J (2022) UniXcoder: Unified cross-modal pre-training for code representation. Annual Meeting of the Association for Computational Linguistics. https://doi.org/10.48550/arXiv.2203.03850

Hadj-Kacem M, Bouassida N (2018) A hybrid approach to detect code smells using deep learning. In: Proceedings of the 13th international conference on evaluation of novel approaches to software engineering. SCITEPRESS - Science and Technology Publications, Lda, Setubal, PRT, pp 137–146. https://doi.org/10.5220/0006709801370146

Hadj-Kacem M, Bouassida N (2019a) Deep representation learning for code smells detection using variational auto-encoder. In: 2019 international joint conference on neural networks (IJCNN), pp 1–8. https://doi.org/10.1109/IJCNN.2019.8851854

Hadj-Kacem M, Bouassida N (2019b) Improving the identification of code smells by combining structural and semantic information. In: Gedeon T, Wong KW, Lee M (eds) Neural Information Processing. Springer International Publishing, Cham, pp 296–304

Hassaine S, Khomh F, Gueheneuc Y-G, Hamel S (2010) IDS: an immune-inspired approach for the detection of software design smells. In: 2010 Seventh International Conference on the Quality of Information and Communications Technology, pp 343–348. https://doi.org/10.1109/QUATIC.2010.61

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

Hua W, Sui Y, Wan Y et al (2021) FCCA: Hybrid Code Representation for Functional Clone Detection Using Attention Networks. IEEE Trans Reliab 70:304–318. https://doi.org/10.1109/TR.2020.3001918CrossRef

Ioffe S, Szegedy C (2015) Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In International conference on machine learning, pp 448–456

Jaiswal A, Babu AR, Zadeh MZ et al (2021) A survey on contrastive self-supervised learning. Technologies 9:2. https://doi.org/10.3390/technologies9010002CrossRef

Kaur A, Jain S, Goel S (2017) A support vector machine based approach for code smell detection. In: 2017 international conference on machine learning and data science (MLDS), pp 9–14. https://doi.org/10.1109/MLDS.2017.8

Khleel NAA, Nehéz K (2022) Deep convolutional neural network model for bad code smells detection based on oversampling method. Indones J Electr Eng Comput Sci 26:1725–1735. https://doi.org/10.11591/ijeecs.v26.i3.pp1725-1735CrossRef

Khomh F, Vaucher S, Guéhéneuc Y-G, Sahraoui H (2009) A Bayesian approach for the detection of code and design smells. In: 2009 Ninth International Conference on Quality Software, pp 305–314. https://doi.org/10.1109/QSIC.2009.47

Khomh F, Vaucher S, Guéhéneuc Y-G, Sahraoui H (2011) BDTEX: A GQM-based Bayesian approach for the detection of antipatterns. J Syst Softw 84:559–572. https://doi.org/10.1016/j.jss.2010.11.921CrossRef

Kim DK (2017) Finding bad code smells with neural network models. Int J Electr Comput Eng IJECE 7:3613–3621. https://doi.org/10.11591/ijece.v7i6.pp3613-3621

Kotsiantis SB (2007) Supervised machine learning: A review of classification techniques. In: Proceedings of the 2007 conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies. IOS Press, NLD, pp 3–24

Lacerda G, Petrillo F, Pimenta M, Guéhéneuc YG (2020) Code smells and refactoring: A tertiary systematic review of challenges and observations. J Syst Softw 167:110610. https://doi.org/10.1016/j.jss.2020.110610CrossRef

Le H, Wang Y, Gotmare AD, Savarese S, Hoi SC (2022) Coderl: Mastering code generation through pretrained models and deep reinforcement learning. Adv Neural Inf Process Syst 35:21314–21328

Lim T-S, Loh W-Y, Shih Y-S (2000) A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach Learn 40:203–228. https://doi.org/10.1023/A:1007608224229CrossRef

Liu H, Xu Z, Zou Y (2018) Deep learning based feature envy detection. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, New York, NY, USA, pp 385–396. https://doi.org/10.1145/3238147.3238166

Liu H, Jin J, Xu Z, Zou Y, Bu Y, Zhang L (2019) Deep learning based code smell detection. IEEE Trans Softw Eng 47(9):1811–1837. https://doi.org/10.1109/TSE.2019.2936376

Liu X, Zhang F, Hou Z, Mian L, Wang Z, Zhang J, Tang J (2021) Self-supervised Learning: Generative or Contrastive. IEEE Trans Knowl Data Eng 35(1):857–876. https://doi.org/10.1109/TKDE.2021.3090866

Liu S, Wu B, Xie X, Meng G, Liu Y (2023) ContraBERT: Enhancing code pre-trained models via contrastive learning. arXiv preprint arXiv:2301.09072

Lu S, Guo D, Ren S, Huang J, Svyatkovskiy A, Blanco A, Clement C, Drain D, Jiang D, Tang D, Li G (2021) CodeXGLUE: A machine learning benchmark dataset for code understanding and generation. arXiv preprint arXiv:2102.04664

Maiga A, Ali N, Bhattacharya N, Sabané A, Guéhéneuc YG, Aimeur E (2012a) SMURF: a SVM-based incremental anti-pattern detection approach. In: 2012 19th Working Conference on Reverse Engineering, pp 466–475. https://doi.org/10.1109/WCRE.2012.56

Maiga A, Ali N, Bhattacharya N, Sabané A, Guéhéneuc YG, Antoniol G, Aimeur E (2012b) Support vector machines for anti-pattern detection. In: 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, pp 278–281. https://doi.org/10.1145/2351676.2351723

Mayvan BB, Rasoolzadegan A, Jafari AJ (2020) Bad smell detection using quality metrics and refactoring opportunities. J Softw Evol Process 32:e2255. https://doi.org/10.1002/smr.2255CrossRef

Moha N, Gueheneuc Y-G, Duchien L, Le Meur A-F (2010) DECOR: A Method for the Specification and Detection of Code and Design Smells. IEEE Trans Softw Eng 36:20–36. https://doi.org/10.1109/TSE.2009.50CrossRef

Myung IJ (2000) The Importance of Complexity in Model Selection. J Math Psychol 44:190–204. https://doi.org/10.1006/jmps.1999.1283CrossRef

Nafi KW, Kar TS, Roy B, Roy CK, Schneider KA (2019) CLCDSA: cross language code clone detection using syntactical features and api documentation. In: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, San Diego, CA, USA, pp 1026–1037. https://doi.org/10.1109/ASE.2019.00099

Olbrich SM, Cruzes DS, Sjøberg DIK (2010) Are all code smells harmful? A study of God Classes and Brain Classes in the evolution of three open source systems. In: 2010 IEEE International Conference on Software Maintenance, pp 1–10. https://doi.org/10.1109/ICSM.2010.5609564

Parr T (2013) The definitive ANTLR 4 reference. The Definitive ANTLR 4 Reference, pp 1–326

Ren S, Shi C, Zhao S (2021) Exploiting multi-aspect interactions for god class detection with dataset fine-tuning. In: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC). IEEE, Madrid, Spain, pp 864–873. https://doi.org/10.1109/COMPSAC51774.2021.00119

Roy GG, Veraart VE (1996) Software engineering education: from an engineering perspective. In: Proceedings 1996 International Conference Software Engineering: Education and Practice, pp 256–262

Sandouka R, Aljamaan H (2023) Python code smells detection using conventional machine learning models. PeerJ Comput Sci 9:e1370. https://doi.org/10.7717/peerj-cs.1370CrossRef

Sharma T, Efstathiou V, Louridas P, Spinellis D (2021) Code smell detection by deep direct-learning and transfer-learning. J Syst Softw 176:110936. https://doi.org/10.1016/j.jss.2021.110936CrossRef

Sotto-Mayor B, Elmishali A, Kalech M, Abreu R (2022) Exploring Design smells for smell-based defect prediction. Eng Appl Artif Intell 115:105240. https://doi.org/10.1016/j.engappai.2022.105240CrossRef

Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2019) The Impact of Automated Parameter Optimization on Defect Prediction Models. IEEE Trans Softw Eng 45:683–711. https://doi.org/10.1109/TSE.2018.2794977CrossRef

Tempero E, Anslow C, Dietrich J, Han T, Li J, Lumpe M, Melton H, Noble J (2010) The qualitas corpus: A curated collection of java code for empirical studies. In: 2010 Asia Pacific Software Engineering Conference, pp 336–345. https://doi.org/10.1109/APSEC.2010.46

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention Is All You Need. Adv Neural Inf Process Syst 30

Wang X, Dang Y, Zhang L, Zhang D, Lan E, Mei H (2012) Can I clone this piece of code here? In: 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, pp 170–179. https://doi.org/10.1145/2351676.2351701

Wang H, Liu J, Kang J, Yin W, Sun H, Wang H (2020) Feature envy detection based on Bi-LSTM with self-attention mechanism. In: 2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). IEEE, Exeter, United Kingdom, pp 448–457. https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom51426.2020.00082

Wang Y, Wang W, Joty S, Hoi SCH (2021) CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:2109.00859

Wang Y, Le H, Gotmare AD, Bui ND, Li J, Hoi SC (2023) CodeT5+: Open code large language models for code understanding and generation. arXiv preprint arXiv:2305.07922

Watanabe S, Hutter F (2022) c-TPE: Generalizing tree-structured parzen estimator with inequality constraints for continuous and categorical hyperparameter optimization. arXiv preprint arXiv:2211.14411

White M, Tufano M, Vendome C, Poshyvanyk D (2016) Deep learning code fragments for code clone detection. In: 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE), pp 87–98. https://doi.org/10.1145/2970276.2970326

Xu W, Zhang X (2021) Multi-granularity code smell detection using deep learning method based on abstract syntax tree. In: Proceeding 33rd Int. Conf. Software Engineering and Knowledge Engineering, pp 503–509

Yin X, Shi C, Zhao S (2021) Local and global feature based explainable feature envy detection. In: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC). IEEE, Madrid, Spain, pp 942–951. https://doi.org/10.1109/COMPSAC51774.2021.00127

Title: CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection
Authors: Amal Alazba
Hamoud Aljamaan
Mohammad Alshayeb
Publication date: 01-05-2024
Publisher: Springer US
Published in: Empirical Software Engineering / Issue 3/2024
Print ISSN: 1382-3256
Electronic ISSN: 1573-7616
DOI: https://doi.org/10.1007/s10664-024-10445-9

Springer Professional

Abstract

Context

Objective

Method

Results

Conclusions

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 3/2024

Analyzing and revivifying function signature inference using deep learning

Comparative analysis of real issues in open-source machine learning projects

Hunting bugs: Towards an automated approach to identifying which change caused a bug through regression testing

An empirical study into the effects of transpilation on quantum circuit smells

Taxonomy of inline code comment smells

Patterns of multi-container composition for service orchestration with Docker Compose

Premium Partner