Abstract
Context Code smells in the software systems are indications that usually correspond to deeper problems that can negatively influence software quality characteristics. This review is a part of a R&D project aiming to improve the existing codebeat platform that help developers to avoid code smells and deliver quality code. Objective This study aims to identify and investigate the current state of the art with respect to: (1) predictors used in prediction models to detect code smells, (2) machine learning/artificial intelligence (ML/AI) methods used in prediction models to detect code smells, (3) code smells analyzed in scientific literature. Our secondary objectives were to identify (4) data sets and projects used in research papers to predict code smells, (5) performance measures used to assess prediction models and (6) improvement ideas with regard to code smell detection using ML/AI. Method We conducted a systematic review using a database search in Scopus and evaluated it using the quasi-gold standard procedure to identify relevant studies. In the data sheet used to obtain data from publications we factor research questions into finer-grained ones, which are then answered on a per-publication basis. Those are then merged over a set of publications using an automated script to obtain answers to the posed research questions. Results We have identified 45 primary studies relevant to the primary objectives of this research. The results show the prediction capability of the ML/AI techniques for predicting code smells. Conclusion Only a few smells—Blob, Feature Envy, Long Method and Data Class—have received the vast majority of interest in research community. The usage of deep learning techniques is increasing. Most researchers still use source code metrics as predictors. Precision, recall and F-measure are the go-to performance metrics. There seems to be a need for modern reference data/projects sets that reflect modern constructs of programming languages. We identified various promising paths of research that have the potential to advance the state of the art in the area of code smells prediction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
codequest.com.
- 2.
codebeat.co.
- 3.
- 4.
Apply only if recall does not reach required threshold.
- 5.
Zhang et al. [29] suggest that a sensitivity (recall) threshold (i.e., a completeness target) of between 70% and 80% might be used to decide whether to go to Step 3 (and to refine the search terms) or whether to proceed to the next stage of the review.
- 6.
We focus on the main full papers research tracks of the conferences, and do not cover collocated conferences or workshops.
- 7.
References
Al-Shaaby, A., Aljamaan, H., Alshayeb, M.: Bad smell detection using machine learning techniques: a systematic literature review. Arabian J. Sci. Eng. 45, 2341–2369 (2020). https://doi.org/10.1007/s13369-019-04311-w
Azeem, M.I., Palomba, F., Shi, L., Wang, Q.: Machine learning techniques for code smell detection: a systematic literature review and meta-analysis. Inf. Softw. Technol. 108, 115 – 138 (2019). https://doi.org/10.1016/j.infsof.2018.12.009
Buenen, M., Muthukrishnan, G.: World quality report 2016–17. Technical report, Sogeti and Hewlett Packard Enterprise, Capgemini (2016)
Caram, F., de Oliveira Rodrigues, B.R., Campanelli, A., Silva Parreiras, F.: Machine learning techniques for code smells detection: a systematic mapping study. Int. J. Softw. Eng. Knowl. Eng. 29, 285–316 (2019). http://orcid.org/10.1142/S021819401950013X
Chen, B., Jiang, Z.M.: Characterizing and detecting anti-patterns in the logging code. In: Proceedings—2017 IEEE/ACM 39th International Conference on Software Engineering, ICSE 2017, pp. 71–81 (2017). https://doi.org/10.1109/ICSE.2017.15
Di Nucci, D., Palomba, F., Tamburri, D.A., Serebrenik, A., De Lucia, A.: Detecting code smells using machine learning techniques: Are we there yet? In: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 612–621 (2018). https://doi.org/10.1109/SANER.2018.8330266
Dieste, O., Grimán, A., Juristo, N.: Developing search strategies for detecting relevant experiments. Empirical Softw. Eng. 14(5), 513–539 (2009). http://orcid.org/10.1109/ESEM.2007.19
Dybå, T., Dingsøyr, T.: Empirical studies of agile software development: a systematic review. Inf. Softw. Technol. 50(9–10), 833–859 (2008). http://orcid.org/10.1016/j.infsof.2008.01.006
Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76(5), 378–382 (1971). http://orcid.org/10.1037/h0031619
Fontana, F.A., Pigazzini, I., Roveda, R., Zanoni, M.: Automatic detection of instability architectural smells. In: Proceedings—2016 IEEE International Conference on Software Maintenance and Evolution, ICSME 2016, pp. 433–437 (2017). https://doi.org/10.1109/ICSME.2016.33
Fontana, F.A., Zanoni, M.: Code smell severity classification using machine learning techniques. Knowl. -Based Syst. 128, 43–58 (2017). http://orcid.org/10.1016/j.knosys.2017.04.014
Fowler, M., Beck, K., Brant, J., Opdyke, W., Roberts, D.: Refactoring: Improving the Design of Existing Code. Addison-Wesley, Boston, MA, USA (1999)
Gartner: Gartner says worldwide software market grew 4.8 percent in 2013 (2014)
Kitchenham, B., Budgen, D., Brereton, P.: Evidence-Based Software Engineering and Systematic Reviews. CRC Press (2016). http://orcid.org/10.1007/11767718\_3
Madeyski, L., Lewowski, T.: MLCQ: Industry-relevant code smell data set. In: Proceedings of the Evaluation and Assessment in Software Engineering, EASE ’20, pp. 342–347. Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3383219.3383264
Malhotra, R.: A systematic review of machine learning techniques for software fault prediction. Appl. Softw. Comput. 27, 504–518 (2015). http://orcid.org/10.1016/j.asoc.2014.11.023
Palomba, F., Bavota, G., Di Penta, M., Fasano, F., Oliveto, R., Lucia, A.: On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation. Empirical Softw. Eng. pp. 1–34 (2017). https://doi.org/10.1007/s10664-017-9535-z
Palomba, F., Di Nucci, D., Panichella, A., Zaidman, A., De Lucia, A.: Lightweight detection of android-specific code smells: the adoctor project. In: SANER 2017—24th IEEE International Conference on Software Analysis, Evolution, and Reengineering, pp. 487–491 (2017). https://doi.org/10.1109/SANER.2017.7884659
Palomba, F., Di Nucci, D., Tufano, M., Bavota, G., Oliveto, R., Poshyvanyk, D., De Lucia, A.: Landfill: an open dataset of code smells with public evaluation. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 482–485 (2015). https://doi.org/10.1109/MSR.2015.69
Palomba, F., Panichella, A., Zaidman, A., Oliveto, R., De Lucia, A.: The scent of a smell: an extensive comparison between textual and structural smells. IEEE Transa. Softw. Eng. (2017). http://orcid.org/10.1109/TSE.2017.2752171
Romano, S., Scanniello, G., Sartiani, C., Risi, M.: A graph-based approach to detect unreachable methods in java software. In: Proceedings of the 31st Annual ACM Symposium on Applied Computing, SAC ’16, p. 1538–1541. Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2851613.2851968
Santos, J.A.M., Rocha-Junior, J.B., Prates, L.C.L., do Nascimento, R.S., Freitas, M.F., de Mendonca, M.G.: A systematic review on the code smell effect. J. Syst. Softw. 144, 450 – 477 (2018). https://doi.org/10.1016/j.jss.2018.07.035
Sharma, T., Spinellis, D.: A survey on software smells. J. Syst. Softw. 138, 158–173 (2018). https://doi.org/10.1016/j.jss.2017.12.034
Singh, S., Kaur, S.: A systematic literature review: Refactoring for disclosing code smells in object oriented software. Ain Shams Eng. J. (2017). https://doi.org/10.1016/j.asej.2017.03.002
Tempero, E., Anslow, C., Dietrich, J., Han, T., Li, J., Lumpe, M., Melton, H., Noble, J.: Qualitas corpus: a curated collection of java code for empirical studies. In: 2010 Asia Pacific Software Engineering Conference (APSEC2010), pp. 336–345 (2010). http://dx.doi.org/10.1109/APSEC.2010.46
Wasylkowski, A., Zeller, A., Lindig, C.: Detecting object usage anomalies. In: 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE 2007, pp. 35–44 (2007). https://doi.org/10.1145/1287624.1287632
Wen, J., Li, S., Lin, Z., Hu, Y., Huang, C.: Systematic literature review of machine learning based software development effort estimation models. Inform. Softw. Technol. 54(1), 41–59 (2012). http://orcid.org/10.1016/j.infsof.2011.09.002
Wohlin, C.: Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering EASE’14 (2014). https://doi.org/10.1145/2601248.2601268
Zhang, H., Babar, M.A., Tell, P.: Identifying relevant studies in software engineering. Inf. Softw. Technol. 53(6), 625–637 (2011). http://orcid.org/10.1016/j.infsof.2010.12.010
Zhang, M., Hall, T., Baddoo, N.: Code Bad Smells: a review of current knowledge. J. Softw. Mainten. Evolut. Res. Pract. 23(3), 179–202 (2011). http://orcid.org/10.1002/smr.521
Systematic Literature Review References
Amorim, L., Costa, E., Antunes, N., Fonseca, B., Ribeiro, M.: Experience report: evaluating the effectiveness of decision trees for detecting code smells. In: 2015 IEEE 26th International Symposium on Software Reliability Engineering, ISSRE 2015, pp. 261–269 (2016). https://doi.org/10.1109/ISSRE.2015.7381819
Barbez, A., Khomh, F., Guéhéneuc, Y.G.: A machine-learning based ensemble method for anti-patterns detection. J. Syst. Softw. 161, (2020). https://doi.org/10.1016/j.jss.2019.110486
Barbez, A., Khomh, F., Gueheneuc, Y.G.: Deep learning anti-patterns from code metrics history. In: Proceedings—2019 IEEE International Conference on Software Maintenance and Evolution, ICSME 2019, pp. 114–124 (2019). https://doi.org/10.1109/ICSME.2019.00021
Boussaa, M., Kessentini, W., Kessentini, M., Bechikh, S., Ben Chikha, S.: Competitive coevolutionary code-smells detection. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8084 LNCS, 50–65 (2013). https://doi.org/10.1007/978-3-642-39742-4_6
Bryton, S., Brito e Abreu, F., Monteiro, M.: Reducing subjectivity in code smells detection: Experimenting with the long method. In: Proceedings—7th International Conference on the Quality of Information and Communications Technology, QUATIC 2010, pp. 337–342 (2010). https://doi.org/10.1109/QUATIC.2010.60
Chen, Z., Chen, L., Ma, W., Zhou, X., Zhou, Y., Xu, B.: Understanding metric-based detectable smells in python software: a comparative study. Inf. Softw. Technol. 94, 14–29 (2018). http://orcid.org/10.1016/j.infsof.2017.09.011
Fakhoury, S., Arnaoudova, V., Noiseux, C., Khomh, F., Antoniol, G.: Keep it simple: Is deep learning good for linguistic smell detection? In: 25th IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2018—Proceedings, vol. 2018-March, pp. 602–611 (2018). https://doi.org/10.1109/SANER.2018.8330265
Fontana, F.A., Zanoni, M., Marino, A., Mäntylä, M.V.: Code smell detection: towards a machine learning-based approach. In: IEEE International Conference on Software Maintenance, ICSM, pp. 396–399 (2013). https://doi.org/10.1109/ICSM.2013.56
Fontana, F.A., Mäntylä, M.V., Zanoni, M., Marino, A.: Comparing and experimenting machine learning techniques for code smell detection. Empirical Softw. Eng. 21(3), 1143–1191 (2016). http://orcid.org/10.1007/s10664-015-9378-4
Fu, S., Shen, B.: Code bad smell detection through evolutionary data mining. In: International Symposium on Empirical Software Engineering and Measurement, vol. 2015-November, pp. 41–49 (2015). 10.1109/ESEM.2015.7321194
Gauthier, F., Merlo, E.: Semantic smells and errors in access control models: a case study in PHP. In: Proceedings—International Conference on Software Engineering, pp. 1169–1172 (2013). https://doi.org/10.1109/ICSE.2013.6606670
Grodzicka, H., Ziobrowski, A., Łakomiak, Z., Kawa, M., Madeyski, L.: Code smell prediction employing machine learning meets emerging Java Language constructs. In: Poniszewska-Marańda, A., Kryvinska, N., Jarząbek, S., Madeyski, L. (eds.) Data-Centric Business and Applications: Towards Software Development, vol. 40 of book series Lecture Notes on Data Engineering and Communications Technologies, pp. 137–167. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-34706-2_8
Guggulothu, T., Moiz, S.A.: Code smell detection using multi-label classification approach. Softw. Qual. J. (2020). https://doi.org/10.1007/s11219-020-09498-y
Guo, X., Shi, C., Jiang, H.: Deep semantic-based feature envy identification. ACM International Conference Proceeding Series (2019). https://doi.org/10.1145/3361242.3361257
Hadj-Kacem, M., Bouassida, N.: A hybrid approach to detect code smells using deep learning. In: ENASE 2018—Proceedings of the 13th International Conference on Evaluation of Novel Approaches to Software Engineering, vol. 2018-March, pp. 137–146 (2018). https://doi.org/10.5220/0006709801370146
Hadj-Kacem, M., Bouassida, N.: Deep representation learning for code smells detection using variational auto-encoder. In: Proceedings of the International Joint Conference on Neural Networks, vol. 2019-July (2019). https://doi.org/10.1109/IJCNN.2019.8851854
Hassaine, S., Khomh, F., Guéhéneucy, Y.G., Hamel, S.: IDS: An immune-inspired approach for the detection of software design smells. In: Proceedings—7th International Conference on the Quality of Information and Communications Technology, QUATIC 2010, pp. 343–348 (2010). https://doi.org/10.1109/QUATIC.2010.61
Hozano, M., Antunes, N., Fonseca, B., Costa, E.: Evaluating the accuracy of machine learning algorithms on detecting code smells for different developers. In: Proceedings of the 19th International Conference on Enterprise Information Systems, Vol. 2: ICEIS, pp. 474–482. INSTICC, SciTePress (2017). https://doi.org/10.5220/0006338804740482
James Benedict Felix, S., Vinod, V.: Design and analysis of improvised genetic algorithm with particle swarm optimization for code smell detection. Int. J. Innov. Technol. Explor. Eng. 9(1), 5327–5330 (2019). https://doi.org/10.35940/ijitee.A5328.119119
Jesudoss, A., Maneesha, S., Lakshmi Naga Durga, T.: Identification of code smell using machine learning. In: 2019 International Conference on Intelligent Computing and Control Systems, ICCS 2019, pp. 54–58 (2019). https://doi.org/10.1109/ICCS45141.2019.9065317
Karaduzovic-Hadziabdic, K., Spahic, R.: Comparison of machine learning methods for code smell detection using reduced features. In: UBMK 2018—3rd International Conference on Computer Science and Engineering, pp. 670–672 (2018). https://doi.org/10.1109/UBMK.2018.8566561
Kaur, A., Jain, S., Goel, S.: A support vector machine based approach for code smell detection. In: Proceedings—2017 International Conference on Machine Learning and Data Science, MLDS 2017, vol. 2018-January, pp. 9–14 (2018). https://doi.org/10.1109/MLDS.2017.8
Kaur, A., Jain, S., Goel, S.: SP-J48: a novel optimization and machine-learning-based approach for solving complex problems: special application in software engineering for detecting code smells. Neural Comput. Appl. (2019). http://orcid.org/10.1007/s00521-019-04175-z
Kessentini, W., Kessentini, M., Sahraoui, H., Bechikh, S., Ouni, A.: A cooperative parallel search-based software engineering approach for code-smells detection. IEEE Trans. Softw. Eng. 40(9), 841–861 (2014). http://orcid.org/10.1109/TSE.2014.2331057
Kessentini, M., Ouni, A.: Detecting android smells using multi-objective genetic programming. In: 2017 IEEE/ACM 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft), pp. 122–132 (2017). https://doi.org/10.1109/MOBILESoft.2017.29
Khomh, F., Vaucher, S., Guéehéneuc, Y.G., Sahraoui, H.: A Bayesian approach for the detection of code and design smells. In: Proceedings—International Conference on Quality Software, pp. 305–314 (2009). https://doi.org/10.1109/QSIC.2009.47
Kiyak, E.O., Birant, D., Birant, K.U.: Comparison of multi-label classification algorithms for code smell detection. In: 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies, ISMSIT 2019—Proceedings (2019). https://doi.org/10.1109/ISMSIT.2019.8932855
Kreimer, J.: Adaptive detection of design flaws. Electronic Notes in Theoretical Computer Science 141(4 SPEC. ISS.), 117–136 (2005). https://doi.org/10.1016/j.entcs.2005.02.059
Liu, H., Jin, J., Xu, Z., Bu, Y., Zou, Y., Zhang, L.: Deep learning based code smell detection. IEEE Trans. Softw. Eng. (2019). http://orcid.org/10.1109/TSE.2019.2936376
Liu, H., Liu, Q., Niu, Z., Liu, Y.: Dynamic and automatic feedback-based threshold adaptation for code smell detection. IEEE Trans. Softw. Eng. 42(6), 544–558 (2016). http://orcid.org/10.1109/TSE.2015.2503740
Liu, H., Xu, Z., Zou, Y.: Deep learning based feature envy detection. In: ASE 2018—Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pp. 385–396 (2018). https://doi.org/10.1145/3238147.3238166
Maiga, A., Ali, N., Bhattacharya, N., Sabané, A., Guéneuc, Y.G., Aimeur, E.: SMURF: A SVM-based incremental anti-pattern detection approach. In: Proceedings—Working Conference on Reverse Engineering, WCRE, pp. 466–475 (2012). https://doi.org/10.1109/WCRE.2012.56
Mansoor, U., Kessentini, M., Maxim, B.R., Deb, K.: Multi-objective code-smells detection using good and bad design examples. Softw. Qual. J. 25(2), 529–552 (2017). http://orcid.org/10.1007/s11219-016-9309-7
Merzah, B.M.: Software quality prediction using data mining techniques. In: 2019 International Conference on Information and Communications Technology, ICOIACT 2019, pp. 394–397 (2019). https://doi.org/10.1109/ICOIACT46704.2019.8938487
Mkaouer, M.W.: Interactive code smells detection: an initial investigation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9962 LNCS, 281–287 (2016). https://doi.org/10.1007/978-3-319-47106-8_24
Ocariza, F.S., Pattabiraman, K., Mesbah, A.: Detecting unknown inconsistencies in web applications. In: ASE 2017—Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, pp. 566–577 (2017). https://doi.org/10.1109/ASE.2017.8115667
Özkalkan, Z., Aydin, K.S., Tetik, H.Y., Belen Saglam, R.: Automatic detection of feature envy using machine learning techniques. In: CEUR Workshop Proceedings, vol. 2201 (2018). http://ceur-ws.org/Vol-2201/UYMS_2018_paper_80.pdf
Palomba, F., Bavota, G., Di Penta, M., Oliveto, R., De Lucia, A., Poshyvanyk, D.: Detecting bad smells in source code using change history information. In: 2013 28th IEEE/ACM International Conference on Automated Software Engineering, ASE 2013—Proceedings, pp. 268–278 (2013). https://doi.org/10.1109/ASE.2013.6693086
Palomba, F.: Alternative sources of information for code smell detection: postcards from far away. In: Proceedings—2016 IEEE International Conference on Software Maintenance and Evolution, ICSME 2016, pp. 636–640 (2017). https://doi.org/10.1109/ICSME.2016.26
Palomba, F.: Textual analysis for code smell detection. Proc. Int. Conf. Softw. Eng. 2, 769–771 (2015). http://orcid.org/10.1109/ICSE.2015.244
Pradel, M., Heiniger, S., Gross, T.R.: Static detection of brittle parameter typing. In: Proceedings of the 2012 International Symposium on Software Testing and Analysis, ISSTA 2012, p. 265–275. Association for Computing Machinery, New York, NY, USA (2012). https://doi.org/10.1145/2338965.2336785
Rubin, J., Henniche, A.N., Moha, N., Bouguessa, M., Bousbia, N.: Sniffing android code smells: an association rules mining-based approach. In: Proceedings—2019 IEEE/ACM 6th International Conference on Mobile Software Engineering and Systems, MOBILESoft 2019, pp. 123–127 (2019). https://doi.org/10.1109/MOBILESoft.2019.00025
Sahin, D., Kessentini, M., Bechikh, S., Deb, K.: Code-smell detection as a bilevel problem. ACM Trans. Softw. Eng. Methodol. 24(1) (2014). https://doi.org/10.1145/2675067
Sharma, P., Kaur, E.A.: Design of testing framework for code smell detection (OOPS) using BFO algorithm. Int. J. Eng. Technol. (UAE) 7(2.27 Special Issue 27), 161–166 (2018). https://doi.org/10.14419/ijet.v7i2.27.14635
Tummalapalli, S., Kumar, L., Neti, L.B.M.: An empirical framework for web service anti-pattern prediction using machine learning techniques. In: IEMECON 2019—9th Annual Information Technology, Electromechanical Engineering and Microelectronics Conference, pp. 137–143 (2019). https://doi.org/10.1109/IEMECONX.2019.8877008
Acknowledgements
This research was partly financed by Polish National Centre for Research and Development grant POIR.01.01.01-00-0792/16: “Codebeat—wykorzystanie sztucznej inteligencji w statycznej analizie jakości oprogramowania.”
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Lewowski, T., Madeyski, L. (2022). Code Smells Detection Using Artificial Intelligence Techniques: A Business-Driven Systematic Review. In: Kryvinska, N., Poniszewska-Marańda, A. (eds) Developments in Information & Knowledge Management for Business Applications . Studies in Systems, Decision and Control, vol 377. Springer, Cham. https://doi.org/10.1007/978-3-030-77916-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-77916-0_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-77915-3
Online ISBN: 978-3-030-77916-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)