ABSTRACT
Software refactoring is widely employed to improve software quality. A key step in software refactoring is to identify which part of the software should be refactored. To facilitate the identification, a number of approaches have been proposed to identify certain structures in the code (called code smells) that suggest the possibility of refactoring. Most of such approaches rely on manually designed heuristics to map manually selected source code metrics to predictions. However, it is challenging to manually select the best features, especially textual features. It is also difficult to manually construct the optimal heuristics. To this end, in this paper we propose a deep learning based novel approach to detecting feature envy, one of the most common code smells. The key insight is that deep neural networks and advanced deep learning techniques could automatically select features (especially textual features) of source code for feature envy detection, and could automatically build the complex mapping between such features and predictions. We also propose an automatic approach to generating labeled training data for the neural network based classifier, which does not require any human intervention. Evaluation results on open-source applications suggest that the proposed approach significantly improves the state-of-the-art in both detecting feature envy smells and recommending destinations for identified smelly methods.
- 2018. Areca. http://www.areca-backup.org/. 2018. Freeplane. https://www.freeplane.org/. 2018. jEdit. http://www.jedit.org/. 2018. JExcelAPI. http://jexcelapi.sourceforge.net/. 2018. JSmooth. http://jsmooth.sourceforge.net/. 2018. JUnit. https://junit.org/. 2018. Neuroph. http://neuroph.sourceforge.net/. 2018. PMD. https://pmd.github.io/. 2018. XDM. http://xdman.sourceforge.net/.Google Scholar
- Lucas Amorim, Evandro Costa, Nuno Antunes, Baldoino Fonseca, and Márcio Ribeiro. 2015. Experience report: Evaluating the effectiveness of decision trees for detecting code smells. In Software Reliability Engineering (ISSRE), 2015 IEEE 26th International Symposium on. IEEE, 261–269. Google ScholarDigital Library
- Francesca Arcelli Fontana, Mika V. Mäntylä, Marco Zanoni, and Alessandro Marino. 2016. Comparing and experimenting machine learning techniques for code smell detection. Empirical Software Engineering 21, 3 (01 Jun 2016), 1143– 1191. Google ScholarDigital Library
- V. Arnaoudova, L. M. Eshkevari, M. D. Penta, R. Oliveto, G. Antoniol, and Y. G. GuÃľhÃľneuc. 2014. REPENT: Analyzing the Nature of Identifier Renamings. IEEE Transactions on Software Engineering 40, 5 (May 2014), 502–532. Google ScholarDigital Library
- G. Bavota, R. Oliveto, M. Gethers, D. Poshyvanyk, and A. De Lucia. 2014. Methodbook: Recommending Move Method Refactorings via Relational Topic Models. Software Engineering, IEEE Transactions on 40, 7 (July 2014), 671–694. Google ScholarDigital Library
- D. Bobkov, S. Chen, R. Jian, M. Z. Iqbal, and E. Steinbach. 2018. Noise-Resistant Deep Learning for Object Classification in Three-Dimensional Point Clouds Using a Point Pair Descriptor. IEEE Robotics and Automation Letters 3, 2 (April 2018), 865–872.Google ScholarCross Ref
- Jehad Al Dallal. 2014. Identifying Refactoring Opportunities in Object-Oriented Code: A Systematic Literature Review. Information and Software Technology 0 (2014), –.Google Scholar
- Francesca Arcelli Fontana, Mika V Mäntylä, Marco Zanoni, and Alessandro Marino. 2016. Comparing and experimenting machine learning techniques for code smell detection. Empirical Software Engineering 21, 3 (2016), 1143–1191. Google ScholarDigital Library
- Francesca Arcelli Fontana and Marco Zanoni. 2017. Code smell severity classification using machine learning techniques. Knowledge-Based Systems 128 (2017), 43–58. Google ScholarDigital Library
- Francesca Arcelli Fontana, Marco Zanoni, Alessandro Marino, and Mika V Mantyla. 2013. Code smell detection: Towards a machine learning-based approach. In Software Maintenance (ICSM), 2013 29th IEEE International Conference on. IEEE, 396–399. Google ScholarDigital Library
- Martin Fowler, Kent Beck, John Brant, William Opdyke, and Don Roberts. 1999. Refactoring: Improving the Design of Existing Code. Addison Wesley Professional. Google ScholarDigital Library
- William G. Griswold and David Notkin. 1993. Automated assistance for program restructuring. ACM Transactions on Software Engineering and Methodology (TOSEM) 2, 3 (July 1993), 228–269. Google ScholarDigital Library
- Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, and Sunghun Kim. 2016. Deep API Learning. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016). ACM, New York, NY, USA, 631–642. Google ScholarDigital Library
- K. Hwang and W. Sung. 2016. Character-level incremental speech recognition with recurrent neural networks. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 5335–5339.Google Scholar
- ICASSP.2016.7472696Google Scholar
- Keras. 2018. Flatten Layer. Retrieved July 21, 2018 from https://github.com/ keras-team/keras/blob/master/keras/layers/core.py#L467Google Scholar
- Keras. 2018. Keras: The Python Deep Learning Library. https://github.com/ keras-team/keras/blob/master/keras/models.pyGoogle Scholar
- Keras. 2018. Merge Layer. Retrieved July 21, 2018 from https://github.com/ keras-team/keras/blob/master/keras/layers/merge.pyGoogle Scholar
- Foutse Khomh, Stéphane Vaucher, Yann-Gaël Guéhéneuc, and Houari Sahraoui. 2009. A bayesian approach for the detection of code and design smells. In Quality Software, 2009. QSIC’09. 9th International Conference on. IEEE, 305–314. Google ScholarDigital Library
- Foutse Khomh, Stephane Vaucher, Yann-Gaël Guéhéneuc, and Houari Sahraoui. 2011. BDTEX: A GQM-based Bayesian approach for the detection of antipatterns. Journal of Systems and Software 84, 4 (2011), 559–572. Google ScholarDigital Library
- Jochen Kreimer. 2005. Adaptive detection of design flaws. Electronic Notes in Theoretical Computer Science 141, 4 (2005), 117–136. Google ScholarDigital Library
- Hui Liu, Xue Guo, and Weizhong Shao. 2013. Monitor-Based Instant Software Refactoring. IEEE Transactions on Software Engineering 39, 8 (2013), 1112–1126. Google ScholarDigital Library
- H. Liu, Q. Liu, Z. Niu, and Y. Liu. 2016. Dynamic and Automatic Feedback-Based Threshold Adaptation for Code Smell Detection. IEEE Transactions on Software Engineering 42, 6 (June 2016), 544–558. Google ScholarDigital Library
- H. Liu, Y. Wu, W. Liu, Q. Liu, and C. Li. 2016. Domino Effect: Move More Methods Once a Method is Moved. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. 1–12.Google Scholar
- Hui Liu, Zhifeng Xu, and Yanzhen Zou. 2018. Replace CNN with Dense Layers. Retrieved July 21, 2018 from https://github.com/liuhuigmail/FeatureEnvy/tree/ master/Algorithm/DenseVScnnGoogle Scholar
- Abdou Maiga, Nasir Ali, Neelesh Bhattacharya, Aminata Sabane, Yann-Gael Gueheneuc, and Esma Aimeur. 2012. SMURF: A SVM-based incremental antipattern detection approach. In Reverse engineering (WCRE), 2012 19th working conference on. IEEE, 466–475. Google ScholarDigital Library
- Abdou Maiga, Nasir Ali, Neelesh Bhattacharya, Aminata Sabané, Yann-Gaël Guéhéneuc, Giuliano Antoniol, and Esma Aïmeur. 2012. Support vector machines for anti-pattern detection. In Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering. ACM, 278–281. Google ScholarDigital Library
- Mika V. Mäntylä and Casper Lassenius. 2006. Subjective evaluation of software evolvability using code smells: An empirical study. Empirical Software Engineering 11, 3 (01 Sep 2006), 395–431. Google ScholarDigital Library
- Tom Mens, Niels Van Eetvelde, and Serge Demeyer. 2005. Formalizing Refactorings with Graph Transformations. Journal of Software Maintenance and Evolution: Research and Practice 17, 4 (2005), 247–276. Google ScholarDigital Library
- Tom Mens and Tom Tourwé. 2004. A Survey of Software Refactoring. IEEE Transactions on Software Engineering 30, 2 (2004), 126–139. Google ScholarDigital Library
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013).Google Scholar
- arXiv: 1301.3781 http://arxiv.org/abs/1301.3781Google Scholar
- Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems 26, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 3111–3119. Google ScholarDigital Library
- D. Di Nucci, F. Palomba, D. A. Tamburri, A. Serebrenik, and A. De Lucia. 2018. Detecting code smells using machine learning techniques: Are we there yet?. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). 612–621.Google Scholar
- D. Di Nucci, F. Palomba, D. A. Tamburri, A. Serebrenik, and A. De Lucia. 2018. Detecting code smells using machine learning techniques: Are we there yet?. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). 612–621.Google Scholar
- William F. Opdyke. 1992. Refactoring Object-Oriented Frameworks. Ph.D. Dissertation. University of Illinois at Urbana-Champaign. Google ScholarDigital Library
- William F. Opdyke. 1992. Refactoring Object-oriented Frameworks. Ph.D. Dissertation. Champaign, IL, USA. UMI Order No. GAX93-05645. Google ScholarDigital Library
- H. Palangi, L. Deng, Y. Shen, J. Gao, X. He, J. Chen, X. Song, and R. Ward. 2016. Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval. IEEE/ACM Transactions on Audio, Speech, and Language Processing 24, 4 (April 2016), 694–707. 1109/TASLP.2016.2520371 Google ScholarDigital Library
- F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, D. Poshyvanyk, and A. De Lucia. 2015. Mining Version Histories for Detecting Code Smells. Software Engineering, IEEE Transactions on 41, 5 (May 2015), 462–489. 2014.2372760Google Scholar
- Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Fausto Fasano, Rocco Oliveto, and Andrea De Lucia. 2017. On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation. Empirical Software Engineering (07 Aug 2017). Google ScholarDigital Library
- F. Palomba, A. Panichella, A. De Lucia, R. Oliveto, and A. Zaidman. 2016. A textualbased technique for Smell Detection. In 2016 IEEE 24th International Conference on Program Comprehension (ICPC). 1–10.Google Scholar
- F. Palomba, A. Panichella, A. De Lucia, R. Oliveto, and A. Zaidman. 2016. A textualbased technique for Smell Detection. In 2016 IEEE 24th International Conference on Program Comprehension (ICPC). 1–10.Google Scholar
- Y. Pan, T. Mei, T. Yao, H. Li, and Y. Rui. 2016. Jointly Modeling Embedding and Translation to Bridge Video and Language. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4594–4602.Google Scholar
- 2016.497Google Scholar
- S. Ren, K. He, R. Girshick, and J. Sun. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 6 (June 2017), 1137–1149. Google ScholarDigital Library
- 10.1109/TPAMI.2016.2577031Google Scholar
- V. Sales, R. Terra, L.F. Miranda, and M.T. Valente. 2013. Recommending Move Method refactorings using dependency sets. In Reverse Engineering (WCRE), 2013 20th Working Conference on. 232–241.Google Scholar
- 6671298Google Scholar
- Olaf Seng, Johannes Stammel, and David Burkhart. 2006. Search-based determination of refactorings for improving the class structure of object-oriented systems. In In Proceedings of the 8th annual conference on genetic and evolutionary computation. 1909–1916. Google ScholarDigital Library
- Tushar Sharma and Diomidis Spinellis. 2018. A survey on software smells. Journal of Systems and Software 138 (2018), 158 – 173. ASE ’18, September 3–7, 2018, Montpellier, France Hui Liu, Zhifeng Xu, and Yanzhen ZouGoogle ScholarCross Ref
- F. Simon, F.Steinbrucker, and C.Lewerentz. 2001. Metrics Based Refactoring. In Proceedings of Europen Conference on Software Maintenance and Reengineering. 30–38. Google ScholarDigital Library
- Ricardo Terra, Marco Tulio Valente, Sergio Miranda, and Vitor Sales. 2018. JMove: A novel heuristic and tool to detect move method refactoring opportunities. Journal of Systems and Software 138 (2018), 19 – 36. 2017.11.073Google ScholarCross Ref
- Frank Tip, Adam Kiezun, and Dirk Baeumer. 2003. Refactoring for Generalization Using Type Constraints. In Proceedings of the Eighteenth Annual Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA’03). Anaheim, CA, 13–26. Google ScholarDigital Library
- Nikolaos Tsantalis and Alexander Chatzigeorgiou. 2009. Identification of Move Method Refactoring Opportunities. IEEE Transactions on Software Engineering 35, 3 (2009), 347–367. Google ScholarDigital Library
- Stephane Vaucher, Foutse Khomh, Naouel Moha, and Yann-Gaël Guéhéneuc. 2009. Tracking design smells: Lessons from a study of god classes. In Reverse Engineering, 2009. WCRE’09. 16th Working Conference on. IEEE, 145–154. Google ScholarDigital Library
- Weka. {n. d.}. http://www.cs.waikato.ac.nz/ml/weka/.Google Scholar
- D. Wu, N. Sharma, and M. Blumenstein. 2017. Recent advances in video-based human action recognition using deep learning: A review. In 2017 International Joint Conference on Neural Networks (IJCNN). 2865–2872.Google Scholar
- IJCNN.2017.7966210Google Scholar
- Zhifeng Xu. 2018. Source Code. Retrieved July 21, 2018 from https://github.com/ liuhuigmail/FeatureEnvy/blob/master/Algorithm/train-CNN.pyGoogle Scholar
- K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang. 2017. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. IEEE Transactions on Image Processing 26, 7 (July 2017), 3142–3155. Google ScholarDigital Library
- 2017.2662206Google Scholar
Index Terms
- Deep learning based feature envy detection
Recommendations
Deep semantic-Based Feature Envy Identification
Internetware '19: Proceedings of the 11th Asia-Pacific Symposium on InternetwareCode smells regularly cause potential software quality problems in software development. Thus, code smell detection has attracted the attention of many researchers. A number of approaches have been suggested in order to improve the accuracy of code ...
DT: an upgraded detection tool to automatically detect two kinds of code smell: duplicated code and feature envy
ICGDA '18: Proceedings of the International Conference on Geoinformatics and Data AnalysisCode smell is unreasonable programming, and is produced when software developers don't have good habits of development and experience of development and other reasons. Code becomes more and more chaotic, the code structure become bloated. Code smell can ...
The Scent of Deep Learning Code: An Empirical Study
MSR '20: Proceedings of the 17th International Conference on Mining Software RepositoriesDeep learning practitioners are often interested in improving their model accuracy rather than the interpretability of their models. As a result, deep learning applications are inherently complex in their structures. They also need to continuously ...
Comments