skip to main content
research-article

An empirical study on clone stability

Published:01 September 2012Publication History
Skip Abstract Section

Abstract

Code cloning is a controversial software engineering practice due to contradictory claims regarding its effect on software maintenance. Code stability is a recently introduced measurement technique that has been used to determine the impact of code cloning by quantifying the changeability of a code region. Although most existing stability analysis studies agree that cloned code is more stable than non-cloned code, the studies have two major flaws: (i) each study only considered a single stability measurement (e.g., lines of code changed, frequency of change, age of change); and, (ii) only a small number of subject systems were analyzed and these were of limited variety.

In this paper, we present a comprehensive empirical study on code stability using four different stability measuring methods. We use a recently introduced hybrid clone detection tool, NiCAD, to detect the clones and analyze their stability in different dimensions: by clone type, by measuring method, by programming language, and by system size and age. Our in-depth investigation on 12 diverse subject systems written in three programming languages considering three types of clones reveals that: (i) cloned code is generally less stable than non-cloned code, and more specifically both Type-1 and Type-2 clones show higher instability than Type-3 clones; (ii) clones in both Java and C systems exhibit higher instability compared to the clones in C# systems; (iii) a system's development strategy might play a key role in defining its comparative code stability scenario; and, (iv) cloned and non-cloned regions of a subject system do not follow any consistent change pattern.

References

  1. Aversano, L., Cerulo, L., and Penta, M. D., "How clones are maintained: An empirical study", in Proc. The 11th European Conference on Software Maintenance and Reengineering (CSMR), 2007, pp. 81--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. CCFinderX. http://www.ccfinder.net/ccfinderxos.htmlGoogle ScholarGoogle Scholar
  3. Cordy, J. R., and Roy, C. K., "The NiCad Clone Detector", in Proc. The Tool Demo Track of the 19th International Conference on Program Comprehension (ICPC), 2011, pp. 219--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cordy, J. R., and Roy, C. K., "Tuning Research Tools for Scalability and Performance: The NICAD Experience", in Science of Computer Programming, 2012, 26 pp. (to appear)Google ScholarGoogle Scholar
  5. Fisher's Exact Test. http://in-silico.net/statistics/fisher_exact_test/2x3.Google ScholarGoogle Scholar
  6. Göde, N., and Harder, J., "Clone Stability", in Proc. The 15th European Conference on Software Maintenance and Reengineering (CSMR), 2011, pp. 65--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Hotta, K., Sano, Y., Higo, Y., and Kusumoto, S., "Is Duplicate Code More Frequently Modified than Non-duplicate Code in Software Evolution?: An Empirical Study on Open Source Software", in Proc. The Joint ERCIM Workshop on Software Evolution (EVOL) and International Workshop on Principles of Software Evolution (IWPSE), 2010, pp. 73--82 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Juergens, E., Deissenboeck, F., Hummel, B., and Wagner, S., "Do Code Clones Matter?", in Proc. The 31st International Conference on Software Engineering (ICSE), 2009, pp. 485--495. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Kapser, C., and Godfrey, M. W., ""Cloning considered harmful" considered harmful: patterns of cloning in software", in Journal of Empirical Software Engineering. 13(6), 2008, pp. 645--692. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kim, M, Sazawal, V., Notkin, D., and Murphy, G. C., "An empirical study of code clone genealogies", in Proc. The joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC-FSE), 2005, pp. 187--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Krinke, J., "A study of consistent and inconsistent changes to code clones", in Proc. The 14th Working Conference on Reverse Engineering (WCRE), 2007, pp. 170--178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Krinke, J., "Is cloned code more stable than non-cloned code?", in Proc. The 8th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM), 2008, pp. 57--66.Google ScholarGoogle ScholarCross RefCross Ref
  13. Krinke, J., "Is Cloned Code older than Non-Cloned Code?", in Proc. The 5th International Workshop on Software Clones (IWSC), 2011, pp. 28--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Lozano, A., Wermelinger, M., and Nuseibeh, B., "Evaluating the Harmfulness of Cloning: A Change Based Experiment", in Proc. The 4th International Workshop on Mining Software Repositories (MSR), 2007, pp. 18--21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Lozano, A., and Wermelinger, M., "Tracking clones' imprint", in Proc. The 4th International Workshop on Software Clones (IWSC), 2010, pp. 65--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Lozano, A., and Wermelinger, M., "Assessing the effect of clones on changeability", in Proc. The 24th IEEE International Conference on Software Maintenance (ICSM), 2008, pp. 227--236.Google ScholarGoogle ScholarCross RefCross Ref
  17. Mann-Whitney-Wilcoxon Test: http://elegans.som.vcu.edu/leon/stats/utest.htmlGoogle ScholarGoogle Scholar
  18. Mondal, M., Roy, C. K., Rahman, M. S., Saha, R. K., Krinke, J., and Schneider, K. A., "Comparative Stability of Cloned and Non-cloned Code: An Empirical Study", in Proc. The 27th Annual ACM Symposium on Applied Computing (SAC), 2012, pp. 1227--1234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Mondal, M., Roy, C. K., and Schneider, K. A., "Dispersion of Changes in Cloned and Non-cloned Code", in Proc. The 6th International Workshop on Software Clones (IWSC), 2012, pp. 29--35.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Roy, C. K., and Cordy, J. R., "A mutation / injection-based automatic framework for evaluating code clone detection tools", in Proc. The IEEE International Conference on Software Testing, Verification, and Validation Workshops, 2009, pp. 157--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Roy, C. K., and Cordy, J. R., "NICAD: Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization" in Proc. The 16th IEEE International Conference on Program Comprehension (ICPC), 2008, pp. 172--181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Roy, C. K., Cordy, J. R., and Koschke, R., "Comparison and Evaluation of Code Clone Detection Techniques and Tools: A Qualitative Approach", in Science of Computer Programming, 74 (2009) 470--495, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Roy, C. K., and Cordy, J. R., "Near-miss Function Clones in Open Source Software: An Empirical Study", in Journal of Software Maintenance and Evolution: Research and Practice, 22(3), 2010, pp. 165--189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Roy, C. K., and Cordy, J. R., "An Empirical Evaluation of Function Clones in Open Source Software", in Proc. The 15th Working Conference on Reverse Engineering (WCRE), 2008, pp. 81--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Roy, C. K., and Cordy, J. R., "Scenario-based Comparison of Clone Detection Techniques", in Proc. The 16th IEEE International Conference on Program Comprehension (ICPC), 2008, pp. 153--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Saha, R. K., Roy, C. K., and Schneider, K. A., "An Automatic Framework for Extracting and Classifying Near-Miss Clone Genealogies", in Proc. The 27th IEEE International Conference on Software Maintenance (ICSM), 2011, pp. 293--302. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Saha, R. K., Asaduzzaman, M., Zibran, M. F., Roy, C. K., and Schneider, K. A., "Evaluating code clone genealogies at release level: An empirical study", in Proc. The 10th IEEE International Conference on Source Code Analysis and Manipulation (SCAM), 2010, pp. 87--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Thummalapenta, S., Cerulo, L., Aversano, L., and Penta, M. D., "An empirical study on the maintenance of source code clones", in Journal of Empirical Software Engineering (ESE), 15(1), 2009, pp. 1--34. Google ScholarGoogle Scholar
  29. Zibran, M. F., Saha, R. K., Asaduzzaman, M., and Roy, C. K., "Analyzing and Forecasting Near-miss Clones in Evolving Software: An Empirical Study", in Proc. The 16th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS), 2011, pp. 295--304. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An empirical study on clone stability

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader