skip to main content
research-article

Leveraging the Defects Life Cycle to Label Affected Versions and Defective Classes

Published:10 February 2021Publication History
Skip Abstract Section

Abstract

Two recent studies explicitly recommend labeling defective classes in releases using the affected versions (AV) available in issue trackers (e.g., Jira). This practice is coined as the realistic approach. However, no study has investigated whether it is feasible to rely on AVs. For example, how available and consistent is the AV information on existing issue trackers? Additionally, no study has attempted to retrieve AVs when they are unavailable. The aim of our study is threefold: (1) to measure the proportion of defects for which the realistic method is usable, (2) to propose a method for retrieving the AVs of a defect, thus making the realistic approach usable when AVs are unavailable, (3) to compare the accuracy of the proposed method versus three SZZ implementations. The assumption of our proposed method is that defects have a stable life cycle in terms of the proportion of the number of versions affected by the defects before discovering and fixing these defects. Results related to 212 open-source projects from the Apache ecosystem, featuring a total of about 125,000 defects, reveal that the realistic method cannot be used in the majority (51%) of defects. Therefore, it is important to develop automated methods to retrieve AVs. Results related to 76 open-source projects from the Apache ecosystem, featuring a total of about 6,250,000 classes, affected by 60,000 defects, and spread over 4,000 versions and 760,000 commits, reveal that the proportion of the number of versions between defect discovery and fix is pretty stable (standard deviation <2)—across the defects of the same project. Moreover, the proposed method resulted significantly more accurate than all three SZZ implementations in (i) retrieving AVs, (ii) labeling classes as defective, and (iii) in developing defects repositories to perform feature selection. Thus, when the realistic method is unusable, the proposed method is a valid automated alternative to SZZ for retrieving the origin of a defect. Finally, given the low accuracy of SZZ, researchers should consider re-executing the studies that have used SZZ as an oracle and, in general, should prefer selecting projects with a high proportion of available and consistent AVs.

References

  1. Aalok Ahluwalia, Davide Falessi, and Massimiliano Di Penta. 2019. Snoring: A noise in defect prediction datasets. In Proceedings of the 16th International Conference on Mining Software Repositories.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Muhammad Asaduzzaman, Michael C. Bullock, Chanchal K. Roy, and Kevin A. Schneider. 2012. Bug introducing changes: A case study with android. In Proceedings of the 9th IEEE Working Conference of Mining Software Repositories, MSR 2012, June 2--3, 2012, Zurich, Switzerland, Michele Lanza, Massimiliano Di Penta, and Tao Xie (Eds.). IEEE Computer Society, 116--119. DOI:https://doi.org/10.1109/MSR.2012.6224267Google ScholarGoogle ScholarCross RefCross Ref
  3. Victor R. Basili, Lionel C. Briand, and Walcélio L. Melo. 1996. A validation of object-oriented design metrics as quality indicators. IEEE Transactions on Software Engineering 22, 10 (1996), 751--761.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Mario Luca Bernardi, Gerardo Canfora, Giuseppe A. Di Lucca, Massimiliano Di Penta, and Damiano Distante. 2012. Do developers introduce bugs when they do not communicate? The case of Eclipse and Mozilla. In Proceedings of the 16th European Conference on Software Maintenance and Reengineering, CSMR 2012, Szeged, Hungary, March 27--30, 2012, Tom Mens, Anthony Cleve, and Rudolf Ferenc (Eds.). IEEE Computer Society, 139--148. DOI:https://doi.org/10.1109/CSMR.2012.24Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Christian Bird, Adrian Bachmann, Eirik Aune, John Duffy, Abraham Bernstein, Vladimir Filkov, and Premkumar Devanbu. 2009. Fair and balanced?: Bias in bug-fix datasets. In Proceedings of the the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering (ESEC/FSE’09). ACM, New York, NY, 121--130. DOI:https://doi.org/10.1145/1595696.1595716Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Markus Borg, Oscar Svensson, Kristian Berg, and Daniel Hansson. 2019. SZZ unleashed: An open implementation of the SZZ algorithm—Featuring example usage in a study of just-in-time bug prediction for the Jenkins project. CoRR abs/1903.01742 (2019). arxiv:1903.01742 http://arxiv.org/abs/1903.01742Google ScholarGoogle Scholar
  7. Jianfeng Chen, Joymallya Chakraborty, Philip Clark, Kevin Haverlock, Snehit Cherian, and Tim Menzies. 2019. Predicting breakdowns in cloud services (with SPIKE). In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 916--924.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Tse-Hsun Chen, Meiyappan Nagappan, Emad Shihab, and Ahmed E. Hassan. 2014. An empirical study of dormant bugs. In Proceedings of the 11th Working Conference on Mining Software Repositories—MSR 2014. DOI:https://doi.org/10.1145/2597073.2597108Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20, 1 (1960), 37--46.Google ScholarGoogle ScholarCross RefCross Ref
  10. Daniel Alencar Da Costa, Shane McIntosh, Weiyi Shang, Uirá Kulesza, Roberta Coelho, and Ahmed E. Hassan. 2017. A framework for evaluating the results of the SZZ approach for identifying bug-introducing changes. IEEE Transactions on Software Engineering 43, 7 (2017), 641--657.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Daniel Alencar da Costa, Shane McIntosh, Weiyi Shang, Uirá Kulesza, Roberta Coelho, and Ahmed E. Hassan. 2017. A framework for evaluating the results of the SZZ approach for identifying bug-introducing changes. IEEE Transactions on Software Engineering 43, 7 (2017), 641--657. DOI:https://doi.org/10.1109/TSE.2016.2616306Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Marco D’Ambros, Michele Lanza, and Romain Robbes. 2012. Evaluating defect prediction approaches: A benchmark and an extensive comparison. Empirical Software Engineering 17, 4--5 (Aug. 2012), 531--577. DOI:https://doi.org/10.1007/s10664-011-9173-9Google ScholarGoogle Scholar
  13. Olive Jean Dunn. 1964. Multiple comparisons using rank sums. Technometrics 6, 3 (1964), 241--252.Google ScholarGoogle ScholarCross RefCross Ref
  14. Jon Eyolfson, Lin Tan, and Patrick Lam. 2011. Do time of day and developer experience affect commit bugginess? In Proceedings of the 8th Working Conference on Mining Software Repositories. ACM, 153--162.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jon Eyolfson, Lin Tan, and Patrick Lam. 2011. Do time of day and developer experience affect commit bugginess. In Proceedings of the 8th International Working Conference on Mining Software Repositories, MSR 2011 (Co-located with ICSE), Waikiki, Honolulu, HI, May 21--28, 2011, Proceedings, Arie van Deursen, Tao Xie, and Thomas Zimmermann (Eds.). ACM, 153--162. DOI:https://doi.org/10.1145/1985441.1985464Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Davide Falessi and Max Jason Moede. 2018. Facilitating feasibility analysis: The pilot defects prediction dataset maker. In Proceedings of the 4th ACM SIGSOFT International Workshop on Software Analytics, SWAN@ESEC/SIGSOFT FSE 2018, Lake Buena Vista, FL, November 5, 2018. 15--18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Davide Falessi, Barbara Russo, and Kathleen Mullen. 2017. What if I had no smells? In Proceedings of the 11th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’17). IEEE Press, Piscataway, NJ, 78--84. DOI:https://doi.org/10.1109/ESEM.2017.14Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Davide Falessi, Michele A. Shaw, and Kathleen Mullen. 2014. Achieving and maintaining CMMI maturity level 5 in a small organization. IEEE Software 31, 5 (2014), 80--86. DOI:https://doi.org/10.1109/MS.2014.17Google ScholarGoogle ScholarCross RefCross Ref
  19. Davide Falessi, Wyatt Smith, and Alexander Serebrenik. 2017. STRESS: A semi-automated, fully replicabile approach for project selection. In Proceedings of the 11th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’17). IEEE Press, Piscataway, NJ, 151--156.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yuanrui Fan, Xin Xia, Daniel Alencar Da Costa, David Lo, Ahmed E Hassan, and Shanping Li. 2019. The impact of changes mislabeled by SZZ on just-in-time defect prediction. IEEE Transactions on Software Engineering (2019).Google ScholarGoogle Scholar
  21. Wei Fu and Tim Menzies. 2017. Revisiting unsupervised learning for defect prediction. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. 72--83.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Wei Fu, Tim Menzies, and Xipeng Shen. 2016. Tuning for software analytics: Is it really necessary?Information 8 Software Technology 76 (2016), 135--146.Google ScholarGoogle Scholar
  23. Takafumi Fukushima, Yasutaka Kamei, Shane McIntosh, Kazuhiro Yamashita, and Naoyasu Ubayashi. 2014. An empirical study of just-in-time defect prediction using cross-project models. In Proceedings of the 11th Working Conference on Mining Software Repositories. 172--181.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kehan Gao, Taghi M. Khoshgoftaar, Huanjing Wang, and Naeem Seliya. 2011. Choosing software metrics for defect prediction: An investigation on feature selection techniques. Software: Practice and Experience 41, 5 (2011), 579--606.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Baljinder Ghotra, Shane McIntosh, and Ahmed E. Hassan. 2017. A large-scale study of the impact of feature selection techniques on defect classification models. In Proceedings of the 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 146--157.Google ScholarGoogle Scholar
  26. Georgios Gousios and Diomidis Spinellis. 2014. Conducting quantitative software engineering studies with Alitheia core. Empirical Software Engineering 19, 4 (2014), 885--925.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. A. Hall. 1998. Correlation-based Feature Subset Selection for Machine Learning. Ph.D. Dissertation. University of Waikato, Hamilton, New Zealand.Google ScholarGoogle Scholar
  28. Maurice Howard Halstead et al. 1977. Elements of Software Science. Vol. 7. Elsevier New York.Google ScholarGoogle Scholar
  29. Steffen Herbold. 2017. Comments on ScottKnottESD in response to ”an empirical comparison of model validation techniques for defect prediction models”. IEEE Transactions on Software Engineering 43, 11 (2017), 1091--1094. DOI:https://doi.org/10.1109/TSE.2017.2748129Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kim Herzig, Sascha Just, and Andreas Zeller. 2013. It’s not a bug, it’s a feature: How misclassification impacts bug prediction. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). IEEE Press, Piscataway, NJ, 392--401. http://dl.acm.org/citation.cfm?id&equals;2486788.2486840Google ScholarGoogle ScholarCross RefCross Ref
  31. Sture Holm. 1979. A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics (1979), 65--70.Google ScholarGoogle Scholar
  32. Tian Jiang, Lin Tan, and Sunghun Kim. 2013. Personalized defect prediction. In Proceedings of the 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 279--289.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Yasutaka Kamei, Takafumi Fukushima, Shane McIntosh, Kazuhiro Yamashita, Naoyasu Ubayashi, and Ahmed E. Hassan. 2016. Studying just-in-time defect prediction using cross-project models. Empirical Software Engineering 21, 5 (2016), 2072--2106.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Yasutaka Kamei, Shinsuke Matsumoto, Akito Monden, Ken-ichi Matsumoto, Bram Adams, and Ahmed E. Hassan. 2010. Revisiting common bug prediction findings using effort-aware models. In Proceedings of the 26th IEEE International Conference on Software Maintenance (ICSM 2010), September 12--18, 2010, Timisoara, Romania. IEEE Computer Society, 1--10. DOI:https://doi.org/10.1109/ICSM.2010.5609530Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yasutaka Kamei and Emad Shihab. 2016. Defect prediction: Accomplishments and future challenges. In Proceedings of the 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 5. IEEE, 33--45.Google ScholarGoogle ScholarCross RefCross Ref
  36. Yasutaka Kamei, Emad Shihab, Bram Adams, Ahmed E. Hassan, Audris Mockus, Anand Sinha, and Naoyasu Ubayashi. 2012. A large-scale empirical study of just-in-time quality assurance. IEEE Transactions on Software Engineering 39, 6 (2012), 757--773.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Sunghun Kim, E. James Whitehead Jr., and Yi Zhang. 2008. Classifying software changes: Clean or buggy?IEEE Transactions on Software Engineering 34, 2 (2008), 181--196.Google ScholarGoogle Scholar
  38. Sunghun Kim, Hongyu Zhang, Rongxin Wu, and Liang Gong. 2011. Dealing with noise in defect prediction. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). ACM, New York, NY, 481--490.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Sunghun Kim, Thomas Zimmermann, E. James Whitehead Jr., and Andreas Zeller. 2007. Predicting faults from cached history. In Proceedings of the 29th International Conference on Software Engineering (ICSE 2007), Minneapolis, MN, May 20--26, 2007. IEEE Computer Society, 489--498. DOI:https://doi.org/10.1109/ICSE.2007.66Google ScholarGoogle Scholar
  40. Sunghun Kim, Thomas Zimmermann, Kai Pan, and E. James Jr. Whitehead. 2006. Automatic identification of bug-introducing changes. In Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering (ASE’06). IEEE Computer Society, Washington, DC, 81--90. DOI:https://doi.org/10.1109/ASE.2006.23Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Pavneet Singh Kochhar, Yuan Tian, and David Lo. 2014. Potential biases in bug localization: Do they matter? In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering (ASE’14). ACM, New York, NY, 803--814.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Masanari Kondo, Cor-Paul Bezemer, Yasutaka Kamei, Ahmed E. Hassan, and Osamu Mizuno. 2019. The impact of feature reduction techniques on defect prediction models. Empirical Software Engineering 24, 4 (2019), 1925--1963.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. William H. Kruskal and W. Allen Wallis. 1952. Use of ranks in one-criterion variance analysis. Journal of the American Statistical Association 47, 260 (1952), 583--621.Google ScholarGoogle ScholarCross RefCross Ref
  44. Janaki T. Madhavan and E. James Whitehead Jr. 2007. Predicting buggy changes inside an integrated development environment. In Proceedings of the 2007 OOPSLA Workshop on Eclipse Technology Exchange. 36--40.Google ScholarGoogle Scholar
  45. Markus Borg, Oscar Svensson, Kristian Berg, and Daniel Hansson. 2019. SZZ Unleashed: An open implementation of the SZZ algorithm - Featuring example usage in a study of just-in-time bug prediction for the jenkins project. In Proc. of the Workshop on Machine Learning Techniques for Software Quality Evolution (MaLTeSQuE'19). 7--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Thomas J. McCabe. 1976. A complexity measure. IEEE Transactions on Software Engineering4 (1976), 308--320.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Shane McIntosh and Yasutaka Kamei. 2018. Are fix-inducing changes a moving target? A longitudinal case study of just-in-time defect prediction. IEEE Transactions on Software Engineering 44, 5 (2018), 412--428.Google ScholarGoogle ScholarCross RefCross Ref
  48. J. Sayyad Shirabad and T. J. Menzies. 2005. The PROMISE repository of software engineering databases. School of Information Technology and Engineering, University of Ottawa, Canada. http://promise.site.uottawa.ca/SERepository.Google ScholarGoogle Scholar
  49. Nuthan Munaiah, Steven Kroh, Craig Cabrey, and Meiyappan Nagappan. 2017. Curating github for engineered software projects. Empirical Software Engineering 22, 6 (Dec. 2017), 3219--3253. DOI:https://doi.org/10.1007/s10664-017-9512-6Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Meiyappan Nagappan, Thomas Zimmermann, and Christian Bird. 2013. Diversity in software engineering research. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2013). ACM, New York, NY, 466--476.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Jaechang Nam, Wei Fu, Sunghun Kim, Tim Menzies, and Lin Tan. 2017. Heterogeneous defect prediction. IEEE Transactions on Software Engineering 44, 9 (2017), 874--896.Google ScholarGoogle ScholarCross RefCross Ref
  52. Jaechang Nam and Sunghun Kim. 2015. Clami: Defect prediction on unlabeled datasets (t). In Proceedings of the 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 452--463.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Edmilson Campos Neto, Daniel Alencar Da Costa, and Uirá Kulesza. 2018. The impact of refactoring changes on the SZZ algorithm: An empirical study. In Proceedings of the 25th International Conference on Software Analysis, Evolution and Reengineering, SANER 2018, Campobasso, Italy, March 20--23, 2018. 380--390. DOI:https://doi.org/10.1109/SANER.2018.8330225Google ScholarGoogle ScholarCross RefCross Ref
  54. Annibale Panichella, Carol V. Alexandru, Sebastiano Panichella, Alberto Bacchelli, and Harald C. Gall. 2016. A search-based training algorithm for cost-aware defect prediction. In Proceedings of the Genetic and Evolutionary Computation Conference 2016. 1077--1084.Google ScholarGoogle Scholar
  55. Foyzur Rahman, Christian Bird, and Premkumar T. Devanbu. 2010. Clones: What is that smell? In Proceedings of the 7th International Working Conference on Mining Software Repositories, MSR 2010 (Co-located with ICSE), Cape Town, South Africa, May 2--3, 2010, Jim Whitehead and Thomas Zimmermann (Eds.). IEEE Computer Society, 72--81. DOI:https://doi.org/10.1109/MSR.2010.5463343Google ScholarGoogle Scholar
  56. Foyzur Rahman and Premkumar Devanbu. 2013. How, and why, process metrics are better. In Proceedings of the 2013 35th International Conference on Software Engineering (ICSE). IEEE, 432--441.Google ScholarGoogle ScholarCross RefCross Ref
  57. Foyzur Rahman and Premkumar T. Devanbu. 2011. Ownership, experience and defects: A fine-grained study of authorship. In Proceedings of the 33rd International Conference on Software Engineering, ICSE 2011, Waikiki, Honolulu, HI, May 21--28, 2011, Richard N. Taylor, Harald C. Gall, and Nenad Medvidovic (Eds.). ACM, 491--500. DOI:https://doi.org/10.1145/1985793.1985860Google ScholarGoogle Scholar
  58. Foyzur Rahman, Daryl Posnett, Israel Herraiz, and Premkumar Devanbu. 2013. Sample Size vs. bias in defect prediction. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2013). ACM, New York, NY, 147--157.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Izabela A. Rodenhuis-Zybert, Jan Wilschut, and Jolanda M. Smit. 2010. Dengue virus life cycle: Viral and host factors modulating infectivity. Cellular and Molecular Life Sciences 67, 16 (2010), 2773--2786.Google ScholarGoogle ScholarCross RefCross Ref
  60. Gema Rodríguez-Pérez, Gregorio Robles, and Jesús M. González-Barahona. 2018. Reproducibility and credibility in empirical software engineering: A case study based on a systematic literature review of the use of the SZZ algorithm. Information 8 Software Technology 99 (2018), 164--176.Google ScholarGoogle Scholar
  61. Gema Rodríguez-Pérez, Gregorio Robles, Alexander Serebrenik, Andy Zaidman, Daniel M. Germán, and Jesus M. Gonzalez-Barahona. [n.d.]. How bugs are born: A model to identify how bugs are introduced in software components. Empirical Software Engineering ([n.d.]), 1--47.Google ScholarGoogle Scholar
  62. Gema Rodríguez-Pérez, Andy Zaidman, Alexander Serebrenik, Gregorio Robles, and Jesús M. González-Barahona. 2018. What if a bug has a different origin?: Making sense of bugs without an explicit bug introducing change. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’18). ACM, New York, NY, Article 52, 4 pages. DOI:https://doi.org/10.1145/3239235.3267436Google ScholarGoogle Scholar
  63. Daniel Rozenberg, Ivan Beschastnikh, Fabian Kosmale, Valerie Poser, Heiko Becker, Marc Palyart, and Gail C. Murphy. 2016. Comparing repositories visually with RepoGrams. In Proceedings of the 13th International Conference on Mining Software Repositories, MSR 2016, Austin, TX, May 14--22, 2016. 109--120.Google ScholarGoogle Scholar
  64. Giuseppe Scanniello, Carmine Gravino, Andrian Marcus, and Tim Menzies. 2013. Class level fault prediction using software clustering. In Proceedings of the 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 640--645.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Samuel Sanford Shapiro and Martin B. Wilk. 1965. An analysis of variance test for normality (complete samples). Biometrika 52, 3/4 (1965), 591--611.Google ScholarGoogle ScholarCross RefCross Ref
  66. Martin Shepperd, Qinbao Song, Zhongbin Sun, and Carolyn Mair. 2013. Data quality: Some comments on the NASA software defect datasets. IEEE Transactions on Software Engineering 39, 9 (Sept. 2013), 1208--1215.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Martin Shepperd, Qinbao Song, Zhongbin Sun, and Carolyn Mair. 2013. Data quality: Some comments on the NASA software defect datasets. IEEE Transactions on Software Engineering 39, 9 (2013), 1208--1215.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Shivkumar Shivaji, E. James Whitehead Jr., Ram Akella, and Sunghun Kim. 2009. Reducing features to improve bug prediction. In Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering. IEEE, 600--604.Google ScholarGoogle Scholar
  69. Jacek Śliwerski, Thomas Zimmermann, and Andreas Zeller. 2005. When do changes induce fixes? In Proceedings of the 2005 International Workshop on Mining Software Repositories (MSR’05). ACM, New York, NY, 1--5. DOI:https://doi.org/10.1145/1082983.1083147Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Jacek Sliwerski, Thomas Zimmermann, and Andreas Zeller. 2005. When do changes induce fixes?ACM SIGSOFT Software Engineering Notes 30, 4 (2005), 1--5. DOI:https://doi.org/10.1145/1082983.1083147Google ScholarGoogle Scholar
  71. Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, Akinori Ihara, and Kenichi Matsumoto. 2015. The impact of mislabelling on the performance and interpretation of defect prediction models. In Proceedings of the 37th International Conference on Software Engineering—Volume 1 (ICSE’15). IEEE Press, Piscataway, NJ, 812--823. http://dl.acm.org/citation.cfm?id&equals;2818754.2818852Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, and Kenichi Matsumoto. 2016. Automated parameter optimization of classification techniques for defect prediction models. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, May 14--22, 2016. 321--332.Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, and Kenichi Matsumoto. 2016. An empirical comparison of model validation techniques for defect prediction models. IEEE Transactions on Software Engineering 43, 1 (2016), 1--18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, and Kenichi Matsumoto. 2018. The impact of automated parameter optimization on defect prediction models. CoRR abs/1801.10270 (2018). arxiv:1801.10270 http://arxiv.org/abs/1801.10270Google ScholarGoogle Scholar
  75. Kapil Vaswani and Abhik Roychoudhury. 2010. Approach for root causing regression bugs. U.S. Patent App. 12/469,850.Google ScholarGoogle Scholar
  76. Cathrin Weiß, Rahul Premraj, Thomas Zimmermann, and Andreas Zeller. 2007. How long will it take to fix this bug? In Proceedings of the 4th International Workshop on Mining Software Repositories, MSR 2007 (ICSE Workshop), Minneapolis, MN, May 19--20, 2007. IEEE Computer Society, 1. DOI:https://doi.org/10.1109/MSR.2007.13Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Ian H. Witten, Eibe Frank, and Mark A. Hall. 2011. Data Mining: Practical Machine Learning Tools and Techniques (3rd ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Claes Wohlin, Per Runeson, Martin Hst, Magnus C. Ohlsson, Bjrn Regnell, and Anders Wessln. 2012. Experimentation in Software Engineering. Springer Publishing Company, Incorporated.Google ScholarGoogle ScholarCross RefCross Ref
  79. Xinli Yang, David Lo, Xin Xia, Yun Zhang, and Jianling Sun. 2015. Deep learning for just-in-time defect prediction. In Proceedings of the 2015 IEEE International Conference on Software Quality, Reliability and Security. IEEE, 17--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Xiaoxing Yang, Ke Tang, and Xin Yao. 2012. A learning-to-rank algorithm for constructing defect prediction models. In International Conference on Intelligent Data Engineering and Automated Learning. Springer, 167--175.Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. Xiaoxing Yang, Ke Tang, and Xin Yao. 2014. A learning-to-rank approach to software defect prediction. IEEE Transactions on Reliability 64, 1 (2014), 234--246.Google ScholarGoogle ScholarCross RefCross Ref
  82. Yibiao Yang, Yuming Zhou, Jinping Liu, Yangyang Zhao, Hongmin Lu, Lei Xu, Baowen Xu, and Hareton Leung. 2016. Effort-aware just-in-time defect prediction: Simple unsupervised models could be better than supervised models. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 157--168.Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Suraj Yatish, Jirayus Jiarpakdee, Patanamon Thongtanunam, and Chakkrit Tantithamthavorn. 2019. Mining software defects: Should we consider affected releases? (2019), 654--665. DOI:https://doi.org/10.1109/ICSE.2019.00075Google ScholarGoogle Scholar
  84. Arthur B. Yeh. 2005. Fundamentals of probability and statistics for engineers. Technometrics 47, 2 (2005), 239. DOI:https://doi.org/10.1198/tech.2005.s266Google ScholarGoogle ScholarCross RefCross Ref
  85. Xiao Yu, Kwabena Ebo Bennin, Jin Liu, Jacky Wai Keung, Xiaofei Yin, and Zhou Xu. 2019. An empirical study of learning to rank techniques for effort-aware defect prediction. In Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 298--309.Google ScholarGoogle ScholarCross RefCross Ref
  86. Xiao Yu, Jin Liu, Jacky Wai Keung, Qing Li, Kwabena Ebo Bennin, Zhou Xu, Junping Wang, and Xiaohui Cui. 2019. Improving ranking-oriented defect prediction using a cost-sensitive ranking SVM. IEEE Transactions on Reliability (2019).Google ScholarGoogle Scholar
  87. Thomas Zimmermann, Rahul Premraj, and Andreas Zeller. 2007. Predicting defects for eclipse. In Proceedings of the 3rd International Workshop on Predictor Models in Software Engineering (PROMISE’07). IEEE Computer Society, Washington, DC, 9--.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Leveraging the Defects Life Cycle to Label Affected Versions and Defective Classes

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Software Engineering and Methodology
      ACM Transactions on Software Engineering and Methodology  Volume 30, Issue 2
      Continuous Special Section: AI and SE
      April 2021
      463 pages
      ISSN:1049-331X
      EISSN:1557-7392
      DOI:10.1145/3446657
      • Editor:
      • Mauro Pezzè
      Issue’s Table of Contents

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 February 2021
      • Accepted: 1 November 2020
      • Revised: 1 October 2020
      • Received: 1 March 2020
      Published in tosem Volume 30, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format