nach oben

Empirical Software Engineering

Erschienen in:

01.09.2021

Evaluating the impact of falsely detected performance bug-inducing changes in JIT models

verfasst von: Sophia Quach, Maxime Lamothe, Bram Adams, Yasutaka Kamei, Weiyi Shang

Erschienen in: Empirical Software Engineering | Ausgabe 5/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Performance bugs bear a heavy cost on both software developers and end-users. Tools to reduce the occurrence, impact, and repair time of performance bugs, can therefore provide key assistance for software developers racing to fix these bugs. Classification models that focus on identifying defect-prone commits, referred to as Just-In-Time (JIT) Quality Assurance are known to be useful in allowing developers to review risky commits. These commits can be reviewed while they are still fresh in developers’ minds, reducing the costs of developing high-quality software. JIT models, however, leverage the SZZ approach to identify whether or not a change is bug-inducing. The fixes to performance bugs may be scattered across the source code, separated from their bug-inducing locations. The nature of performance bugs may make SZZ a sub-optimal approach for identifying their bug-inducing commits. Yet, prior studies that leverage or evaluate the SZZ approach do not distinguish performance bugs from other bugs, leading to potential bias in the results. In this paper, we conduct an empirical study on the JIT defect prediction for performance bugs. We concentrate on SZZ’s ability to identify the bug-inducing commits of performance bugs in two open-source projects, Cassandra, and Hadoop. We verify whether the bug-inducing commits found by SZZ are truly bug-inducing commits by manually examining these identified commits. Our manual examination includes cross referencing fix commits and JIRA bug reports. We evaluate model performance for JIT models by using them to identify bug-inducing code commits for performance related bugs. Our findings show that JIT defect prediction classifies non-performance bug-inducing commits better than performance bug-inducing commits, i.e., the SZZ approach does introduce errors when identifying bug-inducing commits. However, we find that manually correcting these errors in the training data only slightly improves the models. In the absence of a large number of correctly labelled performance bug-inducing commits, our findings show that combining all available training data (i.e., truly performance bug-inducing commits, non-performance bug-inducing commits, and non-bug-inducing commits) yields the best classification results.

Vorheriger Artikel Understanding shared links and their intentions to meet information needs in modern code review:

Nächster Artikel Task estimation for software company employees based on computer interaction logs

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Our data files and scripts used are publicly available and can be found at: https://github.com/senseconcordia/Perf-JIT-Models

Agrawal A, Menzies T (2018) Is ”better data” better than ”better data miners”? on the benefits of tuning smote for defect prediction. In: Proceedings of the 40th International Conference on Software Engineering, ser. ICSE ’18. Association for Computing Machinery, New York, , pp 1050–1061. [Online]. Available: https://doi.org/10.1145/3180155.3180197

Apache apache/cassandra (2019) [Online]. Available: https://github.com/apache/cassandra

Apache hadoop (2020) [Online]. Available: https://hadoop.apache.org/

Bryant RE, O’Hallaron DR (2015) Computer Systems: A Programmer’s Perspective, 3rd ed. Pearson

Catolino G (2017) Just-in-time bug prediction in mobile applications: The domain matters! pp 05

Catolino G, Di Nucci D, Ferrucci F (2019) Cross-project just-in-time bug prediction for mobile apps: An empirical assessment. In: 2019 IEEE/ACM 6th International Conference on Mobile Software Engineering and Systems (MOBILESoft), pp 99–110

Chen J, Shang W (2017) An exploratory study of performance regression introducing code changes. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp 341–352

Chen T.-H., Shang W, Jiang ZM, Hassan AE, Nasser M, Flora P (2014) Detecting performance anti-patterns for applications developed using object-relational mapping. In: Proceedings of the 36th International Conference on Software Engineering, ser. ICSE 2014. Association for Computing Machinery, New York, pp 1001–1012. [Online]. Available: https://doi.org/10.1145/2568225.2568259

Chen J, Shang W, Shihab E (2020) Perfjit: Test-level just-in-time prediction for performance regression introducing commits. IEEE Trans Softw Eng:1–1

Correlation (pearson kendall, spearman) (2020) [Online]. Available: https://www.statisticssolutions.com/correlation-pearson-kendall-spearman/

da Costa DA, McIntosh S, Shang W, Kulesza U, Coelho R, Hassan AE (2017) A framework for evaluating the results of the szz approach for identifying bug-introducing changes. IEEE Trans Softw Eng 43(7):641–657CrossRef

Davies S, Roper M, Wood M (2014) Comparing text-based and dependence-based approaches for determining the origins of bugs. J Softw Evol Process 26:01CrossRef

Ding Z, Chen J, Shang W (2020) Towards the use of the readily available tests from the release pipeline as performance tests. are we there yet? In: 42nd International Conference on Software Engineering, Seoul

Dmwr (2020) [Online]. Available: https://www.rdocumentation.org/packages/DMwR/versions/0.4.1/topics/SMOTE

Fawcett T (2006) An introduction to roc analysis. Pattern Recogn Lett 27(8):861–874MathSciNetCrossRef

Fukushima T, Kamei Y, McIntosh S, Yamashita K, Ubayashi N (2014) Studying just-in-time defect prediction using cross-project models. Empir Softw Eng 21:05

Ghotra B, McIntosh S, Hassan AE (2015) Revisiting the impact of classification techniques on the performance of defect prediction models. In: Proceedings of the 37th International Conference on Software Engineering - Vol 1, ser. ICSE ’15. IEEE Press, pp 789–800

Guindon C Swt: The standard widget toolkit. [Online]. Available: https://www.eclipse.org/swt/

Guo PJ, Zimmermann T, Nagappan N, Murphy B (2010) Characterizing and Predicting Which Bugs Get Fixed: An Empirical Study of Microsoft Windows. Association for Computing Machinery, New York, pp 495–504. [Online]. Available: https://doi.org/10.1145/1806799.1806871

Gyimothy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910CrossRef

Hamill M, Goseva-Popstojanova K (2014) Exploring the missing link: An empirical study of software fixes. Softw Test Verif Reliab 24(8):684–705. [Online]. Available: https://doi.org/10.1002/stvr.1518

Hassan AE (2009) Predicting faults using the complexity of code changes. In: 2009 IEEE 31st International Conference on Software Engineering, pp. 78–88

Jin G, Song L, Shi X, Scherpelz J, Lu S (2012) Understanding and detecting real-world performance bugs. SIGPLAN Not. 47(6):77–88CrossRef

Jpace jpace/diffj [Online]. Available: https://github.com/jpace/diffj

Kamei Y, Shihab E, Adams B, Hassan AE, Mockus A, Sinha A, Ubayashi N (2013) A large-scale empirical study of just-in-time quality assurance. IEEE Trans Softw Eng 39(6):757–773CrossRef

Kim S, Zimmermann T, Pan K, Whitehead Jr. E. J. (2006) Automatic identification of bug-introducing changes. In: 21st IEEE/ACM International Conference on Automated Software Engineering (ASE’06), pp 81–90

Kondo M, German D, Mizuno O, Choi E (2020) The impact of context metrics on just-in-time defect prediction. Empir Softw Eng 25:01CrossRef

LaToza TD, Venolia G, DeLine R (2006) Maintaining mental models: A study of developer work habits. In: Proceedings of the 28th International Conference on Software Engineering, ser. ICSE ’06. ACM, New York, pp 492–501

Li H, Shang W, Zou Y, Hassan AE (2018) Towards just-in-time suggestions for log changes (journal-first abstract). In: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp 467–467

McDonald JH (2014) Handbook of biological statistics. sparky house publishing Baltimore. MD 3:186–189

McHugh M (2012) Interrater reliability: The kappa statistic, Biochemia medica : č,asopis Hrvatskoga društva medicinskih biokemičara / HDMB, vol 22, pp 276–82, 10

McIntosh S., Kamei Y (2018) Are fix-inducing changes a moving target? a longitudinal case study of just-in-time defect prediction. IEEE Trans Softw Eng 44(5):412–428CrossRef

Molyneaux I (2009) The Art of Application Performance Testing: Help for Programmers and Quality Assurance, 1st ed. O’Reilly Media, Inc.

Nayrolles M, Hamou-Lhadj A (2018) Clever: Combining code metrics with clone detection for just-in-time fault prevention and resolution in large industrial projects, pp 03

Neto E, Costa D, Kulesza U (2018) The impact of refactoring changes on the szz algorithm: An empirical study, pp 03

Nistor A, Jiang T, Tan L (2013) Discovering, reporting, and fixing performance bugs. in: 2013 10th working conference on mining software repositories (MSR), pp 237–246

Ohira M, Kashiwa Y, Yamatani Y, Yoshiyuki H, Maeda Y, Limsettho N, Fujino K, Hata H, Ihara A, Matsumoto K (2015)

Radu A, Nadi S (2019) A dataset of non-functional bugs. In: Proceedings of the 16th International Conference on Mining Software Repositories, ser. MSR ’19. IEEE Press, Piscataway, pp 399–403

Rodrıguez-Perez G., Nagappan M, Robles G (2020) Watch out for extrinsic bugs! a case study of their impact in just-in-time bug prediction models on the openstack project. IEEE Transactions on Software Engineering

Rosen C, Grawi B, Shihab E (2015) Commit guru: Analytics and risk prediction of software commits. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ser. ESEC/FSE 2015. ACM, New York, pp 966–969. [Online]. Available: https://doi.org/10.1145/2786805.2803183

Sawilowsky SS (2009) New effect size rules of thumb. J Modern Appl Stat Methods 8(2):26MathSciNetCrossRef

Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes?. SIGSOFT Softw Eng Notes 30(4):1–5CrossRef

Syer MD, Shang W, Jiang ZM, Hassan AE (2017) Continuous validation of performance test workloads. Autom Softw Engg 2(1):189–231. [Online]. Available: https://doi.org/10.1007/s10515-016-0196-8

Tabassum S (2020) An investigation of cross-project learning in online just-in-time software defect prediction, pp 06

Tantithamthavorn C, Hassan AE, Matsumoto K (2020) The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Trans Softw Eng 46(11):1200–1219CrossRef

Team J Eclipse java development tools (jdt). [Online]. Available: https://www.eclipse.org/jdt/

Tsakiltsidis S, Miranskyy A, Mazzawi E (2016) On automatic detection of performance bugs. In: 2016 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pp 132–139

Williams C, Spacco J (2008) Szz revisited: Verifying when changes induce fixes. In: Proceedings of the 2008 Workshop on Defects in Large Software Systems, ser. DEFECTS ’08. ACM, New York, pp 32–36

Yang Y, Zhou Y, Liu J, Zhao Y, Lu H, Xu L, Xu B, Leung H (2016) Effort-aware just-in-time defect prediction: Simple unsupervised models could be better than supervised models. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. FSE 2016. Association for Computing Machinery, New York, pp 157–168. [Online]. Available: https://doi.org/10.1145/2950290.2950353

Zaman S, Adams B, Hassan AE (2011) Security versus performance bugs: A case study on firefox. In: Proceedings of the 8th Working Conference on Mining Software Repositories, ser. MSR ’11. ACM, New York, pp 93–102

Zaman S, Adams B, Hassan AE (2012) A qualitative study on performance bugs, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR), pp 199–208

Titel: Evaluating the impact of falsely detected performance bug-inducing changes in JIT models
verfasst von: Sophia Quach
Maxime Lamothe
Bram Adams
Yasutaka Kamei
Weiyi Shang
Publikationsdatum: 01.09.2021
Verlag: Springer US
Erschienen in: Empirical Software Engineering / Ausgabe 5/2021
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI: https://doi.org/10.1007/s10664-021-10004-6

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 5/2021

Does class size matter? An in-depth assessment of the effect of class size in software defect prediction

Measuring affective states from technical debt

Task estimation for software company employees based on computer interaction logs

Automated driver management for selenium WebDriver

The entrepreneurial logic of startup software development: A study of 40 software startups

Can Offline Testing of Deep Neural Networks Replace Their Online Testing?

Premium Partner