nach oben

Empirical Software Engineering

Erschienen in:

02.03.2019

Characterizing and identifying reverted commits

verfasst von: Meng Yan, Xin Xia, David Lo, Ahmed E. Hassan, Shanping Li

Erschienen in: Empirical Software Engineering | Ausgabe 4/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In practice, a popular and coarse-grained approach for recovering from a problematic commit is to revert it (i.e., undoing the change). However, reverted commits could induce some issues for software development, such as impeding the development progress and increasing the difficulty for maintenance. In order to mitigate these issues, we set out to explore the following central question: can we characterize and identify which commits will be reverted? In this paper, we characterize commits using 27 commit features and build an identification model to identify commits that will be reverted. We first identify reverted commits by analyzing commit messages and comparing the changed content, and extract 27 commit features that can be divided into three dimensions, namely change, developer and message, respectively. Then, we build an identification model (e.g., random forest) based on the extracted features. To evaluate the effectiveness of our proposed model, we perform an empirical study on ten open source projects including a total of 125,241 commits. Our experimental results show that our model outperforms two baselines in terms of AUC-ROC and cost-effectiveness (i.e., percentage of detected reverted commits when inspecting 20% of total changed LOC). In terms of the average performance across the ten studied projects, our model achieves an AUC-ROC of 0.756 and a cost-effectiveness of 0.746, significantly improving the baselines by substantial margins. In addition, we found that “developer” is the most discriminative dimension among the three dimensions of features for the identification of reverted commits. However, using all the three dimensions of commit features leads to better performance.

Vorheriger Artikel Improving the pull requests review process using learning-to-rank algorithms

Nächster Artikel Catalog of energy patterns for mobile applications

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

http://hadoop.apache.org/

https://www.gerritcodereview.com/

http://hbase.apache.org/

http://karaf.apache.org/

https://jenkins.io/index.html

http://projects.spring.io/spring-boot/

https://hive.apache.org/

https://projects.eclipse.org/projects/eclipse.platform

http://www.eclipse.org/egit/

https://projects.eclipse.org/projects/eclipse.jdt

http://commit.guru/

http://cran.r-project.org/web/packages/Hmisc/index.html

https://cran.r-project.org/web/packages/rms/rms.pdf

http://cran.r-project.org/web/packages/bigrf/bigrf.pdf

https://cran.r-project.org/web/packages/PRROC/PRROC.pdf

https://cran.r-project.org/web/packages/e1071/e1071.pdf

https://cran.r-project.org/web/packages/DMwR/DMwR.pdf

https://www.microsoft.com/en-us/cognitive-toolkit/

https://pytorch.org/

http://scikit-learn.org/

https://www.djangoproject.com/

Abdi H (2007) Bonferroni and šidák corrections for multiple comparisons. Encyclopedia of measurement and statistics 3:103–107

Beller M, Bacchelli A, Zaidman A, Juergens E (2014) Modern code reviews in open-source projects: Which problems do they fix?. In: Proceedings of the 11th working conference on mining software repositories. ACM, pp 202–211

Bird C, Nagappan N, Murphy B, Gall H, Devanbu P (2011) Don’t touch my code!: examining the effects of ownership on software quality. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on foundations of software engineering. ACM, pp 4–14

Boyd K, Costa VS, Davis J, Page CD (2012) Unachievable region in precision-recall space and its effect on empirical evaluation. In: Proceedings of the international conference on machine learning, NIH public access, vol 2012, p 349

Bradley AP (1997) The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145–1159CrossRef

Breiman L (2001) Random forests. Mach Learn 45(1):5–32CrossRefMATH

Breunig MM, Kriegel HP, Ng RT, Sander J (2000) Lof: identifying density-based local outliers. In: ACM Sigmod record, vol 29. ACM, pp 93–104

Codoban M, Ragavan SS, Dig D, Bailey B (2015) Software history under the lens: a study on why and how developers examine it. In: 2015 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 1–10

da Costa DA, McIntosh S, Shang W, Kulesza U, Coelho R, Hassan AE (2017) A framework for evaluating the results of the szz approach for identifying bug-introducing changes. IEEE Trans Softw Eng 43(7):641–657CrossRef

Davis J, Goadrich M (2006) The relationship between precision-recall and roc curves. In: Proceedings of the 23rd international conference on machine learning. ACM, pp 233–240

Fan Y, Xia X, Lo D, Hassan AE (2018a) Chaff from the wheat: characterizing and determining valid bug reports. IEEE transactions on software engineering

Fan Y, Xia X, Lo D, Li S (2018b) Early prediction of merged code changes to prioritize reviewing tasks. Empir Softw Eng, pp 1–48

Fluri B, Gall HC (2006) Classifying change types for qualifying change couplings. In: 14th IEEE international conference on program comprehension, 2006. ICPC 2006. IEEE, pp 35–45

Fluri B, Wuersch M, PInzger M, Gall H (2007) Change distilling: tree differencing for fine-grained source code change extraction. IEEE Trans Softw Eng 33 (11):725–743CrossRef

Fu Y, Yan M, Zhang X, Xu L, Yang D, Kymer JD (2015) Automated classification of software change messages by semi-supervised latent dirichlet allocation. Inf Softw Technol 57:369–377CrossRef

Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1):10–18CrossRef

Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, AmsterdamMATH

Hassan AE (2008) Automated classification of change messages in open source projects. In: Proceedings of the 2008 ACM symposium on applied computing. ACM, pp 837–841

Hassan AE (2009) Predicting faults using the complexity of code changes. In: Proceedings of the 31st international conference on software engineering. IEEE Computer Society, pp 78–88

Herzig K, Just S, Zeller A (2013) It’s not a bug, it’s a feature: how misclassification impacts bug prediction. In: Proceedings of the 2013 international conference on software engineering. IEEE Press, pp 392–401

Hindle A, German DM, Holt R (2008) What do large commits tell us?: a taxonomical study of large commits. In: Proceedings of the 2008 international working conference on mining software repositories. ACM, pp 99–108

Huang J, Ling CX (2005) Using auc and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3):299–310CrossRef

Huang Q, Shihab E, Xia X, Lo D, Li S (2017) Identifying self-admitted technical debt in open source projects using text mining. Empir Softw Eng, pp 1–34

Jiang T, Tan L, Kim S (2013) Personalized defect prediction. In: Proceedings of the 28th IEEE/ACM international conference on automated software engineering. IEEE Press, pp 279–289

Kabinna S, Shang W, Bezemer CP, Hassan AE (2016) Examining the stability of logging statements. In: 2016 IEEE 23rd international conference on software analysis, evolution, and reengineering (SANER), vol 1, pp 326–337

Kamei Y, Shihab E, Adams B, Hassan AE, Mockus A, Sinha A, Ubayashi N (2013) A large-scale empirical study of just-in-time quality assurance. IEEE Trans Softw Eng 39(6):757–773CrossRef

Kim S, Whitehead Jr EJ, Zhang Y (2008) Classifying software changes: clean or buggy? IEEE Trans Softw Eng 34(2):181–196CrossRef

Lampert TA, Gançarski P (2014) The bane of skew. Mach Learn 97(1–2):5–32MathSciNetCrossRefMATH

Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496CrossRef

Li H, Shang W, Zou Y, Hassan AE (2016) Towards just-in-time suggestions for log changes. Empir Softw Eng, pp 1–35

Li H, Shang W, Zou Y, Hassan AE (2017) Towards just-in-time suggestions for log changes. Empir Softw Eng 22(4):1831–1865CrossRef

Li H, Chen THP, Shang W, Hassan AE (2018) Studying software logging using topic models. Empir Softw Eng, pp 1–40

Long JD, Feng D, Cliff N (2003) Ordinal analysis of behavioral data. Handbook of psychology

Macho C, McIntosh S, Pinzger M (2016) Predicting build co-changes with source code change and commit categories. In: 2016 IEEE 23rd international conference on software analysis, evolution, and reengineering (SANER), vol 1. IEEE, pp 541–551

Mäntylä MV, Lassenius C (2009) What types of defects are really discovered in code reviews? IEEE Trans Softw Eng 35(3):430–448CrossRef

McCallum A, Nigam K, et al. (1998) A comparison of event models for naive bayes text classification. In: AAAI-98 workshop on learning for text categorization, Madison, WI, vol 752, pp 41–48

McIntosh S, Adams B, Nagappan M, Hassan AE (2014) Mining co-change information to understand when build changes are necessary. In: 2014 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 241–250

Mockus A, Votta LG (2000) Identifying reasons for software changes using historic databases. In: icsm, pp 120–130

Mockus A, Weiss DM (2000) Predicting risk of software changes. Bell Labs Tech J 5(2):169–180CrossRef

Nam J, Kim S (2015) Clami: defect prediction on unlabeled datasets (t). In: 2015 30th IEEE/ACM international conference on automated software engineering (ASE). IEEE, pp 452–463

Romano J, Kromrey JD, Coraggio J, Skowronek J, Devine L (2006) Exploring methods for evaluating group differences on the nsse and other surveys: Are the t-test and cohen’s d indices the most appropriate choices. In: Annual meeting of the southern association for institutional research, Citeseer

Rosen C, Grawi B, Shihab E (2015) Commit guru: analytics and risk prediction of software commits. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering. ACM, pp 966–969

Scott AJ, Knott M (1974) A cluster analysis method for grouping means in the analysis of variance. Biometrics, pp 507–512

Shimagaki J, Kamei Y, McIntosh S, Pursehouse D, Ubayashi N (2016) Why are commits being reverted?: a comparative study of industrial and open source projects. In: 2016 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 301–311

Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes?. In: ACM Sigsoft software engineering notes, vol 30. ACM, pp 1–5

Souza R, Chavez C, Bittencourt RA (2015) Rapid releases and patch backouts: a software analytics approach. IEEE Softw 32(2):89–96CrossRef

Tantithamthavorn C, McIntosh S, Hassan AE, Ihara A, Matsumoto K (2015) The impact of mislabelling on the performance and interpretation of defect prediction models. In: 2015 IEEE/ACM 37th IEEE international conference on software engineering (ICSE), vol 1, pp 812–823

Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2017) An empirical comparison of model validation techniques for defect prediction models. IEEE Trans Softw Eng 43(1):1–18CrossRef

Tao Y, Han D, Kim S (2014) Writing acceptable patches: an empirical study of open source project patches. In: 2014 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 271–280

Tian Y, Nagappan M, Lo D, Hassan AE (2015) What are the characteristics of high-rated apps? a case study on free android applications. In: 2015 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 301–310

Valdivia Garcia H, Shihab E (2014) Characterizing and predicting blocking bugs in open source projects. In: Proceedings of the 11th working conference on mining software repositories. ACM, pp 72–81

Wilcoxon F (1992) Individual comparisons by ranking methods. Breakthroughs in statistics, pp 196–202

Wolpert DH, Macready WG (1999) An efficient method to estimate bagging’s generalization error. Mach Learn 35(1):41–55CrossRefMATH

Xia X, Lo D, Qiu W, Wang X, Zhou B (2014) Automated configuration bug report prediction using text mining. In: 2014 IEEE 38th annual computer software and applications conference (COMPSAC). IEEE, pp 107–116

Xia X, Lo D, McIntosh S, Shihab E, Hassan AE (2015a) Cross-project build co-change prediction. In: 2015 IEEE 22nd international conference on software analysis, evolution and reengineering (SANER). IEEE, pp 311–320

Xia X, Lo D, Shihab E, Wang X, Yang X (2015b) Elblocker: predicting blocking bugs with ensemble imbalance learning. Inf Softw Technol 61:93–106CrossRef

Xia X, Lo D, Pan SJ, Nagappan N, Wang X (2016a) Hydra: massively compositional model for cross-project defect prediction. IEEE Trans Softw Eng 42 (10):977–998CrossRef

Xia X, Shihab E, Kamei Y, Lo D, Wang X (2016b) Predicting crashing releases of mobile applications. In: Proceedings of the 10th ACM/IEEE international symposium on empirical software engineering and measurement. ACM, p 29

Xia X, Bao L, Lo D, Kochhar PS, Hassan AE, Xing Z (2017) What do developers search for on the web? Empir Softw Eng, pp 1–37

Yan M, Fu Y, Zhang X, Yang D, Xu L, Kymer JD (2016) Automatically classifying software changes via discriminative topic model: supporting multi-category and cross-project. J Syst Softw 113:296–308CrossRef

Yan M, Fang Y, Lo D, Xia X, Zhang X (2017) File-level defect prediction: unsupervised vs. supervised models. In: 2017 ACM/IEEE international symposium on empirical software engineering and measurement (ESEM), IEEE, pp 344–353

Yan M, Xia X, Shihab E, Lo D, Yin J, Yang X (2018) Automating change-level self-admitted technical debt determination. IEEE Trans Softw Eng

Yang Y, Zhou Y, Liu J, Zhao Y, Lu H, Xu L, Xu B, Leung H (2016) Effort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering. ACM, pp 157–168

Yoon Y, Myers BA (2012) An exploratory study of backtracking strategies used by developers. In: Proceedings of the 5th international workshop on co-operative and human aspects of software engineering. IEEE Press, pp 138–144

Titel: Characterizing and identifying reverted commits
verfasst von: Meng Yan
Xin Xia
David Lo
Ahmed E. Hassan
Shanping Li
Publikationsdatum: 02.03.2019
Verlag: Springer US
Erschienen in: Empirical Software Engineering / Ausgabe 4/2019
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI: https://doi.org/10.1007/s10664-019-09688-8

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 4/2019

A controlled experiment on time pressure and confirmation bias in functional software testing

To the attention of mobile software developers: guess what, test your app!

An empirical study of the long duration of continuous integration builds

cregit: Token-level blame information in git version control repositories

AspectOCL: using aspects to ease maintenance of evolving constraint specification

Correction to: older adults and hackathons: a qualitative study