Skip to main content
Top
Published in: Empirical Software Engineering 4/2014

01-08-2014

Correlations between bugginess and time-based commit characteristics

Authors: Jon Eyolfson, Lin Tan, Patrick Lam

Published in: Empirical Software Engineering | Issue 4/2014

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Modern software is often developed over many years with hundreds of thousands of commits. Commit metadata is a rich source of time-based characteristics, including the commit’s time of day and the commit frequency and seniority of its author. The “bugginess” of a commit is also a critical property of that commit. In this paper, we investigate the correlation between a commit’s time-based characteristics and its “bugginess”; such results can be useful for software developers and software engineering researchers. For instance, developers or code reviewers might be well-advised to thoroughly verify commits that are more likely to be buggy. In this paper, we study the correlation between a commit’s bugginess and the time of day of the commit, the day of week of the commit, the commit frequency and seniority of the commit authors, and whether or not the developers have marked a commit as a “stable” commit. We survey three widely-used open source projects: the Linux kernel, PostgreSQL, and the Xorg server. Our main findings include: (1) commits between midnight and 4 AM (referred to as late-night commits) are significantly buggier and commits between 8 AM and noon are less buggy, implying that developers may want to double-check their own late-night commits; (2) daily-committing developers produce less-buggy commits, indicating that we may want to promote the practice of daily-committing developers reviewing other developers’ commits; (3) the bugginess of commits versus day-of-week varies for different software projects; and (4) stable commits are significantly less buggy than commits in general.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
Following common practice, we drop trailing digits of the commit id: our commits have ids with unique first-8-digits.
 
3
We were able to identify one move of a committer from Ontario, Canada to California, and incorporated that move into our adjustments, but did not find evidence of many such moves in our set of PostgreSQL contributors.
 
5
Stephen Frost, a PostgreSQL commiter, writes “We depend on the committers to do final review and commit, but they are a very finite resource.” in a presentation about PostgreSQL patch reviewing, found at http://​www.​pgcon.​org/​2011/​schedule/​events/​368.​en.​html.
 
6
Table 6 in the Appendix presents a complete set of p-values evaluating the statistical significance of the per-hour commit bugginess for the Linux kernel, PostgreSQL, and Xorg.
 
7
Table 8 in the Appendix presents p-values evaluating the statistical significance of the per-class commit bugginess for the Linux kernel and Xorg.
 
8
See Table 9 in the Appendix for p-values for the Linux kernel, PostgreSQL and Xorg.
 
9
Table 7 in the Appendix presents p-values evaluating the statistical significance of the combined seniority-and-hour commit bugginess for the Linux kernel and Xorg.
 
10
Table 10 in the Appendix presents p-values evaluating the statistical significance of the per-day commit bugginess for the Linux kernel.
 
Literature
go back to reference Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29(4):1165–1188CrossRefMATHMathSciNet Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29(4):1165–1188CrossRefMATHMathSciNet
go back to reference Bird C, Bachmann A, Aune E, Duffy J, Bernstein A, Filkov V, Devanbu P (2009) Fair and balanced?: bias in bug-fix datasets. In: ESEC/FSE ’09: Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, pp 121–130 Bird C, Bachmann A, Aune E, Duffy J, Bernstein A, Filkov V, Devanbu P (2009) Fair and balanced?: bias in bug-fix datasets. In: ESEC/FSE ’09: Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, pp 121–130
go back to reference Bird C, Nagappan N (2012) Who? where? what? examining distributed development in two large open source projects. In: Proceedings of the international working conference on mining software repositories Bird C, Nagappan N (2012) Who? where? what? examining distributed development in two large open source projects. In: Proceedings of the international working conference on mining software repositories
go back to reference Bird C, Nagappan N, Devanbu PT, Gall H, Murphy B (2009) Does distributed development affect software quality? In: ICSE, pp 518–528 Bird C, Nagappan N, Devanbu PT, Gall H, Murphy B (2009) Does distributed development affect software quality? In: ICSE, pp 518–528
go back to reference Bird C, Nagappan N, Murphy B, Gall H, Devanbu P (2011) Don’t touch my code!: examining the effects of ownership on software quality. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on foundations of software engineering, ESEC/FSE ’11, pp 4–14 Bird C, Nagappan N, Murphy B, Gall H, Devanbu P (2011) Don’t touch my code!: examining the effects of ownership on software quality. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on foundations of software engineering, ESEC/FSE ’11, pp 4–14
go back to reference Chou A, Yang J, Chelf B, Hallem S, Engler DR (2001) An empirical study of operating system errors. In: Symposium on operating systems principles, pp 73–88 Chou A, Yang J, Chelf B, Hallem S, Engler DR (2001) An empirical study of operating system errors. In: Symposium on operating systems principles, pp 73–88
go back to reference Engler D, Chen DY, Hallem S, Chou A, Chelf B (2001) Bugs as deviant behavior: a general approach to inferring errors in systems code. In: SOSP’01: Proceedings of the eighteenth ACM symposium on operating systems principles, pp 57–72 Engler D, Chen DY, Hallem S, Chou A, Chelf B (2001) Bugs as deviant behavior: a general approach to inferring errors in systems code. In: SOSP’01: Proceedings of the eighteenth ACM symposium on operating systems principles, pp 57–72
go back to reference Eyolfson J, Tan L, Lam P (2011) Do time of day and developer experience affect commit bugginess? In: MSR Eyolfson J, Tan L, Lam P (2011) Do time of day and developer experience affect commit bugginess? In: MSR
go back to reference Graves TL, Karr AF, Marron JS, Siy HP (2000) Predicting fault incidence using software change history. TSE 26(7):653–661CrossRef Graves TL, Karr AF, Marron JS, Siy HP (2000) Predicting fault incidence using software change history. TSE 26(7):653–661CrossRef
go back to reference Guo L, Ma Y, Cukic B, Singh H (2004) Robust prediction of fault-proneness by random forests. In: ISSRE Guo L, Ma Y, Cukic B, Singh H (2004) Robust prediction of fault-proneness by random forests. In: ISSRE
go back to reference Hassan AE (2009) Predicting faults using the complexity of code changes. In: ICSE, pp 78–88 Hassan AE (2009) Predicting faults using the complexity of code changes. In: ICSE, pp 78–88
go back to reference Herraiz I, González-Barahona JM, Robles G, Germán DM (2007) On the prediction of the evolution of libre software projects. In: ICSM, pp 405–414 Herraiz I, González-Barahona JM, Robles G, Germán DM (2007) On the prediction of the evolution of libre software projects. In: ICSM, pp 405–414
go back to reference Kim S, Whitehead Jr EJ, Zhang Y (2008) Classifying software changes: clean or buggy? IEEE Trans Softw Eng 34(2):181–196CrossRef Kim S, Whitehead Jr EJ, Zhang Y (2008) Classifying software changes: clean or buggy? IEEE Trans Softw Eng 34(2):181–196CrossRef
go back to reference Kim S, Whitehead Jr E (2006) How long did it take to fix bugs? In: MSR, pp 173–174 Kim S, Whitehead Jr E (2006) How long did it take to fix bugs? In: MSR, pp 173–174
go back to reference Kim S, Zimmermann T, Pan K, Whitehead E (2006) Automatic identification of bug-introducing changes. In: ASE, pp 81–90 Kim S, Zimmermann T, Pan K, Whitehead E (2006) Automatic identification of bug-introducing changes. In: ASE, pp 81–90
go back to reference Meneely A, Williams L, Snipes W, Osborne J (2008) Predicting failures with developer networks and social network analysis. In: SIGSOFT/FSE, pp 13–23 Meneely A, Williams L, Snipes W, Osborne J (2008) Predicting failures with developer networks and social network analysis. In: SIGSOFT/FSE, pp 13–23
go back to reference Menzies T, Milton Z, Turhan B, Cukic B, Jiang Y, Bener AB (2010) Defect prediction from static code features: current results, limitations, new approaches. In: ASE Menzies T, Milton Z, Turhan B, Cukic B, Jiang Y, Bener AB (2010) Defect prediction from static code features: current results, limitations, new approaches. In: ASE
go back to reference Mockus A, Weiss DM (2000) Predicting risk of software changes. Bell Labs Tech J 5(2):169–180CrossRef Mockus A, Weiss DM (2000) Predicting risk of software changes. Bell Labs Tech J 5(2):169–180CrossRef
go back to reference Mockus A, Weiss DM, Zhang P (2003) Understanding and predicting effort in software projects. In: ICSE Mockus A, Weiss DM, Zhang P (2003) Understanding and predicting effort in software projects. In: ICSE
go back to reference Nagappan N, Murphy B, Basili VR (2008) The influence of organizational structure on software quality: an empirical case study. In: ICSE, pp 521–530 Nagappan N, Murphy B, Basili VR (2008) The influence of organizational structure on software quality: an empirical case study. In: ICSE, pp 521–530
go back to reference Ostrand TJ, Weyuker EJ, Bell RM (2005) Predicting the location and number of faults in large software systems. TSE 31(4):340–355 Ostrand TJ, Weyuker EJ, Bell RM (2005) Predicting the location and number of faults in large software systems. TSE 31(4):340–355
go back to reference Purushothaman R, Perry DE (2005) Toward understanding the rhetoric of small source code changes. TSE 31(6):511–526CrossRef Purushothaman R, Perry DE (2005) Toward understanding the rhetoric of small source code changes. TSE 31(6):511–526CrossRef
go back to reference Rahman F, Devanbu P (2011) Ownership, experience and defects: a fine-grained study of authorship. In: Proceedings of the 33rd International Conference on Software Engineering, ICSE ’11, pp 491–500 Rahman F, Devanbu P (2011) Ownership, experience and defects: a fine-grained study of authorship. In: Proceedings of the 33rd International Conference on Software Engineering, ICSE ’11, pp 491–500
go back to reference Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes? In: MSR, pp 24–28 Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes? In: MSR, pp 24–28
go back to reference Spinellis D (2006) Global software development in the FreeBSD project. In: GSD, pp 73–79 Spinellis D (2006) Global software development in the FreeBSD project. In: GSD, pp 73–79
go back to reference Swift MM, Bershad BN, Levy HM (2003) Improving the reliability of commodity operating systems. In: SOSP, pp 207–222 Swift MM, Bershad BN, Levy HM (2003) Improving the reliability of commodity operating systems. In: SOSP, pp 207–222
go back to reference Tian Y, Lawall J, Lo D (2012) Identifying Linux bug fixing patches. In: ICSE’12, pp 386–396 Tian Y, Lawall J, Lo D (2012) Identifying Linux bug fixing patches. In: ICSE’12, pp 386–396
go back to reference Weyuker EJ, Ostrand TJ, Bell RM (2008) Do too many cooks spoil the broth? using the number of developers to enhance defect prediction models. Empir Software Eng 13(5):539–559CrossRef Weyuker EJ, Ostrand TJ, Bell RM (2008) Do too many cooks spoil the broth? using the number of developers to enhance defect prediction models. Empir Software Eng 13(5):539–559CrossRef
go back to reference Wu R, Zhang H, Kim S, Cheung S-C (2011) Relink: recovering links between bugs and changes. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on foundations of software engineering, ESEC/FSE ’11, pp 15–25 Wu R, Zhang H, Kim S, Cheung S-C (2011) Relink: recovering links between bugs and changes. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on foundations of software engineering, ESEC/FSE ’11, pp 15–25
go back to reference Zimmermann T, Nagappan N (2008) Predicting defects using network analysis on dependency graphs. In: ICSE Zimmermann T, Nagappan N (2008) Predicting defects using network analysis on dependency graphs. In: ICSE
go back to reference Zimmermann T, Premraj R, Zeller A (2007) Predicting defects for eclipse. In: PROMISE Zimmermann T, Premraj R, Zeller A (2007) Predicting defects for eclipse. In: PROMISE
go back to reference Zimmermann T, Weißgerber P (2004) Preprocessing CVS data for fine-grained analysis. In: MSR, pp 2–6 Zimmermann T, Weißgerber P (2004) Preprocessing CVS data for fine-grained analysis. In: MSR, pp 2–6
Metadata
Title
Correlations between bugginess and time-based commit characteristics
Authors
Jon Eyolfson
Lin Tan
Patrick Lam
Publication date
01-08-2014
Publisher
Springer US
Published in
Empirical Software Engineering / Issue 4/2014
Print ISSN: 1382-3256
Electronic ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-013-9245-0

Other articles of this Issue 4/2014

Empirical Software Engineering 4/2014 Go to the issue

Premium Partner