Skip to main content
Erschienen in: Empirical Software Engineering 1/2015

01.02.2015

Mining software repair models for reasoning on the search space of automated program fixing

verfasst von: Matias Martinez, Martin Monperrus

Erschienen in: Empirical Software Engineering | Ausgabe 1/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper is about understanding the nature of bug fixing by analyzing thousands of bug fix transactions of software repositories. It then places this learned knowledge in the context of automated program repair. We give extensive empirical results on the nature of human bug fixes at a large scale and a fine granularity with abstract syntax tree differencing. We set up mathematical reasoning on the search space of automated repair and the time to navigate through it. By applying our method on 14 repositories of Java software and 89,993 versioning transactions, we show that not all probabilistic repair models are equivalent.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Fußnoten
1
Most statistical tables of Spearman’s ρ stop at N=60, however since the critical values decreases with N, if ρ > 0.301 the null hypothesis is still rejected.
 
2
Spearman correlation is based on ranks, a value of 0.85 means either that most change actions are ranked similarly or that a single change action has a really different rank.
 
3
Note that our goal is not to have a good classification in terms of precision or recall.
 
4
Some degree of agreement is expected when the ratings are purely random (Cohen et al. 1960; Joseph 1971).
 
5
Since a bug fix may contain several instances of the same repair actions (e.g. several statement insertions), the repair shape may contain several times the same repair action.
 
6
Equation (1) holds if and only if we consider them as independent. If they are not, it means that we under-estimate the deep structure of the repair space, hence we over-approximate the time to navigate in the space to find the correct shape. In other words, even if the repair actions are not independent (which is likely for some of them) our conclusions are sound.
 
7
“Fix for 19346 integrating changes from Sebastian Davids” http://​goo.​gl/​d4OSi.
 
8
In more recent versions of GenProg, swapping has been replaced by “replacing”.
 
Literatur
Zurück zum Zitat Alali A, Kagdi H, Maletic J (2008) What’s a typical commit? a characterization of open source software repositories. In: Proceedings of the IEEE international conference on program comprehension Alali A, Kagdi H, Maletic J (2008) What’s a typical commit? a characterization of open source software repositories. In: Proceedings of the IEEE international conference on program comprehension
Zurück zum Zitat Arcuri A (2011) Evolutionary repair of faulty software. Appl Soft Comput 11(4):3494–3514CrossRef Arcuri A (2011) Evolutionary repair of faulty software. Appl Soft Comput 11(4):3494–3514CrossRef
Zurück zum Zitat Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: Proceedings of the 33rd international conference on software engineering. ACM, pp 1–10 Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: Proceedings of the 33rd international conference on software engineering. ACM, pp 1–10
Zurück zum Zitat Bird C, Bachmann A, Aune E, Duffy J, Bernstein A, Filkov V, Devanbu P (2009) Fair and balanced?: bias in bug-fix datasets. In: Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE ’09. ACM, pp 121–130 Bird C, Bachmann A, Aune E, Duffy J, Bernstein A, Filkov V, Devanbu P (2009) Fair and balanced?: bias in bug-fix datasets. In: Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE ’09. ACM, pp 121–130
Zurück zum Zitat Bóna M (2011) A walk through combinatorics: an introduction to enumeration and graph theory. World Scientific Bóna M (2011) A walk through combinatorics: an introduction to enumeration and graph theory. World Scientific
Zurück zum Zitat Carzaniga A, Gorla A, Perino N, Pezzè M (2010) Automatic workarounds for web applications. In: Proceedings of the 2010 foundations of software engineering conference. ACM, pp 237–246 Carzaniga A, Gorla A, Perino N, Pezzè M (2010) Automatic workarounds for web applications. In: Proceedings of the 2010 foundations of software engineering conference. ACM, pp 237–246
Zurück zum Zitat Cohen J et al (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46CrossRef Cohen J et al (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46CrossRef
Zurück zum Zitat Dallmeier V, Zeller A, Meyer B (2009) Generating fixes from object behavior anomalies. In: Proceedings of the international conference on automated software engineering Dallmeier V, Zeller A, Meyer B (2009) Generating fixes from object behavior anomalies. In: Proceedings of the international conference on automated software engineering
Zurück zum Zitat Debroy V, Wong W (2010) Using mutation to automatically suggest fixes for faulty programs. In: Proceedings of the International Conference on Software Testing, vERIFICATION AND vALIDAtion (ICST). IEEE, pp 65–74 Debroy V, Wong W (2010) Using mutation to automatically suggest fixes for faulty programs. In: Proceedings of the International Conference on Software Testing, vERIFICATION AND vALIDAtion (ICST). IEEE, pp 65–74
Zurück zum Zitat Fluri B, Wursch M, Pinzger M, Gall H (2007) Change distilling: tree differencing for fine-grained source code change extraction. IEEE Trans Softw Eng 33:725–743CrossRef Fluri B, Wursch M, Pinzger M, Gall H (2007) Change distilling: tree differencing for fine-grained source code change extraction. IEEE Trans Softw Eng 33:725–743CrossRef
Zurück zum Zitat Fluri B, Giger E, Gall HC (2008) Discovering patterns of change types. In: Proceedings of the international conference on automated software engineering Fluri B, Giger E, Gall HC (2008) Discovering patterns of change types. In: Proceedings of the international conference on automated software engineering
Zurück zum Zitat German DM (2006) An empirical study of fine-grained software modifications. Empir Software Eng 11(3):369–393CrossRef German DM (2006) An empirical study of fine-grained software modifications. Empir Software Eng 11(3):369–393CrossRef
Zurück zum Zitat Giger E, Pinzger M, Gall H, Xie T, Zimmermann T (2011) Comparing fine-grained source code changes and code churn for bug prediction. In: Working conference on mining software repositories Giger E, Pinzger M, Gall H, Xie T, Zimmermann T (2011) Comparing fine-grained source code changes and code churn for bug prediction. In: Working conference on mining software repositories
Zurück zum Zitat Giger E, Pinzger M, Gall HC (2012) Can we predict types of code changes? an empirical analysis. In: Proceedings of the working conference on mining software repositories, pp 217–226 Giger E, Pinzger M, Gall HC (2012) Can we predict types of code changes? an empirical analysis. In: Proceedings of the working conference on mining software repositories, pp 217–226
Zurück zum Zitat Gopinath D, Malik MZ, Khurshid S (2011) Specification-based program repair using sat. In: Proceedings of the international conference on tools and algorithms for the construction and analysis of systems Gopinath D, Malik MZ, Khurshid S (2011) Specification-based program repair using sat. In: Proceedings of the international conference on tools and algorithms for the construction and analysis of systems
Zurück zum Zitat Hattori L, Lanza M (2008) On the nature of commits. In: Proceedings of 4th international ERCIM workshop on software evolution and evolvabillity (EVOL), pp 63–71 Hattori L, Lanza M (2008) On the nature of commits. In: Proceedings of 4th international ERCIM workshop on software evolution and evolvabillity (EVOL), pp 63–71
Zurück zum Zitat Hindle A, German DM, Holt R (2008) What do large commits tell us?: a taxonomical study of large commits. In: Proceedings of the international working conference on mining software repositories Hindle A, German DM, Holt R (2008) What do large commits tell us?: a taxonomical study of large commits. In: Proceedings of the international working conference on mining software repositories
Zurück zum Zitat Hindle A, German D, Godfrey M, Holt R (2009) Automatic classication of large changes into maintenance categories. In: Proceedings of the debroInternational conference on program comprehension Hindle A, German D, Godfrey M, Holt R (2009) Automatic classication of large changes into maintenance categories. In: Proceedings of the debroInternational conference on program comprehension
Zurück zum Zitat Joseph FL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382CrossRef Joseph FL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382CrossRef
Zurück zum Zitat Kim D, Nam J, Song J, Kim S (2013) Automatic patch generation learned from human-written patches. In: Proceedings of the 2013 international conference on software engineering. IEEE Press, pp 802–811 Kim D, Nam J, Song J, Kim S (2013) Automatic patch generation learned from human-written patches. In: Proceedings of the 2013 international conference on software engineering. IEEE Press, pp 802–811
Zurück zum Zitat Kim S, Pan K, Whitehead EJ (2006) Memories of bug fixes. In: Proceedings of the 14th ACM SIGSOFT international symposium on foundations of software engineering Kim S, Pan K, Whitehead EJ (2006) Memories of bug fixes. In: Proceedings of the 14th ACM SIGSOFT international symposium on foundations of software engineering
Zurück zum Zitat Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174CrossRef Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174CrossRef
Zurück zum Zitat Le Goues C,Weimer W, Forrest S (2012) Representations and operators for improving evolutionary software repair. In: Proceedings of GECCO, pp 959–966 Le Goues C,Weimer W, Forrest S (2012) Representations and operators for improving evolutionary software repair. In: Proceedings of GECCO, pp 959–966
Zurück zum Zitat Livshits B, Zimmermann T (2005) Dynamine: finding common error patterns by mining software revision histories. In: Proceedings of the European software engineering conference held jointly with International Symposium on Foundations of Software Engineering Livshits B, Zimmermann T (2005) Dynamine: finding common error patterns by mining software revision histories. In: Proceedings of the European software engineering conference held jointly with International Symposium on Foundations of Software Engineering
Zurück zum Zitat Martinez M, Monperrus M (2012a) Mining repair actions for guiding automated program fixing. Tech. rep., INRIA Martinez M, Monperrus M (2012a) Mining repair actions for guiding automated program fixing. Tech. rep., INRIA
Zurück zum Zitat Martinez M, Monperrus M (2012b) Appendix of “On Mining Software Repair Models and their Relations to the Search Space of Automated Program Fixing.” Tech. Rep. hal-00903804, INRIA, Martinez M, Monperrus M (2012b) Appendix of “On Mining Software Repair Models and their Relations to the Search Space of Automated Program Fixing.” Tech. Rep. hal-00903804, INRIA,
Zurück zum Zitat Monperrus M, Martinez M (2012) Cvs-vintage: a dataset of 14 cvs repositories of java software. Tech. rep., INRIA Monperrus M, Martinez M (2012) Cvs-vintage: a dataset of 14 cvs repositories of java software. Tech. rep., INRIA
Zurück zum Zitat Murgia A, Concas G, Marchesi M, Tonelli R (2010) A machine learning approach for text categorization of fixing-issue commits on CVS. In: Proceedings of the international symposium on empirical software engineering and measurement Murgia A, Concas G, Marchesi M, Tonelli R (2010) A machine learning approach for text categorization of fixing-issue commits on CVS. In: Proceedings of the international symposium on empirical software engineering and measurement
Zurück zum Zitat Neamtiu I, Foster JS, Hicks M (2005) Understanding source code evolution using abstract syntax tree matching. In: Proceedings of the international workshop on mining software repositories Neamtiu I, Foster JS, Hicks M (2005) Understanding source code evolution using abstract syntax tree matching. In: Proceedings of the international workshop on mining software repositories
Zurück zum Zitat Pan K, Kim S, Whitehead EJ (2008) Toward an understanding of bug fix patterns. Empir Softw Eng 14(3):286–315CrossRef Pan K, Kim S, Whitehead EJ (2008) Toward an understanding of bug fix patterns. Empir Softw Eng 14(3):286–315CrossRef
Zurück zum Zitat Purushothaman R, Perry D (2005) Toward understanding the rhetoric of small source code changes. IEEE Trans Softw Eng 31:511–526CrossRef Purushothaman R, Perry D (2005) Toward understanding the rhetoric of small source code changes. IEEE Trans Softw Eng 31:511–526CrossRef
Zurück zum Zitat Raghavan S, Rohana R, Leon D, Podgurski A, Augustine V (2004) Dex: a semantic-graph differencing tool for studying changes in large code bases. In: Proceedings of the 20th IEEE international conference on software maintenance Raghavan S, Rohana R, Leon D, Podgurski A, Augustine V (2004) Dex: a semantic-graph differencing tool for studying changes in large code bases. In: Proceedings of the 20th IEEE international conference on software maintenance
Zurück zum Zitat Robbes R (2008) Of change and software. PhD thesis, University of Lugano Robbes R (2008) Of change and software. PhD thesis, University of Lugano
Zurück zum Zitat Vaucher S, Sahraoui H, Vaucher J (2008) Discovering new change patterns in object-oriented systems. In: Proceedings of the working conference on reverse engineering Vaucher S, Sahraoui H, Vaucher J (2008) Discovering new change patterns in object-oriented systems. In: Proceedings of the working conference on reverse engineering
Zurück zum Zitat Wei Y, Pei Y, Furia CA, Silva LS, Buchholz S, Meyer B, Zeller A (2010) Automated fixing of programs with contracts. In: Proceedings of the international symposium on software testing and analysis. AC Wei Y, Pei Y, Furia CA, Silva LS, Buchholz S, Meyer B, Zeller A (2010) Automated fixing of programs with contracts. In: Proceedings of the international symposium on software testing and analysis. AC
Zurück zum Zitat Weimer W (2006) Patches as better bug reports. In: Proceedings of the international conference on generative programming and component engineering Weimer W (2006) Patches as better bug reports. In: Proceedings of the international conference on generative programming and component engineering
Zurück zum Zitat Weimer W, Nguyen T, Goues CL, Forrest S (2009) Automatically finding patches using genetic programming. In: Proceedings of the international conference on software engineering Weimer W, Nguyen T, Goues CL, Forrest S (2009) Automatically finding patches using genetic programming. In: Proceedings of the international conference on software engineering
Zurück zum Zitat Williams CC, Hollingsworth JK (2005) Automatic mining of source code repositories to improve bug finding techniques. IEEE Trans Softw Eng 31(6):466–480CrossRef Williams CC, Hollingsworth JK (2005) Automatic mining of source code repositories to improve bug finding techniques. IEEE Trans Softw Eng 31(6):466–480CrossRef
Zurück zum Zitat Wu R, Zhang H, Kim S, Cheung S-C (2011) Relink: recovering links between bugs and changes. In: Proceedings of the 2011 foundations of software engineering conference, pp 15–25 Wu R, Zhang H, Kim S, Cheung S-C (2011) Relink: recovering links between bugs and changes. In: Proceedings of the 2011 foundations of software engineering conference, pp 15–25
Metadaten
Titel
Mining software repair models for reasoning on the search space of automated program fixing
verfasst von
Matias Martinez
Martin Monperrus
Publikationsdatum
01.02.2015
Verlag
Springer US
Erschienen in
Empirical Software Engineering / Ausgabe 1/2015
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-013-9282-8

Weitere Artikel der Ausgabe 1/2015

Empirical Software Engineering 1/2015 Zur Ausgabe