nach oben

Empirical Software Engineering

Erschienen in:

01.02.2015

Mining software repair models for reasoning on the search space of automated program fixing

verfasst von: Matias Martinez, Martin Monperrus

Erschienen in: Empirical Software Engineering | Ausgabe 1/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper is about understanding the nature of bug fixing by analyzing thousands of bug fix transactions of software repositories. It then places this learned knowledge in the context of automated program repair. We give extensive empirical results on the nature of human bug fixes at a large scale and a fine granularity with abstract syntax tree differencing. We set up mathematical reasoning on the search space of automated repair and the time to navigate through it. By applying our method on 14 repositories of Java software and 89,993 versioning transactions, we show that not all probabilistic repair models are equivalent.

Vorheriger Artikel An experimental investigation comparing individual and collaborative work productivity when using desktop and cloud modeling tools

Nächster Artikel UML model refactoring: a systematic literature review

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Most statistical tables of Spearman’s ρ stop at N=60, however since the critical values decreases with N, if ρ > 0.301 the null hypothesis is still rejected.

Spearman correlation is based on ranks, a value of 0.85 means either that most change actions are ranked similarly or that a single change action has a really different rank.

Note that our goal is not to have a good classification in terms of precision or recall.

Some degree of agreement is expected when the ratings are purely random (Cohen et al. 1960; Joseph 1971).

Since a bug fix may contain several instances of the same repair actions (e.g. several statement insertions), the repair shape may contain several times the same repair action.

Equation (1) holds if and only if we consider them as independent. If they are not, it means that we under-estimate the deep structure of the repair space, hence we over-approximate the time to navigate in the space to find the correct shape. In other words, even if the repair actions are not independent (which is likely for some of them) our conclusions are sound.

“Fix for 19346 integrating changes from Sebastian Davids” http://goo.gl/d4OSi.

In more recent versions of GenProg, swapping has been replaced by “replacing”.

Alali A, Kagdi H, Maletic J (2008) What’s a typical commit? a characterization of open source software repositories. In: Proceedings of the IEEE international conference on program comprehension

Arcuri A (2011) Evolutionary repair of faulty software. Appl Soft Comput 11(4):3494–3514CrossRef

Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: Proceedings of the 33rd international conference on software engineering. ACM, pp 1–10

Bird C, Bachmann A, Aune E, Duffy J, Bernstein A, Filkov V, Devanbu P (2009) Fair and balanced?: bias in bug-fix datasets. In: Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE ’09. ACM, pp 121–130

Bóna M (2011) A walk through combinatorics: an introduction to enumeration and graph theory. World Scientific

Carzaniga A, Gorla A, Perino N, Pezzè M (2010) Automatic workarounds for web applications. In: Proceedings of the 2010 foundations of software engineering conference. ACM, pp 237–246

Cohen J et al (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46CrossRef

Dallmeier V, Zeller A, Meyer B (2009) Generating fixes from object behavior anomalies. In: Proceedings of the international conference on automated software engineering

Debroy V, Wong W (2010) Using mutation to automatically suggest fixes for faulty programs. In: Proceedings of the International Conference on Software Testing, vERIFICATION AND vALIDAtion (ICST). IEEE, pp 65–74

Department of Mathematics of the University of York (2013) Statistical tables. http://www.york.ac.uk/depts/maths/tables/. Accessed 9 Apr 2013

Fluri B, Wursch M, Pinzger M, Gall H (2007) Change distilling: tree differencing for fine-grained source code change extraction. IEEE Trans Softw Eng 33:725–743CrossRef

Fluri B, Giger E, Gall HC (2008) Discovering patterns of change types. In: Proceedings of the international conference on automated software engineering

German DM (2006) An empirical study of fine-grained software modifications. Empir Software Eng 11(3):369–393CrossRef

Giger E, Pinzger M, Gall H, Xie T, Zimmermann T (2011) Comparing fine-grained source code changes and code churn for bug prediction. In: Working conference on mining software repositories

Giger E, Pinzger M, Gall HC (2012) Can we predict types of code changes? an empirical analysis. In: Proceedings of the working conference on mining software repositories, pp 217–226

Gopinath D, Malik MZ, Khurshid S (2011) Specification-based program repair using sat. In: Proceedings of the international conference on tools and algorithms for the construction and analysis of systems

Hattori L, Lanza M (2008) On the nature of commits. In: Proceedings of 4th international ERCIM workshop on software evolution and evolvabillity (EVOL), pp 63–71

Hindle A, German DM, Holt R (2008) What do large commits tell us?: a taxonomical study of large commits. In: Proceedings of the international working conference on mining software repositories

Hindle A, German D, Godfrey M, Holt R (2009) Automatic classication of large changes into maintenance categories. In: Proceedings of the debroInternational conference on program comprehension

Joseph FL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382CrossRef

Kim D, Nam J, Song J, Kim S (2013) Automatic patch generation learned from human-written patches. In: Proceedings of the 2013 international conference on software engineering. IEEE Press, pp 802–811

Kim S, Pan K, Whitehead EJ (2006) Memories of bug fixes. In: Proceedings of the 14th ACM SIGSOFT international symposium on foundations of software engineering

Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174CrossRef

Le Goues C,Weimer W, Forrest S (2012) Representations and operators for improving evolutionary software repair. In: Proceedings of GECCO, pp 959–966

Livshits B, Zimmermann T (2005) Dynamine: finding common error patterns by mining software revision histories. In: Proceedings of the European software engineering conference held jointly with International Symposium on Foundations of Software Engineering

Martinez M, Monperrus M (2012a) Mining repair actions for guiding automated program fixing. Tech. rep., INRIA

Martinez M, Monperrus M (2012b) Appendix of “On Mining Software Repair Models and their Relations to the Search Space of Automated Program Fixing.” Tech. Rep. hal-00903804, INRIA,

Monperrus M, Martinez M (2012) Cvs-vintage: a dataset of 14 cvs repositories of java software. Tech. rep., INRIA

Murgia A, Concas G, Marchesi M, Tonelli R (2010) A machine learning approach for text categorization of fixing-issue commits on CVS. In: Proceedings of the international symposium on empirical software engineering and measurement

Neamtiu I, Foster JS, Hicks M (2005) Understanding source code evolution using abstract syntax tree matching. In: Proceedings of the international workshop on mining software repositories

Pan K, Kim S, Whitehead EJ (2008) Toward an understanding of bug fix patterns. Empir Softw Eng 14(3):286–315CrossRef

Purushothaman R, Perry D (2005) Toward understanding the rhetoric of small source code changes. IEEE Trans Softw Eng 31:511–526CrossRef

Raghavan S, Rohana R, Leon D, Podgurski A, Augustine V (2004) Dex: a semantic-graph differencing tool for studying changes in large code bases. In: Proceedings of the 20th IEEE international conference on software maintenance

Robbes R (2008) Of change and software. PhD thesis, University of Lugano

Vaucher S, Sahraoui H, Vaucher J (2008) Discovering new change patterns in object-oriented systems. In: Proceedings of the working conference on reverse engineering

Wei Y, Pei Y, Furia CA, Silva LS, Buchholz S, Meyer B, Zeller A (2010) Automated fixing of programs with contracts. In: Proceedings of the international symposium on software testing and analysis. AC

Weimer W (2006) Patches as better bug reports. In: Proceedings of the international conference on generative programming and component engineering

Weimer W, Nguyen T, Goues CL, Forrest S (2009) Automatically finding patches using genetic programming. In: Proceedings of the international conference on software engineering

Williams CC, Hollingsworth JK (2005) Automatic mining of source code repositories to improve bug finding techniques. IEEE Trans Softw Eng 31(6):466–480CrossRef

Wu R, Zhang H, Kim S, Cheung S-C (2011) Relink: recovering links between bugs and changes. In: Proceedings of the 2011 foundations of software engineering conference, pp 15–25

Titel: Mining software repair models for reasoning on the search space of automated program fixing
verfasst von: Matias Martinez
Martin Monperrus
Publikationsdatum: 01.02.2015
Verlag: Springer US
Erschienen in: Empirical Software Engineering / Ausgabe 1/2015
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI: https://doi.org/10.1007/s10664-013-9282-8

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 1/2015

An experimental investigation comparing individual and collaborative work productivity when using desktop and cloud modeling tools

UML model refactoring: a systematic literature review

A practical guide to controlled experiments of software engineering tools with human participants

Understanding the Influence of User Participation and Involvement on System Success – a Systematic Mapping Study

HAZOP-based identification of events in use cases

Studying the relationship between logging characteristics and the code quality of platform software