skip to main content
10.1145/2483760.2483785acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

Using automated program repair for evaluating the effectiveness of fault localization techniques

Authors Info & Claims
Published:15 July 2013Publication History

ABSTRACT

Many techniques on automated fault localization (AFL) have been introduced to assist developers in debugging. Prior studies evaluate the localization technique from the viewpoint of developers: measuring how many benefits that developers can obtain from the localization technique used when debugging. However, these evaluation approaches are not always suitable, because it is difficult to quantify precisely the benefits due to the complex debugging behaviors of developers. In addition, recent user studies have presented that developers working with AFL do not correct the defects more efficiently than ones working with only traditional debugging techniques such as breakpoints, even when the effectiveness of AFL is artificially improved. In this paper we attempt to propose a new research direction of developing AFL techniques from the viewpoint of fully automated debugging including the program repair of automation, for which the activity of AFL is necessary. We also introduce the NCP score as the evaluation measurement to assess and compare various techniques from this perspective. Our experiment on 15 popular AFL techniques with 11 subject programs shipping with real-life field failures presents the evidence that these AFL techniques performing well in prior studies do not have better localization effectiveness according to NCP score. We also observe that Jaccard has the better performance over other techniques in our experiment.

References

  1. R. Abreu, P. Zoeteweij, R. Golsteijn, and A. J. van Gemund. A practical evaluation of spectrum-based fault localization. Journal of Systems and Software (JSS), 82(11):1780 – 1792, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Abreu, P. Zoeteweij, and A. van Gemund. On the accuracy of spectrum-based fault localization. In Testing: Academic and Industrial Conference, Practice and Research Techniques, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Ali, J. H. Andrews, T. Dhandapani, and W. Wang. Evaluating the accuracy of fault localization techniques. In International Conference on Automated Software Engineering (ASE), pages 76–87, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Arcuri. On the automation of fixing software bugs. In International Conference on Software Engineering (ICSE), pages 1003–1006, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Arcuri. Evolutionary repair of faulty software. Applied Soft Computing, 11(4):3494 – 3514, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Arcuri and L. Briand. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In International Conference on Software Engineering (ICSE), pages 1–10, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Artzi, J. Dolby, F. Tip, and M. Pistoia. Directed test generation for effective fault localization. In International Symposium on Software Testing and Analysis (ISSTA), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Burger and A. Zeller. Minimizing reproduction of software failures. In International Symposium on Software Testing and Analysis (ISSTA), pages 221–231, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Y. Chen, E. Kiciman, E. Fratkin, A. Fox, and E. Brewer. Pinpoint: Problem determination in large, dynamic internet services. In International Conference on Dependable Systems and Networks, pages 595–604, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Z. P. Fry, B. Landau, and W. Weimer. A human study of patch maintainability. In International Symposium on Software Testing and Analysis (ISSTA), pages 177–187, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Harman. Automated patching techniques: the fix is in: technical perspective. Communications of the ACM, 53(5):108–108, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Jin, L. Song, W. Zhang, S. Lu, and B. Liblit. Automated atomicity-violation fixing. In Programming Language Design and Implementation (PLDI), pages 389–400, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. W. Jin and A. Orso. Bugredux: reproducing field failures for in-house debugging. In International Conference on Software Engineering (ICSE), pages 474–484, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. A. Jones and M. J. Harrold. Empirical evaluation of the tarantula automatic fault-localization technique. In International Conference on Automated Software Engineering (ASE), pages 273–282, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. A. Jones, M. J. Harrold, and J. Stasko. Visualization of test information to assist fault localization. In International Conference on Software Engineering (ICSE), pages 467–477, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Le Goues, M. Dewey-Vogt, S. Forrest, and W. Weimer. A systematic study of automated program repair: fixing 55 out of 105 bugs for $8 each. In International Conference on Software Engineering (ICSE), pages 3–13, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Le Goues, T. Nguyen, S. Forrest, and W. Weimer. GenProg: a generic method for automatic software repair. IEEE Transactions on Software Engineering (TSE), 38(1):54 –72, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Le Goues and W. Weimer. Measuring code quality to improve specification mining. IEEE Transactions on Software Engineering (TSE), 38(1):175 –190, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. Liblit, M. Naik, A. X. Zheng, A. Aiken, and M. I. Jordan. Scalable statistical bug isolation. In Programming Language Design and Implementation (PLDI), pages 15–26, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. Naish, H. J. Lee, and K. Ramamohanarao. A model for spectra-based software diagnosis. ACM Transactions on Software Engineering and Methodology (TOSEM), 20(3):11:1–11:32, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. Parnin and A. Orso. Are automated debugging techniques actually helping programmers? In International Symposium on Software Testing and Analysis (ISSTA), pages 199–209, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Pei, Y. Wei, C. Furia, M. Nordio, and B. Meyer. Code-based automated program fixing. In International Conference on Automated Software Engineering (ASE), pages 392 –395, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Poulding and J. A. Clark. Efficient software verification: Statistical testing using automated search. IEEE Transactions on Software Engineering (TSE), 36(6):763–777, Nov. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Qi, X. Mao, and Y. Lei. Making automatic repair for large-scale programs more efficient using weak recompilation. In International Conference on Software Maintenance (ICSM), pages 254–263, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Röβler, G. Fraser, A. Zeller, and A. Orso. Isolating failure causes through test case generation. In International Symposium on Software Testing and Analysis (ISSTA), pages 309–319, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. H. Samimi, M. Schäfer, S. Artzi, T. Millstein, F. Tip, and L. Hendren. Automated repair of HTML generation errors in php applications using string constraint solving. In International Conference on Software Engineering (ICSE), pages 277–287, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. F. Thung, Lucia, D. Lo, L. Jiang, F. Rahman, and P. T. Devanbu. To what extent could we detect field defects? an empirical study of false negatives in static bug finding tools. In International Conference on Automated Software Engineering (ASE), pages 50–59, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Vargha and H. D. Delaney. A critique and improvement of the CL common language effect size statistics of mcgraw and wong. Journal of Educational and Behavioral Statistics, 25(2):101–132, 2000.Google ScholarGoogle Scholar
  29. Y. Wei, C. A. Furia, N. Kazmin, and B. Meyer. Inferring better contracts. In International Conference on Software Engineering (ICSE), pages 191–200, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Wei, Y. Pei, C. A. Furia, L. S. Silva, S. Buchholz, B. Meyer, and A. Zeller. Automated fixing of programs with contracts. In International Symposium on Software Testing and Analysis (ISSTA), pages 61–72, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. F. Wilcoxon. Individual comparisons by ranking methods. Biometrics Bulletin, 1(6):80 – 83, 1945.Google ScholarGoogle ScholarCross RefCross Ref
  32. X. Xie, T. Y. Chen, F.-c. Kuo, and B. Xu. A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Transactions on Software Engineering and Methodology (TOSEM), 2013 (to appear).Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Y. Yu, J. A. Jones, and M. J. Harrold. An empirical study of the effects of test-suite reduction on fault localization. In International Conference on Software Engineering (ICSE), pages 201–210, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. A. Zeller. Automated debugging: Are we close. Computer, 34(11):26–31, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. A. Zeller and R. Hildebrandt. Simplifying and isolating failure-inducing input. IEEE Transactions on Software Engineering (TSE), 28(2):183–200, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Using automated program repair for evaluating the effectiveness of fault localization techniques

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ISSTA 2013: Proceedings of the 2013 International Symposium on Software Testing and Analysis
        July 2013
        381 pages
        ISBN:9781450321594
        DOI:10.1145/2483760

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 July 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate58of213submissions,27%

        Upcoming Conference

        ISSTA '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader