ABSTRACT
Empirical studies in software testing research may not be comparable, reproducible, or characteristic of practice. One reason is that real bugs are too infrequently used in software testing research. Extracting and reproducing real bugs is challenging and as a result hand-seeded faults or mutants are commonly used as a substitute. This paper presents Defects4J, a database and extensible framework providing real bugs to enable reproducible studies in software testing research. The initial version of Defects4J contains 357 real bugs from 5 real-world open source pro- grams. Each real bug is accompanied by a comprehensive test suite that can expose (demonstrate) that bug. Defects4J is extensible and builds on top of each program’s version con- trol system. Once a program is configured in Defects4J, new bugs can be added to the database with little or no effort. Defects4J features a framework to easily access faulty and fixed program versions and corresponding test suites. This framework also provides a high-level interface to common tasks in software testing research, making it easy to con- duct and reproduce empirical studies. Defects4J is publicly available at http://defects4j.org.
- V. Dallmeier and T. Zimmermann. Extraction of bug localization benchmarks from history. In Proceedings of the International Conference on Automated Software Engineering (ASE), pages 433–436, 2007. Google ScholarDigital Library
- H. Do, S. Elbaum, and G. Rothermel. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Software Engineering, 10(4):405–435, 2005. Google ScholarDigital Library
- G. Fraser and A. Arcuri. Evosuite: Automatic test suite generation for object-oriented software. In Proceedings of the Joint Meeting of the European Software Engineering Conference and the Symposium on the Foundations of Software Engineering (ESEC/FSE), pages 416–419, 2011. Google ScholarDigital Library
- M. Hutchins, H. Foster, T. Goradia, and T. Ostrand. Experiments of the effectiveness of dataflow- and controlflow-based test adequacy criteria. In Proceedings of the International Conference on Software Engineering (ICSE), pages 191–200, 1994. Google ScholarDigital Library
- R. Just. The Major mutation framework: Efficient and scalable mutation analysis for Java. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA), 2014. To appear. Google ScholarDigital Library
- R. Just, D. Jalali, L. Inozemtseva, M. D. Ernst, R. Holmes, and G. Fraser. Are mutants a valid substitute for real faults in software testing? Technical Report UW-CSE-14-02-02, University of Washington, 2014.Google ScholarDigital Library
- R. Just, G. M. Kapfhammer, and F. Schweiggert. Using non-redundant mutation operators and test suite prioritization to achieve efficient and scalable mutation analysis. In Proceedings of the International Symposium on Software Reliability Engineering (ISSRE), pages 11–20, 2012.Google ScholarDigital Library
- A. J. Ko and B. A. Myers. A framework and methodology for studying the causes of software errors in programming systems. Journal of Visual Languages & Computing, 16(1):41––84, 2005. Google ScholarDigital Library
Index Terms
- Defects4J: a database of existing faults to enable controlled testing studies for Java programs
Recommendations
BugsInPy: a database of existing bugs in Python programs to enable controlled testing and debugging studies
ESEC/FSE 2020: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software EngineeringThe 2019 edition of Stack Overflow developer survey highlights that, for the first time, Python outperformed Java in terms of popularity. The gap between Python and Java further widened in the 2020 edition of the survey. Unfortunately, despite the rapid ...
Searching for Multi-fault Programs in Defects4J
Search-Based Software EngineeringAbstractDefects4J has enabled numerous software testing and debugging research work since its introduction. A large part of its contribution, and the resulting popularity, lies in the clear separation and distillation of the root cause of each individual ...
GBGallery : A benchmark and framework for game testing
AbstractSoftware bug database and benchmark are the wheels of advancing automated software testing. In practice, real bugs often occur sparsely relative to the amount of software code, the extraction and curation of which are quite labor-intensive but can ...
Comments