skip to main content
survey

Assessing Dependability with Software Fault Injection: A Survey

Published:08 February 2016Publication History
Skip Abstract Section

Abstract

With the rise of software complexity, software-related accidents represent a significant threat for computer-based systems. Software Fault Injection is a method to anticipate worst-case scenarios caused by faulty software through the deliberate injection of software faults. This survey provides a comprehensive overview of the state of the art on Software Fault Injection to support researchers and practitioners in the selection of the approach that best fits their dependability assessment goals, and it discusses how these approaches have evolved to achieve fault representativeness, efficiency, and usability. The survey includes a description of relevant applications of Software Fault Injection in the context of fault-tolerant systems.

References

  1. J. Aidemark, J. Vinter, P. Folkesson, and J. Karlsson. 2001. GOOFI: Generic object-oriented fault injection tool. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 83--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Albinet, J. Arlat, and J. C. Fabre. 2004. Characterization of the impact of faulty drivers on the robustness of the linux kernel. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 867--876. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. AMBER project. 2009. AMBER Final Research Roadmap. Retrieved from http://www.amber-project.eu/.Google ScholarGoogle Scholar
  4. J. H. Andrews, L. C. Briand, and Y. Labiche. 2005. Is mutation an appropriate tool for testing experiments? In Proc. Intl. Conf. on Software Engineering. 402--411. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Arlat, M. Aguera, L. Amat, Y. Crouzet, J. C. Fabre, J. C. Laprie, E. Martins, and D. Powell. 1990. Fault injection for dependability validation: A methodology and some applications. IEEE Trans. Software Eng. 16, 2 (1990), 166--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Arlat, Y. Crouzet, J. Karlsson, P. Folkesson, E. Fuchs, and G. H. Leber. 2003. Comparison of physical and software-implemented fault injection techniques. IEEE Trans. Comput. 52, 9 (2003), 1115--1133. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Arlat, J. C. Fabre, M. Rodríguez, and F. Salles. 2002. Dependability of COTS microkernel-based systems. IEEE Trans. Comput. 51, 2 (2002), 138--163. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Arlat and R. Moraes. 2011. Collecting, analyzing and archiving results from fault injection experiments. In Proc. Latin-American Symposium on Dependable Computing. 100--105.Google ScholarGoogle Scholar
  9. A. Avizienis, J. C. Laprie, B. Randell, and C. Landwehr. 2004. Basic concepts and taxonomy of dependable and secure computing. IEEE Trans. on Dependable and Secure Computing 1, 1 (2004), 11--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Avresky, J. Arlat, J. C. Laprie, and Y. Crouzet. 1996. Fault injection for formal testing of fault tolerance. IEEE Trans. on Reliability 45, 3 (1996), 443--455.Google ScholarGoogle ScholarCross RefCross Ref
  11. R. Banabic and G. Candea. 2012. Fast black-box testing of system recovery code. In Proc. ACM European Conference on Computer Systems. 281--294. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Barbosa, J. Vinter, P. Folkesson, and J. Karlsson. 2005. Assembly-level pre-injection analysis for improving fault injection efficiency. In Proc. European Dependable Computing Conf. 246--262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. H. Barton, E. W. Czeck, Z. Z. Segall, and D. P. Siewiorek. 1990. Fault injection experiments using FIAT. IEEE Trans. Comput. 39, 4 (1990), 575--582. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Basso, R. Moraes, B. P. Sanches, and M. Jino. 2009. An investigation of java faults operators derived from a field data study on Java software faults. In Workshop de Testes e Tolerância a Falhas.Google ScholarGoogle Scholar
  15. A. Bondavalli, S. Chiaradonna, D. Cotroneo, and L. Romano. 2004. Effective fault treatment for improving the dependability of COTS and legacy-based applications. IEEE Trans. Dependable Secure Comput. 1, 4 (2004), 223--237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. E. Bounimova, P. Godefroid, and D. Molnar. 2013. Billions and billions of constraints: Whitebox fuzz testing in production. In Proc. Intl. Conf. on Software Engineering. 122--131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. Broadwell, N. Sastry, and J. Traupman. 2002. FIG: A prototype tool for online verification of recovery mechanisms. In Workshop on Self-Healing, Adaptive and self-MANaged Systems.Google ScholarGoogle Scholar
  18. G. Candea, S. Kawamoto, Y. Fujiki, G. Friedman, and A. Fox. 2004. Microreboot--A technique for cheap recovery. In Proc. Symp. on Operating Systems Design and Implementation. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. V. Carreira, D. Costa, and J. G. Silva. 1999. Fault injection spot-checks computer system dependability. IEEE Spectrum 36, 8 (1999), 50--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Carreira, H. Madeira, and J. G. Silva. 1998. Xception: A technique for the experimental evaluation of dependability in modern computers. IEEE Trans. Software Eng. 24, 2 (1998), 125--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. Chandra, R. M. Lefever, K. R. Joshi, M. Cukier, and W. H. Sanders. 2004. A global-state-triggered fault injector for distributed system evaluation. IEEE Trans. Parallel Distrib. Syst. 15, 7 (2004), 593--605. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Chandra and P. M. Chen. 1998. How fail-stop are faulty programs? In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 240--249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Chillarege, I. S. Bhandari, J. K. Chaar, M. J. Halliday, D. S. Moebus, B. K. Ray, and M. Y. Wong. 1992. Orthogonal defect classification--A concept for in-process measurements. IEEE Trans. Software Eng. 18, 11 (1992), 943--956. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. Chillarege, W. L. Kao, and R. G. Condit. 1991. Defect type and its impact on the growth curve. In Proc. Intl. Conf. on Software Engineering. 246--255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Christmansson and R. Chillarege. 1996. Generation of an error set that emulates software faults based on field data. In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 304--313. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Christmansson, M. Hiller, and M. Rimen. 1998. An experimental comparison of fault and error injection. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 369--378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Christmansson and P. Santhanam. 1996. Error injection aimed at fault removal in fault tolerance mechanisms--Criteria for error selection using field data on software faults. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 175--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. A. Clark and D. K. Pradhan. 1995. Fault injection: A method for validating computer-system dependability. IEEE Computer 28, 6 (1995), 47--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. Cotroneo, D. Di Leo, F. Fucci, and R. Natella. 2013. SABRINE: State-based robustness testing of operating systems. In Proc. IEEE/ACM Intl. Conf. on Automated Software Engineering. 125--135.Google ScholarGoogle Scholar
  30. D. Cotroneo, A. Lanzaro, R. Natella, and R. Barbosa. 2012. Experimental analysis of binary-level software fault injection in complex software. In Proc. European Dependable Computing Conf. 162--172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. D. Cotroneo, R. Natella, S. Russo, and F. Scippacercola. 2013. State-driven testing of distributed systems. In Proc. Intl. Conf. Principles of Distributed Systems. 114--128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Daran and P. Thévenod-Fosse. 1996. Software error analysis: A real case study involving real faults and mutations. ACM Software Engineering Notes 21, 3 (1996), 158--171. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. S. Dawson, F. Jahanian, T. Mitton, and T. L. Tung. 1996. Testing of fault-tolerant and real-time distributed systems via protocol fault injection. In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 404--414. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. DBench project. 2004. DBench Final Report. Retrieved from http://www.laas.fr/DBench/.Google ScholarGoogle Scholar
  35. V. De Florio and C. Blondia. 2008. A survey of linguistic structures for application-level fault tolerance. Comput. Surveys 40, 2 (2008), 6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. M. E. Delamaro and J. C. Maldonado. 1996. Proteum—A tool for the assessment of test adequacy for c programs. In Proc. Conf. Performability in Computer Systems. 79--95.Google ScholarGoogle Scholar
  37. R. A. DeMillo, R. J. Lipton, and F. G. Sayward. 1978. Hints on test data selection: Help for the practicing programmer. IEEE Computer 11, 4 (1978), 34--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. C. P. Dingman and J. Marshall. 1995. Measuring robustness of a fault-tolerant aerospace system. In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 522--527. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. H. Do and G. Rothermel. 2006. On the use of mutation faults in empirical assessments of test case prioritization techniques. IEEE Trans. Software Eng. (2006), 733--752. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. J. Duraes and H. Madeira. 2002. Emulation of software faults by educated mutations at machine-code level. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 329--340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. J. Durães and H. Madeira. 2006. Emulation of software faults: A field data study and a practical approach. IEEE Trans. Software Eng. 32, 11 (2006), 849--867. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. J. Durães, M. Vieira, and H. Madeira. 2003. Multidimensional characterization of the impact of faulty drivers on the operating systems behavior. IEICE Trans. Inf. Sys. 86, 12 (2003), 2563--2570.Google ScholarGoogle Scholar
  43. J. Durães, M. Vieira, and H. Madeira. 2004. Dependability benchmarking of Web-servers. In Proc. Intl. Conf. on Computer Safety, Reliability, and Security. 297--310.Google ScholarGoogle Scholar
  44. N. E. Fenton and N. Ohlsson. 2000. Quantitative analysis of faults and failures in a complex software system. IEEE Trans. Software Eng. 26, 8 (2000), 797--814. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. C. Fetzer, P. Felber, and K. Högstedt. 2004. Automatic detection and masking of nonatomic exception handling. IEEE Trans. Software Eng. 30 (2004), 547--560. Issue 8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. C. Fetzer and Z. Xiao. 2002. An automated approach to increasing the robustness of C libraries. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 155--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. J. Fonseca and M. Vieira. 2008. Mapping software faults with web security vulnerabilities. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 257--266.Google ScholarGoogle Scholar
  48. J. Fonseca, M. Vieira, and H. Madeira. 2009. Vulnerability & attack injection for Web applications. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 93--102.Google ScholarGoogle Scholar
  49. A. G. Ganek and T. A. Corbi. 2003. The dawning of the autonomic computing era. IBM Syst. J. 42, 1 (2003), 5--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. A. K. Ghosh, M. Schmid, and V. Shah. 1998. Testing the robustness of windows NT software. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 231--235. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. C. Giuffrida, A. Kuijsten, and A. S. Tanenbaum. 2013. EDFI: A dependable fault injection tool for dependability benchmarking experiments. In Proc. Pacific Rim Intl. Symp. on Dependable Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. P. Godefroid, M. Y. Levin, and D. A. Molnar. 2008. Automated whitebox fuzz testing. In Proc. Network and Distributed Sys. Sec. Symp. 151--166.Google ScholarGoogle Scholar
  53. A. Gorla, M. Pezzè, J. Wuttke, L. Mariani, and F. Pastore. 2012. Achieving cost-effective software reliability through self-healing. Comput. Inf. 29, 1 (2012), 93--115.Google ScholarGoogle Scholar
  54. J. Gray. 1990. A census of tandem system availability between 1985 and 1990. IEEE Trans. on Reliability 39, 4 (1990), 409--418.Google ScholarGoogle ScholarCross RefCross Ref
  55. H. S. Gunawi, T. Do, P. Joshi, P. Alvaro, J. M. Hellerstein, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, K. Sen, and D. Borthakur. 2011. FATE and DESTINI: A framework for cloud recovery testing. In Proc. USENIX Symposium on Networked Systems Design and Implementation. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. R. G. Hamlet. 1977. Testing programs with the aid of a compiler. IEEE Trans. Software Eng. 3, 4 (1977), 279--290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. S. Han, K. G. Shin, and H. A. Rosenberg. 1995. DOCTOR: An integrated software fault injection environment. In Proc. Intl. Computer Performance and Dependability Symp. 204--213. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. M. Hiller, A. Jhumka, and N. Suri. 2001. An approach for analysing the propagation of data errors in software. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 161--170. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. M. C. Hsueh, T. K. Tsai, and R. K. Iyer. 1997. Fault injection techniques and tools. IEEE Computer 30, 4 (1997), 75--82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. J. J. Hudak, B. H. Suh, D. P. Siewiorek, and Z. Segall. 1993. Evaluation and comparison of fault-tolerant software techniques. IEEE Trans. Reliability 42, 2 (1993), 190--204.Google ScholarGoogle ScholarCross RefCross Ref
  61. IEEE. 1990. IEEE standard glossary of software engineering terminology. IEEE Std 610.12-1990 (1990).Google ScholarGoogle Scholar
  62. IEEE. 1994. IEEE standard for information technology--Portable operating system interface (POSIX) part 1. IEEE Std 1003.1b-1993 (1994).Google ScholarGoogle Scholar
  63. ISO. 2011. Product development: software level. ISO 26262: Road vehicles -- Functional safety 6 (2011).Google ScholarGoogle Scholar
  64. T. Jarboui, J. Arlat, Y. Crouzet, K. Kanoun, and T. Marteau. 2002. Analysis of the effects of real and injected software faults: Linux as a case study. In Proc. Pacific Rim Intl. Symp. on Dependable Computing. 51--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Y. Jia and M. Harman. 2009. Higher order mutation testing. Inf. Software Technol. 51, 10 (2009), 1379--1393. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Y. Jia and M. Harman. 2011. An analysis and survey of the development of mutation testing. IEEE Trans. Software Eng. 37, 5 (2011), 649--678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. A. Johansson and N. Suri. 2005. Error propagation profiling of operating systems. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 86--95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. A. Johansson, N. Suri, and B. Murphy. 2007a. On the impact of injection triggers for OS robustness evaluation. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 127--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. A. Johansson, N. Suri, and B. Murphy. 2007b. On the selection of error model(s) for OS robustness evaluation. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 502--511. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. P. Joshi, H. S. Gunawi, and K. Sen. 2011. PREFAIL: A programmable tool for multiple-failure injection. ACM SIGPLAN Not. 46, 10 (2011), 171--188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. A. Kalakech, K. Kanoun, Y. Crouzet, and J. Arlat. 2004. Benchmarking the dependability of windows NT4, 2000 and XP. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 681--686. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. G. A. Kanawati, N. A. Kanawati, and J. A. Abraham. 1995. FERRARI: A flexible software-based fault and error injection system. IEEE Trans. Comput. 44, 2 (1995), 248--260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. K. Kanoun and L. Spainhower. 2008. Dependability Benchmarking for Computer Systems. Wiley-IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. W.-I. Kao and R. K. Iyer. 1994. DEFINE: A distributed fault injection and monitoring environment. In Proc. Workshop on Fault-Tolerant Parallel and Distributed Systems. 252--259.Google ScholarGoogle Scholar
  75. W.-I. Kao, R. K. Iyer, and D. Tang. 1993. FINE: A fault injection and monitoring environment for tracing the UNIX system behavior under faults. IEEE Trans. Software Eng. 19, 11 (1993), 1105--1118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. J. Katcher. 1997. Postmark: A New File System Benchmark. Technical Report TR-3022.Google ScholarGoogle Scholar
  77. L. Keller, P. Upadhyaya, and G. Candea. 2008. ConfErr: A tool for assessing resilience to human configuration errors. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 157--166.Google ScholarGoogle Scholar
  78. J. C. King. 1976. Symbolic execution and program testing. Commun. ACM 19, 7 (1976), 385--394. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. K. N. King and A. J. Offutt. 1991. A fortran language system for mutation-based software testing. Software: Practice Exp. 21, 7 (1991), 685--718. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. P. Koopman and J. DeVale. 2000. The exception handling effectiveness of POSIX operating systems. IEEE Trans. Software Eng. 26, 9 (2000), 837--848. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. A. Lanzaro, R. Natella, S. Winter, D. Cotroneo, and N. Suri. 2014. An empirical study of injected versus actual interface errors. In Proc. Intl. Symp. Soft. Testing and Analysis. 397--408. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. J.-C. Laprie, J. Arlat, C. Beounes, and K. Kanoun. 1990. Definition and analysis of hardware-and software-fault-tolerant architectures. IEEE Computer 23, 7 (1990), 39--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. N. Laranjeiro, M. Vieira, and H. Madeira. 2014. A technique for deploying robust Web services. IEEE Trans. Services Comput. 7, 1 (2014), 68--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. I. Lee and R. K. Iyer. 1995. Software dependability in the tandem guardian system. IEEE Trans. Software Eng. 21, 5 (1995), 455--467. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. N. G. Leveson. 2004. Role of software in spacecraft accidents. J. Spacecraft Rockets 41, 4 (2004), 564--575.Google ScholarGoogle ScholarCross RefCross Ref
  86. X. Li, M. C. Huang, K. Shen, and L. Chu. 2010. A realistic evaluation of memory hardware errors and software system susceptibility. In Proc. USENIX Annual Technical Conf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. M. R. Lyu. 1995. Software Fault Tolerance. John Wiley & Sons. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. H. Madeira, D. Costa, and M. Vieira. 2000. On the emulation of software faults by software fault injection. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 417--426. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. L. Madeyski, W. Orzeszyna, R. Torkar, and M. Józala. 2014. Overcoming the equivalent mutant problem: A systematic literature review and a comparative experiment of second order mutation. IEEE Trans. Software Eng. 40, 1 (2014), 23--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. A. Mahmood, D. M. Andrews, and EJ McCluskey. 1984. Executable assertions and flight software. In Proceedings of the 6th Digital Avionics Systems Conference. 346--351.Google ScholarGoogle ScholarCross RefCross Ref
  91. P. D. Marinescu and G. Candea. 2011. Efficient testing of recovery code using fault injection. ACM Trans. Comput. Syst. 29, 4 (2011), 11:1--11:38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. E. Martins, C. M. F. Rubira, and N. G. M. Leme. 2002. Jaca: A reflective fault injection tool based on patterns. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 483--487. Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. P. A. McQuaid. 2012. Software disasters—understanding the past, to improve the future. J. Software: Evolution and Process 24, 5 (2012), 459--470.Google ScholarGoogle ScholarCross RefCross Ref
  94. Microsoft Corp. 2014. Resilience by Design for Cloud Services. Retrieved from http://www.microsoft.com/en-us/download/details.aspx?id=38823.Google ScholarGoogle Scholar
  95. B. P. Miller, L. Fredriksen, and B. So. 1990. An empirical study of the reliability of UNIX utilities. Commun. ACM 33, 12 (1990), 32--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. B. Miller, D. Koski, C. Lee, V. Maganty, R. Murthy, A. Natarajan, and J. Steidl. 1998. Fuzz Revisited: A Re-examination of the Reliability of UNIX Utilities and Services. Technical Report CSTR-95-1268.Google ScholarGoogle Scholar
  97. R. Moraes, R. Barbosa, J. Durães, N. Mendes, E. Martins, and H. Madeira. 2006. Injection of faults at component interfaces and inside the component code: Are they equivalent? In Proc. European Dependable Computing Conf. 53--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. V. Nagarajan, D. Jeffrey, and R. Gupta. 2009. Self-recovery in server programs. In Proc. Intl. Symp. on Memory Management. 49--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. NASA. 2004. NASA software safety guidebook. NASA-GB-8719.13 (2004).Google ScholarGoogle Scholar
  100. R. Natella and D. Cotroneo. 2010. Emulation of transient software faults for dependability assessment: A case study. In Proc. European Dependable Computing Conf. 23--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. R. Natella, D. Cotroneo, J. A. Duraes, and H. Madeira. 2013. On fault representativeness of software fault injection. IEEE Trans. Software Eng. 39, 1 (2013), 80--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. W. T. Ng, C. M. Aycock, G. Rajamani, and P. M. Chen. 1996. Comparing disk and memory’s resistance to operating system crashes. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 185--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. W. T. Ng and P. M. Chen. 2001. The design and verification of the rio file cache. IEEE Trans. Comput. 50, 4 (2001), 322--337. Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. A. J. Offutt. 1992. Investigations of the software testing coupling effect. ACM Trans. Software Eng Methodol. 1, 1 (1992), 5--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. A. J. Offutt, A. Lee, G. Rothermel, R. H. Untch, and C. Zapf. 1996. An experimental determination of sufficient mutant operators. ACM Trans. Software Eng Methodol. 5, 2 (1996), 99--118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. D. Oppenheimer, A. Ganapathi, and D. A. Patterson. 2003. Why do internet services fail, and what can be done about it? In USENIX Symp. on Internet Technologies and Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. T. J. Ostrand, E. J. Weyuker, and R. M. Bell. 2005. Predicting the location and number of faults in large software systems. IEEE Trans. Software Eng. 31, 4 (2005), 340--355. Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. M. Papadakis and N. Malevris. 2010. An empirical evaluation of the first and second order mutation testing strategies. In Proc. Intl. Conf. Software Testing, Verification, and Validation Workshops. 90--99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  109. C. S. Păsăreanu and W. Visser. 2009. A survey of new trends in symbolic execution for software testing and analysis. Intl. J. Software Tools Tech. Transf. 11, 4 (2009), 339--353. Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. K. Pattabiraman, N. Nakka, Z. Kalbarczyk, and R. K. Iyer. 2008. SymPLFIED: symbolic program-level fault injection and error detection framework. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 472--481.Google ScholarGoogle Scholar
  111. D. Patterson, A. Brown, P. Broadwell, G. Candea, M. Chen, J. Cutler, P. Enriquez, A. Fox, E. Kiciman, M. Merzbacher, D. Oppenheimer, N. Sastry, W. Tetzlaff, J. Traupman, and N. Treuhaft. 2002. Recovery-Oriented Computing (ROC): Motivation, Definition, Techniques, and Case Studies. Technical Report TR-02-1175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. D. Powell, E. Martins, J. Arlat, and Y. Crouzet. 1995. Estimators for fault tolerance coverage evaluation. IEEE Trans. Comput. 44, 2 (1995), 261--274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  113. V. Prabhakaran, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. 2005. Model-based failure analysis of journaling file systems. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 802--811. Google ScholarGoogle ScholarDigital LibraryDigital Library
  114. G. L. Ries, G. S. Choi, and R. K. Iyer. 1994. Device-level transient fault modeling. In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 86--94.Google ScholarGoogle Scholar
  115. RTCA. 1992. DO-178B software considerations in airborne systems and equipment certification. Requirements and Technical Concepts for Aviation (1992).Google ScholarGoogle Scholar
  116. F. Salfner, M. Lenk, and M. Malek. 2010. A survey of online failure prediction methods. Comput. Surveys 42, 3 (2010), 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  117. F. Salles, M. Rodriguez, J.-C. Fabre, and J. Arlat. 1999. MetaKernels and fault containment wrappers. In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 22--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. B. P. Sanches, T. Basso, and R. Moraes. 2011. J-SWFIT: A Java software fault injection tool. In Proc. Latin American Symp. on Dependable Computing.Google ScholarGoogle Scholar
  119. A. Schiper, K. Birman, and P. Stephenson. 1991. Lightweight causal and atomic group multicast. ACM Trans. Comput. Syst. 9, 3 (1991), 272--314. Google ScholarGoogle ScholarDigital LibraryDigital Library
  120. SPEC. 2000. SPECweb99 v1.02. Retrieved from http://www.spec.org/web99/.Google ScholarGoogle Scholar
  121. M. Sridharan and A. S. Namin. 2010. Prioritizing mutation operators based on importance sampling. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 378--387. Google ScholarGoogle ScholarDigital LibraryDigital Library
  122. D. T. Stott, B. Floering, Z. Kalbarczyk, and R. K. Iyer. 2000. A framework for assessing dependability in distributed systems with lightweight fault injectors. In Proc. Intl. Computer Performance and Dependability Symp. 91--100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  123. R. Strom and S. Yemini. 1985. Optimistic recovery in distributed systems. ACM Trans. Comput. Syst. 3, 3 (1985), 204--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  124. M. Sullivan and R. Chillarege. 1991. Software defects and their impact on system availability: A study of field failures in operating systems. In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 2--9.Google ScholarGoogle Scholar
  125. N. Suri and P. Sinha. 1998. On the use of formal techniques for validation. In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 390--399. Google ScholarGoogle ScholarDigital LibraryDigital Library
  126. M. Susskraut and C. Fetzer. 2006. Automatically finding and patching bad error handling. In Proc. European Dependable Computing Conf. 13--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  127. A. Thakur, R. K. Iyer, L. Young, and I. Lee. 1995. Analysis of failures in the tandem nonstop-UX operating system. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 40--50.Google ScholarGoogle Scholar
  128. TPCC. 2010. TPC Benchmark C (TPC-C) v5.11. Retrieved from http://www.tpc.org/tpcc/.Google ScholarGoogle Scholar
  129. T. K. Tsai, M. C. Hsueh, H. Zhao, Z. Kalbarczyk, and R. K. Iyer. 1999. Stress-based and path-based fault injection. IEEE Trans. Comput. 48, 11 (1999), 1183--1201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  130. E. van der Kouwe, C. Giuffrida, and A. S. Tanenbaum. 2014. Evaluating distortion in fault injection experiments. In Proc. IEEE Intl. Symp. High-Assurance Systems Engineering. 25--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  131. P. C. Véras, E. Villani, A. M. Ambrosio, N. Silva, M. Vieira, and H. Madeira. 2012. Errors on space software requirements: A field study and application scenarios. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 61--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  132. M. Vieira and H. Madeira. 2003. A dependability benchmark for OLTP application environments. In Proc. Intl. Conf. on Very Large Data Bases. 742--753. Google ScholarGoogle ScholarDigital LibraryDigital Library
  133. M. Vieira, H. Madeira, I. Irrera, and M. Malek. 2009. Fault injection for failure prediction methods validation. In Proc. Workshop on Hot Topics in System Dependability.Google ScholarGoogle Scholar
  134. J. M. Voas. 1998. Certifying off-the-shelf software components. IEEE Computer 31, 6 (1998), 53--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  135. J. M. Voas, F. Charron, G. McGraw, K. Miller, and M. Friedman. 1997. Predicting how badly “Good” software can behave. IEEE Software 14, 4 (1997), 73--83. Google ScholarGoogle ScholarDigital LibraryDigital Library
  136. C. J. Walter and N. Suri. 2003. The customizable fault/error model for dependable distributed systems. Theor. Comput. Sci. 290, 2 (2003), 1223--1251. Google ScholarGoogle ScholarDigital LibraryDigital Library
  137. E. J. Weyuker. 1998. Testing component-based software: A cautionary tale. IEEE Software 15, 5 (1998), 54--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  138. L. Wilson. 2013. International Technology Roadmap for Semiconductors. Retrieved from http://www.itrs.net.Google ScholarGoogle Scholar
  139. S. Winter, C. Sârbu, N. Suri, and B. Murphy. 2011. The impact of fault models on software robustness evaluations. In Proc. Intl. Conf. on Software Engineering. 51--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  140. S. Winter, M. Tretter, B. Sattler, and N. Suri. 2013. simFI: From single to simultaneous software fault injections. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. Google ScholarGoogle ScholarDigital LibraryDigital Library
  141. C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén. 2012. Experimentation in Software Engineering. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  142. W. E. Wong and A. P. Mathur. 1995. Reducing the cost of mutation testing: An empirical study. J. Syst. Software 31, 3 (1995), 185--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  143. J. Xu, S. Chen, Z. Kalbarczyk, and R. K. Iyer. 2001. An experimental study of security vulnerabilities caused by errors. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 421--430. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Assessing Dependability with Software Fault Injection: A Survey

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Computing Surveys
            ACM Computing Surveys  Volume 48, Issue 3
            February 2016
            619 pages
            ISSN:0360-0300
            EISSN:1557-7341
            DOI:10.1145/2856149
            • Editor:
            • Sartaj Sahni
            Issue’s Table of Contents

            Copyright © 2016 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 8 February 2016
            • Accepted: 1 October 2015
            • Revised: 1 July 2015
            • Received: 1 June 2013
            Published in csur Volume 48, Issue 3

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • survey
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader