Skip to main content
Erschienen in: Empirical Software Engineering 4/2020

13.03.2020

An empirical analysis of error propagation in critical software systems

verfasst von: Marcello Cinque, Raffaele Della Corte, Antonio Pecchia

Erschienen in: Empirical Software Engineering | Ausgabe 4/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Error propagation analysis is a consolidated practice to gain insights into error modes and effects that pertain to the activation of faults in software systems. A variety of approaches, such as architecture-based, source code instrumentation and variable tracing, have been proposed so far to address software error propagation analysis. Although valuable, existing approaches entail a substantial degree of system internals’ knowledge, visibility and code manipulation that is not well-suited for real-life production environments. This paper proposes an empirical analysis of error propagation. We specifically address the challenges in using fault data and error events in the logs, which are a convenient byproduct of the system’s execution. The approach puts forth the construction of error reporting graphs. We apply the approach to 2,042 failure data points from two real-world critical systems from the Air Traffic Control domain by a top industry provider. The approach contributes to develop a deep understanding on error modes and propagation paths, which can be leveraged by practitioners to make informed decisions on the placement of error detection mechanisms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Fußnoten
1
In this study, we follow the notion that a software fault is a development fault originated during the coding phase. Faults can be activated by the computation process or environmental conditions and cause errors. An error is the part of the total state of the system that may lead to its subsequent service failure. A failure occurs when the delivered service deviates from correct service (Avizienis et al. 2004).
 
2
The evaluation version of these systems, testing applications and workloads are provided by the industry partner within the MINIMINDS Project (n. B21C12000710005).
 
3
Consistently with the software engineering terminology, we mean by component a software unit encompassing a cohesive subset of functionality provided by a given system; a subcomponent is a subset of functionality within the component (Lau and Wang 2007).
 
5
An assertion checks invariant properties holding in correct executions; an alert is generated if an invariant is violated at runtime (Rosenblum 1995).
 
Literatur
Zurück zum Zitat Bondy JA, Murty USR, et al. (1976) Graph theory with applications, vol 290. Citeseer Bondy JA, Murty USR, et al. (1976) Graph theory with applications, vol 290. Citeseer
Zurück zum Zitat Calhoun J, Snir M, Olson LN, Gropp WD (2017) Towards a more complete understanding of SDC propagation. In: Proceedings of the 26th international symposium on high-performance parallel and distributed computing, HPDC ’17. ISBN 978-1-4503-4699-3. ACM, New York, pp 131–142. https://doi.org/10.1145/3078597.3078617 Calhoun J, Snir M, Olson LN, Gropp WD (2017) Towards a more complete understanding of SDC propagation. In: Proceedings of the 26th international symposium on high-performance parallel and distributed computing, HPDC ’17. ISBN 978-1-4503-4699-3. ACM, New York, pp 131–142. https://​doi.​org/​10.​1145/​3078597.​3078617
Zurück zum Zitat Chan A, Winter S, Saissi H, Pattabiraman K, Suri N (2017) IPA: Error propagation analysis of multi-threaded programs using likely invariants. In: IEEE international conference on software testing, verification and validation (ICST), pp 184–195. https://doi.org/10.1109/ICST.2017.24 Chan A, Winter S, Saissi H, Pattabiraman K, Suri N (2017) IPA: Error propagation analysis of multi-threaded programs using likely invariants. In: IEEE international conference on software testing, verification and validation (ICST), pp 184–195. https://​doi.​org/​10.​1109/​ICST.​2017.​24
Zurück zum Zitat Cortellessa V, Grassi V (2007) Component-based software engineering: 10th International Symposium, CBSE 2007, Medford, MA, USA, July 9-11, 2007. Proceedings, chapter A Modeling Approach to Analyze the Impact of Error Propagation on Reliability of Component-Based Systems, pages 140–156. Springer Berlin Heidelberg, Berlin, Heidelberg. ISBN 978-3-540-73551-9. https://doi.org/10.1007/978-3-540-73551-9_10 Cortellessa V, Grassi V (2007) Component-based software engineering: 10th International Symposium, CBSE 2007, Medford, MA, USA, July 9-11, 2007. Proceedings, chapter A Modeling Approach to Analyze the Impact of Error Propagation on Reliability of Component-Based Systems, pages 140–156. Springer Berlin Heidelberg, Berlin, Heidelberg. ISBN 978-3-540-73551-9. https://​doi.​org/​10.​1007/​978-3-540-73551-9_​10
Zurück zum Zitat Filieri A, Ghezzi C, Grassi V, Mirandola R (2010) Reliability analysis of component-based systems with multiple failure modes. In: Grunske L, Reussner R, Plasil F (eds) Component-Based Software Engineering. Springer, Berlin, pp 1–20 Filieri A, Ghezzi C, Grassi V, Mirandola R (2010) Reliability analysis of component-based systems with multiple failure modes. In: Grunske L, Reussner R, Plasil F (eds) Component-Based Software Engineering. Springer, Berlin, pp 1–20
Zurück zum Zitat Hiller M, Jhumka A, Suri N (2002a) Propane: An environment for examining the propagation of errors in software. In: Proceedings of the 2002 ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA, pp 81–85, New York, NY, USA. ACM. ISBN 1-58113-562-9. https://doi.org/10.1145/566172.566184 Hiller M, Jhumka A, Suri N (2002a) Propane: An environment for examining the propagation of errors in software. In: Proceedings of the 2002 ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA, pp 81–85, New York, NY, USA. ACM. ISBN 1-58113-562-9. https://​doi.​org/​10.​1145/​566172.​566184
Zurück zum Zitat Hsueh MC, Tsai TK, Iyer R (1997) Fault injection techniques and tools. IEEE Computer 30(4):75–82CrossRef Hsueh MC, Tsai TK, Iyer R (1997) Fault injection techniques and tools. IEEE Computer 30(4):75–82CrossRef
Zurück zum Zitat Kalyanakrishnam M, Kalbarczyk Z, Iyer R (1999) Failure data analysis of a LAN of windows NT based computers. In: Proceedings of the international symposium on reliable distributed systems (SRDS). IEEE Computer Society, pp 178–187 Kalyanakrishnam M, Kalbarczyk Z, Iyer R (1999) Failure data analysis of a LAN of windows NT based computers. In: Proceedings of the international symposium on reliable distributed systems (SRDS). IEEE Computer Society, pp 178–187
Zurück zum Zitat Lattner C, Adve V (2004) LLVM: a compilation framework for lifelong program analysis & transformation. In: Proceedings of the international symposium on code generation and optimization: feedback-directed and runtime optimization, CGO ’04. ISBN 0-7695-2102-9. IEEE Computer Society, Washington, pp 75–. http://dl.acm.org/citation.cfm?id=977395.977673 Lattner C, Adve V (2004) LLVM: a compilation framework for lifelong program analysis & transformation. In: Proceedings of the international symposium on code generation and optimization: feedback-directed and runtime optimization, CGO ’04. ISBN 0-7695-2102-9. IEEE Computer Society, Washington, pp 75–. http://​dl.​acm.​org/​citation.​cfm?​id=​977395.​977673
Zurück zum Zitat Lyu MR, et al. (1996) Handbook of software reliability engineering, vol 222. IEEE Computer Society Press, CA Lyu MR, et al. (1996) Handbook of software reliability engineering, vol 222. IEEE Computer Society Press, CA
Zurück zum Zitat Popic P, Desovski D, Abdelmoez W, Cukic B (2005) Error propagation in the reliability analysis of component based systems. In: 16th IEEE international symposium on software reliability engineering, 2005. ISSRE 2005, pp 10–62. https://doi.org/10.1109/ISSRE.2005.18 Popic P, Desovski D, Abdelmoez W, Cukic B (2005) Error propagation in the reliability analysis of component based systems. In: 16th IEEE international symposium on software reliability engineering, 2005. ISSRE 2005, pp 10–62. https://​doi.​org/​10.​1109/​ISSRE.​2005.​18
Zurück zum Zitat Rosenblum DS (1995) A practical approach to programming with assertions. IEEE Trans Softw Eng, p 21 Rosenblum DS (1995) A practical approach to programming with assertions. IEEE Trans Softw Eng, p 21
Zurück zum Zitat Tucek J, Lu S, Huang C, Xanthos S, Zhou Y (2007) Triage: Diagnosing production run failures at the user’s site. In: Proceedings of Twenty-first ACM SIGOPS symposium on operating systems principles, SOSP ’07, pp 131–144, New York, NY, USA. ACM. ISBN 978-1-59593-591-5. https://doi.org/10.1145/1294261.1294275 Tucek J, Lu S, Huang C, Xanthos S, Zhou Y (2007) Triage: Diagnosing production run failures at the user’s site. In: Proceedings of Twenty-first ACM SIGOPS symposium on operating systems principles, SOSP ’07, pp 131–144, New York, NY, USA. ACM. ISBN 978-1-59593-591-5. https://​doi.​org/​10.​1145/​1294261.​1294275
Zurück zum Zitat Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2000) Experimentation in software engineering: an introduction. Kluwer Academic Publishers, Norwell. ISBN 0-7923-8682-5CrossRef Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2000) Experimentation in software engineering: an introduction. Kluwer Academic Publishers, Norwell. ISBN 0-7923-8682-5CrossRef
Metadaten
Titel
An empirical analysis of error propagation in critical software systems
verfasst von
Marcello Cinque
Raffaele Della Corte
Antonio Pecchia
Publikationsdatum
13.03.2020
Verlag
Springer US
Erschienen in
Empirical Software Engineering / Ausgabe 4/2020
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-020-09801-2

Weitere Artikel der Ausgabe 4/2020

Empirical Software Engineering 4/2020 Zur Ausgabe

Premium Partner