Skip to main content
Erschienen in:
Buchtitelbild

2017 | OriginalPaper | Buchkapitel

Analysis of Relationship Between Modes of Intercomputer Communications and Fault Types in Redundant Computer Systems

verfasst von : Refik Samet, Nermin Samet

Erschienen in: Transactions on Computational Science XXIX

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper analyzes the reasons of appearance of non - Byzantine and Byzantine fault types in redundant computer systems. The proposed approach is based on analysis of the relationship between the modes of intercomputer communications and fault types. This analysis allows the users to design the redundant computer systems in such a way that Byzantine faults cannot appear. Consequently, designing the redundant computer systems, in which Byzantine faults cannot appear, allows the designers to increase the degree of reliability by preventing the masking of any forms of appearance of faults and by decreasing the time period of checkpoints. In addition, this approach decreases the cost of software and hardware involved in the execution of fault-tolerant procedures.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Avizienis, A., Laprie, J.C., Randell, B., Landwehr, C.: Basic concepts and taxonomy of dependable and secure computing. IEEE Trans. Dependable Secur. Comput. 1(1), 11–33 (2004)CrossRef Avizienis, A., Laprie, J.C., Randell, B., Landwehr, C.: Basic concepts and taxonomy of dependable and secure computing. IEEE Trans. Dependable Secur. Comput. 1(1), 11–33 (2004)CrossRef
2.
Zurück zum Zitat Bentley, J.: Introduction to Reliability and Quality Engineering. Addison-Wesley, Reading (1999) Bentley, J.: Introduction to Reliability and Quality Engineering. Addison-Wesley, Reading (1999)
3.
Zurück zum Zitat Pradhan, D.K. (ed.): Fault-tolerant Computer System Design. Prentice-Hall Inc., Upper Saddle River (1996) Pradhan, D.K. (ed.): Fault-tolerant Computer System Design. Prentice-Hall Inc., Upper Saddle River (1996)
4.
Zurück zum Zitat Kwak, S.W., Choi, B.J., Kim, B.K.: An optimal checkpointing-strategy for real-time control systems under transient faults. IEEE Trans. Reliab. 50(3), 293–301 (2001)CrossRef Kwak, S.W., Choi, B.J., Kim, B.K.: An optimal checkpointing-strategy for real-time control systems under transient faults. IEEE Trans. Reliab. 50(3), 293–301 (2001)CrossRef
5.
Zurück zum Zitat Zhang, Y., Jiang, J.: Integrated active fault-tolerant control using IMM approach. IEEE Trans. Aerosp. Electron. Syst. 37(4), 1221–1235 (2001)CrossRef Zhang, Y., Jiang, J.: Integrated active fault-tolerant control using IMM approach. IEEE Trans. Aerosp. Electron. Syst. 37(4), 1221–1235 (2001)CrossRef
6.
Zurück zum Zitat Alvisi, L., Malkhi, D., Pierce, E., Reiter, M.K.: Fault detection for Byzantine quorum systems. IEEE Trans. Parallel Distrib. Syst. 12(9), 996–1007 (2001)CrossRef Alvisi, L., Malkhi, D., Pierce, E., Reiter, M.K.: Fault detection for Byzantine quorum systems. IEEE Trans. Parallel Distrib. Syst. 12(9), 996–1007 (2001)CrossRef
7.
Zurück zum Zitat Lamport, L., Shostak, R., Pease, M.: The Byzantine generals problem. ACM Trans. Program. Lang. Syst. 4(3), 382–401 (1982)CrossRefMATH Lamport, L., Shostak, R., Pease, M.: The Byzantine generals problem. ACM Trans. Program. Lang. Syst. 4(3), 382–401 (1982)CrossRefMATH
8.
Zurück zum Zitat Lima, G.M., Burns, A.: A consensus protocol for CAN-based systems. In: 24th IEEE Real-Time Systems Symposium, RTSS 2003, pp. 420–429. IEEE (2003) Lima, G.M., Burns, A.: A consensus protocol for CAN-based systems. In: 24th IEEE Real-Time Systems Symposium, RTSS 2003, pp. 420–429. IEEE (2003)
9.
Zurück zum Zitat Cristian, F., Aghili, H., Strong, R., Dolev, D.: Atomic broadcast: from simple message diffusion to Byzantine agreement. Inf. Comput. 118(1), 158–179 (1995)MathSciNetCrossRefMATH Cristian, F., Aghili, H., Strong, R., Dolev, D.: Atomic broadcast: from simple message diffusion to Byzantine agreement. Inf. Comput. 118(1), 158–179 (1995)MathSciNetCrossRefMATH
10.
11.
Zurück zum Zitat Fitzi, M., Gottesman, D., Hirt, M., Holenstein, T., Smith, A.: Detectable Byzantine agreement secure against faulty majorities. In: Proceedings of the Twenty-First Annual Symposium on Principles of Distributed Computing, pp. 118–126. ACM (2002) Fitzi, M., Gottesman, D., Hirt, M., Holenstein, T., Smith, A.: Detectable Byzantine agreement secure against faulty majorities. In: Proceedings of the Twenty-First Annual Symposium on Principles of Distributed Computing, pp. 118–126. ACM (2002)
12.
Zurück zum Zitat Fitzi, M., Hirt, M.: Optimally efficient multi-valued Byzantine agreement. In: Proceedings of the Twenty-Fifth Annual ACM Symposium on Principles of Distributed Computing, pp. 163–168. ACM (2006) Fitzi, M., Hirt, M.: Optimally efficient multi-valued Byzantine agreement. In: Proceedings of the Twenty-Fifth Annual ACM Symposium on Principles of Distributed Computing, pp. 163–168. ACM (2006)
13.
Zurück zum Zitat Bao, F., Igarishi, Y.: Reliable broadcasting in product networks with Byzantine faults. In: Proceedings of Annual Symposium on Fault Tolerant Computing, pp. 262–271. IEEE (1996) Bao, F., Igarishi, Y.: Reliable broadcasting in product networks with Byzantine faults. In: Proceedings of Annual Symposium on Fault Tolerant Computing, pp. 262–271. IEEE (1996)
14.
Zurück zum Zitat Keichafer, R.M., Walter, C.J., Finn, A.M., Thambidurai, P.M.: The MAFT architecture for distributed fault tolerance. IEEE Trans. Comput. 37(4), 398–404 (1988)CrossRef Keichafer, R.M., Walter, C.J., Finn, A.M., Thambidurai, P.M.: The MAFT architecture for distributed fault tolerance. IEEE Trans. Comput. 37(4), 398–404 (1988)CrossRef
15.
Zurück zum Zitat Powell, D., Arlat, J., Beus-Dukic, L., Bondavalli, A., Coppola, P., Fantechi, A., Jenn, E., Rabejac, C., Wellings, A.: GUARDS: a generic upgradable architecture for real-time dependable systems. IEEE Trans. Parallel Distrib. Syst. 10(6), 580–599 (1999)CrossRef Powell, D., Arlat, J., Beus-Dukic, L., Bondavalli, A., Coppola, P., Fantechi, A., Jenn, E., Rabejac, C., Wellings, A.: GUARDS: a generic upgradable architecture for real-time dependable systems. IEEE Trans. Parallel Distrib. Syst. 10(6), 580–599 (1999)CrossRef
16.
Zurück zum Zitat Totel, E., Beus-Dukic, L., Blanquart, J.P., Deswarte, Y., Powell, D., Wellings, A.: Integrity management in GUARDS. In: Davies, N., Jochen, S., Raymond, K. (eds.) Middleware 1998, pp. 105–122. Springer, London (1998) Totel, E., Beus-Dukic, L., Blanquart, J.P., Deswarte, Y., Powell, D., Wellings, A.: Integrity management in GUARDS. In: Davies, N., Jochen, S., Raymond, K. (eds.) Middleware 1998, pp. 105–122. Springer, London (1998)
17.
Zurück zum Zitat Palumbo, D.L., Butler, R.W.: A performance evaluation of the software-implemented fault-tolerance computer. J. Guidance Control Dyn. 9(2), 175–180 (1986)CrossRef Palumbo, D.L., Butler, R.W.: A performance evaluation of the software-implemented fault-tolerance computer. J. Guidance Control Dyn. 9(2), 175–180 (1986)CrossRef
18.
Zurück zum Zitat Hopkins, A.L., Smith, T.B., Lala, J.H.: FTMP: a highly reliable fault-tolerant multiprocess for aircraft. Proc. IEEE 66(10), 1221–1239 (1978)CrossRef Hopkins, A.L., Smith, T.B., Lala, J.H.: FTMP: a highly reliable fault-tolerant multiprocess for aircraft. Proc. IEEE 66(10), 1221–1239 (1978)CrossRef
19.
Zurück zum Zitat Han, S., Shin, K.G.: Experimental evaluation of failure-detection schemes in real-time communication networks. In: Twenty-Seventh Annual International Symposium on Fault-Tolerant Computing, FTCS-27, Digest of Papers, pp. 122–131. IEEE (1997) Han, S., Shin, K.G.: Experimental evaluation of failure-detection schemes in real-time communication networks. In: Twenty-Seventh Annual International Symposium on Fault-Tolerant Computing, FTCS-27, Digest of Papers, pp. 122–131. IEEE (1997)
20.
Zurück zum Zitat Rufino, J., Verissimo, P., Arroz, G., Almeida, C., Rodrigues, L.: Fault-tolerant broadcasts in CAN. In: Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing, Digest of Papers, pp. 150–159. IEEE (1998) Rufino, J., Verissimo, P., Arroz, G., Almeida, C., Rodrigues, L.: Fault-tolerant broadcasts in CAN. In: Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing, Digest of Papers, pp. 150–159. IEEE (1998)
21.
Zurück zum Zitat AlMohammad, B., Bose, B.: Fault-tolerant communication algorithms in toroidal networks. IEEE Trans. Parallel Distrib. Syst. 10(10), 976–983 (1999)CrossRef AlMohammad, B., Bose, B.: Fault-tolerant communication algorithms in toroidal networks. IEEE Trans. Parallel Distrib. Syst. 10(10), 976–983 (1999)CrossRef
22.
Zurück zum Zitat Hsieh, H.C., Chiang, M.L.: A new solution for the Byzantine agreement problem. J. Parallel Distrib. Comput. 71(10), 1261–1277 (2011)CrossRefMATH Hsieh, H.C., Chiang, M.L.: A new solution for the Byzantine agreement problem. J. Parallel Distrib. Comput. 71(10), 1261–1277 (2011)CrossRefMATH
23.
Zurück zum Zitat Saini, P., Singh, A.K.: An efficient Byzantine fault tolerant agreement. In: AIP Conference Proceedings, vol. 1324, no. 1 (2010) Saini, P., Singh, A.K.: An efficient Byzantine fault tolerant agreement. In: AIP Conference Proceedings, vol. 1324, no. 1 (2010)
24.
Zurück zum Zitat Wang, S.S., Yan, K.Q., Wang, S.C.: An optimal solution for Byzantine agreement under a hierarchical cluster-oriented mobile ad hoc network. Comput. Electr. Eng. 36(1), 100–113 (2010)CrossRefMATH Wang, S.S., Yan, K.Q., Wang, S.C.: An optimal solution for Byzantine agreement under a hierarchical cluster-oriented mobile ad hoc network. Comput. Electr. Eng. 36(1), 100–113 (2010)CrossRefMATH
25.
Zurück zum Zitat Moniz, H., Neves, N.F., Correia, M.: Byzantine fault-tolerant consensus in wireless ad hoc networks. IEEE Trans. Mobile Comput. 12(12), 2441–2454 (2013)CrossRef Moniz, H., Neves, N.F., Correia, M.: Byzantine fault-tolerant consensus in wireless ad hoc networks. IEEE Trans. Mobile Comput. 12(12), 2441–2454 (2013)CrossRef
26.
Zurück zum Zitat Veronese, G.S., Correia, M., Bessani, A.N., Lung, L.C., Verissimo, P.: Efficient Byzantine fault-tolerance. IEEE Trans. Comput. 62(1), 16–30 (2013)MathSciNetCrossRef Veronese, G.S., Correia, M., Bessani, A.N., Lung, L.C., Verissimo, P.: Efficient Byzantine fault-tolerance. IEEE Trans. Comput. 62(1), 16–30 (2013)MathSciNetCrossRef
27.
Zurück zum Zitat Kotla, R., Clement, A., Wong, E., Alvisi, L., Dahlin, M.: Zyzzyva: speculative Byzantine fault tolerance. Commun. ACM 51(11), 86–95 (2008)CrossRef Kotla, R., Clement, A., Wong, E., Alvisi, L., Dahlin, M.: Zyzzyva: speculative Byzantine fault tolerance. Commun. ACM 51(11), 86–95 (2008)CrossRef
28.
Zurück zum Zitat Keidar, I., Rajsbaum, S.: On the cost of fault-tolerant consensus when there are no faults: preliminary version. SIGACT News 32(2), 45–63 (2001)CrossRef Keidar, I., Rajsbaum, S.: On the cost of fault-tolerant consensus when there are no faults: preliminary version. SIGACT News 32(2), 45–63 (2001)CrossRef
29.
Zurück zum Zitat Banu, N., Izumi, T., Wada, K.: Adaptive and doubly-expedited one-step consensus in Byzantine asynchronous systems. Parallel Process. Lett. 21(04), 461–477 (2011)MathSciNetCrossRefMATH Banu, N., Izumi, T., Wada, K.: Adaptive and doubly-expedited one-step consensus in Byzantine asynchronous systems. Parallel Process. Lett. 21(04), 461–477 (2011)MathSciNetCrossRefMATH
30.
Zurück zum Zitat Patra, A., Choudhury, A., Rangan, C.P.: Asynchronous Byzantine agreement with optimal resilience. Distrib. Comput. 27(2), 111–146 (2014)MathSciNetCrossRefMATH Patra, A., Choudhury, A., Rangan, C.P.: Asynchronous Byzantine agreement with optimal resilience. Distrib. Comput. 27(2), 111–146 (2014)MathSciNetCrossRefMATH
31.
Zurück zum Zitat Xu, X., Lin, Y.: Checkpoint selection in fault recovery based on Byzantine fault model. In: Fourth International Conference on Computational Intelligence and Communication Networks (CICN), pp. 582–587, November 2012 Xu, X., Lin, Y.: Checkpoint selection in fault recovery based on Byzantine fault model. In: Fourth International Conference on Computational Intelligence and Communication Networks (CICN), pp. 582–587, November 2012
32.
Zurück zum Zitat Widder, J., Biely, M., Gridling, G., Weiss, B., Blanquart, J.P.: Consensus in the presence of mortal Byzantine faulty processes. Distrib. Comput. 24(6), 299–321 (2012)CrossRefMATH Widder, J., Biely, M., Gridling, G., Weiss, B., Blanquart, J.P.: Consensus in the presence of mortal Byzantine faulty processes. Distrib. Comput. 24(6), 299–321 (2012)CrossRefMATH
33.
Zurück zum Zitat Wang, S.C., Yan, K.Q., Ho, C.L., Wang, S.S.: The optimal generalized Byzantine agreement in cluster-based wireless sensor networks. Comput. Stan. Interfaces 36(5), 821–830 (2014)CrossRef Wang, S.C., Yan, K.Q., Ho, C.L., Wang, S.S.: The optimal generalized Byzantine agreement in cluster-based wireless sensor networks. Comput. Stan. Interfaces 36(5), 821–830 (2014)CrossRef
34.
Zurück zum Zitat Abdelhakim, M., Lightfoot, L.E., Ren, J., Li, T.: Distributed detection in mobile access wireless sensor networks under Byzantine attacks. IEEE Trans. Parallel Distrib. Syst. 25(4), 950–959 (2014)CrossRef Abdelhakim, M., Lightfoot, L.E., Ren, J., Li, T.: Distributed detection in mobile access wireless sensor networks under Byzantine attacks. IEEE Trans. Parallel Distrib. Syst. 25(4), 950–959 (2014)CrossRef
35.
Zurück zum Zitat Duran, A., Ferrer, R., Costa, J.J., Gonzàlez, M., Martorell, X., Ayguadé, E., Labarta, J.: A proposal for error handling in OpenMP. Int. J. Parallel Prog. 35(4), 393–416 (2007)CrossRefMATH Duran, A., Ferrer, R., Costa, J.J., Gonzàlez, M., Martorell, X., Ayguadé, E., Labarta, J.: A proposal for error handling in OpenMP. Int. J. Parallel Prog. 35(4), 393–416 (2007)CrossRefMATH
36.
Zurück zum Zitat Bronevetsky, G., Marques, D., Pingali, K., Szwed, P., Schulz, M.: Application-level checkpointing for shared memory programs. In: Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems. ASPLOS XI, pp. 235–247. ACM, New York (2004) Bronevetsky, G., Marques, D., Pingali, K., Szwed, P., Schulz, M.: Application-level checkpointing for shared memory programs. In: Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems. ASPLOS XI, pp. 235–247. ACM, New York (2004)
37.
Zurück zum Zitat Bronevetsky, G., Pingali, K., Stodghill, P.: Experimental evaluation of application-level checkpointing for OpenMP programs. In: Proceedings of the 20th Annual International Conference on Supercomputing, ICS 2006, pp. 2–13. ACM, New York (2006) Bronevetsky, G., Pingali, K., Stodghill, P.: Experimental evaluation of application-level checkpointing for OpenMP programs. In: Proceedings of the 20th Annual International Conference on Supercomputing, ICS 2006, pp. 2–13. ACM, New York (2006)
38.
Zurück zum Zitat Fu, H., Ding, Y.: Using redundant threads for fault tolerance of OpenMP programs. In: 2010 International Conference on Information Science and Applications, pp. 1–8, April 2010 Fu, H., Ding, Y.: Using redundant threads for fault tolerance of OpenMP programs. In: 2010 International Conference on Information Science and Applications, pp. 1–8, April 2010
39.
Zurück zum Zitat Li, M., Hsiao, M.S.: 3-D parallel fault simulation with GPGPU. IEEE Trans. Comput. Aided Design Integr. Circuits Syst. 30(10), 1545–1555 (2011)CrossRef Li, M., Hsiao, M.S.: 3-D parallel fault simulation with GPGPU. IEEE Trans. Comput. Aided Design Integr. Circuits Syst. 30(10), 1545–1555 (2011)CrossRef
40.
Zurück zum Zitat Guo, X., Jiang, H., Li, K.C.: A checkpoint/restart scheme for CUDA applications with complex memory hierarchy. In: 14th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), pp. 247–252, July 2013 Guo, X., Jiang, H., Li, K.C.: A checkpoint/restart scheme for CUDA applications with complex memory hierarchy. In: 14th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), pp. 247–252, July 2013
41.
Zurück zum Zitat Carlo, S.D., Gambardella, G., Martella, I., Prinetto, P., Rolfo, D., Trotta, P.: Fault mitigation strategies for CUDA GPUs. In: 2013 IEEE International Test Conference (ITC), pp. 1–8, September 2013 Carlo, S.D., Gambardella, G., Martella, I., Prinetto, P., Rolfo, D., Trotta, P.: Fault mitigation strategies for CUDA GPUs. In: 2013 IEEE International Test Conference (ITC), pp. 1–8, September 2013
42.
Zurück zum Zitat Xu, X.H., Yang, X.J., Xue, J.L., Lin, Y.F., Lin, Y.S.: PartialRC: a partial recomputing method for efficient fault recovery on GPGPUs. J. Comput. Sci. Technol. 27(2), 240–255 (2012)CrossRef Xu, X.H., Yang, X.J., Xue, J.L., Lin, Y.F., Lin, Y.S.: PartialRC: a partial recomputing method for efficient fault recovery on GPGPUs. J. Comput. Sci. Technol. 27(2), 240–255 (2012)CrossRef
43.
Zurück zum Zitat Laosooksathit, S., Nassar, R., Leangsuksun, C., Paun, M.: Reliability-aware performance model for optimal GPU-enabled cluster environment. J. Supercomputing 68(3), 1630–1651 (2014)CrossRef Laosooksathit, S., Nassar, R., Leangsuksun, C., Paun, M.: Reliability-aware performance model for optimal GPU-enabled cluster environment. J. Supercomputing 68(3), 1630–1651 (2014)CrossRef
44.
Zurück zum Zitat Demchik, V., Kolomoyets, N.: QCDGPU: open-source package for Monte Carlo lattice simulations on OpenCL-compatible multi-GPU systems (2013) Demchik, V., Kolomoyets, N.: QCDGPU: open-source package for Monte Carlo lattice simulations on OpenCL-compatible multi-GPU systems (2013)
45.
Zurück zum Zitat Avizienis, A.: Fault-tolerance: a property that ensures constant availability of digital system. IEEE Trans. Comput. 66(10), 5–25 (1978)MathSciNet Avizienis, A.: Fault-tolerance: a property that ensures constant availability of digital system. IEEE Trans. Comput. 66(10), 5–25 (1978)MathSciNet
46.
47.
Zurück zum Zitat Mamedli, È.M., Samedov, R.Y., Sobolev, N.: A method for localization of Byzantine and nonbyzantine faults. Avtomatika i Telemekhanika 5, 126–138 (1992)MATH Mamedli, È.M., Samedov, R.Y., Sobolev, N.: A method for localization of Byzantine and nonbyzantine faults. Avtomatika i Telemekhanika 5, 126–138 (1992)MATH
48.
Zurück zum Zitat Samet, R.: Recovery device for real-time dual-redundant computer systems. IEEE Trans. Dependable Secure Comput. 8(3), 391–403 (2011)CrossRef Samet, R.: Recovery device for real-time dual-redundant computer systems. IEEE Trans. Dependable Secure Comput. 8(3), 391–403 (2011)CrossRef
49.
Zurück zum Zitat Samet, R.: Choosing between design options for real-time computers tolerating a single fault. J. Circuits Syst. Comput. 19(05), 1041–1068 (2010)CrossRef Samet, R.: Choosing between design options for real-time computers tolerating a single fault. J. Circuits Syst. Comput. 19(05), 1041–1068 (2010)CrossRef
50.
Zurück zum Zitat Sivencrona, H., Johannessen, P., Persson, M., Torin, J.: Heavy-ion fault injections in the time-triggered communication protocol. In: Lemos, R., Weber, T.S., Camargo, J.B. (eds.) LADC 2003. LNCS, vol. 2847, pp. 69–80. Springer, Heidelberg (2003). doi:10.1007/978-3-540-45214-0_8 CrossRef Sivencrona, H., Johannessen, P., Persson, M., Torin, J.: Heavy-ion fault injections in the time-triggered communication protocol. In: Lemos, R., Weber, T.S., Camargo, J.B. (eds.) LADC 2003. LNCS, vol. 2847, pp. 69–80. Springer, Heidelberg (2003). doi:10.​1007/​978-3-540-45214-0_​8 CrossRef
51.
Zurück zum Zitat Driscoll, K., Hall, B., Sivencrona, H., Zumsteg, P.: Byzantine fault tolerance, from theory to reality. In: Anderson, S., Felici, M., Littlewood, B. (eds.) SAFECOMP 2003. LNCS, vol. 2788, pp. 235–248. Springer, Heidelberg (2003). doi:10.1007/978-3-540-39878-3_19 CrossRef Driscoll, K., Hall, B., Sivencrona, H., Zumsteg, P.: Byzantine fault tolerance, from theory to reality. In: Anderson, S., Felici, M., Littlewood, B. (eds.) SAFECOMP 2003. LNCS, vol. 2788, pp. 235–248. Springer, Heidelberg (2003). doi:10.​1007/​978-3-540-39878-3_​19 CrossRef
52.
Zurück zum Zitat Tanenbaum, A.S.: Computer Networks, vol. 3. Prentice Hall, New Jersey (1996)MATH Tanenbaum, A.S.: Computer Networks, vol. 3. Prentice Hall, New Jersey (1996)MATH
53.
Zurück zum Zitat Stallings, W.: Data and computer communications. Pearson/Prentice Hall (2007) Stallings, W.: Data and computer communications. Pearson/Prentice Hall (2007)
54.
Zurück zum Zitat Mullender, S.: Distributed Systems. ACM Press/Addison-Wesley Publishing Co. (1993) Mullender, S.: Distributed Systems. ACM Press/Addison-Wesley Publishing Co. (1993)
55.
Zurück zum Zitat Coulouris, G.F., Dollimore, J., Kindberg, T.: Distributed Systems: Concepts and Design. Pearson education (2005) Coulouris, G.F., Dollimore, J., Kindberg, T.: Distributed Systems: Concepts and Design. Pearson education (2005)
56.
Zurück zum Zitat Mitra, S., Saxena, N.R., McCluskey, E.J.: A design diversity metric and analysis of redundant systems. IEEE Trans. Comput. 51(5), 498–510 (2002)CrossRef Mitra, S., Saxena, N.R., McCluskey, E.J.: A design diversity metric and analysis of redundant systems. IEEE Trans. Comput. 51(5), 498–510 (2002)CrossRef
57.
Zurück zum Zitat Samedov, R.: An approach to the support of the fault-tolerance of the double redundant computer control systems. Math. Comput. Appl. 4(2), 175–184 (1999) Samedov, R.: An approach to the support of the fault-tolerance of the double redundant computer control systems. Math. Comput. Appl. 4(2), 175–184 (1999)
58.
Zurück zum Zitat Kim, H., Jeon, H.J., Lee, K., Lee, H.: The design and evaluation of all voting triple modular redundancy system. In: Proceedings. Annual Reliability and Maintainability Symposium, pp. 439–444. IEEE (2002) Kim, H., Jeon, H.J., Lee, K., Lee, H.: The design and evaluation of all voting triple modular redundancy system. In: Proceedings. Annual Reliability and Maintainability Symposium, pp. 439–444. IEEE (2002)
59.
Zurück zum Zitat Smith, T.B.: Fault tolerant processor concepts and operation. In: Digest of Papers, FTCS-14, Kissimmee, USA, pp. 158–163 (1984) Smith, T.B.: Fault tolerant processor concepts and operation. In: Digest of Papers, FTCS-14, Kissimmee, USA, pp. 158–163 (1984)
60.
Zurück zum Zitat Laprie, J.C.: Dependable computing and fault-tolerance. In: Digest of Papers FTCS-15, pp. 2–11 (1985) Laprie, J.C.: Dependable computing and fault-tolerance. In: Digest of Papers FTCS-15, pp. 2–11 (1985)
61.
Zurück zum Zitat Mamedli, È.M., Samedov, R.Y., Sobolev, N.: A method for localization of Byzantine and NonByzantine faults. J. Autom. Remote Control 53(5), 734–744 (1992)MATH Mamedli, È.M., Samedov, R.Y., Sobolev, N.: A method for localization of Byzantine and NonByzantine faults. J. Autom. Remote Control 53(5), 734–744 (1992)MATH
62.
Zurück zum Zitat Oh, N., Mitra, S., McCluskey, E.J.: ED4I: error detection by diverse data and duplicated instructions. IEEE Trans. Comput. 51(2), 180–199 (2002)CrossRef Oh, N., Mitra, S., McCluskey, E.J.: ED4I: error detection by diverse data and duplicated instructions. IEEE Trans. Comput. 51(2), 180–199 (2002)CrossRef
63.
Zurück zum Zitat Siewiorek, D.P., Swarz, R.S.: Reliable Computer Systems: Design and Evaluation, 3rd edn. A.K. Peters Ltd., Natick (1998)MATH Siewiorek, D.P., Swarz, R.S.: Reliable Computer Systems: Design and Evaluation, 3rd edn. A.K. Peters Ltd., Natick (1998)MATH
64.
Zurück zum Zitat Samet, R.: Fault-tolerant procedures for redundant computer systems. Qual. Reliab. Eng. Int. 25(1), 41–68 (2009)CrossRef Samet, R.: Fault-tolerant procedures for redundant computer systems. Qual. Reliab. Eng. Int. 25(1), 41–68 (2009)CrossRef
65.
Zurück zum Zitat Hurst, S.L.: VLSI Testing: digital and mixed analogue/digital techniques, vol. 9. IET (1998) Hurst, S.L.: VLSI Testing: digital and mixed analogue/digital techniques, vol. 9. IET (1998)
66.
Zurück zum Zitat Lala, P.K.: Self-checking and fault-tolerant digital design. Morgan Kaufmann (2001) Lala, P.K.: Self-checking and fault-tolerant digital design. Morgan Kaufmann (2001)
67.
Zurück zum Zitat Powell, D.: Failure mode assumptions and assumption coverage. In: Randell, B., Laprie, J.C., Kopetz, H., Littlewood, B. (eds.) Predictably Dependable Computing Systems, pp. 123–140. Springer, Heidelberg (1995)CrossRef Powell, D.: Failure mode assumptions and assumption coverage. In: Randell, B., Laprie, J.C., Kopetz, H., Littlewood, B. (eds.) Predictably Dependable Computing Systems, pp. 123–140. Springer, Heidelberg (1995)CrossRef
68.
Zurück zum Zitat Laprie, J.C., Arlat, J., Blanquart, J., Costes, A., Crouzet, Y., Deswarte, Y., Fabre, J., Guillermain, H., Kaâniche, M., Kanoun, K., et al.: Guide de la sûreté de fonctionnement (dependability handbook). Cépaduès, Toulouse (1995) Laprie, J.C., Arlat, J., Blanquart, J., Costes, A., Crouzet, Y., Deswarte, Y., Fabre, J., Guillermain, H., Kaâniche, M., Kanoun, K., et al.: Guide de la sûreté de fonctionnement (dependability handbook). Cépaduès, Toulouse (1995)
69.
Zurück zum Zitat Ziv, A., Bruck, J.: An on-line algorithm for checkpoint placement. IEEE Trans. Comput. 46(9), 976–985 (1997)MathSciNetCrossRef Ziv, A., Bruck, J.: An on-line algorithm for checkpoint placement. IEEE Trans. Comput. 46(9), 976–985 (1997)MathSciNetCrossRef
70.
Zurück zum Zitat Ling, Y., Mi, J., Lin, X.: A variational calculus approach to optimal checkpoint placement. IEEE Trans. Comput. 50(7), 699–708 (2001)CrossRef Ling, Y., Mi, J., Lin, X.: A variational calculus approach to optimal checkpoint placement. IEEE Trans. Comput. 50(7), 699–708 (2001)CrossRef
71.
Zurück zum Zitat Lincoln, P., Rushby, J.: A formally verified algorithm for interactive consistency under a hybrid fault model. In: The Twenty-Third International Symposium on Fault-Tolerant Computing, FTCS-23, Digest of Papers, pp. 402–411. IEEE (1993) Lincoln, P., Rushby, J.: A formally verified algorithm for interactive consistency under a hybrid fault model. In: The Twenty-Third International Symposium on Fault-Tolerant Computing, FTCS-23, Digest of Papers, pp. 402–411. IEEE (1993)
72.
Zurück zum Zitat Meyer, F.J., Pradhan, D.K.: Consensus with dual failure modes. IEEE Trans. Parallel Distrib. Syst. 2(2), 214–222 (1991)CrossRef Meyer, F.J., Pradhan, D.K.: Consensus with dual failure modes. IEEE Trans. Parallel Distrib. Syst. 2(2), 214–222 (1991)CrossRef
73.
Zurück zum Zitat Thambidurai, P., Park, Y.K.: Interactive consistency with multiple failure modes. In: Proceedings, Seventh Symposium on Reliable Distributed Systems, pp. 93–100. IEEE (1988) Thambidurai, P., Park, Y.K.: Interactive consistency with multiple failure modes. In: Proceedings, Seventh Symposium on Reliable Distributed Systems, pp. 93–100. IEEE (1988)
74.
Zurück zum Zitat Chor, B., Coan, B.A.: A simple and efficient randomized Byzantine agreement algorithm. IEEE Trans. Softw. Eng. 6, 531–539 (1985)MathSciNetCrossRef Chor, B., Coan, B.A.: A simple and efficient randomized Byzantine agreement algorithm. IEEE Trans. Softw. Eng. 6, 531–539 (1985)MathSciNetCrossRef
75.
Zurück zum Zitat Kopetz, H.: Real-Time Systems: Design Principles for Distributed Embedded Applications. Springer Science & Business Media, London (2011)CrossRefMATH Kopetz, H.: Real-Time Systems: Design Principles for Distributed Embedded Applications. Springer Science & Business Media, London (2011)CrossRefMATH
Metadaten
Titel
Analysis of Relationship Between Modes of Intercomputer Communications and Fault Types in Redundant Computer Systems
verfasst von
Refik Samet
Nermin Samet
Copyright-Jahr
2017
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-54563-8_1