Skip to main content
Erschienen in: Journal of Network and Systems Management 3/2014

01.07.2014

An Iterative Approach to Trustable Systems Management Automation and Fault Handling

verfasst von: Barry McLarnon, Philip Robinson, Peter Milligan, Paul Sage

Erschienen in: Journal of Network and Systems Management | Ausgabe 3/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Automated systems management solutions aim to reduce the pressure on the administrators of complex, large-scale, distributed systems by enabling the automation of many of the common tasks of management. However, this creates a level of abstraction, which can act as a barrier between the administrator and the elements being controlled. This can impede the transition to new management paradigms required by the increase of off-premise resources and hybrid cloud systems. The resulting loss of control of the managed environment can contribute to a loss of trust in automated systems management solutions and affect their broader use. This paper proposes a novel approach where the administrator can control the automation level on a per task basis. Administrators define a management task as they would perform it directly and allow the solution to identify the triggers that cause the task to be enacted. The solution also allows administrators to define relevant task output that can be analyzed for fault states and enable error recovery without manual intervention. The impact of this approach leads to reduced management effort for the administrator, while retaining controllability and keeping automation costs low, along with reducing the incidence of errors.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Armburst, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., Zaharia, M.: Above the clouds: a berkeley view of cloud computing. Commun. ACM 53(4), 50–58 (2010)CrossRef Armburst, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., Zaharia, M.: Above the clouds: a berkeley view of cloud computing. Commun. ACM 53(4), 50–58 (2010)CrossRef
5.
Zurück zum Zitat Brown, A.B., Hellerstein, J.L.: Reducing the cost of it operations: is automation always the answer? In: HotOS ’05: Proceedings of the 10th Conference on Hot Topics in Operating Systems (2005) Brown, A.B., Hellerstein, J.L.: Reducing the cost of it operations: is automation always the answer? In: HotOS ’05: Proceedings of the 10th Conference on Hot Topics in Operating Systems (2005)
6.
Zurück zum Zitat Duez, P.P., Zuliana, M.J., Jamieson, G.A.: Trust by design: information requirements for appropriate trust in automation. In: CASCON ’06: Proceedings of the 2006 Conference of the Center for Advanced Studies on Collaborative Research (2006) Duez, P.P., Zuliana, M.J., Jamieson, G.A.: Trust by design: information requirements for appropriate trust in automation. In: CASCON ’06: Proceedings of the 2006 Conference of the Center for Advanced Studies on Collaborative Research (2006)
7.
Zurück zum Zitat Velasquez, N.F., Weisband, S.P.: Work practices of system administrators: Implications for tool design. In: CHIMIT ’08: Proceedings of the 2nd ACM Symposium on Computer Human Interaction for Management of Information Technology (2008) Velasquez, N.F., Weisband, S.P.: Work practices of system administrators: Implications for tool design. In: CHIMIT ’08: Proceedings of the 2nd ACM Symposium on Computer Human Interaction for Management of Information Technology (2008)
8.
Zurück zum Zitat Nolan, R., McFarlan, F.W.: Information technology and the board of directors. Harv. Bus. Rev. 83(10), 96–106 (2005) Nolan, R., McFarlan, F.W.: Information technology and the board of directors. Harv. Bus. Rev. 83(10), 96–106 (2005)
9.
Zurück zum Zitat Sheridan, T.B., Parasuraman, R.: Human-automation interaction. Rev. Hum. Factors Ergon. 1(1), 80–129 (2006) Sheridan, T.B., Parasuraman, R.: Human-automation interaction. Rev. Hum. Factors Ergon. 1(1), 80–129 (2006)
10.
Zurück zum Zitat IBM: Ibm global services and autonomic computing. White paper, IBM (2002) IBM: Ibm global services and autonomic computing. White paper, IBM (2002)
12.
Zurück zum Zitat Garcia Leiva, R., Barroso Lopez, M., Cancio Melia, G., Chardi Marco, B., Cons, L., Poznanski, P., Washbrook, A., Ferro, E., Holt, A.: Quattor: tools and techniques for the configuration, installation and management of large-scale grid computing fabrics. J. Grid Comput. 2(4), 313–322 (2004)CrossRef Garcia Leiva, R., Barroso Lopez, M., Cancio Melia, G., Chardi Marco, B., Cons, L., Poznanski, P., Washbrook, A., Ferro, E., Holt, A.: Quattor: tools and techniques for the configuration, installation and management of large-scale grid computing fabrics. J. Grid Comput. 2(4), 313–322 (2004)CrossRef
13.
Zurück zum Zitat Burgess, M.: A tiny overview of cfengine: convergent maintenance agent. In: MARS/ICINCO ’05: Proceedings of the 1st International Workshop on Multi-Agent and Robotic Systems (2005) Burgess, M.: A tiny overview of cfengine: convergent maintenance agent. In: MARS/ICINCO ’05: Proceedings of the 1st International Workshop on Multi-Agent and Robotic Systems (2005)
15.
Zurück zum Zitat IBM: An architectural blueprint for autonomic computing. White paper, IBM (2005) IBM: An architectural blueprint for autonomic computing. White paper, IBM (2005)
16.
Zurück zum Zitat Huebsher, M.C., McCann, J.A.: A survey of autonomic computing—degrees, models, and applications. ACM Comput. Surv. 40(3), 1–28 (2008)CrossRef Huebsher, M.C., McCann, J.A.: A survey of autonomic computing—degrees, models, and applications. ACM Comput. Surv. 40(3), 1–28 (2008)CrossRef
17.
Zurück zum Zitat Lanfranchi, G., Della Peruta, P., Perrone, A., Calvanese, D.: Toward a new landscape of systems management in an autonomic computing environment. IBM Syst. J. 42(1), 119–128 (2003)CrossRef Lanfranchi, G., Della Peruta, P., Perrone, A., Calvanese, D.: Toward a new landscape of systems management in an autonomic computing environment. IBM Syst. J. 42(1), 119–128 (2003)CrossRef
18.
Zurück zum Zitat Herrmann, K., Muhl, G., Geihs, K.: Self management: the solution to complexity or just another problem?. IEEE Distrib. Syst. Online 6(1), 1 (2005)CrossRef Herrmann, K., Muhl, G., Geihs, K.: Self management: the solution to complexity or just another problem?. IEEE Distrib. Syst. Online 6(1), 1 (2005)CrossRef
19.
Zurück zum Zitat Barrett, R., Chen, Y.Y.M., Maglio, P.P.: System administrators are users, too: designing workspaces for managing internet-scale systems. In: CHI ’03: Proceedings of the 2003 Conference on Human Factors in Computing Systems, pp. 1068–1069 (2003) Barrett, R., Chen, Y.Y.M., Maglio, P.P.: System administrators are users, too: designing workspaces for managing internet-scale systems. In: CHI ’03: Proceedings of the 2003 Conference on Human Factors in Computing Systems, pp. 1068–1069 (2003)
20.
Zurück zum Zitat Buchholz, J., Volk, E.: The need for new monitoring and management technologies in large scale computing systems. In: Proceedings of the eChallenges e-2010 Conference (2010) Buchholz, J., Volk, E.: The need for new monitoring and management technologies in large scale computing systems. In: Proceedings of the eChallenges e-2010 Conference (2010)
21.
Zurück zum Zitat Bainbridge, L.: Ironies of automation. Automatica 19(6), 775–779 (1983)CrossRef Bainbridge, L.: Ironies of automation. Automatica 19(6), 775–779 (1983)CrossRef
22.
Zurück zum Zitat David, J.S., Schuff, D., St. Louis, R.: Managing your total it cost of ownership. Commun. ACM 45(1), 101–106 (2002)CrossRef David, J.S., Schuff, D., St. Louis, R.: Managing your total it cost of ownership. Commun. ACM 45(1), 101–106 (2002)CrossRef
23.
Zurück zum Zitat Di Nocera, F., Lorenz, B., Parasuraman, R.: Consequences of shifting from one level of automation to another: main effects and their stability. In: Human Factors in Design, Safety and Management, pp. 363–376 (2004) Di Nocera, F., Lorenz, B., Parasuraman, R.: Consequences of shifting from one level of automation to another: main effects and their stability. In: Human Factors in Design, Safety and Management, pp. 363–376 (2004)
24.
Zurück zum Zitat Chen, X., Mao, Y., Mao, Z.M., Merwe, J.V.d.: Declarative configuration management for complex and dynamic networks. In: Proceedings of ACM CoNext (2010) Chen, X., Mao, Y., Mao, Z.M., Merwe, J.V.d.: Declarative configuration management for complex and dynamic networks. In: Proceedings of ACM CoNext (2010)
25.
Zurück zum Zitat Volk, E., Buchholz, J., Wesner, S., Koudela, D., Schmidt, M., Fallenbeck, N., Schwarzkopf, R., Freisleben, B., Isenmann, G., Schwitalla, J.: Towards intelligent management of very large computing systems. In: Proceedings of the International Conference on Competence in High Performance Computing (2010) Volk, E., Buchholz, J., Wesner, S., Koudela, D., Schmidt, M., Fallenbeck, N., Schwarzkopf, R., Freisleben, B., Isenmann, G., Schwitalla, J.: Towards intelligent management of very large computing systems. In: Proceedings of the International Conference on Competence in High Performance Computing (2010)
26.
Zurück zum Zitat Schumm, D., Fehling, C., Karastoyanova, D., Leymann, F., Rütschlin, J.: Processes for human integration in automated cloud application management. Tech. rep., Universität Stuttgart (2012) Schumm, D., Fehling, C., Karastoyanova, D., Leymann, F., Rütschlin, J.: Processes for human integration in automated cloud application management. Tech. rep., Universität Stuttgart (2012)
27.
Zurück zum Zitat Humble, J., Molesky, J.: Why enterprises must adopt devops to enable continuous delivery. Cut. IT J. 24(8), 6–12 (2011) Humble, J., Molesky, J.: Why enterprises must adopt devops to enable continuous delivery. Cut. IT J. 24(8), 6–12 (2011)
28.
Zurück zum Zitat Ekaette, E., Far, B.: A framework for distributed fault management using intelligent software agents. In: Proceedings of the IEEE Canadian Conference on Electrical and Computer Engineering (2003) Ekaette, E., Far, B.: A framework for distributed fault management using intelligent software agents. In: Proceedings of the IEEE Canadian Conference on Electrical and Computer Engineering (2003)
29.
Zurück zum Zitat Hanemann, A., Sailer, M., Schmitz, D.: Assured service quality by improved fault management. In: ICSOC ’04: Proceedings of the 2nd International Conference on Service Oriented Computing, pp. 183–192 (2004) Hanemann, A., Sailer, M., Schmitz, D.: Assured service quality by improved fault management. In: ICSOC ’04: Proceedings of the 2nd International Conference on Service Oriented Computing, pp. 183–192 (2004)
30.
Zurück zum Zitat Oliveira, F., Tjang, A., Bianchini, R., Martin, R.P., Nguyen, T.D.: Barricade: defending systems against operator mistakes. In: Proceedings of the 5th European Conference on Computer Systems (2010) Oliveira, F., Tjang, A., Bianchini, R., Martin, R.P., Nguyen, T.D.: Barricade: defending systems against operator mistakes. In: Proceedings of the 5th European Conference on Computer Systems (2010)
31.
Zurück zum Zitat Lee, J.D., See, K.A.: Trust in automation: designing for appropriate reliance. Hum. Factors J. Hum. Factors Ergon. Soc. 46(1), 50–80 (2004)CrossRef Lee, J.D., See, K.A.: Trust in automation: designing for appropriate reliance. Hum. Factors J. Hum. Factors Ergon. Soc. 46(1), 50–80 (2004)CrossRef
32.
Zurück zum Zitat McLarnon, B., Robinson, P., Milligan, P., Sage, P.: Introducing automated management through iteratively increased automation and indicators. In: DANMS ’11: Proceedings of 4th IFIP/IEEE Workshop on Distributed Autonomous Network Management Systems, pp. 1116–1121 (2011) McLarnon, B., Robinson, P., Milligan, P., Sage, P.: Introducing automated management through iteratively increased automation and indicators. In: DANMS ’11: Proceedings of 4th IFIP/IEEE Workshop on Distributed Autonomous Network Management Systems, pp. 1116–1121 (2011)
33.
Zurück zum Zitat Dugmore, J., Taylor, S.: Itil v3 and iso/iec 20000. Tech. rep., BSi (2008) Dugmore, J., Taylor, S.: Itil v3 and iso/iec 20000. Tech. rep., BSi (2008)
34.
Zurück zum Zitat Delaet, T., Joosen, W., Vanbrabant, B.: A survey of system configuration tools. In: LISA ’10: Proceedings of the 24th International Conference on Large Installation System Administration (2010) Delaet, T., Joosen, W., Vanbrabant, B.: A survey of system configuration tools. In: LISA ’10: Proceedings of the 24th International Conference on Large Installation System Administration (2010)
35.
Zurück zum Zitat Diao, Y., Hellerstein, J.L., Parekh, S., Griffith, R., Kaiser, G.E., Phung, D.: A control theory foundation for self-managing computer systems. IEEE J. Sel. Areas Commun. 23(12), 2213–2222 (2005)CrossRef Diao, Y., Hellerstein, J.L., Parekh, S., Griffith, R., Kaiser, G.E., Phung, D.: A control theory foundation for self-managing computer systems. IEEE J. Sel. Areas Commun. 23(12), 2213–2222 (2005)CrossRef
Metadaten
Titel
An Iterative Approach to Trustable Systems Management Automation and Fault Handling
verfasst von
Barry McLarnon
Philip Robinson
Peter Milligan
Paul Sage
Publikationsdatum
01.07.2014
Verlag
Springer US
Erschienen in
Journal of Network and Systems Management / Ausgabe 3/2014
Print ISSN: 1064-7570
Elektronische ISSN: 1573-7705
DOI
https://doi.org/10.1007/s10922-013-9295-z

Weitere Artikel der Ausgabe 3/2014

Journal of Network and Systems Management 3/2014 Zur Ausgabe

Premium Partner