Holistic approach to management of IT infrastructure for environmental monitoring and decision support systems with urgent computing capabilities

https://doi.org/10.1016/j.future.2016.08.007Get rights and content

Highlights

  • IT infrastructures for environmental monitoring systems are investigated.

  • Holistic management of their configuration to maintain QoS is proposed.

  • The approach optimizes the system as a whole rather than isolated subsystems.

  • Optimization goals are different in urgent and normal modes of operation.

  • The approach is validated with a levee monitoring use case.

Abstract

Modern environmental monitoring and decision support systems are based on complex IT infrastructures comprising multiple hardware and software subsystems that need to provide a variety of Quality of Service (QoS) guarantees required for urgent computing services, essential in emergency situations. Such IT infrastructures need to be managed in order to maintain the quality of service, which–especially when operating in the urgent mode–involves optimization of multiple, often conflicting, objectives and making trade-offs between them. Existing approaches do not solve this issue optimally because they focus on delivering quality of service within individual subsystems in isolation. We propose a holistic approach to system management which takes into account knowledge about the system as a whole—in particular the interplay of conflicting objectives and configuration options across all subsystems. We argue that such an approach produces a better configuration of the involved subsystems, improving the resolution of trade-offs between cost, energy and performance objectives, leading to their better overall fulfillment in comparison with the non-holistic approach in which individual subsystems are managed in isolation. We validate our approach using a prototype implementation of the holistic optimization algorithm—the Holistic Computing Controller, and applying it to a smart levee monitoring and flood decision support system.

Introduction

Environmental monitoring and decision support systems increasingly rely on modern IT technologies, notably the so-called Internet of Things (IoT) integrated with cloud computing  [1]. Together, these technologies enable real-time monitoring of natural phenomena, provide advance warning of approaching disasters and help mitigate their impact through results of data analyses and resource-intensive simulations. However, these results must be delivered in a timely fashion and therefore the IT infrastructure must provide urgent computing (UC)  [2] capabilities in order to support applications subject to soft deadlines. On the other hand, infrastructure limitations such as the use of resource-constrained devices (e.g. environmental sensors) must be taken into account. Such requirements imply that the infrastructure needs to be managed in order to adjust its configuration to changing conditions and maintain the required quality of service.

The IoT system architecture  [3] defines seven layers dealing with data acquisition, transmission, processing and applications (Fig. 1). However, only the four topmost layers have been addressed in existing UC systems. Such systems focus on quality of service through on-demand resource allocation using e-Infrastructures  [4], high-throughput data stream processing  [5], or provisioning of storage resources for data-intensive urgent applications  [6]. There is a notable lack of a holistic approach addressing the management of the entire IoT-Cloud stack, including its three bottom layers, i.e. physical devices (sensors), connectivity and edge computing. Implementing such an approach poses a challenge for two reasons. First, environmental monitoring systems may operate in several modes–typically a ‘normal’ and an ‘urgent’ mode–characterized by different resource and QoS requirements. Second, the QoS requirements of different IT subsystems conflict with one another and almost all of them are in contradiction with the need to maintain energy efficiency and reduce operating costs. Consequently, the holistic approach requires knowledge about the system as a whole and a mechanism to calculate and orchestrate execution policies for individual subsystems in order to manage the operation of the entire system and resolve trade-offs between conflicting objectives, preferring some of them at the expense of others depending on the current mode of operation.

Existing approaches do not solve this problem optimally because they focus on delivering quality of service within individual subsystems in isolation [4], [5], [6], where each subsystem acts separately in accordance with its own internal policy and QoS requirements defined for that subsystem alone. We propose a holistic approach to system management that (i) addresses all layers of the IoT-Cloud system stack; (ii) optimizes and adapts the configuration of the system as a whole rather than its individual subsystems in isolation. We introduce a new component complementing the IoT stack, called the Holistic Computing Controller which performs adaptation of the system configuration in two steps: (1) calculation of Pareto-optimal configurations for the entire IT infrastructure based on cost-of-operation and quality-of-service (QoS) requirements of individual IT subsystems; (2) resolution of trade-offs between conflicting optimization objectives in order to select the single best configuration to be deployed in the system.

This proposed holistic approach is validated in the context of the ISMOP system for smart levee monitoring and flood decision support. The ISMOP project1 operates a research site featuring an experimental smart levee (Fig. 2), in order to conduct controlled flooding experiments, and support comprehensive research on smart levees, including the design of wireless sensors for levee monitoring, development of efficient data acquisition and transmission tools  [7], modeling of levee behavior  [8], and development of a data management and processing system leveraging cloud infrastructures  [9].

The obtained results confirm that the holistic approach leads to improved configuration settings and, consequently, better fulfillment of the system’s cost and QoS requirements than would have otherwise been possible had the configuration of all subsystems been managed in isolation.

The paper is organized as follows. Section  2 presents related work. Section  3 explains the holistic approach to system management. Section  4 outlines the architecture and quality-related objectives of a smart levee monitoring and flood decision support system, while Section  5 presents its practical implementation leveraging the holistic approach—the ISMOP system. Section  6 describes a case study and discusses the results of experiments. Section  7 concludes the paper.

Section snippets

Related work

Monitoring and decision support systems dealing with natural disasters typically require urgent computing services in order to ensure sufficient supply of computer resources and the required quality of service during a crisis event. Urgent computing systems are designed for applications characterized by the presence of a firm deadline, unpredictability of the urgent event’s occurrence, and the possibility to mitigate the event’s impact through resource-intensive computations  [2]. Illustrative

Holistic approach to system management

The operation of an environmental monitoring and decision support system is driven by Service-Level Agreements (SLAs) specific for each subsystem and dependent on the operating mode (urgent or normal2). Each SLA contains a set of Quality of Service (QoS) requirements that need to be

Smart levee monitoring and decision support system

A typical disaster scenario which lies at the root of the presented research involves a flood wave passing down a river. The flood wave may last from a few hours up to several weeks, and affect a large area comprising hundreds of kilometers of levees. A flood will typically occur due to the failure of a levee resulting from its long-term infiltration.

The business requirements stemming from this emergency scenario dictate that the decision support system should provide regular flood threat

ISMOP flood decision support system

We have developed ISMOP, a practical implementation of the smart levee monitoring and decision support system described in Section  4. To present its operation, the abstract architecture (Fig. 6) introduced in Section  4, is mapped to hardware and software subsystems of the ISMOP system, as presented in Fig. 7. The following sections describe implementation details of two main subsystems comprising ISMOP.

Case study

In order to practically validate the proposed holistic approach to system management we have performed a series of experiments using prototype implementations of hardware and software components of the ISMOP IT infrastructure. The validation involved the following steps:

  • 1.

    We have identified the decision space, i.e. key configurable properties for all subsystems of the ISMOP IT platform and their possible values.

  • 2.

    We have identified the objective space, i.e. the objective functions, and developed

Conclusion

We presented a holistic approach to management of an IT infrastructure for environmental monitoring and decision support systems based on the Internet of Things and cloud computing technologies. We introduced the Holistic Computing Controller, a component complementing the IoT-Cloud stack, which calculates and deploys a globally optimal configuration for all subsystems based on knowledge of the system as a whole. The approach was experimentally validated using the hardware and software

Acknowledgments

This work is partially supported by the National Centre for Research and Development (NCBiR), Poland, project PBS1/B9/18/2013; AGH statutory research Grant No. 11.11.230.124 is also acknowledged. The authors are grateful to Prof. Robert Meijer (UvA and TNO, The Netherlands) and to the entire ISMOP project team for fruitful discussions and suggestions.

Bartosz Balis, Ph.D. in Computer Science, is an Assistant Professor at the Department of Computer Science, AGH University of Science and Technology. He is co-author of above 75 international publications including journal articles, conference papers, and book chapters. His research interests include environments for eScience, scientific workflows, grid and cloud computing. He has participated in international research projects including EU-IST CrossGrid, CoreGRID, K-Wf Grid, ViroLab, Gredia,

References (47)

  • B. Balis

    Hyperflow: A model of computation, programming approach and enactment engine for complex distributed workflows

    Future Gener. Comput. Syst.

    (2016)
  • T. Szydlo et al.

    Predictive power consumption adaptation for future generation embedded devices powered by energy harvesting sources

    Microprocess. Microsyst.

    (2015)
  • CISCO, The Internet of Things Reference Model, 2014. Available online:...
  • R. Tolosana-Calasanz, J. Banares, O. Rana, C. Pham, E. Xydas, C. Marmaras, P. Papadopoulos, L. Cipcigan, Enforcing...
  • J. Cope, H. Tufo, Supporting storage resources in urgent computing environments, in: 2008 IEEE International Conference...
  • A. Pieta, J. Bala, M. Dwornik, K. Krawiec, Stability of the levees in case of high level of the water, in: 14th SGEM...
  • K.K. Droegemeier, V. Ch, R. Clark, D. Gannon, S. Graves, M. Ramamurthy, R. Wilhelmson, K. Brewster, B. Domenico, T....
  • Y. Cui et al.

    Enabling very-large scale earthquake simulations on parallel machines

  • P.S. Bogden et al.

    Architecture of a community infrastructure for predicting and analyzing coastal inundation

    Mar. Technol. Soc. J.

    (2007)
  • J. Mandel et al.

    A dynamic data driven wildland fire model

  • R. Strijkers et al.

    Amos: Using the cloud for on-demand execution of e-science applications

  • Open Geospatial Consortium, OpenGIS SWE Service Model - Implementation Standard, version 2.0 (3...
  • P. Beckman et al.

    Spruce: A system for supporting urgent high-performance computing

  • Cited by (15)

    • ThermoSim: Deep learning based framework for modeling and simulation of thermal-aware resource management for cloud computing environments

      2020, Journal of Systems and Software
      Citation Excerpt :

      Cores is the number of Processing Element's (PE) required by the Cloudlet. Table 4 shows the simulation parameters utilized in the various experiments undertaken by this research work, also as identified from the existing empirical studies and literature such as utilization model (Qinghui et al., 2008; Kouki and Ledoux, 2012), energy model (computing (Singh et al., 2016; Gill et al., 2019; Li et al., 2018; Balis et al., 2018; Lin et al., 2019, 2019) and cooling (Qinghui et al., 2008; Lazic et al., 2018; Möbius et al., 2014; Liu et al., 2012) and thermal-aware scheduling (Lazic et al., 2018; Ranganathan and Sharma, 2005; Möbius et al., 2014). Experimental setup incorporated CloudSim to produce and retrieve simulation results.

    • Holistic resource management for sustainable and reliable cloud computing: An innovative solution to global challenge

      2019, Journal of Systems and Software
      Citation Excerpt :

      Cores is the number of Processing Element's (PE) required by the Cloudlet. Table 5 shows the simulation parameters utilized in the various experiments undertaken by this research work, also as identified from the existing empirical studies and literature such as fault management (Li et al., 2018a; Gill and Buyya, 2018; Gill et al., 2019), application's QoS (Gill and Buyya, 2018a; Gill and Buyya, 2018b; Gill and Buyya, 2018c; Gill et al., 2019; Singh and Chana, 2016), capacity planning (Kouki and Ledoux, 2012; Qinghui et al., 2008), energy management (Li et al., 2018a; Balis et al., 2018; Gill and Buyya, 2018b; Singh and Chana, 2016), waste heat utilization (Karellas and Braimakis, 2016; Qinghui et al., 2008), renewable energy (Tschudi et al., 2010; Liu et al., 2012), virtualization (Li et al., 2018a; Balis et al., 2018; Singh and Chana, 2016), thermal-aware scheduling (Moore et al., 2005; Lazic et al., 2018; Möbius et al., 2014) and cooling management (Liu et al., 2012; Qinghui et al., 2008; Lazic et al., 2018; Möbius et al., 2014). Experimental setup incorporated CloudSim to produce and retrieve simulation results.

    • Urgent computing for decision support in critical situations

      2018, Future Generation Computer Systems
    View all citing articles on Scopus

    Bartosz Balis, Ph.D. in Computer Science, is an Assistant Professor at the Department of Computer Science, AGH University of Science and Technology. He is co-author of above 75 international publications including journal articles, conference papers, and book chapters. His research interests include environments for eScience, scientific workflows, grid and cloud computing. He has participated in international research projects including EU-IST CrossGrid, CoreGRID, K-Wf Grid, ViroLab, Gredia, UrbanFlood and PaaSage. He has served as a member of Program Committee for conferences: e-Science 2006, ICCS 2007–2016, ITU Kaleidoscope 2013–2016, SC16 Workshops, SIMULTECH 2016.

    Robert Brzoza-Woch studied Electronics and Telecommunication with major in Sensors and Microsystems. He received his M.Sc. degree in 2009 and Ph.D. in computer science in 2013 from the AGH University of Science and Technology in Krakow, Poland. Currently, he works as an assistant professor at the Department of Computer Science, at the same university. He has broad experience in hardware and software design for distributed embedded systems, wireless sensor networks, telemetry systems, and home automation. His work is centered around embedded systems based on microcontrollers and FPGAs with particular emphasis on ARM cores and FPGA-based microprocessor systems.

    Marian Bubak has an M.Sc. degree in Technical Physics and Ph.D. in Computer Science. He is an adjunct at the Institute of Computer Science and ACC Cyfronet AGH University of Science and Technology, Kraków, Poland, and a Professor of Distributed System Engineering at the University of Amsterdam. His research interests include collaborative environments, parallel and distributed computing, and eScience. He is the author of about 230 papers in this area, co-editor of about 30 proceedings of international conferences and the member of editorial boards of 3 journals. He served key roles in series of EU-funded projects, including CrossGrid (the Architecture Team leader), K-Wf Grid (the Scientific Coordinator), CoreGRID (member of the Monitoring Committee), and ViroLab, GREDIA, UrbanFlood, MAPPER and VPH-Share (WP leader).

    Marek Kasztelnik graduated from the University of Science and Technology in Kraków, Poland, where he received his M.Sc. in Computer Science. In 2005 he joined Technologies Group in Research and Development Department in ComArch (2005–2010) where he was responsible for designing and developing solutions for telecommunication management and meta-modeling. In 2006 he has joined DICE team at Academic Computing Centre Cyfronet AGH where he has been involved in the EU-funded projects. He is author of more than 40 scientific articles. His current work addresses areas such as distributed resource discovery, distributed cloud based application design and implementation.

    Bartosz Kwolek has received M.Sc. degrees in Computer Science and Architecture. He works for the Department of Computer Science at the AGH University of Science and Technology, Krakow, Poland, where he gained Ph.D. degree in 2014. His interests focus on computer networks, multimedia systems and web application technologies as well as user interface design. He is an active Cisco CNAP trainer. He has participated in several EU research projects (6WINIT, ProAccess) and national projects (IT-SOA, NGOSS, ISMOP).

    Piotr Nawrocki, Ph.D., is an Assistant Professor in the Department of Computer Science at the AGH University of Science and Technology, Krakow, Poland. His research interests include distributed systems, computer networks, mobile systems, mobile cloud computing, Internet of Things and service-oriented architectures. He has participated in several EU research projects including MECCANO, 6WINIT, UniversAAL and national projects including IT-SOA and ISMOP. He is a member of the Polish Information Processing Society (PTI).

    Piotr Nowakowski received M.Sc. in Computer Science from the AGH University of Science and Technology. He works at the Academic Computer Center AGH in Kraków as a scientific programmer, delivering custom IT solutions for various branches of e-science. He is involved in many research projects, including CrossGrid, ViroLab, VPH-Share and PL-Grid. His primary research interests focus on managing and optimizing access to large datasets processed with HPC resources.

    Tomasz Szydlo received Ph.D. in Computer Science from the University of Science and Technology in Kraków, Poland in 2010. Since 2011 he is assistant professor at the Department of Computer Science at AGH-UST. His interests focus on Internet of Things as well as mobile and SOA systems. He has participated in several EU research projects including CrossGrid, Ambient Networks and UniversAAL and national projects including IT-SOA and ISMOP.

    Krzysztof Zielinski is a full professor and head of Department of Computer Science at the AGH University of Science and Technology, Krakow, Poland. His interests focus on networking, mobile and wireless systems, service-oriented distributed systems engineering and Cloud computing. He is author of over 200 papers in this area. He served as an expert with Ministry of Science and Education. He is a member of IEEE, ACM and Polish Academy of Sciences, Computer Science Chapter. He served as a program committee member, chairman and organizer of several international conferences including MobiSys, ICCS, ICWS, IEEE SCC and many others.

    View full text