Adjusting middleware knobs to assess scalability limits of distributed cyber-physical systems

https://doi.org/10.1016/j.csi.2016.11.003Get rights and content

Highlights

  • Analysis of challenges and performance issues of middleware for CPS domains.

  • Light-weight design of a middleware for CPS.

  • Provision of a set of primitives for the basic functions in a dynamic CPS domain.

  • Validation of a design implementation shows that it is possible to obtain a stable configuration.

Abstract

The traditional development paradigm for time-sensitive distributed systems (and even more for real-time domains) has typically relied on unflexible low-level schemes; these have been based on (also) low-level programming of improved medium access control protocols to obtain deterministic network schedules. Such technique does not scale well in the context of cyber-physical systems (CPS), which have a complexity of several orders of magnitude higher. In this paper, we explore the actual trend of cyber-physical systems (CPS) that are progressively integrated with Internet technologies to fulfill their requirements of being highly connected systems. The integration pillar is the communication middleware; however, communication middleware technologies are prone to introducing delays at different levels, increasing the potential uncertainty of the system execution. Therefore, individual analysis of middleware technologies is needed to assess the bounds on the type and scale of CPS that they can support. We analyse the behavior of a specific a middleware technology and its cost for handling the interaction of distributed nodes in the context of cyber-physical systems. We propose the design of a reliable middleware infrastructure for distributed embedded systems integrated in a CPS environment. Our approach considers time requirements specified by the system nodes or units, and fine tunes a few specific parameters of the middleware that we refer to as knobs. We validate our solution implementation in a experimental setting and show the system scale that can be supported in a stable way.

Introduction

The vision of cyber-physical systems is extremely challenging as they are highly dynamic systems immersed in ultra large scale deployments, prone to suffering interference by other subsystems [26]. Although they have inherent real-time requirements, it is often the case that in their large scale structure there are different subsystems with different levels of temporal requirements; these may range from hard safety-critical real-time subsystems to time-sensitive domains such as cloud computing [13] or best effort ones. Fig. 1 shows a typical deployment of a cyber-physical system related to the remote monitoring and control of a factory floor that involves the monitoring of physical processes. Gathered information may be sent to the cloud to be analysed, resulting in the fine tunning of the factory floor processes.

One challenging point of cyber-physical system is their development cost at both hardware and software levels due to the number of techniques, paradigms, and technologies that are involved. Only software-wise, the development of cyber-physical system [26] requires mastering a number of different modeling, design, and verification paradigms as well as the associated platform technologies such as operating systems and kernels, network protocol run-time software, middleware technology, and the specific application level logic. All these technologies have to cooperate to ensure the functional correctness as well as the non-functional properties that are essential as timeliness. Therefore, easing their programming becomes essential, so techniques and technologies that support platform abstraction and reusability have to be integrated in their development.

Middleware solves part of this problem as it clearly favors programmability by providing platform abstraction allowing highly heterogeneous subsystems (or nodes) to effectively interoperate. Also, it is the layer in the software stack where enhanced functionality can be put in place to address requirements such as adaptation and dynamic reconfiguration.

Nevertheless, the cyber-physical system community (mainly, real-time systems) has been reluctant to use middleware, as it has been typically seen as a black box software layer prone to unpredictable behavior. To guarantee predictability, low-level network programming has been usually employed making direct use of the media access protocols in order to ensure timely delivery and the calculation of a network schedule for real-time communications. However, this has led to unflexible designs where the dynamic behavior could hardly be accomodated. Addition/removal of a functional piece or node would typically result in the redesign of the system and of the network transmission schedule.

In this paper, we describe an approach to support adaptivity and dynamic behavior in distributed systems in the context of cyber-physical environments by providing a middleware that performs active monitoring and ensures the service time contracted by the system nodes (called units). The middleware is validated on a soft real-time environment with the dynamic updating of its internal parameters such as the thread pool size. This allows to perform dynamic resizing of the middleware, i.e., on-line modification of the number of threads in the thread pool. This is a key element in cyber-physical systems as they must face changing situations such as supporting varying numbers of clients. We validate our solution by providing performance results over a specific implementation scenario to assess scalability and timeliness. It is not the intention of our contribution to compete with time deterministic network schedules nor target real-time properties, but to provide a simple middleware model and assess the type of system that it is appropriate for, according to the temporal values (i.e. service time) that it is capable to output. We show that the temporal interference caused by the middleware is negligible and does not preclude units to meet their timing requirements.

This paper is structured as follows. Section 2 presents the state of the art and most related work based on the challenges that middleware faces to provide predictability for cyber-physical systems. Section 3 provides an analysis over the specific middleware design issues that affects its performance. Section 4 presents the proposed middleware design to suit cyber-physical domains based on servers that service a dynamic number of units or clients. Section 5 presents the experiments that have been carried out on the middleware implementation that show the validity of the framework. Section 6 draws some conclusions and presents lines of future work to improve the support offered to cyber-physical domains.

Section snippets

Challenges and approaches

There is no perfect solution for cyber-physical systems design and development, especially if approaching the problem from the middleware level. This section describes the main challenges faced by a distributed system in a cyber-physical environment in which the communication is enabled by middleware. The state of the art solutions are presented classified per challenge that they address.

Network unpredictability. cyber-physical systems are intensive in the usage of network communications of

Middleware performance

Most off-the-shelf communication middleware are designed for general purpose domains that are not real-time. Examples are RMI (Java Remote Method Invocation), Jini or River for service-based programming, AMQP (Advanced Message Queuing Protocol), JMS (Java Messaging Service), or web services [30], among others. These technologies are silent about the timing requirements on the operation of units, and some of them, e.g., Jini, introduces high communication overhead due to the discovery logic.

In a

Processes and data models

The overall behavior and functions of the middleware are presented in Fig. 3. To build a distributed system in a cyber-physical systems environment, an initial off-line phase is proposed for tuning the middleware; this phase allows to assess the execution limits of the middleware. The inputs to be considered in the off-line tuning phase are: (1) the specification and requirements of the units in the temporal domain; and (2) the execution platform characteristics such as the hardware

Setting description and baseline experiment

This section presents the validation of the proposed middleware design. The driving goal is to show the stability of the middleware operation with a performance analysis on different scenarios with varying conditions and progressive load increase. The empirical results present the following parameters:

  • The service time (s_time) that is the total request time which includes the network transmission time, the server processing time, and the protocol processing at the nodes for a unit request.

Conclusion

Cyber-physical systems have numerous sources of unpredictability. These sources are mainly related to the inherent changing nature of the environment in which they are immersed; but others are derived from the characteristics of the technologies that they use. cyber-physical systems are intensive in the usage of heterogeneous networks which hit the corner stone of predictable execution. In this paper, we have focused on the middleware level where massive work to fulfill the cyber-physical

Acknowledgement

This research was supported, in part, by REM4VSS (TIN2011-28339) and M2C2 (TIN2014-56158-C4-3-P) project grants of the Spanish Ministry of Economy and Competitiveness.

References (41)

  • M. García-Valls et al.

    Analyzing point-to-point DDS communication over desktop virtualization software

    Comput. Stand. Interfaces

    (2017)
  • M. García-Valls et al.

    A real-time perspective of service composition: key concepts and some contributions

    J. Syst. Arch.

    (2013)
  • K. An et al.

    A cloud middleware for assuring performance and high availability of soft real-time applications

    J. Syst. Arch.

    (2014)
  • Apache Software Foundation, JiniTM Network Technologies Specification, Apache River v2.2.0,...
  • M. Azab, M. Eltoweissy, Towards a cooperative autonomous resilient defense platform for cyber-physical systems,...
  • J. Balasubramanian, et al., NetQoPE: a model-driven network QoS provisioning engine for distributed real-time and...
  • M.M. Bersani, M. García-Valls, The cost of formal verification in adaptive CPS, An example of a virtualized server...
  • H. Cui, J. Simsa, Y.H. Lin, H. Li, B. Blum, X. Xu, J. Yang, G.A. Gibson, R.E. Bryant, Parrot: a practical runtime for...
  • A. Dabholkar, A. Gokhale, An approach to middleware specialization for cyber physical systems, Proceedings of the IEEE...
  • N. Deakin, JSR 343: JavaTM Message Service 2.0,...
  • I. Delamer et al.

    Service-oriented architecture for distributed publish/subscribe middleware in electronics production

    IEEE Trans. Ind. Inform.

    (2006)
  • G. Denker et al.

    Resilient dependable cyber-physical systems: a middleware perspective

    J. Int. Serv. Appl.

    (2012)
  • D. Ferrari et al.

    A scheme for real-time channel establishment in wide-area networks

    IEEE J. Select. Areas Commun.

    (1992)
  • M. García-Valls et al.

    Using DDS in distributed partitioned systems

    ACM Sigbed Rev.

    (2017)
  • M. García Valls et al.

    Challenges in real-time virtualization and predictable cloud computing

    J. Syst. Arch.

    (2014)
  • M. García Valls, R. Baldoni, Adaptive middleware design for CPS: Considerations on the OS, resource managers, and the...
  • M. García-Valls et al.

    iLAND: an enhanced middleware for real-time reconfiguration of service oriented distributed real-time systems

    IEEE Trans. Ind. Inf.

    (2013)
  • M. García-Valls, A. Alonso Munoz, J. Ruíz, A. Groba, An architecture of a quality of service resource manager...
  • M. García-Valls, D. Perez-Palacin, R. Mirandola, Time sensitive adaptation in CPS through run-time configuration...
  • M. García-Valls, C. Calva-Urrego, A. Alonso, J.A. de la Puente, Adjusting middleware knobs to suit CPS domains....
  • Cited by (25)

    • Roadmap to semi-automatic generation of digital twins for brownfield process plants

      2022, Journal of Industrial Information Integration
      Citation Excerpt :

      An attempt to align RAMI 4.0 and IIRA has been undertaken [91]. While IIRA has received little academic attention, it includes a DDS (Data Distribution Service) specification, which has been applied especially in the context of Cyber-Physical Production systems [92–94]. DDS has similar capabilities as OPC UA [95,96], so it is an alternative for standardizing an information architecture concept such as the one in Fig. 7.

    • A review on the characteristics of cyber-physical systems for the future smart factories

      2020, Journal of Manufacturing Systems
      Citation Excerpt :

      It refers to the ability of complex CPSs to change during their life cycle, due to either a growing or shrinking number of “nodes” (nodes could be either participating or managed physical systems, sub-systems or components of the CPSs) [29]. According to Garcia-Valls et al. [116] scalability means that CPSs should contain the needed logic to deal with aspects such as moving nodes and joining/removing them. Relying on the logical and physical modularity of system components and on the standardization of the interfaces between such modules, CPSs should be scalable and composable [67,73].

    • Adapting an agile manufacturing concept to the reference architecture model industry 4.0: A survey and case study

      2019, Journal of Industrial Information Integration
      Citation Excerpt :

      Although IIRA has not been explicitly mentioned in any of the reviewed CPPS architecture related publications, its key technology standard DDS (Data Distribution Service) has been incorporated to some proposals. Bonci et al. [58] proposes a CPPS architecture based on distributed databases using DDS as the communication solution; further works that assess and endorse DDS as the CPPS communication solution include [59,60]. Although IIRA is not mentioned in any of these scientific articles, DDS occupies a central place in IIRA similar to OPC UA in RAMI 4.0.

    • Integrating multicore awareness functions into distribution middleware for improving performance of distributed audio surveillance

      2019, Advances in Engineering Software
      Citation Excerpt :

      The distribution module uses Ice [15] as core distribution software because of its lightweight structure that uses standard operating system facilities and has operating system standard compliance (e.g. it has a POSIX compliant runtime). Its lightweight design and performance has been evaluated extensively in different works such as [16] and its adaptation facility has been shown in [17], among others. Each node of a system can take the role of client and/or server.

    • Reliable software technologies and communication middleware: A perspective and evolution directions for cyber-physical system, mobility, and cloud computing

      2017, Future Generation Computer Systems
      Citation Excerpt :

      More recently, Omacy [1] has been designed as an architecture that draws a high-level model of the internal software architecture of an enhanced logic middleware, including the consideration over the hardware, bare communication software and including a key component that encapsulates the verification logic. Some centralized architectures have also been devised to support hard real-time guarantees such as [7] and to improve the flexibility of servers [8]. Taking advantage of the power of the underlying computation hardware can significantly improve the processing and distributed communication.

    View all citing articles on Scopus
    View full text