Adjusting middleware knobs to assess scalability limits of distributed cyber-physical systems
Introduction
The vision of cyber-physical systems is extremely challenging as they are highly dynamic systems immersed in ultra large scale deployments, prone to suffering interference by other subsystems [26]. Although they have inherent real-time requirements, it is often the case that in their large scale structure there are different subsystems with different levels of temporal requirements; these may range from hard safety-critical real-time subsystems to time-sensitive domains such as cloud computing [13] or best effort ones. Fig. 1 shows a typical deployment of a cyber-physical system related to the remote monitoring and control of a factory floor that involves the monitoring of physical processes. Gathered information may be sent to the cloud to be analysed, resulting in the fine tunning of the factory floor processes.
One challenging point of cyber-physical system is their development cost at both hardware and software levels due to the number of techniques, paradigms, and technologies that are involved. Only software-wise, the development of cyber-physical system [26] requires mastering a number of different modeling, design, and verification paradigms as well as the associated platform technologies such as operating systems and kernels, network protocol run-time software, middleware technology, and the specific application level logic. All these technologies have to cooperate to ensure the functional correctness as well as the non-functional properties that are essential as timeliness. Therefore, easing their programming becomes essential, so techniques and technologies that support platform abstraction and reusability have to be integrated in their development.
Middleware solves part of this problem as it clearly favors programmability by providing platform abstraction allowing highly heterogeneous subsystems (or nodes) to effectively interoperate. Also, it is the layer in the software stack where enhanced functionality can be put in place to address requirements such as adaptation and dynamic reconfiguration.
Nevertheless, the cyber-physical system community (mainly, real-time systems) has been reluctant to use middleware, as it has been typically seen as a black box software layer prone to unpredictable behavior. To guarantee predictability, low-level network programming has been usually employed making direct use of the media access protocols in order to ensure timely delivery and the calculation of a network schedule for real-time communications. However, this has led to unflexible designs where the dynamic behavior could hardly be accomodated. Addition/removal of a functional piece or node would typically result in the redesign of the system and of the network transmission schedule.
In this paper, we describe an approach to support adaptivity and dynamic behavior in distributed systems in the context of cyber-physical environments by providing a middleware that performs active monitoring and ensures the service time contracted by the system nodes (called units). The middleware is validated on a soft real-time environment with the dynamic updating of its internal parameters such as the thread pool size. This allows to perform dynamic resizing of the middleware, i.e., on-line modification of the number of threads in the thread pool. This is a key element in cyber-physical systems as they must face changing situations such as supporting varying numbers of clients. We validate our solution by providing performance results over a specific implementation scenario to assess scalability and timeliness. It is not the intention of our contribution to compete with time deterministic network schedules nor target real-time properties, but to provide a simple middleware model and assess the type of system that it is appropriate for, according to the temporal values (i.e. service time) that it is capable to output. We show that the temporal interference caused by the middleware is negligible and does not preclude units to meet their timing requirements.
This paper is structured as follows. Section 2 presents the state of the art and most related work based on the challenges that middleware faces to provide predictability for cyber-physical systems. Section 3 provides an analysis over the specific middleware design issues that affects its performance. Section 4 presents the proposed middleware design to suit cyber-physical domains based on servers that service a dynamic number of units or clients. Section 5 presents the experiments that have been carried out on the middleware implementation that show the validity of the framework. Section 6 draws some conclusions and presents lines of future work to improve the support offered to cyber-physical domains.
Section snippets
Challenges and approaches
There is no perfect solution for cyber-physical systems design and development, especially if approaching the problem from the middleware level. This section describes the main challenges faced by a distributed system in a cyber-physical environment in which the communication is enabled by middleware. The state of the art solutions are presented classified per challenge that they address.
Network unpredictability. cyber-physical systems are intensive in the usage of network communications of
Middleware performance
Most off-the-shelf communication middleware are designed for general purpose domains that are not real-time. Examples are RMI (Java Remote Method Invocation), Jini or River for service-based programming, AMQP (Advanced Message Queuing Protocol), JMS (Java Messaging Service), or web services [30], among others. These technologies are silent about the timing requirements on the operation of units, and some of them, e.g., Jini, introduces high communication overhead due to the discovery logic.
In a
Processes and data models
The overall behavior and functions of the middleware are presented in Fig. 3. To build a distributed system in a cyber-physical systems environment, an initial off-line phase is proposed for tuning the middleware; this phase allows to assess the execution limits of the middleware. The inputs to be considered in the off-line tuning phase are: (1) the specification and requirements of the units in the temporal domain; and (2) the execution platform characteristics such as the hardware
Setting description and baseline experiment
This section presents the validation of the proposed middleware design. The driving goal is to show the stability of the middleware operation with a performance analysis on different scenarios with varying conditions and progressive load increase. The empirical results present the following parameters:
- •
The service time (s_time) that is the total request time which includes the network transmission time, the server processing time, and the protocol processing at the nodes for a unit request.
Conclusion
Cyber-physical systems have numerous sources of unpredictability. These sources are mainly related to the inherent changing nature of the environment in which they are immersed; but others are derived from the characteristics of the technologies that they use. cyber-physical systems are intensive in the usage of heterogeneous networks which hit the corner stone of predictable execution. In this paper, we have focused on the middleware level where massive work to fulfill the cyber-physical
Acknowledgement
This research was supported, in part, by REM4VSS (TIN2011-28339) and M2C2 (TIN2014-56158-C4-3-P) project grants of the Spanish Ministry of Economy and Competitiveness.
References (41)
- et al.
Analyzing point-to-point DDS communication over desktop virtualization software
Comput. Stand. Interfaces
(2017) - et al.
A real-time perspective of service composition: key concepts and some contributions
J. Syst. Arch.
(2013) - et al.
A cloud middleware for assuring performance and high availability of soft real-time applications
J. Syst. Arch.
(2014) - Apache Software Foundation, JiniTM Network Technologies Specification, Apache River v2.2.0,...
- M. Azab, M. Eltoweissy, Towards a cooperative autonomous resilient defense platform for cyber-physical systems,...
- J. Balasubramanian, et al., NetQoPE: a model-driven network QoS provisioning engine for distributed real-time and...
- M.M. Bersani, M. García-Valls, The cost of formal verification in adaptive CPS, An example of a virtualized server...
- H. Cui, J. Simsa, Y.H. Lin, H. Li, B. Blum, X. Xu, J. Yang, G.A. Gibson, R.E. Bryant, Parrot: a practical runtime for...
- A. Dabholkar, A. Gokhale, An approach to middleware specialization for cyber physical systems, Proceedings of the IEEE...
- N. Deakin, JSR 343: JavaTM Message Service 2.0,...
Service-oriented architecture for distributed publish/subscribe middleware in electronics production
IEEE Trans. Ind. Inform.
Resilient dependable cyber-physical systems: a middleware perspective
J. Int. Serv. Appl.
A scheme for real-time channel establishment in wide-area networks
IEEE J. Select. Areas Commun.
Using DDS in distributed partitioned systems
ACM Sigbed Rev.
Challenges in real-time virtualization and predictable cloud computing
J. Syst. Arch.
iLAND: an enhanced middleware for real-time reconfiguration of service oriented distributed real-time systems
IEEE Trans. Ind. Inf.
Cited by (25)
Roadmap to semi-automatic generation of digital twins for brownfield process plants
2022, Journal of Industrial Information IntegrationCitation Excerpt :An attempt to align RAMI 4.0 and IIRA has been undertaken [91]. While IIRA has received little academic attention, it includes a DDS (Data Distribution Service) specification, which has been applied especially in the context of Cyber-Physical Production systems [92–94]. DDS has similar capabilities as OPC UA [95,96], so it is an alternative for standardizing an information architecture concept such as the one in Fig. 7.
A review on the characteristics of cyber-physical systems for the future smart factories
2020, Journal of Manufacturing SystemsCitation Excerpt :It refers to the ability of complex CPSs to change during their life cycle, due to either a growing or shrinking number of “nodes” (nodes could be either participating or managed physical systems, sub-systems or components of the CPSs) [29]. According to Garcia-Valls et al. [116] scalability means that CPSs should contain the needed logic to deal with aspects such as moving nodes and joining/removing them. Relying on the logical and physical modularity of system components and on the standardization of the interfaces between such modules, CPSs should be scalable and composable [67,73].
Adapting an agile manufacturing concept to the reference architecture model industry 4.0: A survey and case study
2019, Journal of Industrial Information IntegrationCitation Excerpt :Although IIRA has not been explicitly mentioned in any of the reviewed CPPS architecture related publications, its key technology standard DDS (Data Distribution Service) has been incorporated to some proposals. Bonci et al. [58] proposes a CPPS architecture based on distributed databases using DDS as the communication solution; further works that assess and endorse DDS as the CPPS communication solution include [59,60]. Although IIRA is not mentioned in any of these scientific articles, DDS occupies a central place in IIRA similar to OPC UA in RAMI 4.0.
Integrating multicore awareness functions into distribution middleware for improving performance of distributed audio surveillance
2019, Advances in Engineering SoftwareCitation Excerpt :The distribution module uses Ice [15] as core distribution software because of its lightweight structure that uses standard operating system facilities and has operating system standard compliance (e.g. it has a POSIX compliant runtime). Its lightweight design and performance has been evaluated extensively in different works such as [16] and its adaptation facility has been shown in [17], among others. Each node of a system can take the role of client and/or server.
Reliable software technologies and communication middleware: A perspective and evolution directions for cyber-physical system, mobility, and cloud computing
2017, Future Generation Computer SystemsCitation Excerpt :More recently, Omacy [1] has been designed as an architecture that draws a high-level model of the internal software architecture of an enhanced logic middleware, including the consideration over the hardware, bare communication software and including a key component that encapsulates the verification logic. Some centralized architectures have also been devised to support hard real-time guarantees such as [7] and to improve the flexibility of servers [8]. Taking advantage of the power of the underlying computation hardware can significantly improve the processing and distributed communication.