Elsevier

Measurement

Volume 108, October 2017, Pages 152-162
Measurement

Architecture for hybrid modelling and its application to diagnosis and prognosis with missing data

https://doi.org/10.1016/j.measurement.2017.02.003Get rights and content

Highlights

  • An architecture for hybrid modelling is proposed oriented to diagnosis and prognosis.

  • Physics-based and data-driven modelling are combined to deal with missing data.

  • Context-driven services are identified as the key to increase the results’ accuracy.

  • A test case is presented for rotating machinery, determining the state of a bearing.

  • A multi-body model and a semi-supervised learning algorithm are used in this work.

Abstract

The advances in technology involving internet of things, cloud computing and big data mean a new perspective in the calculation of reliability, maintainability, availability and safety by combining physics-based modelling with data-driven modelling. This paper proposes an architecture to implement hybrid modelling based on the fusion of real data and synthetic data obtained in simulations using a physics-based model. This architecture has two levels of analysis: an online process carried out locally and virtual commissioning performed in the cloud. The former results in failure detection analysis to avoid upcoming failures whereas the latter leads to both diagnosis and prognosis. The proposed hybrid modelling architecture is validated in the field of rotating machinery using time-domain and frequency-domain analysis. A multi-body model and a semi-supervised learning algorithm are used to perform the hybrid modelling. The state of a rolling element bearing is analysed and accurate results for fault detection, localisation and quantification are obtained. The contextual information increases the accuracy of the results; the results obtained by the model can help improve maintenance decision making and production scheduling. Future work includes a prescriptive analysis approach.

Introduction

Information and communication technologies (ICTs) have made considerable progress in sensing, data storage, data mining, simulation capabilities and online computing. When these ICTs are integrated with physical assets, the result is a cyber-physical system. The internet of things and big data have transformed the connectivity between the elements of an enterprise, leading to the fourth stage of industrialisation, known as Industry 4.0 [1], characterised by smart assets that dynamically interact with each other and with all levels of the business model.

The self-awareness and self-capabilities of cyber-physical systems have an impact on maintenance. Since its first application in the middle of the 20th century to the automotive, military and aerospace industries, condition based maintenance (CBM) has proved to be an efficient maintenance policy from economic and safety points of view. It has many advantages over the two traditional approaches, corrective and scheduled maintenance, including the ability to anticipate fatal failure thanks to updated information on the state of assets. This results in better maintenance planning; tasks are performed only when it is completely necessary, the risk if reaching a faulty state is reduced, and the use of the asset is maximised. Thus, despite the need for an initial investment when first applying this maintenance approach, it reduces the cost of maintenance [2].

CBM, combined with proper decision support systems, leads to a maximisation of resources and increased productivity and, therefore, to business efficiency. Taking this to the next level of analysis and opening the door to 21st century realities, Lee et al. [3] propose a five-level structure (named 5C architecture) for the development of cyber-physical systems within a manufacturing environment. These five levels, shown in Fig. 1, have the following functions: acquiring and storing data using a sensor network; transforming the acquired data into valuable information; connecting the assets to get more knowledge about the individuals using the network information; developing proper human-machine interfaces; and allowing the assets themselves to make decisions about their operation. With this intelligent equipment, the maintenance processes can be automated in such a way as to optimise resources. These tasks require a distributed system framework, as centralising the information decreases the performance [4].

The application of CBM has a number of challenges: defining the problem for which the maintenance policy is to be applied, identifying the application level, measuring performance, selecting the method to apply diagnosis and prognosis, defining the sensing strategy, defining the monitoring strategy, allocating time and resources to conduct experiments, selecting the solution, and, finally, analysing the costs and benefits [5]. This paper focuses on the first step: the methods for carrying out diagnosis and prognosis. The objective of the diagnosis process is to examine symptoms and syndromes to determine the nature of faults or failures (kind, situation, extent), whereas prognosis deals with the analysis of the symptoms of faults to predict future condition and residual life within design parameters [6].

A new trend in modelling combines traditional methods, i.e. physics-based modelling with data-driven modelling [7]. Physics-based modelling is based on the first principles for constructing a set of ordinary or partial differential equations representing the dynamics of the system in certain conditions, even those difficult to achieve by testing. Simulations of different component faults can be done without any cost associated with damage seeding [8]. In contrast, data-driven modelling is based on the construction of a set of equations without any knowledge of the system, simply by relating the inputs to a set of outputs by means of a learning process using a large amount of data obtained from the monitored asset. The combination of these modelling approaches, known as hybrid modelling, uses the advantages of each approach. Its ability to fuse data from different sources could be extremely helpful to maintenance decision making and production scheduling [9].

The new technologies mentioned at the beginning can be used to good effect in hybrid modelling. For one thing, the connectivity of cyber-physical assets allows them to share data and receive information about the required tasks. As a result, they become self-aware and can self-actuate. For another, the computational cost related to the use of physics-based modelling for virtual commissioning is reduced by the use of supercomputers in a cloud computing framework [10]. Moreover, big data capabilities lead to the proper management of the high volume of data stored over the life of the machines of an industry and to improved data mining. In short, hybrid modelling can be used to create smart assets, thereby facilitating and improving CBM.

Despite its obvious promise, at this point, little work discusses hybrid modelling. The diagnosis process is tackled by Matei et al. [11]. They propose a hybrid framework for a railway switch, obtaining accurate results in detection but succeeding only partially in fault identification. Medjaher and Zerhouni [12] present a two-phase methodology for hybrid prognosis. The first phase develops a physics-based model in both healthy and damaged conditions; the second phase computes the residuals when comparing the measurements with the simulation results. These residuals are indicative of the state of the monitored asset; its remaining useful life (RUL) can be computed by comparing the residuals with a predefined performance. A framework called hybrid mathematical informational modelling (HMIM) in which a neural network is used to analyse the differences between the measurements and the mathematical model’s response is proposed by Ghaboussi et al. [13]. They apply the HMIM framework to a beam-to-column connection and conclude that the hybrid approach is capable of representing the issues the mathematical model cannot capture by itself. In contrast, Didona and Romano [14] propose a data fusion framework for measurements and synthetic data generated by a physics-based model and study its implementation in computer systems. Other authors present a data fusion strategy for the prognosis of rolling element bearings (REBs) in flight operating conditions [15] and the degradation of a battery [16]. Data fusion can also relate continuous data with categorical data; the latter type are very common in the real world. Working in this area, Otey et al. [4] suggest a model for outliers and anomaly detection. Although some authors propose frameworks for hybrid modelling [11], [12], [13], there are no clear architectures for this purpose in the research literature.

There are two important things to consider when implementing a CBM strategy for an asset: how to deal with missing data on its reliability evolution and the role of contextual information in its operation.

Maintenance data are formed by pieces of information very different in nature. Data can be acquired from sensors placed at different points of the asset while the operators produce information, either handwritten or digitalised, including work orders, maintenance reports, information about stocks and maintenance planning, among others. These latter records often have poor quality because of inappropriate reporting equipment, missing or lost information during data migration from paper documents to digital sources, unrecorded events, etc. [17].

Another common scenario is a lack of data when assets cannot be operated to their maintenance limit. This situation occurs in many industries, such as the transport, energy or chemistry sectors, in which safety is more important than other factors of efficiency and reliability. Only those elements of low criticality are operated until failure; all components and subsystems affecting the safety of the systems are replaced in early stages of degradation, even far from the maintenance limit, because of strong regulatory conditions. A lack of data also occurs with early replacements of components as a result of opportunistic maintenance [9]. Overprotection and excessive maintenance tasks lead to a situation in which maintainers have little historical information about the behaviour of the assets – a handicap when trying to estimate their future response.

Other reasons for getting missing data or incomplete data are sensor failure, communication failure and storage size restrictions. Thus, there is an interest to prepare the models in advance to overcome these limitations. There are some approaches based on data-driven modelling in the literature, in which some authors use artificial neural networks to improve diagnosis in systems such as wind turbines or cutting machines [18], [19].

It should be highlighted that the aforementioned assets are considered to be systems of systems, characterised by nonlinear structures, with a large scale spatial scope, dynamic and responsive behaviour, and going beyond a single scientific discipline [20]. This implies complexity when implementing a maintenance approach as the interaction between components and subsystems means it is difficult to obtain a complete fault catalogue corresponding to all the individual systems and to the different combinations when they work together and produce new faults.

To summarise, there is a lack of data on the operation of these assets. The amount of data available for maintenance planning is schematically represented in Fig. 2. As the figure shows, few components can be operated until failure (i.e., minimum criticality) or have no degradation. Thus, in the majority of cases, data are obtained until intermediate points are reached between the operating start points and the maintenance threshold. These data points are called suspensions.

Given the lack of data, the use of synthetic data generated by physics-based models describing the operation of the assets is a must. Those scenarios involving common operating conditions can be simulated, as well as those difficult to reproduce in real operation, such as extreme operating conditions or damage situations that cannot be seeded to the system to learn about its behaviour because of safety, economic or environmental reasons. The fusion of synthetic data and acquired data from sensors placed in the assets combining physics-based modelling techniques with purely data-driven methods results in a hybrid modelling approach.

When combining the modelling strategies and fusing data to improve maintenance performance, the concept of context is a key to efficient diagnosis and prognosis [21]. Context is defined as “any information that can be used to characterise the situation of entities (i.e., whether a person, place, or object) that are considered relevant to the interaction between a user and an application, including the user and the application themselves” [22]. This definition, applied to the field of maintenance, means having the appropriate information on the conditions of the operation of an asset. The context includes information such as working temperature, humidity, applied loads, operating speed and information about any other system with which the asset interacts, among others. The context has a great influence on the behaviour of physical assets and should not be omitted. Sensors must be added to obtain this information and provide context-driven services, as shown in Fig. 3.

As mentioned, this paper proposes a framework for hybrid modelling combining the physics-based and data-driven modelling approaches and using both data fusion and context awareness. It suggests a two level architecture. One level is responsible for analysing the condition monitoring (CM) data acquired from an asset for early failure detection. The second level carries out the virtual emulation of the behaviour of the asset and performs deeper analysis using data fusion. The architecture is validated by being applied to a rotating machine; more specifically, the response of a gearbox’s REBs is monitored.

The paper has the following structure. The architecture for hybrid modelling for CBM is proposed in Sections 2 Proposed architecture for hybrid modelling, 3 Validation of the architecture validates the architecture by applying it to a rotating machine; finally, Section 4 offers concluding remarks.

Section snippets

Proposed architecture for hybrid modelling

The proposed hybrid modelling architecture seeks to perform diagnosis and prognosis to provide the information required to optimise operation and maintenance based on the RUL, assuring the reliability and safety of the monitored asset. The workflow of this architecture is depicted in Fig. 4. In this scenario, a physical asset is given the technology necessary for it to acquire smart capabilities. Meeting this goal requires the use of complementary processes and tools.

The architecture is based

Validation of the architecture

This section explains the validation of the proposed architecture, specifically, the hybrid methodology of data fusion and the combination of modelling techniques. The section begins by describing the physical asset used for the validation process. It goes on to explain the physics-based model developed to represent the behaviour of the asset. Next, it introduces the data-driven modelling and the data generation process. It concludes by presenting and discussing the results.

Conclusions

Optimising both the maintenance resources and the maintenance costs are key concerns of maintainers. In recent decades, CBM has been proved a useful tool to achieve these goals. New technologies involving big data, cloud computing and virtual commissioning are now being used in Industry 4.0 to strengthen maintenance. Hybrid modelling is still in its infancy, but it has great potential to improve diagnosis and prognosis and, consequently, to optimise maintenance.

This paper proposes an

Acknowledgements

This study is partially funded by the Ministry of Economy and Competitiveness of the Spanish Government under the Retos-Colaboración Program (LEMA project, RTC-2014-1768-4). Any opinions, findings and conclusions expressed in this article are those of the authors and do not necessarily reflect the views of funding agencies. The authors would also like to thank Fundación de Centros Tecnológicos – Iñaki Goenaga.

References (51)

  • J. Antoni et al.

    Unsupervised noise cancellation for vibration signals: part II – a novel frequency-domain algorithm

    Mech. Syst. Signal Process.

    (2004)
  • H. Endo et al.

    Enhancement of autoregressive model based gear tooth fault detection technique by the use of minimum entropy deconvolution filter

    Mech. Syst. Signal Process.

    (2007)
  • J. Antoni

    The spectral kurtosis: a useful tool for characterising non-stationary signals

    Mech. Syst. Signal Process.

    (2006)
  • J. Antoni

    Fast computation of the kurtogram for the detection of transient faults

    Mech. Syst. Signal Process.

    (2007)
  • Z. Zeng et al.

    Semi-supervised feature selection based on local discriminative information

    Neurocomputing

    (2016)
  • W. Pedrycz

    Algorithms of fuzzy clustering with partial supervision

    Pattern Recogn. Lett.

    (1985)
  • A. Prajapati et al.

    Condition based maintenance: a survey

    J. Quality Maintenance Eng.

    (2012)
  • M.E. Otey et al.

    Fast distributed outlier detection in mixed-attribute data sets

    Data Mining Knowl. Discovery

    (2006)
  • J. Lee et al.

    A systematic approach for predictive maintenance service design: methodology and applications

    Int. J. Internet Manufact. Services

    (2009)
  • ISO13372:2012

    Condition Monitoring and Diagnostics of Machines – Vocabulary

    (2012)
  • F. Ahmadzadeh et al.

    Remaining useful life estimation: review

    Int. J. Syst. Assurance Eng. Manage.

    (2014)
  • M. Mishra et al.

    Modelización híbrida para el diagnóstico y pronóstico de fallos en el sector del transporte: Datos adquiridos y datos sintéticos

    Dyna

    (2015)
  • I. Matei et al.

    The case for a hybrid approach to diagnosis: a railway switch

  • K. Medjaher et al.

    Framework for a hybrid prognostics

    Chem. Eng. Trans.

    (2013)
  • J. Ghaboussi et al.

    Hybrid modelling framework by using mathematics-based and information-based methods

    IOP Conf. Ser.: Mater. Sci. Eng.

    (2010)
  • Cited by (38)

    • Visualization methodology of the health state for wind turbines based on dimensionality reduction techniques

      2022, Sustainable Energy Technologies and Assessments
      Citation Excerpt :

      In this context, it is necessary and of great importance to devise methods to track the performance degradation process of the WTs and visualize their health state by extracting effective information from SCADA data based on the data-driven method. The data-driven method includes model-based and similarity metric methods [8–11]. The former, also known as the residual method, is a “black box” machine learning approach that has disadvantages such as lack of physical explanation, sample size dependency, and under/over-fitting.

    • Prognostics and Health Management (PHM): Where are we and where do we (need to) go in theory and practice

      2022, Reliability Engineering and System Safety
      Citation Excerpt :

      A model based on Auto-Regressive Moving Average (ARMA) and Auto-Associative Neural Networks (AANN), has been developed for fault diagnostics and prognostics of water process systems with incomplete data [130]. An integrated Extreme Learning Machine (ELM)-based imputation-prediction scheme for prognostics of battery data with missing data [125] and an hybrid architecture of physics-based and data-driven approaches have been proposed to deal with missing data in a rotating machinery prognostic application [131]. In the medical field, a Bayesian simulator has been used to generate missing data for developing prognostic models [132] and a Multiple Imputation approach has been embedded within a prognostic model for assessing overall survival of ovarian cancer in presence of missing covariate data [133].

    • Semi-supervised data modeling and analytics in the process industry: Current research status and challenges

      2021, IFAC Journal of Systems and Control
      Citation Excerpt :

      Potocnik and Govekar (2017) proposed a semi-supervised vibration-based classification and condition monitoring method for compressors, which combines feature extraction, principal component analysis, and statistical analysis for the extraction of initial class representatives, and compares the capability of various classification methods. Leturiondo et al. (2017) built an architecture for hybrid modeling and fault diagnosis and prognosis applications with missing data. In this framework, A multi-body model and a semi-supervised learning algorithm have been used to perform the hybrid modeling.

    • Fault prognostics by an ensemble of Echo State Networks in presence of event based measurements

      2020, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      To the best of our knowledge, few research works have considered fault prognostics in presence of missing data. A model based on Auto-Regressive Moving Average (ARMA) and an auto-associative neural networks, is developed for fault diagnostics and prognostics of water processes with incomplete data (Xiao et al., 2017) and an hybrid architecture including physics-based and data-driven approaches are proposed to deal with missing data in case of rotating machinery (Leturiondo et al., 2017). In the medical field, a Bayesian simulator is used to generate missing data for developing prognostic models (Marshall et al., 2010) and a Multiple Imputation approach is used within a prognostic model for assessing overall survival of ovarian cancer in presence of missing covariate data (Clark and Altman, 2003).

    View all citing articles on Scopus
    View full text