Data classification and MTBF prediction with a multivariate analysis approach

https://doi.org/10.1016/j.ress.2011.09.010Get rights and content

Abstract

The paper presents a multivariate statistical approach that supports the classification of mechanical components, subjected to specific operating conditions, in terms of the Mean Time Between Failure (MTBF). Assessing the influence of working conditions and/or environmental factors on the MTBF is a prerequisite for the development of an effective preventive maintenance plan. However, this task may be demanding and it is generally performed with ad-hoc experimental methods, lacking of statistical rigor. To solve this common problem, a step by step multivariate data classification technique is proposed. Specifically, a set of structured failure data are classified in a meaningful way by means of: (i) cluster analysis, (ii) multivariate analysis of variance, (iii) feature extraction and (iv) predictive discriminant analysis. This makes it possible not only to define the MTBF of the analyzed components, but also to identify the working parameters that explain most of the variability of the observed data.

The approach is finally demonstrated on 126 centrifugal pumps installed in an oil refinery plant; obtained results demonstrate the quality of the final discrimination, in terms of data classification and failure prediction.

Introduction

Nowadays, the increasing demand on productivity and equipment availability, together with the decrease of profit margins, impose to enhance reliability and to boost performance, keeping operating costs at a low level. In a scenario where maintenance costs can be determinant to make or break a business, the development of a Total Productive Maintenance (TPM) plan and/or the use of preventive maintenance (PM) should be considered as promising solutions [1]. Indeed PM makes it possible to minimize maintenance and failure costs, by scheduling standard maintenance activities just before a failure occurs. To this aim a reliable estimation of the equipments hazard rate is needed, but unfortunately, understanding the underlying failure processes and predicting when an equipment might fail is challenging, since it depends on several parameters whose relevance is almost impossible to be quantified [2], [3], [4].

To simplify the analysis, most PM models are based on the assumption of a constant hazard rate [5], [6], although this hypothesis is seldom appropriate in an industrial setting. Attempts to link the working conditions with the hazard rate have been made for electrical devices only [7], [8], but even in this case, the hazard rate was computed making experiments under specific operating conditions and it is almost impossible to prove a relation between theoretical results and field performances, as formally demonstrated by the theories in the field of the roller coaster models [9], [10], [11], [12], [13].

Due to the above mentioned issues, it is a common industrial practice to make components life time estimations using ad-hoc experimental methods [3]. For instance, the periodic inspection intervals of PM activities are frequently planned using as guideline the reliability estimations made by the original equipment manufacturers (OEM) [14], even if such judgments are conservative and precautionary and may lead to an excessive intervention frequency. To counteract this drawback the hazard rate could be empirically estimated based on the results of inspection activities: anytime a maintenance task is performed, the repaired/substituted item is disassembled and a detailed analysis of its wear is made. If the inspected item is still capable to perform its primary function, then the periodic inspection interval is increased and the reliability estimation is progressively refined. Nonetheless, this practical approach is questionable not only for its empirical nature, but also because it takes long before the optimal inspection frequency is found. Furthermore, owing to higher system complexity and higher demand of system availability, the above mentioned time-based approaches have become inefficient in many cases [15]. With new developments in both informatics and technology, a more advanced approach is to collect off line and on line data in order to define a warning limit and/or to build a statistical model for time to failure [16]. This can be done with different techniques, such as parametric or logistic regression [17], [18], but the most respected one is the use of Cox's Proportional Hazards Model (PHM) [19], [20], which allows using condition variables and/or technical features as covariates to find their effects on the life time of a component [21]. Several contributions have been made to adapt PHM to repairable systems [22], [23] and most of them propose using time dependent covariates and a Weibull baseline hazard function. Briefly, what differentiates among alternative models is the probabilistic function used to describe the evolution of the covariates over time, which is generally obtained using non-homogeneous Poisson processes [24], [25] or non-homogeneous Markov Chains [26]. Recently some interesting extensions have also been made to include additional elements, such as multi-components systems [27], [28] and imperfect preventive maintenance tasks [16], [29].

Notwithstanding their merits, PHM approaches have some drawbacks that limit their applicability in the industry [3], [5], [15]. The first one is due to the use of Markov Chains to describe the evolution of the time dependent covariates. The result is that the number of possible system states grows exponentially with the number of covariates: if there are k covariates and each one of them can take n values, then there are (up to) kn possible states to be defined. This fact, together with the necessity of a frequent monitoring of the covariates (i.e. installation of sophisticated sensors and extensive data collection and analysis) may lead to unaffordable computational costs [15].

A second issue is the so-called collinearity problem: if some linear combinations of covariates are highly correlated, a PHM model could lead to unrealistic results. Although possible solutions have been proposed [15], [30], they increase the complexity of the model and require statistical tools and competencies that are rarely available in the industry.

Starting from these considerations, some non-parametric approaches (for hazard rate analysis) have been proposed by Bevilacqua et al. [31], [32] and by Bellandi et al. [33]. These simpler, but more practical models are based on Classification/Regression Trees and on Neural Networks and have proved their validity in several industrial applications. The present work belong to this field of research and proposes a new approach for (i) the determination of the MTBF of mechanical items subjected to different working conditions and for (ii) the discrimination of the working parameters responsible for the difference in MTBF. Specifically, since the MTBF of mechanical equipments depends on several variables, a set of maintenance data (stored in a Computerized Maintenance Management System, CMMS) is analyzed by means of a step by step procedure based on multivariate discriminating techniques. The approach grants statistical rigor of the analysis and assures easiness of use, since all the adopted techniques are fully implemented by most of the commercial statistical packages.

Section snippets

MTBF classification framework

As detailed above, our interest is on classification and predictions, based on reliability data. The underlying idea is to partition a set of items into classes, with respect to the MTBF, and to associate different maintenance policies to each one of them. Next, once the MTBF classes have been created, a classification procedure is used to associate new entries to a specific class, using as input technical features and the expected working conditions. For instance, for those equipments

Case study

This section of the paper presents an industrial application and shows the quality and the operating potentialities of the proposed framework.

Conclusions and future works

The paper presented an innovative framework, based on multivariate statistics, which makes it possible to classify items in terms of the MTBF and to identify the operating parameters that influence their reliability. The model explicitly refers to electromechanical equipments that, in terms of reliability, are highly affected by the operating conditions and for which the development of a mathematical formulation of the hazard rate would be challenging, if not impossible. Specifically, the

References (40)

  • G. Waeyenbergh et al.

    Maintenance concept development: a case study

    International Journal of Production Economics

    (2004)
  • Uckun S, Goebel K, Lucas PJF. Standardizing research methods for prognostics. In: Proceeding of the international...
  • R.S. Sayles

    The use of discriminant function techniques in reliability assessment and classification

    Reliability Engineering

    (2003)
  • D.N.P. Murthy et al.

    Weibull model selection for reliability modeling

    Reliability Engineering and System Safety

    (2004)
  • A.H. Christer

    Developments in delay time analysis for modelling plant maintenance

    Journal of Operational Research Society

    (1999)
  • W. Wang et al.

    Reliability data analysis and modelling of offshore oil platform plants

    Journal of Quality in Maintenance Engineering

    (2000)
  • P. O'Connor

    Practical reliability engineering

    (2002)
  • Reliability Prediction of Electronic Equipment. Military Handbook MIL-HDBK-217;...
  • A.C. Brombacher

    Maturity index on reliability: covering non-technical aspects of IEC61508 reliability certification

    Reliability Engineering and System Safety

    (1999)
  • Wong KL, Lindstrom DL. Off the bathtub onto the roller-coaster curve. In: Proceeding of the IEEE annual reliability and...
  • L. Bekker et al.

    Shape and crossing properties of mean residual life functions

    Statistic & Probability Letters

    (2003)
  • Y. Lu et al.

    Accelerated Stress testing in a time driven product development process

    International Journal of Production Economics

    (2000)
  • M. Bebbington et al.

    A flexible Weibull extension

    Reliability Engineering and System Safety

    (2007)
  • R. Kumar et al.

    Maintenance of machinery, negotiating service contracts in business-to-business marketing

    International Journal of Service Industry Management

    (2004)
  • D. Lin et al.

    Using principal components in a proportional hazards model with applications in condition based maintenance

    Journal of the Operational Research Society

    (2006)
  • M.Y. You et al.

    Control limit preventive maintenance policies for components subjected to imperfect preventive maintenance and variable operational conditions

    Reliability Engineering and System Safety

    (2011)
  • P. Temsuwanpanich et al.

    Reducing machining downtime in head gimbal assembly industry

    Asia International Journal of Science and Technology in Production and Manufacturing Engineering

    (2010)
  • Liao A, Zhao W, Guo H. Predicting remaining useful life of an individual unit using proportional hazard models and...
  • D.R. Cox

    Regression models and life-tables

    Journal of the Royal Statistical Society

    (1972)
  • D. Kumar et al.

    Proportional hazards model: a review

    Reliability Engineering and System Safety

    (1994)
  • Cited by (33)

    • Performability evaluation, validation and optimization for the steam generation system of a coal-fired thermal power plant

      2022, MethodsX
      Citation Excerpt :

      Bahl et al. [5] gave a simulation modeling to find out the behavior of the distillery plant by considering the different input parameters for various components and determining the highest critical components, which greatly influenced the availability of the plant. Braglia et al. [6] discussed an oil refinery plant with a multivariate statistical approach to support the classification of mechanical components working in a specific environment. An effective preventive maintenance plan was formulated by assessing the impact of working conditions on the mean time between failures.

    • Data fusion and machine learning for industrial prognosis: Trends and perspectives towards Industry 4.0

      2019, Information Fusion
      Citation Excerpt :

      Intelligent monitoring of equipment by using sensors is essential to acquire relevant data containing the characterization of operational faults in physical signals; acoustic and ultrasonic sensors, accelerometers, current measurements or thermocouples are usually employed for this purpose [133,134]. In addition to these data, environmental conditions and contextual information, such as temperature, pressure or humidity, provide very useful information to enrich the modeling process [135]. From such information, specific KPIs are calculated and analysed to discover trends that can lead to a potential critical fault.

    • Data-driven prognostics using a combination of constrained K-means clustering, fuzzy modeling and LOF-based score

      2017, Neurocomputing
      Citation Excerpt :

      Intelligent monitoring of equipment by means of sensors is essential in order to acquire relevant data, containing the characterization of operational faults in physical signals: acoustic and ultrasonic sensors, accelerometers, current measurements or thermocouples are usually employed [3,4]. In addition to this data, environmental conditions and contextual information, such as temperature, pressure or humidity also provide very useful additional information to enrich the modeling process [5]. From such information, specific Key Performance Indicators (KPIs) are calculated and analysed to discover trends and knowledge of interest that can lead to a potential critical fault.

    View all citing articles on Scopus
    View full text