Skip to main content
Erschienen in: European Transport Research Review 1/2024

Open Access 01.12.2024 | Original Paper

A data-driven prioritisation framework to mitigate maintenance impact on passengers during metro line operation

verfasst von: Alice Consilvio, Giulia Vignola, Paula López Arévalo, Federico Gallo, Marco Borinato, Carlo Crovetto

Erschienen in: European Transport Research Review | Ausgabe 1/2024

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The application of artificial intelligence (AI) techniques may lead to significant improvements in different aspects of rail sector. Considering asset management and maintenance, AI can improve data analysis and asset status forecasting and decision-making processes, fostering predictive and prescriptive maintenance strategies. A prescriptive approach should be able to predict future scenarios as well as to suggest a course of actions. Nevertheless, the decision-making in rail asset management is often based on the classical asset-oriented approach, concentrating on the function of the asset itself as a main key performance indicator (KPI), whereas a user-oriented approach could lead to improved performance in terms of level of service. This paper is aimed at integrating the passengers’ perspective in the decision-making process for asset management to mitigate the impact that service interruptions may have on the final users. A data-driven prioritisation framework is developed to prioritise maintenance interventions taking into account asset status and criticality. In particular, a three-step approach is proposed, which focuses on the analysis of passenger data to evaluate the failure impact on the service, the analysis of alarms and anomalies to evaluate the asset status, and the suggestion of maintenance interventions. The proposed approach is applied to the maintenance of the metro line M5 in the Italian city of Milan. Results show the usefulness of the proposed approach to support infrastructure managers and maintenance operators in making decisions regarding the priority of maintenance activities, reducing the risk of critical failures and service interruptions.
Hinweise

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

With the digitalisation of rail infrastructure, an increasingly amount of data is becoming available, and automated tools based on artificial intelligence (AI) techniques are under development to extract information from them [15].
Different definitions of AI exist [6], since the definitions of artificial intelligence evolve based upon the goals that are trying to be achieved with an AI system, e.g. to imitate the human behavior, to use human reasoning as a model, etc..[7]. AI can be defined as “the ability of a digital computer or computer-controlled robot to perform tasks commonly associated with intelligent beings”. AI techniques represent methods, algorithms and approaches enabling systems to perform tasks commonly associated with intelligent behaviour (e.g. machine learning, evolutionary computing [8, 9].) In the literature Bésinovic et al. [10], Ghofrani et al. [11], Yin et al. [12], aspects considered crucial when considering AI applications in the railway domain are: 1) the capability to accomplish tasks that would require critical intelligence that can be done by a human (e.g. decision-making); 2) the capability of taking into account uncertainties and/or unexpected scenarios (e.g. machine learning models for data-driven predictive maintenance); 3) Being able to learn from experience and take autonomous decisions in uncertain scenarios.
Machine learning is at the core of many approaches to artificial intelligence; according to Yong et al. [13] a large part of AI for railways today is based on machine learning (ML). In particular, ML is a branch of the broader field of AI that uses statistical models to identify anomalies and develop predictions [14, 15]. However, AI tools include also other techniques such as search algorithms, mathematical optimisation, evolutionary computation, logic programming, automated reasoning, probabilistic methods such as bayesian network and Markov model [4].
In this context, there is a high demand for a step change in asset management (AM) [16] to be delivered through innovative data-driven technologies and AI techniques [1722].
This innovation of asset management strategy and techniques is particularly challenging in the railway sector where complex systems are integrated to achieve high safety and reliability standards. Kumari et al. [19, 20] develop and propose a concept for augmented asset management for railway assets, which involves the augmentation of AM with advanced analytics, based on digitalization and AI techniques, to provide augmented decision support for fleet management. McMahon et al. [21] analyse requirements and challenges for big data analytics applications to railway asset management and recommend that the research efforts should be directed to define potential data-driven analytics frameworks, and to integrate different data-driven approaches for condition and failure monitoring and decision support.
As stated by [22], detecting defects of the railway infrastructure in the early stage of defect development, can reduce the risk of railway operation, the cost of maintenance, and make the asset management more efficient.
To this aim, condition-based [2325] and predictive maintenance approaches [2630] were studied to evaluate rail asset current and future status, exploiting data collected in real-time by new monitoring technologies and sensors, installed on board trains and wayside.
Many initiatives for the innovation of AM approaches, through digitalisation and AI, are ongoing in the rail sector within the Shift2Rail and Europe’s Rail research framework with projects such as IN2SMART [31], IN2SMART2 [32], IAM4RAIL [33], IN2DREAMS [34], DAYDREAMS [35], and RAILS [36].
Moreover, a railway system is usually composed by numbers of interdependently linked subsystems, and a failed component or subsystem may differently affect the system performance, according to its function. Therefore, the evaluation of subsystems and components’ status and criticality is crucial for maintenance managers and operators [37]. The choice of maintaining the most degraded asset is not always the best one if, for example, the asset is redundant asset or its failure does not affect the service availability. For this reason, AM decision support systems should consider criticality considerations, besides the asset status evaluation, to identify the best AM strategy.
The goal is to achieve a prescriptive approach which is not only able to answer questions like “What is happening?” (the condition-based approach), or “What will happen?” (the predictive approach), but it can also provide answers to questions like “What could be done?” and “What are the best options?”, optimising, under context-specific asset management constraints, preferences and targets of railway stakeholders [11, 3840].
Nevertheless, the decision-making in rail asset management is still too often based on the classical asset-oriented approach, which concentrates on the function of the asset itself as a main key performance indicator (KPI), whereas a user-oriented approach could lead to improved performance in terms of level of service.
Service-based asset management concentrates on provided services and serviceability of assets by applying service-level KPIs in order to reflect users’ perspectives in decision-making [41].
The availability of data on passengers’ transport demand collected by different systems, such as automated passenger counting (APC) and automated fare collection (AFC) systems, allows to exploit these data to extract important information to be used in many service-based decision-making processes.
Recent studies [42, 43] focus on the impact on passengers of service interruptions and delays caused by rail asset failures and maintenance to investigate and quantify how disruptions to rail services are perceived by passengers [42], and to propose mitigation measures [43]. However, the passengers’ perspective is usually not integrated in the decision-making framework for asset management. This leads to strategies that may keep assets in a high-quality condition but significantly affecting the passengers and the service, such as planning maintenance during days characterised by a peak of transport demand. Including passenger flow prediction as an input of the decision-making process would allow to avoid these situations, since maintenance tasks, which require temporary service disruptions or track closures, can be scheduled during off-peak time intervals. In addition, it would be possible to reduce the number of critical failures during a peak demand period, by planning preventive maintenance before the peak, ensuring a good condition of the assets when they are needed the most.​ Finally, maintenance strategies could be updated based on changing passenger behaviour, guaranteeing that maintenance plans could be always aligned with the most up-to-date rail line usage patterns.
The purpose of this paper is to fill this gap and move some steps towards a service-based asset management in the rail sector. In addition, the present study is aimed at answering to the research needs related to the definition of data-driven analytics frameworks and the integration of different data-driven approaches for condition and failure detection and maintenance decision support.
The addressed problem is the definition of a prescriptive maintenance approach able to suggest the needed predictive maintenance activities and the best order to perform those interventions according to selected KPIs and targets for the infrastructure manager.
The objective of the study is to develop a data-driven framework to prioritise predictive maintenance interventions according to asset status and criticality.
The prediction of rail assets’ status is performed based on an anomaly detection technique, and the capability of the proposed approach to predict failure and avoid corrective interventions, suggesting predictive interventions, is evaluated.
The prioritisation approach is able to represent different decision-making criteria, including service-level targets, and to assign different weights to the criteria according to their importance for the infrastructure manager.
The assumption is that the asset criticality can be decomposed in two terms:
  • a static term related to the type of asset and its specific function and types of failure;
  • a dynamic term related to the service condition in the considered time period.
While the static term is usually provided by the maintenance expert or the rail system designer, the dynamic term is evaluated considering the impact of failure on the service and the involved passengers according to the asset position and the utilisation rate of the different line sections over time. To achieve this objective, a model for the prediction of passenger flow at stations is proposed. This information is exploited in combination with the prediction of asset status to prescribe the needed predictive maintenance activities and to suggest the optimal sequence of interventions to the railway infrastructure manager. The ranking of the possible maintenance options is performed according to defined KPIs. Different options of maintenance ordering can be proposed to the infrastructure manager and the related KPIs can support the infrastructure manager in making the final decision.
The approach deals with a tactical level of decision making, providing a prioritisation of predictive maintenance interventions. The detailed scheduling of the maintenance activities with their allocation to maintenance time windows and work teams is out of the scope of this paper and represents a consecutive decisional phase that will use the prioritisation as an input.
The considered maintenance activities are predictive maintenance work orders generated by the anomaly detection model. The anomaly can lead to a predictive maintenance work order if it is detected sufficiently in advance, leaving enough time to organize the intervention.
Corrective maintenance interventions are neglected, but they can be integrated in the prioritisation list as activities with the highest priority to be executed as soon as possible.
Moreover, it is worth saying that the scope of the work is to show how different data-driven models and AI tools could be integrated in a data-driven framework for decision-making, including the passengers’ perspective in the maintenance planning. The identification of the best data-driven models for the considered data sets and the comparison with other existing methods is out of the scope of the paper.
The approach is suitable to estimate the predictive maintenance interventions needed in the upcoming week, considering a weekly time horizon.

2 Literature review

The proposed approach exploits different sources and types of data, linking three different models within a unique data-driven framework, to extract important knowledge to support maintenance decision-making.
In this section, the analysis of existing studies is presented considering the aims of the three models developed in this work: passenger flow prediction, rail assets status evaluation, and maintenance planning and decision support.

2.1 Passenger flow prediction

Considering the prediction of passenger flows and vehicle occupancy, several studies have addressed the topic of predicting the number of passengers on public transport vehicles, including trains and metro lines; they mainly differ in the used type of data and the adopted prediction model.
Regarding the prediction framework, the cyclical component of transport demand concerning the time of the day and the day of the week favored classical statistical prediction techniques based on time-series forecasting methods, such as autoregressive integrated moving average (ARIMA) and Kalman filters [4446].
However, since the changes in traffic data are nonlinear in nature, all the above-mentioned models are limited in their performance and application due to the assumption of linearity [47].
Therefore, research has shifted more towards machine learning and deep-learning techniques that can model the non-linearity and the feature interactions. Liu et al. [47], Liu and Chen [48], Wang et al. [49], and Baek and Sohn [50] have successfully applied methods based on neural networks and deep learning, whereas Samaras et al. [51], Ding et al. [52], Vandewiele et al. [53], and Gallo et al. [54] have found that neural networks need large quantities of data to perform well, and are outperformed in their cases by tree-based algorithms like random forests, gradient boosted decision trees, and Bayesian based models.
Jenelius [55] applied lasso, stepwise regression, and boosted tree ensembles to predict passenger numbers on metro lines both at the stations and on the trains, testing the effectiveness of different typologies of data.
Few papers tried to solve the passenger prediction problem using Markov chain-based methods, even though they have been proven to provide promising results, since they can capture memoryless dependencies between the crowding of temporally close public transport services [56]. In addition, this prediction method has the advantage of requiring only recent data, without the need for large historical datasets.
In this paper, a Markov chain Monte Carlo (MCMC) is applied to estimate the passenger flow exploiting ticketing data. The technique is exploited within a wide framework, in which the final aim is the evaluation of asset criticality.
The choice of using MCMC technique is motivated by its adequate performance with data sets of limited size, with respect to approaches such as Bayesian networks methods and neural networks, which need a higher volume of data to provide good results.
Table 1 compares existing studies on passenger flow prediction.
Table 1
Comparison of the selected passenger flow prediction studies
Study
Objective
Type of data
Methodology
Ni et al. [44]
Station flows prediction
Twitter data and turnstile usage
Linear regression and seasonal autoregressive integrated moving average
Xue et al. [45]
Onboard passenger prediction
Passenger boarding data
Algorithm based on interactive multiple model (IMM) filter and different types of moving averages
Zhang et al. [46]
Onboard passenger estimation and prediction
Smart card data and GPS data
Extended Kalman filter
Liu et al. [47]
Station flows prediction
Passenger measured flow
Deep learning
Liu & Chen [48]
Station flows prediction
Passenger flow data from automated passenger counting (APC)
Deep learning
Wang et al. [49]
Incoming flows prediction at station
Automated fare collection (AFC) data
Dynamic spatiotemporal hypergraph neural networks
Baek and Sohn [50]
Bus stop flows and onboard passenger prediction
Smart card data
Deep learning
Samaras et al. [51]
Onboard passenger prediction
Automated vehicle location (AVL) and APC data
Several machine learning algorithms
Ding et al. [52]
Onboard passenger prediction
AFC data
Gradient boosting decision trees
Vandewiele et al. [53]
Onboard passenger prediction
Crowd-sourcing data
Neural networks and XGBoost
Gallo et al. [54]
Onboard passenger prediction
APC data
Linear regression and LightGBM
Jenelius [55]
Onboard passenger prediction
Weight measurements in the air suspension system of the train cars
Lasso and stepwise regression, boosted trees ensembles
Wiecek et al. [56]
Onboard passenger prediction
APC data
Markov chains
This paper
Station flows prediction
AFC data
Markov chain Monte Carlo

2.2 Rail assets status evaluation

Regarding the second main goal of the present work, the asset status evaluation, different studies exist in the literature. In particular, several machine learning methods, including artificial neural networks, support vector machines and random forests have been used to evaluate the asset degradation status by analysing data coming from the equipment [35]. These data can be used both to identify equipment faults and to predict potential asset failures.
Thanks to the digitalisation of the sector, analytics can be used to manage large quantities of data and derive a prediction of the asset status [11, 57]. Pipe and Culkin [58] developed a data-driven model able to forecast the status of rail assets in order to achieve predictive maintenance strategies.
Machine learning can be divided into supervised, semi-supervised and unsupervised learning approaches [59]. Supervised models, such as artificial neural networks, support vector machines (SVM) and Bayesian networks, applied to fault detection, diagnosis and prediction, stand as an interesting option in operations with high maturity, whereby most possible faults are already mapped, measured and available for use to train models.
For example, Li and He [60] used a random forest based supervised methodology to predict the status and the remaining useful life of railcars combining multiple data sources. Niu et al. [61] used an adaptive pyramid graph method to detect anomalies in rail surface using images. Shim et al. [62] used deep learning to detect anomalies of wheel flats exploiting processed flat wheel signals. Li et al. [63] applied a supervised SVM technique that effectively uses large-scale data and provides valuable tools for operational sustainability and alarm prediction in railways. Considering signaling assets and, in particular, track circuits, a solution based on SVM is proposed by Sun et al. [64].
However, in the real world, new faults can happen in unmapped forms. This means that even a known fault might manifest in different ways. Therefore, even if supervised learning might bring trustful results for known cases, these models are limited in the case of unexpected events.
Semi-supervised models for fault detection can be seen as a variation of supervised models, since it is assumed that the training data have labeled instances for only the normal class. Therefore, any observation that deviates from the training data might be classified as a fault.
Conversely, unsupervised learning models do not consider any class variable. Unsupervised models focus on identifying similarities and discriminating clusters of observations with common characteristics [65].
In this paper, a one-class support vector machine (OCSVM) is applied. The choice of using OCSVM technique is motivated by the fact that it is one of the most well-established algorithms for outliner detection and a popular semi-supervised model already applied in rail sector [66]. The OCSVM model is here integrated within a framework for maintenance prioritisation and planning.
Table 2 summarises existing studies on rail assets status evaluation.
Table 2
Comparison of the selected rail assets status evaluation studies
Study
Objective
Type of data
Methodology
Li and He [60]
Prediction of the remaining useful life of train wheels and trucks
Data from three types of detectors (wheel impact load detector, machine vision systems, and optical geometry detectors)
Random forest
Niu et al. [61]
Anomaly detection of rail surface
Rail surface images
Adaptive pyramid graph and variation residuals
Shim et al. [62]
Anomaly detection of wheel flats
Wheel flat signals
Signal processing and deep learning
Li et al. [63]
Early fault detection of trucks and train wheels
Data from multiple types of detectors
Support vector machine
Sun et al. [64]
Fault diagnosis in railway track circuits
Short-circuit current signals
Multi-class support vector machine
Wan et al. [66]
Anomaly detection of train wheels
Vibration signal collected with a pair of fiber Bragg grating sensors
Unsupervised learning algorithms
This paper
Asset status evaluation
Automatic train supervision (ATS) logs, assets parameters, and maintenance data
One-class support vector machine

2.3 Maintenance planning and decision support

In the literature, several methods and works have been proposed to schedule railway maintenance interventions, focusing on predictive maintenance.
Scheduling methods for predictive maintenance are expected to reduce both the maintenance cost and the risk of service disruptions; thus they should consider both the asset status and the asset criticality to identify the interventions priority [67]. In doing so, the most widely used minimisation targets are intervention costs and time duration, to maximise the system’s reliability and availability [6870]. Solution approaches are based on heuristic approaches like tabu search algorithms, simpler greedy heuristic and genetic algorithms [71, 72].
Lopes Gerum et al. [73], Hamshari et al. [74], Chang et al. [75] include, in the scheduling method, data related to the asset condition, collected by diagnostic trains and sensors, and degradation models to predict the probability of asset failure. Mira et al. [76] integrates the maintenance scheduling into a fleet assignment model to schedule train services. Baglietto et al. [37] and Carretero et al. [77] include risk-based methodologies and take into account the asset criticalities, but they do not incorporate asset condition prediction methods.
However, scheduling optimisation models are NP-hard problems, which can be optimally solved within an acceptable computational time only for small instances, while heuristic algorithms, such as genetic algorithms, can provide sub optimal solutions also for bigger instances, but need long time to converge [78].
The ad-hoc prioritisation algorithm proposed in this paper allows to model the specific optimisation criteria of the infrastructure manager and to compare different solutions in a short computational time.
In addition, the ad-hoc optimisation algorithm has been formulated in order to include the outputs from the data-driven models for the evaluation of asset status and asset criticality.
Table 3 compares the existing studies on rail maintenance planning and scheduling.
Table 3
Comparison of the selected maintenance planning studies
Study
Objective
Methodology
Optimised variables
Consilvio et al. [67]
Maintenance planning
Mixed integer linear programming problem solved with a Matheuristic approach
Tardiness with respect to the soft deadlines and the total completion time
Khalouli et al. [68]
Maintenance scheduling
Mixed-integer linear programming model solved with an ant colony optimisation method
Possession costs, maintenance costs, and penalty costs
Macedo et al. [69]
Maintenance scheduling
Mixed-integer linear programming model solved with a variable neighborhood search
Total operational costs
Zhao et al. [71]
Maintenance scheduling
Mixed-integer linear programming model solved with a genetic algorithm
Cost benefit obtained by combining renewal activities for a track section
Quiroga et al. [72]
Maintenance scheduling
Heuristic algorithm that maximises the defined maintenance process’s objectives
Total reduction track geometry deviation, expected time to failure, and expected geometry deviation at next campaign
Lopes Gerum et al. [73]
Maintenance scheduling
Markov decision process
Maintenance costs
Hamshari et al. [74]
Maintenance scheduling
Gradient descent optimisation
Risk of asset failure and total maintenance costs
Chang et al. [75]
Maintenance scheduling
Hybrid multi-objective optimisation algorithm based on a quantum-behaved particle swarm optimisation
Total cost, window leveling, and resource leveling
Mira et al. [76]
Maintenance scheduling
Integer linear programming mathematical model
Dead-headings, turning times robustness, and number of shuntings to the depot needed to be executed
This paper
Maintenance prioritisation
Ad-hoc prioritisation algorithm
Cost of executing the maintenance intervention after the due date, cost related to postponing in the sequence the maintenance of an asset with a high criticality, and cost of executing in a consecutive order the maintenance of assets located far from each other
Therefore, in this paper, a model for prioritising predictive maintenance interventions and mitigating the impact on service is developed. The goal is two-fold: on one hand, this study is aimed at predicting future failures based on the detected anomalies and, on the other hand, the focus is on the identification of asset criticality based on the impact that its failure would imply on the service. The detected anomaly is considered a true anomaly if it would be followed by a failure. Therefore, if a true anomaly is detected sufficiently in advance, a predictive maintenance intervention can be performed, having sufficient time to organize the intervention. The objective is to avoid critical failures and corrective interventions that would imply higher costs and service disruptions, as much as possible.
In this regard, the passenger flow along the line is estimated to identify the line sections with the higher utilisation rate and its variation over time, taking into account periodic fluctuations.
The asset status (i.e., functioning, low degraded, medium degraded, high degraded) is evaluated through a machine learning technique, the one-class support vector machine and a threshold-based approach.
Finally, a prioritisation algorithm is proposed to find the best compromise between different infrastructure manager’s targets and KPIs.
In summary, the innovative aspects of this paper are:
  • addressing a service-based asset management in the rail sector, with the introduction of the impact on final users in the evaluation of maintenance priority;
  • answering to the research needs related to the definition of a data-driven analytics framework for maintenance decision-making and the integration of different data-driven approaches;
  • linking different sources of data and different models for data analysis in the decision-making process, moving towards prescriptive maintenance strategies;
  • the evaluation of assets’ status by machine learning algorithms and their clustering according to degradation thresholds, defining predictive maintenance work orders;
  • the representation, in a prioritisation approach, of different decision-making criteria, including service-level targets;
  • the consideration of criticality concept in maintenance prioritisation, in addition to the asset status, to reduce the risk of critical failures and service disruptions;
  • the prediction of passenger flow at stations in order to estimate the involved passengers in case of failure.

3 Methodology

The proposed approach consists of three steps:
  • a Markov chain Monte Carlo technique is applied to evaluate the criticality of the assets according to the impact of their failure on the passengers, taking into account their position along the line, and exploiting the passengers data from the ticketing system of each station;
  • a machine learning algorithm, the one-class support vector machine, is applied to cluster the assets according to their status, based on data collected from the field on events and alarms logs, asset parameters, and maintenance data;
  • an ad-hoc ordering algorithm is developed to prioritise the interventions considering as inputs the results of the previous steps.
The proposed framework is described in Fig. 1. The “asset criticality evaluation” module computes the average passenger flow trend in the different line sections. By considering the asset position and the transport demand, a dynamic criticality value \({\pi p}_{i}^{\tau }\) is computed for each asset \(i\) in the time horizon \(\tau\).
In addition, given the type of failure and the impact caused by the failure in terms of duration of service interruption, a static criticality value \({\pi }_{i}\) is defined for each asset \(i\).
The “asset status evaluation” module exploits the data about alarms, parameters, and the maintenance data to assess the asset condition. The output is the list of the assets with an anomalous status, the related maintenance interventions to be done, and their due dates \({DD}_{i}^{\tau }\), computed in the considered time horizon \(\tau\).
Finally, the “asset maintenance prioritisation” module evaluates the best \(m\) sequences with the related KPIs values to be shown to the operator.

3.1 Asset criticality in terms of impact on passengers

As mentioned, a static term of criticality \({\pi }_{i}\) is considered, related to the type of asset and its specific function. It is usually provided by the maintenance expert or the rail system designer, given the data about the types of failures for each asset and their time to repair.
The dynamic term of asset criticality \({\pi p}_{i}^{\tau }\) is, instead, evaluated in terms of impact that the asset failure may cause to the service, taking into account the asset position and the transport demand over time.
In order to evaluate the impact on passengers, a Markov chain Monte Carlo (MCMC) approach has been applied to analyse the data of the passengers entering and exiting each station, which are available from the ticketing systems. The model estimates the passengers in the different sections of the line and their evolution over time.
Monte Carlo method is a family of computational techniques based on random sampling to obtain numerical results. The goal is to solve problems that might be deterministic using randomness. They are mainly used for optimisation, numerical integration and generating draws from probability distribution.
A Markov chain or process is a stochastic model describing a sequence of possible events where the probability of each event depends only on the state attained in the previous event. In such a process, predictions of future outcomes can be made by considering the present state, and such predictions are just as good as the ones that could be made knowing the process's full history.
MCMC methods [79, 80] comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. The more steps are included, the more closely the distribution of the sample matches the actual desired distribution.
In this study, the MCMC methodology was implemented to create the model that fits the data. A decomposable time series model is used. The three main model components are trend, seasonality, and holidays, combined as it is shown in Eq. (1):
$$y\left(t\right)=g\left(t\right)+s\left(t\right)+h\left(t\right)+{\epsilon }_{t}$$
(1)
where:
  • \(g\left(t\right)\) is the trend function which models non-periodic changes in the value of the time series;
  • \(s\left(t\right)\) represents periodic changes (e.g., weekly and yearly seasonality);
  • \(h\left(t\right)\) represents the effects of holidays which occur on potentially irregular schedules over one or more days;
  • \({\epsilon }_{t}\) is the error term which represents any idiosyncratic changes that are not accommodated by the model.
Two different models for the trend are considered: nonlinear saturation growth in Eq. (2), and linear trend with changepoints in Eq. (3).
$$g(t)=\frac{C\left(t\right)}{1+\mathit{exp}\left(-{\left(k+a\left(t\right)\right)}^{T}\delta \right)\left(t-\left(m+a{\left(t\right)}^{T}\gamma \right)\right)}$$
(2)
$$g\left(t\right)=(k+a{\left(t\right))}^{T}\delta +(m+a{\left(t\right)}^{T}\gamma )$$
(3)
where:
  • \(C\) is the carrying capacity;
  • \(k\) the growth rate;
  • \(m\) an offset parameter;
  • \(a(t)\) defined in Eq. (4) and \(\gamma\) in Eq. (5) model the trend changes.
Trend changes are incorporated in the growth model by explicitly defining changepoints where the growth rate is allowed to change. Suppose there are \(S\) changepoints at times \({s}_{j}\),\(j = 1,\dots , S\). The vector of rate adjustments \(\delta\) is defined, where \({\delta }_{j}\) is the change in rate that occurs at time\({s}_{j}\).
$${a}_{j}(t)\left\{\begin{array}{c}1, if \, t\ge {s}_{j}\\ 0, otherwise\end{array}\right.$$
(4)
$$\gamma _{j} = \left( {s_{j} - m - \sum\limits_{{l < j}} {\gamma _{l} } } \right)\left( {1 - \frac{{k + \sum\limits_{{l < j}} {\delta _{l} } }}{{k + \sum _{{l \le j}} \delta _{l} }}} \right)$$
(5)
Seasonality trends are modelled as Fourier series (Eq. (6)). \(P\) is the regular period and \(n\) the Fourier series order. The periodic changes are modelled as Eq. (7) shows.
$$X\left(t\right)=\sum_{n=1}^{N}\left({a}_{n}\mathit{cos}\left(\frac{2 \pi n t}{P}\right)+{b}_{n}\mathit{sin}\left(\frac{2\pi nt}{P}\right)\right)$$
(6)
$$s\left(t\right)=X\left(t\right)\beta$$
(7)
Holidays are modelled using a matrix of regressors, being \({D}_{i}\) the set of past and future dates for that holiday for each holiday, as shown in Eq. (8). The effect of holidays is modelled in a similar way than the periodic changes following Eq. (9).
$$Z\left(t\right)=[1\left(t\in {D}_{1}\right),\dots ,l(t\in {D}_{L})]$$
(8)
$$h\left(t\right)=Z(t)\kappa$$
(9)
A prior distribution representing our beliefs prior to the application of the algorithm was selected. The selected distribution for \(\delta\) is Laplace to obtain a sparse prior. The rest of priors are selected as Normal due to computational simplicity and central limit theory, where large number of independent and identically distributed random variables tends to follow a normal distribution. Nevertheless, in the tuning process different distributions such as Student’s t or exponential were used, and no significant changes on the results were found.
The prior distributions are represented in Eq. (1014).
$$K\sim Normal\left(0,{\sigma }_{k}^{2}\right)$$
(10)
$$m\sim Normal\left(0,{\sigma }_{m}^{2}\right)$$
(11)
$$\beta \sim Normal\left(0,{\sigma }_{\beta }^{2}\right)$$
(12)
$$\kappa \sim Normal\left(0,{\sigma }_{\kappa }^{2}\right)$$
(13)
$$\delta \sim Laplace\left(0,\tau \right)$$
(14)
The approach has been developed by using the Prophet Python library [81] which is an open-source library for univariate time series forecasting that allows to estimate trends over time considering fluctuations due to seasonality. In this study, as described in Sect. 4, seasonality represents periodic changes relative to week and day due to the considered time horizon and the available data set.
Figure 2 shows, in blue colour, the data related to the passengers entering a given station of the line (collected through the ticketing system) that were used to train the algorithm, whereas the forecasted values are depicted in orange. The model is fitted using Stan’s L-BFGS [82] to find a maximum a posteriori estimate.
Given the average passenger flow trend in the different line sections and knowing the asset position, a criticality value \({\pi p}_{i}^{\tau }\) is assigned to each asset \(i\) and time horizon \(\tau\), according to three criticality thresholds, as depicted in Fig. 7. In addition, a static criticality term \({\pi }_{i}\) is given by considering the type of assets and the impact caused by its failure, in terms of service interruption.

3.2 Asset status evaluation and suggestion of interventions

The asset status evaluation is based on support vector machines, a technique based on statistical learning theory with its root in structural risk minimisation (SRM) principle. It represents a well-known state-of-art machine learning tool which has been widely used in the last few decades. The one-class SVM algorithm, as defined in Schölkopf et al. [83, 84], is an extension of the original SVM; this technique can be used in an unsupervised setting for anomaly detection. Basically. the OCSVM algorithm separates all the data points from the origin (in the feature space) and maximises the distance from this hyperplane to the origin. This results in a binary function which captures regions in the input space where the probability density of the data lives. Thus, the function returns \(+1\) in a “small” region (capturing the training data points) and \(-1\) elsewhere.
The algorithm, trained with positive examples only (i.e. data points from the target class), allows only a small part of the dataset to lie on the other side of the decision boundary (the outliers). The quadratic programming minimisation function is slightly different from the original one presented by Cortes and Vapnik [85] and has the form:
$$\underset{w, {\mathit{ }\xi }_{i},\mathit{ }\rho }{{\text{min}}}\frac{1}{2}{\Vert w\Vert }^{2}+\frac{1}{\nu n} \sum_{i=1}^{n}{\xi }_{i}-\rho$$
(15)
$$\text{subject to:}$$
$$\left( {w \cdot \phi \left( {x_{i} } \right)} \right) \ge \rho - \xi_{i} \quad \forall i = 1, \ldots , n$$
(16)
$$\xi_{i} \ge 0\quad \forall i = 1, \ldots , n$$
(17)
In this different formulation, the solution is characterised by the parameter \(\nu\), which affects the smoothness of the decision boundary. The tuning of this parameter is two-fold: on one hand, it sets an upper bound on the fraction of identified anomalies (training examples regarded out-of-class) and, on the other hand, it represents a lower bound for the number of training examples used as support vector. Due to the importance of this parameter, this approach is often referred to as \(\nu\)-SVM.
By using Lagrange techniques and using a kernel function for the dot-product calculations, the decision function becomes:
$$f\left(x\right)={\text{sgn}}(\left(w\cdot \phi \left({x}_{i}\right)-\rho \right)={\text{sgn}}(\sum_{i=1}^{n}{\alpha }_{i}K\left(x,{x}_{i}\right)-\rho )$$
(18)
Thus, this method creates a hyperplane characterised by \(w\) and \(\rho\) that has maximal distance from the origin in the feature space and separates all the data points from the origin.
It was of interest to study and understand if OCSVM can perform well in the proposed case study. Since OCSVM should only be trained on healthy data to perform the training, only data points whose features fell in predefined ranges of values were considered. A hyper-parameter tuning step has been performed to find the best values to fit our dataset and the best kernel found is the Gaussian Radial Basis Function (RBF) reported in Eq. (19):
$$K\left(x,{x}^{\prime}\right)={\text{exp}}\left(-\frac{{\Vert x-{x}^{\prime}\Vert }^{2}}{2{\sigma }^{2}}\right)$$
(19)
where \(\sigma \in {\mathbb{R}}\) is a term of the kernel parameter and \(\| x-x^{\prime}\|\) is the dissimilarity measure.
Therefore, after setting the kernel, the hyper-parameters to be tuned are the smoothing parameter \(\nu\)  and the kernel parameter \(\gamma\), defined as \(1/2{\sigma }^{2}\).
The result of the training is the creation of a separating hyperplane, which can be visualised in Fig. 3, where points that fall within the plane are considered healthy, whereas points that fall outside are faulty.
The OCSVM model classify the assets in “functioning” and “degraded”. An asset is functioning if it is behaving normally. An asset is degraded if it presents anomalies in its behaviour. When the asset status is degraded, it is classified in different levels of degradation through a threshold-based model that considers the number of anomalies detected for each asset and assign a level of degradation based on that number.
The threshold-based model defines a specific threshold for each asset, as reported in Table 4. The threshold is estimated according to the asset behaviour and represents the percentage of anomalies to assign to the asset a critical level of degradation.
Table 4
Anomaly detection model’s parameters and performance
 
Selected parameters
Performance measures
Asset ID
ν
γ
Threshold (%)
ρ
Anomaly detection precision (%)
True anomaly rate (%)
PPV
TNR
1
0.05
0.00322
10
0.053799
67
67
0.564956
0.964856
2
0.05
0.00458
15
0.054133
67
100
0.581549
0.968125
3
0.01
0.000001
40
0.011452
100
100
0.822188
0.996917
4
0.01
0.00112
20
0.011158
75
75
0.733333
0.995465
5
0.05
0.00000142
20
0.052545
100
100
0.669077
0.975273
6
0.02
0.00159
35
0.021406
100
60
0.815661
0.995486
7
0.05
0.00000579
15
0.049346
67
100
0.717913
0.983427
9
0.02
0.000001
35
0.022559
67
67
0.649317
0.989856
10
0.01
0.000001
20
0.010219
100
67
0.813356
0.997402
The space of the hyper-parameters tested for the OCSVM module is \(\nu \in \left[0.01, 0.02, 0.05, 0.1\right]\) and \(\gamma\) in a logarithmic distributed space of 60 instances ranging from \({10}^{-6}\) to \({10}^{3}\), on the other side, thresholds have been tested in a range from 1 to 99%. An example of validation results for a subset of 10 assets is presented in Table 4.
The considered indicators of model performance are:
  • anomaly detection precision: \(\frac{\text{True anomalies}}{\text{Total occurred failures}}\);
  • true anomaly rate: \(\frac{\text{True anomalies}}{\text{Total identified anomalies}}\);
  • precision or positive predictive value \(\text{(}\mathrm{PPV)}=\frac{TP}{TP+FP}\);
  • specificity or true negative rate \(\text{(}\mathrm{TNR)}=\frac{TN}{FP+TN}\).
where:
  • true anomaly is an anomaly event which occurs at most 1,5 months before a reported failure;
  • false anomaly is an anomaly event which is not followed by a reported failure in the following 1,5 months;
  • true positive (TP) is the sum of anomalous patterns correctly predicted as anomalies;
  • false positive (FP) is the sum of normal patterns predicted as anomalies;
  • true negative (TN) is the sum of normal patterns correctly predicted as normal;
  • false negative (FN) is the sum of anomalous patterns identified as normal.
The data used to evaluate the assets status are:
  • ATS logs: collection of events and alarms log from the Automatic Train Supervision (ATS). These large set of logs allow to extract information about alarms and events related to every asset and information about trains movements;
  • asset parameters: meaningful asset parameters collected from the field;
  • maintenance data: collection of corrective and preventive maintenance activities for all the assets.
Based on the level of degradation of an asset, estimated according to the detected anomalies that contribute to the definition of the asset status, a specific maintenance intervention should be done within a suggested due date \({DD}_{i}^{\tau }\), for the considered time horizon \(\tau\). The output of the asset status evaluation is the list of the monitored assets with their status, the related predictive maintenance interventions to be done, and their due dates, as shown in Table 5.
Table 5
Asset status evaluation—output example
Asset ID
Asset status
Maintenance intervention
Due Date \({DD}_{i}^{\tau }\)
A001
Functioning
None
 
A002
Low Degraded
Intervention X
M days
A003
Medium Degraded
Intervention Y
L days
A004
High Degraded
Intervention Z
P days

3.3 Asset ordering and intervention prioritisation

In this section, the predictive maintenance interventions, identified by the OCSVM model, are prioritised. To order the assets and the related predictive maintenance interventions, an ad-hoc prioritisation method is developed with the objective of mathematically representing the infrastructure manager’s prioritisation criteria and including in the approach the outputs from the data-driven models for the evaluation of asset status and asset criticality.
Therefore, the ordering algorithm uses as inputs the data received from the previous steps; in particular, the due date \({DD}_{i}^{\tau }\) of the maintenance intervention on asset \(i\) and the asset criticality \(\left({\pi }_{i}+{\pi p}_{i}^{\tau }\right).\)
The algorithm is based on an iterative approach aimed at finding the best sequence of assets to be maintained in a given time period \(\tau\). The main idea of the algorithm consists of computing, for each asset \(i,\) the cost \({d}_{i}^{k,r}\) for maintaining that asset in a given position \(p\left(k,r,i\right)\) within the sequence computed at iteration \(k\) of run \(r\). The term \({d}_{i}^{k,r}\) represents the weighted sum of the main infrastructure manager’s targets or KPIs for a given asset \(i\): \({d}_{1i}^{k,r}\) is the KPI related to the status of the asset \(i\) derived from the OCSVM model, \({d}_{2i}^{k,r}\) is the KPI related to the criticality of the asset \(i\) derived from the MCMC model, and \({d}_{3i}^{k,r}\) is the KPI related to the distance to be covered to execute the maintenance interventions according to the position of the asset \(i\) along the line.
In detail, considering a given time horizon τ and assuming, therefore, τ as constant, \({d}_{i}^{k,r}\) is expressed by the following equation:
$$d_{i}^{k,r} = \frac{1}{{b^{r} }}\left[ {\alpha_{1} d_{1i}^{k,r} + \alpha_{2} d_{2i}^{k,r} + \alpha_{3} d_{3i}^{k,r} } \right]$$
(20)
where:
$$d_{1i}^{k,r} = max\left\{ {0,\left( {t_{i}^{k,r} - DD_{i}^{\tau } } \right)} \right\}\quad \quad \forall i,\forall k,\forall r$$
(21)
$$t_{i}^{k,r} = \left( {\mathop \sum \limits_{j \prec i} S_{j - 1,j} + e_{j} } \right) + { }S_{j,i} + e_{i} \quad \quad \forall i,j,\forall k,\forall r$$
(22)
$$d_{2i}^{k,r} = \frac{{\pi_{i} + \pi p_{i}^{\tau } { }}}{{n + 1 - p\left( {k,r,i} \right)}}\quad \quad \forall i,\forall k,\forall r$$
(23)
$$d_{3i}^{k,r} = S_{{j_{{\left( {k - 1} \right),r}} ,i}} + S_{{i,h_{{\left( {k - 1} \right),r}} }} \quad \quad \forall i,j,\forall k,\forall r$$
(24)
$$b^{r} = \left\{ {\begin{array}{*{20}c} {k, if \, k \le \tilde{k}} \\ {k^{0.8}, if \, k > \tilde{k}} \\ \end{array} } \right.\quad \quad \forall r$$
(25)
The cost terms are represented by Eq. (21), Eq. (23) and Eq. (24).
Equation (21) represents the cost of executing the maintenance intervention after the due date, defined according to the asset status; Eq. (23) represents the cost related to postponing in the sequence the maintenance of an asset with a high criticality; Eq. (24) is the cost of executing in a consecutive order the maintenance of assets located far from each other. Equation (22) evaluates the time instant of maintenance execution for each asset in the sequence, while Eq. (25) defines the value of the parameter \({b}^{r}\) which changes value after \(\widetilde{k}\) iterations. The relevant notation can be summarised in Table 6. It is worth noting that only costs, affected by the order in which the maintenance interventions are executed, are considered and that the infrastructure manager can impose different weights \({\alpha }_{l}\) in the objective function according to the importance of the different KPIs.
Table 6
Notation of the ordering algorithm
Notation
Description
\(i\)
Asset index
\(n\)
Total number of considered assets
\(r\)
Run index
\(k\)
Index of the iteration for each run \(r\)
\({t}_{i}^{k,r}\)
Time of execution of the maintenance of asset \(i\) in the sequence defined at iteration \(k\) of run \(r\)
\(\widetilde{k}\)
Number of iterations after which the parameter \(b\) changes its value
\(K\)
Maximum number of the considered iterations for each run \(r\)
\(R\)
Total number of runs
\({DD}_{i}^{\tau }\)
Due date of the maintenance intervention according to asset status
\({\pi }_{i}\)
Static asset criticality related to the type of asset
\({\pi p}_{i}^{\tau }\)
Asset criticality in the considered time horizon τ related to the passenger flow
\(p\left(k,r,i\right)\)
Position of asset \(i\) in the sequence at iteration \(k\) of run \(r\)
\({e}_{i}\)
Time penalty for postponing asset \(i\) in the sequence, that is, the average duration of maintenance execution
\({S}_{{j}_{\left(k-1\right),r}i}\)
Distance between the asset and the previous one at the iteration \(k-1\) of run \(r\)
\({S}_{{ih}_{\left(k-1\right),r}}\)
Distance between the asset and the following one at the iteration \(k-1\) of run \(r\)
\({\alpha }_{l}\)
Weights to be calibrated
\({t}_{i}^{k,r}-{DD}_{i}\)
Indication of asset degradation/status
The ordering algorithm is described in Table 7. At the beginning of each run \(r\), a feasible initial solution is built by randomly generating a sequence of assets. After that, the iterative steps of the algorithm follow. In doing so, at each iteration \(k\) of the algorithm, the parameters \({d}_{1i}^{k,r}, {d}_{2i}^{k,r}, {d}_{3i}^{k,r}\) for each asset \(i\) are computed and the assets \(i=1\dots N\) are sorted in descending order according to the maintenance cost \({d}_{i}^{k,r}\) to generate a new sequence. The iterations stop when \({d}^{k,r}\) is lower than a given threshold \(\delta\) or when the maximum number of iterations \(K\) are reached. At the end of the \(R\) runs, the best \(m\) sequences are shown to the operator with the related KPIs values.
Table 7
Algorithm
https://static-content.springer.com/image/art%3A10.1186%2Fs12544-023-00631-z/MediaObjects/12544_2023_631_Tab7_HTML.png
Since \({d}_{1i}^{k,r}, {d}_{2i}^{k,r}, {d}_{3i}^{k,r}\) are related to a specific asset \(i\), the final KPIs values associated to each sequence \(m\) that are shown to the operator are the total values:
$${d}_{1}^{k,r}={\sum }_{i}{d}_{1i}^{k,r}$$
(26)
$${d}_{2}^{k,r}={\sum }_{i}{d}_{2i}^{k,r}$$
(27)
$${d}_{3}^{k,r}={\sum }_{i}{d}_{3i}^{k,r}$$
(28)
The output of the prioritisation algorithm is reported in Sect. 4 Fig. 9, which shows for the two best sequences, the related KPIs values.
It is worth noting that the prioritisation considers the predictive maintenance work orders generated by the OCSVM anomaly detection model; however, a corrective maintenance intervention can be integrated as first in the ordered list assigning to it the highest level of priority.
To test its effectiveness and performance, the proposed approach has been applied to a real-world case study. Its details and the results are discussed in Sect. 4.

4 Case study and results

This section describes the results of the application of the approach described in Sect. 3 to a real-world case study, consisting of the metro line M5 of the Italian city of Milan (Fig. 4), a completely automated line composed by 19 stations, three of which are transfer stations to lines M1, M2 and M3.
The total number of the considered assets is 850, belonging to these categories: track circuits, switches, platform doors, signalling equipment rooms, wayside antennas. These assets were selected according to their relevance for the infrastructure manager and the availability of data.
The asset status evaluation is performed using the ATS logs, the assets parameters and the maintenance data.
The ATS logs include a large number of records for each day, which mainly refer to two different categories: events and alarms. The typical log record is composed of three main parts: timestamp, involved system, and event description.
In Table 8, the list of alarms with their description is shown for each asset.
Table 8
Critical events and alarms
Alarm code
Description
Asset
W010
Vital Sequencing Error Detected, Track Occupancy Lost
Track circuit
A046
Track Occupancy with no Tracking ID
Track circuit
W103
On-Line AF902 Failure
Track circuit
W104
Off-Line AF902 Reserve Failure
Track circuit
W103
Vital MLK to AF902 Link Failed
Track circuit
W201
Battery (for Internal SER Equipment) Blown Fuse
Signalling equipment room
W202
Battery (for Internal SER Equipment) Ground Detect On
Signalling equipment room
W205
Battery (for External SER Equipment) Power Failure
Signalling equipment room
W030
Switch Never Came into Correspondence
Switch
W031
Switch Lost Correspondence without a Switch Move Request
Switch
V023
Vehicle TWC Link Failure—Master ATP in Non-Critical Fault
Wayside antenna
P023
Door Obstructed
Platform door
P026
Door Out of Order
Platform door
P053
Major Fault—Isolate Door
Platform door
P054
Medium Fault—Cycle Door
Platform door
The maintenance data include both the preventive maintenance interventions, scheduled based on assets manuals, and the corrective maintenance interventions, scheduled after the occurrence of a failure during the normal train operations. In Table 9, an example of the maintenance data is shown.
Table 9
Maintenance data example
Work order code
Asset ID
Start date
End date
Intervention code
WO001
A001
13/04/22 12:00:00
13/04/22 18:00:00
I004
WO002
A001
15/04/22 11:00:00
15/04/22 22:00:00
I005
WO003
A002
14/04/22 03:00:00
14/04/22 06:00:00
I001
WO004
A003
13/04/22 17:00:00
13/04/22 23:00:00
I010
WO005
A004
14/04/22 22:00:00
15/04/22 02:00:00
I005
Following the methodology explained in Sect. 3.1, the real data sets of June 2020 of passengers entering and exiting each station are modelled, considering weekly and daily trends.
Before forecasting future values, in-sample predictions are made dividing the dataset in training/testing data using 80% of the data for the training set and 20% for the test set.
The information from the training data is used to train the model to forecast the testing data. The differences between the testing data and the forecasted testing data are used to calculate errors.
The obtained final results are:
  • train-test prediction: graph showing the training and testing data evolution with respect to time as well as the forecasted values of the testing dataset with their correspondent absolute error (Fig. 5);
  • forecasted values for particular days: different dates have been chosen to be forecasted for the different stations;
  • different seasonalities and trend obtained: plot of the previously defined model components with their uncertainty (Fig. 6). The increase of uncertainty around 3:00 am is due to the closing hours of the metro station, closing hours (from midnight until 6 am) are not taken into account to model and make predictions.
Finally, the estimation of the average passenger flow at the different line stations is reported in Fig. 7, which shows the most critical sections of the line according to the passengers’ flow level. From this diagram, the criticality \({\pi p}_{i}^{\tau }\) is evaluated for each asset \(i\) according to its position along the line considering three criticality thresholds, represented in green, orange and red in Fig. 7, correspondent to high, medium and low criticality.
Cross-validation results for eight different stations are shown, as example, in Table 10. For each station, the mean error values are reported.
Table 10
Cross-validation results
 
Mean error [Number of passengers]
Mean error [%]
ISOLA M5
4.32
19.04
GERUSALEMME M5
4.09
13.13
GARIBALDI M5
6.86
23.79
DOMODOSSOLA FNM M5
5.52
14.89
CENISIO M5
5.88
15.54
CA GRANDA M5
6.09
13.94
BIGNAMI M5
5.64
15.74
BICOCCA M5
5.73
18.90
The mean errors are in all cases lower than 8 passengers. The total mean error is 18.56% which is an acceptable value for the considered application of supporting maintenance decision-making.
The OCSVM model, instead, identifies the list of degraded assets among the 850 considered assets, with the indication of their current degradation and due date for maintenance execution. In Fig. 8, an example of asset status evaluation is depicted for a specific asset, the track circuit. Four different levels of degradation are considered to represent the asset condition. In particular, track circuits are represented in the track layout, coloured based on their status:
  • green colour indicates a functioning condition;
  • yellow colour indicates a low degraded condition;
  • orange colour corresponds to a medium degraded condition;
  • red colour corresponds to the high degraded condition.
The results show an average anomaly detection precision of 67% and a true anomaly rate of 76% for track circuits, reaching the 75% and the 80%, respectively, considering all the assets.
The availability of data about corrective maintenance activities allows to estimate if, with the application of the proposed approach, corrective interventions could have been avoided. In order to avoid a corrective maintenance intervention, the anomaly detection model should be able to detect the anomaly sufficiently in advance and with a sufficient level of precision.
The OCSVM identified 50% of track circuits’ anomalies sufficiently in advance to avoid a corrective intervention. Considering all the types of monitored assets, the percentage of avoided corrective interventions reaches 54%.
Therefore, even if the model identifies the anomaly with a high precision, it is not always possible to avoid the failure and the corrective intervention.
Other machine learning algorithms and statistics models are currently under evaluation to compare their performance such as the simple non regressive informed machine learning model applied in [86] for track circuits.
As mentioned, the asset status and criticality results are then used as input by the prioritisation algorithm.
A comparison between the scenario in which the maintenance planning is performed without considering passengers data and the scenario in which the estimation of criticality is done according to the number of passengers, is performed. The results show a reduction of the number of passengers affected by service interruptions is of around 37% in comparison to the scenario without passenger prediction.
Figure 9 shows the comparison of two prioritisation options (solution 1 and solution 2) according to the three considered KPIs: \({d}_{1}\)(asset status), \({d}_{2}\) (criticality), and \({d}_{3}\) (covered distance). Solution 1 performs better in terms of asset status (- 43% of the cost) but presents a worse performance in terms of asset criticality, while the cost related to the covered distance is comparable in the two solutions.
It is worth noting that the decision support suggests the two best options, and the infrastructure manager can choose the preferred one according to its specific constraints or needs.
A sensitivity analysis has been conducted to test the robustness of the solution. The results show that the solution is robust since the choice of the weights \({\alpha }_{l}\) does not significantly affect the costs.
As an example, Fig. 10 depicts the relative variation of the costs \({d}_{1}\), \({d}_{2}\) and \({d}_{3}\) with the criticality weight \({\alpha }_{2}\), considering as reference values for \({d}_{1}\), \({d}_{2}\) and \({d}_{3}\) those corresponding to \({\alpha }_{2}=1\).
Finally, to clarify how the passenger flow affects the solution, two scenarios of transport demand are compared:
  • a reference scenario, with the nominal transport demand;
  • a passenger flow peak scenario, which has, in addition to the nominal demand, a predicted increase of passenger flow at station Monumentale M5.
In Fig. 11, the dynamic criticality \({\pi p}_{i}^{\tau }\) for the assets at the different stations of the line is reported. In detail, four discrete dynamic criticality values are assigned for high, medium, low and very low passenger flow.
The dynamic criticality corresponding to the nominal transport demand is reported in blue, and the increase in the dynamic criticality, due to the passenger flow peak at the station Monumentale M5, is depicted in red. These values are assigned to each asset \(i\) according to the station of the line where the asset is located.
The prioritisation, in the reference scenario and in the peak demand scenario, for a subset of 20 assets is reported in Table 11. With respect to the solution obtained in the reference scenario, the decision support system suggests to maintain earlier the asset located at the Monumentale M5 station (highlighted in bold in Table 11), which is moved from position 20 to position 9 in the maintenance activities’ prioritised list. As a consequence, the ordering of the other assets in the proposed prioritisation changes as well, to optimise also the KPIs values \({d}_{1}\) (asset status), and \({d}_{3}\) (covered distance).
Table 11
Comparison of maintenance activities’ order in the reference and passenger flow peak scenarios
ID asset
Asset category
Station
Maintenance order (reference scenario)
Maintenance order (passenger flow peak scenario)
A060
Track circuit
BIGNAMI M5
1
8
A070
Track circuit
GARIBALDI M5
8
5
A065
Track circuit
ISTRIA M5
11
13
A088
Track circuit
SAN SIRO STADIO M5
10
3
A071
Track circuit
MONUMENTALE M5
20
9
A076
Track circuit
TRE TORRI M5
6
15
A059
Switch
BIGNAMI M5
9
7
A598
Switch
BIGNAMI M5
18
11
A701
Switch
GARIBALDI M5
12
2
A653
Switch
ISTRIA M5
3
6
A605
Wayside antenna
BIGNAMI M5
4
18
A083
Wayside antenna
SAN SIRO STADIO M5
19
20
A319
Platform door
BICOCCA M5
14
14
A608
Platform door
BIGNAMI M5
13
17
A644
Platform door
CA GRANDA M5
2
1
A746
Platform door
DOMODOSSOLA M5
5
10
A785
Platform door
LOTTO M5
16
16
A673
Platform door
MARCHE M5
15
12
A829
Signalling equipment room
SAN SIRO STADIO M5
17
19
A758
Signalling equipment room
TRE TORRI M5
7
4
In particular, the asset status cost \({d}_{1}\) increases of 4.9%, while the covered distance cost \({d}_{3}\) increases of 23% with respect to the reference scenario results. This shows the capability of the model to deal with passenger flows peaks while keeping a good performance in terms of asset status, with a small increase of the covered distance. The variation of the covered distance is usually acceptable since it is deemed less relevant for the infrastructure manager in comparison to the asset status.

5 Conclusions

This paper proposes a novel data-driven prioritisation framework to prioritise maintenance interventions on railway lines taking into account the asset status and criticality. More in detail, a dynamic criticality term related to the service condition in the relevant time period is considered, which is updated on the basis of the passenger flow trend over time at the different stations. The proposed three-step approach includes the analysis of passenger data to evaluate the failure impact on the service, the analysis of alarms and anomalies to evaluate the asset status, and the suggestion of maintenance interventions. The application to the maintenance of the metro line M5 in the Italian city of Milan shows the usefulness of the proposed approach to support infrastructure managers and maintenance operators in making decisions regarding the priority of maintenance activities, reducing the risk of critical failures and service interruptions, and paving the way towards the adoption of prescriptive maintenance strategies.
Based on the asset status and criticality, a list of predictive interventions on track circuits, switches, platform doors, signalling equipment rooms and wayside antennas is calculated. In this way, these interventions are planned more efficiently and have a lower impact on service quality.
The results show a good precision in the detection of the anomaly with an anomaly detection precision of 75% and a true anomaly rate of 80%.
The percentage of avoided corrective interventions that are identified through the data-driven model is around 54%, which represents the corrective interventions that are detected sufficiently in advance, and replaced by predictive interventions that can be planned in advance more efficiently.
The reduction of the number of passengers affected by service interruptions is around 37% in comparison to the scenario without passengers prediction.
This paper does not focus on the identification of the best data-driven models for each considered data set and the comparison with other existing methods. For this reason, future developments will consist of testing and comparing the performance of various machine learning algorithms and statistics models and improving the accuracy and the prediction horizon of the forecasting model of asset status. The aim is to predict the failure more in advance, avoiding a higher percentage of corrective interventions.
The presented system represents the backbone of the intelligent asset management system that was developed, implemented, and validated by the IN2SMART2 project.

Acknowledgements

None.

Declarations

Competing interests

The authors have no competing interests to declare.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literatur
1.
Zurück zum Zitat UIC Rail System Department. (2021). Artificial Intelligence. Case of the Railway Sector. State of Play and perspectives. 1–28, ISBN 978-2-7461-3065-4. UIC Rail System Department. (2021). Artificial Intelligence. Case of the Railway Sector. State of Play and perspectives. 1–28, ISBN 978-2-7461-3065-4.
2.
Zurück zum Zitat Tang, R., De Donato, L., Bes̆inović, N., Flammini, F., Goverde, R. M. P., Lin, Z., Liu, R., Tang,T., Vittorini, V., & Wang, Z. (2022). A literature review of Artificial Intelligence applications in railway systems. Transportation Research Part C: Emerging Technologies, 140, 103679. https://doi.org/10.1016/j.trc.2022.103679 Tang, R., De Donato, L., Bes̆inović, N., Flammini, F., Goverde, R. M. P., Lin, Z., Liu, R., Tang,T., Vittorini, V., & Wang, Z. (2022). A literature review of Artificial Intelligence applications in railway systems. Transportation Research Part C: Emerging Technologies, 140, 103679. https://​doi.​org/​10.​1016/​j.​trc.​2022.​103679
3.
Zurück zum Zitat Mulongo, N. Y., Mnkandla, E., & Kanakana-Katumba, G. (2021). Artificial Intelligence as key driver for competitiveness in the railway industry: Review. In 62nd International scientific conference on information technology and management science of Riga Technical University (ITMS), Riga, Latvia (pp. 1–6). https://doi.org/10.1109/ITMS52826.2021.9615314. Mulongo, N. Y., Mnkandla, E., & Kanakana-Katumba, G. (2021). Artificial Intelligence as key driver for competitiveness in the railway industry: Review. In 62nd International scientific conference on information technology and management science of Riga Technical University (ITMS), Riga, Latvia (pp. 1–6). https://​doi.​org/​10.​1109/​ITMS52826.​2021.​9615314.
9.
Zurück zum Zitat Davenport, T. H. (2018). From analytics to artificial intelligence. Journal of. Business Analytics, 1(2), 73–80.CrossRef Davenport, T. H. (2018). From analytics to artificial intelligence. Journal of. Business Analytics, 1(2), 73–80.CrossRef
10.
Zurück zum Zitat Bešinović, N., De Donato, L., Flammini, F., Goverde, R. M. P., Lin, Z., Liu, R., Marrone, S., Tang, T., & Vittorini, V. (2022). Artificial intelligence in railway transport: taxonomy, regulations and applications. IEEE Transactions on Intelligent Transportation Systems, 23(9), 14011–14024. https://doi.org/10.1109/TITS.2021.3131637CrossRef Bešinović, N., De Donato, L., Flammini, F., Goverde, R. M. P., Lin, Z., Liu, R., Marrone, S., Tang, T., & Vittorini, V. (2022). Artificial intelligence in railway transport: taxonomy, regulations and applications. IEEE Transactions on Intelligent Transportation Systems, 23(9), 14011–14024. https://​doi.​org/​10.​1109/​TITS.​2021.​3131637CrossRef
15.
16.
Zurück zum Zitat ISO. (2014). ISO 55000: Asset Management. ISO. (2014). ISO 55000: Asset Management.
18.
Zurück zum Zitat Consilvio, A., Solis-Hernandez, J., Jimenez-Redondo, N., Sanetti, P., Papa, F., & Mingolarra-Garaizar, I. (2020). On applying machine learning and simulative approaches to railway asset management: The earthworks and track circuits case studies. Sustainability, 12, 2544–2567. https://doi.org/10.3390/su12062544CrossRef Consilvio, A., Solis-Hernandez, J., Jimenez-Redondo, N., Sanetti, P., Papa, F., & Mingolarra-Garaizar, I. (2020). On applying machine learning and simulative approaches to railway asset management: The earthworks and track circuits case studies. Sustainability, 12, 2544–2567. https://​doi.​org/​10.​3390/​su12062544CrossRef
19.
Zurück zum Zitat Kumari, J., Karim, R., Thaduri, A., & Castano, M. (2021). Augmented asset management in railways – Issues and challenges in rolling stock. Proceedings of the Institution of Mechanical Engineers, Part F: Journal of Rail and Rapid Transit, 236(7), 850–862.CrossRef Kumari, J., Karim, R., Thaduri, A., & Castano, M. (2021). Augmented asset management in railways – Issues and challenges in rolling stock. Proceedings of the Institution of Mechanical Engineers, Part F: Journal of Rail and Rapid Transit, 236(7), 850–862.CrossRef
22.
Zurück zum Zitat Sresakoolchai, J., & Kaewunruen, S. (2022). Integration of building information modeling (BIM) and artificial intelligence (AI) to detect combined defects of infrastructure in the railway system. In: Kolathayar, S., Ghosh, C., Adhikari, B. R., Pal, I., & Mondal, A. (eds) Resilient infrastructure. Lecture Notes in Civil Engineering, 2022. Springer, Singapore. https://doi.org/10.1007/978-981-16-6978-1_30 Sresakoolchai, J., & Kaewunruen, S. (2022). Integration of building information modeling (BIM) and artificial intelligence (AI) to detect combined defects of infrastructure in the railway system. In: Kolathayar, S., Ghosh, C., Adhikari, B. R., Pal, I., & Mondal, A. (eds) Resilient infrastructure. Lecture Notes in Civil Engineering, 2022. Springer, Singapore. https://​doi.​org/​10.​1007/​978-981-16-6978-1_​30
25.
Zurück zum Zitat Su, Z., Núñez, A., Baldi, S., & De Schutter, B. (2016). Model predictive control for rail condition-based maintenance: A multilevel approach. In 2016 IEEE 19th international conference on intelligent transportation systems (ITSC), Rio de Janeiro, Brazil (pp. 354–359). https://doi.org/10.1109/ITSC.2016.7795579. Su, Z., Núñez, A., Baldi, S., & De Schutter, B. (2016). Model predictive control for rail condition-based maintenance: A multilevel approach. In 2016 IEEE 19th international conference on intelligent transportation systems (ITSC), Rio de Janeiro, Brazil (pp. 354–359). https://​doi.​org/​10.​1109/​ITSC.​2016.​7795579.
31.
Zurück zum Zitat Bornia, O., & Vignola, G. et al. (2019). Anomalies Detection Prototype and Validation Report; Deliverable 8.2, s.l. In2Smart EU Project. Bornia, O., & Vignola, G. et al. (2019). Anomalies Detection Prototype and Validation Report; Deliverable 8.2, s.l. In2Smart EU Project.
32.
Zurück zum Zitat Vignola, G., & Consilvio, A. et al. (2021). Data Analytics and DSS Framework Design; D4.2 IN2SMART2 EU Project. Vignola, G., & Consilvio, A. et al. (2021). Data Analytics and DSS Framework Design; D4.2 IN2SMART2 EU Project.
46.
51.
Zurück zum Zitat Samaras, P., Fachantidis, A., Tsoumakas, G., & Vlahavas, I. (2015). A prediction model of passenger demand using AVL and APC data from a bus fleet. In Proceedings of the 19th panhellenic conference on informatics (pp. 129–134). https://doi.org/10.1145/2801948.2801984 Samaras, P., Fachantidis, A., Tsoumakas, G., & Vlahavas, I. (2015). A prediction model of passenger demand using AVL and APC data from a bus fleet. In Proceedings of the 19th panhellenic conference on informatics (pp. 129–134). https://​doi.​org/​10.​1145/​2801948.​2801984
53.
Zurück zum Zitat Vandewiele, G., Colpaert, P., Janssens, O., Van Herwegen, J., Verborgh, R., Mannens, E., Ongenae, F., & De Turck, F. (2017). Predicting train occupancies based on query logs and external data sources. In Proceedings of the 26th International conference on world wide web companion - WWW ’17 Companion (pp. 1469–1474). https://doi.org/10.1145/3041021.3051699 Vandewiele, G., Colpaert, P., Janssens, O., Van Herwegen, J., Verborgh, R., Mannens, E., Ongenae, F., & De Turck, F. (2017). Predicting train occupancies based on query logs and external data sources. In Proceedings of the 26th International conference on world wide web companion - WWW ’17 Companion (pp. 1469–1474). https://​doi.​org/​10.​1145/​3041021.​3051699
59.
Zurück zum Zitat Oliveira D. F.N. et al. (2019). Evaluating unsupervised anomaly detection models to detect faults in heavy haul railway operations. In 2019 18th IEEE international conference on machine learning and applications (ICMLA), Boca Raton, FL, USA, 2019 (pp. 1016–1022). https://doi.org/10.1109/ICMLA.2019.00172 Oliveira D. F.N. et al. (2019). Evaluating unsupervised anomaly detection models to detect faults in heavy haul railway operations. In 2019 18th IEEE international conference on machine learning and applications (ICMLA), Boca Raton, FL, USA, 2019 (pp. 1016–1022). https://​doi.​org/​10.​1109/​ICMLA.​2019.​00172
64.
Zurück zum Zitat Shangpeng, S., & Zhao, H. (2013). Fault diagnosis in railway track circuits using support vector machines. In 2013 12th International conference on machine learning and applications (ICMLA), 2. IEEE. Shangpeng, S., & Zhao, H. (2013). Fault diagnosis in railway track circuits using support vector machines. In 2013 12th International conference on machine learning and applications (ICMLA), 2. IEEE.
65.
Zurück zum Zitat Bouman, R., Bukhsh, Z., & Heskes, T. (2023). Unsupervised anomaly detection algorithms on real-world data: How many do we need? 2305.00735, arXiv, cs.LG. Bouman, R., Bukhsh, Z., & Heskes, T. (2023). Unsupervised anomaly detection algorithms on real-world data: How many do we need? 2305.00735, arXiv, cs.LG.
68.
Zurück zum Zitat Khalouli, S., Benmansour, R., & Hanafi, S. (2016). An ant colony algorithm based on opportunities for scheduling the preventive railway maintenance. In 2016 international conference on control, decision and information technologies (CoDIT) (pp. 594–599). https://doi.org/10.1109/CoDIT.2016.7593629 Khalouli, S., Benmansour, R., & Hanafi, S. (2016). An ant colony algorithm based on opportunities for scheduling the preventive railway maintenance. In 2016 international conference on control, decision and information technologies (CoDIT) (pp. 594–599). https://​doi.​org/​10.​1109/​CoDIT.​2016.​7593629
70.
Zurück zum Zitat Soh, S. S., Radzi, Nor. H. M., & Haron, H. (2012). Review on scheduling techniques of preventive maintenance activities of railway. In 2012 Fourth international conference on computational intelligence, modelling and simulation (pp. 310–315). https://doi.org/10.1109/CIMSim.2012.56 Soh, S. S., Radzi, Nor. H. M., & Haron, H. (2012). Review on scheduling techniques of preventive maintenance activities of railway. In 2012 Fourth international conference on computational intelligence, modelling and simulation (pp. 310–315). https://​doi.​org/​10.​1109/​CIMSim.​2012.​56
71.
Zurück zum Zitat Zhao, J., Chan, A. H. C., & Burrow, M. P. N. (2009). A genetic-algorithm-based approach for scheduling the renewal of railway track components. Proceedings of the Institution of Mechanical Engineers, Part F: Journal of Rail and Rapid Transit, 223(6), 533–541.CrossRef Zhao, J., Chan, A. H. C., & Burrow, M. P. N. (2009). A genetic-algorithm-based approach for scheduling the renewal of railway track components. Proceedings of the Institution of Mechanical Engineers, Part F: Journal of Rail and Rapid Transit, 223(6), 533–541.CrossRef
77.
Zurück zum Zitat Carretero, J., Pérez, J. M., & Garcı́a-Carballeira, F., Calderón, A., Fernández, J., Garcı́a, J. D., Lozano, A., Cardona, L., Cotaina, N., & Prete, P. (2003). Applying RCM in large scale systems: A case study with railway networks. Reliability Engineering & System Safety, 82(3), 257–273. https://doi.org/10.1016/S0951-8320(03)00167-4CrossRef Carretero, J., Pérez, J. M., & Garcı́a-Carballeira, F., Calderón, A., Fernández, J., Garcı́a, J. D., Lozano, A., Cardona, L., Cotaina, N., & Prete, P. (2003). Applying RCM in large scale systems: A case study with railway networks. Reliability Engineering & System Safety, 82(3), 257–273. https://​doi.​org/​10.​1016/​S0951-8320(03)00167-4CrossRef
81.
Zurück zum Zitat Taylor, S. J., & Letham, B. (2018). Forecasting at scale. The American Statistician, 37–45, 2018.MathSciNet Taylor, S. J., & Letham, B. (2018). Forecasting at scale. The American Statistician, 37–45, 2018.MathSciNet
82.
Zurück zum Zitat Liu, D. C., & Nocedal, J. (1989). On the limited memory method for large scale optimization. Mathematical Programming B., 45(3), 503–528.MathSciNetCrossRef Liu, D. C., & Nocedal, J. (1989). On the limited memory method for large scale optimization. Mathematical Programming B., 45(3), 503–528.MathSciNetCrossRef
83.
Zurück zum Zitat Schölkopf, B., Burges, C. J. C., & Smola, A. J. (1999). Introduction to support vector learning. Advances in kernel methods. MIT Press, 327–352. Schölkopf, B., Burges, C. J. C., & Smola, A. J. (1999). Introduction to support vector learning. Advances in kernel methods. MIT Press, 327–352.
84.
Zurück zum Zitat Scholkopf, B., & Smola, A. J. (2002). Support vector machines and kernel algorithms. MIT Press, 1119–1125. Scholkopf, B., & Smola, A. J. (2002). Support vector machines and kernel algorithms. MIT Press, 1119–1125.
86.
Zurück zum Zitat Garrone, A., et al. (2023). Simple non regressive informed machine learning model for prescriptive maintenance of track circuits in a subway environment. In: Valle, M., et al. Advances in system-integrated intelligence. SYSINT 2022. Lecture Notes in Networks and Systems, 546. Springer, Cham. https://doi.org/10.1007/978-3-031-16281-7_8 Garrone, A., et al. (2023). Simple non regressive informed machine learning model for prescriptive maintenance of track circuits in a subway environment. In: Valle, M., et al. Advances in system-integrated intelligence. SYSINT 2022. Lecture Notes in Networks and Systems, 546. Springer, Cham. https://​doi.​org/​10.​1007/​978-3-031-16281-7_​8
Metadaten
Titel
A data-driven prioritisation framework to mitigate maintenance impact on passengers during metro line operation
verfasst von
Alice Consilvio
Giulia Vignola
Paula López Arévalo
Federico Gallo
Marco Borinato
Carlo Crovetto
Publikationsdatum
01.12.2024
Verlag
Springer International Publishing
Erschienen in
European Transport Research Review / Ausgabe 1/2024
Print ISSN: 1867-0717
Elektronische ISSN: 1866-8887
DOI
https://doi.org/10.1186/s12544-023-00631-z

Weitere Artikel der Ausgabe 1/2024

European Transport Research Review 1/2024 Zur Ausgabe