nach oben

Complex & Intelligent Systems

Erschienen in:

Open Access 11.04.2023 | Original Article

Flight risk evaluation based on flight state deep clustering network

verfasst von: Guozhi Wang, Haojun Xu, Binbin Pei, Haoyu Cheng

Erschienen in: Complex & Intelligent Systems | Ausgabe 5/2023

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

Flight risk evaluation based on data-driven approach is an essential topic of aviation safety management. Existing risk analysis methods ignore the coupling and time-variant characteristics of flight parameters, and cannot accurately establish the mapping relationship between flight state and loss-of-control risk. To deal with the problem, a flight state deep clustering network (FSDCN) model was proposed to mine latent loss-of-control risk information implicating in raw flight parameters. FSDCN integrates the feature extraction and clustering into an end-to-end deep hybrid network to extract latent risk features from multivariate time-series flight parameters and cluster them. In the FSDCN model, a sequential multi-attention encoder–decoder network is designed to extract embedded risk features, and the feature clustering layer is designed to iteratively refine clustering effects and feature extraction. Besides, a loss-of-control classifier is added to optimize the risk feature vector expression and ensure sufficient dividing feature for facilitate clustering. The multi-task joint learning strategy is adopted to improve the clustering performance of the model further. According to extracted risk features and similarity metrics, the optimal clusters number of flight states is set as 5. Comparative experiments show that FSDCN significantly performs better than other clustering models with performance percentage error below $6\%$. Through statistical analysis of clustering results, the risk level is quantified for each cluster. Three high-difficulty maneuver cases are presented to demonstrate FSDCN for flight risk evaluation. The flight parameter sequences of the maneuver cases are input into the well-trained FSDCN to obtain the risk prediction results. The spatiotemporal distribution characteristics of the risk-quantized results are consistent with flight parameters over-limit situations, which demonstrates the effectiveness of FSDCN on clustering flight states. The experimental results on flight maneuver cases show that FSDCN can find potential loss-of-control risk features according to multivariate time-series flight data and provide support for in-flight risk warnings.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Introduction

Flight risk early warning has always been the focus of flight safety research, and its core is to evaluate the aircraft’s performance in advance objectively [1, 2]. When the aircraft falls into complex conditions, accurate and objective risk evaluation for the aircraft’s performance will help the crew take corresponding manipulation strategies to operate the aircraft away from the loss-of-control (LOC) states [3]. In this way, flight crashes can be reduced effectively. Actually, the time-variant characteristics of flight parameters directly determine the aircraft’s performance and contain important risk information reflecting the LOC occurrence. Hence, mining the latent risk features of the flight parameters based on the data-driven method is significant for flight risk evaluation and prediction.

Clustering analysis is a standard data-mining method based on unsupervised machine learning [4]. Without prior knowledge, the method can separate data into clusters according to the captured features, then find the valuable information hidden in the data [5]. Due to the apparent advantages in unsupervised learning, the clustering analysis method has been widely used in the field of medicine [6, 7], industry [8], and aviation [9‐11]. The flight data recording the airplane path, speed, attitude and operation have some typical characteristics, including high-dimensional, redundant, and time-series. Directly clustering the flight parameters in the original data space is the simplest cluster procedure [12]. However, the classic clustering algorithms used to detect abnormal flight status lack efficient representation learning for the coupling and time-variant characteristics of flight parameters. The large-scale, unbalanced-distribution and high-dimensional data can greatly increase the clustering difficulty. Any special or abnormal data could affect clustering analysis results and result in the low reliability of the results. Deep clustering algorithms based on feature extraction can effectively solve this problem by mapping the higher-dimensional data space to an embedded features space.

Inspired by the deep clustering model, a flight state deep clustering network (FSDCN) model is proposed to jointly perform feature extraction and clustering assignments for multivariate time-series flight data. In the FSDCN model, a sequential multi-attention encoder–decoder network is designed to extract embedded risk features, and the feature clustering layer is designed to iteratively refine the cluster centroids by using an auxiliary target distribution. Minimizing the Kullback–Leibler (KL) divergence between embedded features’ soft assignments and auxiliary target distribution is used to improve clustering assignment and feature representation simultaneously. In addition, the multi-task joint learning strategy is adopted to improve the clustering performance of the model further. Some clustering models including random forest K-means model, autoencoder K-means model and deep embedding cluster model are also introduced to make a comparison with FSDCN. The comparative experiments show that FSDCN performs better than other clustering models. Through statistical analysis of clustering results, the risk level is finally quantified for each cluster. Three high-difficulty maneuver cases are presented to demonstrate the effectiveness of FSDCN on clustering flight states. The experiment results confirm that well-trained FSDCN can effetely define the LOC risk level of the flight state according to multivariate time-series flight data. The main contributions of this work are summarized as follows:

An unsupervised FSDCN is proposed to realize flight state clustering and flight risk evaluation. The end-to-end multi-dimensional learning structure integrates feature extraction and unsupervised clustering procedure, and it can separate the flight state into different clusters according to the multivariate time-series flight parameters.

The multi-task joint learning strategy is designed by deriving a loss function containing reconstruction loss, classification loss and clustering loss. In network training, the shared network parameters are jointly adjusted to ensure convergence at different stages.

The LOC risk levels of clusters are defined, and the experimental results confirm the cluster ability of FSDCN and the effectiveness of flight risk evaluation.

This paper is organized as follows: The related works in terms of clustering analysis are delineated in “Related works”. The flight state clustering problem is defined in “Flight state clustering and splitting”. The flight state deep clustering network is illustrated in “Flight state deep clustering network”. Experiments and results analysis are shown in “Experiments”. Finally, some conclusions are presented in “Conclusions”.

Classic clustering

In recent years, some classic clustering algorithms have been developed and expanded, such as modified k-means clustering algorithm [13], density spatial clustering algorithm [14], K-nearest neighbor decision clustering algorithm [15], density deviation multi-peaks automatic clustering algorithm [16], and synchronization-inspired clustering algorithm [17]. The classic clustering algorithms are effective when the data structures are the combination of simple form and the features are representative. Clustering analysis has grown in popularity, given the need for detecting abnormal flight characteristics. Jiang et al. [15] used the density-based spatial clustering algorithm to identify abnormal spatiotemporal flight trajectories on the basis of the pilot’s operation. To address the negative impacts of outliers during clustering, Liu et al. [18] proposed a clustering with the outlier removal algorithm and used it to detect abnormal flight trajectories. Gao et al. [19] combined principal component analysis and hierarchical clustering algorithm to locate known exceedance flight data fragments and potential abnormal flight data fragments. Aslaner et al. [20] used the dynamic time-warping method to evaluate the similarities of the flight parameters in clusters. These clustering algorithms focus on anomaly detection for historical flight data, which causes its inability to track the cluster changes in flight. Zhao et al. [21] developed an online clustering algorithm to achieve cluster adjustment as onboard flight data update. Because these models do not consider multi-parameter coupling and time-varying characteristics from the perspective of flight safety, it is difficult to accurately describe the relationship between flight state and LOC risk.

Deep clustering

Recently, the neural network has developed rapidly, and has been widely applied in many fields of modern technology due to its approximation properties and feature learning capabilities. The deep clustering algorithms based on the neural network are the promising methods in both feature extraction and clustering assignments. Siłka et al. [22] developed long short term memory neural network model with hyperbolic tangent in hidden layers, and use it to predict potential vibrations of high-speed train from time series of recorded vibrations. Mittal et al. [23] introduced Levenberg–Marquardt neural network into routing protocols for detecting malicious attacks over wireless sensor network. Wieczorek et al. [24] proposed a complex solution for the network training, and used the custom neural network to flexibly predict virus spread. Moreover, some researchers have attempted to combine the neural network with clustering algorithms. The neural network can improve the performance of the clustering algorithm by mapping the higher-dimensional data space to an embedded features space. Qin et al. [25] proposed a deep hybrid model to detect anomaly flight. In the model, the time-feature attention-based convolutional neural network extracts flight features, and the hierarchical density-based spatial clustering algorithm detects anomalous flights according to the extracted features. The deep hybrid model has limitations because it decouples feature extraction from clustering assignments, which could result in a mismatch problem between the extracted features and the clustering target. Xie et al. [26] proposed an unsupervised deep embedding clustering (DEC) model. DEC model integrates feature extraction and clustering assignments into a deep neural network, which can simultaneously improve feature expression and optimize clustering objectives. DEC model and its improved versions have been superiorly applied to graph classification [27‐29], signal processing [30] document retrieving [31], and traffic crash prediction [32].

Flight state clustering and splitting

Flight state clustering aims to establish the mapping relationship between flight state and LOC risk. The time-variant characteristics of flight parameters are the primary basis for unsupervised cluster analysis and risk classification. Unlike the traditional risk classification method based on single flight parameter limitation, FSDCN first extracts the risk features implicit in multivariate time-series flight parameters. The original data space is abstracted into feature space. Then, unsupervised clustering is complicated in the risk feature space.

In this paper, eight flight parameters are selected to be combined as the model input $\varvec{{\hat{x}}}$, as shown in Eq. (1). $\varvec{{\hat{x}}}$ is the multivariate time-series that is extracted from the flight data stream. The extraction range of the flight parameters is $\Delta t = 10$ s, and the detector collection frequency is $fs=50$ Hz:

$$\begin{aligned} {\varvec{{\hat{x}}}} = f(V,\dot{H},\alpha ,\theta ,\phi ,p,q,r). \end{aligned}$$

(1)

Because the flight parameters have different scale characteristics, which will affect the initialization of network parameters and the efficiency of network training. The flight parameters usually vary within a specific range. Their available limitations have obvious boundaries, and their performances are continuous. Hence, a normalization method of multi-scaled variables is proposed based on available limitations of single flight parameters. The model input $\varvec{{\hat{x}}}$ is preprocessed by the maximum normalization method. For each dimensional parameter $\hat{x} \in \{V,\dot{H},\alpha ,\theta ,\phi ,p,q,r\}$, its normalized value x is calculated as shown in Eq. (2):

$$\begin{aligned} x = \left\{ {\begin{array}{*{20}{c}} {\frac{{\hat{x} - {\text {mid}}(\hat{x})}}{{\hat{x}_{{\text {limit}}}^{{\text {up}}} - {\text {mid}}(\hat{x})}}}&{}{{\text {mid}}(\hat{x}) \leqslant \hat{x}< \hat{x}_{{\text {limit}}}^{{\text {up}}}} \\ {\frac{{\hat{x} - {\text {mid}}(\hat{x})}}{{\hat{x}_{{\text {limit}}}^{{\text {down}}} - {\text {mid}}(\hat{x})}}}&{}{\hat{x}_{{\text {limit}}}^{{\text {down}}} \leqslant \hat{x} < {\text {mid}}(\hat{x})} \end{array}} \right. \end{aligned}$$

(2)

where, $\hat{x}_{{\text {limit}}}^{{\text {up}}}$ is the upper limit of the parameter, $\hat{x}_{{\text {limit}}}^{{\text {down}}}$ is the lower limit of the parameter, as shown in Table 1, and ${{\text {mid}}(\hat{x})}$ is the median value of the parameter $\hat{x}$.

Table 1

The limitations of the flight parameters

Parameter	$\hat{x}_{{\text {limit}}}^{{\text {down}}}$	$\hat{x}_{{\text {limit}}}^{{\text {up}}}$	Parameter	$\hat{x}_{{\text {limit}}}^{{\text {down}}}$	$\hat{x}_{{\text {limit}}}^{{\text {up}}}$
Airspeed V	70 m/s	400 m/s	Roll angle $\phi $	$-150^{\circ }$	$150^{\circ }$
Climb rate $\dot{H}$	$-150$ m/s	150 m/s	Roll rate p	$-50^{\circ }$	$50^{\circ }$
Angle of attack $\alpha $	$- 20^{\circ }$	$30^{\circ }$	Pitch rate q	$-25^{\circ }$	$25^{\circ }$
Pitch angle $\theta $	$-90^{\circ }$	$90^{\circ }$	Yaw rate r	$-45^{\circ }$	$45^{\circ }$

FSDCN model automatically summarizes the nonlinear mapping relationship ${F_{\Omega } }$ between the original data space ${\varvec{X}}$ and the risk feature space ${\varvec{Z}}$ through learning train data. Flight parameters $\{ {{\varvec{x}}_i} \in {\varvec{X}}\} _{i = 1}^N$ are firstly transformed to risk feature vectors $\{ {\varvec{z}_i} \in {\varvec{Z}}\} _{i = 1}^N$ with the nonlinear mapping relationship ${f_{\Omega } }$. The risk feature vectors are clustered into K clusters whose centroids are $\{ {{\varvec{\mu }}_j} \in {\varvec{Z}}\} _{j = 1}^K$ in the risk feature space ${\varvec{Z}}$. The risk level of each cluster is determined according to the number of LOC states.

$$\begin{aligned} {F_{\Omega } }:{\varvec{X}} \rightarrow {\varvec{Z}} \end{aligned}$$

(3)

where, $\Omega $ represents learned parameters in the deep clustering network.

Flight state deep clustering network

FSDCN adopts an end-to-end multi-task learning structure to simultaneously accomplish the feature extraction and clustering task for multi-dimensional flight parameters with time-series features. The framework of FSDCN is shown in Fig. 1. According to the priority of tasks, the training process of FSDCN is divided into three stages: risk feature extraction stage, risk feature clustering stage and network joint training stage. A multi-task joint learning strategy is designed to joint-adjust shared network parameters and ensure convergence at different stages.

Risk feature extraction with sequential multi-attention encoder–decoder network

Encoder–decoder is a general network framework, which is widely used for machine learning. Hence, a sequential multi-attention encoder–decoder network is designed to extract effective risk features from multivariate time-series flight parameters, as shown in Fig. 2.

GRU neural network is an improvement of the recurrent neural network (RNN), which has excellent long-term memory ability [33]. It not only contains the cyclic network structure but also introduces the gating mechanism to control the accumulation and update of information, which makes the GRU network suitable for processing time-series data. The GRU neural network extracts sequential features of corresponding parameters, and the multi-head attention mechanism is adopted to adjust the feature weight at different time nodes. According to the many-to-one mapping criterion, the risk feature vector ${f_c}$ is obtained. Next, the fully connected layer is used for further dimension reduction and obtains the low-dimensional risk feature vector ${f_p}$.

The decoder layer is composed of flight parameters reconstructor and LOC classifier. Specifically, the reverse network layer is adopted to realize reconstructed input vector $\varvec{{\tilde{x}}}$ from the risk feature vector ${f_p}$. The reconstruction loss ${L_{{\text {reconstruct }}}}$ is shown in Eq. (4). Since normal flight samples far exceed LOC samples, a dropout layer is adopted to design the LOC classifier to avoid network over-fitting. The LOC classifier can distinguish normal data and LOC data based on extracted low-dimension risk feature vector ${f_p}$, which can ensure ${f_p}$ retains sufficient latent LOC risk features. Moreover, focal loss [34] is introduced to address the training scenario in which there is an extreme imbalance between normal samples and LOC samples, as shown in Eq. (5). This can significantly increase the loss contribution of the LOC samples to make the model tend to learn from these samples.

$$\begin{aligned} {L_{{\text {reconstruct}}}} = \sum _{x \in \{ V, \ldots ,r\} } {\int _0^{\Delta t} {fs*{{\left( {{\tilde{x}}(t) - x(t)} \right) }^2}{\textrm{d}}t}} \end{aligned}$$

(4)

where, x(t) is the real flight parameter, ${{\tilde{x}}(t)}$ is the reconstructed parameter.

$$\begin{aligned} {L_{{\text {classify}}}} = \left\{ {\begin{array}{*{20}{l}} { - \zeta {{\left( {1 - {p_y}} \right) }^2}\log {p_y}},&{}\quad y = 1 \\ { - (1 - \zeta ){p_y}^2\log \left( {1 - {p_y}} \right) },&{}\quad y = 0 \end{array}} \right. \end{aligned}$$

(5)

where, ${p_y}$ is the estimated probability for the LOC label y, and $\zeta $ represents the balance coefficient of positive and negative samples.

Eventually, the loss function of the sequential multi-attention encoder–decoder network is shown in Eq. (6). The designed encoder–decoder network is trained to update the parameters $\Omega $ by minimizing the loss ${L_{FE}}$ as shown in Eq. (6). The trained-well sequential multi-attention encoder–decoder network can summarize nonlinear mapping relationships and build an effective mapping from original data space ${\varvec{X}}$ to risk feature space ${\varvec{Z}}$.

$$\begin{aligned} {L_{FE}} = \delta {L_{{\text {reconstruct}}}} + (1 - \delta ){L_{{\text {classify}}}} \end{aligned}$$

(6)

where, $\delta $ is the weighting factor.

Risk feature clustering with feature clustering layer

Based on the obtained low-dimensional risk feature vector ${f_p}$,the feature clustering layer aims to learn K clustering centers in the risk feature space and determine the risk label of each data sample according to the similarity between the feature vector and the cluster center.The conventional clustering method updates the cluster center and modifies the data samples’ labels by minimizing the value of the similarity measure. The conventional clustering method does not change the location and distribution of data samples in the process. Different from the conventional clustering method, FSDCN combines clustering process with feature extraction. The feature clustering layer introduces soft assignment ${\varvec{Q}}$ and auxiliary target distribution ${\varvec{P}}$, and updates layer parameters by minimizing the KL divergence representing the similarity between ${\varvec{Q}}$ and ${\varvec{P}}$.

As shown in Fig. 3, K cluster centers $\{ {{\varvec{\mu }}_j} \in {\varvec{Z}}\} _{j = 1}^K$ are initialed by the K-means algorithm in risk feature space. The similarity between the risk feature vectors ${{{\varvec{z}}_i}}$ and the cluster center ${{{\varvec{\mu }}_j}}$ is calculated as Eq. (7)

$$\begin{aligned} {q_{i,j}} = \frac{{{{\left( {1 - {{\left\| {{{\varvec{z}}_i} - {{\varvec{\mu }}_j}} \right\| }^2}} \right) }^{ - 1}}}}{{\sum _{j = 1}^K {{{\left( {1 - {{\left\| {{{\varvec{z}}_i} - {{\varvec{\mu }}_j}} \right\| }^2}} \right) }^{ - 1}}} }} \end{aligned}$$

(7)

where, ${q_{i,j}}$ is softer probabilistic targets, and it represents the probability that the feature point ${{{\varvec{z}}_i}}$ are assigned to the cluster centroid ${{{\varvec{\mu }}_j}}$.

After obtaining the soft assignment ${\varvec{Q}} = \{ \{ {p_{i,j}}\} _{j = 1}^K\} _{i = 1}^N$ of the data sample set, auxiliary target distribution ${\varvec{P}}$ is calculated by first squaring ${\varvec{Q}}$ and then normalizing by frequency per cluster, as shown in Eq. (8). The operation can make ${\varvec{P}}$ have a stricter probability distribution (its value is closer to 0 or 1) than ${\varvec{Q}}$, which facilitates learning risk feature data with high-confidence assignments.

$$\begin{aligned} {p_{i,j}} = \frac{{{{q_{i,j}^2} \bigg / {\sum _{i = 1}^N {{q_{i,j}}} }}}}{{\sum _{j = 1}^K {\left( {{{q_{i,j}^2} \bigg / {\sum _{i = 1}^N {{q_{i,j}}} }}} \right) } }}. \end{aligned}$$

(8)

The clustering loss ${L_{{\text {cluster}}}}$ is defined as the KL divergence that measures the difference between soft assignment ${\varvec{Q}}$ and auxiliary target distribution ${\varvec{P}}$, as shown in Eq. (9). The clusters are iteratively refined by learning from high-confidence assignments. The clustering results can guide feature extraction, and that in turn optimizes the clustering effect:

$$\begin{aligned} {L_{{\text {cluster }}}} = KL({\varvec{P}}\parallel {\varvec{Q}}) = \sum _{i = 1}^N {\sum _{j = 1}^K {{p_{i,j}}\log \frac{{{p_{i,j}}}}{{{q_{i,j}}}}} }. \end{aligned}$$

(9)

Network joint training

Flight risk state clustering relies on extracting risk features contained in the parameter data set. FSDCN integrates risk feature extraction and cluster analysis into an end-to-end network structure. At network joint training stage, a new loss function is formed by extraction loss and clustering loss, as shown in Eq. (10). So that the risk feature vectors are updated in a direction more conducive to clustering.

$$\begin{aligned} L(\Omega ) = \gamma {L_{FE}}(\Omega ) + (1 - \gamma ){L_{{\text {cluster }}}}(\Omega ) \end{aligned}$$

(10)

where, $\gamma $ is the weighting factor.

Multi-task joint learning strategy

According to multi-task joint learning strategy, meaningful and well-separated feature representations are firstly produced by training the custom encoder–decoder network, which can set the foundation for efficient data clustering. Then, clustering assignment and improved feature representations are simultaneously realized by feature clustering layer. Finally, the parameters of whole FSDCN are jointly updated in a direction more conducive to clustering. According to the priority of tasks, different loss gradients are used to update network parameters to ensure the convergence of FSDCN training. The standard back-propagation to compute network parameters of each training stage are given in Table 2.

(1) Risk feature extraction. Update the parameters of the sequential multi-attention encoder–decoder network to learn the nonlinear mapping function ${f_{\Omega } }$. The network initially completes flight risk feature extraction:

$$\begin{aligned} {\Omega _{FE}} \leftarrow - {\nabla _{{\Omega _{FE}}}}\left( {{L_{FE}}} \right) . \end{aligned}$$

(11)

(2) Risk feature clustering. Update the parameters of the feature clustering layer by minimizing the KL divergence representing the similarity between Q and P:

$$\begin{aligned} {\Omega _{{\text {cluster}}}} \leftarrow - {\nabla _{{\Omega _{{\text {cluster}}}}}}\left( {{L_{{\text {cluster}}}}} \right) . \end{aligned}$$

(12)

(3) Network joint training. Jointly fine tune the parameters of the overall FSDCN and the parameters of the FSDCN are updated in the direction conducive to clustering tasks:

$$\begin{aligned} \Omega \leftarrow - {\nabla _{\Omega } }(L). \end{aligned}$$

(13)

Table 2

Multi-task joint learning strategy

Implementation of FSDCN training
Input: Flight parameters data ${\varvec{x}}$
Normalization: Data maximum normalization
Initialization: Initialize FSDCN parameters
Switch epoch = from 1 to Epochs
Case epoch in (0, 0.2Epochs]
Update parameters of sequential multi-attention encoder–decoder network by $ - {\nabla _{{\Omega _{FE}}}}\left( {{L_{FE}}} \right) $
Case epoch in (0.2Epochs, 0.6Epochs]
Update parameters of deep cluster layer by $ - {\nabla _{{\Omega _{{\text {cluster }}}}}}\left( {{L_{{\text {cluster}}}}} \right) $
Case epoch in (0.6Epochs, Epochs]
Update parameters of overall FSDCN by $ - {\nabla _{\Omega } }(L)$
end
Output: the class label of corresponding input

Experiments

Due to the development of aviation technology, flying has been the safest mode of transportation by accident statistics. Over the past 20 years, accident rates dramatically declined while flights rose steadily. LOC events rarely happen in actual flight. Hence, there are few flight data in case of complex situations, especially LOC accidents. Flight data used in this paper mainly came from flight aerobatics training with a simulator. The flight simulator and part of the flight aerobatics trajectory are shown in Fig. 4. The implementation of the FSDCN algorithm in this paper is based on python 3.7 and Pytorch 1.10.2 deep learning framework.

Model parameters setting and evaluation metrics

Flight states are related to the number of clusters. The reconstruction loss, classification loss and clustering loss are weighted by the weighting factors $\delta $ and $\gamma $. The role of the weighted loss function is to guide network training in a direction more conducive to clustering. The optimal number of clusters depends on the risk features extracted for partitioning, as well as measuring similarities method. Sum of squared error (SSE) and gap value are the common indexes used to select the optimal cluster number in K-means algorithm, as shown in Eqs. (14) and (15). Both indexes can provide valuable information for cluster analysis by fitting the model with a range of values for cluster number K. Hence, it is a good idea to use both indexes to determine the most optimal cluster number.

The elbow method finds the elbow point by drawing a line plot between SSE and K. As shown in Fig. 5a, for cluster number $K = 5$, which represents the elbow point. Gap statistics (GS) measures the cluster difference between observed data and reference data with reference distribution. The most optimal number of clusters can be chosen as the smallest value of K such that the gap value is within one standard deviation of the gap at $K + 1$. As shown in Fig. 5b, when $K = 5$, its gap value is greater than the value at $K = 6$, which satisfies Eq. (15). Hence, the optimal number of clusters is set as 5:

$$\begin{aligned}{} & {} \text {WSSE} = \sum _{k = 1}^K {\sum _{i = 1}^N {{{\left( {p_i^k - {\mu ^k}} \right) }^2}} } \end{aligned}$$

(14)

$$\begin{aligned}{} & {} \left\{ \begin{array}{l} {{\text {Gap}} _n}(K) = E_n^*\log {W_k} - \log {W_k} \\ E_n^*\log {W_k} = \frac{1}{B}\mathop \sum _{b = 1}^B {\log } \left( {W_{kb}^*} \right) . \end{array}\right. \end{aligned}$$

(15)

Meanwhile, Table 3 gives the other primary parameter configurations of FSDCN. The flight data are divided into a training dataset and a testing dataset to verify the performance superiority of FSDCN. The k-fold cross-validation method is prevalent for evaluating classification algorithm performance [35]. The experiment in this paper uses ten-fold cross-validation. The flight data are randomly divided into ten disjoint datasets with approximately equal sizes. The ratio of the training dataset to the testing dataset currently stands at 9:1. Each disjoint dataset is used in turn as the testing dataset to evaluate the flight state clustering effect, and the other nine disjoint datasets are used as the training dataset to learn feature representations and cluster assignments.

Three metrics are introduced to evaluate the quality of clustering: Davies–Bouldin (DB), Silhouette coefficient (SC), and Calinski–Harabasz (CH). DB represents the similarity measurement between clusters, and SC represents how tightly grouped the samples in the clusters are. CH represents a ratio of the sum of inter-cluster dispersion and the sum of intra-cluster dispersion for all clusters.

SC combines cluster cohesion a(i) and cluster separation b(i). SC is calculated as shown in Eq. (16). SC is essentially the difference between a(i) and b(i) divided by the maximum of the two. Hence, the score of SC ranges within $[ - 1,1]$, and its value getting closer to 1 indicates a better clustering effect.

$$\begin{aligned} \text {SC}(i) = \frac{{b(i) - a(i)}}{{\max \{ a(i),b(i)\} }} = \left\{ {\begin{array}{*{20}{c}} {1 - \frac{{a(i)}}{{b(i)}}},&{}\quad {a(i) < b(i)} \\ {\frac{{b(i)}}{{a(i)}} - 1},&{}\quad {a(i) \geqslant b(i)} \end{array}} \right. \nonumber \\ \end{aligned}$$

(16)

where, a(i) refers to the average distance between an instance and all other samples within the same cluster, and b(i) refers to the average distance between an instance and all samples in other clusters. Their formulas are shown in Eq. (17):

$$\begin{aligned} \left\{ \begin{array}{l} a(i) = \frac{1}{{n - 1}}\sum _{j \ne i}^n {\left\| {{\varvec{z}_i} - {\varvec{z}_j}} \right\| } \\ b(i) = \frac{1}{{N - 1}}\sum _{j \ne i}^N {\left\| {{\varvec{z}_i} - {\varvec{z}_j}} \right\| } \end{array} \right. \end{aligned}$$

(17)

where, n represents the number of all other samples within the same cluster, and N represents the number of all samples in other clusters.

CH combines the sum of inter-cluster dispersion ${\text {GS}\left( {{B_k}} \right) }$ and the sum of intra-cluster dispersion for all clusters ${\text {GS}\left( {{W_k}} \right) }$. CH is calculated as shown in Eq. (18). Unlike SC, CH has no bound, and its high score means a better clustering effect.

$$\begin{aligned} \text {CH} = \frac{{\text {GS}\left( {{B_k}} \right) }}{{\text {GS}\left( {{W_k}} \right) }}*\frac{{N - K}}{{K - 1}} \end{aligned}$$

(18)

where,

$$\begin{aligned} \left\{ \begin{array}{l} \text {GS}({B_k}) = \sum _{k = 1}^K {{n_k} \bullet {{\left\| {{\varvec{\mu }_k} - {\varvec{\mu }_0}} \right\| }^2}} \\ \text {GS}({W_k}) = \sum _{k = 1}^K {\sum _{j = 1}^{{n_k}} {{{\left\| {{\varvec{z}_{kj}} - {\varvec{\mu }_k}} \right\| }^2}} } \end{array} \right. \end{aligned}$$

(19)

where, ${{\varvec{\mu }_0}}$ is the centroid of all samples, ${{\varvec{\mu }_k}}$ is the centroid of the k-th cluster, ${{\varvec{z}_{kj}}}$ is the j-th sample of the k-th cluster, and ${{n_k}}$ is the number of samples in the k-th cluster.

DB describes the average similarity of each cluster with a cluster most similar to it, which combines intra-cluster dispersion ${{C_i}},{{C_j}}$ and separation measure ${{M_{ij}}}$.

$$\begin{aligned} \text {DB} = \frac{1}{K}\sum _{i,j = 1}^K {{{\max }_{j \ne i}}} \left( {\frac{{{C_i} + {C_j}}}{{{M_{ij}}}}} \right) \end{aligned}$$

(20)

where,

$$\begin{aligned} \left\{ \begin{array}{l} {C_i} = {\left\{ {\frac{1}{{{n_k}}}\sum _{j = 1}^{{n_k}} {{{\left| {{\varvec{z}_{ij}} - {\varvec{\mu } _i}} \right| }^2}} } \right\} ^{\frac{1}{2}}} \\ {M_{ij}} = {\left\{ {\sum _{j = 1}^K {{{\left| {{\varvec{\mu } _i} - {\varvec{\mu } _j}} \right| }^2}} } \right\} ^{\frac{1}{2}}}. \end{array} \right. \end{aligned}$$

(21)

Table 3

Other primary parameters setting of FSDCN

Parameter	Setting value
GRU layer	3
GRU node number	128
Fully connected layer node number	64
Network learning rate	0.001
Maximum iteration number	10,000
Mini-batch size	64
$\zeta $ in Eq. (5)	0.65
$\delta $ in Eq. (6)	0.5
$\gamma $ in Eq. (10)	0.5

FSDCN proposed in this paper is an unsupervised clustering model. The model input lacks true labels. Hence, there is no independent validation data with label to evaluate FSDCN performance. To solve this problem, the statistics of the metrics (DB, SC and CH) are used to measure the performance of clustering model. Since there is a randomness mechanism in k-fold cross-validation, the average-value and standard-deviation of metrics can be adopted to verify unsupervised learning algorithms’ performance. The value of the metrics (average-value) can measure the clusters assignment, and the distribution of the metrics (standard-deviation) can measure the stability of the model. However, the scale characteristics of different metrics can affect the comparisons between clustering models. To compare the performance of different clustering models directly, percentage error is introduced as shown in Eq. (22).

$$\begin{aligned} {\text {Percentage error}} = \frac{{\left| {{V_{{\text {train}}}} - {V_{{\text {test}}}}} \right| }}{{{V_{{\text {train}}}}}} \times 100\% \end{aligned}$$

(22)

where, $V_{{\text {train}}}$ is metrics value from training dataset, and $V_{{\text {test}}}$ is metrics value from testing dataset.

Clustering results analysis

T-distributed stochastic neighbor embedding (T-SNE) is a popular tool for high-dimensional data visualization, which maps the data in a high-dimensional space to a low-dimensional space. Here, T-SNE is applied to map the high-dimensional latent risk features to three-dimensional vectors for visualization. The progression of the risk features embedded representation is visualized, as shown in Fig. 6. It is clear that the clusters are becoming increasingly well separated. This means that FSDCN extract more distinguishable risk features and can better service for flight state clustering. Figure 6a is the visualization result for embedded representation clustering in the initial stage. FSDCN start from the chaotic state where clusters overlap too closely to capture the data separability. Figure 6b is the visualization result for embedded representation clustering in the risk features extraction stage. The sequential multi-attention encoder–decoder network is updated to improve the performance of clustering. All data move closer toward the cluster center, but the boundaries of clusters still overlap. Figure 6c is the visualization result for embedded representation clustering in the risk features clustering stage. The deep cluster layer is updated to enhance the performance of clustering. Obviously, the distance between clusters is getting farther. Figure 6d is the visualization result for embedded representation clustering in the multi-task joint learning stage. The parameters of overall FSDCN are updated simultaneously at the multi-task joint learning stage. Clearly, the distribution of clusters begins to change with the training iteration, and the embedded representation of risk features flock together more distinctly, and cluster assignments become more reasonable. Figure 6e is the clustering result for embedded representation. The elements in the cluster are highly concentrated and the interval between clusters is apparent. This proves the effectiveness of the proposed multi-task joint learning strategy.

The performance of evaluation metrics for FSDCN is shown in Fig. 7. As the training process goes, DB index representing the separation between the clusters decreases gradually, and SC index indicating the cohesion of the clusters increases gradually. The clustering effect is continuously improved by risk feature extraction. At the risk features clustering stage, DB and SC indexes have little change, and CH index containing separation and cohesion information significantly increases. The higher value of CH index means the clusters are dense and well separated. At the multi-task joint learning stage, three evaluation metrics, DB, SC, and CH, oscillate slightly with the training iteration, and the clustering effect is optimized continuously. As can be observed from Fig. 7, the performance of three evaluation metrics converges to a stable state finally. FSDCN search for a stable clustering solution that can extract latent risk features and cluster them reasonably.

The results of k-fold cross-validation for FSDCN are shown in Table 4. In both the training dataset and the testing dataset, FSDCN has high SC, CH indexes and a low DB index, which indicates that the FSDCN has strong unsupervised clustering ability. Moreover, the average values of DB, CH, and SC are similar, and their variance values are small. The percentage error is not more than $6\%$, indicating that FSDCN has good generalization ability.

Table 4

The results of k-fold cross validation for clustering algorithms

Algorithms	Metrics	Training dataset	Testing dataset	Percentage error
FSDCN	DB	$0.1661 \pm 0.020$	$0.1583 \pm 0.019$	$4.70\% \pm 5.00\% $
	SC	$0.8954 \pm 0.023$	$0.8601 \pm 0.022$	$3.94\% \pm 4.35\% $
	CH	$198756.56 \pm 5598.33$	$187844.67 \pm 5854.16$	$5.49\% \pm 4.57\% $
Random forest K-means [36]	DB	$0.4150 \pm 0.043$	$0.4263 \pm 0.046$	$2.72\% \pm 6.98\% $
	SC	$0.4994 \pm 0.034$	$0.5246 \pm 0.040$	$5.05\% \pm 17.65\% $
	CH	$92478.44 \pm 10534.64$	$97804.07 \pm 9714.15$	$5.76\% \pm 7.79\% $
Autoencoder K-means [37]	DB	$0.3525 \pm 0.026$	$0.4060 \pm 0.035$	$15.18\% \pm 34.62\% $
	SC	$0.6159 \pm 0.024$	$0.4873 \pm 0.033$	$20.88\% \pm 37.50\% $
	CH	$119360.76 \pm 8377.33$	$103854.93 \pm 9844.38$	$12.99\% \pm 17.51\% $
Deep embedding cluster [26]	DB	$0.2971 \pm 0.021$	$0.3962 \pm 0.019$	$33.36\% \pm 9.52\% $
	SC	$0.6543 \pm 0.025$	$0.5365 \pm 0.027$	$18.00\% \pm 8.00\% $
	CH	$138923.35 \pm 8560.33$	$127362.45 \pm 9334.57$	$8.32\% \pm 9.05\% $

To validate the effectiveness of FSDCN, we compare its clustering effect on data with other models, and specifically including random forest K-means model [36]: features extracted by random forest are used to support K-means clustering, and autoencoder K-means model [37]: features extracted by a autoencoder are used to support K-means clustering, and deep embedding cluster model [26]: features extraction and unsupervised clusters assignment are integrated into a deep embedding network.

Figure 8 presents visualization results of other clustering algorithms on flight datasets, and Table 4 also reports the performance of other clustering algorithms on the training and testing datasets. From Fig. 8, it is observed that the autoencoder has fascinating potential in feature extraction. Compared with the random forest, the autoencoder significantly improves the performance of the clustering algorithm. The deep embedding cluster algorithm has better metrics among other three clustering algorithms according to Table 4, and the clustering result of the deep embedding cluster algorithm is better than other clustering algorithms according to Fig. 8c. This means that the deep neural network integrating feature extraction and clusters assignment can improves the performance of the clustering algorithm further. However, the percentage error of deep embedding cluster model is obviously large so there are still overlapping areas between clusters, which means that the generalization ability of deep embedding cluster model is poor. Hence, inspired by deep embedding cluster model, FSDCN is developed.

Comparing Figs. 6e and 8c, the boundaries between the clusters obtained by FSDCN are more apparent than that obtained by the deep embedding cluster algorithm. This means that latent risk features contain more information served for flight state division. Moreover, according to Table 4, the metrics values of FSDCN are better than other clustering algorithms, especially the percentage error of all metrics is less than $6\%$. This can prove its better clustering performance. In summary, FSDCN is a competitive algorithm that can simultaneously learn feature representations and cluster assignments.

Flight risk evaluation

According to the clustering results of flight data by FSDCN, all flight states are classified into clusters from $\text {RL}_{\text {A}}$ to $\text {RL}_{\text {E}}$. Every cluster includes the normal flight state and LOC state. Hence, some metrics used to evaluate the characteristics of the clusters are defined. Based on the relationship between flight parameters and latent LOC information, the risk level of the clusters is determined through the statistics method.

Proportion of cluster data (PCD): the sample number of each cluster in the total sample. The value of PCD is related to the complexity and danger of flight maneuvers. Its expression is shown in Eq. (23).

$$\begin{aligned} {p_{{\text {PCD}}}} = \frac{{n_c^i}}{{{n_{{\text {all}}}}}} \end{aligned}$$

(23)

where, ${n_c^i}$ is the sample number of the i-th cluster, and ${{n_{{\text {all}}}}}$ is the sample number of all data.

Proportion of LOC data (PLD): the LOC samples contained in the cluster to the total LOC samples. The larger the PLD value, the more LOC samples in the cluster. Its expression is shown in Eq. (24).

$$\begin{aligned} {p_{{\text {PLD}}}} = \frac{{n_{{\text {loss}}}^i}}{{n_{{\text {loss}}}^{{\text {all}}}}} \end{aligned}$$

(24)

where, ${n_{{\text {loss}}}^i}$ is the number of the LOC sample contained in the i-th cluster, and ${n_{{\text {loss}}}^{{\text {all}}}}$ is the number of the total LOC sample.

Average LOC correlation (ALC): the average correlation between LOC samples and other samples in the cluster. The larger the ALC value, the stronger the correlation between flight status and LOC in the cluster. Its expression is shown in Eq. (25).

$$\begin{aligned} {p_{{\text {ALC}}}} = \frac{{\sum \nolimits _{k = 0}^{n_c^i} {\sum \nolimits _{j = 0}^{n_{{\text {loss}}}^i} {\text {Pearson}({c_k},{l_j})} } }}{{n_c^i}} \end{aligned}$$

(25)

where, $\text {Pearson}(*)$ is the Pearson correlation coefficient function, which can refers to [38].

The statistical results of the clusters are shown in Table 5. For the $\text {RL}_{\text {B}}$ cluster, which account for $35.33\%$ of the whole dataset, its LOC samples only accounts for $1.56\%$ of the total LOC samples. ${p_{{\text {ALC}}}}$ of the $\text {RL}_{\text {B}}$ cluster is less than 0.1. This indicates that the correlation between the flight states and LOC states in the $\text {RL}_{\text {B}}$ cluster is extremely weak or irrelevant. Hence, the risk level of the $\text {RL}_{\text {B}}$ cluster is relatively low. For the $\text {RL}_{\text {C}}$ and $\text {RL}_{\text {E}}$ clusters, which account for $34.41\%$ of the whole dataset, their LOC samples account for $17.31\%$ of the total LOC samples. ${p_{{\text {ALC}}}}$ of the $\text {RL}_{\text {C}}$ and $\text {RL}_{\text {E}}$ clusters is 0.2567 and 0.3177 respectively. ${p_{{\text {ALC}}}} \in [0.2,0.4]$ indicates that the correlation between the flight states and LOC states in the $\text {RL}_{\text {C}}$ and $\text {RL}_{\text {E}}$ clusters is medium. Hence, the risk level of the $\text {RL}_{\text {C}}$ and $\text {RL}_{\text {E}}$ clusters is moderate. For the $\text {RL}_{\text {A}}$ and $\text {RL}_{\text {D}}$ clusters, which account for $30.26\%$ of the whole dataset, their LOC samples account for $81.13\%$ of the total LOC samples. of the $\text {RL}_{\text {A}}$ and $\text {RL}_{\text {D}}$ clusters is 0.6489 and 0.8357 respectively. ${p_{{\text {ALC}}}} \in [0.6,0.9]$ indicates that the correlation between the flight states and LOC states in the $\text {RL}_{\text {A}}$ and $\text {RL}_{\text {D}}$ clusters is strong. Hence, the risk level of the $\text {RL}_{\text {A}}$ and $\text {RL}_{\text {D}}$ clusters is high. Finally, the risk level of different clusters is determined according to the PLD and ALC metrics, as shown in Table 5. The higher the risk level, the more serious the LOC correlation.

Table 5

Flight state statistics and risk level evaluation

Cluster	PCD	PLD	ALC	Risk level ${{\text {C}}_{\text {R}}}$
$\text {RL}_{\text {A}}$	$16.85\%$	$23.74\%$	0.6489	4
$\text {RL}_{\text {B}}$	$33.53\%$	$1.56\%$	0.0936	1
$\text {RL}_{\text {C}}$	$16.34\%$	$5.83\%$	0.2567	2
$\text {RL}_{\text {D}}$	$13.41\%$	$57.39\%$	0.8357	5
$\text {RL}_{\text {E}}$	$18.07\%$	$11.48\%$	0.3177	3

To verify the validity of FSDCN in classifying flight states, some flight maneuver cases’ flight parameter sequences are inputted into the well-trained FSDCN. Then, the risk level ${{{C}}_{\text {R}}}$ for each input sequence is exported. The risk level ${{{C}}_{\text {R}}}$ and the flight parameters performance are put together to compare their temporal distribution characteristics. Besides, the trajectory with risk level ${{{C}}_{\text {R}}}$ of the flight maneuver case is also provided to conveniently view the spatial distribution characteristics.

Some high-difficulty maneuvers are included in the flight cases, such as loops, barrel rolls, s-turns, wingovers, and nosedives, etc. Figures 9, 10 and 11 show the risk heatmap of the spatiotemporal distribution for the flight maneuver cases. By examining the risk heatmap of the spatial distribution for the flight maneuver cases, it is found that LOC risk is mainly concentrated in the loop maneuver stage. After examining Figs. 9a, 10a and 11a, it is found that the distribution characteristics of the flight parameter over-limit are basically consistent with LOC risk. This means that the quantified risk level can effectively characterize the flight states.

Conclusions

In this paper, a flight state deep clustering network was proposed for flight risk evaluation. FSDCN possesses an end-to-end multi-task learning structure to integrate feature extraction and unsupervised clustering procedure. In FSDCN, a sequential multi-attention encoder–decoder network is designed to extract effective risk features from original flight parameters, and a clustering layer is constructed iteratively refine clusters by measuring the clustering performance to facilitate feature extraction. Multi-task joint learning strategy is adopted to optimize the clustering performance of FSDCN. Compared with other deep clustering models, FSDCN has better clustering performance and obtains the most apparent boundaries between the clusters. This greatly improves the accuracy of risk evaluation. According to self-defined metrics evaluating the relationship between flight parameters and latent LOC information, and each cluster’s LOC risk level is quantified through statistical analysis. Three high-difficulty maneuver cases are presented to illustrate FSDCN for flight risk evaluation. The results of the risk spatiotemporal distribution for the flight maneuver cases confirm the flight state clustering effectiveness of the proposed FSDCN.

However, our work has some limitations. The FSDCN model needs to be further improved in future work. Since flight data used in this paper mainly came from flight aerobatics training with a simulator, the data excludes abnormities such as noise, missing and bias. The near-perfect flight data may limit the practical application of FSDCN on the risk alarm system. Hence, future studies will focus on data cleaning to treat noise, missing values and deviation in data. Besides, the other factors (e.g. parameters combination, extraction range and collection interval) associated with the shape of input variables are need to further study to clarify the effects on clustering performance of FSDCN. These will set the foundation for successful, accurate, and efficient data analysis. In addition, the display form of risk information is also need to concern to clarify the suitability for improving the crew’s situational awareness.

Acknowledgements

This work was supported partly by the National Natural Science Foundation of China under grant number 62103440, and partly by the National Program on Key Basic Research Project (973 Project) under grant number 2015CB755800. The authors would like to thank the editors and the anonymous reviewers for their constructive comments and suggestions that have improved the quality of the note.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Vorheriger Artikel Data-driven XGBoost model for maximum stress prediction of additive manufactured lattice structures

Nächster Artikel A dual-population constrained multi-objective evolutionary algorithm with variable auxiliary population size

Li S, Fan L (2015) Research on risk early-warning model in airport flight area based on information entropy attribute reduction and BP neural network. Int J Secur Appl 9(10):313–322

Balachandran S, Atkins EM (2015) Flight safety assessment and management for takeoff using deterministic Moore machines. J Aerosp Inf Syst 12(9):599–615

Wei Y, Xu H, Xue Y et al (2020) Quantitative assessment and visualization of flight risk induced by coupled multi-factor under icing conditions. Chin J Aeronaut 33(8):2146–2161CrossRef

Yasue K (2020) Extraction of monophasic data from flight test data via cluster analysis. J Aircr 57(3):399–407CrossRef

Abou-Nasr M, Lessmann S, Stahlbock R et al (2015) Real world data mining applications. Springer, ChamCrossRef

Sharifi F, Mohammed E, Crump T et al (2019) A cluster-based machine learning model for large healthcare data analysis. In: Proceedings of the 5th international joint conference on big data innovations and applications, pp 92–106

Hadipour H, Liu C, Davis R et al (2022) Deep clustering of small molecules at large-scale via variational autoencoder embedding and K-means. BMC Bioinform 23(4):132–153CrossRef

Yan Y, He M, Song L (2021) Evaluation of regional industrial cluster innovation capability based on particle swarm clustering algorithm and multi-objective optimization. Complex Intell Syst

Li L, Das S, John Hansman R et al (2015) Analysis of flight data using clustering techniques for detecting abnormal operations. J Aerosp Inf Syst 12(9):587–598

10.

Nguyen M-H, Alam S (2018) Airspace collision risk hot-spot identification using clustering models. IEEE Trans Intell Transp Syst 19(1):48–57CrossRef

11.

Lishuai L, Gariel M, Hansman RJ et al (2011) Anomaly detection in onboard-recorded flight data using cluster analysis. In: Proceedings of IEEE/AIAA 30th digital avionics systems conference, pp 1–11

12.

Rao W, Xia J, Liu W et al (2019) Interval data-based k-means clustering method for traffic state identification at urban intersections. IET Intell Transp Syst 13(7):1106–115CrossRef

13.

Yang W, Li X, Deng Y (2022) A clustering based method to complete frame of discernment. Chin J Aeronaut

14.

Sheridan K, Puranik TG, Mangortey E et al (2020) An application of DBSCAN clustering for flight anomaly detection during the approach phase. In: Proceedings of AIAA scitech forum, pp 1851–1871

15.

Jiang Q, Liu Y, Ding Z et al (2022) Behavior pattern mining based on spatiotemporal trajectory multidimensional information fusion. Chin J Aeronaut

16.

Zhou W, Wang L, Han X et al (2022) A novel density deviation multi-peaks automatic clustering algorithm. Complex Intell Syst

17.

Chen L, Guo Q, Liu Z et al (2021) Enhanced synchronization-inspired clustering for high-dimensional data. Complex Intell Syst 7(1):203–223CrossRef

18.

Liu H, Li J, Wu Y et al (2021) Clustering with outlier removal. IEEE Trans Knowl Data Eng 33(6):2369–2379CrossRef

19.

Gao X, Cheng Z, Huo W (2020) Anomaly location method for QAR data based on principal component analysis hierarchical clustering. In: Proceedings of the 1th international conference on materials science and engineering, pp 12085–12095

20.

Aslaner HE, Unal C, Iyigun C (2016) Applying data mining techniques to detect abnormal flight characteristics. In: Proceedings of the international conference on society for optical engineering, pp 1–18

21.

Zhao W, He F, Li L et al (2018) An adaptive online learning model for flight data cluster analysis. In: Proceedings of IEEE/AIAA 37th digital avionics systems conference, pp 1–7

22.

Siłka J, Wieczorek M, Wozniak M (2022) Recurrent neural network model for high-speed train vibration prediction from time series. Neural Comput Appl 34:13305–13318

23.

Mittal M, Kobielnik M, Gupta S et al (2022) An efficient quality of services based wireless sensor network for anomaly detection using soft computing approaches. JoCCASA 11(70):1–21

24.

Wieczorek M, Siłka J, Wozniak M (2020) Neural network powered COVID-19 spread forecasting model. Chaos Soliton Fract 140:110203–110218MathSciNetCrossRef

25.

Qin K, Wang Q, Lu B et al (2022) Flight anomaly detection via a deep hybrid model. Aerospace 9(6):329CrossRef

26.

Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: Proceedings of the 33th international conference on machine learning, pp 1–10

27.

Zhong H, Wu J, Chen C et al (2021) Graph contrastive clustering. In: Proceedings of IEEE international conference on computer vision and pattern recognition, pp 9204–9213

28.

Huang J, Gong S, Zhu X (2020) Deep semantic clustering by partition confidence maximisation. In: Proceedings of IEEE international conference on computer vision and pattern recognition, pp 8846–8855

29.

Ren Y, Hu K, Dai X et al (2019) Semi-supervised deep embedded clustering. Neurocomputing 325:121–130CrossRef

30.

Ienco D, Interdonato R (2020) Deep multivariate time series embedding clustering via attentive-gated autoencoder. In: Proceedings of the 24th international conference on advances in knowledge discovery and data mining, pp 318–329

31.

Diallo B, Hu J, Li T et al (2021) Deep embedding clustering based on contractive autoencoder. Neurocomputing 433:96–107CrossRef

32.

Li H, Bai Q, Zhao Y et al (2021) TSDCN: traffic safety state deep clustering network for real-time traffic crash-prediction. IET Intell Transp Syst 15(1):132–146CrossRef

33.

Xia H, Luo Y, Liu Y (2021) Attention neural collaboration filtering based on GRU for recommender systems. Complex Intell Syst 7(3):1367–1379CrossRef

34.

Lin T, Goyal P, Girshick R et al (2021) Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 42(2):318–327CrossRef

35.

Xia H, Huang K, Liu Y (2022) Unexpected interest recommender system with graph neural network. Complex Intell Syst

36.

Sun J, Sun JA (2016) Real-time crash prediction on urban expressways: identification of key variables and a hybrid support vector machine model. IET Intell Transp Syst 10(5):331–337CrossRef

37.

Xiang J, Chen Z (2018) Traffic state estimation of signalized intersections based on stacked denoising auto-encoder model. Wirel Pers Commun 103(1):625–638CrossRef

38.

Liu H, Chen C, Li Y (2022) Smart metro station systems: data science and engineering. Elsevier, Amsterdam

Titel: Flight risk evaluation based on flight state deep clustering network
verfasst von: Guozhi Wang
Haojun Xu
Binbin Pei
Haoyu Cheng
Publikationsdatum: 11.04.2023
Verlag: Springer International Publishing
Erschienen in: Complex & Intelligent Systems / Ausgabe 5/2023
Print ISSN: 2199-4536
Elektronische ISSN: 2198-6053
DOI: https://doi.org/10.1007/s40747-023-01053-z

Parameter	\(\hat{x}_{{\text {limit}}}^{{\text {down}}}\)	\(\hat{x}_{{\text {limit}}}^{{\text {up}}}\)	Parameter	\(\hat{x}_{{\text {limit}}}^{{\text {down}}}\)	\(\hat{x}_{{\text {limit}}}^{{\text {up}}}\)
Airspeed V	70 m/s	400 m/s	Roll angle \(\phi \)	\(-150^{\circ }\)	\(150^{\circ }\)
Climb rate \(\dot{H}\)	\(-150\) m/s	150 m/s	Roll rate p	\(-50^{\circ }\)	\(50^{\circ }\)
Angle of attack \(\alpha \)	\(- 20^{\circ }\)	\(30^{\circ }\)	Pitch rate q	\(-25^{\circ }\)	\(25^{\circ }\)
Pitch angle \(\theta \)	\(-90^{\circ }\)	\(90^{\circ }\)	Yaw rate r	\(-45^{\circ }\)	\(45^{\circ }\)

Algorithms	Metrics	Training dataset	Testing dataset	Percentage error
FSDCN	DB	\(0.1661 \pm 0.020\)	\(0.1583 \pm 0.019\)	\(4.70\% \pm 5.00\% \)
	SC	\(0.8954 \pm 0.023\)	\(0.8601 \pm 0.022\)	\(3.94\% \pm 4.35\% \)
	CH	\(198756.56 \pm 5598.33\)	\(187844.67 \pm 5854.16\)	\(5.49\% \pm 4.57\% \)
Random forest K-means [36]	DB	\(0.4150 \pm 0.043\)	\(0.4263 \pm 0.046\)	\(2.72\% \pm 6.98\% \)
	SC	\(0.4994 \pm 0.034\)	\(0.5246 \pm 0.040\)	\(5.05\% \pm 17.65\% \)
	CH	\(92478.44 \pm 10534.64\)	\(97804.07 \pm 9714.15\)	\(5.76\% \pm 7.79\% \)
Autoencoder K-means [37]	DB	\(0.3525 \pm 0.026\)	\(0.4060 \pm 0.035\)	\(15.18\% \pm 34.62\% \)
	SC	\(0.6159 \pm 0.024\)	\(0.4873 \pm 0.033\)	\(20.88\% \pm 37.50\% \)
	CH	\(119360.76 \pm 8377.33\)	\(103854.93 \pm 9844.38\)	\(12.99\% \pm 17.51\% \)
Deep embedding cluster [26]	DB	\(0.2971 \pm 0.021\)	\(0.3962 \pm 0.019\)	\(33.36\% \pm 9.52\% \)
	SC	\(0.6543 \pm 0.025\)	\(0.5365 \pm 0.027\)	\(18.00\% \pm 8.00\% \)
	CH	\(138923.35 \pm 8560.33\)	\(127362.45 \pm 9334.57\)	\(8.32\% \pm 9.05\% \)

Cluster	PCD	PLD	ALC	Risk level \({{\text {C}}_{\text {R}}}\)
\(\text {RL}_{\text {A}}\)	\(16.85\%\)	\(23.74\%\)	0.6489	4
\(\text {RL}_{\text {B}}\)	\(33.53\%\)	\(1.56\%\)	0.0936	1
\(\text {RL}_{\text {C}}\)	\(16.34\%\)	\(5.83\%\)	0.2567	2
\(\text {RL}_{\text {D}}\)	\(13.41\%\)	\(57.39\%\)	0.8357	5
\(\text {RL}_{\text {E}}\)	\(18.07\%\)	\(11.48\%\)	0.3177	3

Springer Professional

Flight risk evaluation based on flight state deep clustering network

Abstract

Publisher's Note

Introduction

Classic clustering

Deep clustering

Flight state clustering and splitting

Flight state deep clustering network

Risk feature extraction with sequential multi-attention encoder–decoder network

Risk feature clustering with feature clustering layer

Network joint training

Multi-task joint learning strategy

Experiments

Model parameters setting and evaluation metrics

Clustering results analysis

Flight risk evaluation

Conclusions

Acknowledgements

Publisher's Note

Premium Partner

Implementation of FSDCN training
Input: Flight parameters data \({\varvec{x}}\)
Normalization: Data maximum normalization
Initialization: Initialize FSDCN parameters
Switch epoch = from 1 to Epochs
Case epoch in (0, 0.2Epochs]
Update parameters of sequential multi-attention encoder–decoder network by \( - {\nabla _{{\Omega _{FE}}}}\left( {{L_{FE}}} \right) \)
Case epoch in (0.2Epochs, 0.6Epochs]
Update parameters of deep cluster layer by \( - {\nabla _{{\Omega _{{\text {cluster }}}}}}\left( {{L_{{\text {cluster}}}}} \right) \)
Case epoch in (0.6Epochs, Epochs]
Update parameters of overall FSDCN by \( - {\nabla _{\Omega } }(L)\)
end
Output: the class label of corresponding input

Springer Professional

Abstract

Publisher's Note

Introduction

Related works

Classic clustering

Deep clustering

Flight state clustering and splitting

Flight state deep clustering network

Risk feature extraction with sequential multi-attention encoder–decoder network

Risk feature clustering with feature clustering layer

Network joint training

Multi-task joint learning strategy

Experiments

Model parameters setting and evaluation metrics

Clustering results analysis

Flight risk evaluation

Conclusions

Acknowledgements

Publisher's Note

Weitere Artikel der Ausgabe 5/2023

Probabilistic prediction with locally weighted jackknife predictive system

A novel ensemble learning-based model for network intrusion detection

SimDCL: dropout-based simple graph contrastive learning for recommendation

A visually meaningful double-image encryption scheme using 2D compressive sensing and multi-rule DNA encoding

Application of improved black hole algorithm in prolonging the lifetime of wireless sensor network

BRCE: bi-roles co-evolution for energy-efficient distributed heterogeneous permutation flow shop scheduling with flexible machine speed

Premium Partner