1 Introduction
2 State of the art
3 Machine learning pipeline
PipelineConfig.json
file.3.1 Feature extraction
features.json
file, in which the available features are categorized by domain into statistical
, temporal
, and spectral
features.temporal-spectral
is implemented. The extension applies the discrete wavelet transform (DWT) and Hilbert-Huang transform (HHT) to the input time series data. The DWT decomposes the input signal into individual frequency bands by repeating high-pass and low-pass filtering. The decomposition level determines the number of transformation steps. In each decomposition level, the high-pass filtered signal components are coded as wavelet coefficients. The low-pass filtered signal components finally serve as a basis for the subsequent decomposition step. The calculation of the wavelet coefficients \(c \left( \tau , s \right)\) within the DWT is done for discrete values of the scaling parameter s and shifting parameter \(\tau\) according to Eq. (1)3.2 Feature selection
uniStat
) based on the determination of the mutual information between the feature vector and target variable. In addition, logistic regression (LogisRe
) with elastic net regularization is utilized for feature selection. This regularization technique is particularly suited for a large number of features and a small number of training samples [39]. Lasso regression (Lasso
), a shrinkade method, uses L1 and L2 regression penalty terms to shrink the coefficients of irrelevant features to 0. This model-based selection method allows a straightforward selection of the influential features by analyzing the model coefficients [13, 19]. For Lasso feature selection, the Least Angle Regression (LARS) algorithm developed by Efron et al. [9] is used to compute the coefficients, which calculates all Lasso estimates at high computational efficiency. In particular, LARS shows its advantages in high-dimensional datasets. Efron et al. [9] The fourth method for feature selection consists of the efficient wrapper approach boruta (Boruta
). It aims to the identification of all relevant features for the prediction task. For this purpose, shadow features exhibiting random values are taken into account in addition to the real features. Finally, the feature selection is performed by comparing the feature importance, given by the used random forest, between the real and the shadow features. Kursa and Rudnicki [25]3.3 ML algorithms and optimization
3.4 Interim conclusion
-
feature selection : S = \(\left[ \text {uniStat, LogisRe} \right]\) with \(SW_{\text {fs,prop}} = 2\) (cf. Sect. 3.2)
-
scaling method: standardisation
PipelineConfig.json
to obtain and select the best suitable models for the quality prediction task.4 Algorithm for automated parametrization of MLPL
4.1 Algorithm description
-
Optimization of feature extraction1.1
optStep
\(_{\text {DWT}}\): Optimization of DWT Hyperparameters (basis-wavelet, decomposition level)1.2optStep
\(_{\text {HHT}}\): Optimization of HHT Hyperparameters (number of IMFs)2.Optimization of domaintemporal-spectral
3.Optimization of domains and window function to use -
Optimization of feature selection4.Optimization of feature selection algorithms, \(SW_{\text {fs,prop}}\), and scaling method to use
-
Final run with optimized parameter configuration of the MLPL5.Final run and seletion of the best-performing models per geomElem \(SW_{\text {fs,prop}}\)
configTable
_optSteps
. For each optStep a separate configTable
_optStep_runs
provides the specified parameter values run
_param
to be set for the MLPL runs. Accordingly, the MLPL is iterated within an optStep according to the number of parameter combinations of the configTable
_optStep_runs
with individually adjusted parameter values. The results for each run are stored as a metrics report, which contains the resulting prediction metrics. Additionally, the MLPL provides the trained models for the corresponding run. Basically, the main concept consists of a modular and extensible design, which allows the algorithm to run through additional optimization steps by extending the configTable
_optSteps
with the corresponding configTable
_optStep_runs
.metricsReports
are read for each run to rank and sort the results per geomElem following the developed scoring approach (cf. Sect. 4.2). These results are subsequently used to determine which parameter values lead to the highest performance metrics. Finally, the PipelineConfig
gets reparameterized for the next optStep based on the identified parameters.optStep
\(_{\text {DWT}}\) thus envisages runs under altered basis-wavelets as well as decomposition levels. Included are the wavelet families daubechies, coiflet, symlet, biorthogonal with 9 shapes and the decomposition levels 3–7. Owing to the different processes resulting from the individual geometry of each geomElem, individual DWT hyperparameters are selected for each geomElem, which leads to the parameterization of the created individual features.json
configuration files. The same applies to the HHT, in which the number of IMFs (in this case 1–7) is selected individually for each geomElem. Following the determination of the individual hyperparameters, a subsequent decision is required on whether features extracted using HHT should be considered in addition to DWT based features. Preliminary tests showed that the DWT is considerably more powerful compared to the HHT in terms of prediction performance. The combination of both time-frequency methods yielded no improvement in the results. Since it cannot be excluded that in particular cases the additional use of HHT features may achieve better performance, the described optStep 2 is included as well. The final step of optimizing the feature extraction is to select which domains should be included in the model building process. For the spectral domain, the Hanning, Hamming, and Blackman window functions are additionally examined to improve the quality of the spectral analysis [32]. The selection of domains to be considered is based on different subsets of the available domains. The subsets consist of each domain individually (4 subsets), the combinations of temporal-spectral with the other domains (3 subsets) as well as all domains together (1 subset).4.2 Scoring and parameter value selection
metricsReports
created for each geomElem, which contain the classification performance measures for accuracy (ACC), precision (PREC), recall (REC) and specificity (SPEC) [12]. Additionally, the ROC AUC and the number of false positive predictions (FP). The metrics within the metricsReports are determined on the validation set. To achieve the identification of the parameters, a rank is assigned to each model. The ranks are determined using the number of FP predictions, the values for specificity and accuracy, as well as ROC AUC. A lower value for the number of FP as well as higher values of the remaining metrics lead to a better rank and thus to a preferred selection as a final model of a geomElem. Due to the desired application in quality prediction, FP predictions are considered to be particularly critical in a production environment. These lead to further processing and assembly of a workpiece that has been manufactured in violation of its tolerances. As a result, it may not fulfill its function and may not be able to withstand the operating loads acting on it. For this reason, the ascending sorted ranks belonging to the number of FP forms the first basis to select the best performing models. If the number of FP is equal for several models, the subsequent sorting base considers the sum of ranks across all metrics. If this is still insufficient for unambiguous identification, the sorting procedure takes into account the ranks of ROC AUC, specificity, and accuracy for decision-making. This individual analysis for specific geomElem allows the hyperparameters of the time-frequency feature extraction methods to be determined and adjusted. The subsequent optSteps are evaluated globally across all geometric elements. Preliminary tests have shown that this individualized consideration yields significant improvements in model performance. However, no improvements and thus no advantages were obtained by individualized evaluation of the other optSteps. To reduce complexity and preserve comprehensibility, the subsequent optSteps are evaluated globally across all geometric elements. For each parameter combination, the resulting sum of FPs on the validation set predicted from the previously determined best-performing models per run is calculated across the entire workpiece thus all geometric elements. The parameter combination which leads to the lowest number of FPs is finally selected.5 Results
5.1 Datasets
ID | Machine tool | Number workpieces | Production period |
---|---|---|---|
\(\text {DS}_\text {DMG1}\) | DMC 850V | 200 | 07.2020–08.2020 (summer) |
\(\text {DS}_\text {GROB}\) | G350 | 200 | 02.2021 (winter) |
\(\text {DS}_\text {DMG2}\) | DMC 850V | 200 | 03.2022 (winter/spring) |
\(\text {DS}_\text {DMG3}\) | DMC 850V | 392 | 09.2020–10.2020 (summer) |
5.2 Results of the automated parametrization
temporal-spectral
domain without HHT is considered solely for the prediction model development. The base-wavelets and decomposition levels selected individually for each geomElem thus provide the highest information density for quality prediction regarding the analyzed datasets. These best-performing hyperparameter values used for DWT furthermore differ among the geomElems and datasets. This supports the design of the algorithm toward individual parameter identification. In addition, the optimization algorithm selects different model algorithms for each geometric element, which leads to the conclusion that multiple model algorithms are required for the most accurate prediction.ID | Acc in% | PREC in% | REC in% | SPEC in% | ROC AUC in% |
---|---|---|---|---|---|
\(\text {DS}_\text {DMG1}\) | 0.38 | 0.55 | − 0.09 | 3.81 | 1.62 |
\(\text {DS}_\text {DMG2}\) | 0.75 | 1.25 | − 0.45 | 10.51 | 2.12 |
\(\text {DS}_\text {GROB}\) | 2.63 | 2.40 | 0.35 | 7.74 | 3.30 |
\(\text {DS}_\text {DMG3}\) | 3.21 | 1.47 | 2.60 | 5.13 | 3.14 |
ID | ACC in% (non-opt.) | PREC in% (non-opt.) | SPEC in% (non-opt.) | ROC AUC in% (non-opt.) | FP (non-opt.) |
---|---|---|---|---|---|
\(\text {DS}_\text {DMG1}\) | 97.47 (95.60) | 98.77 (97.17) | 94.79 (87.95) | 97.32 (93.79) | 0.47 (1.13) |
\(\text {DS}_\text {DMG2}\) | 98.00 (90.00) | 99.66 (96.77) | 98.56 (93.75) | 98.85 (97.43) | 0.13 (1.00) |
\(\text {DS}_\text {GROB}\) | 92.00 (78.00) | 98.49 (73.33) | 96.74 (68.00) | 94.92 (86.40) | 0.40 (2.07) |
\(\text {DS}_\text {DMG3}\) | 93.73 (77.84) | 99.45 (98.58) | 97.74 (96.57) | 97.12 (91.24) | 0.43 (0.71) |