Skip to main content
Erschienen in:

Open Access 09.09.2024 | Original Article

Efficient inverse design optimization through multi-fidelity simulations, machine learning, and boundary refinement strategies

verfasst von: Luka Grbcic, Juliane Müller, Wibe Albert de Jong

Erschienen in: Engineering with Computers | Ausgabe 6/2024

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper introduces a methodology designed to augment the inverse design optimization process in scenarios constrained by limited compute, through the strategic synergy of multi-fidelity evaluations, machine learning models, and optimization algorithms. The proposed methodology is analyzed on two distinct engineering inverse design problems: airfoil inverse design and the scalar field reconstruction problem. It leverages a machine learning model trained with low-fidelity simulation data, in each optimization cycle, thereby proficiently predicting a target variable and discerning whether a high-fidelity simulation is necessitated, which notably conserves computational resources. Additionally, the machine learning model is strategically deployed prior to optimization to compress the design space boundaries, thereby further accelerating convergence toward the optimal solution. The methodology has been employed to enhance two optimization algorithms, namely Differential Evolution and Particle Swarm Optimization. Comparative analyses illustrate performance improvements across both algorithms. Notably, this method is adaptable across any inverse design application, facilitating a synergy between a representative low-fidelity ML model, and high-fidelity simulation, and can be seamlessly applied across any variety of population-based optimization algorithms.
Hinweise

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Inverse design problems represent a frontier in the field of engineering and science, where the objective is to discover the necessary system inputs to achieve a desired known output. Rather than following the traditional forward design process–which starts with given parameters and attempts to predict the outcome–inverse design turns the procedure on its head, beginning with the desired outcome and working backward to determine the optimal parameters to realize it. Particularly in scenarios with computationally expensive or hierarchical simulations, multi-fidelity evaluations play a pivotal role, offering a trade-off between accuracy and computational cost.
Multi-fidelity (MF) methods, that range from faster and approximate or low-fidelity (LF) objective function evaluations to detailed–high fidelity (HF), computationally intensive ones have been explored in-depth for optimization purposes [5, 19, 21, 49]. In the context of inverse design optimization, which is the focus of this work, coupled with a multi-fidelity approach, an additional layer of complexity is introduced when the target output is a distribution. Bayesian and surrogate-based optimization methods have provided great insights in this specific domain, especially when the inverse design problem is rooted in uncertainty or when prior knowledge is available [18, 44, 55]. However, due to the curse of dimensionality, these optimization approaches can encounter computational challenges.
Variable-fidelity methods have further enhanced the efficiency of inverse design optimization by adapting the fidelity level dynamically based on the current stage of the optimization process [19]. This ensures a balance between computational efficiency and solution accuracy, leading to faster convergence rates and reduced computational costs. The success of variable-fidelity methods is evident in their widespread application across various engineering disciplines [5, 22, 24]. These methods usually either employ LF models during the initial exploratory phases and gradually transition to HF models as the solution converges or they have an adaptive mechanism for fidelity selection. These mechanics include monitoring convergence conditions [36, 40], correction techniques [13, 20], space mapping [54], model error monitoring [32, 45], etc. Furthermore, the dominant surrogate model algorithms used in variable-fidelity optimization approaches are Kriging (or Gaussian Process Regression), Co-Kriging, Polynomial Chaos Expansion (PCE), and Moving Least Squares [19]. Additionally, Deep Neural Networks (DNN) are commonly used for multi-fidelity inverse design as a surrogate model for optimization purposes ([14, 17]), but not as a part of the variable-fidelity optimization mechanism.
LF warm-start optimization techniques have also demonstrated their efficacy in improving the convergence of optimization algorithms [25, 51]. By initializing the optimization process with solutions generated with LF machine learning (ML) models, warm starting leverages prior knowledge to reduce the number of evaluations required to reach optimal solutions. This approach is particularly useful in scenarios where optimization problems or targets share similarities, such as in iterative design processes or when dealing with parametric variations [10, 12, 37].
The essence of MF simulations is to harmoniously integrate models of varying accuracy and computational expense. By leveraging the strengths of both HF simulations and LF ML models, it is possible to achieve accurate solutions while conserving computational resources. This is especially pivotal in scenarios where computational budgets are limited, but the accuracy cannot be compromised. Kriging, Co-Kriging, and PCE have the major benefit of having reliable uncertainty estimates, however, they do not scale well with an increase of data without an increase in computational complexity ([29, 38]), and they require retraining when additional data is available. Furthermore, most methods switch between LF and HF simulations, however, the execution time of the LF simulation could also be non-trivial.
Hence, in this paper, in order to tackle the aforementioned issues, an innovative inverse design framework is presented and investigated. The framework integrates metaheuristic algorithms with a pre-trained LF ML model used for design approximation and decision making in order to discern whether there is a need for a HF simulation. This decision is achieved by comparing the discrepancy between its predicted approximation and the inverse design target value. This innovation stands out from prior research by facilitating the predictive power of the ML model regarding the necessity for HF simulations, leveraging LF data. Additionally, the other, equally important task of the LF ML model is its capability of design space boundary refinement before the optimization process starts. This can be considered a form of optimization process warm starting, and the purpose is to enhance the rate of convergence of the inverse design optimization algorithms. Two different strategies for boundary refinement are investigated and separately applied to two different problems. Finally, the ultimate benefit of this framework is that the LF ML model can enhance the optimization for any target design within its applicable domain, thereby substantially extending its reach and impact. This is possible since the ML models that are investigated are DNNs, and Gradient Boosting (GB) algorithms. Both of these algorithm types can be continually trained (provided there is not a large data distribution shift), and scale well with additional data.
The metaheuristic algorithms used within the inverse design framework are: Particle Swarm Optimization (PSO) and Differential Evolution (DE). Even though any kind of optimization algorithm could be incorporated into the framework, PSO and DE were chosen since they generally require a high number of evaluations, and they’ve been previously used for similar tasks [7, 50]. To the best of the authors’ knowledge, no approaches in the literature combine metaheuristic algorithms with ML techniques for both boundary refinement techniques and MF optimization within a single framework.
The importance of this research is emphasized by its application to airfoil inverse design (AID) and scalar field reconstruction (SFR) challenges. The AID problem is chosen since it occupies a pivotal role in engineering, particularly in the realm of aeronautics [4, 27] and wind energy generation [56], and it has been extensively studied in the field of multi-fidelity inverse design optimization [16, 27, 36, 39, 41, 50, 59, 66]. The SFR problem emerges as an inverse design challenge across various scientific and engineering domains, representing a specific variant of the inverse boundary value problem [60]. This problem centers on deducing the distribution of a scalar field from sparse measurements [3, 23, 28, 46, 53, 6365]. Solving the SFR problem utilizing optimization algorithms has been of research interest [8, 9, 33, 57].
Finally, the goals of this research are to: (i) show that the proposed framework can accelerate the rate of convergence of both optimization algorithms, and on both inverse design tasks, (ii) show that the LF ML models can be re-used when the inverse design target is changed, without retraining, and (iii) quantify the amount of data needed for the LF ML models to be of use through detailed analysis.
The manuscript is structured to offer a thorough understanding of the research. Following the introduction, Sect. 2 delves into the ML-enhanced inverse design framework. Sections 3 and 4 provide an in-depth examination of the AID and the SFR problems, respectively, as well as their boundary refinement strategies. Finally, Sect. 5 presents a comprehensive discussion of the results of the ML model, techniques for boundary refinement, and a meticulous analysis of the ML-enhanced framework, contrasting it with traditional optimization algorithms.

2 ML-enhanced inverse design framework

In this section we introduce our ML-enhanced inverse design framework. The methodology consists of two primary stages: training an ML model and applying it to refine the boundaries of optimization problem thus enabling acceleration and the rate of convergence improvement, and finally, executing the ML-enhanced optimization process to find the design corresponding to the target performance. The framework uniquely combines LF simulation data for ML model training with HF simulations for optimization, creating a versatile MF system. Once trained, the ML model can be utilized to augment the inverse design for a given problem. This, however, is applicable to the solutions that lie within the boundaries of the dataset used for training the model. The general workflow of the inverse design framework and the components is displayed in Fig. 1.

2.1 Inverse design optimization and objective function definition

The inverse design problem can be mathematically articulated as
$$\begin{aligned} \textbf{x} = f^{-1}(y) \end{aligned}$$
(1)
here, \(\textbf{x} \in \mathbb {R}^m\) represents the design parameters and \(y \in \mathbb {R}^q\) is the known target value. The objective is to ascertain the design parameters \(\textbf{x}\) that generate \(y\) when evaluated with the inverse of the objective function \(f: \mathbb {R}^m \rightarrow \mathbb {R}^q\). This is in stark contrast to forward problems, where \(y\) is typically unknown. Inverse design problems tend to be ill-posed, commonly encountering the problem of multiple viable solutions, which complicates the process of identifying a unique and optimal solution.
In Eq. (2) the inverse design optimization problem is defined as
$$\begin{aligned} \begin{aligned}&\underset{\textbf{x}}{\text {minimize}} & \varepsilon (f(\textbf{x}), y) \\&\text {subject to} & \textbf{x}_{lb} \le \textbf{x} \le \textbf{x}_{ub} \end{aligned} \end{aligned}$$
(2)
here, \(\varepsilon : \mathbb {R}^q \times \mathbb {R}^q \rightarrow \mathbb {R}\) is the error-based objective function that returns a scalar value in \(\mathbb {R}\). The goal of this optimization problem is to minimize the discrepancy between the desired output \(y\) and the outcome derived from the proposed design \(\textbf{x}\). The design parameters \(\textbf{x}\) are constrained within a compact design space, defined by the lower and upper boundaries \(\textbf{x}_{lb} \in \mathbb {R}^m\) and \(\textbf{x}_{ub} \in \mathbb {R}^m\) respectively, which represent the feasible range of the design variables.
In this study, the root mean square error is used as \(\varepsilon\):
$$\begin{aligned} \begin{aligned}&\underset{\textbf{x}}{\text {minimize}} & \varepsilon (\textbf{x}) = \sqrt{\frac{1}{q} \Vert \textbf{P}^{C}(\textbf{x}) - \textbf{T}\Vert _2^2} \\&\text {subject to} & \textbf{x}_{lb} \le \textbf{x} \le \textbf{x}_{ub}, \end{aligned} \end{aligned}$$
(3)
where \(\textbf{x} = (x_1, \ldots , x_m)^T\) is the optimization design vector in the decision space \(\mathbb {R}^m\), \(m\) is the dimension of the general optimization design vector, \(\textbf{P}^{C}(\textbf{x}) = (P_1^C(\textbf{x}), \ldots , P_q^C(\textbf{x}))^T \in \mathbb {R}^q\) denotes the computed performance vector based on the design vector \(\textbf{x}\), while \(\textbf{T} = (T_1, \ldots , T_q)^T \in \mathbb {R}^q\) signifies the user-defined target performance vector. Both \(\textbf{P}^{C}(\textbf{x})\) and \(\textbf{T}\) are of dimension \(q\). Ideally, an exact match between these values would result in an objective function value of zero.

2.2 ML-enhanced optimization

As shown in Fig. 1, the requirement for the ML-enhanced optimization process is to train an ML model using simulation data generated with Latin Hypercube Sampling (LHS). Each simulation yields an input design vector and a simulation result vector. These design vectors, which are of the same size and within the same bounds as those evaluated during the optimization process, are used as inputs to the ML model. The outputs are statistical measures (mean, minimum, maximum, etc.) derived from the simulation result vectors. Once trained, this ML model can be reused within the inverse design framework when the target performance changes. The details of the ML model can be found in Sect. 2.4.
The ML model has two main tasks: (i) refine the lower and upper optimization boundaries (denoted as \(\textbf{lb}_{R}\) and \(\textbf{ub}_{R}\), respectively), and (ii) decide whether to run a HF simulation based on a design \(\textbf{x}\) being evaluated during the optimization process. The details of the boundary refinement procedure are given in Sect. 2.3. Furthermore, the pseudo-code of the ML-enhanced optimization process (ii), and all the necessary parameters are detailed in Alg. 1.
When the ML model is trained, the optimization process begins by defining a target vector (\(\textbf{T}\)) and deriving a target scalar value (\(T _{info} \in \mathbb {R}\)), where \(T _{info} = \frac{1}{q} \sum _{i=1}^{q} T_i\) (the mean of \(\textbf{T}\)) or \(T _{info} = \max (\textbf{T})\) (the maximum of \(\textbf{T}\)), depending on the application case in this study. During each evaluation of the objective function \(\varepsilon\) (defined in Eq. 2), the pre-trained ML model (\(M(\textbf{x})\)) predicts the value (\(M _{info} \in \mathbb {R}\)) for a given optimization design vector \(\textbf{x}\). The scalar values \(M _{info}\) and \(T _{info}\) must represent the same statistically derived quantities in the space \(\mathbb {R}\), ensuring consistency in the comparison of predicted and target performance metrics. Subsequently, the absolute error (\(\Delta\)) between the ML predicted value (\(M _{info}\)) and the target scalar value (\(T _{info}\)) is then computed. If \(\Delta\) exceeds a pre-established threshold (\(\omega\)), the objective function (\(\varepsilon\)) is assigned the value \(\lambda \cdot e^\Delta\) (\(\lambda = 2\)). If \(\Delta\) is less than or equal to \(\omega\), \(\varepsilon\) is evaluated with a HF simulation and the result is compared with \(\textbf{T}\) through a discrepancy metric.
The threshold parameter (\(\omega\)) is calculated using a user-defined scaling factor (c) and the error of the ML model (\(\epsilon _ M\)), such as root mean square error or mean absolute error obtained through ML model analysis. In regions of the design space where the ML model is less accurate, the framework tends to focus more on exploitation rather than exploration. The initial design vector \(\textbf{x}_{init} \in \mathbb {R}^m\) is a vector randomly initialized within the optimization boundaries (\(\textbf{lb}_{R}\) and \(\textbf{ub}_{R}\)) by the optimization algorithms. The remaining budget (RB) denotes the remaining HF simulation budget, which is used as a comparison metric with unenhanced optimization algorithms. A higher RB value indicates enhanced performance, reflecting increased computational efficiency and savings. The optimization process stops when the simulation budget is exceeded.
The proposed ML-enhanced inverse design framework can be used in conjunction with any population-based global optimization algorithm. In this study, it is investigated how the ML model enhances two population-based algorithms, namely, DE and PSO. The fundamental goal of this framework is to enhance the robustness and efficiency of the optimization algorithms through the use of ML-generated boundary refinement and ML-guided evaluation of HF simulations.

2.3 ML-generated boundary refinement

The ML model is used to narrow down the optimization boundaries through a boundary refinement method, shown in Alg. 2. This approach aims to significantly minimize the demand for computational resources, an essential factor when operating within a stringent computational budget. The requirements for the boundary refinement are \(T _{info}\) and \(M _{info}\) values. The objective of each of the \(N\) optimization runs in the algorithm is to minimize the absolute difference between these two values. The \(N\) value is predefined by the user and ultimately will determine the number of optimization solutions that will be used to refine the boundaries for the ML-enhanced inverse design framework. More specifically, to narrow down the boundaries through the boundary refinement method, it is necessary to determine the optimal solution defined as in Eq. (4):
$$\begin{aligned} \begin{aligned}&\textbf{x}^* = \underset{\textbf{x}}{\text {argmin}} \, | M ({\textbf {x}}) - T _{info}| \\&\text {subject to} \quad \textbf{x}_{lb} \le \textbf{x} \le \textbf{x}_{ub} \end{aligned} \end{aligned}$$
(4)
The solution vector \(\textbf{x}^* \in \mathbb {R}^m\) represents an optimized design based on the absolute difference between \(T _{info}\) and \(M _{info}\) (that is predicted by \(M ({\textbf {x}})\)).
Given the inherent ill-posedness of most inverse design problems, this optimization procedure is repeatedly executed, resulting in a matrix of optimal solutions \(\textbf{S}\). Repeating the optimization process N times yields various solutions due to the multi-modal landscape and the stochastic nature of the used optimizer (DE), which converges to different local optima. The condition in Eq. (4) is especially sensitive to this because it relies on partial information (ML prediction of a single scalar value instead of a complete array), further reducing the fidelity of the ML model trained with LF simulation data.
More specifically, each row of \(\textbf{S}\) represents one of the optimal solutions N, while each column is one of the design variables m as defined in:
$$\begin{aligned} \textbf{S} = \begin{bmatrix} \textbf{x}^{*}_{1} \\ \textbf{x}^{*}_{2} \\ \vdots \\ \textbf{x}^{*}_{N} \end{bmatrix} = \begin{bmatrix} x^{*}_{1,1} & x^{*}_{1,2} & \cdots & x^{*}_{1,m} \\ x^{*}_{2,1} & x^{*}_{2,2} & \cdots & x^{*}_{2,m} \\ \vdots & \vdots & \ddots & \vdots \\ x^{*}_{N,1} & x^{*}_{N,2} & \cdots & x^{*}_{N,m} \end{bmatrix} \end{aligned}$$
(5)
where \(x^*_{N,m}\) is the design point in dimension m of the optimized solution N, and \(\textbf{x}^{*}_{N}\) is the \(N^{th}\) solution vector.
The obtained solutions within the matrix \(\textbf{S}\) are then subjected to statistical processing that depends on the specific inverse design problem being solved (see Sects. 3 and 4), which yields the compressed lower and upper boundaries \(\textbf{lb}_{R}\) and \(\textbf{ub}_{R}\), respectively.

2.4 ML model

The ML model \(M ({\textbf {x}})\) takes as input the optimization design vector \(\textbf{x}\) and maps it to \(M _{info}\) which is then compared with the \(T _{info}\) value, thereby minimizing the necessity for HF simulations. The performance of three different ML algorithms is analyzed within this methodology – a DNN, and two different GB algorithms – LightGBM (LGB) [34] and XGBoost (XGB) [11]. XGBoost or eXtreme Gradient Boosting (XGB) is a scalable tree boosting framework that effectively integrates a sparsity-aware algorithm alongside a weighted quantile sketch, thereby facilitating an approximate tree learning process. The combination of cache access patterns, elevated data compression, and sharding empowers XGBoost to construct an efficient and powerful tree boosting system.
LightGBM (LGB) is a robust and efficient gradient boosting framework aimed at enhanced performance and speed. It incorporates innovative strategies such as gradient-based one-side sampling and exclusive feature bundling to expedite processing and improve efficiency. LightGBM has a unique leaf-wise tree growth strategy, which deviates from the conventional level-wise approach seen in other boosting algorithms, and contributes to improved model accuracy by minimizing loss, thereby achieving faster convergence.
A DNN configured as an MLP is fundamentally composed of three distinct types of layers: the input, hidden, and output layers. These layers are constituted by artificial neuron nodes. The MLP model can incorporate multiple hidden layers as part of its neural architecture. Each neuron residing within the hidden and output layers utilizes a nonlinear activation function, echoing the complex processing mechanisms observed in the human brain [48]. This structure effectively facilitates the MLP’s ability to model and solve intricate nonlinear problems.
The accuracy of all trained ML models was assessed using the RMSE (Eq. (6)) since it is used to evaluate the \(\omega\) value within the ML-enhanced framework (as shown in Alg. 1).
$$\begin{aligned} \textit{RMSE} = \sqrt{\frac{\sum _{l=1}^{L} (y_{l} - \hat{y}_{l})^2}{L}} \end{aligned}$$
(6)
The variables \(y_l\), \(\hat{y}_l\), and L represent the \(l^{th}\) actual value, the \(l^{th}\) ML model prediction, and the test set size, respectively. More specifically, the variable \(y_l\) represents the \(l^{th}\) data point that is the result of the \(l^{th}\) LF simulation, and \(y_l\) must represent the same statistically derived information as the T\(_{info}\) value. The K-Fold cross-validation procedure (\(k=5\)) was used to evaluate the accuracy and uncertainty of all three investigated algorithms. For the K-Fold analysis of the ML model, the test set size L is varied as 500, 1000, 5000, and 15000.

2.5 Metaheuristic optimization algorithms

Two distinct metaheuristic optimization algorithms will be compared: Particle Swarm Optimization (PSO) and Differential Evolution (DE). Both algorithms belong to the broader categories of swarm intelligence and evolutionary algorithms. Fundamentally, these categories rely on populations of agents that abide by specific rules to identify optimal solutions. Using both PSO and DE will demonstrate the general applicability of the ML-enhancement.
PSO is a population-based stochastic optimization algorithm, inspired by the social behavior of bird flocking or fish schooling [35]. In PSO, each individual particle in the swarm population represents a solution in the design space. Every particle updates its position based on its local best position, as well as the global best solution of the swarm. This cooperative search process, conducted through the iterative adjustment of velocities and positions increases the rate of convergence of the swarm towards the local or global optimum.
DE is a population-based stochastic search technique, commonly used for global optimization problems over continuous optimization design vectors [58]. In DE, the potential solutions are evolved over time via a simple arithmetic operation: a combination of mutation, crossover, and selection operations. Each individual in the population is a potential solution, and the evolution of these individuals is performed based on the differences between randomly sampled pairs of individuals within the population. The differential evolution of the population ensures a good rate of convergence; however, converging to a global optimum is not guaranteed. The success-history-based parameter adaptation (SHADE) variant of DE is used in this investigation. In the SHADE variant, the scaling factor and crossover rate are adaptively adjusted for each individual in the population based on a history of successful parameters. This dynamic adaptation allows for more effective exploration and exploitation of the design space, potentially improving the performance of the algorithm.
For the investigated problems, the swarm size and the population size parameters for the PSO and DE algorithms were both set to 10. Both DE and PSO implementations in the Indago 0.4.5 Python module for numerical optimization were used [30]. For the PSO algorithm, the inertia parameter was set at 0.8. The cognitive and social rates of the swarm were standardized to 1, mainly based on the default recommendations with the Python module. For DE, the key hyperparameters, such as the archive size factor, historical memory size, and mutation rate, were configured to 2.6, 4, and 0.11, respectively, following recommendations from the utilized Python module.

3 Airfoil inverse design

This section defines the AID problem through the optimization design vector, constraints, and the boundary refinement strategy.

3.1 AID problem description

The goal of the AID problem is to determine the optimal geometry of an airfoil given a set of target pressure coefficients on the surface of the airfoil. The parameters used for the AID and the boundary refinement in the context of Eq. (3) and Eq. (4) are presented in Table 1. Each evaluation of the function \(\varepsilon\) (Eq. (3)) necessitates executing a flow simulation over a generated design.
Table 1
Mapping of problem-specific parameters for the AID to their corresponding general parameters used in the objective function and the boundary refinement process
General parameter
Problem specific parameter
\(T _{info}\)
\(C^T_{p_{min}}\)
\(\textbf{T}\)
\(\textbf{C}_{p}^{T}\)
\(\textbf{P}^{C}({\textbf {x}})\)
\(\textbf{C}_{p}^{C}(\textbf{x})\)
\(M _{info}\)
\(C_{p_{min}}\)
m
\(N_c\)
\(\textbf{C}_{p}^{C}(\textbf{x}) \in \mathbb {R}^q\) denotes the computed pressure distribution around an airfoil based on the design vector \(\textbf{x}\), while \(\textbf{C}_{p}^{T} \in \mathbb {R}^q\) signifies the user-defined target pressure coefficient distribution measured at the same locations on the surface of the airfoil. For all target cases, \(q = 300\). For each evaluation of \(\textbf{x} \in \mathbb {R}^{N_c}\), the computed pressure distribution \(\textbf{C}_{p}^{C}(\textbf{x})\) is linearly interpolated to match the target pressure distribution in both the size and airfoil surface location for each individual component q. \(C^T_{p_{min}} \in \mathbb {R}\) denotes the target minimum pressure coefficient obtained from \(\textbf{C}_{p}^{T}\), and because the minimum pressure coefficient is used, the ML model’s task is to map the optimization design vector \(\textbf{x}\) to the minimum pressure coefficient \(C_{p_{min}} \in \mathbb {R}\) measured on the surface of the airfoil.
The AID optimization problem uses B-Spline approximation of the NACA0012 airfoil geometry for the lower and upper boundary definition, as this method offers superior shape parametrization [52]. Utilizing the splrep function from the scipy 1.9.1 module [61], the B-Splines are defined with coefficients, knots, a maximum degree of 5, and a smoothness parameter set at 0.00001. The knots generated by the splrep function as well as the degree and the smoothness of the splines remain unchanged throughout the optimization procedure.
The optimization design vector for the AID problem \(\mathbf {x_A}\) is defined as:
$$\begin{aligned} \mathbf {x_A} = [\mathbf {c_l}, \mathbf {c_u}]^T \in \mathbb {R}^{N_c}, \quad {\left\{ \begin{array}{ll} \gamma \cdot (\mathbf {c_{L0012}} - 1\cdot 10^{-5}) \le \mathbf {c_l} \le \textbf{0} \\ \textbf{0} \le \mathbf {c_u} \le \gamma \cdot (\mathbf {c_{U0012}} + 1\cdot 10^{-5}) \\ \end{array}\right. } \end{aligned}$$
(7)
where it is defined as being comprised of the 15 lower and 15 upper surface B-spline coefficient vectors (\(\mathbf {c_l}\) and \(\mathbf {c_u}\)), meaning that \(N_c=30\). In detail, \(\mathbf {x_A}\) with its individual components (\(c_{li}\) and \(c_{ui}\)) is defined as:
$$\begin{aligned} \begin{aligned} \mathbf {x_A}&= \begin{bmatrix} c_{l1},&c_{l2},&\ldots ,&c_{l\frac{N_c}{2}},&c_{u\frac{N_c}{2} + 1},&c_{u\frac{N_c}{2} + 2},&\ldots ,&c_{uN_c} \end{bmatrix}^T \in \mathbb {R}^{N_c}, \\&{\left\{ \begin{array}{ll} \gamma \cdot (c_{L0012_i} - 1 \cdot 10^{-5}) \le c_{li} \le 0 & \text {for} \quad i = 1, 2, \ldots , \frac{N_c}{2} \\ 0 \le c_{ui} \le \gamma \cdot (c_{U0012_i} + 1 \cdot 10^{-5}) & \text {for} \quad i = \frac{N_c}{2} + 1, \frac{N_c}{2} + 2, \ldots , N_c \end{array}\right. } \end{aligned} \end{aligned}$$
(8)
The bounds of the lower and upper B-spline coefficients are determined by the NACA0012 lower and upper B-Spline coefficient vectors denoted as \(\mathbf {c_{L0012}}\) and \(\mathbf {c_{U0012}}\), respectively, scaled by a multiplication factor \(\gamma =3\). The extracted lower boundary B-Spline coefficients in \(\mathbf {c_{L0012}}\) are negative. Subsequently, there is an overlap in the initial lower and upper B-Spline coefficients (\(\mathbf {c_{L0012}}\) and \(\mathbf {c_{U0012}}\)) generated by B-Spline interpolation of the NACA0012 geometry, where some coefficients in these vectors converge to zero. To address this potential overlap and to enable optimization in the near-zero space, we adjust the values of \(\mathbf {c_{L0012}}\) and \(\mathbf {c_{U0012}}\) by subtracting and adding a small value of \(1 \times 10^{-5}\) to these vectors, respectively.
The constraints on \(\mathbf {c_l}\) and \(\mathbf {c_u}\) in Eq. (7) represent the initial lower and upper bounds (\(\textbf{x}_{lb}\) and \(\textbf{x}_{ub}\)) used for the ML-generated boundary refinement (Eq. (4)) as well as the lower and upper boundaries used by the unenhanced optimization algorithms (stand-alone DE and PSO). The lower and upper boundaries of the optimization design vector based on the scale factor \(\gamma\) and the B-Splines of NACA0012 are shown in Fig. 2. Details of the airfoil targets, flow simulation solver, and the dataset used to train the ML models are provided in the Appendix A.

3.2 AID boundary refinement

After the boundary refinement technique is applied, a whole matrix of solutions \(\textbf{S}\) is obtained, as detailed in the Sect. 2.3. By statistically analyzing the matrix of optimal solutions derived from the ML model, the column-averaged values of the solution matrix \(\textbf{S}\) provide a meaningful representation of the refined design space. More specifically, the matrix of solutions for the AID problem \(\textbf{S}_A\) is defined as:
$$\begin{aligned} \textbf{S}_A = \begin{bmatrix} \textbf{x}^{opt}_{A_{1}} \\ \textbf{x}^{opt}_{A_{2}} \\ \vdots \\ \textbf{x}^{opt}_{A_{N}} \end{bmatrix} = \begin{bmatrix} c_{l1}^{opt_1} & c_{l2}^{opt_1} & \ldots & c_{l\frac{N_c}{2}}^{opt_1} & c_{u\frac{N_c}{2} + 1}^{opt_1} & c_{u\frac{N_c}{2} + 2}^{opt_1} & \ldots & c_{uN_c}^{opt_1} \\ c_{l1}^{opt_2} & c_{l2}^{opt_2} & \ldots & c_{l\frac{N_c}{2}}^{opt_2} & c_{u\frac{N_c}{2} + 1}^{opt_2} & c_{u\frac{N_c}{2} + 2}^{opt_2} & \ldots & c_{uN_c}^{opt_2} \\ \vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots \\ c_{l1}^{opt_N} & c_{l2}^{opt_N} & \ldots & c_{l\frac{N_c}{2}}^{opt_N} & c_{u\frac{N_c}{2} + 1}^{opt_N} & c_{u\frac{N_c}{2} + 2}^{opt_N} & \ldots & c_{uN_c}^{opt_N} \end{bmatrix} \end{aligned}$$
(9)
where \(\textbf{x}^{opt}_{A_{N}}\) is the \(N^{th}\) optimized solution of the AID problem, and it consists of optimal lower and upper B-Spline coefficients (Eq. (8)). Subsequently, the averaged design vector \({\bar{\textbf{x}}}_A\) is defined by column-averaging the matrix \(\textbf{S}_A\):
$$\begin{aligned} {\bar{\textbf{x}}}_A = \begin{bmatrix} \frac{1}{N} \sum _{i=1}^{N} c_{l1}^{opt_i},&\ldots ,&\frac{1}{N} \sum _{i=1}^{N} c_{l\frac{N_c}{2}}^{opt_i},&\frac{1}{N} \sum _{i=1}^{N} c_{u\frac{N_c}{2} + 1}^{opt_i},&\ldots ,&\,&\frac{1}{N} \sum _{i=1}^{N} c_{uN_c}^{opt_i} \end{bmatrix} \end{aligned}$$
(10)
Furthermore, the averaged design vector \(\bar{\textbf{x}}_A\) is scaled by a safety factor \(\eta\) to ensure that the target design is within the new boundaries:
$$\begin{aligned} \mathbf {x_{A_\eta }} = \eta \cdot \bar{\textbf{x}}_A \end{aligned}$$
(11)
Finally, the design vector \(\mathbf {x_{A_\eta }}\) represents an airfoil shape itself, i.e., its design variables are the lower and upper B-Spline coefficients, hence, new lower and upper boundaries (\(\textbf{lb}_R\) and \(\textbf{ub}_R\)) are constructed based on this design vector for the AID problem. Different values of hyperparameter \(\eta\) are investigated to assess their impact on the performance of the ML enhanced optimization, more specifically, \(\eta \in \{1, 1.1, 1.2, 1.3\}\). These values showcase a range from less efficient to more efficient performance to provide a comprehensive view of the method’s effectiveness.

4 Scalar field reconstruction

This section defines the SFR problem through the optimization design vector, constraints, and the boundary refinement strategy.

4.1 SFR problem description

The goal of the SFR problem is to determine the scalar boundary values based on a set of target scalar measurements on a given domain. The essence lies in optimizing the boundary conditions for a diffusion partial differential equation (PDE). This mathematical model describes how a scalar quantity spreads within a given domain. Instead of prescribing boundary conditions outright, the problem aims to find the ideal boundary conditions that, when applied to the diffusion PDE, result in a reconstructed scalar field that closely aligns with measured data. The diffusion PDE is defined as:
$$\begin{aligned} \frac{\partial s}{\partial t} = D \nabla ^2 s \quad \text {in } \Omega , \quad t \in [0, t_{max}] \end{aligned}$$
(12)
where s is the non-dimensional scalar value, D is the diffusion coefficient set to 1 (m\(^2\)/s), t denotes the time (s), while \(t_{max}\) denotes the maximum or end time of the simulation, and \(\Omega\) is the domain. For the purposes of demonstrating the ML-enhanced inverse design framework, \(t_{max}\) is set to 0.1 s and is treated as a converged state, i.e. the scalar diffusion is treated as a quasi-transient problem.
The parameters used for the SFR and the boundary refinement in the context of Eq. (3) and Eq. (4) are presented in Table 2.
Table 2
Mapping of problem-specific parameters for the SFR problem to their corresponding general parameters used in the objective function and the boundary refinement process
General parameter
Problem specific parameter
\(T_{info}\)
\(s^T_{_{max}}\)
\(\textbf{T}\)
\(\textbf{s}^{T}\)
\(\textbf{P}^{C}({\textbf {x}})\)
\(\textbf{s}^{C}(\textbf{x})\)
\(M_{info}\)
\({s_{max}}\)
m
I
\(\textbf{s}^{C}\)(\(\textbf{x}\)) denotes the computed scalar distribution in \(\mathbb {R}^q\) (with \(q = 30\)) on a given domain \(\Omega\) based on the design vector \(\textbf{x}\) (denoted as \(\mathbf {x_s}\) for the SFR problem) which is used to define the boundary condition, while \(\textbf{s}^{T}\), also in \(\mathbb {R}^q\), signifies the user-defined target scalar field measured at the same locations. \(s^T_{{max}}\) denotes the target maximum scalar value obtained from \(\textbf{s}^{T}\) in \(\mathbb {R}\). The design vector constraint for the SFR problem is defined in Eq. (13). The ML model maps the design vector \(\mathbf {x_s}\) to the maximum scalar value \({s_{max}}\) observed in the domain, with \({s_{max}}\) being in \(\mathbb {R}\).
$$\begin{aligned} \begin{aligned} \mathbf {x_s} = [s_1, s_2, \ldots , s_{I}]^T \in \mathbb {R}^{I}, \\ \quad 0 \le s_i \le s_{ub} \quad \text {for} \quad i = 1, 2, \ldots , I. \end{aligned} \end{aligned}$$
(13)
The scalar design vector \(\mathbf {x_s}\) consists of scalar values \(s_i\) that collectively form a boundary condition for a given domain. The minimum scalar value at the boundary is defined as 0 and the maximum value \(s_{ub}\) is set to 30. For a HF simulation, the number of scalar values i defined at the top of the domain \(\Omega\) is 80 (\(I=80\)). However, with each evaluation of \(\mathbf {x_s}\), it is necessary to obtain the \(M_{info}\) value, and a discrepancy arises due to the ML model being trained with LF data where the number of scalar values was set to 20 (\(I=20\)). To evaluate \(\mathbf {x_s}\) with \(I=80\) using the ML model, \(\mathbf {x_s}\) is linearly interpolated. The scalar values at the LF boundary points (\(I=20\)) are then extracted and used by the ML model to predict \(s{_{max}}\). In accordance with the diffusion PDE (Eq. (12)), the scalar boundary values are defined as the Dirichlet boundary condition (Eq. (14)) for the top part of the domain \(\partial \Omega _{top}\):
$$\begin{aligned} \begin{aligned} s = g(\mathbf {x_s}, t) \quad \text {on } \partial \Omega _{top}, \quad t \in [0, T]. \end{aligned} \end{aligned}$$
(14)
Other parts of the domain \(\partial \Omega _{other}\) (left, right, bottom) are defined as the Neumann boundary condition:
$$\begin{aligned} \begin{aligned} \frac{\partial s}{\partial \textbf{n}} = 0 \quad \text {on } \partial \Omega _{other}, \quad t \in [0, T], \end{aligned} \end{aligned}$$
(15)
where \(\textbf{n}\) is the unit normal vector pointing outward from the domain. Finally, the mathematical domain \(\Omega\) for the given SFR problem along with the appropriate boundary conditions is shown in Fig. 3. Details of the solver, inverse design target parameters for the SFR problem, and the dataset used to train the ML models are provided in the Appendix B.

4.2 SFR boundary refinement

In addressing the SFR problem, the design space size presents a significant challenge. Compared to the AID problem, the SFR problem is less constrained and more ill-posed. This means that, based on the objective function and the lower and upper boundaries of the design space, the optimization landscape is more multi-modal for the SFR problem. This difference requires a more robust strategy for boundary refinement. Firstly, the LF solution matrix \(\textbf{S}_{s_{LF}}\) of optimized design vectors (obtained through Eq. (4)) that contain boundary condition scalar values is defined as:
$$\begin{aligned} \textbf{S}_{s_{LF}} = \begin{bmatrix} \textbf{x}^{opt}_{s_{1}} \\ \textbf{x}^{opt}_{s_{2}} \\ \vdots \\ \textbf{x}^{opt}_{s_{N}} \end{bmatrix} = \begin{bmatrix} s_{1}^{opt_1} & s_{2}^{opt_1} & \ldots & s_{20}^{opt_1} \\ s_{1}^{opt_2} & s_{2}^{opt_2} & \ldots & s_{20}^{opt_2} \\ \vdots & \vdots & \ddots & \vdots \\ s_{1}^{opt_N} & s_{2}^{opt_N} & \ldots & s_{20}^{opt_N} \end{bmatrix} \end{aligned}$$
(16)
where \(\textbf{x}^{opt}_{s_{N}}\) is the \(N^{th}\) optimized design vector. Each row of \(\textbf{S}_{s_{LF}}\) contains an optimized scalar value for each point on the LF domain boundary (\(I=20\)). Subsequently, the design vectors in each row of matrix \(\textbf{S}_{s_{LF}}\) are subjected to regression model fitting. Since the shape of the BC is unknown, and considering the number of possible solutions, in order to cover a variety of BC shapes, this fitting utilizes polynomials of degree d \(\in \{1,2,3,4\}\) for each \(n = 1,2,\ldots ,N\) design vector, forming the new regression model solution matrix \(\textbf{S}_{sreg}\) as:
$$\begin{aligned} \textbf{S}_{sreg} = \begin{bmatrix} \textbf{P}_1(\textbf{x}^{opt}_{s_1}) \\ \textbf{P}_2(\textbf{x}^{opt}_{s_1}) \\ \textbf{P}_3(\textbf{x}^{opt}_{s_1}) \\ \textbf{P}_4(\textbf{x}^{opt}_{s_1}) \\ \vdots \\ \textbf{P}_1(\textbf{x}^{opt}_{s_N}) \\ \textbf{P}_2(\textbf{x}^{opt}_{s_N}) \\ \textbf{P}_3(\textbf{x}^{opt}_{s_N}) \\ \textbf{P}_4(\textbf{x}^{opt}_{s_N}) \end{bmatrix} = \begin{bmatrix} \sum _{k=0}^{1} a_{k,1}^{opt_1} \hat{s}^k \\ \sum _{k=0}^{2} a_{k,2}^{opt_1} \hat{s}^k \\ \sum _{k=0}^{3} a_{k,3}^{opt_1} \hat{s}^k \\ \sum _{k=0}^{4} a_{k,4}^{opt_1} \hat{s}^k \\ \vdots \\ \sum _{k=0}^{1} a_{k,1}^{opt_N} \hat{s}^k \\ \sum _{k=0}^{2} a_{k,2}^{opt_N} \hat{s}^k \\ \sum _{k=0}^{3} a_{k,3}^{opt_N} \hat{s}^k \\ \sum _{k=0}^{4} a_{k,4}^{opt_N} \hat{s}^k \end{bmatrix} \end{aligned}$$
(17)
where \(\textbf{P}_4(\textbf{x}^{opt}_{s_N})\) is the 4\(^{th}\) degree polynomial of the optimized \(N^{th}\) vector \(\textbf{x}^{opt}_{s_{N}}\). More specifically, the term \(\sum _{k=0}^{4} a_{k,4}^{opt_N} \hat{s}^k\) represents the 4\(^{th}\) degree polynomial regression model for the \(N^{th}\) optimized design vector, where \(a_{k,4}^{opt_N}\) are the polynomial regression coefficients and \(\hat{s}\) is the unknown variable. As the matrix \(\textbf{S}_{sreg}\) is defined, each regression model is utilized to evaluate the HF discretized space (\(I=80\)) with equally spaced points between 0 and 1, resulting in the final \(\textbf{S}_{s_{HF}}\) matrix:
$$\begin{aligned} \textbf{S}_{s_{HF}} = \begin{bmatrix} s_{1,1}^{opt_1,1} & s_{1,2}^{opt_1,1} & \ldots & s_{1,80}^{opt_1,1} \\ s_{1,1}^{opt_1,2} & s_{1,2}^{opt_1,2} & \ldots & s_{1,80}^{opt_1,2} \\ s_{1,1}^{opt_1,3} & s_{1,2}^{opt_1,3} & \ldots & s_{1,80}^{opt_1,3} \\ s_{1,1}^{opt_1,4} & s_{1,2}^{opt_1,4} & \ldots & s_{1,80}^{opt_1,4} \\ \vdots & \vdots & \ddots & \vdots \\ s_{N,1}^{opt_N,1} & s_{N,2}^{opt_N,1} & \ldots & s_{N,80}^{opt_N,1} \\ s_{N,1}^{opt_N,2} & s_{N,2}^{opt_N,2} & \ldots & s_{N,80}^{opt_N,2} \\ s_{N,1}^{opt_N,3} & s_{N,2}^{opt_N,3} & \ldots & s_{N,80}^{opt_N,3} \\ s_{N,1}^{opt_N,4} & s_{N,2}^{opt_N,4} & \ldots & s_{N,80}^{opt_N,4} \end{bmatrix} \end{aligned}$$
(18)
where \(s_{N,1}^{opt_N,4}\) is the first scalar value at the boundary of the HF domain obtained from the \(N^{th}\) design vector using the 4\(^{th}\) degree polynomial model, and \(s_{N,80}^{opt_N,4}\) is the last scalar value obtained from the same regression model of the same degree.
Polynomial regression coefficients are determined using the numpy 1.24.3 function polyfit for 20 equally spaced points between 0 and 1 (which corresponds to the LF discretization), and this regression model, generated by the function poly1d, is subsequently evaluated for the HF discretization (\(I=80\)) with equally spaced points between 0 and 1, resulting in the final \(\textbf{S}_{sreg}\) set. Finally, to determine the new optimization boundaries for the SFR problem, the maximum scalar value is extracted from the flattened matrix \(\textbf{S}_{s_{HF}}\): \(max(\textbf{S}_{s_{HF}})\) defines the upper optimization boundary value for each dimension \(I=80\), thus forming the new upper boundary \(\textbf{ub}_R\). For safety reasons, the lower boundary \(\textbf{lb}_R\) remains as specified in Eq. (13), i.e., \(\textbf{0}\).
Figure 4 illustrates an instance of the boundary refinement process. The green line represents the true solution, the blue line shows the reduction of the design space (approximately 50% pruned) as described above and the black line represents the average of N optimized boundary conditions for comparison. The green curve lies below the black curve, suggesting that the boundary refinement methodology used for the AID problem might be similarly effective here.

5 Results and discussion

In this section, the results and analyses for the ML model, boundary refinement, and ML-enhanced framework for both demonstration problems are detailed. An in-depth hyperparameter analysis of the ML-enhanced framework is showcased, followed by overarching recommendations for optimal utilization. The section concludes by highlighting the advantages and limitations of the proposed technique. The details of all ML model hyperparameters, the hyperparameter tuning procedure, and the Python modules used are given in Appendix C.

5.1 ML models results

In this subsection, the ML model results for both the AID and SFR problems are presented through the accuracy metrics given in Sect. 2.4.

5.1.1 AID ML model results

Figure 5 presents the RMSE scores for the three ML algorithms applied to the AID problem for varying dataset sizes. The dataset size was varied in order to assess the influence it has on the ML-enhanced framework, and to obtain the learning curve for each algorithm. All models show that the larger the dataset size, the better (lower) the resulting RMSE. It can be seen that for smaller datasets, XGB has better performance than LGB and MLP, but its accuracy marginally lags when leveraging all 15000 data instances for training and cross-validation. Given its overall top performance, XGB (trained with four different dataset sizes) was selected as the ML model for the inverse design framework, and the K-fold cross-validation RMSE values used for further analysis are presented in Table 3.
Table 3
XGB RMSE values used for calculating the \(\omega\) threshold parameter for each scenario and dataset size
Dataset size
RMSE
500
0.81
1000
0.61
5000
0.39
15000
0.34

5.1.2 SFR ML model results

Figure 6 shows the RMSE values of the three ML algorithms when applied to the SFR problem using different dataset sizes. The results show the MLP’s superior performance over both LGB and XGB across all dataset sizes. As a result, the MLP was selected as the ML model within the inverse design framework for the SFR problem. The specific K-fold cross-validation RMSE values for the MLP, which were used to compute the \(\omega\) parameter for the SFR problem, are detailed in Table 4. Given the minimal performance difference between the models trained with 5000 and 15000 instances, only three distinct models trained on three different amounts of data were compared within the ML-enhanced framework.
Table 4
MLP RMSE values used for calculating the \(\omega\) threshold parameter for each of the two SFR BC scenarios and dataset size
Dataset size
RMSE
500
7.63
1000
4.27
5000
0.78

5.2 Boundary refinement with the ML model

In this subsection, the results of the boundary refinement technique for both investigated problems are presented. To generate the new boundaries \(\textbf{lb}_R\) and \(\textbf{ub}_R\), models trained on different dataset sizes were compared. The DE was employed to solve Eq. (4) 150 times (\(N=150\) solutions). The DE algorithm was configured with a maximum of 800 function evaluations, and the population size was set to equal the dimensionality of the optimization vector i.e., 20 for SFR and 30 for AID. A comparative analysis of different N values is provided in Appendix D.

5.2.1 AID boundary refinement

In Sect. 5.1, the choice of the XGB algorithm was justified by its marginal superiority over other algorithms, especially when various training data sizes are taken into account. For the purpose of boundary refinement, the XGB was trained using data instances of sizes 500, 1000, 5000, and 15000. The results of the edge cases of the XGB-produced boundaries \(\textbf{lb}_R\) and \(\textbf{ub}_R\) are illustrated in Fig. 7. Since every solution in the matrix \(\textbf{S}_A\) represents an airfoil itself with lower and upper shape coefficients, the new lower and upper boundaries were derived solely by averaging the solution matrix \(\textbf{S}_A\) where \(\eta =1\) encompasses the genuine target designs. A notable overlap is observed in a section of the upper trailing edge between the target and the new boundary (\(\zeta _x > 0.8\)). When the safety factor is increased to its maximum investigated value of \(\eta =1.3\), this overlap at \(\zeta _x > 0.8\) significantly diminishes, and a noticeable distinction is achieved between the new and the original boundaries (Figs. 7b and 7d).
When training the XGB model with different numbers of instances, only minor variations in results emerge. This suggests that the ML model trained with a small dataset suffices to prune a segment of the design space for such problems. This observation holds for both NACA2410 and RAE2822 boundary refinement procedures, as illustrated in Fig. 7. Finally, an analysis of how the number of solutions N affects the change in the airfoil shape and the boundary refinement is shown in Fig. 19 (Appendix D).

5.2.2 SFR boundary refinement

For the SFR problem, the MLP outperformed the other investigated algorithms in modeling \(s_{max}\). Figure 8 displays the MLP results of the boundary refinement. The MLP was trained with dataset sizes 500, 1000, and 5000. Across both BC scenarios, all three MLP models significantly reduce the size of the design space, confining the \(\textbf{ub}_R\) value between \(s=13\) and \(s=18\) (43% to 60% of the design space pruned). Compared to the airfoil problem, these newly produced boundaries exhibit greater sensitivity to changes in dataset size, but all three can be reliably incorporated into the ML-enhanced inverse design framework without losing the true solutions. Additionally, Fig. 20 (Appendix D) presents an analysis of how the number of solutions, N, impacts both cases of the SFR problem.

5.3 ML-enhanced inverse design framework results

This section provides a comprehensive analysis of the ML-enhanced inverse design framework, detailing the hyperparameters (c values, \(\eta\), ML dataset size) for both problem categories. Following this, a meticulous comparison is presented between the conventional inverse design approach which employs classic optimization algorithms like DE and PSO, and the ML-enhanced optimization algorithms. Both strategies aim to minimize the objective function described in Eq. (3) subject to the constraints specified in Eq. (7) and Eq. (13) for the AID and SFR challenges, respectively. Given the stringent computational budget, both methodologies are restricted to 200 HF simulations for each problem. Furthermore, to account for the inherent randomness of the population-based algorithms in use, all hyperparameter combinations are subjected to 30 runs, facilitating a robust uncertainty analysis. The term fitness is introduced to align with the conventions of PSO and DE, and it is equal to RMSE, which is the optimization objective used to evaluate the quality of solutions.
Within the ML-enhanced inverse design framework, the boundaries resulting from the boundary refinement techniques are utilized, i.e., when the ML model is trained with a particular dataset size, the \(\textbf{lb}_R\) and \(\textbf{ub}_R\) corresponding to that ML model are applied. The user-defined hyperparameter c is utilized to scale the K-fold cross-validation RMSE values of the ML models. For the AID problem, the explored values are \(c \in \{1, 2, 4, 6, 8\}\), while for the SFR problem, they are \(c \in \{0.25, 0.5, 1, 2, 4\}\). The differing ranges for c between the two problems arise from the variance in magnitude of their RMSE values. However, there is an overlap in the sets, which aids in formulating a generalized recommendation. The RMSE metric of the ML models was used to calculate the \(\omega\) threshold as defined in Sect. 2. This decision is motivated by the intuitiveness and interpretability offered by the RMSE value. By reflecting the degree of discrepancy in the model’s predictions, it provides a clear and meaningful measure of the model’s performance.

5.3.1 AID results

The results of the ML-enhanced inverse design framework utilizing the XGB model and the boundary refinement technique (\(\eta\) = 1 and \(\eta\) = 1.3) applied to the AID problem for the NACA2410 and RAE2822 airfoils in Figs. 9 and 10. \(PSO_{ML-EN}\) and \(DE_{ML-EN}\) denote the ML-enhanced versions of the optimization algorithms. For comparison, the average and standard deviation over the results of 30 runs of the unehanced PSO and DE algorithms are shown as horizontal black and grey lines, respectively. The markers indicate the dataset size used to train the XGB model, which was then incorporated into the ML-enhanced optimization algorithm and used to form \(\textbf{lb}_R\) and \(\textbf{ub}_R\). These markers are color-coded based on the RB values, implying that the fitness values were obtained from a number of HF simulations defined as \(TSB - RB\), where TSB represents the total simulation budget, specifically set at 200. The full results, which include the \(\textbf{lb}_R\) and \(\textbf{ub}_R\) formed with \(\eta\) = 1.1 and \(\eta\) = 1.2 are given in Figs. 21 and 22 (Appendix E), respectively.
First, a clear observation is that the DE algorithm, in both its unenhanced and ML-enhanced forms, outperforms the PSO algorithm. Moreover, across most tested hyperparameters, airfoil types, and optimization algorithms, the ML-enhanced variant consistently surpasses the performance of its unenhanced counterpart. There are a few instances where DE or PSO exhibit competitive performance in terms of raw fitness (RMSE) value, particularly when the c value is set to 1 and the XGB models trained with dataset sizes of 5000 and 15000 are employed. However, note that both \(PSO_{ML-EN}\) and \(DE_{ML-EN}\) have consumed only about 60% of their HF simulation budgets (remaining budget \(RB\sim 70-80\)), whereas their unenhanced versions have fully exhausted theirs.
Once the user defined RMSE scaling parameter c value reaches and exceeds 4, the RB value becomes zero for most dataset sizes and \(\eta\) values. Given that the RB value is zero, it indicates that only HF simulations were utilized for assessing the design vector. Consequently, it can be inferred that, in this particular scenario, employing unenhanced algorithms alongside the refined boundaries would yield equivalent results. ML-enhanced algorithms, especially when employing models trained on dataset sizes of 5000 and 15000 and when c = 2 (observable in Figs. 9 and 10), not only converge to a better solution but also economize on the total HF computational budget (\(RB\sim 30-50\)) when compared with the unenhanced versions.
For a general recommendation on the use of ML-enhanced optimization algorithms for the AID problem within a limited HF computational budget, any of the investigated \(\eta\) factors can be employed. However, to ensure the target design falls within the refined boundaries, an \(\eta\) value of 1.3 is preferable. This choice allows for convergence across all configurations. In terms of achieving optimal fitness and conserving the computational budget, the c value of 2 appears to be the best across all dataset sizes and algorithm combinations. Furthermore, a c value of 1 can be considered for exploratory inverse designs, as it requires fewer HF simulation runs to attain comparable or superior results to the unenhanced algorithms.
The ML models trained on smaller datasets (500 and 1000) suffice to expedite the inverse design process, achieving (\(RB\sim 20-50\)) for c = 1 and c = 2. These ML models also lead to effective boundary refinement when the entire simulation budget is used up in pursuit of the optimal design.
Figure 11 offers a comparison between selected ML-enhanced algorithm configurations and their unenhanced optimization counterparts. The first column displays the optimal achieved airfoil geometry, while the second presents the optimal set of pressure coefficients, both set against the target values. The third column illustrates the convergence graphs of all 30 runs for both algorithm variants. The first row corresponds to the PSO algorithm and the NACA2410 airfoil, while the second shows an example of the DE algorithm and the RAE2822 airfoil. Considering all three visual metrics, both \(DE_{ML-EN}\) and \(PSO_{ML-EN}\) surpass their unenhanced counterparts. Yet, neither algorithm achieves an exact alignment with the target designs, in terms of geometry and pressure coefficient sets. This discrepancy arises because the framework is assessed under strict computational budgets, with a specific focus on only 200 HF simulations, however, further improvements for both approaches are likely with larger computational budgets.

5.3.2 SFR results

Figure 12 presents the hyperparameter analysis for the ML-framework applied to the SFR problem. It also provides a comparison with the unenhanced algorithms showing the average and standard deviation of the fitness, indicated by the horizontal black and grey lines, respectively. The ML-enhanced algorithms consistently outshine their traditional counterparts. Drawing parallels with the AID problem, it is observed that while elevating the c parameter allows the framework to focus on improving the target performance approximation (reducing the fitness value), it does so at the expense of fully utilizing the entire simulation budget.
The ML-enhanced optimizers with the MLP model trained on the dataset size 1000 exhibit superior performance in terms of fitness value compared to their counterparts trained on dataset sizes 500 and 5000, respectively. This difference can be attributed to the more effective boundary refinement achieved by the 1000-instance model, as evidenced by Figs. 8a and 8b. The applied boundary refinement notably contributes to reducing fitness uncertainty across all hyperparameter combinations as observed through the standard deviation lines corresponding to each marker. This advantage becomes even more pronounced when compared to the standard deviation observed in the unenhanced algorithms.
For dataset sizes of 500 and 1000, a c value of 1 or greater causes the ML-enhanced algorithms to consume the entire budget of HF simulations. Notably, when optimizers are paired with the MLP model trained on 5000 instances, the fitness scales almost linearly with the c value. The 5000-instance trained ML-enhanced optimizers strike a good trade-off between achieving low fitness (RMSE) values and conserving HF simulations. Considering results from both BC scenarios, the hyperparameter settings that would achieve a trade-off between a good target performance approximation and simulation budget would be the 1000 dataset model at c = 0.25 or the 5000 dataset model at c = 1. If the goal is to substantially narrow down the design space, a noisy model, like the one trained on 500 instances, proves sufficient.
Figure 13 displays examples of the optimized BCs for both test instances. The results from PSO\(_{ML-EN}\) for the sinusoidal BC employed an MLP trained on 1000 instances with c = 1, while for the linear BC an MLP trained on 500 instances with c = 1 was used. Different reconstructed averaged BCs are depicted for both instances and algorithms. This variety arises because the final optimized average design vectors, which contained raw scalar values for each \(\Omega _{top}\) coordinate, underwent regression model fitting ranging from degrees 1 to 4, described in Sect. 4.2. In both cases presented in Fig. 13, both the average reconstructed BCs (for all regression model degrees) and the convergence plot clearly demonstrate the superiority of PSO\(_{ML-EN}\) over PSO.
In Fig. 14, the reconstructed scalar fields generated by the BCs presented in Fig. 13 are shown. The top row shows the fields for the sinusoidal BC, while the bottom row shows the fields for the linear BC. The first column shows the ground truth, while the second and third columns show the \(PSO_{ML-EN}\) and PSO reconstructed scalar fields. It is obvious that the BCs generated by the ML-enhanced algorithm align much more closely with the true solution. Finally, Fig. 15 shows the absolute error between the true scalar fields (for both BC cases) and those obtained by \(PSO_{ML-EN}\) and PSO-optimized boundary conditions. The absolute error was calculated for every point in the HF domain, and with the range of the absolute error being the same for both results shown, it is apparent that the \(PSO_{ML-EN}\) generated BC is more accurate.

5.4 Advantages and limitations of the ML-enhanced framework

While the ML-enhanced inverse design method shows improved performance, it is not without limitations. Primarily, the framework requires a pre-trained ML model to estimate the M\(_{info}\) value. To harness this model effectively for boundary refinement and to cut back on the number of HF simulations, it is vital to understand and determine the pertinent reduced-order information related to the optimization challenge.
The main advantage of the proposed method is that an ML model is trained independently of the optimization loop using LF data only, and can then be exploited for different inverse design instances of the same type of problem (e.g., one ML model for airfoils enables the efficient optimization of multiple different types of airfoils). Furthermore, the ML model does not have to be highly accurate as highlighted in the detailed hyperparameter analysis for both investigated problems, which is advantageous in cases where obtaining LF data is computationally non-trivial.
Another limitation of the framework lies in its dependence on multiple hyperparameters. Both the boundary refinement technique, as applied to AID, and the ML-enhanced framework itself require safety hyperparameters (\(\eta\) and c respectively). Although this study demonstrates that the c value can correlate with the RMSE of the model suggesting values of \(c \in \{0.5, 1, 2\}\) for both problems, a more exhaustive analysis encompassing a broader set of similar problems is essential. However, pinpointing the appropriate c parameter could be accomplished through an exploratory analysis leveraging an ML model and exclusively LF simulations. Finally, investigating the error metric of the ML model in the ML-enhanced framework is a potential research direction, as it would be beneficial to remove the error scaling hyperparameter.

6 Conclusion

The paper presents an ML-enhanced inverse design framework for problems with stringent simulation budgets. This framework, applied to two distinct engineering challenges–AID and SFR–leveraged a pre-trained ML model. The goal was to reduce the size of the optimization design space and decrease the need for costly HF simulations to arrive at an optimal design. In this ML-enhanced framework, both the DE and PSO optimization algorithms, which have an extensive demand for objective function evaluations, were enhanced with the ML model and contrasted with their conventional versions.
The main contributions of the study can be summarized in several points:
  • An ML model trained on a small set of LF data effectively narrows the optimization design space. This facilitates a better rate of convergence of both PSO and DE towards a better approximation of the target performance within a predefined HF simulation budget.
  • The ML-framework proves highly effective for both minimizing the number of HF simulations and approximating user-defined target designs. A relationship between the ML model’s error metric (RMSE) and the mechanism for minimizing HF simulations has been established and explored. For the AID and SFR problems, the hyperparameter c which is used to multiply the RMSE, is recommended to be in the range \(c \in \{0.5, 1, 2\}\).
  • The solutions obtained with population-based stochastic global optimization algorithms, such as DE and PSO, can be significantly improved when guided by ML models.
  • The effectiveness of the ML-enhanced inverse design framework was demonstrated on two conceptually different engineering challenges.
For the AID problem, future studies could delve into the integration of sophisticated computational fluid dynamics models like RANS or LES as the main HF simulators in the optimization loop, complemented by the ML model. Regarding the SFR problem, research emphasis should be on increasing the problem complexity, e.g., by employing a fully transient simulation model, integrating the diffusion coefficient value into both the ML model and inverse design, and potentially utilizing the RANS model for flow field reconstruction [6]. Furthermore, an analysis of the influence of the number of field measurements should be conducted.
Generally, the ML-enhanced framework proposed here could find application in any problem where meaningful reduced-order information can be obtained and approximated using an ML model. Multiple scientific applications fall into this problem category including simulations in climate and combustion that can be run with different grid resolutions and time step sizes. The proposed framework could be implemented within a larger hybrid metaheuristic-Bayesian optimization framework to further minimize the number of HF function evaluations, and it could be further investigated with other derivative-free optimization algorithms.

Acknowledgements

This work was supported by the Laboratory Directed Research and Development Program of Lawrence Berkeley National Laboratory under U.S. Department of Energy Contract No. DE-AC02-05CH11231. Müller’s time was supported under U.S. Department of Energy Contract No. DE-AC36-08GO28308. Funding for math developments was provided by U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing (SciDAC) program through the FASTMath Institute. Funding for analysis of applications was provided by the Laboratory Directed Research and Development Program of the National Renewable Energy Laboratory.

Declarations

Conflict of interests

The authors have no relevant financial or non-financial interests to disclose.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by-nc-nd/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Anhänge

AID numerical experiments and dataset

This section details the target airfoil designs, flow conditions, and the flow simulation solver. Additionally, it describes the dataset used to train the ML models for the AID problem.

Airfoils and aerodynamic flow analysis

The airfoils RAE2822 and NACA2410 are used for the numerical analysis. They both exhibit asymmetry along the chord line unlike the base NACA0012 airfoil which was used to construct the original lower and upper boundaries of the decision vector. The NACA2410 airfoil is a member of the same family as the NACA0012, which serves as a reference for defining the B-Spline coefficient constraints. The RAE2822 airfoil is one of the most widely used benchmark airfoils in the field of aerodynamic shape optimization and inverse design [14, 26, 42, 43]. The shapes of both airfoils are shown in the top row of Fig. 16, and the difference between the HF and LF simulation results for both airfoils through the \(C_p\) distribution graph are shown in the bottom row of Fig. 16.
For both investigated airfoils, the flow simulation parameters–Reynolds number (Re), Angle of Attack (AoA), and Mach number (Ma)–were set to 5\(\cdot 10^7\), 4, and 0, respectively. The target pressure coefficients were derived from these flow conditions and the specific airfoils, including the \(C^T_{p_{min}}\) values. Specifically, for the RAE2822, \(C^T_{p_{min}}=-2.27\) and for the NACA2410, \(C^T_{p_{min}}=-1.58\). The aerodynamic flow analysis was conducted using XFOIL. This software package, specifically designed for subsonic airfoil analysis, served as the primary tool for assessing the pressure coefficients around the airfoil [15]. The Python wrapper for XFOIL simulations – xfoil 1.1.1 [62] was utilized.
XFOIL operates on a numerical panel method, which is integrated with a boundary layer model, facilitating accurate predictions of flow behavior around an airfoil. Through an iterative process, XFOIL effectively solves the potential flow equation for inviscid flows and the integral boundary layer equations for momentum and energy in viscous flows. It is optimally designed to accommodate incompressible flow scenarios with a Reynolds number between \(10^6\) and \(10^8\). The number of discretization panels used for XFOIL simulations determines the fidelity of the simulation. It has been shown by [47] that XFOIL is more accurate than other methods for high lift low Reynolds number airfoils. In HF simulations, the discretization panel value is set to 300, while in LF simulations it is reduced to 100.
For each analysis, XFOIL takes as input the airfoil design, which is represented by the coordinates generated by the optimization variables–B-Spline coefficients, as well as B-Spline degree, and knots. Additional parameters, such as Re, AoA, and Ma must be specified for each simulation. Each XFOIL evaluation outputs the pressure coefficients measured around the airfoil which are compared with the target pressure coefficients. The number of iterations was set to 400 for every simulation, while the panel bunching parameter was set to 1, the trailing and leading edge density ratio was set to 0.15, and the refined-area-leading edge panel density ratio was set to 0.2.
While the proposed inverse design framework can leverage various computational fluid dynamics (CFD) analysis tools, XFOIL has been selected for its computational efficiency and as a proof-of-concept. The difference in execution time between the HF and LF simulations generated by XFOIL is not significant, however, the quality of solution does differ (shown in Fig. 16). In the future, this methodology can easily be expanded to incorporate more sophisticated approaches, such as Reynolds-averaged Navier–Stokes (RANS) or Large Eddy Simulation (LES), which both have significantly higher computational demands.

Minimum pressure coefficient dataset

In order to train the ML model, a suitable dataset must be generated. As defined in Table 1, the \(M_{info}\) value corresponds to the minimum pressure coefficient, denoted \(C_{p_{\text {min}}}\), necessitating the simulation of aerodynamic properties across a wide array of geometries and their mapping to respective \(C_{p_{min}}\) values. This dataset was assembled utilizing the LHS design of experiment technique. Input features for training the ML model were generated using LHS as B-Spline coefficients (\(\mathbf {c_u}\) and \(\mathbf {c_l}\) in Eq. (7)). Each B-spline coefficient was subsequently transformed into an airfoil geometry to obtain the corresponding \(C_{p_{min}}\) value. All data were generated utilizing LF simulations, employing 100 discretization panels, with the flow parameters defined in Sect. A.1. A total of 15000 LF simulations were conducted, meaning a total of 15000 B-Spline and C\(_{p_{min}}\) pairs were generated.

SFR numerical experiments and dataset

This section provides details on the inverse design targets and the solver used for simulating the scalar diffusion process. It also includes information on the scalar measurement locations and the random generator algorithm for the scalar boundary values. Additionally, it describes the specifics of the ML dataset generated for the SFR problem.

Scalar diffusion boundary conditions and solver

Two distinct boundary conditions (BC) were investigated to demonstrate the versatility of a single ML model across various scenarios. As depicted in Fig. 17, one BC exhibits a sinusoidal pattern, whereas the other adheres to a linear trend. Both BCs were used to generate \(\textbf{s}^{T}\) arrays. The values were measured at locations given in Sect. B.3 (Fig. 18 and Table 6). The \(s^T_{{max}}\) (\(T_{info}\)) value for the sinusoidal BC (Fig. 17a) was 5.67, while for the linear BC (Fig. 17b) it was 9.1.
To simulate the BCs over the domain, the open source computational fluid dynamics library OpenFOAM 9 was used [31]. More specifically, the laplacianFoam diffusion PDE solver was used. The details about the LF and HF domains are presented in Table 5, as well as the difference in the LF and HF modeled values of \(s^T_{{max}}\). The LF scalar diffusion equation is solved on a computational grid that is 16 times smaller than the HF computational grid in terms of total finite volume cells. Moreover, both LF and HF OpenFOAM simulation execution times are similar, however, due to a difference between the obtained results, the cases can be utilized for investigation as a proof-of-concept for the ML-enhanced framework.
Table 5
“Total cells” refers to the number of cells within the computational domain. “Top BC cells” signifies the quantity of cells along the \(\Omega _{x}\)-axis direction, where the Dirichlet boundary condition is applied. Scalar values at the boundary are set in the cell centers. \(\delta \Omega _{x}\) and \(\delta \Omega _{y}\) represent the cell sizes in the \(\Omega _{x}\) and \(\Omega _{y}\) axis directions, respectively
Type
Total cells
Top BC cells
\(\delta \Omega _{x}\)
\(\delta \Omega _{y}\)
\(s^T_{{max}}\) Sinusoidal BC
\(s^T_{{max}}\) Linear BC
LF
400
20
0.05
0.025
5.72
8.67
HF
6400
80
0.0125
0.00625
5.67
9.10

Maximum scalar dataset

The \(M_{info}\) for the SFR requires identification of the maximum scalar value, \(s_{max}\) within the domain. To curate a dataset, variations were made in the BC scalar values. For every distinct scalar value set, a simulation was executed, capturing the corresponding \(s_{max}\) value. An in-depth analysis of the methodology employed to generate diverse BCs for this reconstruction problem can be found in Alg. 3 presented in Sect. B.4. For dataset creation, 15000 LF simulations were executed using OpenFOAM 9, with full details provided in Sect. B.1.

Scalar field domain probe locations

In this section, the scalar measurement locations within the domain \(\Omega\), used as post-processing points for each OpenFOAM simulation, are presented. For the achievement of the target performance (scalar distribution) in both BC scenarios and for the training of ML models, the probe locations depicted in Fig. 18 and listed in Table 6 are to be used.
Table 6
\(\Omega _x\)-axis and \(\Omega _y\)-axis values for all probe locations within the domain \(\Omega\) for the SFR problem
Probe
\(\Omega _x\)
\(\Omega _x\)
1
0.168
0.263
2
0.063
0.043
3
0.867
0.445
4
0.711
0.292
5
0.412
0.329
6
0.593
0.193
7
0.096
0.227
8
0.670
0.104
9
0.814
0.064
10
0.109
0.083
11
0.666
0.216
12
0.024
0.399
13
0.560
0.276
14
0.322
0.374
15
0.250
0.009
16
0.210
0.343
17
0.277
0.128
18
0.957
0.136
19
0.933
0.496
20
0.151
0.175
21
0.461
0.409
22
0.385
0.470
23
0.785
0.032
24
0.511
0.091
25
0.488
0.458
26
0.619
0.307
27
0.355
0.361
28
0.865
0.425
29
0.976
0.163
30
0.765
0.249

Scalar boundary condition generator

Alg. 3 illustrates the method used to generate the BC for SFR. The method begins with the values \(LF_n\) and \(s_{\Omega _{top}}\), which represent the number of LF discretization points at the top of the boundary and maximum scalar value that can be set at the top of the domain \(\Omega\), respectively. The vector \(\textbf{BC}_{init}\) represents the initial BC. It consists of points that are equally spaced and sized \(LF_n\). These points are derived from linear interpolation of values ranging from 1 to \(LF_n\). A value R is randomly chosen from a uniform distribution, representing one of three states that signify different BC variations: linear, parabolic, or sinusoidal. \(\textbf{G}_{noise}\) is a random vector, generated from a normal distribution, with a length of \(LF_n\). Its standard deviation, \(\sigma\), is drawn from a uniform distribution. This vector is added to the transformed \(\textbf{BC}\) to enhance model robustness, simulate real-world scenarios, and ensure better generalization in imperfect or noisy environments.
For each run, one of the three BC types is chosen and \(\textbf{BC}_{init}\) is transformed using the corresponding equation (linear, parabolic, or sinusoidal) incorporating randomly generated values \(rand_1\), \(rand_2\), and \(rand_3\) from a uniform distribution. If the resultant \(\textbf{BC}\) with added noise \(\textbf{G}_{noise}\) has values exceeding \(s_{max}\), they are substituted with a random value between 0 and \(s_{max}\). To further diversify the generated BCs, if a random value between 0 and 1 is less than 0.5, the \(\textbf{BC}\) is reversed.

ML algorithms hyperparameters

In this section, the optimized hyperparameters of all investigated ML algorithms are presented. The best performing algorithm was further used as a part of the enhanced inverse design framework. All three ML algorithms were optimized using the Python framework for hyperparameter optimization Optuna 3.1.0 [2]. The number of trials for all three algorithms was 100, and Optuna-based hyperparameter optimization goal was to minimize the average RMSE of a shuffled K-Fold (\(k=3\)) cross-validation procedure. The ML algorithms were separately tuned for both investigated problems/datasets (described in Sect. A.2 and Sect. B.2), and 15000 LF data instances were used for optimization. The optimal set of hyperparameters was independently selected for each of the three investigated algorithms, based on the results from 100 trials conducted using Optuna.
The XGB algorithm hyperparameters used are presented in Table 7. The max_depth parameter controls the depth of each tree, the n_estimators defines the total number of gradient boosted trees in the model, learning_rate scales the contribution of each tree when it is added to the ensemble of trees, colsample_bytree and subsample parameters specify the fraction of the randomly sampled features and data instances used to construct each tree, respectively, and gamma, reg_alpha and reg_lambda are regularization parameters. The Python module xgboost 1.7.4 was used.
Table 7
XGB model hyperparameters tuned with the Optuna Python framework. The best solution of 100 trials is shown. The first column denotes the names of the tuned hyperparameters, while the second column shows the values obtained for the AID problem, and the third column shows the parameter values for the SFR problem
Hyperparameter
AID
SFR
max_depth
4
5
n_estimators
500
500
learning_rate
0.07
0.06
colsample_bytree
0.94
0.19
subsample
0.54
0.36
gamma
0.42
1.33
reg_alpha
2.44
0.51
reg_lambda
4.16
0.20
The LGB model hyperparameters are presented in Table 8. The num_iterations parameter controls the number of boosting iterations performed. Each iteration builds a new tree that boosts the performance of the model. The learning_rate scales the contribution of each tree when it is added to the model (similarly to XGB), lambda_l1 and lambda_l2 are L1 and L2 regularization parameters added in order to reduce overfitting. The parameter num_leaves controls the complexity of the model, and min_child_samples refers to the minimum number of data instances a leaf node must have after a split, as a form of regularization. The feature_fraction parameter defines the fraction of features used at each training iteration, while bagging_fraction determines the number of data instances used at each iteration. Both parameters are also used as a form of regularization. The Python module LightGBM 3.3.5 was used.
Table 8
LGB model hyperparameters tuned with the Optuna Python framework (the best solution out of 100 for each problem). The first column denotes the names of the tuned hyperparameters, the second column shows the values obtained for the AID problem, and the third column shows the parameter values for the SFR problem
Hyperparameter
AID
SFR
num_iterations
2500
2500
learning_rate
0.0187
0.0190
lambda_l1
1.78
0.40
lambda_l2
7.43
6.71
num_leaves
41
68
min_child_samples
37
99
feature_fraction
0.77
0.56
bagging_fraction
0.45
0.52
The hyperparameters for the MLP are detailed in Table 9. The LeakyReLU activation function was applied to all hidden layers in the AID problem, whereas the SFR used ReLU. Monte Carlo dropout layers were integrated into the architecture to reduce overfitting. During training for both problems, 30% of the data was reserved for validation. An early stopping criterion with a patience value of 20 was set based on the validation loss to further combat overfitting. The MLP was implemented in Tensorflow 2.11.0 [1].
Table 9
MLP hyperparameters tuned with the Optuna Python framework (the best solution out of 100 trials is shown for each problem). The first column denotes the names of the tuned hyperparameters, the second column shows the values obtained for the AID problem, and the third column shows the parameter values for the SFR problem
Hyperparameter
AID
SFR
Layers
3
2
Neurons per layer
92,116,34
388,322
Dropout per layer
0.1,0.1,0.0
0.1,0.0
Activation function
LeakyReLU
ReLU
Optimizer
Adam
Adam
Epochs
100
500
Batch size
128
64
Learning rate
0.00083
0.00060

Boundary refinement convergence

Figure 19 illustrates the impact of both the dataset size used for training the XGB model and the number of solutions, N, derived from the boundary refinement technique on the formation of the new lower and upper boundaries, \(\textbf{lb}_R\) and \(\textbf{ub}_R\), respectively. Since these new boundaries can be interpreted as an airfoil shape, the effect of the dataset size and the N value is articulated through the average and standard deviation of the \(\zeta _y\) values (the chord length-normalized y-coordinates of the airfoil defined by \(\textbf{lb}_R\) and \(\textbf{ub}_R\)).
For 10, 50, and 150 runs (or solutions), the average \(\zeta _y\) and its standard deviation bandwidth exhibit only minor variations as the dataset size increases. This trend is discernible for both NACA2410 and RAE2822 in Figs. 19a and 19b. This implies that even a boundary refinement formulated by an XGB model trained with just 500 data instances and merely 10 repeated runs could be beneficial, as the boundaries remain relatively consistent despite increases in both parameters.
Figure 20 demonstrates the impact of the number of solutions, N, and the dataset size on the average scalar value \(\overline{s}\) of the BC. Mirroring observations from the AID boundary refinement, neither the dataset size nor the number of solutions exert a significant effect on \(\overline{s}\).

AID results for \(\eta\) = 1.1 and \(\eta\) = 1.2

The results of the ML-enhanced inverse design framework utilizing the XGB model and the boundary refinement technique (\(\eta\) = 1.1 and \(\eta\) = 1.2) applied to the AID problem for the NACA2410 and RAE2822 airfoils are shown in Figs. 21 and 22.
Literatur
2.
3.
Zurück zum Zitat Aster RC, Borchers B, Thurber CH (2018) Parameter estimation and inverse problems. Elsevier, Amsterdam Aster RC, Borchers B, Thurber CH (2018) Parameter estimation and inverse problems. Elsevier, Amsterdam
5.
Zurück zum Zitat Beran PS, Bryson D, Thelen AS, et al (2020) Comparison of multi-fidelity approaches for military vehicle design. In: AIAA Aviation 2020 Forum, p 3158 Beran PS, Bryson D, Thelen AS, et al (2020) Comparison of multi-fidelity approaches for military vehicle design. In: AIAA Aviation 2020 Forum, p 3158
7.
Zurück zum Zitat Chakraborty S, Chatterjee T, Chowdhury R et al (2017) A surrogate based multi-fidelity approach for robust design optimization. Appl Math Model 47:726–744MathSciNetCrossRef Chakraborty S, Chatterjee T, Chowdhury R et al (2017) A surrogate based multi-fidelity approach for robust design optimization. Appl Math Model 47:726–744MathSciNetCrossRef
10.
Zurück zum Zitat Chen Q, Wang J, Pope P et al (2022) Inverse design of two-dimensional airfoils using conditional generative models and surrogate log-likelihoods. J Mech Des 144(2):021712CrossRef Chen Q, Wang J, Pope P et al (2022) Inverse design of two-dimensional airfoils using conditional generative models and surrogate log-likelihoods. J Mech Des 144(2):021712CrossRef
12.
Zurück zum Zitat Chen W, Chiu K, Fuge MD (2020) Airfoil design parameterization and optimization using bézier generative adversarial networks. AIAA J 58(11):4723–4735CrossRef Chen W, Chiu K, Fuge MD (2020) Airfoil design parameterization and optimization using bézier generative adversarial networks. AIAA J 58(11):4723–4735CrossRef
13.
Zurück zum Zitat Demange J, Savill AM, Kipouros T (2016) Multifidelity optimization for high-lift airfoils. In: 54th AIAA Aerospace Sciences Meeting, p 0557 Demange J, Savill AM, Kipouros T (2016) Multifidelity optimization for high-lift airfoils. In: 54th AIAA Aerospace Sciences Meeting, p 0557
15.
Zurück zum Zitat Drela M (1989) Xfoil: An analysis and design system for low reynolds number airfoils. In: Low Reynolds Number Aerodynamics: Proceedings of the Conference Notre Dame, Indiana, USA, 5–7 June 1989, Springer, pp 1–12 Drela M (1989) Xfoil: An analysis and design system for low reynolds number airfoils. In: Low Reynolds Number Aerodynamics: Proceedings of the Conference Notre Dame, Indiana, USA, 5–7 June 1989, Springer, pp 1–12
18.
Zurück zum Zitat Eldred M, Dunlavy D (2006) Formulations for surrogate-based optimization with data fit, multifidelity, and reduced-order models. In: 11th AIAA/ISSMO multidisciplinary analysis and optimization conference, p 7117, https://doi.org/10.2514/6.2006-7117 Eldred M, Dunlavy D (2006) Formulations for surrogate-based optimization with data fit, multifidelity, and reduced-order models. In: 11th AIAA/ISSMO multidisciplinary analysis and optimization conference, p 7117, https://​doi.​org/​10.​2514/​6.​2006-7117
20.
Zurück zum Zitat Fischer CC, Grandhi RV (2016) Multi-fidelity design optimization via low-fidelity correction technique. In: 17th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, p 4293 Fischer CC, Grandhi RV (2016) Multi-fidelity design optimization via low-fidelity correction technique. In: 17th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, p 4293
22.
Zurück zum Zitat Fusi F, Guardone A, Quaranta G et al (2015) Multifidelity physics-based method for robust optimization applied to a hovering rotor airfoil. AIAA J 53(11):3448–3465CrossRef Fusi F, Guardone A, Quaranta G et al (2015) Multifidelity physics-based method for robust optimization applied to a hovering rotor airfoil. AIAA J 53(11):3448–3465CrossRef
24.
Zurück zum Zitat Guo Q, Hang J, Wang S et al (2021) Design optimization of variable stiffness composites by using multi-fidelity surrogate models. Struct Multidiscip Optim 63:439–461MathSciNetCrossRef Guo Q, Hang J, Wang S et al (2021) Design optimization of variable stiffness composites by using multi-fidelity surrogate models. Struct Multidiscip Optim 63:439–461MathSciNetCrossRef
25.
Zurück zum Zitat Habibi M, Wang J, Fuge M (2023) When is it actually worth learning inverse design? In: International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, American Society of Mechanical Engineers, p V03AT03A025 Habibi M, Wang J, Fuge M (2023) When is it actually worth learning inverse design? In: International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, American Society of Mechanical Engineers, p V03AT03A025
31.
Zurück zum Zitat Jasak H, Jemcov A, Tukovic Z, et al (2007) Openfoam: A c++ library for complex physics simulations. In: International workshop on coupled methods in numerical dynamics, pp 1–20 Jasak H, Jemcov A, Tukovic Z, et al (2007) Openfoam: A c++ library for complex physics simulations. In: International workshop on coupled methods in numerical dynamics, pp 1–20
32.
Zurück zum Zitat Jo Y, Yi S, Choi S et al (2016) Adaptive variable-fidelity analysis and design using dynamic fidelity indicators. AIAA J 54(11):3564–3579CrossRef Jo Y, Yi S, Choi S et al (2016) Adaptive variable-fidelity analysis and design using dynamic fidelity indicators. AIAA J 54(11):3564–3579CrossRef
34.
Zurück zum Zitat Ke G, Meng Q, Finley T et al (2017) Lightgbm: a highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 30:5 Ke G, Meng Q, Finley T et al (2017) Lightgbm: a highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 30:5
37.
Zurück zum Zitat Kudyshev ZA, Kildishev AV, Shalaev VM et al (2020) Machine learning-assisted global optimization of photonic devices. Nanophotonics 10(1):371–383CrossRef Kudyshev ZA, Kildishev AV, Shalaev VM et al (2020) Machine learning-assisted global optimization of photonic devices. Nanophotonics 10(1):371–383CrossRef
38.
Zurück zum Zitat Lederer A, Conejo AJO, Maier K, et al (2020) Real-time regression with dividing local gaussian processes. arXiv preprint arXiv:2006.09446 Lederer A, Conejo AJO, Maier K, et al (2020) Real-time regression with dividing local gaussian processes. arXiv preprint arXiv:​2006.​09446
40.
Zurück zum Zitat Leifsson L, Koziel S (2010) Multi-fidelity design optimization of transonic airfoils using physics-based surrogate modeling and shape-preserving response prediction. J Comput Sci 1(2):98–106CrossRef Leifsson L, Koziel S (2010) Multi-fidelity design optimization of transonic airfoils using physics-based surrogate modeling and shape-preserving response prediction. J Comput Sci 1(2):98–106CrossRef
44.
Zurück zum Zitat Marzouk Y, Xiu D (2009) A stochastic collocation approach to bayesian inference in inverse problems. Commun Comput Phys 6(4):826–847MathSciNetCrossRef Marzouk Y, Xiu D (2009) A stochastic collocation approach to bayesian inference in inverse problems. Commun Comput Phys 6(4):826–847MathSciNetCrossRef
45.
Zurück zum Zitat Mehmani A, Chowdhury S, Messac A (2014) Managing variable fidelity models in population-based optimization using adaptive model switching. In: 15th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, p 2436 Mehmani A, Chowdhury S, Messac A (2014) Managing variable fidelity models in population-based optimization using adaptive model switching. In: 15th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, p 2436
48.
Zurück zum Zitat Nielsen MA (2015) Neural networks and deep learning, vol 25. Determination Press, San Francisco Nielsen MA (2015) Neural networks and deep learning, vol 25. Determination Press, San Francisco
51.
Zurück zum Zitat Poloczek M, Wang J, Frazier PI (2016) Warm starting bayesian optimization. In: 2016 Winter simulation conference (WSC), IEEE, pp 770–781 Poloczek M, Wang J, Frazier PI (2016) Warm starting bayesian optimization. In: 2016 Winter simulation conference (WSC), IEEE, pp 770–781
54.
Zurück zum Zitat Robinson T, Willcox K, Eldred M, et al (2006) Multifidelity optimization for variable-complexity design. In: 11th AIAA/ISSMO multidisciplinary analysis and optimization conference, p 7114 Robinson T, Willcox K, Eldred M, et al (2006) Multifidelity optimization for variable-complexity design. In: 11th AIAA/ISSMO multidisciplinary analysis and optimization conference, p 7114
60.
Zurück zum Zitat Tarantola A (2005) Inverse problem theory and methods for model parameter estimation. SIAM 10(1137/1):9780898717921 Tarantola A (2005) Inverse problem theory and methods for model parameter estimation. SIAM 10(1137/1):9780898717921
Metadaten
Titel
Efficient inverse design optimization through multi-fidelity simulations, machine learning, and boundary refinement strategies
verfasst von
Luka Grbcic
Juliane Müller
Wibe Albert de Jong
Publikationsdatum
09.09.2024
Verlag
Springer London
Erschienen in
Engineering with Computers / Ausgabe 6/2024
Print ISSN: 0177-0667
Elektronische ISSN: 1435-5663
DOI
https://doi.org/10.1007/s00366-024-02053-4