Der Artikel befasst sich mit der Integration von Principal Component Analysis (PCA) und Data Driven Topology Design (DDTD), um die Komplexität von 3D-Strukturentwurfsproblemen in Angriff zu nehmen. Es beginnt damit, die Grenzen traditioneller sensitivitätsbasierter Methoden im Umgang mit stark nichtlinearen Problemen hervorzuheben, die häufig zu lokalen Optima führen, die aus technischer Sicht suboptimal sind. Die Autoren führen einen sensitivitätsfreien Ansatz ein, der tiefe generative Modelle, insbesondere variationale Autoencoder (VAEs), verwendet, um vielfältige und leistungsstarke Materialverteilungen zu erzeugen. Die zentrale Innovation liegt in der Verwendung von PCA, um die Dimensionalität der Materialverteilungsdaten zu verringern, was es dem VAE ermöglicht, auch mit einem hohen Maß an Freiheit effektiv zu trainieren. Diese Methodik ist insbesondere für 3D-Optimierungsprobleme relevant, bei denen der DOF deutlich höher ist, was die Anwendung herkömmlicher DDTD erschwert. Der Artikel bietet einen detaillierten Rahmen und die Implementierung der PCA-basierten DDTD, einschließlich des Datenprozessflusses, der Normalisierungstechniken und der Netzwerkarchitektur der VAE. Anhand von numerischen Beispielen zeigen die Autoren die überlegene Leistung der vorgeschlagenen Methode bei der Lösung stark nichtlinearer Probleme, wie etwa dem konformen Mechanismus-Design und der maximalen von Mises-Stressminimierung. Die Ergebnisse zeigen, dass die PCA-basierte DDTD nicht nur die Leistungsfähigkeit elitärer Materialverteilungen verbessert, sondern auch eine hohe DOF-Repräsentation aufrechterhält, was sie zu einem vielversprechenden Ansatz für fortgeschrittene strukturelle Entwurfsanwendungen macht.
KI-Generiert
Diese Zusammenfassung des Fachinhalts wurde mit Hilfe von KI generiert.
Abstract
Topology optimization is a structural design methodology widely utilized to address engineering challenges. However, sensitivity-based topology optimization methods struggle to solve optimization problems characterized by strong non-linearity. Leveraging the sensitivity-free nature and high capacity of deep generative models, data-driven topology design (DDTD) methodology is considered an effective solution to this problem. Despite this, the training effectiveness of deep generative models diminishes when input size exceeds a threshold while maintaining high degrees of freedom is crucial for accurately characterizing complex structures. To resolve the conflict between the both, we propose DDTD based on principal component analysis (PCA). Its core idea is to replace the direct training of deep generative models with material distributions using a principal component score matrix obtained from PCA computation and to obtain the generated material distributions with new features through the restoration process. We apply the proposed PCA-based DDTD to the problem of minimizing the maximum stress in 3D structural mechanics and demonstrate that it can effectively address the current challenges faced by DDTD that fail to handle 3D structural design problems. Various experiments are conducted to demonstrate the effectiveness and practicability of the proposed PCA-based DDTD.
Hinweise
Communicated by Jun Wu.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
1 Introduction
Topology optimization (TO) aims at designing structures with optimal performance to meet specific engineering requirements and functional demands. Since TO was proposed by Bendsøe and Kikuchi (1988), it has been applied to various engineering problems with tremendous success. However, with the further development and application of TO methods, some researchers have gradually noticed the difficulty of mainstream sensitivity-based methods in coping with strongly nonlinear problems. That is, due to the presence of multiple local optima in the solution space of strongly nonlinear problems, sensitivity-based methods may fall into local optima that are low performance in the engineering viewpoint, making it challenging to solve such problems, e.g., turbulent flow channel design (Dilgen et al. 2018) and compliant mechanism design considering maximum stress (De Leon et al. 2020). Although researchers have proposed some approaches to improve sensitivity-based methods for dealing with strongly nonlinear problems, these are usually implemented indirectly, potentially leading to issues such as loss of accuracy. Hence, sensitivity-free methods have gained widespread attention due to their advantages of not requiring sensitivity analysis and offering higher generality.
With the presentation of updating the shape and topology of structures using genetic algorithms by Hajela et al. (1993), researchers have sequentially proposed a series of sensitivity-free methods (Wang and Tai 2004; Tai and Akhtar 2005; Wang et al. 2023) applied to structural design problems in a given design domain. Although these methods demonstrate the advantages of sensitivity-free approaches in solving strongly nonlinear problems, they also reveal the difficulty in finding optimal or satisfactory solutions with a high degree of freedom (DOF). Tai and Prasad (2007) have attempted to represent material distributions with parametric models to achieve the goal of reducing design variables, but the reduction in the DOF of the design variables also brings the drawback of being difficult to represent complex structures. Therefore, it is extremely challenging to solve strongly nonlinear problems in a sensitivity-free manner while maintaining the ability to represent material distributions with a high DOF.
Anzeige
With the rapid development of artificial intelligence (AI), some researchers are increasingly recognizing that deep generative models (a type of AI) (Kingma and Welling 2013) have the potential to address the aforementioned problems. Deep generative models utilize unsupervised learning to extract features from training data and generate novel yet similar data by sampling within the latent space. Thanks to the capabilities of deep neural networks, deep generative models can generate diverse material distributions with significant flexibility using a limited set of latent variables.
On the basis of this point of view, Yamasaki et al. (2021) proposed a sensitivity-free data-driven topology design (DDTD) methodology for efficiently solving strongly nonlinear multi-objective problems with a high DOF using a deep generative model. In their research, elite material distributions are selected from already obtained material distributions with a high DOF on the basis of the non-dominated rank (Deb et al. 2002). They are fed into the variational autoencoder (VAE) for training, and their features are extracted into a small-sized latent space. Then, latent variables are sampled in the latent space and decoded back to the original DOF. Due to the nature of deep generative models, the newly generated material distributions are diverse and inherit features from the training data, that is, the elite material distributions. The newly generated material distributions are merged into the training data, and after that, new elite material distributions are selected from the merged data. They are then used as the inputs for the next round of VAE training. By repeating the above processes, the performance of elite material distributions is enhanced while preserving a representation with a high DOF.
Although DDTD is promising to strongly nonlinear structural design problems, it has been realized that limiting the number of DOFs for representing material distributions to a suitable value (approximately tens of thousands in our experience) is crucial for successful VAE training through application studies of DDTD. However, this limitation significantly impedes the application of DDTD to 3D optimization problems.
On the basis of the above discussions, in this paper, we propose integrating principal component analysis (PCA) into data-driven topology design as a solution to this challenge. In other words, we train the deep generative model indirectly by utilizing principal component score data obtained via PCA instead of using the original material distribution data. In this way, the deep generative model generates new principal component score data that inherits the original features while gaining diversity. The new material distribution data are subsequently derived by the process of restoration from the new principal component scores. With the aid of PCA, the DOFs of the material distribution can be reduced from the original DOFs to the number of the elite material distributions, thereby addressing the challenges faced by DDTD when applied to 3D structural design problems.
Anzeige
In the following, the related work is introduced in Sect. 2 and the proposed framework and implementation are described in Sect. 3 and its effectiveness is confirmed using numerical examples in Sect. 4. Finally, conclusions are provided in Sect. 5.
2 Related works
2.1 Topology optimization
TO has the potential to provide high-performance structural designs that are widely used in industrial manufacturing, 3D printing, medical, and many other fields. Since the homogenization method of transforming the TO problem for macrostructure into a size optimization problem for the material microstructure was proposed by Bendsøe and Kikuchi (1988), TO has gained great focus and motivated more researchers to devote to the TO field. Subsequently, the solid isotropic material with penalization (SIMP) method was introduced by Bendsøe (1989) and later developed by Zhou and Rozvany (1991), Mlejnek (1992) as well as Bendsoe and Sigmund (2003). In this method, the material distribution within the design domain is represented by a scalar field and the material properties are related to relative densities by a power-law interpolation, while the intermediate densities provide the basis for filling porous microstructures, as described in Wu et al. (2021).
In addition, Xie and Steven (1996) proposed the evolutionary structural optimization (ESO) method, i.e., the TO of the structure is achieved by removing the materials in the lower stress region. In contrast to the above density-based TO methods, researchers have proposed a series of methods for TO of structures by controlling the deformation of the structure via boundary evolution, such as the level set (Allaire et al. 2002), moving morphable component (MMC) (Guo et al. 2014), and moving morphable void (MMV) (Zhang et al. 2017). The level-set method describes the boundaries of the structures using the equivalence surfaces of the level-set function and performs TO via the evolution of that function. The MMC and MMV methods enable topological changes in structures by defining explicit components or holes and controlling their movement and integration. Other widely developed optimization methods have also brought in fresh ideas on TO domains, such as topological derivative (Novotny and Sokołowski 2012) and phase field (Takezawa et al. 2010).
It should be noted that this type of sensitivity-based methods build on repeated analysis and design update steps, mostly guided by gradient computation. However, the reliance on gradient information leads to sensitivity-based methods that may be trapped in local optimal solutions, thereby hardly solving strongly nonlinear problems. Although sensitivity-free TO methods can escape this issue and possess stronger generalization, they also face the challenge of being hardly applicable to TO problems with a high DOF. Hence, in recent years, part of researchers have expected to utilize machine learning (ML) techniques to solve the above challenges. Not limited to these challenges, many ML-based TO methods have been proposed as reviewed in the works of Woldseth et al. (2022); Regenwetter et al. (2022). We briefly introduce them as related works in the next section.
2.2 Machine learning-based TO methods
In recent years, many works have combined ML with TO to attempt to improve the quality of the solution and reduce the computational cost. Banga et al. (2018) proposed a deep learning approach based on an encoder–decoder architecture for accelerating TO process. Zhang et al. (2019a) introduced a deep convolutional neural network with a strong generalization ability for TO. Chandrasekhar and Suresh (2021) demonstrated that one can directly execute TO using neural networks. The primary concept is to use the network’s activation functions to represent the density in a given design domain. Zhang et al. (2021) conducted an in-depth study on the method of directly using neural networks (NN) to carry out TO. The core idea is reparameterization, which means that the update of the design variables in the conventional TO method is transformed into the update of the network’s parameters. Jeong et al. (2023) proposed a novel TO framework: Physics-Informed Neural Network-based TO (PINN-based TO). It employs an energy-based PINN-based TO to replace finite-element analysis in the conventional TO to numerically determine the displacement field.
It should be noted that the optimization problems discussed in most of the ML-based TO methods (e.g., minimizing compliance) can be solved as effectively as using traditional sensitivity-based TO methods. Although ML-based TO methods have improved in efficiency and effectiveness, they do not address optimization problems that are difficult to solve by sensitivity-based TO methods (e.g., strongly nonlinear problems).
As previously mentioned, the sensitivity-free TO methods can effectively address strong nonlinear problems. However, it encounters difficulties when applied to large-scale optimization problems with a high DOF. With the development and application of ML in TO, some researchers have noticed the potential that exists in deep generative modeling to address this problem. Guo et al. (2018) present a novel approach to TO using an indirect design representation. This method combines a variational autoencoder (VAE) to encode material distributions and a style transfer technique to reduce noise, allowing for efficient exploration of the design space and discovery of optimized structures. In this research, deep generative models are proved capable of handling large-scale optimization problems by encoding data into latent space. Oh et al. (2019) proposed a framework that integrates TO and generative models in an iterative manner to explore new design options, thereby generating a wide array of designs from a limited set of initial design data. Zhang et al. (2019b) explored the 3D shape of a glider for conceptual design and optimization using VAE. These approaches find the optimal design by employing genetic algorithms (GA) to explore the latent space of the trained VAE. Nonetheless, employing randomly generated initial individuals for GA operations results in considerable computational overhead and poses challenges for VAE to learn meaningful features from entirely irregular data.
In addition to addressing large-scale optimization problems, another critical aspect in engineering design is the ability to handle multiple conflicting objectives. Traditional TO typically follows a single-objective formulation, where one objective (e.g., minimizing compliance) is optimized under predefined constraints. However, many real-world applications, such as lightweight yet high-stiffness structures, require balancing multiple design criteria simultaneously. Multi-objective optimization (MOO) formulates the design problem as a Pareto optimization task, providing a diverse set of optimal solutions that capture trade-offs between competing objectives (Deb et al. 2016; Emmerich and Deutz 2018). This enables engineers to explore a range of optimal solutions and select the most appropriate design based on practical considerations.
On the other hand, Yamasaki et al. (2021) proposed a sensitivity-free methodology called DDTD, incorporating a policy to provide the initial material distributions with a certain regularity to ensure that the VAE can effectively capture meaningful features. In addition, DDTD exclusively utilizes high-performance material distributions to train the VAE, which distinguishes it significantly from another methods that incorporate diverse material distributions as inputs. With the advantage of sensitivity-free and the capability of solving large-scale problems, DDTD enables to address strongly nonlinear optimization problems that are difficult or even impossible to solve by mainstream TO methods and has been applied in various research fields. Yaji et al. (2022) proposed data-driven multi-fidelity topology design (MFTD) that enables gradient-free optimization even if tackling a complex optimization problem with a high DOF and applying it to forced convection heat transfer problems. Kato et al. (2023) tackle a bi-objective problem of the exact maximum stress and volume minimization by data-driven MFTD incorporating initial solutions composed of the optimized designs derived by solving the gradient-based TO using the p-norm stress measure. Kii et al. (2024) proposed a new sampling method in the latent space called the latent crossover for improving the efficiency of DDTD.
Consider D as the design domain that is a fixed nonempty and sufficiently regular subset of \(\mathbb {R}^{d}(d=3)\) in this paper. The proposed PCA-based DDTD focuses on the following multi-objective optimization problem in the continuous system:
Here, \(J_{i}\) is the i-th objective function and \(G_{j}(\rho )\) is the j-th constraint function. The design variable field, the so-called density field, \(\rho (\textbf{x})\) takes values from 0 to 1 at an arbitrary point \(\textbf{x}\) in D. \(\rho (\textbf{x}) = 1\) means that the material exists at that point, whereas \(\rho (\textbf{x}) = 0\) means the void. \(\rho (\textbf{x})\) has been relaxed according to the manner of the density method. \(N_{\text {obj}}, N_{\text {cns}}\) are the number of the objective and constraint functions, respectively.
For the implementation, the design domain is discretized using a finite-element mesh. On this finite-element mesh, the nodal densities serve as the design variables to characterize the corresponding material distribution. When calculating the objective and constraint functions in Eq. 1, a body-fitted mesh along with the iso-contour of \(\rho = 0.5\) is generated for each material distribution.
DDTD leverages a data-driven strategy, integrating deep generative models to explore and enhance the performance of solutions. The workflow of DDTD consists of three key stages: (i) evaluating and selecting high-performance material distributions (elite data) from dataset based on predefined performance criteria, (ii) training a generative model using the selected elite data, and (iii) generating new diverse material distributions and integrating them into the dataset for iterative improvement. By repeating this processes, the overall performance of elite data can be enhanced continuously. The effectiveness of DDTD lies in its ability to break free from the limitations of traditional TO methods, which often rely on deterministic gradient-based updates and can be easily trapped in local optima. The iterative learning mechanism of DDTD allows the deep generative model to progressively enhance the quality and diversity of generated solutions.
As a sensitivity-free methodology, DDTD uses a deep generative model to generate diverse data that differs from the input data. However, the capacity of the deep generative model (VAE in this paper) is not infinite. This necessitates limiting the DOFs of the input data to a certain value (approximately tens of thousands in our experience) to ensure effective training of the VAE. Meanwhile, the representation of the designed structure using material distributions with a high DOF is important for characterizing shape and morphological changes, particularly in 3D structural design problems. To address the conflict between both, the data dimensionality reduction method is employed to preprocess the input dataset of VAE. Using PCA (a data dimensionality reduction method) and DDTD methodology, material distributions with higher performance under the 3D optimization problem are iteratively selected while maintaining their high DOF representation. Figure 1 shows the data process flow of the proposed PCA-based DDTD and the details of each step are explained here.
Initial data generation Since deep generative models typically struggle to extract meaningful features from highly irregular material distributions, it is essential for the initial dataset to exhibit a certain degree of regularity. In this paper, we construct a pseudo-problem (low-fidelity problem) that is easily and directly solvable, yet relevant to the original multi-objective optimization problem (high-fidelity problem). The solution to this low-fidelity problem is used as the initial dataset. For example, if the high-fidelity problem is a compliant mechanism design problem (an example is in Sect. 4.4) that considers geometric non-linearity, we solve a compliant mechanism design problem under the assumption of linear strain as the low-fidelity problem. It is important to note that although all numerical examples in this paper utilize material distributions obtained by solving low-fidelity problems, DDTD is not restricted to this approach. Initial material distributions can also be derived from a parametric model including random numbers, as demonstrated by Zhou et al. (2025). Therefore, multi-fidelity problems are not mandatory for DDTD, including PCA-based DDTD.
Performance evaluation and data selection We evaluate the performance of the material distributions on the basis of the high-fidelity multi-objective problem, thereby obtaining the evaluated data. Subsequently, we perform the non-dominated sorting (Deb et al. 2002) to obtain the rank-one material distributions as the elite data (see Appendix A for details of the non-dominated sorting). Here, we limit the maximum number of the elite data to \(m_{\text {max}}\). When the number of the rank-one material distributions exceeds \(m_{\text {max}}\), the elite data are chosen based on their crowding distance within the objective function space, as described by Deb et al. (2002).
Data compression by PCA If the given convergence criterion is satisfied, we terminate the calculation and obtain the current elite data as the final results. Otherwise, we conduct data compression by PCA. Here, we denote the material distributions of the elite data, \(\textbf{X} \in \mathbb {R}^{m \times n}\), as follows:
where \(\hat{\varvec{\rho }}_{i} \in \mathbb {R}^{1 \times n}\) is the nodal density vector of the ith material distribution, m is the number of the material distributions, and n is the number of DOF for each material distribution.
Using PCA, \(\textbf{X}\) is processed as follows:
where \(\bar{\textbf{X}}\) is \(\textbf{X}\) after the centering, \(\textbf{C} \in \mathbb {R}^{n\times m}\) is the principal component coefficient matrix, and \(\textbf{S} \in \mathbb {R}^{m \times m}\) is the principal component score matrix. Here, it is important that m is independent from n. Therefore, if we set \(m_{\text {max}}\) to hundreds or thousands (400 in this paper), we can resolve the issue of the input and output size of the VAE by feeding \(\textbf{S}\) into the VAE as the training data.
Data generation Further, we train the VAE using the principal component score matrix \(\textbf{S}\) instead of the original material distribution \(\textbf{X}\). Here, if m is less than \(m_{\text {max}}\), we complement the training data by data augmentation (copying data). By doing so, we provide \(m_{\text {max}}\) training data at every iteration.
Using the trained VAE, we newly generate a principal component score matrix \(\textbf{S}_{\text {gen}}\). The principal component scores in \(\textbf{S}_{\text {gen}}\) are diverse and inherit features from those in \(\textbf{S}\) because of the generation process of the VAE. The architecture of the VAE used in this method is presented in Sect. 3.4.
Data restoration by PCA Then, we restore material distributions from \(\textbf{S}_{\text {gen}}\). First, we calculate \(\bar{\textbf{X}}_{\text{ gen } }\) using the following equation:
Next, we conduct the inverse operation of the centering to \(\bar{\textbf{X}}_{\text{ gen } }\). By doing so, we obtain the generated data \(\textbf{X}_{\text{ gen } }\). After that, \(\textbf{X}_{\text{ gen } }\) is normalized, such that the material distributions have clear and fixed-width transition zones between the solid and void phases. The details of the normalization are presented in Sect. 3.3.
Performance evaluation and merging The performances of the generated data are evaluated on the basis of the high-fidelity multi-objective problem. Subsequently, the generated data, along with their performance values, are merged into the current elite data. In this manner, we update the evaluated data.
Through the above steps, we solve the challenge of applying DDTD to 3D structural design by addressing the conflict between the DOFs of the material distribution and the input constraints of the deep generative model. To conclude the overview, we summarize the entire procedure in Algorithm 1.
3.3 Normalization
As described in Sects. 3.1 and 3.2, we represent material distributions using the nodal density vector \(\hat{\varvec{\rho }}\). Each component of \(\hat{\varvec{\rho }}\) has to take values from 0 to 1 according to the basic concept of TO. In addition, it should be 0 or 1 except for the boundaries between the solid and void phases. However, these are not guaranteed on the material distributions generated by the VAE. Therefore, we conduct the normalization to those generated material distributions as described below.
First, we give the nodal level-set function \(\hat{\varvec{\phi }}\) corresponding to \(\hat{\varvec{\rho }}\) generated by the VAE, as follows:
where \(\hat{\phi }_i\) and \(\hat{\rho }_i\) are the ith components of \(\hat{\varvec{\phi }}\) and \(\hat{\varvec{\rho }}\), respectively. Next, we re-initialize \(\hat{\varvec{\phi }}\) as the signed distance function, using a geometry-based reinitialization scheme (Yamasaki et al. 2010). Finally, the nodal density vector is updated using the following equation:
where \(\hat{\rho }_{\text {u},i}\) is the ith components of the nodal density vector after the update, h is the parameter for the bandwidth of the transition zone between the solid and void phases, and \(H(\hat{\phi }_i)\) is defined as follows:
By the above calculation, each component of the nodal density vector after the update becomes 0 or 1 except for the transition zone between the solid and void phases. In addition, the bandwidth of the transition zone is fixed with 2h. In the numerical examples of this paper, we set h as 0.01, which is the element length of the finite-element mesh discretizing the design domain, if we describe nothing.
To demonstrate the generality of the proposed approach, we constructed a simple VAE architecture to obtain all the results in this paper, avoiding shifting the focus to how to build the optimal network architecture. The VAE consists of two main parts, encoder and decoder, as shown in Fig. 2. The encoder consists of input and hidden layers, and the number of neurons depends on the amount of input data.
Fig. 3
Boundary conditions and design domains: a 2D compliant mechanism design problem, b maximum von Mises stress (MVMS) minimization problem, and c 3D compliant mechanism design problem
It should be noted that a higher number of neurons in the hidden layers assume a greater fitting and generating power which also results in a larger computational cost. Therefore, users usually need to make a trade-off between effectiveness and efficiency, and in the case of this paper, two hidden layers are set up with the number of neurons 10,000 and 500, respectively. After activating these neurons in hidden layers using the ReLU function, this layer is also fully connected to two layers having eight neurons, one corresponds to \(\mu\), which is the mean value vector of the latent variables \(\textbf{z}\), and the other corresponds to \(\log (\varvec{\sigma } \circ \varvec{\sigma })\)), where \(\varvec{\sigma }\) is the variance vector of \(\textbf{z}\), and \(\circ\) represents the element-wise product. We then obtain the latent variables \(\textbf{z} \in \mathbb {R}^{N_{ltn}}\) (\(N_{\textrm{ltn}}\) is the number of the latent variables) as follows:
where \(\varvec{\epsilon }\) is a random vector according to the standard normal distribution. The layer of the latent variables \(\textbf{z}\) is further fully connected to two hidden layers with the number of neurons 500 and 10,000, respectively. The last hidden layer is further fully connected to the output layer without any activation such as the sigmoid activation. This is because that the range of the principal component score is not limited to [0, 1]. The VAE can generate meaningful new data using the decoder by imposing a regularization, such that the compressed data are continuously distributed on a Gaussian in the latent space. The VAE with the above architecture is trained using the elite data as the input data, and the latent space composed of the latent variables is constructed through training. In more detail, the training is conducted by minimizing the following loss function L using the Adam optimizer (Kingma and Ba 2014)
where \(L_{\textrm{rcn}}\) is the reconstruction loss measured by the mean-squared error, and \(L_{\textrm{KL}}\) is a term corresponding to the Kullback–Leibler (KL) divergence. \(\beta\) is the weighting coefficient for the KL divergence loss. \(L_{\textrm{KL}}\) is computed as follows:
where \(\mu _{i}\) and \(\sigma _{i}\) are the ith components of \(\mu\) and \(\sigma\), respectively.
4 Experiment
In this section, we conduct several numerical experiments to validate the effectiveness of the proposed PCA-based DDTD. All experiments are conducted on a computer with Linux x86\(\_\)64 architecture and 128 cores.
4.1 Problem settings
TO methods which target geometrically linear problems under the assumption of small deformations are not applicable to more complex practical applications under large deformations, such as energy absorbing structures, compliant mechanisms. Thus, geometric non-linearity is considered in this paper to obtain a more realistic design for applications with large deformations.
More specifically, in Sects. 4.2 and 4.4, we select the followings as the objective functions in Eq. (1) to obtain geometric nonlinear compliant mechanism structures:
where \(\sigma _{\text {v}}\) is the von Mises stress in the structure, V is the volume of the structure, and \(F_{\text {r}}\) is the reaction force yielded by an artificial spring set on the output port. The negative sign of \(F_{\text {r}}\) is needed to convert the maximization problem to a minimization problem. Hereafter, we represent the reaction force with \(J_3\), that is, reaction force values are displayed in negative values. We do not consider constraint functions here; therefore, \(N_{\text {cns}}\) is 0.
In Sect. 4.3, we select the following objective functions to obtain structures whose maximum von Mises stress (MVMS) is low:
It should be noted that due to the localization, singularity, non-linearity, and metric accuracy of the stress, traditional TO methods circumvent the direct solving via approximating the original problem. In this research, all objective functions including the MVMS are accurately calculated using a body-fitted mesh while avoiding problems such as accuracy loss caused by the approximation approaches.
Fig. 4
Elite material distributions at iteration 0 in comparison with non PCA-based DDTD
Performances of elite material distributions in comparison with Non PCA-based DDTD: iteration 0 (blue), iteration 50 in PCA-based DDTD (red), and iteration 50 in non PCA-based DDTD (green)
Comparison of elite material distributions at iteration 50 in PCA-based and non-PCA-based DDTDs. Here, the left and right sides of the bracket indicate the reaction force and volume values, respectively
Performances of elite material distributions in MVMS minimization problem without considering large deformation: iteration 0 (blue) and iteration 50 (red). Here, vol means volume
History of hypervolumes: a MVMS minimization problem without considering large deformation, b MVMS minimization problem with considering large deformation, and c 3D compliant mechanism design problem
Performances of elite material distributions in MVMS minimization problem with considering large deformation: iteration 0 (blue) and iteration 50 (red). Here, vol means volume
Comparison of elite material distributions at iterations 0 (blue) and 50 (red) in 3D compliant mechanism design problem: a comparison under MVMS \(\in 0.4800 \pm 0.01\), b comparison under MVMS \(\in 0.5891 \pm 0.01\), and c comparison under MVMS \(\in 0.8252 \pm 0.01\). Here, RF and vol mean reaction force and volume, respectively
The termination condition is given as follows: if no improvement of the hypervolume is observed through 5 iterations, or if the iteration count reaches 50, the calculation is terminated. The hypervolume is a widely used evaluation indicator in multi-objective optimization (Kii et al. 2024). It measures the volume dominated by an elite solution set in the objective function space bounded by a given reference point. A larger hypervolume indicates a superior elite solution set. We determined this termination condition through a preliminary study while referring to a previous paper (Yamasaki et al. 2021). We also set the maximum number of elite data, \(m_{\text {max}}\), to 400, by referring to that paper.
For the training of the VAE, we use the following common settings:
learning rate: \(1.0\times 10^{-4}\).
mini-batch size: 20.
number of epochs: 400.
number of latent variables, \(N_{\text {ltn}}\): 8.
weighting coefficient for KL divergence, \(\beta\): 4.
It should be noted that different \(\beta\) setting can affect the overall performance of the final solution. In the \(\beta\)-related experiment described in Appendix B, the smaller the \(\beta\) value, the better the overall performance of the final solution obtained. However, the applicability of this observation to all the optimization problems requires further investigation. To avoid discussion of excessive parameter tuning dependent on \(\beta\), we chose \(\beta =4\) for all the numerical examples. In addition, we use the latent crossover proposed by Kii et al. (2024) for the sampling in the latent space.
4.2 Comparison
To verify the effectiveness of the proposed PCA-based DDTD, we compare our results with DDTD without PCA (non-PCA-based DDTD), in which the nodal densities are directly used as the training data for the VAE. Due to the limitation on input data size for deep generative models, it is challenging to apply non-PCA-based DDTD to optimization problems with massive DOFs, particularly in 3D structural design. Therefore, we compare the proposed PCA-based DDTD with non-PCA-based DDTD in the setting of a 2D compliant mechanism design problem shown in Fig. 3a. As shown here, the numerical analysis is done using the half model, a horizontal load of 0.08 is applied on the input port, an artificial spring of 10 is set on the output port, and the displacement is fixed on the fixed end. Young’s modulus and Poisson’s ratio are set to 1 and 0.3, respectively.
To ensure fairness, we compare PCA-based and non PCA-based DDTDs under the same conditions except for two points. First, we changed the neural network architecture of the VAE used in non-PCA-based DDTD, because the size of the input and output layers are quite different. Those size is equal to the number of the nodal densities n in non-PCA-based DDTD, and the number of the intermediate layer is one, whose size is 500, for both the encoder and decoder. Further, we use the sigmoid activation before the output layer, because the outputs are the nodal densities in this case. Next, in non-PCA-based DDTD, material distributions are normalized with \(h = 0.02\) for training the VAE. This is because that it is suggested to normalize the material distributions with large h in one representative implementation of non-PCA-based DDTD (Yamasaki et al. 2021).
Initial material distributions are obtained by solving a low-fidelity problem using a density-based TO method, which is easily and directly solvable, yet relevant to the original multi-objective optimization problem. More specifically, the low-fidelity problem does not consider the large deformation and replaces minimizing the MVMS with limiting the amount of the deformation on the input port to reduce computational complexity. For these initial material distributions, we evaluate the objective function values of the original problem. By doing so, we obtain 29 initial elite material distributions shown in Fig. 4.
Figure 5 shows how the performances of the elite material distributions are improved in this example. As shown in this figure, the variation of the elite solutions toward the minimization in the objective function space is more obvious in PCA-based DDTD compared to non-PCA-based DDTD, which proves that PCA-based DDTD has better effectiveness.
To more directly compare the difference in effectiveness between PCA-based and non PCA-based DDTDs, we evaluate the performance of the whole elite solutions for each iteration using the hypervolume indicator (Kii et al. 2024). We evaluate and compare them using the hypervolume indicator, which is normalized using the initial value. When the value of hypervolume indicator is greater than 1, it indicates that the elite solutions are progressing relative to the reference point. In addition, a larger value of hypervolume indicator indicates a better performance compared to the reference point. As shown in Fig. 6, the blue and red lines represent the hypervolume indicator of non-PCA-based DDTD and PCA-based DDTD, respectively. We can observe that the hypervolume indicator improves throughout the entire iteration process, and both values are greater than 1, indicating that the performance of the results from both DDTDs is better than that of the initial elite material distributions. It should be noted that the hypervolume indicator of PCA-based DDTD outperforms the hypervolume indicator of non-PCA-based DDTD throughout, which suggests that PCA-based DDTD is significantly more effective than non PCA-based DDTD under a fair comparison.
Part of the finally obtained elite material distributions in PCA-based and non-PCA-based DDTDs are shown in Fig. 7. Since this 2D compliant mechanism design problem contains three optimization objectives (volume, reaction force, and MVMS), we chose three pairs of results under almost same MVMS (difference within \(5 \times 10^{-4}\)) for straightforward comparison. As we have seen, the structural performance of the results of PCA-based DDTD is significantly better than the structural performance of the results of non-PCA-based DDTD (i.e., having smaller objective function values in both reaction force and volume). In this 2D case, the average computation times per iteration for PCA-based and non-PCA-based DDTDs are 13.2 and 11.88 min, respectively.
4.3 Numerical example 1
Here, we examine the popular L-shaped beam for testing under the condition of not considering and considering large deformations respectively. In this optimization problem, as shown in Fig. 3b, the number of design variables is reduced to half of that in the original design domain, i.e., 138,621, due to the presence of symmetric boundary conditions about the xz plane. As mentioned earlier, material distributions at this scale cannot be learned efficiently by the VAE due to the presence of input size limitation. With the benefit of PCA, the proposed PCA-based DDTD can resolve the conflict between the input size limitation of the VAE and the representation of complex structures. For other problem settings, a vertical load of 0.002 is applied on the tip of the L-shaped design domain, and Young’s modulus and Poisson’s ratio are set to 1 and 0.3, respectively.
In the optimization problem without considering large deformation, the stiffness of the structure is considered to be independent of the magnitude of the applied loading and the displacement matrix is unique. We first obtained the initial material distributions by solving a low-fidelity optimization problem using a density-based TO method, which utilizes the approximation based on the P-norm and essentially the same as the problem solved by Kii et al. (2024). After that, we selected 41 initial elite material distributions shown in Fig. 8.
Figure 9 shows how the performances of the elite material distributions are improved through the DDTD iterative process. As shown in this figure, the elite solutions are migrating toward the minimization in the objective function space, which proves that the performances of the elite material distributions are improving via DDTD process. We here chose two structures at iteration 0, \(S_1\) and \(S_2\) (blue boxes), and one structure at iteration 50, \(S_{3}\) (red box), in the volume range of [0.01, 0.015], to explain the changes in the shape and topology of material distributions during the generation process of DDTD.
Compared to \(S_1\), which has a lower volume, \(S_2\) has more branches to spread out the stresses, thus leading to the advantages of \(S_1\) and \(S_2\) in terms of volume and stress, respectively. \(S_{3}\) inherits the multiple branches in \(S_2\) and its shape is modified to widely distribute the von Mises stress through the DDTD process, thus outperforming \(S_1\) and \(S_2\) in terms of the volume (\(S_1\), \(S_2\), \(S_{3}\) are 1.173, 1.311, 1.123 \(\times 10^{-2}\), respectively) and the stress (\(S_1\), \(S_2\), \(S_{3}\) are 8.118, 5.439, 5.315 \(\times 10^{-2}\), respectively). Figure 10 shows a part of the elite material distributions at iteration 50. As shown in Fig. 11a, the hypervolume indicator is progressing throughout the iterations, which indicates that the whole performance of the solutions is improving.
In the optimization problem with considering large deformation, the deformation of the structure by the applied load cannot be neglected, which means that the stiffness matrix of the structure is changed along with the amount of the deformation and nodal displacements. It should be noted that considering large deformations is more meaningful in engineering problems, but it also causes more computational complexity.
From the same initial material distributions to the case of without considering large deformation, we select 37 initial elite material distributions shown in Fig. 12. Figure 13 shows how the performances of the elite material distributions are improved through the DDTD iterative process. In this figure, there is a significant stress concentration in the structure at iteration 0, \(S_{4}\), which can easily lead to extreme fragility of the structure. While it seems that the multiple branches appearing in \(S_{4}\) are effective in spreading the stresses, the branches with high stiffness existing in the lower half of the design domain are conversely the most significant factor contributing to the problem of the localized high stress concentration. In the structure at iteration 50, \(S_{5}\), this localized stress concentration problem is avoided by removing those seemingly effective high-stiffness structures.
Part of the elite material distributions at iteration 50 are shown in Fig. 14. It should be noted that these new features appearing in the elite material distributions at iteration 50 do not exist in those at iteration 0, which proves that the DDTD process not only inherits the features of the initial data, but also brings in brand novel features to play the role of optimization. As a result, \(S_{5}\) is significantly better than \(S_{4}\) in terms of both volume (\(S_{4}\), \(S_{5}\) are 1.443, 1.205 \(\times 10^{-2}\), respectively) and stress (\(S_{4}\), \(S_{5}\) are 4.578, 1.704 \(\times 10^{-2}\), respectively), and even has an 62.9% decrease in stress. As shown in Fig. 11b, the hypervolume indicator is approximately 1.122 at iteration 50, which indicates that the whole performance of the elite material distributions is improving. In this 3D case, the average computation time per iteration under MVMS minimization problem without/with considering large deformation is 51.1 and 60.0 min, respectively.
4.4 Numerical example 2
To further verify the validity of the proposed PCA-based DDTD, we chose 3D compliant mechanism design problem shown in Fig. 3c. In this optimization problem, the number of design variables is reduced to one-fourth of that in the original design domain, i.e., 130,078, due to the presence of the symmetric boundary conditions about the xy, xz plane. As previously mentioned, non-PCA-based DDTD cannot directly use material distribution data of this scale to train the VAE due to the limitation of the maximum input size. For other problem settings, a horizontal load of 0.08 is applied on the input port, an artificial spring of 10 is set on the output port, and the displacement is fixed on the fixed end. Young’s modulus and Poisson’s ratio are set to 1 and 0.3, respectively.
Similar to the case of the 2D compliant mechanism design problem, initial material distributions are obtained by solving a low-fidelity problem using a density-based TO method. After that, we evaluate the objective function values of the original problem. By doing so, we obtain 29 initial elite material distributions shown in Fig. 15.
Starting from these elite material distributions, we finally obtain the elite material distributions shown in Fig. 16. Figure 17 also shows how the performances of the elite material distributions are improved through the DDTD iterative process. In this figure, the elite material distributions are migrating toward the minimization in the objective function space, which proves that the performances of the elite material distributions are improving during the iterations. As shown in Fig. 11c, the hypervolume indicator is approximately 1.31 at iteration 50, which indicates that the whole performance of the elite material distributions has improved significantly.
To provide a more obvious comparison of the performance difference between the elite material distributions at iterations 0 and 50, we choose three structures at iteration 0 as a benchmark, denoted as \(S_6\), \(S_7\), and \(S_8\), respectively. As shown in Fig. 18a, PCA-based DDTD generates diverse material distributions with higher performances than the benchmark \(S_6\) at similar MVMS (MVMS \(\in 0.4800\pm 0.01\), \(S_6\) and \(S_9\) are 0.4800, 0.4731, respectively). The newly generated structure \(S_9\) slightly outperforms \(S_6\) in terms of the reaction force (\(S_6\) and \(S_9\) are \(-3.521\)\(\times 10^{-4}\), \(-3.596\)\(\times 10^{-4}\), respectively) while significantly outperforming \(S_6\) in terms of the volume (\(S_6\) and \(S_9\) are 6.733 \(\times 10^{-2}\), 4.538 \(\times 10^{-2}\), respectively), with a decrease of \(32.5\%\), proving that lightweighting is effectively achieved, while other properties remain similar. Compared to \(S_6\), whose stress is concentrated at the ends of the bar at the input side, the stress in \(S_9\) is relatively uniformly distributed over the entire bar, which demonstrates that the stress concentration of the structure is effectively solved in PCA-based DDTD, and this is the main reason why the volume of the material decreases significantly, while the structure’s performance remains unchanged.
In Fig. 18b, we selected material distributions whose MVMS \(\in 0.5891\pm 0.01\), for comparison. As shown in this figure, PCA-based DDTD generates new material distributions with diverse structural properties compared to the benchmark \(S_7\). We choose \(S_{10}\), which has similar properties to \(S_7\) in terms of the reaction force (\(S_7\) and \(S_{10}\) are \(-4.379\)\(\times 10^{-4}\) , \(-4.400\)\(\times 10^{-4}\), respectively), for illustration. As a result, we obtain the same conclusion as in Fig. 18a, i.e., by eliminating the stress concentration of the initial structure, PCA-based DDTD can achieve the effect of maintaining the other structural properties unchanged while effectively reducing the material usage (\(S_7\) and \(S_{10}\) are 6.727 \(\times 10^{-2}\), 5.855 \(\times 10^{-2}\), respectively, decreasing \(12.9\%\)).
In Fig. 18c, we selected material distributions whose MVMS \(\in 0.8252\pm 0.01\), for comparison. To better show the stress concentration part, we changed the viewpoint from that of Fig. 18a, b. We selected structure \(S_{11}\), because it has a similar value in terms of the reaction force (\(S_8\) and \(S_{11}\) are \(-2.570 \times 10^{-4}\), \(-2.730 \times 10^{-4}\), respectively). The volumes of both are \(3.636 \times 10^{-2}\) and \(2.818 \times 10^{-2}\) (with a decrease of \(22.5\%\)), which again validate the aforementioned conclusions.
It should be noted that the focus on the volume comparison does not mean that DDTD only improves the initial material distributions in terms of material usage (lightweighting), rather DDTD can improve the initial material distributions on all the set optimization objectives to generate new material distributions with a diversity of structural properties. For example, in Fig. 18a, newly generated structure \(S_{12}\) outperformed benchmark \(S_6\) obviously in both the reaction force and volume aspects, due to the smaller values of the optimization objectives. In this 3D case, the average computation time per iteration is 60.5 min.
5 Conclusions
In this paper, we proposed PCA-based DDTD for solving the input limitation problem of the VAE. In the proposed PCA-based DDTD, the VAE is trained by replacing the original material distributions with the principal component score matrix obtained using PCA. The material distributions with new features were generated by PCA-based DDTD. This ensures that a high DOF of the material distribution representation is maintained while still satisfying the maximum input limitation of the VAE, thus addressing the difficulty of the original (non PCA-based) DDTD to be applied to 3D strongly nonlinear optimization problems. Furthermore, we demonstrated that the proposed PCA-based DDTD achieves elite material distributions with superior performances compared to non PCA-based DDTD. By solving MVMS minimization problems with/without considering large deformation and 3D compliant mechanism design problem, we validated the effectiveness of the proposed PCA-based DDTD.
Declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Replication of results
The manuscript provides sufficient information for the replication of results. Readers interested in further details regarding the implementation or access to numerical codes are encouraged to contact the corresponding author with a reasonable request. The data in this study are also available upon request.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
In multi-objective optimization, the non-dominated sorting ranks solutions based on Pareto dominance, ensuring that the selected elite data represent optimal trade-offs among conflicting objectives. Given a set of evaluated material distributions \(\mathcal {S} = \{\rho ^{(1)}, \rho ^{(2)},\ldots , \rho ^{(e)}\}\) (e is the number of the evaluated material distributions) with objective function values \(\left[ J_1, J_2,\ldots , J_{N_{obj}} \right]\), a solution \(\rho ^{(i)}\) is considered to dominate another solution \(\rho ^{(j)}\) if it satisfies
$$\begin{aligned} \begin{array}{l} \forall k \in \left\{ 1, \ldots , N_{o b j}\right\} , \quad J_{k}\left( \rho ^{(i)}\right) \le J_{k}\left( \rho ^{(j)}\right) ,\\ \\ \exists k \text{ such } \text{ that } J_{k}\left( \rho ^{(i)}\right) <J_{k}\left( \rho ^{(j)}\right) , \end{array} \end{aligned}$$
where \(J_k(\rho ^{(i)})\) is the k-th objective function value of the material distribution \(\rho ^{(i)}\). Based on this criterion, the algorithm identifies the first non-dominated front, which consists of the rank-one material distributions.
Appendix B
To investigate the impact of the weight parameter \(\beta\) in the loss function of the VAE, we conducted experiments with different \(\beta\) values and analyzed their influence on the final results of PCA-based DDTD in 2D case, as introduced in Sect. 4.2. The experiments were performed by discretely varying \(\beta\) while keeping all other settings unchanged, and the performance of final elite data was evaluated using the hypervolume metric, which provides a quantitative measure of the performance of the elite solutions.
Figure 19 shows that different \(\beta\) values lead to variations in the hypervolume values of the final results. In addition, the smaller \(\beta\) leads to the better the hypervolume. However, it is not clear whether the same tendency appears in another optimization problems. Further research is needed for the influence of \(\beta\) to the optimization result. In the numerical examples of Sect. 4, we avoided discussion that relies on excessive parameter tuning of \(\beta\). Instead, \(\beta = 4\), which leads a relatively poor result in Fig. 19, was used.
Fig. 19
Effect of different \(\beta\) values on hypervolume. The x-axis represents the hyperparameter \(\beta\) in the VAE, while the y-axis denotes the hypervolume value, which serves as a performance indicator for the elite solutions
Die im Laufe eines Jahres in der „adhäsion“ veröffentlichten Marktübersichten helfen Anwendern verschiedenster Branchen, sich einen gezielten Überblick über Lieferantenangebote zu verschaffen.