A sparse grid based method for generative dimensionality reduction of high-dimensional data
Introduction
In real-world applications, nominally high-dimensional data often resides on a lower-dimensional manifold. Due to this observation, one of the main topics in mathematical learning theory, see e.g. [1], [2], [3], and machine learning algorithms, see e.g. [4], [5], is the dimensionality reduction of a high-dimensional data set. To this end, it is assumed that there exists an m-dimensional manifold embedded in on which the data resides. The task of dimensionality reduction is to build an approximate representation of the low-dimensional manifold from the available data in and by that to obtain a description of the data in the new coordinate system governed by the manifold.
While there is a vast amount of algorithms to achieve this, the class of generative algorithms in this context is quite small, see [5], [6]. Most algorithms result in a mapping which allows to describe the available data in terms of the manifold coordinates. A generative approach however provides two maps and . This way, given a generative algorithm, we can project new data points from into T by means of P, we can interpolate between two data points in the low-dimensional representation of the manifold and thus generate new, meaningful data on by means of f.
While the principal component analysis (PCA) is the method of choice for linear manifolds , the nonlinear case is more relevant for practical applications but also more involved. Here, the explicit determination of general manifolds is quite complicated. To this end, generative methods usually assume that there exists a bijective map , i.e. the manifold can be described by one chart. The generative algorithm then takes and approximates ζ by f. A generative model is of special interest when the high-dimensional data space represents simulation results, for instance the coefficient vector of a finite element solution of a partial differential equation. Here, the comparison of original data and reconstructed surrogates of the d-dimensional simulations is inevitable for the quality control of the resulting mapping f.
Two well-known examples for nonlinear generative methods are the generative topographic mapping (GTM), see [7], and the regularized principal manifold learning (PML) algorithm, see [8]. Here, the question arises how the mappings f and P are computed. In the original GTM [7] a discretization of f is performed by a full tensor product grid approach. This naturally suffers from the curse of dimensionality, see [9], i.e. the case cannot be treated computationally. In the PML algorithm [8] a kernel-based ansatz centered in the data points is chosen for f. This relies on the representer theorem for reproducing kernel Hilbert spaces, see e.g. [4]. While this approach is well-suited if the number n of data is moderate (up to several hundred data points), a grid-based approach is more favorable for large n. However, as for the GTM, a full tensor product grid does not work in the case .
To circumvent the curse of dimensionality at least to some extent, sparse grid approaches have been introduced for both the GTM, see [10], [11], and the PML, see [12]. With a sparse grid discretization it is possible to exploit higher regularity of ζ, i.e. if ζ has Sobolev regularity of bounded mixed smoothness we obtain almost the same approximation quality as in the full tensor product grid approach but with significantly lower computational complexity, see [13].
In this paper we will introduce an adaptive sparse grid PML approach which is an enhancement of [12]. To this end, we will present an alternating minimization scheme to solve the PML optimization problem over a generalized sparse grid to obtain . We suggest error indicators which rely on the so-called hierarchical surplus of a sparse grid basis function. These error indicators are then used to refine the underlying sparse grid space in spatial directions in which the current approximation of f varies the most. The alternating minimization and the refinement of the underlying sparse grid space are then iterated until a given computational complexity threshold of a maximum refinement level is reached.
We will apply our method to a high-dimensional data set of finite element simulations. This specific scenario is based on the virtual product development arising in typical engineering applications, in particular we consider the automotive industry. Here, as part of the development cycle for a new car model, the influence of design parameters on the crash behavior of a car is analyzed by an engineer with the help of numerical crash simulations [14]. Each simulation consists of approximately one million finite element nodes and up to several hundred snapshots in time, and is therefore very high-dimensional, i.e. approximately 108 (time × nodes). The resulting huge data bases of finite element simulations can be used for sophisticated data-driven analysis steps, see [15], [16]. For example, the detection of the number of different effects such as tearing or bending behavior in a set of crash simulation data is of special interest. As an example, we will study the deformation of several beams in the front part of the car with the help of our dimension-adaptive PML method.
The outline of this paper is as follows: In section 2 we will introduce the idea of generative dimensionality reduction in the most general setting followed by a more detailed view of two common generative approaches – the PCA and the PML methods – and how they fit into our general setting. In section 3 we will shortly review the concept of sparse grids and introduce an adaptive version of the sparse grid PML method. Section 4 deals with a state-of-the-art big data manifold learning problem which stems from current demands in the automotive industry. We apply our adaptive sparse grid PML algorithm to this problem and compare the results to the linear PCA method. Section 5 contains some concluding remarks.
Section snippets
Generative dimensionality reduction
Let us assume that we are given n data points which have been drawn i.i.d. according to an unknown probability measure ρ on which is absolutely continuous with respect to the Lebesgue measure. Furthermore, let ρ be supported on an unknown compact manifold of dimension m. Since the treatment of general manifolds is very involved we impose an additional condition on the structure of and assume that there exists a bijective map , i.e. the compact manifold can be
Discretized PML with adaptive sparse grids
As we showed in (9), the application of the representer theorem leads to a finite expansion of g in terms of the point evaluated kernel functions . Then, it is easy to see that for fixed a minimizer of (8) solves the system of linear equations with an kernel block matrix given by . Here, I denotes the nd-dimensional identity matrix, is the coefficient vector and represents the data vector. Since is usually
Application of the dimension-adaptive PML
In this section we present results for the application of the PML method from section 3. We use the -semi norm, i.e. we omit the term in the sum in (5), to regularize the dimension-adaptive sparse grid PML approach. We start with a toy problem: a two-dimensional S-shaped manifold embedded in three dimensions where we will discuss the performance of the adaptivity. Subsequently, we consider a state-of-the-art engineering problem where we aim to build a generative model for numerical
Conclusion
In this paper, we discussed the idea of generative dimensionality reduction for two examples: Principal Component Analysis (PCA), and Principal Manifold Learning (PML). We explained how the PML method can be discretized in terms of sparse grids and introduced a dimension-adaptive generalization of the algorithm in [12]. Furthermore, we pointed out the relation to the ANOVA decomposition of multivariate functions. As a typical field of application for a generative dimensionality reduction
Acknowledgements
The authors were partially supported by the Hausdorff Center for Mathematics in Bonn, by the Sonderforschungsbereich 1060 The Mathematics of Emergent Effects funded by the Deutsche Forschungsgemeinschaft and by the project SIMDATA-NL funded by the German Federal Ministry of Education and Research under the grant numbers 05M10AAB, 05M10KTD, and 05M10PDA. We cordially thank Rodrigo Iza-Teran for generating the car crash simulation data.
References (27)
- et al.
Analysis of car crash simulation data with nonlinear machine learning methods
- et al.
Spatially adaptive sparse grids for high-dimensional data-driven problems
J. Complex.
(2010) - et al.
Learning Theory
(2007) - et al.
A Distribution-Free Theory of Nonparametric Regression
(2002) Statistical Learning Theory
(1998)- et al.
Learning with Kernels – Support Vector Machines, Regularization, Optimization, and Beyond
(2002) - et al.
Nonlinear Dimensionality Reduction
(2007) - et al.
Dimensionality reduction and principal surfaces via kernel map manifolds
- et al.
GTM: the generative topographic mapping
Neural Comput.
(1998) - et al.
Regularized principal manifolds
J. Mach. Learn. Res.
(2001)
Adaptive Control Processes: A Guided Tour
Dimensionality reduction of high-dimensional data with a nonlinear principal component aligned generative topographic mapping
SIAM J. Sci. Comput.
A sparse grid based generative topographic mapping for the dimensionality reduction of high-dimensional data
Cited by (17)
A parallel sparse grid construction algorithm based on the shared memory architecture and its application to flash calculations
2019, Computers and Mathematics with ApplicationsCitation Excerpt :Moreover, the relative errors decrease with the increase of the maximal layer. The application of sparse grids covers many fields such as machine learning [16,17], computational finance [18–20], fluid dynamics [21,22] and PDE [23,24]. Since our former research [8,12] is mainly on flash calculations in reservoir simulations, we will continue to study the topic in this work.
Sharp interface approaches and deep learning techniques for multiphase flows
2019, Journal of Computational PhysicsCitation Excerpt :Several recent works have presented learning techniques that are broader or more foundational in their applications. For example, [63,62] describe a framework that uses learning techniques including information fusion, multi-level Gaussian process regression, and diffusion maps in order to perform fluid dynamics simulations that can scale better in parallel environments, that can be resilient against faulty hardware in a large cluster, and that can use provided data to improve simulation results. [13] integrates sparse grid techniques with the principal manifold learning algorithm in order to provide a more efficient generative dimensionality reduction technique, demonstrating their method on a finite element simulation of an automotive crash test.
Multiscale simulation of polymeric fluids using the sparse grid combination technique
2018, Applied Mathematics and ComputationSpecial Issue: Big data and predictive computational modeling
2016, Journal of Computational PhysicsHIERARCHICAL REGULARIZATION NETWORKS FOR SPARSIFICATION BASED LEARNING ON NOISY DATASETS
2023, Foundations of Data ScienceStructural Damage Recognition Based on Filtered Feature Selection and Convolutional Neural Network
2022, International Journal of Structural Stability and Dynamics