Skip to main content
Top
Published in:

16-11-2023

On the Representation and Learning of Monotone Triangular Transport Maps

Authors: Ricardo Baptista, Youssef Marzouk, Olivier Zahm

Published in: Foundations of Computational Mathematics | Issue 6/2024

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Transportation of measure provides a versatile approach for modeling complex probability distributions, with applications in density estimation, Bayesian inference, generative modeling, and beyond. Monotone triangular transport maps—approximations of the Knothe–Rosenblatt (KR) rearrangement—are a canonical choice for these tasks. Yet the representation and parameterization of such maps have a significant impact on their generality and expressiveness, and on properties of the optimization problem that arises in learning a map from data (e.g., via maximum likelihood estimation). We present a general framework for representing monotone triangular maps via invertible transformations of smooth functions. We establish conditions on the transformation such that the associated infinite-dimensional minimization problem has no spurious local minima, i.e., all local minima are global minima; and we show for target distributions satisfying certain tail conditions that the unique global minimizer corresponds to the KR map. Given a sample from the target, we then propose an adaptive algorithm that estimates a sparse semi-parametric approximation of the underlying KR map. We demonstrate how this framework can be applied to joint and conditional density estimation, likelihood-free inference, and structure learning of directed graphical models, with stable generalization performance across a range of sample sizes.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
For any \({\varvec{z}}\in \mathbb {R}^d\), \({\varvec{x}}=S^{-1}({\varvec{z}})\) can be computed recursively as \(x_{k}=T^{k}({\varvec{x}}_{< k},z_k)\) for \(k=1,\dots ,d\), where the function \(T^{k}({\varvec{x}}_{< k},\cdot )\) is the inverse of \(x_k\mapsto S_{k}({\varvec{x}}_{< k},x_k)\). In practice, evaluating \(T^{k}\) requires solving a root-finding problem which is guaranteed to have a unique (real) root, and for which the bisection method converges geometrically fast. Therefore, \(S^{-1}({\varvec{z}})\) can be evaluated to machine precision in negligible computational time.
 
2
That is, \(\Vert v_1\otimes \cdots \otimes v_k\Vert _{V_k} = \Vert v_1\Vert _{L^2_{\eta _1}}\Vert v_2\Vert _{L^2_{\eta _2}}\cdots \Vert v_{k-1}\Vert _{L^2_{\eta _{k-1}}} \Vert v_k\Vert _{H^1_{\eta _k}}\) for any \(v_j\in L^2_{\eta _k}\) and \(v_k\in H^1_{\eta _k}\).
 
Literature
1.
go back to reference Ambrogioni, L., Güçlü, U., van Gerven, M. A. and Maris, E. (2017). The kernel mixture network: A nonparametric method for conditional density estimation of continuous random variables. arXiv preprintarXiv:1705.07111. Ambrogioni, L., Güçlü, U., van Gerven, M. A. and Maris, E. (2017). The kernel mixture network: A nonparametric method for conditional density estimation of continuous random variables. arXiv preprintarXiv:​1705.​07111.
2.
go back to reference Anderes, E. and Coram, M. (2012). A general spline representation for nonparametric and semiparametric density estimates using diffeomorphisms. arXiv preprintarXiv:1205.5314. Anderes, E. and Coram, M. (2012). A general spline representation for nonparametric and semiparametric density estimates using diffeomorphisms. arXiv preprintarXiv:​1205.​5314.
3.
go back to reference Baptista, R., Hosseini, B., Kovachki, N. B. and Marzouk, Y. (2023). Conditional sampling with monotone GANs: from generative models to likelihood-free inference. arXiv preprintarXiv:2006.06755v3. Baptista, R., Hosseini, B., Kovachki, N. B. and Marzouk, Y. (2023). Conditional sampling with monotone GANs: from generative models to likelihood-free inference. arXiv preprintarXiv:​2006.​06755v3.
4.
go back to reference Baptista, R., Marzouk, Y., Morrison, R. E. and Zahm, O. (2021). Learning non-Gaussian graphical models via Hessian scores and triangular transport. arXiv preprintarXiv:2101.03093. Baptista, R., Marzouk, Y., Morrison, R. E. and Zahm, O. (2021). Learning non-Gaussian graphical models via Hessian scores and triangular transport. arXiv preprintarXiv:​2101.​03093.
5.
go back to reference Bertsekas, D. P. (1997). Nonlinear programming. Journal of the Operational Research Society 48 334–334.CrossRef Bertsekas, D. P. (1997). Nonlinear programming. Journal of the Operational Research Society 48 334–334.CrossRef
6.
go back to reference Bigoni, D., Marzouk, Y., Prieur, C. and Zahm, O. (2022). Nonlinear dimension reduction for surrogate modeling using gradient information. Information and Inference: A Journal of the IMA. Bigoni, D., Marzouk, Y., Prieur, C. and Zahm, O. (2022). Nonlinear dimension reduction for surrogate modeling using gradient information. Information and Inference: A Journal of the IMA.
7.
go back to reference Bishop, C. M. (1994). Mixture density networks Technical Report No. Neural Computing Research Group report: NCRG/94/004, Aston University. Bishop, C. M. (1994). Mixture density networks Technical Report No. Neural Computing Research Group report: NCRG/94/004, Aston University.
8.
go back to reference Bogachev, V. I., Kolesnikov, A. V. and Medvedev, K. V. (2005). Triangular transformations of measures. Sbornik: Mathematics 196 309.MathSciNetCrossRef Bogachev, V. I., Kolesnikov, A. V. and Medvedev, K. V. (2005). Triangular transformations of measures. Sbornik: Mathematics 196 309.MathSciNetCrossRef
9.
go back to reference Boyd, J. P. (1984). Asymptotic coefficients of Hermite function series. Journal of Computational Physics 54 382–410.MathSciNetCrossRef Boyd, J. P. (1984). Asymptotic coefficients of Hermite function series. Journal of Computational Physics 54 382–410.MathSciNetCrossRef
10.
go back to reference Brennan, M., Bigoni, D., Zahm, O., Spantini, A. and Marzouk, Y. (2020). Greedy inference with structure-exploiting lazy maps. Advances in Neural Information Processing Systems 33. Brennan, M., Bigoni, D., Zahm, O., Spantini, A. and Marzouk, Y. (2020). Greedy inference with structure-exploiting lazy maps. Advances in Neural Information Processing Systems 33.
11.
go back to reference Chang, S.-H., Cosman, P. C. and Milstein, L. B. (2011). Chernoff-type bounds for the Gaussian error function. IEEE Transactions on Communications 59 2939–2944.CrossRef Chang, S.-H., Cosman, P. C. and Milstein, L. B. (2011). Chernoff-type bounds for the Gaussian error function. IEEE Transactions on Communications 59 2939–2944.CrossRef
12.
go back to reference Chkifa, A., Cohen, A. and Schwab, C. (2015). Breaking the curse of dimensionality in sparse polynomial approximation of parametric PDEs. Journal de Mathématiques Pures et Appliquées 103 400–428.MathSciNetCrossRef Chkifa, A., Cohen, A. and Schwab, C. (2015). Breaking the curse of dimensionality in sparse polynomial approximation of parametric PDEs. Journal de Mathématiques Pures et Appliquées 103 400–428.MathSciNetCrossRef
13.
go back to reference Cohen, A. (2003). Numerical analysis of wavelet methods. Elsevier. Cohen, A. (2003). Numerical analysis of wavelet methods. Elsevier.
14.
go back to reference Cohen, A. and Migliorati, G. (2018). Multivariate approximation in downward closed polynomial spaces. In Contemporary Computational Mathematics-A celebration of the 80th birthday of Ian Sloan 233–282. Springer. Cohen, A. and Migliorati, G. (2018). Multivariate approximation in downward closed polynomial spaces. In Contemporary Computational Mathematics-A celebration of the 80th birthday of Ian Sloan 233–282. Springer.
15.
go back to reference Cui, T. and Dolgov, S. (2021). Deep composition of tensor trains using squared inverse Rosenblatt transports. Foundations of Computational Mathematics 1–60. Cui, T. and Dolgov, S. (2021). Deep composition of tensor trains using squared inverse Rosenblatt transports. Foundations of Computational Mathematics 1–60.
16.
go back to reference Cui, T., Dolgov, S. and Zahm, O. (2023). Scalable conditional deep inverse Rosenblatt transports using tensor trains and gradient-based dimension reduction. Journal of Computational Physics 485 112103.MathSciNetCrossRef Cui, T., Dolgov, S. and Zahm, O. (2023). Scalable conditional deep inverse Rosenblatt transports using tensor trains and gradient-based dimension reduction. Journal of Computational Physics 485 112103.MathSciNetCrossRef
17.
go back to reference Cui, T., Tong, X. T. and Zahm, O. (2022). Prior normalization for certified likelihood-informed subspace detection of Bayesian inverse problems. Inverse Problems 38 124002.MathSciNetCrossRef Cui, T., Tong, X. T. and Zahm, O. (2022). Prior normalization for certified likelihood-informed subspace detection of Bayesian inverse problems. Inverse Problems 38 124002.MathSciNetCrossRef
18.
go back to reference Dinh, L., Sohl-Dickstein, J. and Bengio, S. (2017). Density estimation using Real NVP. In International Conference on Learning Representations. Dinh, L., Sohl-Dickstein, J. and Bengio, S. (2017). Density estimation using Real NVP. In International Conference on Learning Representations.
19.
go back to reference Durkan, C., Bekasov, A., Murray, I. and Papamakarios, G. (2019). Neural spline flows. In Advances in Neural Information Processing Systems 7509–7520. Durkan, C., Bekasov, A., Murray, I. and Papamakarios, G. (2019). Neural spline flows. In Advances in Neural Information Processing Systems 7509–7520.
20.
go back to reference El Moselhy, T. A. and Marzouk, Y. M. (2012). Bayesian inference with optimal maps. Journal of Computational Physics 231 7815–7850.MathSciNetCrossRef El Moselhy, T. A. and Marzouk, Y. M. (2012). Bayesian inference with optimal maps. Journal of Computational Physics 231 7815–7850.MathSciNetCrossRef
21.
go back to reference Huang, C.-W., Chen, R. T., Tsirigotis, C. and Courville, A. (2020). Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization. In International Conference on Learning Representations. Huang, C.-W., Chen, R. T., Tsirigotis, C. and Courville, A. (2020). Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization. In International Conference on Learning Representations.
22.
go back to reference Huang, C.-W., Krueger, D., Lacoste, A. and Courville, A. (2018). Neural Autoregressive Flows. In International Conference on Machine Learning 2083–2092. Huang, C.-W., Krueger, D., Lacoste, A. and Courville, A. (2018). Neural Autoregressive Flows. In International Conference on Machine Learning 2083–2092.
23.
go back to reference Irons, N. J., Scetbon, M., Pal, S. and Harchaoui, Z. (2022). Triangular flows for generative modeling: Statistical consistency, smoothness classes, and fast rates. In International Conference on Artificial Intelligence and Statistics 10161–10195. PMLR. Irons, N. J., Scetbon, M., Pal, S. and Harchaoui, Z. (2022). Triangular flows for generative modeling: Statistical consistency, smoothness classes, and fast rates. In International Conference on Artificial Intelligence and Statistics 10161–10195. PMLR.
24.
go back to reference Jaini, P., Kobyzev, I., Yu, Y. and Brubaker, M. (2020). Tails of Lipschitz triangular flows. In International Conference on Machine Learning 4673–4681. PMLR. Jaini, P., Kobyzev, I., Yu, Y. and Brubaker, M. (2020). Tails of Lipschitz triangular flows. In International Conference on Machine Learning 4673–4681. PMLR.
25.
go back to reference Jaini, P., Selby, K. A. and Yu, Y. (2019). Sum-of-squares polynomial flow. In International Conference on Machine Learning 3009–3018. Jaini, P., Selby, K. A. and Yu, Y. (2019). Sum-of-squares polynomial flow. In International Conference on Machine Learning 3009–3018.
26.
go back to reference Katzfuss, M. and Schäfer, F. (2023). Scalable Bayesian transport maps for high-dimensional non-Gaussian spatial fields. Journal of the American Statistical Association 0 1–15. Katzfuss, M. and Schäfer, F. (2023). Scalable Bayesian transport maps for high-dimensional non-Gaussian spatial fields. Journal of the American Statistical Association 0 1–15.
27.
go back to reference Kingma, D. P. and Dhariwal, P. (2018). Glow: Generative flow with invertible 1x1 convolutions. In Advances in Neural Information Processing Systems 10215–10224. Kingma, D. P. and Dhariwal, P. (2018). Glow: Generative flow with invertible 1x1 convolutions. In Advances in Neural Information Processing Systems 10215–10224.
28.
go back to reference Kobyzev, I., Prince, S. and Brubaker, M. (2020). Normalizing flows: An introduction and review of current methods. IEEE Transactions on Pattern Analysis and Machine Intelligence. Kobyzev, I., Prince, S. and Brubaker, M. (2020). Normalizing flows: An introduction and review of current methods. IEEE Transactions on Pattern Analysis and Machine Intelligence.
29.
go back to reference Koller, D. and Friedman, N. (2009). Probabilistic graphical models: principles and techniques. MIT press. Koller, D. and Friedman, N. (2009). Probabilistic graphical models: principles and techniques. MIT press.
30.
go back to reference Kufner, A. and Opic, B. (1984). How to define reasonably weighted Sobolev spaces. Commentationes Mathematicae Universitatis Carolinae 25 537–554.MathSciNet Kufner, A. and Opic, B. (1984). How to define reasonably weighted Sobolev spaces. Commentationes Mathematicae Universitatis Carolinae 25 537–554.MathSciNet
31.
go back to reference Lezcano Casado, M. (2019). Trivializations for gradient-based optimization on manifolds. Advances in Neural Information Processing Systems 32 9157–9168. Lezcano Casado, M. (2019). Trivializations for gradient-based optimization on manifolds. Advances in Neural Information Processing Systems 32 9157–9168.
33.
go back to reference Lueckmann, J.-M., Boelts, J., Greenberg, D., Goncalves, P. and Macke, J. (2021). Benchmarking simulation-based inference. In International Conference on Artificial Intelligence and Statistics 343–351. PMLR. Lueckmann, J.-M., Boelts, J., Greenberg, D., Goncalves, P. and Macke, J. (2021). Benchmarking simulation-based inference. In International Conference on Artificial Intelligence and Statistics 343–351. PMLR.
34.
go back to reference Mallat, S. (1999). A wavelet tour of signal processing. Elsevier. Mallat, S. (1999). A wavelet tour of signal processing. Elsevier.
35.
go back to reference Marzouk, Y., Moselhy, T., Parno, M. and Spantini, A. (2016). Sampling via Measure Transport: An Introduction In Handbook of Uncertainty Quantification 1–41. Springer International Publishing. Marzouk, Y., Moselhy, T., Parno, M. and Spantini, A. (2016). Sampling via Measure Transport: An Introduction In Handbook of Uncertainty Quantification 1–41. Springer International Publishing.
36.
go back to reference Migliorati, G. (2015). Adaptive polynomial approximation by means of random discrete least squares. In Numerical Mathematics and Advanced Applications-ENUMATH 2013 547–554. Springer. Migliorati, G. (2015). Adaptive polynomial approximation by means of random discrete least squares. In Numerical Mathematics and Advanced Applications-ENUMATH 2013 547–554. Springer.
37.
go back to reference Migliorati, G. (2019). Adaptive approximation by optimal weighted least-squares methods. SIAM Journal on Numerical Analysis 57 2217–2245.MathSciNetCrossRef Migliorati, G. (2019). Adaptive approximation by optimal weighted least-squares methods. SIAM Journal on Numerical Analysis 57 2217–2245.MathSciNetCrossRef
38.
go back to reference Morrison, R., Baptista, R. and Marzouk, Y. (2017). Beyond normality: Learning sparse probabilistic graphical models in the non-Gaussian setting. In Advances in Neural Information Processing Systems 2359–2369. Morrison, R., Baptista, R. and Marzouk, Y. (2017). Beyond normality: Learning sparse probabilistic graphical models in the non-Gaussian setting. In Advances in Neural Information Processing Systems 2359–2369.
40.
go back to reference Nocedal, J. and Wright, S. (2006). Numerical optimization. Springer Science & Business Media. Nocedal, J. and Wright, S. (2006). Numerical optimization. Springer Science & Business Media.
41.
go back to reference Novak, E., Ullrich, M., Woźniakowski, H. and Zhang, S. (2018). Reproducing kernels of Sobolev spaces on \(\mathbb{R}^d\) and applications to embedding constants and tractability. Analysis and Applications 16 693–715.MathSciNetCrossRef Novak, E., Ullrich, M., Woźniakowski, H. and Zhang, S. (2018). Reproducing kernels of Sobolev spaces on \(\mathbb{R}^d\) and applications to embedding constants and tractability. Analysis and Applications 16 693–715.MathSciNetCrossRef
42.
go back to reference Oord, A. V. D., Li, Y., Babuschkin, I., Simonyan, K., Vinyals, O., Kavukcuoglu, K., Driessche, G. V. D., Lockhart, E., Cobo, L. C., Stimberg, F. et al. (2017). Parallel WaveNet: Fast high-fidelity speech synthesis. arXiv preprintarXiv:1711.10433. Oord, A. V. D., Li, Y., Babuschkin, I., Simonyan, K., Vinyals, O., Kavukcuoglu, K., Driessche, G. V. D., Lockhart, E., Cobo, L. C., Stimberg, F. et al. (2017). Parallel WaveNet: Fast high-fidelity speech synthesis. arXiv preprintarXiv:​1711.​10433.
43.
go back to reference Papamakarios, G. and Murray, I. (2016). Fast \(\varepsilon \)-free inference of simulation models with Bayesian conditional density estimation. In Advances in Neural Information Processing Systems 1028–1036. Papamakarios, G. and Murray, I. (2016). Fast \(\varepsilon \)-free inference of simulation models with Bayesian conditional density estimation. In Advances in Neural Information Processing Systems 1028–1036.
44.
go back to reference Papamakarios, G., Nalisnick, E., Rezende, D. J., Mohamed, S. and Lakshminarayanan, B. (2021). Normalizing flows for probabilistic modeling and inference. Journal of Machine Learning Research 22 1–64.MathSciNet Papamakarios, G., Nalisnick, E., Rezende, D. J., Mohamed, S. and Lakshminarayanan, B. (2021). Normalizing flows for probabilistic modeling and inference. Journal of Machine Learning Research 22 1–64.MathSciNet
45.
go back to reference Papamakarios, G., Pavlakou, T. and Murray, I. (2017). Masked autoregressive flow for density estimation. In Advances in Neural Information Processing Systems 2338–2347. Papamakarios, G., Pavlakou, T. and Murray, I. (2017). Masked autoregressive flow for density estimation. In Advances in Neural Information Processing Systems 2338–2347.
46.
go back to reference Parno, M. D. and Marzouk, Y. M. (2018). Transport map accelerated Markov chain Monte Carlo. SIAM/ASA Journal on Uncertainty Quantification 6 645–682.MathSciNetCrossRef Parno, M. D. and Marzouk, Y. M. (2018). Transport map accelerated Markov chain Monte Carlo. SIAM/ASA Journal on Uncertainty Quantification 6 645–682.MathSciNetCrossRef
47.
go back to reference Radev, S. T., Mertens, U. K., Voss, A., Ardizzone, L. and Köthe, U. (2020). BayesFlow: Learning complex stochastic models with invertible neural networks. IEEE transactions on neural networks and learning systems. Radev, S. T., Mertens, U. K., Voss, A., Ardizzone, L. and Köthe, U. (2020). BayesFlow: Learning complex stochastic models with invertible neural networks. IEEE transactions on neural networks and learning systems.
48.
go back to reference Ramsay, J. O. (1998). Estimating smooth monotone functions. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 60 365–375.MathSciNetCrossRef Ramsay, J. O. (1998). Estimating smooth monotone functions. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 60 365–375.MathSciNetCrossRef
49.
go back to reference Raskutti, G. and Uhler, C. (2018). Learning directed acyclic graph models based on sparsest permutations. Stat 7 e183.MathSciNetCrossRef Raskutti, G. and Uhler, C. (2018). Learning directed acyclic graph models based on sparsest permutations. Stat 7 e183.MathSciNetCrossRef
50.
go back to reference Rezende, D. and Mohamed, S. (2015). Variational inference with normalizing flows. In International conference on machine learning 1530–1538. PMLR. Rezende, D. and Mohamed, S. (2015). Variational inference with normalizing flows. In International conference on machine learning 1530–1538. PMLR.
51.
go back to reference Rosenblatt, M. (1952). Remarks on a multivariate transformation. The Annals of Mathematical Statistics 23 470–472.MathSciNetCrossRef Rosenblatt, M. (1952). Remarks on a multivariate transformation. The Annals of Mathematical Statistics 23 470–472.MathSciNetCrossRef
52.
go back to reference Rothfuss, J., Ferreira, F., Walther, S. and Ulrich, M. (2019). Conditional density estimation with neural networks: Best practices and benchmarks. arXiv preprintarXiv:1903.00954. Rothfuss, J., Ferreira, F., Walther, S. and Ulrich, M. (2019). Conditional density estimation with neural networks: Best practices and benchmarks. arXiv preprintarXiv:​1903.​00954.
53.
go back to reference Santambrogio, F. (2015). Optimal Transport for Applied Mathematicians. Springer International Publishing. Santambrogio, F. (2015). Optimal Transport for Applied Mathematicians. Springer International Publishing.
54.
go back to reference Schäfer, F., Katzfuss, M. and Owhadi, H. (2021). Sparse Cholesky Factorization by Kullback–Leibler Minimization. SIAM Journal on Scientific Computing 43 A2019–A2046.MathSciNetCrossRef Schäfer, F., Katzfuss, M. and Owhadi, H. (2021). Sparse Cholesky Factorization by Kullback–Leibler Minimization. SIAM Journal on Scientific Computing 43 A2019–A2046.MathSciNetCrossRef
55.
go back to reference Schmuland, B. (1992). Dirichlet forms with polynomial domain. Math. Japon 37 1015–1024.MathSciNet Schmuland, B. (1992). Dirichlet forms with polynomial domain. Math. Japon 37 1015–1024.MathSciNet
56.
go back to reference Schölkopf, B., Herbrich, R. and Smola, A. J. (2001). A generalized representer theorem. In International conference on computational learning theory 416–426. Springer. Schölkopf, B., Herbrich, R. and Smola, A. J. (2001). A generalized representer theorem. In International conference on computational learning theory 416–426. Springer.
57.
go back to reference Shin, Y. E., Zhou, L. and Ding, Y. (2022). Joint estimation of monotone curves via functional principal component analysis. Computational Statistics & Data Analysis 166 107343.MathSciNetCrossRef Shin, Y. E., Zhou, L. and Ding, Y. (2022). Joint estimation of monotone curves via functional principal component analysis. Computational Statistics & Data Analysis 166 107343.MathSciNetCrossRef
58.
go back to reference Silverman, B. W. (1982). On the estimation of a probability density function by the maximum penalized likelihood method. The Annals of Statistics 795–810. Silverman, B. W. (1982). On the estimation of a probability density function by the maximum penalized likelihood method. The Annals of Statistics 795–810.
59.
go back to reference Sisson, S. A., Fan, Y. and Tanaka, M. M. (2007). Sequential Monte Carlo without likelihoods. Proceedings of the National Academy of Sciences 104 1760–1765.MathSciNetCrossRef Sisson, S. A., Fan, Y. and Tanaka, M. M. (2007). Sequential Monte Carlo without likelihoods. Proceedings of the National Academy of Sciences 104 1760–1765.MathSciNetCrossRef
60.
go back to reference Spantini, A., Baptista, R. and Marzouk, Y. (2022). Coupling techniques for nonlinear ensemble filtering. SIAM Review 64 921–953.MathSciNetCrossRef Spantini, A., Baptista, R. and Marzouk, Y. (2022). Coupling techniques for nonlinear ensemble filtering. SIAM Review 64 921–953.MathSciNetCrossRef
61.
go back to reference Spantini, A., Bigoni, D. and Marzouk, Y. (2018). Inference via low-dimensional couplings. The Journal of Machine Learning Research 19 2639–2709.MathSciNet Spantini, A., Bigoni, D. and Marzouk, Y. (2018). Inference via low-dimensional couplings. The Journal of Machine Learning Research 19 2639–2709.MathSciNet
62.
go back to reference Tabak, E. G. and Turner, C. V. (2013). A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66 145–164.MathSciNetCrossRef Tabak, E. G. and Turner, C. V. (2013). A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66 145–164.MathSciNetCrossRef
63.
go back to reference Teshima, T., Ishikawa, I., Tojo, K., Oono, K., Ikeda, M. and Sugiyama, M. (2020). Coupling-based invertible neural networks are universal diffeomorphism approximators. In Advances in Neural Information Processing Systems 33 3362–3373. Teshima, T., Ishikawa, I., Tojo, K., Oono, K., Ikeda, M. and Sugiyama, M. (2020). Coupling-based invertible neural networks are universal diffeomorphism approximators. In Advances in Neural Information Processing Systems 33 3362–3373.
64.
go back to reference Trippe, B. L. and Turner, R. E. (2018). Conditional density estimation with Bayesian normalising flows. In Bayesian Deep Learning: NIPS 2017 Workshop. Trippe, B. L. and Turner, R. E. (2018). Conditional density estimation with Bayesian normalising flows. In Bayesian Deep Learning: NIPS 2017 Workshop.
65.
go back to reference Truong, T. T. and Nguyen, H.-T. (2021). Backtracking Gradient Descent Method and Some Applications in Large Scale Optimisation. Part 2: Algorithms and Experiments. Applied Mathematics & Optimization 84 2557–2586.MathSciNetCrossRef Truong, T. T. and Nguyen, H.-T. (2021). Backtracking Gradient Descent Method and Some Applications in Large Scale Optimisation. Part 2: Algorithms and Experiments. Applied Mathematics & Optimization 84 2557–2586.MathSciNetCrossRef
66.
go back to reference Uria, B., Murray, I. and Larochelle, H. (2013). RNADE: The real-valued neural autoregressive density-estimator. arXiv preprintarXiv:1306.0186. Uria, B., Murray, I. and Larochelle, H. (2013). RNADE: The real-valued neural autoregressive density-estimator. arXiv preprintarXiv:​1306.​0186.
67.
go back to reference Vershynin, R. (2018). High-dimensional probability: An introduction with applications in data science 47. Cambridge university press.CrossRef Vershynin, R. (2018). High-dimensional probability: An introduction with applications in data science 47. Cambridge university press.CrossRef
68.
go back to reference Vidakovic, B. (2009). Statistical modeling by wavelets 503. John Wiley & Sons. Vidakovic, B. (2009). Statistical modeling by wavelets 503. John Wiley & Sons.
69.
go back to reference Villani, C. (2008). Optimal transport: old and new 338. Springer Science & Business Media. Villani, C. (2008). Optimal transport: old and new 338. Springer Science & Business Media.
70.
71.
go back to reference Wasserman, L. (2013). All of statistics: a concise course in statistical inference. Springer Science & Business Media. Wasserman, L. (2013). All of statistics: a concise course in statistical inference. Springer Science & Business Media.
72.
go back to reference Wehenkel, A. and Louppe, G. (2019). Unconstrained monotonic neural networks. In Advances in Neural Information Processing Systems 1543–1553. Wehenkel, A. and Louppe, G. (2019). Unconstrained monotonic neural networks. In Advances in Neural Information Processing Systems 1543–1553.
73.
go back to reference Wenliang, L., Sutherland, D., Strathmann, H. and Gretton, A. (2019). Learning deep kernels for exponential family densities. In International Conference on Machine Learning 6737–6746. Wenliang, L., Sutherland, D., Strathmann, H. and Gretton, A. (2019). Learning deep kernels for exponential family densities. In International Conference on Machine Learning 6737–6746.
74.
go back to reference Zahm, O., Cui, T., Law, K., Spantini, A. and Marzouk, Y. (2022). Certified dimension reduction in nonlinear Bayesian inverse problems. Mathematics of Computation 91 1789–1835.MathSciNetCrossRef Zahm, O., Cui, T., Law, K., Spantini, A. and Marzouk, Y. (2022). Certified dimension reduction in nonlinear Bayesian inverse problems. Mathematics of Computation 91 1789–1835.MathSciNetCrossRef
75.
go back to reference Zech, J. and Marzouk, Y. (2022). Sparse approximation of triangular transports. Part II: the infinite dimensional case. Constructive Approximation 55 987–1036.MathSciNetCrossRef Zech, J. and Marzouk, Y. (2022). Sparse approximation of triangular transports. Part II: the infinite dimensional case. Constructive Approximation 55 987–1036.MathSciNetCrossRef
76.
go back to reference Zech, J. and Marzouk, Y. (2022). Sparse Approximation of triangular transports. Part I: the finite-dimensional case. Constructive Approximation 55 919–986.MathSciNetCrossRef Zech, J. and Marzouk, Y. (2022). Sparse Approximation of triangular transports. Part I: the finite-dimensional case. Constructive Approximation 55 919–986.MathSciNetCrossRef
Metadata
Title
On the Representation and Learning of Monotone Triangular Transport Maps
Authors
Ricardo Baptista
Youssef Marzouk
Olivier Zahm
Publication date
16-11-2023
Publisher
Springer US
Published in
Foundations of Computational Mathematics / Issue 6/2024
Print ISSN: 1615-3375
Electronic ISSN: 1615-3383
DOI
https://doi.org/10.1007/s10208-023-09630-x

Premium Partner