1 Introduction
2 Several characterizations of quantiles
2.1 Quantiles
-
\(\alpha =Q_Y(t)\) is a solution of the convex minimization problem$$\begin{aligned} \min _{\alpha } \{{\mathbb {E}}((Y-\alpha )^+)+ \alpha (1-t)\} \end{aligned}$$(2.2)
-
there exists a uniformly distributed random variable U such that \( Y=Q_Y(U)\). Moreover, among uniformly distributed random variables, U is maximally correlated4 to Y in the sense that it solveswhere \(\mu :={\mathcal {U}}([0,1])\) is the uniform measure on [0, 1]. Of course, when \({\mathscr {L}}(Y)\) has no atom, i.e., when \(F_{Y}\) is continuous, U is unique and given by \(U=F_{Y}(Y)\). Problem (2.3) is the easiest example of optimal transport problem one can think of. The decomposition of a random variable Y as the composed of a monotone nondecreasing function and a uniformly distributed random variable is called a polar factorization of Y. The existence of such decompositions goes back to Ryff (1970) and the extension to the multivariate case (by optimal transport) is due to Brenier (1991).$$\begin{aligned} \max \{ {\mathbb {E}}(VY), \; V\sim \mu \} \end{aligned}$$(2.3)
-
the local or “t by t” approach which consists, for a fixed probability level t, in using directly formula (2.1) or the minimization problem (2.2) (or some approximation of it), this can be done very efficiently in practice but has the disadvantage of forgetting the fundamental global property of the quantile function: it should be monotone in t,
-
the global approach (or polar factorization approach), where quantiles of Y are defined as all nondecreasing functions Q for which one can write \(Y=Q(U)\) with U uniformly distributed. In this approach, one rather tries to recover directly the whole monotone function Q (or the uniform variable U that is maximally correlated to Y). Therefore this is a global approach for which one should rather use the optimal transport problem (2.3).
2.2 Conditional quantiles
3 Specified and quasi-specified quantile regression
3.1 Specified quantile regression
3.2 Quasi-specified quantile regression
-
\(U^{QR}\) is uniformly distributed,
-
X is mean-independent from \(U^{QR}\), i.e., \({\mathbb {E}}(X\vert U^{QR})={\mathbb {E}}(X)=0\),
-
\(Y=\alpha ^{QR}(U^{QR})+ {\beta ^{QR}}(U^{QR})^{\top } X\) almost surely.
-
both U and \(\overline{U}\) uniformly distributed,
-
X is mean-independent from U and \(\overline{U}\): \({\mathbb {E}} (X\vert U)={\mathbb {E}}(X\vert \overline{U})=0\),
-
\(\alpha , \beta , \overline{\alpha }, \overline{\beta }\) are continuous on [0, 1],
-
\((\alpha , \beta )\) and \((\overline{\alpha }, \overline{\beta })\) satisfy the monotonicity condition (3.1),
4 Quantile regression without specification
5 Vector quantiles, vector quantile regression and optimal transport
5.1 Brenier’s map as a vector quantile
5.2 Conditional vector quantiles
5.3 Vector quantile regression
-
the first marginal of \(\pi \) is \(\mu _d\), i.e., for every \(\varphi \in C([0,1]^d, {\mathbb {R}})\):$$\begin{aligned} \int _{ [0,1]^d \times {\mathbb {R}}^N\times {\mathbb {R}}^d} \varphi (u)\text {d} \pi (u,x,y)=\int _{[0,1]^d} \varphi (u) \text {d} \mu _d(u), \end{aligned}$$
-
the second marginal of \(\pi \) is \(\nu \), i.e., for every \(\psi \in C_b({ \mathbb {R}}^N\times {\mathbb {R}}^d, {\mathbb {R}})\):$$\begin{aligned} \begin{aligned} \int _{[0,1]^d \times {\mathbb {R}}^N\times {\mathbb {R}}^d} \psi (x,y)\text {d} \pi (u,x,y)&=\int _{{\mathbb {R}}^N\times {\mathbb {R}}^d} \psi (x,y) \text {d} \nu (x,y) \\&={\mathbb {E}}(\psi (X,Y)), \end{aligned} \end{aligned}$$
-
the conditional expectation of x given u is 0, i.e., for every \( b\in C([0,1]^d, {\mathbb {R}}^N)\):$$\begin{aligned} \int _{[0,1]^d \times {\mathbb {R}}^N\times {\mathbb {R}}^d} b(u)^{\top } x \text {d} \pi (u,x,y)=0. \end{aligned}$$
6 Discretization, regularization, numerical minimization
6.1 Discrete optimal transport with a mean independence constraint
6.2 The regularized vector quantile regression (RVQR) problem
-
\((b,\psi ) \leftarrow (b+c, \psi -c^{\top } x)\) with \(c\in {\mathbb {R}}^N \) is a constant translation vector,
-
\(\psi \leftarrow \psi +\lambda \) where \(\lambda \in {\mathbb {R}}\) is a constant.
6.3 Gradient descent
\(\varepsilon \) | 0.05 | 0.1 | 0.5 | 1 |
---|---|---|---|---|
\(||Q_{soft}-Q_{hard} ||_2/||Q_{soft} ||_2\), \(X=10\%\) | 3.8\(\cdot 10^{-3}\) | 1.5\(\cdot 10^{-2}\) | 6.7\(\cdot 10^{-2}\) | 9.2\(\cdot 10^{-2}\) |
\(||Q_{soft}-Q_{hard} ||_2/||Q_{soft} ||_2\), \(X=30\%\) | 6.8\(\cdot 10^{-3}\) | 1.9\(\cdot 10^{-2}\) | 7.0\(\cdot 10^{-2}\) | 9.3\(\cdot 10^{-2}\) |
\(||Q_{soft}-Q_{hard} ||_2/||Q_{soft} ||_2\), \(X=60\%\) | 1.2\(\cdot 10^{-2}\) | 2.0\(\cdot 10^{-2}\) | 6.9\(\cdot 10^{-2}\) | 9.5\(\cdot 10^{-2}\) |
\(||Q_{soft}-Q_{hard} ||_2/||Q_{soft} ||_2\), \(X=90\%\) | 1.6\(\cdot 10^{-2}\) | 2.3\(\cdot 10^{-2}\) | 6.8\(\cdot 10^{-2}\) | 9.5\(\cdot 10^{-2}\) |
\(\varepsilon \)
| 0.05 | 0.1 | 0.5 | 1 |
---|---|---|---|---|
\(||Q_{QR}-Q_{VQR} ||_2/||Q_{QR} ||_2\), \(X=10\%\) | 9.8\(\cdot 10^{-3}\) | 9.8\(\cdot 10^{-3}\) | 2.8\(\cdot 10^{-2}\) | 3.8\(\cdot 10^{-2}\) |
\(||Q_{QR}-Q_{VQR} ||_2/||Q_{QR} ||_2\), \(X=30\%\) | 8.5\(\cdot 10^{-3}\) | 1.1\(\cdot 10^{-2}\) | 3.3\(\cdot 10^{-2}\) | 4.3\(\cdot 10^{-2}\) |
\(||Q_{QR}-Q_{VQR} ||_2/||Q_{QR} ||_2\), \(X=60\%\) | 7.7\(\cdot 10^{-3}\) | 9.3\(\cdot 10^{-3}\) | 3.1\(\cdot 10^{-2}\) | 4.4\(\cdot 10^{-2}\) |
\(||Q_{QR}-Q_{VQR} ||_2/||Q_{QR} ||_2\), \(X=90\%\) | 8.2\(\cdot 10^{-3}\) | 1.0\(\cdot 10^{-2}\) | 3.5\(\cdot 10^{-2}\) | 4.9\(\cdot 10^{-2}\) |