Finds documents with both search terms in any word order, permitting "n" words as a maximum distance between them. Best choose between 15 and 30 (e.g. NEAR(recruit, professionals, 20)).
Finds documents with the search term in word versions or composites. The asterisk * marks whether you wish them BEFORE, BEHIND, or BEFORE and BEHIND the search term (e.g. lightweight*, *lightweight, *lightweight*).
This chapter delves into the world of signature-based models in finance, highlighting the shift from traditional, parameter-calibrated models to data-driven, overparametrized approaches. The focus is on signature SDEs, which offer a robust and universal framework for asset price modeling. The text explores the universality and no-arbitrage conditions of these models, providing explicit formulas for option pricing and calibration to market data. The chapter also presents a novel approach for joint SPX and VIX modeling, demonstrating the model's ability to capture market dynamics accurately. Through detailed mathematical derivations and numerical results, the chapter showcases the model's performance in fitting implied volatility surfaces and calibrating to both SPX and VIX options. The practical applications and empirical results make this chapter a valuable resource for professionals seeking to understand and implement advanced financial modeling techniques.
AI Generated
This summary of the content was generated with the help of AI.
Abstract
We consider two classes of asset price models where either the price or the volatility dynamics are described by a linear function of the (time extended) signature of a primary process, in general a multidimensional continuous semimartingale. These model classes are universal in the sense that classical models can be approximated arbitrarily well or are simply nested in our setup. Under the additional assumption that the primary process is polynomial, we obtain tractable option pricing formulas for so-called sig-payoffs in the first class and closed form expressions for the VIX squared and the log-price in the second one. In both cases the signature samples can be easily precomputed, hence the calibration task can be split into an offline sampling and a standard optimization. We present several applications, in particular the successfully solved joint SPX/VIX calibration problem.
The present work was initiated when the second author was affiliated to École des Ponts ParisTech, CERMICS Lab, Marne la Vallée Cedex 2, France.
1 Introduction
The present chapter is based on the two articles [13] and [11].
In recent years, the traditional approach of calibrating a few well-interpretable parameters has given way to learning the model’s characteristics as a whole, by leveraging all available sources of data. Consequently, overparametrized models have become increasingly important. This shift has opened the door to more robust and data-driven model selection mechanisms. At the same time the choice of admissible models must, however, be limited to those adhering to first principles such as “no-arbitrage”. Finding classes of dynamic processes that are data-driven and satisfy such well-established theoretical principles can be achieved by relying on different universal approximation theorems.
Advertisement
One class of such financial models are so-called signature SDEs, as considered in [12, 37]. These are Itô-diffusions, where the characteristics are linear or real-analytic functions of the signature of some (properly extended) driving Brownian motion or the process itself. The models that we consider in this chapter can be embedded in the framework of signature SDEs and serve as particularly tractable examples thereof.
In the first part (Sect. 3) of the present chapter our goal is to provide a data-driven, universal, tractable and easy to calibrate asset price model. For the sake of exposition we shall assume throughout that the asset S is one-dimensional. We model S as a linear function of the signature of—what we call—primary process. This primary process can either be a classical driving signal, e.g., a Brownian motion or a polynomial process [9], but also a general d-dimensional continuous semimartingale \(X=(X_{t}^{1},\dots ,X_{t}^{d})_{t \geq 0}\), usually augmented with time t. The corresponding time-extended process is denoted by \((\widehat {X}_{t})_{t \geq 0}=(t,X_{t}^{1},\dots ,X_{t}^{d})_{t \geq 0}\) and its signature by \(\widehat {\mathbb {X}}\). The asset S is then modeled/approximated via a process \(S_{n}(\ell )\) defined as
where \(\ell \) is a linear map of the signature of \(\widehat {X}\) up to some level of truncation \(n\in \mathbb {N}\) which needs to be inferred from data (see Definition 3.2 for further details). Note that the parameters of X are prespecified beforehand and can thus be seen—in analogy to machine learning terminology—as hyperparameters. This is crucial as it allows to split the calibration task into precomputable samples of the signature and a standard optimization to find the parameters of the linear map. This is one of the attractive features, further ones are summarized subsequently.
No Arbitrage
We will present in Sect. 3.2, that (1) can also be expressed in terms of stochastic integrals, whence is straightforward to deduce no-arbitrage conditions.
Universality
We refer the reader to [11, Example 3.15], where we show that classical stochastic volatility models with sufficiently regular coefficients can be arbitrarily well approximated by models of form (1).
Advertisement
Option Pricing Formulas for Sig-Payoffs
By approximating (path dependent) payoffs via so-called sig-payoffs of the form \(L(\widehat {\mathbb {S}}_T ({\ell }) )\) (see also [34]), where \(\widehat {\mathbb {S}}\) denotes the signature of \((\widehat {S}_t)_t:=(t, S_t)_t\) and L a linear map, we show that this kind of approximate option pricing reduces to the computation of the expected signature of \(\widehat {X}\). These formulas lead to high tractability whenever \(\mathbb {E}[\widehat {\mathbb {X}}_T]\) is easy to compute. This is the case for all polynomial process as analyzed in [12] and in Sect. 2 for the truncated signature.
Calibration to Options
As mentioned above, when calibrating to the market’s volatility surface (see Sect. 3.5), we precompute Monte Carlo samples of \(\widehat {\mathbb {X}}\) and are then only left with finding the parameters \(\ell \), which is subject to a standard optimization. We perform for both simulated and real market data (S\(\&\)P 500 index) a full calibration to the volatility surface, and show that this is not only highly accurate but also very fast (see [11] for details, in particular in the case of time dependent parameters).
The second part (Sect. 4) treats a slightly different class of signature-based models and constitutes a novel contribution for joint VIX and S&P 500 modeling as well as calibration. In the context of volatility modeling the joint calibration to SPX1 and VIX options is still considered a rather hard problem which has become increasingly important over the past years. However it is worth to mention that significant progress has been made recently (see [13] and [17]). We address the reader to [13, 17, 28] for an extensive literature review.
Inspired by the previous model, we now study a stochastic volatility model for the discounted price process \(S=(S_{t})_{t\ge 0}\), namely
for a linear map \(\ell \) of the signature \(\widehat {\mathbb {X}}_{t}\) of a time extended d-dimensional continuous semimartingale X, which takes here the role of the primary process. We thus assume that the signature of \(\widehat {X}\) serves as a linear regression basis for the volatility process, while the parameters of the linear map \(\ell \) have to be learned from (option price) data.
Also in this case the modeling framework can be seen as universal in the class of continuous non-rough stochastic volatility, which is a consequence of the universal approximation properties of the signature. Besides that it truly nests several classical models (see Remark 4.3). and incorporates both, purely Markovian (in \((S,X)\)) and path-dependent ones. Moreover, it provides for the first time a signature-based approach for pricing VIX options and highly accurate joint calibration results, as illustrated in Sect. 4.2. For the latter we exploit the following mathematical and numerical properties.
Setting \(Z:= (X, B)\), then not only \(\sigma ^S(\ell )\) but also the log-price \(\log (S(\ell ))\) is a linear function of the signature of \(\widehat {Z}\). Therefore no (Euler) simulation scheme is needed to sample the price process, leading to immediate computational advantages.
If we additionally assume X to be a polynomial process (see Definition 2.1 and [9, 19]), we obtain an analytic expression for the VIX, which only involves the computation of a matrix exponential. This follows from the property that the truncated signature of a polynomial process is again a polynomial process (see Sect. 2).
We can apply a Monte Carlo approach for option pricing and calibration where we are able to generate the signature samples of \(\widehat {Z}\) offline and independently of the model parameters to be optimized. As in the previous model the calibration task can thus be split into an offline sampling procedure and a standard optimization, since the latter does not require re-sampling for updated model parameters \(\ell \).
Let us first fix the notation used throughout the chapter. Recall the notions introduced in chapter “A Primer on the Signature Method in Machine Learning”, in particular the algebra of formal power series which we will also refer to as extended tensor algebra and denote by \(T((\mathbb {R}^d))\). For a multi-index \(I:=(i_1,\ldots ,i_n)\in \{1, \dots , d\}^n\) we set \(|I|:=n\). We also consider the empty index \(I:=\emptyset \) and set \(|I|:=0\). For each index I we then define
Similarly we set \(I'':=(I')'\) if \(|I|\neq 0\) and \(I''=0\) if \(|I|=0\). Let us denote by \(e_\emptyset \) the basis element corresponding to \((\mathbb {R}^d)^{\otimes 0}\), then each element \(\mathbf {a}\in T((\mathbb {R}^d))\) can be expressed as
for a collection of \({\mathbf {a}}_I\in \mathbb {R}\). Moreover a vectorization of the elements of the truncated tensor algebra \(T^{(n)}(\mathbb {R}^d)\) whose elements are of the form
will prove useful. To this end, we introduce the isomorphism \({\mathbf {vec}}: T^{(n)}(\mathbb {R}^d)\to \mathbb {R}^{d_{n}}\) and an arbitrary but fixed injective labeling function \(\mathscr {L}:\{I: |I|\le n\}\longrightarrow \{1,\dots , d_{n}\}\), such that
We stick to the convention of mathematical finance, that the process \(S=(S_t)_{t\geq 0}\) denotes the price process. Therefore, we denote the signature of a process X by \(\mathbb {X}\). Moreover, for a multi-index I and \(0\leq s \leq t\), we denote the corresponding increment of a component of the signature by its projection on \(e_I\) that is \(\langle e_I, \mathbb {X}_{s,t}\rangle \). To make the connection to the notation introduced in chapter “A Primer on the Signature Method in Machine Learning”: for a multi-index I and times \(0\leq s \leq t\) we denote the corresponding element of the signature of a process X by
and write \(\mathbb {X}_{t}:= \mathbb {X}_{0,t}\). We emphasize again that from now on we will only use the letter S to denote the price process and never the signature of a process. Recall that for any \(t\geq 0\) the signature of an \(\mathbb {R}^d\)-dimensional process \(X= (X)_{t\geq 0}\) it holds that \(\mathbb {X}_t \in T((\mathbb {R}^d))\) at any time \(t\geq 0\). We denote its canonical projection to \(T^{(n)}(\mathbb {R}^d)\) by \(\mathbb {X}^n\) and refer to it as the truncated signature.
In this chapter, we assume the stochastic processes to be continuous semimartingales. Recall that the elements of the signature of a continuous semimartingale \(X=(X)_{t\geq 0}\) can be defined recursively as
for each \(I=(i_1,\ldots , i_n)\), \(I'=(i_1,\ldots , i_{n-1})\) and \(0\leq s\leq t\), where \(\circ \) denotes the Stratonovich integral.
Definition 1.1 (Shuffles)
Set \(I=(i_1,\ldots ,i_n)\) and \(J=(j_1,\ldots ,j_m)\) for some \(n\in \mathbb {N}\) and \(m>0\). We define the shuffle
Bar chart with three vertical bars of equal height, representing data comparison across three categories. The chart lacks labels and numerical values, focusing on visual representation of uniformity among the categories.
and half-shuffle
A symbol resembling a tilde (~) placed above a horizontal line, with two vertical lines extending downward from each end of the horizontal line.
as
The image displays a mathematical formula: e_I perp!!!perp e_emptyset := e_I . The formula includes the symbols for independence (perp!!!perp), the empty set (emptyset), and the definition symbol (:=).
,
The image shows a mathematical formula: e_I widetilde{sqcup} e_{emptyset} := 0 . The formula includes a tilde over a square cup symbol, and the empty set symbol.
, respectively and
The image contains mathematical expressions involving Greek letters and mathematical symbols. The first expression is: [ rho bigsqcup rho_j = (rho_i bigsqcup rho_j) otimes e_n + (rho_i bigsqcup rho_j) otimes e_{in} ] The second expression is: [ e_i bigsqcup e_j = (e_i bigsqcup e_j) otimes e_n + (e_i bigsqcup e_j) otimes e_{in}, quad e_i bigsqcup e_j = (e_i bigsqcup e_j) otimes e_{in} ] Symbols used include the Greek letters rho and rho_j, the tensor product symbol otimes, and the join operation symbol bigsqcup.
Remark 1.2
Using shuffle and half-shuffle one can recover linear representations of more complex operations on the signature. In particular, for any pair of multi-indices \(I,J\) and each \(t\geq 0\) it holds that (see e.g. [11])
The image contains mathematical notation with two equations. The first equation is: [ langle e_t, X_{Delta t} mid e_J, X_t rangle = langle e bigsqcup e_J, X_t rangle ] The second equation is: [ int_0^t langle e_t, X_s rangle circ d(e_J, X_s) = langle tilde{e}_t bigsqcup e_J, X_t rangle ] Symbols include angle brackets, integral sign, and the coproduct symbol bigsqcup.
Choosing for example \(I=(1)\) and \(J=(2)\) we get that
The image displays a mathematical formula involving operations on elements e_{(1)} and e_{(2)} . The expression is: [ e_{(1)} bigsqcup e_{(2)} = (e_{(1)} bigsqcup e_{(2)}) otimes e_{(1)} + (e_{(1)} bigsqcup e_{(2)}) otimes e_{(2)} = e_{(2,1)} + e_{(1,2)} ] Symbols used include the coproduct bigsqcup, tensor product otimes, and elements e_{(1)} , e_{(2)} , e_{(2,1)} , and e_{(1,2)} .
and
The image displays a mathematical formula involving operations on elements denoted by e . The expression is: [ widetilde{e}_{(1)} bigsqcup bigsqcup e_{(2)} = (e_{(1)} bigsqcup bigsqcup e_{emptyset}) otimes e_{(2)} = e_{(1,2)} ] Symbols include the tilde, disjoint union (bigsqcup), tensor product (otimes), and empty set (emptyset).
for some d-dimensional Brownian motion W and some maps \(a:\mathbb {R}^d\to \mathbb {S}^d_+\) (where \(\mathbb {S}^d_+\) denotes the set of symmetric positive definite \(d\times d\) matrices) and \(b:\mathbb {R}^d\to \mathbb {R}^d\). Set \(\sigma (Y_t):= \sqrt { a(Y_t)}\). Assume then that \(Y=(Y_{t})_{t\ge 0}\) is a polynomial process as defined in the next definition and denote by \(\mathbb {Y}\) the corresponding signature.
Definition 2.1 (Polynomial Process)
A process \(Y=(Y_{t})_{t\ge 0}\) satisfying (5) is called polynomial process if \(a_{ij}\) is a polynomial of degree at most 2 and \(b_j\) is a polynomial of degree at most 1 for each \(i,j\in \{1,\ldots ,d\}\).
Various representations of the conditional expected signature of Y and analogous quantities, in particular for Brownian motion, can be found in [3, 5, 18, 32, 33]. We also refer to [15, 20] for the case of Lévy processes. Our approach aligns with [12] and is grounded in the classical theory of polynomial processes (see [9] and [19]). Although the framework of general signature SDEs considered in [12] requires results for infinite-dimensional stochastic processes (see, for example, [8, 14]), the current assumption of Y being a polynomial process allows us to remain within the finite-dimensional setting.
Lemma 2.2
Let\((Y_t)_{t\geq 0}\)be a polynomial process satisfying (5). The corresponding drift and diffusion coefficients b and a can then be written as
$$\displaystyle \begin{aligned} b_j(y)=b_j^c+\sum_{k=1}^db_j^{k}y_k \qquad \mathit{\text{ and }}\qquad a_{ij}(y)=a_{ij}^c+\sum_{k=1}^da_{ij}^ky_k+\sum_{k,h=1}^da_{ij}^{kh}y_ky_h,\end{aligned}$$
for some\(b_j^c\),\(b_j^{k}\),\(a_{ij}^c\), \(a_{ij}^k\), \(a_{ij}^{kh}=a_{ij}^{hk}\in \mathbb {R}\), with\(i,j=1,\dots ,d\). Moreover,
The image contains two mathematical expressions. The first expression is: [ mathbf{b}_j = left( b_j^c + sum_{k=1}^{d} b_j^k gamma_0^k right) e_{emptyset} + sum_{k=1}^{d} b_j^k e_k ] The second expression is: [ a_{ij} = left( a_{ij}^c + sum_{k=1}^{d} a_{ij}^k gamma_0^k + sum_{k,h=1}^{d} a_{ij}^{kh} gamma_0^k gamma_0^h right) e_{emptyset} + sum_{k=1}^{d} left( a_{ij}^k + 2 sum_{h=1}^{d} a_{ij}^{kh} gamma_0^h right) e_k + sum_{k,h=1}^{d} a_{ij}^{kh} e_k sqcup e_h ] The expressions involve summations, Greek letters, and mathematical symbols.
Note that the upper index on\(Y_0^k\)and\(Y_0^h\)refers to the components of Y and not to powers.
Proof
The first representation follows by the definition of polynomial processes, according to which b and a are polynomials of degree at most 1 and 2, respectively. For the second representation it then suffices to note that \(\langle e_\emptyset ,\mathbb {Y}_t^1\rangle =\langle e_\emptyset ,\mathbb {Y}_t^2\rangle =1,\ \langle e_k,\mathbb {Y}_t^1\rangle = \langle e_k,\mathbb {Y}_t^2\rangle =(Y_t^k-Y_0^k),\) and
The image displays a mathematical formula involving vectors and operations. The formula is: [ langle e_k boxminus e_h, mathbb{Y}^2_t rangle = (Y^k_t - Y^k_0)(Y^h_t - Y^h_0). ] Here, langle cdot , cdot rangle denotes an inner product, boxminus is a specific operation, and mathbb{Y}^2_t is a vector or matrix. The terms Y^k_t, Y^k_0, Y^h_t, and Y^h_0 are components of vectors or sequences.
□
Lemma 2.3
Let\((Y_t)_{t\geq 0}\)be a polynomial process satisfying (5). Then the truncated signature\(( \mathbb {Y}_t^n)_{t\geq 0}\)is a polynomial process and for each\(|I|\leq n\)it holds that
where\(L:T((\mathbb {R}^d))\to T((\mathbb {R}^d))\)satisfies\(L(T^{(n)}(\mathbb {R}^d))\subseteq T^{(n)}(\mathbb {R}^d)\)and is given by
The image displays a mathematical formula involving indexed variables and Greek letters. The formula is: [ Le_I = e_{I'} bigsqcup b_{i_{|I|}} + frac{1}{2} e_{I''} bigsqcup a_{i_{|I|-1} i_{|I|}} ] Key elements include the Greek letter e with subscripts I' and I'' , and the use of the union symbol bigsqcup. The formula also includes indexed variables b_{i_{|I|}} and a_{i_{|I|-1} i_{|I|}} , and a fraction frac{1}{2}.
(6)
for\({\mathbf {b}}\)and\({\mathbf {a}}\)as in Lemma2.2.
Proof
Let \(\sigma _j(Y_t)\) denote the j-th row of \(\sigma (Y_t)\). Using the definition of the signature, the Stratonovich integral and the shuffle property for each \(|I|\geq 0\) we can compute
The image displays a series of mathematical equations involving integrals and inner products. The equations are expressed in terms of variables e_I , Y_t , and Y_s , with integrals from 0 to t . The notation includes Greek letters such as sigma and alpha , and mathematical symbols like langle cdot, cdot rangle for inner products, and circ for composition. The equations involve terms with differentials d and functions b and a . The expressions are structured in a stepwise manner, showing transformations and simplifications of the initial equation.
Fix \(|I|\leq n\). Since
The image displays a mathematical formula involving the symbols I , J , and a perpendicular symbol. The formula is represented as I perp J = |I| + |J| . Here, perp denotes a perpendicular relation, and the vertical bars | cdot | represent absolute values or magnitudes.
we get that \(L(T^{(n)}(\mathbb {R}^d))\subseteq T^{(n)}(\mathbb {R}^d)\) and thus that the corresponding drift components are linear maps in \(\mathbb {Y}^n\). Similarly, fixing \(|I|,|J|\leq n\) and using that
The image displays a mathematical formula: [ mathbf{a}_{ij} = sum_{|I|, |J| leq 1} lambda_{ij}^{IJ} e_I wedge e_J ] Key elements include the summation symbol, Greek letter lambda (lambda), and wedge product (wedge). The formula involves indices i, j, and sets I, J with constraints |I|, |J| leq 1.
for some \(\lambda _{ij}^{IJ}\in \mathbb {R}\) we get that
The image displays a mathematical formula involving vectors, summation, and various mathematical symbols. The formula is: [ langle e_{I_1}, Y_s rangle sigma_{i|J}(Y_t) left( langle e_J, Y_s rangle sigma_{i|J}(Y_t) right)^{top} = langle e_{I_1}, Y_s rangle langle e_J, Y_s rangle langle a_{i|J[j]J}, Y_s rangle ] [ = sum_{|H_1|, |H_2| leq 1} lambda_{i|J[j]J}^{H_1 H_2} langle e_{I_1} perp!!!perp e_{H_1}, Y_s rangle langle e_J perp!!!perp e_{H_2}, Y_s rangle ] The formula includes Greek letters such as lambda (lambda) and sigma (sigma), and mathematical operations like inner products (langle cdot, cdot rangle), summation (sum), and perpendicular symbols (perp!!!perp).
This shows that the components of the corresponding diffusion matrix are polynomials of degree 2 in \(\mathbb {Y}^n\). The polynomial property follows from Lemma 2.2 in [19]. □
As the linear operator L maps the finite-dimensional vector space \(T^{(n)}(\mathbb {R}^d)\) into itself, it can be represented by a matrix.
Definition 2.4
We call the operator L defined in (6) dual operator corresponding to\(\mathbb {Y}\) and denote by G the \(d_n\)-dimensional matrix representative ofL. Explicitly, for each \(|I|\leq n\) we consider coefficients \(\eta _{IJ}\in \mathbb {R}\) such that
and fix a labelling injective function \({\mathscr {L}}:\{I\colon |I|\leq n\}\to \{1,\ldots ,d_n\}\). The matrix \(G\in \mathbb {R}^{d_n\times d_n}\) is then given by
The results of the following theorem have been implemented and the code is available at the repository github.com/sarasvaluto/AffPolySig presented in [13].
Theorem 2.5
Let\((Y_t)_{t\geq 0}\)be a polynomial process satisfying (5), G be the\(d_n\)-dimensional matrix representative of the dual operator corresponding to\(\mathbb {Y}\), and\((\mathcal {F}_t)_{t\geq 0}\)be the filtration generated by\((Y_t)_{t\geq 0}\). Then for each\(T,t\geq 0\)and each\(|I|\leq n\)it holds
where\(e^{(\cdot )}\)denotes the matrix exponential.
Proof
Lemma 2.3 yields that \({\mathbf {vec}}(\mathbb {Y}^n)\) is a polynomial process and the claim follows by Theorem 3.1 in [19] for polynomials of degree 1. □
In the special case given by \(Y=W\) the matrix G is nilpotent and we obtain a more explicit representation. Loosely speaking, the signature element \(\langle e_I, \widehat {\mathbb {W}}_t \rangle \) has nonzero expectation only if all the indices \(i_j \neq 0\) in I are grouped in blocks of even length.
Corollary 2.6
Let W be a vector of d correlated Brownian motions with correlation matrix\(\rho \). Consider a multi-index I admitting the representation
where\(\rho (J):=\prod _{k=1}^{|J|/2} \rho _{j_{2k-1},j_{2k}}\). If I does not admit representation (8) then\(\mathbb {E}[\langle e_I,\widehat {\mathbb {W}}_t\rangle ]=0\). Additionally, for each\(s, t\geq 0\)it holds
for each I of the form \(e_I=e_{j_0}^{\otimes k_0}\otimes \cdots \otimes e_{j_m}^{\otimes k_m}\) with \(k_i\) even whenever \(j_i>0\).
In order to better understand Theorem 2.5 we propose two examples. The first concerns a vector of correlated Brownian motions and constitutes the core of the proof of Corollary 2.6. The second one leads to the formulas that will be applied later on in the chapter.
Example 2.8
Let W be a d-dimensional Brownian motion and a process \((X_t)_{t\geq 0}\) given by
for some matrix a of the form \(a_{ij}(X_{t})=\sigma ^i\sigma ^j\rho _{ij}\) and some constant \(\sigma ^i>0\) and \(\rho _{ij}\in [-1,1]\). Observe that \(\widehat {X}\) satisfies (5) in \(d+1\) dimensions for
The corresponding \({\mathbf {b}}\) and \({\mathbf {a}}\) are given by \( {\mathbf {b}}_{j}=e_{\emptyset } 1_{\{j=0\}}\) and \( {\mathbf {a}}_{ij}=e_{\emptyset }\sigma ^i\sigma ^j\rho _{ij}1_{\{i,j\neq 0\}} \) and we thus get
The image displays a mathematical formula involving tensor products and other operations. The formula is: [ L(e_I otimes e_0) = e_I bigsqcup bigsqcup mathbf{b}_0 + frac{1}{2} e_{I'} bigsqcup bigsqcup mathbf{a}_{i|I|0} = e_I ] Key symbols include the tensor product otimes, the disjoint union bigsqcup, and Greek letters e_I and e_0.
, and \(L(e_{0}\otimes e_{1}\otimes e_{2})=\frac {1}{2}e_{0}\sigma ^{1}\sigma ^{2}\rho _{12}\). Letting \((\mathcal {F}_t)_{t\geq 0}\) be the filtration generated by \((\widehat {X}_t)_{t\geq 0}\), by Theorem 2.5 we can thus conclude that
where G denotes the \((d+1)_n\)-dimensional matrix representative of L.
Example 2.9
In a similar setting as of the previous example let W be a d-dimensional Brownian motion and set \(Z_t:=(X_t,B_t)\) for an another one-dimensional Brownian motion \((B_t)_{t\geq 0}\). Suppose that
where \(\kappa , \theta \in \mathbb {R}^d\) and \(\operatorname {diag}(\kappa )\) denotes a diagonal matrix consisting of the components of \(\kappa \) and a is as before. Observe that \(\widehat {Z}\) satisfies (5) in \(d+2\) dimensions for
where \(\kappa ^{d+1}:=0\), \(\sigma ^{d+1}:=1\), and \(\rho _{j(d+1)}\) is the correlation between \(X^j\) and B. The corresponding \({\mathbf {b}}\) is then given by \( {\mathbf {b}}_{j}=e_{\emptyset } (1_{\{j=0\}}+\kappa ^{j}(\theta ^{j}-\widehat {Z}_{0}^{j})1_{\{j\neq 0\}})-e_{j}\kappa ^{j}1_{\{j\neq 0\}}\) and we thus get
The image displays a mathematical formula involving various symbols and notations. The formula is: [ Le_{I} = e_{I|I}(1_{{i_{I|I}=0}}) + k^{j|I|I}(theta^{j|I|I} - hat{Z}_{0}^{j|I|I})1_{{i_{I|I}neq0}} - (e_{I|I} bigsqcup e_{i_{I|I}})k^{j|I|I}1_{{i_{I|I}neq0}} ] [ + frac{1}{2}e^{i|I|I}sigma^{j|I|I-1}sigma^{j|I|I}rho_{i_{I|I-1}i_{I|I}}1_{{i_{I|I-1},i_{I|I}neq0}} ] The formula includes Greek letters such as theta (theta), sigma (sigma), and rho (rho), as well as mathematical symbols like summation (bigsqcup), and hat notation (hat{Z}).
3 Linear Signature Models
Let us now introduce a framework for signature-based asset price models. We fix a time horizon \(T>0\), consider a d-dimensional continuous semimartingale \( X:=( X^1,\ldots , X^{d})\) and its matrix-valued quadratic covariation \([X]\). We suppose that \( X\) encodes all the information to represent the market asset S. Recall that for notational convenience we assume that S is one-dimensional.
Furthermore, we suppose that X has some tractability properties which are made precise below and which are for instance satisfied by a d-dimensional Brownian motion, but more generally also all polynomial diffusion processes.
3.1 Definition and First Properties
We regard the continuous semimartingale X as the primary process, being the main modeling building block. Its time extension is denoted by
We use \(e_0\) for the component of \(\widehat {X}\) corresponding to time and \(e_k\) for its component corresponding to \( X^k\). Its signature is denoted by \(\widehat {\mathbb {X}}_{t}\).
Throughout the section we then make the following standing assumption about diffusion matrix of X.
for some \(m\in \mathbb {N}\). Observe that this assumption is always satisfied when X is given by a polynomial process.
Remark 3.1
It could be of interest to consider the extension \(\widehat {X}_t:=(t, X_t,[ X]_t)\), which includes the \(d^2\)-dimensional process given by the quadratic covariation of the primary process X. In this case the components of the diffusion matrix do not need to be linear functions of the time extended signature, but could for instance be general path-dependent functionals. Moreover, the map \(t \mapsto [X]_t\) does not need to be absolutely continuous with respect to the Lebesgue measure, hence X does not need to be an Itô-semimartingale. For ease of exposition we focus here on the simple time extension and refer to [11] for the slightly more general setup.
Our goal consists in approximating the dynamics of \( S\) with a signature model.
Definition 3.2 (Signature Model)
A signature model is a stochastic process of the form
where \(n\in \mathbb {N}\) and \(\ell :=\{\ell _\emptyset , \ell _I\colon 0<|I|\leq n\}\).
Remark 3.3
By Proposition 3.4 below the class of Sig-SDEs models considered in [37] can be embedded in our framework by choosing a properly extended one-dimensional Brownian motion as primary process.
In the following we list several important properties which make signature models a tractable framework for stochastic finance.
For each \(t \in [0,T]\), \( S_n(\ell )_t\) is linear in \(\widehat {\mathbb {X}}_t\). This in particular implies that having precomputed \(\widehat {\mathbb {X}}\) an update of the parameters \(\ell \) boils down to evaluating the scalar product in (13).
The quadratic variation of processes of form (13) is again of the form (13).
Representation (13) remains invariant under polynomial transformations.
Itô-integrals of processes of form (13) with respect to processes of form (13) are again processes of form (13). This includes in particular the signature \(\widehat {\mathbb {S}}_n(\ell )\) of \(\widehat {S}_n(\ell )_t:=(t, S_n(\ell )_t)\) or expressions of the form \(\int _0^\cdot S_n( \ell )_s \mathrm {d} X_s^i.\)
The latter point implies that the expected signature of \(\widehat {S}_n(\ell )\) can be expressed as
for some \(P_J\) such that \(P_J(\cdot ,\mathbb {E}[\widehat {\mathbb {X}}_t])\) is a polynomial of degree \(|J|\) and \(P_J(\ell , \cdot )\) is a linear map for each \(\ell \) (see Theorem 3.9 and Remark 3.11 for more details).
Due to the universal approximation theorem (see e.g. [7, 15]) this yields approximations
for each map f, which is continuous in a suitable sense, where \(P_f\) is given by a finite linear combination of maps \(P_J\) as above and with \(\widehat {\mathbb {S}}^2\) we denote here the second order lift. This includes representations for
The image displays a mathematical formula involving tensor products and summation. The formula is: [ tilde{e}_I^k := e_I otimes e_k - sum_{|J| leq m} frac{a_i^J |I|^k}{2} (e_I sqcup e_J) otimes e_0 ] Key elements include the tensor product symbol otimes, summation sum, and the disjoint union symbol sqcup. The formula involves indices I, J, and k, with constraints on J.
Proof
The representations of \(\int _0^t \langle e_I,\widehat {\mathbb {X}}_s\rangle \mathrm {d} s\) follows by the definition of signature. We proceed with the proof of the representation of \(\int _0^t \langle e_I,\widehat {\mathbb {X}}_s\rangle \mathrm {d} X_s^k\). For \(I=\emptyset \) the claim follows. By the definition of the Stratonovich integral,
The shuffle property and the definition of the signature yield then
The image displays a mathematical formula involving integrals and summations. The formula is: [ int_0^t langle e_I, hat{X}_s rangle , mathrm{d}[X^{k_1}, X^{k_2}]_s = int_0^t sum_{|J| leq m} a_{k_1 k_2}^J langle e_I, hat{X}_s rangle langle e_J, hat{X}_s rangle , mathrm{d}s ] [ = sum_{|J| leq m} a_{k_1 k_2}^J langle (e_I sqcup e_J) otimes e_0, hat{X}_t rangle ] The formula includes integrals from 0 to t , summations over J with conditions |J| leq m , and various mathematical symbols such as langle cdot, cdot rangle for inner products, otimes for tensor products, and sqcup for disjoint union.
and the claim follows. □
Suppose that \( X\) is a vector of correlated Brownian motions with correlation matrix \(\rho \). Then, the transformation introduced in Lemma 3.5 reads
which is of the form given by (13). Conversely, by Lemma 3.5 we also have
The image contains two mathematical equations involving integrals and summations. The first equation is: [ langle e_I otimes e_0, hat{X}_t rangle = int_0^t langle e_I, hat{X}_s rangle , ds ] The second equation is: [ langle e_I otimes e_k, hat{X}_t rangle = int_0^t langle e_I, hat{X}_s rangle , dX_s^k + frac{1}{2} sum_{|J| leq m} int_0^t langle a_{i|J|k} (e_I mathbin{text{textifsym{lrcorner}}} e_J), hat{X}_s rangle , ds ] Symbols include integrals, summations, tensor products, and inner products.
yielding the claim. □
3.2 Absence of Arbitrage and Universality Properties
The results obtained in the representation above will enable us to establish conditions ensuring absence of arbitrage. More precisely, we shall make the assumption that the principle of “no free lunch with vanishing risk” [16] holds. This assumption is equivalent to the existence of an equivalent local martingale measure \(\mathbb {Q}\) as \(S_{n}(\ell )\) has continuous sample paths. Throughout this section, we will assume zero interest rates and that the asset has already been discounted. To formulate a precise condition for the absence of arbitrage, we rely on the following corollary, which is a direct consequence of Proposition 3.4 and Lemma 3.5.
Corollary 3.6
Suppose that\( X\)is a local martingale. Then\( S_{n}(\ell )\)is a local martingale if and only if it admits a representation of the form
for some\(D\in \{1,\ldots , d\}\)and\(\ell :=\{\ell _\emptyset , \ell _I^{k}\colon 0\leq |I|\leq n-1\mathit{\text{ and }} k\in \{1,\ldots ,d\}\}.\)
Proof
Since X is a local martingale, \(S_n(\ell )\) in the representation of Proposition 3.4 is a local martingale if and only if all integrals with respect to time and the quadratic variation process vanish. This means that \(S_n(\ell )\) is of form
Let us now formulate sufficient no-arbitrage conditions.
Corollary 3.7
Suppose that there is an equivalent measure\(\mathbb {Q} \sim \mathbb {P}\)such that X is a local\(\mathbb {Q}\)-martingale. Then the following holds.
(i)
The model\(S_{n}(\ell )\)is free of arbitrage if it admits a representation as of (14).
(ii)
If\(\mathbb {Q}\)is an equivalent local martingale measure for\(S_{n}(\ell )\), then\(S_{n}(\ell )\)is necessarily of form (14).
Proof
The first assertion is a direct consequence of Corollary 3.6, as form (14) implies that \(S_{n}(\ell )\) is a local \(\mathbb {Q}\)-martingale and thus \(\mathbb {Q}\) is an equivalent local martingale measure. The assumption of the second assertion implies that \(S_{n}(\ell )\) is a local martingale under \(\mathbb {Q}\), whence by Corollary 3.6 it has to be of form (14). □
Remark 3.8
(i)
It is important to recognize that \(S_{n}(\ell )\) could be free of arbitrage without being of the form in (14). This scenario arises when the primary process X is not a local martingale under any of the equivalent local martingale measures.
(ii)
Additionally, note that if \(S_n(\ell )\) is of the form (14) and a local martingale under \(\mathbb {Q}\), it does not necessarily imply that X is a local \(\mathbb {Q}\)-martingale. This situation arises when the drift terms in (15) cancel each other out. It is worth mentioning that in the one-dimensional case (\(d=1\)), such a scenario is impossible, and consequently, X is inevitably a local \(\mathbb {Q}\)-martingale. Moreover, \(\mathbb {Q}\) is unique and the model thus complete.
3.3 The Expected Signature of \(S_n(\ell )\)
For the pricing of so-called sig-payoffs discussed in Sect. 3.4, it is crucial to be able to compute the expected signature of \(\widehat {S}_n(\ell )_t=(t,S_n(\ell )_{t})\). In this section, we present formulas that link this computation to the calculation of the expected signature of \(\widehat {X}_t\), which can often be explicitly computed. Specifically, when X is a Brownian motion, this is a well-known result (refer to, e.g., [18]). For the case where X is a polynomial process, the computation can be carried out using polynomial processes techniques (see Sect. 2). In a general setting, it can be computed by solving an infinite-dimensional system of linear PDEs corresponding to the Kolmogorov forward equation of the signature process (see [36]). For a comprehensive treatment of signature cumulants, i.e., the logarithm of the expected signature, we direct the readers to [21].
Theorem 3.9
Fix\(n\in \mathbb {N}\), a multi-index J, and\(D\in \{1,\ldots , d\}\)and denote by\(\widehat {\mathbb {S}}_n(\ell )_{t}\)the signature of\(\widehat {S}_n(\ell )_{t}\). Let\(e_0\)be the component of\(\widehat {S}_n(\ell )\)corresponding to time and\(e_1\)its component corresponding to\(S_n(\ell )\). Define\(e(\emptyset ,\ell ):=\tilde e(\emptyset ,\ell ):=e_\emptyset \)and
The image contains two mathematical expressions. The first expression is: [ e(J, ell) = widetilde{prod}_{i=1}^{|J|} left( e_0 mathbf{1}_{{j_i=0}} + left( sum_{0<|I|leq } ell_I e_I right) mathbf{1}_{{j_i=1}} right) ] The second expression is: [ tilde{e}(J, ell) = widetilde{prod}_{i=1}^{|J|} left( e_0 mathbf{1}_{{j_i=0}} + left( sum_{k=1}^{D} ell_{emptyset}^k e_k + sum_{0<|I|leq -1} ell_I^k tilde{e}_I^k right) mathbf{1}_{{j_i=1}} right) ] These expressions involve summations, products, and indicator functions, with Greek letters and mathematical symbols.
for\(|J|>0\)with
A symbol resembling a tilde (~) positioned above a horizontal line, with two vertical lines extending downward from each end of the horizontal line.
the half-shuffle being introduced in Definition1.1. Then the following representations hold.
\( \langle e_{J},{\widehat {\mathbb {S}}_n(\ell )}_{t}\rangle =\langle e(J,\ell ),\widehat {\mathbb {X}}_{t}\rangle \), if\(S_n(\ell )\)is given by (13), and
\(\langle e_{J},{\widehat {\mathbb {S}}_n(\ell )}_{t}\rangle =\langle \tilde e(J,\ell ),\widehat {\mathbb {X}}_{t}\rangle \), if\( S_{n}(\ell )\)is given as in Corollary3.6.
Proof
We proceed by induction to prove the claim. Fix \(S_n(\ell )\) as in (13). For \(J=\emptyset \) the claim follows by the definition of signature. Suppose the claim holds true for each J such that \(|J|=m-1\), and fix J with \(|J|\leq m\). Then
The induction hypothesis and Remark 1.2 yield the first claim and the second one is analogous. □
Remark 3.10
The linear combinations of multiindices given by \(e(J,\ell )\) might look very abstract and we thus provide some additional details. The intuition behind its construction is as follows. For each 0 in J we are integrating with respect to time and we thus add
The image shows a mathematical expression with a tilde symbol over a capital letter "L" followed by a subscript "0" next to a lowercase letter "e". In LaTeX notation, it is represented as tilde{L} e_0.
to the current linear combination of multiindices. For each 1 in J we are instead integrating with respect to \(S_n(\ell )\) and we thus need to add
The image displays a mathematical formula: widetilde{mathbb{L}}left(sum_{0 < |I| leq } ell_I e_Iright). It includes a tilde over a script L, a summation from 0 to with conditions on the index I, and Greek letter ell and subscripted elements e_I.
. Since integrating with respect to \(S_n(\ell )\) corresponds to integrating with respect to \(S_n(\ell )-S_n(\ell )_0\) the term \(\ell _{\emptyset }e_{\emptyset }\) can be omitted. Choosing \(J=(0,1)\) and \(J=(1,1)\) we would for instance get
The image contains mathematical expressions involving summations and products. The first expression is: [ e((0,1), ell) = e_0 widetilde{sum_{0 < |I| leq } ell_I e_I} = sum_{0 < |I| leq } ell_I e_0 widetilde{e_I} ] The second expression is: [ e((1,1), ell) = left( sum_{0 < |I_1| leq } ell_{I_1} e_{I_1} right) widetilde{sum_{0 < |I_2| leq } ell_{I_2} e_{I_2}} = sum_{0 < |I_1|, |I_2| leq } ell_{I_1} ell_{I_2} e_{I_1} widetilde{e_{I_2}} ] The expressions involve summation indices I , I_1 , and I_2 , and use the tilde symbol widetilde{} to denote a specific operation.
The intuition behind \(\tilde e(J,\ell )\) is similar.
Remark 3.11
(i)
Let \( S_n(\ell )\) be as in (13) and observe that
The image displays a mathematical formula involving summation and product notations. The formula is: [ e(J, ell) = widetilde{prod_{i=1}^{|J|}} sum_{0 < |I| leq } left( e_I mathbb{1}_{(j_i=0)} mathbb{1}_{[I=(0)]} + ell_I e_I mathbb{1}_{(j_i=1)} right) ] Key elements include the use of Greek letters ell and I , product notation prod, summation notation sum, and indicator functions mathbb{1}.
Setting \(c(j,I,\ell ):={\boldsymbol 1_{\{j=0\}}}{\boldsymbol 1_{\{I=(0)\}}}+\ell _{I}{\boldsymbol 1_{\{j=1\}}}\) we thus obtain that
The image displays a mathematical formula involving expected values, summation, and product notations. The formula is: [ mathbb{E}[e_J, hat{S}_n(ell)_t] = mathbb{E}[e(J, ell), hat{X}_t] ] [ = sum_{|I_1|, ldots, |I_J| = 1}^{} mathbb{E}left[left(tilde{prod}_{i=1}^{|J|} e_{I_i}, hat{X}_tright) prod_{i=1}^{|J|} c(j_i, l_i, ell)right] ] Symbols include Greek letters such as ell (lambda) and hat{} (hat) notation indicating estimates or approximations. The formula involves complex operations with indices and conditions.
Although this representation may seem intricate at first glance, it is, in fact, quite convenient as the expectations
Expected value notation with mathematical symbols: mathbb{E}left[leftlangle prod_{i=1}^{tilde{J}} e^{I_i}, hat{X}_t rightrangleright]. The formula includes a product from i=1 to tilde{J}, exponential function e^{I_i}, and a vector hat{X}_t. Symbols include tilde tilde{}, hat hat{}, and angle brackets langle rangle.
can be computed just once in advance. Since \(c(j,I,\cdot )\) is affine, we also immediately obtain that the map
is polynomial of degree \(|J|\) in its first argument and linear in the second one.
(ii)
Similarly, let \( S_{n}(\ell )\) be as in Corollary 3.6, set \(\tilde e_\emptyset ^k:=e_k\), and observe that
The image displays a mathematical formula involving summations and various symbols. The formula is: [ tilde{e}(J, ell) = tilde{u} prod_{i=1}^{J} sum_{k=1}^{D} sum_{0 < |I| leq -1} left( tilde{e}_{I}^{k} mathbf{1}_{(j=0)} mathbf{1}_{(I=(0))} mathbf{1}_{(k=1)} + ell_{C}^{k} tilde{e}_{I}^{k} mathbf{1}_{(j=1)} right) ] Key elements include the use of Greek letters such as tilde{e}, tilde{u}, and ell, and mathematical symbols like summation (sum), product (prod), and indicator functions (mathbf{1}).
The image displays a mathematical formula involving expectations, summations, and products. The formula is: [ mathbb{E}[e_t, hat{S}_n(ell_t)] = sum_{k_1, ldots, k_J = 1}^{|I_1|, ldots, |I_U| = 0} sum_{D}^{-1} mathbb{E}left[left(prod_{j=1}^{|J|} tilde{ell}_j^{k_n} hat{X}_t right) right] prod_{i=1}^{|J|} tilde{c}(j_i, I_i, k_i, ell) ] Key elements include the expectation operator mathbb{E}, summation symbols sum, and product symbols prod. The formula incorporates variables such as e_t, hat{S}_n(ell_t), tilde{ell}_j, hat{X}_t, and tilde{c}(j_i, I_i, k_i, ell).
is polynomial of degree \(|J|\) in its first argument and linear in the second one.
The expressions above also provide a formula of the variance of \( S_{n}(\ell )\).
Corollary 3.12
Let\( S_{n}(\ell )\)be as in Corollary3.6and assume that it is a true martingale. Then
The image displays a mathematical formula related to variance. The formula is: [ text{Var}(S_n(ell_t)) = 2 sum_{k_1, k_2 = 1}^{D} sum_{|I_1|, |I_2| = 1}^{-1} mathbb{E}[tilde{e}_{I_1}^{k_1} perp !!! perp tilde{e}_{I_2}^{k_2}, hat{X}_t] ell_{I_1} ell_{I_2} ] Key elements include summations, expected value notation mathbb{E}, and various mathematical symbols such as tilde{e}, hat{X}_t, and perp !!! perp indicating independence.
(16)
Proof
The martingale property guarantees that \(\mathbb {E}[S_{n}(\ell )_{t}]=S_{n}(\ell )_0\) and hence, by the shuffle product, \(\mbox{Var} (S_{n}(\ell )_{t})=2\mathbb {E}[\langle e_{1}\otimes e_{1},\widehat {\mathbb {S}}_{n}(\ell )_{t}\rangle ]\). By Remark 3.11 we can conclude that
The image displays a mathematical formula involving variance. The formula is: [ text{Var}(S_n(ell_t)) = 2 sum_{k_1, k_2 = 1}^{D} sum_{|I_1|, |I_2| = 1}^{-1} mathbb{E}[tilde{e}_1^{k_1} perp tilde{e}_2^{k_2} hat{X}_t] tilde{c}(1, I_1, k_1, emptyset) tilde{c}(1, I_2, k_2, emptyset) ] Key elements include summation symbols, expected value notation mathbb{E}, and various mathematical symbols such as tilde{e}, hat{X}, and tilde{c}.
and the claim follows. □
3.4 Pricing of Sig-Payoffs
We recall here the notion of sig-payoffs as introduced in [34].
Definition 3.13 (Sig-Payoff)
Suppose that the price process S is given by a continuous semimartingale. A payoff \(F:\Omega \to \mathbb {R}\) is said to be a sig-payoff if there exists \(m\in \mathbb {N}\), and \(f:=\{f_\emptyset , f_J\colon 0<|J|\leq m\},\) such that
where \(\widehat {\mathbb {S}}\) denotes the signature of \(\widehat {S}_t=(t,S_t)\).
Example 3.14
Let \(K>0\) be a strike price and \(T>0\) a maturity time. Then, Asian forwards written on S are payoffs of the form
$$\displaystyle \begin{aligned} \frac{1}{T}\int_0^T S_t \mathrm{d} t -K =\frac{1}{T}\int_0^T (S_t -S_{0}) \mathrm{d} t -K+S_{0} =\frac{1}{T}\langle e_{1}\otimes e_{0},\widehat{\mathbb{S}}_T\rangle+(K-S_{0})\langle e_{\emptyset},\widehat{\mathbb{S}}_T\rangle,\end{aligned}$$
and are thus sig-payoffs.
While standard vanilla derivatives like call and put options do not fall under the category of sig-payoffs, approximate sig-payoffs can still serve as efficient control variates in Monte Carlo pricing, as explained in Sect. 3.5.2.
In this section, we consistently denote \(\widehat {S}_t\) as \((t,S_t)\) and focus on pricing sig-payoffs when S follows a signature model, as specified in the subsequent corollary.
Corollary 3.15
Let the dynamics of\(S_n(\ell )\)under a local martingale measure\(\mathbb {Q}\)be specified as in Corollary3.6. Consider a sig-payoff
Then, using the notation of Remark3.11we can write the corresponding price as
The image displays a complex mathematical formula involving summations and expectations. The formula is: [ E_Q[F] = f_{emptyset} + sum_{0 < |J| leq m} f_J tilde{P}_J(ell, E_Q[hat{X}_T]) ] [ = f_{emptyset} + sum_{|J|=1}^{m} sum_{k_J=1}^{D} sum_{l_J=0}^{} E_Qleft(tilde{U}_i^{|J|} e_{l_i}^{k_i} hat{X}_Tright) f_J prod_{i=1}^{|J|} tilde{c}(j_i, l_i, k_i, ell) ] The formula includes Greek letters such as ell and tilde{P}, and mathematical symbols like summation sum, product prod, and expectation E_Q.
This is a direct consequence of Theorem 3.9 and Remark 3.11. □
Remark 3.16
(i)
Expression (17) admits also a second representation that turns out to be useful for coding:
The image displays a complex mathematical formula involving summations and products. The formula is as follows: [ mathbb{E}_Q[F] = f_{emptyset} + sum_{|J|=1}^{m} sum_{k_J=1}^{D} sum_{I_J=1}^{I} mathbb{E}_Q left[ prod_{i=1}^{|J|} tilde{e}_{I_i}^{k_{I_i}}; hat{X}_T right] ] [ times f left( 1_{{I_1 neq I_1^t, k_1 neq 0}}, ldots, 1_{{I_{|J|} neq I_{|J|}^t, k_{|J|} neq 0}} right) prod_{i=1}^{|J|} left( 1_{{I_i = I_i^t, k_i = 0}} + rho_{I_i}^{k_i} 1_{{I_i neq I_i^t, k_i neq 0}} right) ] The formula includes Greek letters such as rho and mathematical symbols like summation sum, product prod, and expectation mathbb{E}.
(ii)
With a similar procedure it is also possible to provide a representation of \(\mbox{Var}_{\mathbb {Q}}(F)\). By the shuffle product we know that
The image displays a mathematical formula: [ (F - f_{emptyset})^2 = sum_{|J_1|, |J_2| = 1}^{m} f_{J_1} f_{J_2} langle e_{J_1} perp !!! perp e_{J_2}, hat{S}_n(ell)_T rangle ] The formula includes symbols such as the empty set emptyset, summation sum, perpendicular perp !!! perp, and a hat hat{} over S_n.
Setting
The image displays a mathematical formula involving summation and tilde notation. The formula is: [ tilde{P}_{I_1 perp!!!perp I_2} := sum_{i=1}^{K} tilde{P}_{J_i} ] Key elements include the tilde symbol over P, the independence symbol perp!!!perp, and the summation from i=1 to K.
for \(J_i\) and K satisfying
The image displays a mathematical formula involving indexed variables and summation. The formula is: [ e_{I_1} perp!!!perp e_{I_2} = sum_{i=1}^{K} e_{J_i} ] Here, e_{I_1} and e_{I_2} are variables, and the symbol perp!!!perp represents independence. The right side of the equation is a summation from i = 1 to K of e_{J_i} .
we thus obtain
The image displays a mathematical formula related to variance. The formula is expressed as: [ text{Var}_{mathbb{Q}}(F) = mathbb{E}_{mathbb{Q}}[(F - f_{emptyset})^2] - mathbb{E}_{mathbb{Q}}[F - f_{emptyset}]^2 ] [ = sum_{|J_1|, |J_2| = 1}^{m} f_{J_1} f_{J_2} left( tilde{P}_{J_1 perp perp J_2}(ell, mathbb{E}_{mathbb{Q}}[hat{X}_T]) - tilde{P}_{J_1}(ell, mathbb{E}_{mathbb{Q}}[hat{X}_T]) tilde{P}_{J_2}(ell, mathbb{E}_{mathbb{Q}}[hat{X}_T]) right) ] The formula includes symbols such as variance (Var), expectation (mathbb{E}), and summation (sum). It also uses Greek letters like ell and mathematical symbols like perp perp.
(iii)
From this representations we can see that the maps \(P_F(f,\ell ,\mathbb {E}_{\mathbb {Q}}[\widehat {\mathbb {X}}_T]):=\mathbb {E}_{\mathbb {Q}}[F]\) and \(P_{\mbox{Var} (F)}(f,\ell ,\mathbb {E}_{\mathbb {Q}}[\widehat {\mathbb {X}}_T]):=\mbox{Var}_{\mathbb {Q}} (F)\) inherit good properties from \(\widetilde P_J\). In particular, \(P_F\) is linear in f and polynomial of degree m in \(\ell \), and \(P_{\mbox{Var} (F)}\) is quadratic in f and polynomial of degree 2m in \(\ell \). Both maps are linear in \(\mathbb {E}[\widehat {\mathbb {X}}_T]\).
3.5 Calibration to Option Prices
In this section our objective is to calibrate models to call (and put) option prices, aiming to closely match their implied volatilities.
Throughout this section, we assume the existence of a pricing measure \(\mathbb {Q}\) and that the primary process X is a vector of correlated \(\mathbb {Q}\)-Brownian motions. The essence of this approach lies in selecting the parameters \(\ell \) to reproduce the market prices of options on S. Given the prices of N call options \(\pi ^{\ast }(T_{1},K_{1}),\dots ,\pi ^{\ast }(T_{N},K_{N})\) characterized by maturities \(T_{i}\) and strikes \(K_{i}\), our goal is to determine these parameters for an optimal fit. Then, in spirit of [6], we translate the current calibration task into finding \(\ell \) such that
with \(\gamma _{i}\) Vega weights and \(\pi ^{\text{model}}(\ell ,T_{i},K_{i})\) denoting the price of the option under the signature model with parameters \(\ell \), maturity \(T_{i}\) and strike \(K_{i}\).
Remark 3.17
The loss function described in (18) can be adapted depending on the available data. A common approach is for instance to weigh the options by their bid-ask spreads, see for instance [6]. This reflects the relative importance of reproducing different option prices precisely. In the present work we shall employ Vega weights since bid-ask spreads are not always provided in the data and since they are well-suited to match the implied volatility surface from option prices, see also Remark 5.8 in [13].
For notational convenience, let us here briefly recall the definition of implied volatility, which serves as our criterion for goodness of fit. Notice that for liquidity reasons we only calibrate to vanilla options.
Definition 3.18
Let \(\pi (T,K)\) be the price of a call option with maturity T and strike price K written on the asset S. The implied volatility of \(\pi (T,K)\) is defined as the volatility \(\sigma _{\text{IV}}(T,K)\) that solves the equation
where \(\pi ^{BS}\) denotes the Black-Scholes option price. Then \((\sigma _{\text{IV}}(T,K))_{T,K}\) is called (implied) volatility surface, and \((\sigma _{\text{IV}}(T,K))_{K}\) is called (implied) volatility smile for each fixed maturity T.
Once the optimal parameters \(\ell ^{\ast }\) are found via (18) we then need to solve (19) numerically for \(\pi (T,K)=\pi ^{\text{model}}(\ell ^\ast ,T,K)\) to assess the goodness of fit. Indeed we measure it in terms of the absolute relative error (in percentage) between the implied volatilities from the market and the model. Another goodness of fit test is to check whether the calibrated implied volatilities lie in the market’s bid-ask spread (if available), see Sect. 4.2.
3.5.1 Monte Carlo Pricing
We now turn our attention to the task of computing the price \(\pi ^{\text{model}}(\ell ,T,K)\) for a given strike \(K\in \mathbb {R}\) and maturity \(T>0\) under the signature model.
Before presenting our Monte Carlo-based method, it is worth recalling that Sect. 3.4 provides a closed-form formula for computing sig-payoffs without relying on Monte Carlo techniques. Additionally, thanks to the universal approximation theorem, observe that call options payoffs can be approximated arbitrarily well by sig-payoffs, or even just by polynomials, on compact sets. The combination of these two properties forms the basis of the approach employed by [37], which involves breaking down the computation of \(\pi ^{\text{model}}(\ell ,T,K)\) into two components: an approximation of call (put) payoffs using sig-payoffs and the pricing of the latter through the closed-form formula derived in Sect. 3.4. However, achieving an acceptable error between the original call payoff and the sig-payoff requires selecting a sig-payoff with signature terms of sufficiently high order. This approximation must hold for all sets of coefficients \(\ell \) involved in the optimization procedure, encompassing a large compact set K.
It is important to note that while sig-payoffs may be used as control variates in Monte Carlo pricing techniques, as we will discuss in Sect. 3.5.2, we opted for a Monte Carlo approach. The linearity of the model makes this approach particularly manageable, rendering the entire calibration computationally feasible within a reasonable time while delivering highly accurate results.
For the Monte Carlo price we thus fix a number of samples \(N_{MC}>0\) and approximate \(\pi ^{\text{model}}(\ell ,T,K)\) via
We stress again that this can be computed fast. Indeed, by the linearity of the model simulating \((S_{n}(\ell )_{T}(\omega _{i}))_{i=1}^{N_{MC}}\) boils down to the following steps:
simulate \((X_{t}(\omega _{i}))_{t\in [0,T]}\), which in the current setting are just trajectories for correlated Brownian motions, for each \(i\in \{1,\ldots ,N_{MC}\}\);
compute \(\langle e_{I},\widehat {\mathbb {X}}_{T}(\omega _{i})\rangle \) for all \(i=1,\dots ,N_{MC}\) and for all multi-indices I such that \(\lvert I\lvert \le n\);
take linear combinations to compute \(\langle \tilde {e}_{I},\widehat {\mathbb {X}}_{T}(\omega _{i})\rangle \) for all \(i=1,\dots ,N_{MC}\) and for all multi-indices \(|I|\leq n-1\) as described in Lemma 3.5;
retrieve \((S_{n}(\ell )_{T}(\omega _i))_{i=1}^{N_{MC}}\) via (14).
It is worth noting that the parameters \(\ell \) come into play only in the final step, enabling the precomputation and storage of all other quantities. This contrasts with other models where the calibration parameters are involved at each time point during the simulation steps, such as in an Euler scheme for classical Markovian models or more complex schemes employed in rough volatility models.
3.5.2 Variance Reduction with Sig-Payoffs
Even though Monte Carlo pricing is fast since all essential quantities can be precomputed as explained above, we here discuss variance reduction techniques (see e.g. [26]) that can speed up the procedure even further. The idea is to introduce a control variate, i.e., a random variable \(\Phi ^{cv}\) such that:
An example of control variates used for pricing and calibrating neural SDE models can be found in [10, 25], where \(\Phi ^{cv}\) is constructed from the Delta hedge. A possible other choice of control variates for signature models are sig-payoffs. Indeed, one can use the pricing formula derived in Sect. 3.4 to define:
for a wide range of \(\ell \) and with high probability. This can be done by performing a linear regression to obtain the coefficients \(f=(f_{J})_{|J|\le m}\). Alternatively, a polynomial approximation of the payoff’s function can also be employed.
The properties of \(\Phi ^{cv}\) then guarantee the accuracy of the approximation
already for smaller values of \(N_{MC}\). In the following remark we report a numerical experiment in this regard.
Example 3.19
Fix \(n, d =2\) and consider the parameters \(\ell ^{\ast }\in \mathbb {R}^{13}\) calibrated to the first smile of the market data shown in Fig. 4 (left). Consider as example a call option with maturity \(T=30\) days and strike \(K=85\%\) of the spot price. Let p be a polynomial of order \(m=4\) approximating the function \(f: x \mapsto (x-K)^{+}\) on the compact \([70\%S_0,120\%S_0]\). Recall that \(\mathbb {E}_{\mathbb {Q}}[p(S_n(\ell )_T)]\) can be computed analytically as polynomials are sig-payoffs. Using \(\Phi ^{cv}(\ell ):=p(S_n(\ell )_T)-\mathbb {E}_{\mathbb {Q}}[p(S_n(\ell )_T)]\) as control variate we can reduce the sample variance of the Monte Carlo estimator from approximately \(3.89\cdot 10^{-5}\) to \(6.5\cdot 10^{-7}\). Observe that \(p(S_n(\ell )_T)\) here coincides with the sig-payoff without time-augmentation i.e.,
In the following we discuss the problem of minimizing the functional (18) using the Monte Carlo method as described in Sect. 3.5.1 to compute the model prices. We consider the model described in Corollary 3.6 for two correlated Brownian motions B and W with correlation coefficient \(\rho =-0,5\), and \(D=1\).
As a first example we consider synthetic data, where the implied volatility surface to fit is generated by a Heston model whose dynamics are given by
where in both cases \({\mathrm {d}}[B^{\mathbb {P}},W^{\mathbb {P}}]_{t}=\rho {\mathrm {d}}t\) where \(\rho \in [-1,1]\).
We consider 7 maturities \((T_{k})_{k=1}^{7}\) ranging from 30 days to 2 years and 13 strikes \((K_{j})_{j=1}^{13}\) ranging from 80\(\%\) to 120\(\%\) of the spot price. The truncation parameter is fixed to \(n=3\) and the number of Monte Carlo samples to \(N_{MC}=10^6\). The results for the following two sets of parameters under a risk neutral measure
\(\kappa \)
\(\theta \)
\(\sigma \)
\(\rho \)
\(V_{0}\)
0.2
0.3
0.5
\(-\)0.5
0.08
are displayed in the first and in the second row of Fig. 1, respectively.
Fig. 1
On the left: blue stars correspond to the implied volatilities of the Heston models, red dots denote the calibrated implied volatilities of \(S_{n}(\ell )\) with \(n=3\) (13 estimated parameters). On the right: absolute relative errors between the two surfaces are expressed in percentages
On the left: the upper surface represents the implied volatility of the S\(\&\)P 500 index as of 17-03-2021, the lower one is the calibrated implied volatility of \(S_{n}(\ell ^*)\) with \(n=3\) (13 parameters). On the right: absolute relative error between the two surfaces in percentages
We stress that these calibrations to Heston generated implied volatility surfaces can take between 4 and 15 minutes on a standard laptop. We consider now implied volatility data as of 17/03/2021 for call options written on the S\(\&\)P 500 index. Our dataset provided by Bloomberg consists of 7 maturities \((T_{k})_{k=1}^{7}\), ranging from 30 days to 2 years, and 9 strike prices \((K_{j})_{j=1}^{9}\) for each maturity which vary between 80\(\%\) and 120\(\%\) of the spot price. Again, the truncation parameter is fixed to \(n=3\) and the Monte Carlo’s parameter to \(N_{MC}=10^6\). The results are displayed in Fig. 2.
Fig. 3
Comparison between the calibrated implied volatility smiles for different parameters and the S\(\&\)P 500 index as of 17-03-2021 smile (in blue) at maturity \(T_{1}=30\) days (on the left) and for maturities ranging from 60 days to 2 years (on the right). Calibration has been performed using (21)
Figure 2 indicates that the volatility smiles for short maturities have not been captured adequately. Specifically, as with many continuous models, the challenge lies in fitting the shortest maturities, primarily due to the high at-the-money skew in the market (refer to Chapter 3 and Chapter 7 of [22]).
To assess the model’s ability to replicate short maturity smiles, we conduct calibrations using different loss functions that penalize outliers more severely. In the initial calibration procedure described earlier, we denote by \(w_i\) the absolute error between the target implied volatility and the approximated one for maturity \(T_i\) and strike \(K_i\). Subsequently, inspired by generative-adversarial distances as considered in, e.g., [10], we define a new loss function
depending on parameters p and \(\alpha \) that need to be chosen. By taking high values for p and \(\alpha \) we can approximate the sup-distance between the two price surfaces, i.e.,
without compromising differentiability with respect to \(\ell \). The result for different choices of the parameters \(\alpha \) and p but also the truncation level n is displayed in Fig. 3. As can be guessed from the figure, although the maximal absolute relative error for maturities larger than 60 days is (almost) acceptable (5.6\(\%\) for \(n=3\), \(p=1000\), and \(\alpha =500\) and 2.3 \(\%\) for \(n=4\) and \(p=1000\), and \(\alpha =500\)), the absolute relative error for the shortest maturity and the far in and out of the money strikes are still above 12 \(\%\) and 18 \(\%\), respectively. Observe that the performance for \(n=4\) is the best for every maturity larger than 60 days as well as for the at-the-money region of the shortest maturity.
Fig. 4
Comparison between the calibrated implied volatility smiles and the SPX-500 Index as of 17-03-2021 smile (in blue) at maturity \(T_{1}=30\) days (on the left) and for maturities ranging from 60 days to 2 years (on the right). Two different calibrations have been performed using (21)
In a final experiment, we conduct two separate calibrations: one to the shortest maturity alone and another to every other maturity combined. The first calibration is performed for \(T_1=30\) days, and the second calibration encompasses 6 maturities ranging from 60 days to 2 years. In both cases, we consider 9 strikes ranging from 80\(\%\) to 120\(\%\) of the spot price. For the first calibration, we fix the parameters at \(n=2\), \(p=2\), and \(\alpha =0\), while for the subsequent maturities, the parameters are set to \(n=4\), \(p=300\), and \(\alpha =500\). The results are displayed in Fig. 4. Notably, the fit for the first maturity is remarkably accurate, as well as for the entire implied volatility surface. The computational time required for the calibration of the first smile is approximately 2 minutes, while the calibration to the remaining surface might take longer, mainly due to the higher value of n and the calibration to all maturities.
These results suggest that introducing maturity-dependent parameters and performing a slice-wise calibration to the individual smiles as for instance in [10] or [25] can be of interest to obtain both, an excellent accuracy and a low computational time. For details in this regard and the out-of-sample performance see [11].
4 A Signature Model for Index Options and Volatility Derivatives
This section is dedicated to define the second model \((S_{t})_{t\geq 0}\) for the S&P 500 index in detail. Under a risk-neutral probability measure \(\mathbb {Q}\) the model’s dynamics are given by:
Here, \(S_{0}\in \mathbb {R}^{+}\), \(\sigma ^{S}=(\sigma _{t}^{S})_{t\geq 0}\) is the volatility process, and \(B=(B_{t})_{t\geq 0}\) is a one-dimensional Brownian motion that is correlated with the volatility process. Note that the instantaneous variance is given by \(V_t:=(\sigma _t^S)^2\) for every \(t\geq 0\). Choosing a functional form of the volatility process is the crucial modelling choice. We set
$$\displaystyle \begin{aligned} {} \sigma_{t}^{S}(\ell):=\ell_{\emptyset}+\sum_{0<\lvert I \lvert \le n } \ell_{I} \langle e_{I},\widehat{\mathbb{X}}_{t}\rangle, \end{aligned} $$
(23)
i.e. the volatility process is determined by a linear function of the (time-extended) signature of a primary process X. Moreover, we assume that X is a polynomial process (recall Definition 2.1) and that the model parameters are \(\ell :=\{\ell _{I}\in \mathbb {R}: |I|\le n\}\in \mathbb {R}^{(d+1)_{n}}\).
For later convenience, we now introduce the process \((Z_t)_{t\geq 0}\) where \(Z_t=(X_t,B_t)\) and denote by \((\widehat {\mathbb {Z}}_t)_{t\geq 0}\) its time-extended signature. The correlation of the components of \((Z_t)_{t\geq 0}\) is given by \(\rho = (\rho _{ij})_{i,j=1,\dots ,d+1}\), i.e.,
Recall that we use \(\lbrack \cdot ,\cdot \rbrack \) for the quadratic variation. We will often write \((\sigma _{t}^{S})_{t\geq 0}= (\sigma _{t}^{S}(\ell ))_{t\geq 0}\) as in (22) to keep the notation light and only mention the dependence on \(\ell \) when it is explicitly needed.
Remark 4.1 (Interest Rates and Dividends)
To calibrate our model to option prices we will both need the discounted, dividend-adjusted and undiscounted, unadjusted prices. This is due to the fact that the VIX is defined via the discounted, dividend-adjusted prices by the CBOE but the claims on the S&P500 itself are written on the undiscounted, unadjusted prices. Recall that \((S_t)_{t\geq 0}\) given by (22) is the discounted, dividend-adjusted price process. Including an interest rate r and dividend q, the corresponding undiscounted, unadjusted price process is given by
where we used that \(\tilde {S}_t(\ell ) = e^{(r-q)t}S_t (\ell )\).
Remark 4.2
Recall our assumption that \(\widehat {X}\) is a polynomial process. This is important since then \(\widehat {\mathbb {X}}^{n}\) is a finite-dimensional polynomial process in sense of [19] and [9] and therefore the expected signature of \(\widehat {X}\) and conditional expected signature can be found by solving a finite-dimensional ODE, more specifically, it is given by a finite-dimensional matrix exponential as described in Sect. 2.
This assumption still leaves us with a broad class of admissible primary processes. Some prominent examples are correlated Brownian motions, Cox-Ingersoll-Ross (CIR) processes, geometric Brownian motions, Jacobi processes, OU processes, and all continuous affine processes.
Remark 4.3
Several stochastic volatility models are encapsulated in our modelling framework (23). Let us elaborate on the following:
The Stein-Stein model, as introduced in [40], is obtained if we choose a one-dimensional OU process as our primary process \((X_t)_{t\geq 0}\) and set \(n=1\), \(\ell _{\emptyset }=\ell _{(0)}=0\) and \(\ell _{(1)}\neq 0\).
The SABR model, as introduced initially in [31] with \(\beta =1\). To this end we choose \((X_t)_{t\geq 0}\) to be a 1-dimensional geometric Brownian motion without drift and again let \(n=1\), with \(\ell _{\emptyset }=\ell _{(0)}=0\) and \(\ell _{(1)}\neq 0\).
Let \((X_t)_{t\geq 0}\) be a one-dimensional OU process with \(n=5\), \(\ell _{\emptyset },\ell _{(1)},\ell _{(1,1,1)},\)\(\ell _{(1,1,1,1,1)}\) non-zero and \(\ell _{I}=0\) otherwise. This results in the model described in [1], except for the fact that therein a deterministic input curve is added. Moreover, we can embed the entire set of Gaussian polynomial volatility models introduced in [1], if we allow X not to be a semimartingale and do not add a time-augmentation.
In the next section we discuss the pricing of VIX options with (23) as well as the nature of the log-price process.
4.1 Explicit Formulas for the VIX and the Log-Price
The CBOE Volatility Index (VIX) measures the market’s expected volatility of the S\(\&\)P 500 index. More specifically, its current value corresponds to the expected annualized change in the S&P 500 index over the 30 days ahead. Indeed, the index value is computed by
where \(\Delta =30\) days, \(S_{T}\) denotes the price process and \(\mathcal {F}_T\) the filtration at time \(T>0\). Note that for any stochastic volatility model of the form
see e.g., [35]. Although there are of course put and call options written on the VIX, we from now on consider without loss of generality only call options and will simply refer to them as VIX options.
We shall now derive an analytical expression for the VIX index value (24) under our model, i.e. if S is defined as in (22) and (23) and for a polynomial process X. More precisely, we state in Theorem 4.4 that the value of the VIX is given by the square-root of a quadratic function in the parameters \(\ell \). This is achieved via polynomial technology and the computation of a matrix exponential, as we showed in Sect. 2.
Theorem 4.4
Let the price process\(S=(S_{t})_{t\ge 0}\)be given by
with volatility process\(\sigma ^{S}(\ell )=(\sigma _{t}^S(\ell ))_{t\ge 0}\)and\(B=(B_{t})_{t\ge 0}\)denoting a one-dimensional Brownian motion. Recall our modeling assumption, namely that\(\sigma ^S(\ell )\)and X satisfy (23) and note that under this (25) is fulfilled. As introduced in Sect.2we fix an injective labeling function\(\mathscr {L}:\{I: |I|\le n\}\to \{1,\dots , (d+1)_{2n+1}\}\)and denote by G be the\((d+1)_{(2n+1)}\)-dimensional matrix representative of the dual operator corresponding to\(\widehat {\mathbb {X}}\). Then, the following expression for the VIX at time\(T>0\)holds
The image displays a mathematical formula involving various symbols and operations. The formula is: [ Q_{mathcal{L}_I, mathcal{L}_J}(T, Lambda) = text{vec}((e_1 perp!!!perp e_J) otimes e_0)^{prime} e^{Delta G^{top}} - text{Id} , text{vec}(hat{X}_T^{+1}) ] Key elements include: - Greek letters: Lambda, Delta, rho - Mathematical symbols: perp!!!perp (independence), otimes (tensor product), prime (transpose), top (transpose), Id (identity matrix) - Functions: vec() (vectorization), e^{} (exponential function)
(28)
and\(\operatorname {Id}\in \mathbb {R}^{(d+1)_{2n+1} \times (d+1)_{2n+1}}\)denoting the identity matrix. Note that Q is positive semidefinite and symmetric and hence allows for a Cholesky decomposition.
Proof
Note that \(V_t(\ell )=(\sigma ^S(\ell ))^2\) can be expressed as
The image displays a mathematical formula involving summations and variables. The formula is: [ V_t(ell) = left( sum_{|I| leq } ell_I(e_I, hat{X}_t) right)^2 = sum_{|I|, |J| leq } ell_I ell_J (e_I sqcup e_J, hat{X}_t) ] Key elements include summation symbols, subscripts, and a hat symbol over X_t.
by the shuffle-property. Moreover, recall that continuous polynomials processes have finite moments of every degree and hence by Remark 1.2 it follows that (25) holds. Plugging-in the above equation for \(V_t(\ell )\) the value of the VIX is for each \(T>0\) given by
The image displays a mathematical formula involving summation, integration, and expectation. The formula is: [ text{VIX}^2_T(ell) = frac{1}{Delta} sum_{|I_t|, |J| leq } ell_I ell_J mathbb{E} left[ int_T^{T+Delta} langle e_I perp !!! perp e_T, hat{X}_t rangle dt big| mathcal{F}_T right] = frac{1}{Delta} ell^T Q(T, Delta) ell, ] where ell, e, hat{X}, and Q are variables or functions, Delta is a parameter, and mathbb{E} denotes expectation. The formula includes Greek letters and mathematical symbols such as summation (sum), integration (int), and conditional expectation.
where for each \(T>0\) the matrix Q is given by
The image displays a mathematical formula involving expected values, integrals, and inner products. The formula is expressed as: [ Q_{mathcal{L}(I)mathcal{L}(J)}(T, Delta) := mathbb{E} left[ int_{T}^{T+Delta} langle e_I perp !!!perp e_J, hat{X}_t rangle dt bigg| mathcal{F}_T right] ] [ = mathbb{E} left[ int_{0}^{T+Delta} langle e_I perp !!!perp e_J, hat{X}_t rangle dt - int_{0}^{T} langle e_I perp !!!perp e_J, hat{X}_t rangle dt bigg| mathcal{F}_T right] ] [ = mathbb{E} left[ langle (e_I perp !!!perp e_J) otimes e_0, hat{X}_{T+Delta} rangle - langle (e_I perp !!!perp e_J) otimes e_0, hat{X}_T rangle bigg| mathcal{F}_T right] ] [ = mathbb{E} left[ langle (e_I perp !!!perp e_J) otimes e_0, hat{X}_{T+Delta} rangle bigg| mathcal{F}_T right] - langle (e_I perp !!!perp e_J) otimes e_0, hat{X}_T rangle ] The formula includes Greek letters, such as Delta and mathcal{F}, and mathematical symbols like mathbb{E} for expectation, int for integrals, langle cdot, cdot rangle for inner products, and otimes for tensor products.
The image displays a mathematical formula involving vector and matrix operations. The formula is: [ Q_{mathcal{L}(I) mathcal{L}(J)}(T, Delta) = text{vec}((e_I perp e_J) otimes e_0)^top e^{Delta G^top} text{vec}(hat{X}_T^{2n+1}) ] [ - text{vec}((e_I perp e_J) otimes e_0)^top text{vec}(hat{X}_T^{2n+1}) ] [ = text{vec}((e_I perp e_J) otimes e_0)^top (e^{Delta G^top} - text{Id}) text{vec}(hat{X}_T^{2n+1}) ] Symbols include vector notation "vec", tensor product "?", perpendicular "?", transpose "?", exponential "e", and identity matrix "Id".
which proves our claim. The fact that Q is positive semidefinite and symmetric follows from the shuffle property. Under these properties, it is well-known that a Cholesky decomposition exists, meaning that there is an upper triangular matrix \(U_{T}\in \mathbb {R}^{(d+1)_{n}\times (d+1)_{n}}\) such that
Moreover, an explicit expression of the log price in terms of a linear function of the signature of \(\widehat {Z}\) can be obtained, which we present in the next proposition.
Proposition 4.5
Let S be given by (22) with\(S_0=1\), as well as\(\sigma ^S\)and X satisfy (23). Under these assumptions the log-price at time\(t\geq 0\)is given by,
The image displays mathematical formulae involving tensor products and summations. The first formula is tilde{e}_{emptyset}^{B} := e_{d+1}. The second formula is tilde{e}_{I}^{B} := e_{I} otimes e_{d+1} - sum_{|J| < m} frac{a_{i_{|I|}}^{J}(d+1)}{2} (e_{I'} sqcup e_{J}) otimes e_{0}. The symbols include Greek letters, tensor product otimes, and summation sum.
for each\(|I|>0\), and the matrix\(Q^{0}(t)\in \mathbb {R}^{(d+1)_{n}\times (d+1)_{n}}\)has components
The image displays a mathematical formula: [ Q_{mathcal{L}(I)mathcal{L}(J)}^0(t) = langle (e_I bigsqcup e_J) otimes e_0, hat{X}_t rangle ] Key symbols include: - Q with subscript and superscript notation. - mathcal{L} representing a script letter. - bigsqcup indicating a disjoint union. - otimes for the tensor product. - langle cdot, cdot rangle denoting an inner product. - hat{X}_t with a hat symbol over X .
for an arbitrary but fixed labeling function\(\mathscr {L}:\{I: |I|\le n\}\to \{1,\dots ,(d+1)_{n}\}\).
Proof
Using Itô’s lemma and the form or S, \(\sigma ^S\) and X under our model, we obtain
The image contains a complex mathematical formula involving integrals, summations, and various mathematical symbols. The formula is expressed as: [ log(S_t(ell)) = -frac{1}{2} int_0^t V_s(ell) , ds + int_0^t sigma_s^S(ell) , dB_s ] [ = -frac{1}{2} sum_{|I|, |J| leq } ell_I ell_J int_0^t langle e_I perp perp e_J, hat{X}_s rangle , ds + sum_{|I| leq } ell_I int_0^t langle e_I, hat{X}_s rangle , dB_s ] [ = -frac{1}{2} sum_{|I|, |J| leq } ell_I ell_J langle (e_I perp perp e_J) otimes e_0, hat{X}_t rangle + sum_{|I| leq } ell_I langle tilde{e}_I^B, hat{Z}_t rangle ] [ = -frac{1}{2} ell^T Q^0(t) ell + sum_{|I| leq } ell_I langle tilde{e}_I^B, hat{Z}_t rangle ] The formula includes Greek letters such as ell, sigma, and hat{X}, and mathematical operations like integrals, summations, and inner products.
where we used that \(\int _{0}^{t}\langle e_{I},\widehat {\mathbb {X}}_{s}\rangle \mathrm {d}B_{s}=\langle \tilde {e}_{I}^{B}, \widehat {\mathbb {Z}}_{t}\rangle \) (Lemma 3.5) in the second to last equality. □
4.2 Joint Calibration of SPX and VIX Options
We point out again that we only work with call options in the following, however this choice was made merely for simplicity and it is straightforward to include any other liquid options. The time-to-maturities of options available on the market, as well as strikes-prices available on the market may differ for VIX and SPX. Accordingly, we denote by \(\mathcal {T}^{\mathrm {SPX}}, \mathcal {T}^{\mathrm {VIX}}\) the set of maturities and by \(\mathcal {K}^{\mathrm {SPX}}\), \(\mathcal {K}^{\mathrm {VIX}}\) the set of strikes for SPX and VIX options respectively.
Proposition 4.5 implies that the SPX call option payoff is given by
for a given maturity \(T>0\) and a strike price \(K>0\). Recall from Remark 4.1 that \(\tilde {S}\) stands for the undiscounted, unadjusted price process, r is the interest rate and q the dividend. Similarly, according to Theorem 4.4, the pay-off for the VIX reads
where we applied the Cholesky decomposition to \(Q(T,\Delta )=U_TU_T^{\mathsf {T}}\). To evaluate the expectations in the pricing formula for the above options as well as the VIX’s futures, we apply a Monte-Carlo procedure. However, we want to point out that we do not need an additional Monte-Carlo simulation to evaluate the conditional expectation in the definition of the VIX. This is a significant computational benefit of our model thanks to the polynomial technology.
The loss function to be minimized in the joint calibration is simply a convex combination of the individual loss functions of VIX and SPX respectively:
where \(\pi _{VIX}^{model}\) and \(F_{VIX}^{model}\) are the Monte Carlo option- and futures-prices under our model (i.e. for \({\text{VIX}}_T(\ell ,\omega _i)\) defined as in (27));
Note that in the case of the SPX no future prices need to be calibrated (because the index value can directly be used as spot price) and we therefore do not add them as inputs to \(\mathcal {L}^{\beta }\) by slight abuse of notation.
For some \(\beta \in (0,1)\), we define \(\mathcal {L}^{\beta }\) to be the loss function
\(\upsilon ^{mkt}\), \(\delta ^{mkt}\) denoting the Vega and Delta of the option under the Black-Scholes model (but note that they depend on both maturity and strike price);
\(\pi ^{mkt, b, a}=[\pi ^{mkt,b},\pi ^{mkt,a}] \),\(\sigma ^{mkt, b, a}=[\sigma ^{mkt,b},\sigma ^{mkt,a}]\), with \(\pi ^{mkt,b},\pi ^{mkt,a}\), \(\sigma ^{mkt,b}\) and \(\sigma ^{mkt,a}\) denoting the market bid and ask prices and implied volatilities respectively.
F and \(F^{mkt}\) standing for the models and markets futures respectively (with maturity T);
\(\tilde {1}_{\{x \notin [y^{b},y^{a}]\}}:=s(y^b-x)+s(x-y^a)\) for \(s(x):=\frac {1}{2}\tanh (100x)+\frac {1}{2}\) a smooth approximation of the indicator function, which penalizes implied volatilities that lie out of the bid-ask spread.
4.3 Numerical Results
Before presenting the numerical results concerning the joint calibration problem within our framework, it is worth to mention that two main approaches in the literature are taken into account when choosing the maturities to fit.
1.
One approach is to choose \(T_{1}^{{\text{SPX}}}=T_{1}^{{\text{VIX}}}\) the first maturity to be the same for VIX and SPX and then for higher maturities to set \(j\ge 2\), \(T_{j}^{{\text{SPX}}}=T_{j-1}^{{\text{VIX}}}+\Delta \), see for instance [24, 27, 29, 30]. Sometimes, one may also choose the first maturity to differ by a few days between SPX and VIX, if the same maturity is not available for both on the market.
2.
Another approach consists of taking the same set of maturities for SPX and VIX \(\mathcal {T}^{{\text{SPX}}}=\mathcal {T}^{{\text{VIX}}}\) (or close together) as it was done for example by [4, 23, 39].
In the following we will adhere to the first approach and refer the reader to [13] for results regarding the second approach.
The trading day we consider for our calibration is 02/06/2021, the same as was used in [24, 30] and we use call options for both VIX and SPX. Maturities are reported in the following tables with the corresponding range of strikes (in percentage) with respect to the spot and the market’s futures prices.
In the table below, we report the maturities used in the calibrations (where \(T=1\equiv 365.25\) days) and report the corresponding moneyness range (i.e. strike price normalized by spot price or markets futures price) within which we calibrated our model.
\(T_{1}^{{\text{VIX}}}=0.0383\)
\(T_{2}^{{\text{VIX}}}=0.0767\)
\(T_{1}^{{\text{SPX}}}=0.0383\)
\(T_{2}^{{\text{SPX}}}=0.1205\)
\(T_{3}^{{\text{SPX}}}=0.1588\)
(90\(\%\),220\(\%\))
(90\(\%\),220\(\%\))
(92\(\%\),105\(\%\))
(70\(\%\),105\(\%\))
(80\(\%\),120\(\%\))
The maturities in days are \(\mathcal {T}^{SPX}= (14, 44, 58)\) an \(\mathcal {T}^{VIX}= (14,28)\). Let us also point out the high moneyness region (up to 220\(\%\)) for \({\text{VIX}}\) options is known to be challenging to fit. In our model, we need to choose both a primary process and a level of truncation of the signature a priori. Indeed, we treat these choices as hyperparameters and do not train them. As primary process we choose a three-dimensional OU process (see Example 2.9) with randomly chosen parameters
and truncate the signature at \(n=3\). In this setting, our model has 85 trainable parameters, meaning that \(\ell \in \mathbb {R}^{85}\). Note that choosing instead a two-dimensional Brownian motion as primary process does not lead to good fits at low truncation levels of the signature (\(n=3\)), which is shown in [13, Appendix A]. The OU process is therefore a natural choice for a polynomial process which is tractable but exhibits richer dynamics than correlated Brownian motions alone. Moreover, OU processes have qualified for volatility modeling as shown in various articles, see e.g. [1, 2, 38, 40]. In terms of loss function, we choose the parameters \(\lambda =0.35\) and \(\beta =1\). For the evaluation of the pricing functional we simulate \(N_{MC}=80{,}000\) Monte Carlo samples.
The calibrated implied volatility smiles are depicted in Fig. 5 and the absolute value of the relative error between the models and markets future prices for each VIX-maturity are reported in the table below.
Fig. 5
The implied volatility smiles correspond to \(T_{1}^{{\text{SPX}}}, T_{1}^{{\text{VIX}}}, T_{2}^{\text{SPX}}, { T_{2}^{{\text{VIX}}}, T_{3}^{{\text{SPX}}}}\) (from top left to bottom right). The blue dots correspond to the values given by our calibrated model and the red stars are the markets bid-ask implied volatilities (from market call options). For the VIX, we additionally show the markets future price by the red dashed line and the one of the calibrated model by a blue dashed line
The image displays a mathematical formula: varepsilon_{T_1}^{VIX} = 9.8 cdot 10^{-6}. The formula includes the Greek letter epsilon (varepsilon), subscript T_1, and superscript VIX, equated to a numerical value in scientific notation.
The image shows a mathematical formula: varepsilon_{T_2}^{VIX} = 6.6 cdot 10^{-8}. The formula includes a Greek letter epsilon (varepsilon), subscript T_2, and superscript VIX, equated to a numerical value in scientific notation.
Indeed, the model does seem to fit the market implied volatilities of the VIX and SPX options at the given maturities rather well. As shown in Fig. 5, the model’s implied volatilities lie within the market’s bid-ask spreads and the calibrated future’s prices coincide with the ones of the market. The latter is confirmed by the small relative errors reported in the table. However, it is also worth noting that this numerical experiment only consists of a small set of maturities and a single trading day. We refer the reader to [13] for a more thorough empirical study with calibration results for various sets of maturities and tests for parameter stability under re-calibration over multiple weeks.
Acknowledgements
The first three authors gratefully acknowledge financial support through grant Y 1235 and grant I 3852 of the Austrian Science Fund. All authors acknowledge financial support through the OEAD WTZ project FR.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Note that the SPX is a theoretical index tracking the value of the S&P 500 without holding an actual portfolio of the constituent stocks. In the following we shall use the terms SPX and S&P 500 interchangeably.
1.
E. Abi Jaber, C. Illand, S. Li, Joint SPX-VIX calibration with Gaussian polynomial volatility models: deep pricing with quantization hints. Preprint. arXiv:2212.08297 (2022)
2.
E. Abi Jaber, C. Illand, S. Li, The quintic Ornstein-Uhlenbeck volatility model that jointly calibrates SPX & VIX smiles. Preprint. arXiv:2212.10917 (2022)
3.
H. Boedihardjo, J. Diehl, M. Mezzarobba, H. Ni, The expected signature of Brownian motion stopped on the boundary of a circle has finite radius of convergence. Bull. Lond. Math. Soc. 53(1), 285–299 (2021)MathSciNetCrossRef
4.
A. Bondi, S. Pulido, S. Scotti, The rough Hawkes Heston stochastic volatility model. Math. Finance 1, 1–45 (2024)MathSciNet
5.
T. Cass, E. Ferrucci, On the wiener chaos expansion of the signature of a gaussian process, in Probability Theory and Related Fields (2024), pp. 1–39
6.
R. Cont, S. Ben Hamida, Recovering volatility from option prices by evolutionary optimization. J. Comput. Finance 8(4), 43–76 (2005)CrossRef
7.
C. Cuchiero, J. Möller, Signature methods in stochastic portfolio theory. Preprint. arXiv:2310.02322 (2023)
8.
C. Cuchiero, S. Svaluto-Ferro, Infinite-dimensional polynomial processes. Finance Stoch. 25(2), 383–426 (2021)MathSciNetCrossRef
9.
C. Cuchiero, M. Keller-Ressel, J. Teichmann, Polynomial processes and their applications to mathematical finance. Finance Stoch. 16, 711–740 (2012)MathSciNetCrossRef
10.
C. Cuchiero, W. Khosrawi, J. Teichmann, A generative adversarial network approach to calibration of local stochastic volatility models. Risks 8(4), 101 (2020)
11.
C. Cuchiero, G. Gazzani, S. Svaluto-Ferro, Signature-based models: Theory and calibration. SIAM J. Financ. Math. 14(3), 910–957 (2023)MathSciNetCrossRef
12.
C. Cuchiero, S. Svaluto-Ferro, J. Teichmann, Signature SDEs from an affine and polynomial perspective. Preprint. arXiv:2302.01362 (2023)
13.
C. Cuchiero, G. Gazzani, J. Möeller, S. Svaluto-Ferro, Joint calibration to SPX and VIX options with signature-based models. Math. Finance 1, 1–53 (2024)
14.
C. Cuchiero, F. Guida, L. Di Persio, S. Svaluto-Ferro, Measure-valued affine and polynomial diffusions. Stoch. Process. Appl. 175, 104392 (2024)MathSciNetCrossRef
15.
C. Cuchiero, F. Primavera, S. Svaluto-Ferro, Universal approximation theorems for continuous functions of càdlàg paths and Lévy-type signature models. Finance Stoch. (2025), to appear
16.
F. Delbaen, W. Schachermayer, A general version of the fundamental theorem of asset pricing. Math. Ann. 300(1), 463–520 (1994)MathSciNetCrossRef
17.
G. Di Nunno, K. Kubilius, Y. Mishura, A. Yurchenko-Tytarenko, From constant to rough: a survey of continuous volatility modeling. Mathematics 11(19), 4201 (2023)
18.
T. Fawcett, Problems in stochastic analysis. Connections between rough paths and non-commutative harmonic analysis. PhD Thesis, Univ. Oxford, 2003
19.
D. Filipović, M. Larsson, Polynomial diffusions and applications in finance. Finance Stoch. 20(4), 931–972 (2016)MathSciNetCrossRef
20.
P.K. Friz, A. Shekhar, General rough integration, Lévy rough paths and a Lévy–Kintchine-type formula. Ann. Probab. 45(4), 2707–2765 (2017)
21.
P.K. Friz, P.P. Hager, N. Tapia, Unified signature cumulants and generalized magnus expansions, in Forum of Mathematics, Sigma, vol. 10 (Cambridge University Press, Cambridge, 2022), p. e42
22.
J. Gatheral, The Volatility Surface: A Practitioner’s Guide (Wiley, 2011)
23.
J. Gatheral, T. Jaisson, M. Rosenbaum, Volatility is rough. Quant. Finance 18(6), 933–949 (2018)MathSciNetCrossRef
24.
G. Gazzani, J. Guyon, Pricing and calibration in the 4-factor path-dependent volatility model. Preprint. arXiv:2406.02319 (2024)
25.
P. Gierjatowicz, M. Sabate-Vidales, D. Siska, L. Szpruch, Z. Zuric, Robust pricing and hedging via neural SDEs. J. Comput. Finance 3(26), 1–32 (2022)
26.
P. Glasserman, Monte Carlo Methods in Financial Engineering, vol. 53 (Springer, 2004)
27.
I. Guo, G. Loeper, J. Obloj, S. Wang, Optimal transport for model calibration. Preprint. arXiv:2107.01978 (2021)
28.
J. Guyon, The joint S&P 500/VIX smile calibration puzzle solved. Risk, April (2020)
29.
J. Guyon, Dispersion-constrained martingale Schrödinger problems and the exact joint S&P 500/VIX smile calibration puzzle. Finance Stoch. 28, 1–53 (2023)
30.
J. Guyon, J. Lekeufack, Volatility is (mostly) path-dependent. Quant. Finance, 23, 1–38 (2023)MathSciNetCrossRef
31.
P.S. Hagan, D. Kumar, A.S. Lesniewski, D. Woodward, Managing smile risk. Best Wilmott 1, 249–296 (2002)
32.
T. Lyons, H. Ni, Expected signature of Brownian motion up to the first exit time from a bounded domain. Ann. Probab. 43(5), 2729–2762 (2015)MathSciNetCrossRef
33.
T. Lyons, N. Victoir, Cubature on Wiener space. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 460(2041), 169–198 (2004)MathSciNetCrossRef
34.
T. Lyons, S. Nejad, I. Perez Arribas, Non-parametric pricing and hedging of exotic derivatives. Appl. Math. Finance 27(6), 457–494 (2020)MathSciNetCrossRef
35.
A. Neuberger, The log contract. J. Portfolio Manag. 20(2), 74 (1994)
36.
H. Ni, The expected signature of a stochastic process. Ph.D. thesis, University of Oxford, 2012
37.
I. Perez Arribas, C. Salvi, L. Szpruch, Sig-SDEs model for quantitative finance, in Proceedings of the First ACM International Conference on AI in Finance (2020), pp. 1–8
38.
S. Rømer, Empirical analysis of rough and classical stochastic volatility models to the SPX and VIX markets. Quant. Finance 22, 1–34 (2022)MathSciNetCrossRef
39.
M. Rosenbaum, J. Zhang, Deep calibration of the quadratic rough Heston model. Preprint. arXiv:2107.01611 (2021)
40.
E.M. Stein, J.C. Stein, Stock price distributions with stochastic volatility: an analytic approach. Rev. Financ. Stud. 4(4), 727–752 (1991)CrossRef