Skip to main content
Top

Signature-Based Models in Finance

  • Open Access
  • 2026
  • OriginalPaper
  • Chapter
Published in:

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This chapter delves into the world of signature-based models in finance, highlighting the shift from traditional, parameter-calibrated models to data-driven, overparametrized approaches. The focus is on signature SDEs, which offer a robust and universal framework for asset price modeling. The text explores the universality and no-arbitrage conditions of these models, providing explicit formulas for option pricing and calibration to market data. The chapter also presents a novel approach for joint SPX and VIX modeling, demonstrating the model's ability to capture market dynamics accurately. Through detailed mathematical derivations and numerical results, the chapter showcases the model's performance in fitting implied volatility surfaces and calibrating to both SPX and VIX options. The practical applications and empirical results make this chapter a valuable resource for professionals seeking to understand and implement advanced financial modeling techniques.
The present work was initiated when the second author was affiliated to École des Ponts ParisTech, CERMICS Lab, Marne la Vallée Cedex 2, France.

1 Introduction

The present chapter is based on the two articles [13] and [11].
In recent years, the traditional approach of calibrating a few well-interpretable parameters has given way to learning the model’s characteristics as a whole, by leveraging all available sources of data. Consequently, overparametrized models have become increasingly important. This shift has opened the door to more robust and data-driven model selection mechanisms. At the same time the choice of admissible models must, however, be limited to those adhering to first principles such as “no-arbitrage”. Finding classes of dynamic processes that are data-driven and satisfy such well-established theoretical principles can be achieved by relying on different universal approximation theorems.
One class of such financial models are so-called signature SDEs, as considered in [12, 37]. These are Itô-diffusions, where the characteristics are linear or real-analytic functions of the signature of some (properly extended) driving Brownian motion or the process itself. The models that we consider in this chapter can be embedded in the framework of signature SDEs and serve as particularly tractable examples thereof.
In the first part (Sect. 3) of the present chapter our goal is to provide a data-driven, universal, tractable and easy to calibrate asset price model. For the sake of exposition we shall assume throughout that the asset S is one-dimensional. We model S as a linear function of the signature of—what we call—primary process. This primary process can either be a classical driving signal, e.g., a Brownian motion or a polynomial process [9], but also a general d-dimensional continuous semimartingale \(X=(X_{t}^{1},\dots ,X_{t}^{d})_{t \geq 0}\), usually augmented with time t. The corresponding time-extended process is denoted by \((\widehat {X}_{t})_{t \geq 0}=(t,X_{t}^{1},\dots ,X_{t}^{d})_{t \geq 0}\) and its signature by \(\widehat {\mathbb {X}}\). The asset S is then modeled/approximated via a process \(S_{n}(\ell )\) defined as
$$\displaystyle \begin{aligned} {} S_n(\ell)_t:=\ell(\widehat{\mathbb{X}}_t), \end{aligned} $$
(1)
where \(\ell \) is a linear map of the signature of \(\widehat {X}\) up to some level of truncation \(n\in \mathbb {N}\) which needs to be inferred from data (see Definition 3.2 for further details). Note that the parameters of X are prespecified beforehand and can thus be seen—in analogy to machine learning terminology—as hyperparameters. This is crucial as it allows to split the calibration task into precomputable samples of the signature and a standard optimization to find the parameters of the linear map. This is one of the attractive features, further ones are summarized subsequently.
No Arbitrage
We will present in Sect. 3.2, that (1) can also be expressed in terms of stochastic integrals, whence is straightforward to deduce no-arbitrage conditions.
Universality
We refer the reader to [11, Example 3.15], where we show that classical stochastic volatility models with sufficiently regular coefficients can be arbitrarily well approximated by models of form (1).
Option Pricing Formulas for Sig-Payoffs
By approximating (path dependent) payoffs via so-called sig-payoffs of the form \(L(\widehat {\mathbb {S}}_T ({\ell }) )\) (see also [34]), where \(\widehat {\mathbb {S}}\) denotes the signature of \((\widehat {S}_t)_t:=(t, S_t)_t\) and L a linear map, we show that this kind of approximate option pricing reduces to the computation of the expected signature of \(\widehat {X}\). These formulas lead to high tractability whenever \(\mathbb {E}[\widehat {\mathbb {X}}_T]\) is easy to compute. This is the case for all polynomial process as analyzed in [12] and in Sect. 2 for the truncated signature.
Calibration to Options
As mentioned above, when calibrating to the market’s volatility surface (see Sect. 3.5), we precompute Monte Carlo samples of \(\widehat {\mathbb {X}}\) and are then only left with finding the parameters \(\ell \), which is subject to a standard optimization. We perform for both simulated and real market data (S\(\&\)P 500 index) a full calibration to the volatility surface, and show that this is not only highly accurate but also very fast (see [11] for details, in particular in the case of time dependent parameters).
The second part (Sect. 4) treats a slightly different class of signature-based models and constitutes a novel contribution for joint VIX and S&P 500 modeling as well as calibration. In the context of volatility modeling the joint calibration to SPX1 and VIX options is still considered a rather hard problem which has become increasingly important over the past years. However it is worth to mention that significant progress has been made recently (see [13] and [17]). We address the reader to [13, 17, 28] for an extensive literature review.
Inspired by the previous model, we now study a stochastic volatility model for the discounted price process \(S=(S_{t})_{t\ge 0}\), namely
$$\displaystyle \begin{aligned} \mathrm{d} S_{t}(\ell) = S_{t}(\ell)\sigma_{t}^S(\ell) \mathrm{d} B_{t}, \end{aligned}$$
with some initial condition \(S_0 \in \mathbb {R}_+\), standard Brownian motion B, and volatility process \(\sigma ^S\) satisfying
$$\displaystyle \begin{aligned} {} \sigma_{t}^S(\ell) := \ell(\widehat{\mathbb{X}}_{t}), \end{aligned} $$
(2)
for a linear map \(\ell \) of the signature \(\widehat {\mathbb {X}}_{t}\) of a time extended d-dimensional continuous semimartingale X, which takes here the role of the primary process. We thus assume that the signature of \(\widehat {X}\) serves as a linear regression basis for the volatility process, while the parameters of the linear map \(\ell \) have to be learned from (option price) data.
Also in this case the modeling framework can be seen as universal in the class of continuous non-rough stochastic volatility, which is a consequence of the universal approximation properties of the signature. Besides that it truly nests several classical models (see Remark 4.3). and incorporates both, purely Markovian (in \((S,X)\)) and path-dependent ones. Moreover, it provides for the first time a signature-based approach for pricing VIX options and highly accurate joint calibration results, as illustrated in Sect. 4.2. For the latter we exploit the following mathematical and numerical properties.
  • Setting \(Z:= (X, B)\), then not only \(\sigma ^S(\ell )\) but also the log-price \(\log (S(\ell ))\) is a linear function of the signature of \(\widehat {Z}\). Therefore no (Euler) simulation scheme is needed to sample the price process, leading to immediate computational advantages.
  • If we additionally assume X to be a polynomial process (see Definition 2.1 and [9, 19]), we obtain an analytic expression for the VIX, which only involves the computation of a matrix exponential. This follows from the property that the truncated signature of a polynomial process is again a polynomial process (see Sect. 2).
  • We can apply a Monte Carlo approach for option pricing and calibration where we are able to generate the signature samples of \(\widehat {Z}\) offline and independently of the model parameters to be optimized. As in the previous model the calibration task can thus be split into an offline sampling procedure and a standard optimization, since the latter does not require re-sampling for updated model parameters \(\ell \).
The codes for the first part are available at GuidoGazzani-ai/sigsde_calibration, while an implementation of the model for the joint calibration can be found in GuidoGazzani-ai/_jointcalib_sigsde or janka-moeller/joint_calib_SPX_VIX. Moreover another implementation of the expected signature of polynomial process is available at github.com/sarasvaluto/AffPolySig. The first repository refers to [11], the other three have been introduced in [13]. We additionally share a Colab notebook where the reader can benefit from an implementation of the building blocks of signature-based models as here introduced.

1.1 Notation

Let us first fix the notation used throughout the chapter. Recall the notions introduced in chapter “A Primer on the Signature Method in Machine Learning”, in particular the algebra of formal power series which we will also refer to as extended tensor algebra and denote by \(T((\mathbb {R}^d))\). For a multi-index \(I:=(i_1,\ldots ,i_n)\in \{1, \dots , d\}^n\) we set \(|I|:=n\). We also consider the empty index \(I:=\emptyset \) and set \(|I|:=0\). For each index I we then define
$$\displaystyle \begin{aligned} {} I':= \begin{cases} (i_1,\ldots,i_{|I|-1}) &\text{if }|I|\geq2,\\ \emptyset&\text{if }|I|=1,\\ 0&\text{if }|I|=0. \end{cases} \end{aligned} $$
(3)
Similarly we set \(I'':=(I')'\) if \(|I|\neq 0\) and \(I''=0\) if \(|I|=0\). Let us denote by \(e_\emptyset \) the basis element corresponding to \((\mathbb {R}^d)^{\otimes 0}\), then each element \(\mathbf {a}\in T((\mathbb {R}^d))\) can be expressed as
$$\displaystyle \begin{aligned} \mathbf{a}=\sum_{|I|\geq 0}{\mathbf{a}}_I e_I,\end{aligned}$$
for a collection of \({\mathbf {a}}_I\in \mathbb {R}\). Moreover a vectorization of the elements of the truncated tensor algebra \(T^{(n)}(\mathbb {R}^d)\) whose elements are of the form
$$\displaystyle \begin{aligned} \mathbf{u}=\sum_{0\leq |I| \leq n}{\mathbf{u}}_I e_I\end{aligned}$$
will prove useful. To this end, we introduce the isomorphism \({\mathbf {vec}}: T^{(n)}(\mathbb {R}^d)\to \mathbb {R}^{d_{n}}\) and an arbitrary but fixed injective labeling function \(\mathscr {L}:\{I: |I|\le n\}\longrightarrow \{1,\dots , d_{n}\}\), such that
$$\displaystyle \begin{aligned} {} {\mathbf{vec}}({\mathbf{u}}):=\sum_{|I|\le n}e_{\mathscr{L}(I)}{\mathbf{u}}_{I}, \end{aligned} $$
(4)
with \(d_{n}:=\frac {d^{n+1}-1}{d-1}\).
We stick to the convention of mathematical finance, that the process \(S=(S_t)_{t\geq 0}\) denotes the price process. Therefore, we denote the signature of a process X by \(\mathbb {X}\). Moreover, for a multi-index I and \(0\leq s \leq t\), we denote the corresponding increment of a component of the signature by its projection on \(e_I\) that is \(\langle e_I, \mathbb {X}_{s,t}\rangle \). To make the connection to the notation introduced in chapter “A Primer on the Signature Method in Machine Learning”: for a multi-index I and times \(0\leq s \leq t\) we denote the corresponding element of the signature of a process X by
$$\displaystyle \begin{aligned} \langle e_I, \mathbb{X}_{s,t} \rangle = S(X)^I_{s, t}, \end{aligned}$$
and write \(\mathbb {X}_{t}:= \mathbb {X}_{0,t}\). We emphasize again that from now on we will only use the letter S to denote the price process and never the signature of a process. Recall that for any \(t\geq 0\) the signature of an \(\mathbb {R}^d\)-dimensional process \(X= (X)_{t\geq 0}\) it holds that \(\mathbb {X}_t \in T((\mathbb {R}^d))\) at any time \(t\geq 0\). We denote its canonical projection to \(T^{(n)}(\mathbb {R}^d)\) by \(\mathbb {X}^n\) and refer to it as the truncated signature.
In this chapter, we assume the stochastic processes to be continuous semimartingales. Recall that the elements of the signature of a continuous semimartingale \(X=(X)_{t\geq 0}\) can be defined recursively as
$$\displaystyle \begin{aligned} \langle e_{\emptyset},\mathbb{X}_{s,t}\rangle:=\mbox{1}, \qquad \langle e_{I}, \mathbb{X}_{s,t}\rangle:=\int_{s}^{t}\langle e_{I'},\mathbb{X}_{s,r}\rangle\circ \mathrm{d} X_{r}^{i_{n}}, \end{aligned}$$
for each \(I=(i_1,\ldots , i_n)\), \(I'=(i_1,\ldots , i_{n-1})\) and \(0\leq s\leq t\), where \(\circ \) denotes the Stratonovich integral.
Definition 1.1 (Shuffles)
Set \(I=(i_1,\ldots ,i_n)\) and \(J=(j_1,\ldots ,j_m)\) for some \(n\in \mathbb {N}\) and \(m>0\). We define the shuffle https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq76_HTML.gif
Bar chart with three vertical bars of equal height, representing data comparison across three categories. The chart lacks labels and numerical values, focusing on visual representation of uniformity among the categories.
and half-shuffle https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq77_HTML.gif
A symbol resembling a tilde (~) placed above a horizontal line, with two vertical lines extending downward from each end of the horizontal line.
as https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq78_HTML.gif
The image displays a mathematical formula: e_I perp!!!perp e_emptyset := e_I . The formula includes the symbols for independence (perp!!!perp), the empty set (emptyset), and the definition symbol (:=).
, https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq79_HTML.gif
The image shows a mathematical formula: e_I widetilde{sqcup} e_{emptyset} := 0 . The formula includes a tilde over a square cup symbol, and the empty set symbol.
, respectively and
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equf_HTML.png
The image contains mathematical expressions involving Greek letters and mathematical symbols. The first expression is: [ rho bigsqcup rho_j = (rho_i bigsqcup rho_j) otimes e_n + (rho_i bigsqcup rho_j) otimes e_{in} ] The second expression is: [ e_i bigsqcup e_j = (e_i bigsqcup e_j) otimes e_n + (e_i bigsqcup e_j) otimes e_{in}, quad e_i bigsqcup e_j = (e_i bigsqcup e_j) otimes e_{in} ] Symbols used include the Greek letters rho and rho_j, the tensor product symbol otimes, and the join operation symbol bigsqcup.
Remark 1.2
Using shuffle and half-shuffle one can recover linear representations of more complex operations on the signature. In particular, for any pair of multi-indices \(I,J\) and each \(t\geq 0\) it holds that (see e.g. [11])
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equg_HTML.png
The image contains mathematical notation with two equations. The first equation is: [ langle e_t, X_{Delta t} mid e_J, X_t rangle = langle e bigsqcup e_J, X_t rangle ] The second equation is: [ int_0^t langle e_t, X_s rangle circ d(e_J, X_s) = langle tilde{e}_t bigsqcup e_J, X_t rangle ] Symbols include angle brackets, integral sign, and the coproduct symbol bigsqcup.
Choosing for example \(I=(1)\) and \(J=(2)\) we get that
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equh_HTML.png
The image displays a mathematical formula involving operations on elements e_{(1)} and e_{(2)} . The expression is: [ e_{(1)} bigsqcup e_{(2)} = (e_{(1)} bigsqcup e_{(2)}) otimes e_{(1)} + (e_{(1)} bigsqcup e_{(2)}) otimes e_{(2)} = e_{(2,1)} + e_{(1,2)} ] Symbols used include the coproduct bigsqcup, tensor product otimes, and elements e_{(1)} , e_{(2)} , e_{(2,1)} , and e_{(1,2)} .
and https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq84_HTML.gif
The image displays a mathematical formula involving operations on elements denoted by e . The expression is: [ widetilde{e}_{(1)} bigsqcup bigsqcup e_{(2)} = (e_{(1)} bigsqcup bigsqcup e_{emptyset}) otimes e_{(2)} = e_{(1,2)} ] Symbols include the tilde, disjoint union (bigsqcup), tensor product (otimes), and empty set (emptyset).
Setting \(X_0=0\) this corresponds to
$$\displaystyle \begin{aligned} \begin{aligned} &\langle e_{(1)}, \mathbb{X}_t\rangle \langle e_{(2)},\mathbb{X}_t\rangle {=} {X}_t^1{X}_t^2 {=}\int_0^t{X}_s^2\circ d{X}_s^1{+}\int_0^t{X}_s^1\circ d{X}_s^2 {=}\langle e_{(2,1)}+e_{(1,2)},\mathbb{X}_t\rangle \end{aligned}\end{aligned}$$
and \(\int _0^t\langle e_{(1)}, \mathbb {X}_s\rangle \circ d \langle e_{(2)},\mathbb {X}_t\rangle =\int _0^t{X}_s^1\circ d{X}_s^2 =\langle e_{(1,2)},\mathbb {X}_t\rangle \), respectively.

2 Expected Signature of Polynomial Processes

Consider the \(\mathbb {R}^d\)-valued process \(Y=(Y_{t})_{t\ge 0}\) given by
$$\displaystyle \begin{aligned} {} \mathrm{d} Y_{t} = b(Y_t)\mathrm{d} t+ \sqrt{ a(Y_t)}\mathrm{d} W_{t},\qquad Y_0=y_0, \end{aligned} $$
(5)
for some d-dimensional Brownian motion W and some maps \(a:\mathbb {R}^d\to \mathbb {S}^d_+\) (where \(\mathbb {S}^d_+\) denotes the set of symmetric positive definite \(d\times d\) matrices) and \(b:\mathbb {R}^d\to \mathbb {R}^d\). Set \(\sigma (Y_t):= \sqrt { a(Y_t)}\). Assume then that \(Y=(Y_{t})_{t\ge 0}\) is a polynomial process as defined in the next definition and denote by \(\mathbb {Y}\) the corresponding signature.
Definition 2.1 (Polynomial Process)
A process \(Y=(Y_{t})_{t\ge 0}\) satisfying (5) is called polynomial process if \(a_{ij}\) is a polynomial of degree at most 2 and \(b_j\) is a polynomial of degree at most 1 for each \(i,j\in \{1,\ldots ,d\}\).
Various representations of the conditional expected signature of Y  and analogous quantities, in particular for Brownian motion, can be found in [3, 5, 18, 32, 33]. We also refer to [15, 20] for the case of Lévy processes. Our approach aligns with [12] and is grounded in the classical theory of polynomial processes (see [9] and [19]). Although the framework of general signature SDEs considered in [12] requires results for infinite-dimensional stochastic processes (see, for example, [8, 14]), the current assumption of Y  being a polynomial process allows us to remain within the finite-dimensional setting.
Lemma 2.2
Let\((Y_t)_{t\geq 0}\)be a polynomial process satisfying (5). The corresponding drift and diffusion coefficients b and a can then be written as
$$\displaystyle \begin{aligned} b_j(y)=b_j^c+\sum_{k=1}^db_j^{k}y_k \qquad \mathit{\text{ and }}\qquad a_{ij}(y)=a_{ij}^c+\sum_{k=1}^da_{ij}^ky_k+\sum_{k,h=1}^da_{ij}^{kh}y_ky_h,\end{aligned}$$
for some\(b_j^c\),\(b_j^{k}\),\(a_{ij}^c\), \(a_{ij}^k\), \(a_{ij}^{kh}=a_{ij}^{hk}\in \mathbb {R}\), with\(i,j=1,\dots ,d\). Moreover,
$$\displaystyle \begin{aligned} b_j(Y_t)=\langle {\mathbf{b}}_j, \mathbb{Y}_t^{1}\rangle \qquad \mathit{\text{and}}\qquad a_{ij}(Y_t)=\langle {\mathbf{a}}_{ij}, \mathbb{Y}_t^{2}\rangle \end{aligned}$$
for
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equl_HTML.png
The image contains two mathematical expressions. The first expression is: [ mathbf{b}_j = left( b_j^c + sum_{k=1}^{d} b_j^k gamma_0^k right) e_{emptyset} + sum_{k=1}^{d} b_j^k e_k ] The second expression is: [ a_{ij} = left( a_{ij}^c + sum_{k=1}^{d} a_{ij}^k gamma_0^k + sum_{k,h=1}^{d} a_{ij}^{kh} gamma_0^k gamma_0^h right) e_{emptyset} + sum_{k=1}^{d} left( a_{ij}^k + 2 sum_{h=1}^{d} a_{ij}^{kh} gamma_0^h right) e_k + sum_{k,h=1}^{d} a_{ij}^{kh} e_k sqcup e_h ] The expressions involve summations, Greek letters, and mathematical symbols.
Note that the upper index on\(Y_0^k\)and\(Y_0^h\)refers to the components of Y  and not to powers.
Proof
The first representation follows by the definition of polynomial processes, according to which b and a are polynomials of degree at most 1 and 2, respectively. For the second representation it then suffices to note that \(\langle e_\emptyset ,\mathbb {Y}_t^1\rangle =\langle e_\emptyset ,\mathbb {Y}_t^2\rangle =1,\ \langle e_k,\mathbb {Y}_t^1\rangle = \langle e_k,\mathbb {Y}_t^2\rangle =(Y_t^k-Y_0^k),\) and https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq110_HTML.gif
The image displays a mathematical formula involving vectors and operations. The formula is: [ langle e_k boxminus e_h, mathbb{Y}^2_t rangle = (Y^k_t - Y^k_0)(Y^h_t - Y^h_0). ] Here, langle cdot , cdot rangle denotes an inner product, boxminus is a specific operation, and mathbb{Y}^2_t is a vector or matrix. The terms Y^k_t, Y^k_0, Y^h_t, and Y^h_0 are components of vectors or sequences.
Lemma 2.3
Let\((Y_t)_{t\geq 0}\)be a polynomial process satisfying (5). Then the truncated signature\(( \mathbb {Y}_t^n)_{t\geq 0}\)is a polynomial process and for each\(|I|\leq n\)it holds that
$$\displaystyle \begin{aligned} \langle e_I, \mathbb{Y}_t^n\rangle =\int_0^t\langle Le_I, \mathbb{Y}_s^n\rangle\mathrm{d} s + \int_0^t \langle e_{I'}, \mathbb{Y}_s^n\rangle \sigma_{i_{|I|}}(Y_s) \mathrm{d} W_s, \end{aligned}$$
where\(L:T((\mathbb {R}^d))\to T((\mathbb {R}^d))\)satisfies\(L(T^{(n)}(\mathbb {R}^d))\subseteq T^{(n)}(\mathbb {R}^d)\)and is given by
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equ6_HTML.png
The image displays a mathematical formula involving indexed variables and Greek letters. The formula is: [ Le_I = e_{I'} bigsqcup b_{i_{|I|}} + frac{1}{2} e_{I''} bigsqcup a_{i_{|I|-1} i_{|I|}} ] Key elements include the Greek letter e with subscripts I' and I'' , and the use of the union symbol bigsqcup. The formula also includes indexed variables b_{i_{|I|}} and a_{i_{|I|-1} i_{|I|}} , and a fraction frac{1}{2}.
(6)
for\({\mathbf {b}}\)and\({\mathbf {a}}\)as in Lemma2.2.
Proof
Let \(\sigma _j(Y_t)\) denote the j-th row of \(\sigma (Y_t)\). Using the definition of the signature, the Stratonovich integral and the shuffle property for each \(|I|\geq 0\) we can compute
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equn_HTML.png
The image displays a series of mathematical equations involving integrals and inner products. The equations are expressed in terms of variables e_I , Y_t , and Y_s , with integrals from 0 to t . The notation includes Greek letters such as sigma and alpha , and mathematical symbols like langle cdot, cdot rangle for inner products, and circ for composition. The equations involve terms with differentials d and functions b and a . The expressions are structured in a stepwise manner, showing transformations and simplifications of the initial equation.
Fix \(|I|\leq n\). Since https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq122_HTML.gif
The image displays a mathematical formula involving the symbols I , J , and a perpendicular symbol. The formula is represented as I perp J = |I| + |J| . Here, perp denotes a perpendicular relation, and the vertical bars | cdot | represent absolute values or magnitudes.
we get that \(L(T^{(n)}(\mathbb {R}^d))\subseteq T^{(n)}(\mathbb {R}^d)\) and thus that the corresponding drift components are linear maps in \(\mathbb {Y}^n\). Similarly, fixing \(|I|,|J|\leq n\) and using that https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq126_HTML.gif
The image displays a mathematical formula: [ mathbf{a}_{ij} = sum_{|I|, |J| leq 1} lambda_{ij}^{IJ} e_I wedge e_J ] Key elements include the summation symbol, Greek letter lambda (lambda), and wedge product (wedge). The formula involves indices i, j, and sets I, J with constraints |I|, |J| leq 1.
for some \(\lambda _{ij}^{IJ}\in \mathbb {R}\) we get that
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equo_HTML.png
The image displays a mathematical formula involving vectors, summation, and various mathematical symbols. The formula is: [ langle e_{I_1}, Y_s rangle sigma_{i|J}(Y_t) left( langle e_J, Y_s rangle sigma_{i|J}(Y_t) right)^{top} = langle e_{I_1}, Y_s rangle langle e_J, Y_s rangle langle a_{i|J[j]J}, Y_s rangle ] [ = sum_{|H_1|, |H_2| leq 1} lambda_{i|J[j]J}^{H_1 H_2} langle e_{I_1} perp!!!perp e_{H_1}, Y_s rangle langle e_J perp!!!perp e_{H_2}, Y_s rangle ] The formula includes Greek letters such as lambda (lambda) and sigma (sigma), and mathematical operations like inner products (langle cdot, cdot rangle), summation (sum), and perpendicular symbols (perp!!!perp).
This shows that the components of the corresponding diffusion matrix are polynomials of degree 2 in \(\mathbb {Y}^n\). The polynomial property follows from Lemma 2.2 in [19]. □
As the linear operator L maps the finite-dimensional vector space \(T^{(n)}(\mathbb {R}^d)\) into itself, it can be represented by a matrix.
Definition 2.4
We call the operator L defined in (6) dual operator corresponding to\(\mathbb {Y}\) and denote by G the \(d_n\)-dimensional matrix representative ofL. Explicitly, for each \(|I|\leq n\) we consider coefficients \(\eta _{IJ}\in \mathbb {R}\) such that
$$\displaystyle \begin{aligned} Le_I=\sum_{|J|\leq n}\eta_{IJ}e_J, \end{aligned}$$
and fix a labelling injective function \({\mathscr {L}}:\{I\colon |I|\leq n\}\to \{1,\ldots ,d_n\}\). The matrix \(G\in \mathbb {R}^{d_n\times d_n}\) is then given by
$$\displaystyle \begin{aligned} {} G_{{\mathscr{L}}(I){\mathscr{L}}(J)}:=\eta_{IJ}. \end{aligned} $$
(7)
Note that using the notation of (4), for each \({\mathbf {u}}\in T^{(n)}(\mathbb {R}^d)\) it holds
$$\displaystyle \begin{aligned} {\mathbf{vec}}(L{\mathbf{u}})=G{\mathbf{vec}}({\mathbf{u}}).\end{aligned}$$
The results of the following theorem have been implemented and the code is available at the repository github.com/sarasvaluto/AffPolySig presented in [13].
Theorem 2.5
Let\((Y_t)_{t\geq 0}\)be a polynomial process satisfying (5), G be the\(d_n\)-dimensional matrix representative of the dual operator corresponding to\(\mathbb {Y}\), and\((\mathcal {F}_t)_{t\geq 0}\)be the filtration generated by\((Y_t)_{t\geq 0}\). Then for each\(T,t\geq 0\)and each\(|I|\leq n\)it holds
$$\displaystyle \begin{aligned} \mathbb{E}[ \langle e_I,\mathbb{Y}_{T+t}^n\rangle |\mathcal{F}_{T}]= \sum_{|J|\leq n} (e^{tG^\top})_{{\mathscr{L}}(I){\mathscr{L}}(J)}\langle e_J,\mathbb{Y}_T^n\rangle, \end{aligned}$$
or equivalently,
$$\displaystyle \begin{aligned} \mathbb{E}[ {\mathbf{vec}}(\mathbb{Y}_{T+t}^n) |\mathcal{F}_{T}]= e^{tG^\top}{\mathbf{vec}}(\mathbb{Y}_T^n),\end{aligned}$$
where\(e^{(\cdot )}\)denotes the matrix exponential.
Proof
Lemma 2.3 yields that \({\mathbf {vec}}(\mathbb {Y}^n)\) is a polynomial process and the claim follows by Theorem 3.1 in [19] for polynomials of degree 1. □
In the special case given by \(Y=W\) the matrix G is nilpotent and we obtain a more explicit representation. Loosely speaking, the signature element \(\langle e_I, \widehat {\mathbb {W}}_t \rangle \) has nonzero expectation only if all the indices \(i_j \neq 0\) in I are grouped in blocks of even length.
Corollary 2.6
Let W be a vector of d correlated Brownian motions with correlation matrix\(\rho \). Consider a multi-index I admitting the representation
$$\displaystyle \begin{aligned} {} e_{I}=e_{0}^{\otimes k_{0}}\otimes e_{J_{1}}\otimes e_{0}^{\otimes k_{1}}\otimes e_{J_{2}}\otimes\cdots\otimes e_{0}^{\otimes k_{m}}, \end{aligned} $$
(8)
for some\(J_i\in \{1,\dots ,d\}^{2h_{i}}\)and\(h_i,k_i\in \mathbb {N}_0\). Then,
$$\displaystyle \begin{aligned} {} \mathbb{E}[\langle e_{I},\widehat{\mathbb{W}}_{t}\rangle]=\frac{t^{\sum_{i=0}^{m}k_{i}+\sum_{i=1}^{m}h_{i}}}{\left(\sum_{i=0}^{m}k_{i}+\sum_{i=1}^{m}h_{i}\right)!}\bigg(\frac{1}{2}\bigg)^{\sum_{i=1}^{m}h_{i}}\prod_{i=1}^{m} \rho(J_{i}), \end{aligned} $$
(9)
where\(\rho (J):=\prod _{k=1}^{|J|/2} \rho _{j_{2k-1},j_{2k}}\). If I does not admit representation (8) then\(\mathbb {E}[\langle e_I,\widehat {\mathbb {W}}_t\rangle ]=0\). Additionally, for each\(s, t\geq 0\)it holds
$$\displaystyle \begin{aligned} \mathbb{E}[\langle e_{I}, \widehat{\mathbb{W}}_{s+t}\rangle|\mathcal{F}_s] =\sum_{e_{I_1}\otimes e_{I_2}=e_I}\langle e_{I_1}, \widehat{\mathbb{W}}_{s}\rangle \mathbb{E}[ \langle e_{I_2}, \widehat{\mathbb{W}}_{t}\rangle].\end{aligned}$$
Note that letting \(I_0\) be the number of 0s in I and setting \(I_{>0}:=|I|-I_0\) we can rewrite (9) as
$$\displaystyle \begin{aligned} \mathbb{E}[\langle e_{I},\widehat{\mathbb{W}}_{t}\rangle] =\frac{t^{I_{0}+I_{>0}/2}}{\left(I_0+I_{>0}/2\right)!}\bigg(\frac{1}{2}\bigg)^{\frac{I_{>0}}2}\prod_{i=1}^{m} \rho(J_{i}). \end{aligned}$$
Example 2.7
Note that in the special case where W is a d-dimensional (standard) Brownian motion we get that \(\rho _{ij}=1_{\{i=j\}}\). Corollary 2.6 then yields
$$\displaystyle \begin{aligned} \mathbb{E}[\langle e_{I},\widehat{\mathbb{W}}_{t}\rangle]=\frac{t^{I_{0}+I_{>0}/2}}{\left(I_0+I_{>0}/2\right)!}\bigg(\frac{1}{2}\bigg)^{\frac{I_{>0}}2}, \end{aligned}$$
for each I of the form \(e_I=e_{j_0}^{\otimes k_0}\otimes \cdots \otimes e_{j_m}^{\otimes k_m}\) with \(k_i\) even whenever \(j_i>0\).
In order to better understand Theorem 2.5 we propose two examples. The first concerns a vector of correlated Brownian motions and constitutes the core of the proof of Corollary 2.6. The second one leads to the formulas that will be applied later on in the chapter.
Example 2.8
Let W be a d-dimensional Brownian motion and a process \((X_t)_{t\geq 0}\) given by
$$\displaystyle \begin{aligned} \mathrm{d} X_t=\sqrt{a(X_{t})}\mathrm{d} W_t,\qquad X_0=x_0, \end{aligned} $$
for some matrix a of the form \(a_{ij}(X_{t})=\sigma ^i\sigma ^j\rho _{ij}\) and some constant \(\sigma ^i>0\) and \(\rho _{ij}\in [-1,1]\). Observe that \(\widehat {X}\) satisfies (5) in \(d+1\) dimensions for
$$\displaystyle \begin{aligned} b_j(\widehat{X}_t)=1_{\{j=0\}} \qquad \text{and}\qquad a_{ij}(\widehat{X}_t)=\sigma^i\sigma^j\rho_{ij}1_{\{i,j\neq0\}}.\end{aligned}$$
The corresponding \({\mathbf {b}}\) and \({\mathbf {a}}\) are given by \( {\mathbf {b}}_{j}=e_{\emptyset } 1_{\{j=0\}}\) and \( {\mathbf {a}}_{ij}=e_{\emptyset }\sigma ^i\sigma ^j\rho _{ij}1_{\{i,j\neq 0\}} \) and we thus get
$$\displaystyle \begin{aligned} Le_I&=e_{I'}1_{\{{i_{|I|}}=0\}}+\frac 1 2e_{I''}\sigma^{i_{|I|-1}}\sigma^{i_{|I|}}\rho_{{i_{|I|-1}}{i_{|I|}}}1_{\{{i_{|I|-1}},{i_{|I|}}\neq0\}}. \end{aligned} $$
This for instance implies that \(L(e_{1})=0\), https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq172_HTML.gif
The image displays a mathematical formula involving tensor products and other operations. The formula is: [ L(e_I otimes e_0) = e_I bigsqcup bigsqcup mathbf{b}_0 + frac{1}{2} e_{I'} bigsqcup bigsqcup mathbf{a}_{i|I|0} = e_I ] Key symbols include the tensor product otimes, the disjoint union bigsqcup, and Greek letters e_I and e_0.
, and \(L(e_{0}\otimes e_{1}\otimes e_{2})=\frac {1}{2}e_{0}\sigma ^{1}\sigma ^{2}\rho _{12}\). Letting \((\mathcal {F}_t)_{t\geq 0}\) be the filtration generated by \((\widehat {X}_t)_{t\geq 0}\), by Theorem 2.5 we can thus conclude that
$$\displaystyle \begin{aligned} {} \mathbb{E}[ {\mathbf{vec}}(\widehat{\mathbb{X}}_{T+t}^n) |\mathcal{F}_{T}]= e^{tG^\top}{\mathbf{vec}}(\widehat{\mathbb{X}}_T^n), \end{aligned} $$
(10)
or equivalently,
$$\displaystyle \begin{aligned} {} \mathbb{E}[\langle e_{I}, \widehat{\mathbb{X}}_{T+t}^n \rangle|\mathcal{F}_{T}]=\sum_{|J|\leq n} (e^{tG^\top})_{{\mathscr{L}}(I){\mathscr{L}}(J)}\langle e_J,\widehat{\mathbb{X}}_T^n\rangle, \end{aligned} $$
(11)
where G denotes the \((d+1)_n\)-dimensional matrix representative of L.
Example 2.9
In a similar setting as of the previous example let W be a d-dimensional Brownian motion and set \(Z_t:=(X_t,B_t)\) for an another one-dimensional Brownian motion \((B_t)_{t\geq 0}\). Suppose that
$$\displaystyle \begin{aligned} \mathrm{d} X_t=\operatorname{diag}(\kappa)(\theta-X_t) \mathrm{d} t+ \sqrt{a(X_{t})}\mathrm{d} W_t,\qquad X_0=x_0, \end{aligned} $$
where \(\kappa , \theta \in \mathbb {R}^d\) and \(\operatorname {diag}(\kappa )\) denotes a diagonal matrix consisting of the components of \(\kappa \) and a is as before. Observe that \(\widehat {Z}\) satisfies (5) in \(d+2\) dimensions for
$$\displaystyle \begin{aligned} b_j(\widehat{Z}_t)=1_{\{j=0\}}+\kappa^j(\theta^j-\widehat{Z}_t^j)1_{\{j\neq0\}} \qquad \text{and}\qquad a_{ij}(\widehat{Z}_t)=\sigma^i\sigma^j\rho_{ij}1_{\{i,j\neq0\}},\end{aligned}$$
where \(\kappa ^{d+1}:=0\), \(\sigma ^{d+1}:=1\), and \(\rho _{j(d+1)}\) is the correlation between \(X^j\) and B. The corresponding \({\mathbf {b}}\) is then given by \( {\mathbf {b}}_{j}=e_{\emptyset } (1_{\{j=0\}}+\kappa ^{j}(\theta ^{j}-\widehat {Z}_{0}^{j})1_{\{j\neq 0\}})-e_{j}\kappa ^{j}1_{\{j\neq 0\}}\) and we thus get
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equab_HTML.png
The image displays a mathematical formula involving various symbols and notations. The formula is: [ Le_{I} = e_{I|I}(1_{{i_{I|I}=0}}) + k^{j|I|I}(theta^{j|I|I} - hat{Z}_{0}^{j|I|I})1_{{i_{I|I}neq0}} - (e_{I|I} bigsqcup e_{i_{I|I}})k^{j|I|I}1_{{i_{I|I}neq0}} ] [ + frac{1}{2}e^{i|I|I}sigma^{j|I|I-1}sigma^{j|I|I}rho_{i_{I|I-1}i_{I|I}}1_{{i_{I|I-1},i_{I|I}neq0}} ] The formula includes Greek letters such as theta (theta), sigma (sigma), and rho (rho), as well as mathematical symbols like summation (bigsqcup), and hat notation (hat{Z}).

3 Linear Signature Models

Let us now introduce a framework for signature-based asset price models. We fix a time horizon \(T>0\), consider a d-dimensional continuous semimartingale \( X:=( X^1,\ldots , X^{d})\) and its matrix-valued quadratic covariation \([X]\). We suppose that \( X\) encodes all the information to represent the market asset S. Recall that for notational convenience we assume that S is one-dimensional.
Furthermore, we suppose that X has some tractability properties which are made precise below and which are for instance satisfied by a d-dimensional Brownian motion, but more generally also all polynomial diffusion processes.

3.1 Definition and First Properties

We regard the continuous semimartingale X as the primary process, being the main modeling building block. Its time extension is denoted by
$$\displaystyle \begin{aligned} \widehat{X}_t:=(t, X_t).\end{aligned}$$
We use \(e_0\) for the component of \(\widehat {X}\) corresponding to time and \(e_k\) for its component corresponding to \( X^k\). Its signature is denoted by \(\widehat {\mathbb {X}}_{t}\).
Throughout the section we then make the following standing assumption about diffusion matrix of X.
For all \(i,j\in \{1,\ldots , d\}\) it holds
$$\displaystyle \begin{aligned} {} \mathrm{d}[ X^i, X^j]_t=\sum_{ |I|\leq m} a_{ij}^I\langle e_I,\widehat{\mathbb{X}}_t\rangle \mathrm{d} t \end{aligned} $$
(12)
for some \(m\in \mathbb {N}\). Observe that this assumption is always satisfied when X is given by a polynomial process.
Remark 3.1
It could be of interest to consider the extension \(\widehat {X}_t:=(t, X_t,[ X]_t)\), which includes the \(d^2\)-dimensional process given by the quadratic covariation of the primary process X. In this case the components of the diffusion matrix do not need to be linear functions of the time extended signature, but could for instance be general path-dependent functionals. Moreover, the map \(t \mapsto [X]_t\) does not need to be absolutely continuous with respect to the Lebesgue measure, hence X does not need to be an Itô-semimartingale. For ease of exposition we focus here on the simple time extension and refer to [11] for the slightly more general setup.
Our goal consists in approximating the dynamics of \( S\) with a signature model.
Definition 3.2 (Signature Model)
A signature model is a stochastic process of the form
$$\displaystyle \begin{aligned} {} S_n(\ell)_t:=\ell_{\emptyset} +\sum_{0<|I|\leq n} \ell_I \langle e_I,\widehat{\mathbb{X}}_t\rangle, \end{aligned} $$
(13)
where \(n\in \mathbb {N}\) and \(\ell :=\{\ell _\emptyset , \ell _I\colon 0<|I|\leq n\}\).
Remark 3.3
By Proposition 3.4 below the class of Sig-SDEs models considered in [37] can be embedded in our framework by choosing a properly extended one-dimensional Brownian motion as primary process.
In the following we list several important properties which make signature models a tractable framework for stochastic finance.
  • For each \(t \in [0,T]\), \( S_n(\ell )_t\) is linear in \(\widehat {\mathbb {X}}_t\). This in particular implies that having precomputed \(\widehat {\mathbb {X}}\) an update of the parameters \(\ell \) boils down to evaluating the scalar product in (13).
  • The quadratic variation of processes of form (13) is again of the form (13).
  • Representation (13) remains invariant under polynomial transformations.
  • Itô-integrals of processes of form (13) with respect to processes of form (13) are again processes of form (13). This includes in particular the signature \(\widehat {\mathbb {S}}_n(\ell )\) of \(\widehat {S}_n(\ell )_t:=(t, S_n(\ell )_t)\) or expressions of the form \(\int _0^\cdot S_n( \ell )_s \mathrm {d} X_s^i.\)
  • The latter point implies that the expected signature of \(\widehat {S}_n(\ell )\) can be expressed as
    $$\displaystyle \begin{aligned} \mathbb{E}[\langle e_J,\widehat{\mathbb{S}}_n(\ell)_t\rangle]=P_J(\ell,\mathbb{E}[\widehat{\mathbb{X}}_t]),\end{aligned}$$
    for some \(P_J\) such that \(P_J(\cdot ,\mathbb {E}[\widehat {\mathbb {X}}_t])\) is a polynomial of degree \(|J|\) and \(P_J(\ell , \cdot )\) is a linear map for each \(\ell \) (see Theorem 3.9 and Remark 3.11 for more details).
  • Due to the universal approximation theorem (see e.g. [7, 15]) this yields approximations
    $$\displaystyle \begin{aligned} \mathbb{E}\big[f\big((\widehat{\mathbb{S}}^2_n(\ell)_t)_{t\in[0,T]}\big)\big]\approx P_f(\ell,\mathbb{E}[\widehat{\mathbb{X}}_T])\end{aligned}$$
    for each map f, which is continuous in a suitable sense, where \(P_f\) is given by a finite linear combination of maps \(P_J\) as above and with \(\widehat {\mathbb {S}}^2\) we denote here the second order lift. This includes representations for
    $$\displaystyle \begin{aligned} \mathbb{E}\big[\tilde f(\widehat{S}_n(\ell)_T)\big]\text{\qquad and \qquad } \mathbb{E}\bigg[\tilde f\bigg(\int_0^T\widehat{S}_n(\ell)_t \mathrm{d} t\bigg)\bigg]\end{aligned}$$
    for maps \(\tilde f\) being payoff functions and where the expectation is taken with respect to a risk-neutral measure.
We now prove several representation results for the signature model of form (13).
Proposition 3.4
Fix\(n\in \mathbb {N}\). Then, there is a one to one correspondence between representations of the form (13) and representations of the form
$$\displaystyle \begin{aligned} S_n(\ell)_t&=\ell_{\emptyset}+\int_0^t\Big(\ell^0_\emptyset+\sum_{0<|I|\leq n-1} \ell^0_I \langle e_I,\widehat{\mathbb{X}}_s\rangle\Big) \mathrm{d} s\\ &\qquad +\sum_{k=1}^{d}\int_0^t\Big(\ell^{k}_\emptyset+\sum_{0<|I|\leq n-1} \ell^{k}_I \langle e_I,\widehat{\mathbb{X}}_s\rangle \Big)\mathrm{d} X_s^k, \end{aligned} $$
for\(\ell :=\{\ell _\emptyset , \ell _I^{k}\colon |I|\leq n-1\mathit{\text{ and }} k\in \{0,\ldots ,d\}\}.\)
Before presenting the proof of let us formulate the following technical lemma.
Lemma 3.5
For each\(|I|>0\)it holds
$$\displaystyle \begin{aligned} \int_0^t \langle e_I,\widehat{\mathbb{X}}_s\rangle \mathrm{d} s &= \langle e_I\otimes e_0,\widehat{\mathbb{X}}_t\rangle,\mathit{\text{\qquad and\qquad }} \int_0^t \langle e_I,\widehat{\mathbb{X}}_s\rangle \mathrm{d} X_s^k= \langle \tilde e_I^k,\widehat{\mathbb{X}}_t\rangle, \end{aligned} $$
where
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equai_HTML.png
The image displays a mathematical formula involving tensor products and summation. The formula is: [ tilde{e}_I^k := e_I otimes e_k - sum_{|J| leq m} frac{a_i^J |I|^k}{2} (e_I sqcup e_J) otimes e_0 ] Key elements include the tensor product symbol otimes, summation sum, and the disjoint union symbol sqcup. The formula involves indices I, J, and k, with constraints on J.
Proof
The representations of \(\int _0^t \langle e_I,\widehat {\mathbb {X}}_s\rangle \mathrm {d} s\) follows by the definition of signature. We proceed with the proof of the representation of \(\int _0^t \langle e_I,\widehat {\mathbb {X}}_s\rangle \mathrm {d} X_s^k\). For \(I=\emptyset \) the claim follows. By the definition of the Stratonovich integral,
$$\displaystyle \begin{aligned} \int_0^t\langle e_{I},\widehat{\mathbb{X}}_s\rangle \mathrm{d} X^k_s &=\int_0^t\langle e_{I},\widehat{\mathbb{X}}_s\rangle \circ \mathrm{d} X^k_s -\frac 1 2 [\langle e_{I},\widehat{\mathbb{X}}\rangle,X^k]_t\\ &=\langle e_{I}\otimes e_k,\widehat{\mathbb{X}}_t\rangle -\frac 1 2 \int_0^t\langle e_{I'},\widehat{\mathbb{X}}_s\rangle \mathrm{d} [X^{i_{|I|}},X^k]_s. \end{aligned} $$
The shuffle property and the definition of the signature yield then
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equak_HTML.png
The image displays a mathematical formula involving integrals and summations. The formula is: [ int_0^t langle e_I, hat{X}_s rangle , mathrm{d}[X^{k_1}, X^{k_2}]_s = int_0^t sum_{|J| leq m} a_{k_1 k_2}^J langle e_I, hat{X}_s rangle langle e_J, hat{X}_s rangle , mathrm{d}s ] [ = sum_{|J| leq m} a_{k_1 k_2}^J langle (e_I sqcup e_J) otimes e_0, hat{X}_t rangle ] The formula includes integrals from 0 to t , summations over J with conditions |J| leq m , and various mathematical symbols such as langle cdot, cdot rangle for inner products, otimes for tensor products, and sqcup for disjoint union.
and the claim follows. □
Suppose that \( X\) is a vector of correlated Brownian motions with correlation matrix \(\rho \). Then, the transformation introduced in Lemma 3.5 reads
$$\displaystyle \begin{aligned} \tilde e^k_\emptyset=e_{k}\qquad \text{and}\qquad \tilde e^k_I=e_{I}\otimes e_{k} -\frac {\rho_{i_{|I|},k}} 2 {\boldsymbol 1_{\{i_{|I|}\neq 0\}}}e_{I'}\otimes e_0,\end{aligned}$$
for each \(|I|>0\).
We are now ready to provide the proof of Proposition 3.4.
Proof of Proposition3.4
Let \( S_n(\ell )\) be as in the statement of the proposition. By Lemma 3.5 it holds
$$\displaystyle \begin{aligned} S_n(\ell)_t&=\ell_{\emptyset}+\Big(\ell^0_\emptyset\langle e_0,\widehat{\mathbb{X}}_t\rangle+\sum_{0<|I|\leq n-1} \ell^0_I \langle e_I\otimes e_0,\widehat{\mathbb{X}}_t\rangle\Big) \\ &\qquad +\sum_{k=1}^{d}\Big(\ell^{k}_\emptyset\langle \tilde e ^k_\emptyset,\widehat{\mathbb{X}}_t\rangle+\sum_{0<|I|\leq n-1} \ell^{k}_I \langle \tilde e^k_I,\widehat{\mathbb{X}}_t\rangle \Big), \end{aligned} $$
which is of the form given by (13). Conversely, by Lemma 3.5 we also have
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equan_HTML.png
The image contains two mathematical equations involving integrals and summations. The first equation is: [ langle e_I otimes e_0, hat{X}_t rangle = int_0^t langle e_I, hat{X}_s rangle , ds ] The second equation is: [ langle e_I otimes e_k, hat{X}_t rangle = int_0^t langle e_I, hat{X}_s rangle , dX_s^k + frac{1}{2} sum_{|J| leq m} int_0^t langle a_{i|J|k} (e_I mathbin{text{textifsym{lrcorner}}} e_J), hat{X}_s rangle , ds ] Symbols include integrals, summations, tensor products, and inner products.
yielding the claim. □

3.2 Absence of Arbitrage and Universality Properties

The results obtained in the representation above will enable us to establish conditions ensuring absence of arbitrage. More precisely, we shall make the assumption that the principle of “no free lunch with vanishing risk” [16] holds. This assumption is equivalent to the existence of an equivalent local martingale measure \(\mathbb {Q}\) as \(S_{n}(\ell )\) has continuous sample paths. Throughout this section, we will assume zero interest rates and that the asset has already been discounted. To formulate a precise condition for the absence of arbitrage, we rely on the following corollary, which is a direct consequence of Proposition 3.4 and Lemma 3.5.
Corollary 3.6
Suppose that\( X\)is a local martingale. Then\( S_{n}(\ell )\)is a local martingale if and only if it admits a representation of the form
$$\displaystyle \begin{aligned} {} S_{n}(\ell)_t=\ell_{\emptyset} +\sum_{k=1}^{D}\bigg(\ell_\emptyset^k\langle e_k, \widehat{\mathbb{X}}_t\rangle+\sum_{0<|I|\leq n-1} \ell^{k}_I \langle \tilde e^k_I,\widehat{\mathbb{X}}_t\rangle\bigg), \end{aligned} $$
(14)
for some\(D\in \{1,\ldots , d\}\)and\(\ell :=\{\ell _\emptyset , \ell _I^{k}\colon 0\leq |I|\leq n-1\mathit{\text{ and }} k\in \{1,\ldots ,d\}\}.\)
Proof
Since X is a local martingale, \(S_n(\ell )\) in the representation of Proposition 3.4 is a local martingale if and only if all integrals with respect to time and the quadratic variation process vanish. This means that \(S_n(\ell )\) is of form
$$\displaystyle \begin{aligned} {} S_n(\ell)=\ell_{\emptyset}+\sum_{k=1}^{d}\int_0^t\Big(\ell^{k}_\emptyset+\sum_{0<|I|\leq n-1} \ell^{k}_I \langle e_I,\widehat{\mathbb{X}}_s\rangle \Big)\mathrm{d} X_s^k \end{aligned} $$
(15)
and Lemma 3.5 yields the assertion. □
Let us now formulate sufficient no-arbitrage conditions.
Corollary 3.7
Suppose that there is an equivalent measure\(\mathbb {Q} \sim \mathbb {P}\)such that X is a local\(\mathbb {Q}\)-martingale. Then the following holds.
(i)
The model\(S_{n}(\ell )\)is free of arbitrage if it admits a representation as of (14).
 
(ii)
If\(\mathbb {Q}\)is an equivalent local martingale measure for\(S_{n}(\ell )\), then\(S_{n}(\ell )\)is necessarily of form (14).
 
Proof
The first assertion is a direct consequence of Corollary 3.6, as form (14) implies that \(S_{n}(\ell )\) is a local \(\mathbb {Q}\)-martingale and thus \(\mathbb {Q}\) is an equivalent local martingale measure. The assumption of the second assertion implies that \(S_{n}(\ell )\) is a local martingale under \(\mathbb {Q}\), whence by Corollary 3.6 it has to be of form (14). □
Remark 3.8
(i)
It is important to recognize that \(S_{n}(\ell )\) could be free of arbitrage without being of the form in (14). This scenario arises when the primary process X is not a local martingale under any of the equivalent local martingale measures.
 
(ii)
Additionally, note that if \(S_n(\ell )\) is of the form (14) and a local martingale under \(\mathbb {Q}\), it does not necessarily imply that X is a local \(\mathbb {Q}\)-martingale. This situation arises when the drift terms in (15) cancel each other out. It is worth mentioning that in the one-dimensional case (\(d=1\)), such a scenario is impossible, and consequently, X is inevitably a local \(\mathbb {Q}\)-martingale. Moreover, \(\mathbb {Q}\) is unique and the model thus complete.
 

3.3 The Expected Signature of \(S_n(\ell )\)

For the pricing of so-called sig-payoffs discussed in Sect. 3.4, it is crucial to be able to compute the expected signature of \(\widehat {S}_n(\ell )_t=(t,S_n(\ell )_{t})\). In this section, we present formulas that link this computation to the calculation of the expected signature of \(\widehat {X}_t\), which can often be explicitly computed. Specifically, when X is a Brownian motion, this is a well-known result (refer to, e.g., [18]). For the case where X is a polynomial process, the computation can be carried out using polynomial processes techniques (see Sect. 2). In a general setting, it can be computed by solving an infinite-dimensional system of linear PDEs corresponding to the Kolmogorov forward equation of the signature process (see [36]). For a comprehensive treatment of signature cumulants, i.e., the logarithm of the expected signature, we direct the readers to [21].
Theorem 3.9
Fix\(n\in \mathbb {N}\), a multi-index J, and\(D\in \{1,\ldots , d\}\)and denote by\(\widehat {\mathbb {S}}_n(\ell )_{t}\)the signature of\(\widehat {S}_n(\ell )_{t}\). Let\(e_0\)be the component of\(\widehat {S}_n(\ell )\)corresponding to time and\(e_1\)its component corresponding to\(S_n(\ell )\). Define\(e(\emptyset ,\ell ):=\tilde e(\emptyset ,\ell ):=e_\emptyset \)and
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equao_HTML.png
The image contains two mathematical expressions. The first expression is: [ e(J, ell) = widetilde{prod}_{i=1}^{|J|} left( e_0 mathbf{1}_{{j_i=0}} + left( sum_{0<|I|leq } ell_I e_I right) mathbf{1}_{{j_i=1}} right) ] The second expression is: [ tilde{e}(J, ell) = widetilde{prod}_{i=1}^{|J|} left( e_0 mathbf{1}_{{j_i=0}} + left( sum_{k=1}^{D} ell_{emptyset}^k e_k + sum_{0<|I|leq -1} ell_I^k tilde{e}_I^k right) mathbf{1}_{{j_i=1}} right) ] These expressions involve summations, products, and indicator functions, with Greek letters and mathematical symbols.
for\(|J|>0\)with https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq274_HTML.gif
A symbol resembling a tilde (~) positioned above a horizontal line, with two vertical lines extending downward from each end of the horizontal line.
the half-shuffle being introduced in Definition1.1. Then the following representations hold.
  • \( \langle e_{J},{\widehat {\mathbb {S}}_n(\ell )}_{t}\rangle =\langle e(J,\ell ),\widehat {\mathbb {X}}_{t}\rangle \), if\(S_n(\ell )\)is given by (13), and
  • \(\langle e_{J},{\widehat {\mathbb {S}}_n(\ell )}_{t}\rangle =\langle \tilde e(J,\ell ),\widehat {\mathbb {X}}_{t}\rangle \), if\( S_{n}(\ell )\)is given as in Corollary3.6.
Proof
We proceed by induction to prove the claim. Fix \(S_n(\ell )\) as in (13). For \(J=\emptyset \) the claim follows by the definition of signature. Suppose the claim holds true for each J such that \(|J|=m-1\), and fix J with \(|J|\leq m\). Then
$$\displaystyle \begin{aligned} \langle e_{J}, \widehat{\mathbb{S}}_n(\ell)_t\rangle&=\int_{0}^{t}\langle e_{J'},\widehat{\mathbb{S}}_n(\ell)_{s}\rangle\circ {\mathrm{d}}\langle e_{j_{m}}, \widehat{S}_n(\ell)_s\rangle\\ &=\int_{0}^{t}\langle e_{J'},\widehat{\mathbb{S}}_n(\ell)_{s}\rangle\circ{\mathrm{d}}\langle e_0{\boldsymbol 1_{\{j_m=0\}}} +\Big(\sum_{0<\lvert I\lvert\le n}\ell_{I}e_{I}\Big){\boldsymbol 1_{\{j_m=1\}}},\widehat{\mathbb{X}}_s\rangle\\ &={\boldsymbol 1_{\{j_m=0\}}}\int_{0}^{t}\langle e_{J'},\widehat{\mathbb{S}}_n(\ell)_{s}\rangle\circ{\mathrm{d}}\langle e_0,\widehat{\mathbb{X}}_s\rangle\\ &\qquad +{\boldsymbol 1_{\{j_m=1\}}}\sum_{0<\lvert I\lvert\le n}\ell_{I}\int_{0}^{t}\langle e_{J'},\widehat{\mathbb{S}}_n(\ell)_{s}\rangle\circ{\mathrm{d}} \langle e_{I},\widehat{\mathbb{X}}_s\rangle. \end{aligned} $$
The induction hypothesis and Remark 1.2 yield the first claim and the second one is analogous. □
Remark 3.10
The linear combinations of multiindices given by \(e(J,\ell )\) might look very abstract and we thus provide some additional details. The intuition behind its construction is as follows. For each 0 in J we are integrating with respect to time and we thus add https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq284_HTML.gif
The image shows a mathematical expression with a tilde symbol over a capital letter "L" followed by a subscript "0" next to a lowercase letter "e". In LaTeX notation, it is represented as tilde{L} e_0.
to the current linear combination of multiindices. For each 1 in J we are instead integrating with respect to \(S_n(\ell )\) and we thus need to add https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq286_HTML.gif
The image displays a mathematical formula: widetilde{mathbb{L}}left(sum_{0 < |I| leq } ell_I e_Iright). It includes a tilde over a script L, a summation from 0 to with conditions on the index I, and Greek letter ell and subscripted elements e_I.
. Since integrating with respect to \(S_n(\ell )\) corresponds to integrating with respect to \(S_n(\ell )-S_n(\ell )_0\) the term \(\ell _{\emptyset }e_{\emptyset }\) can be omitted. Choosing \(J=(0,1)\) and \(J=(1,1)\) we would for instance get
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equaq_HTML.png
The image contains mathematical expressions involving summations and products. The first expression is: [ e((0,1), ell) = e_0 widetilde{sum_{0 < |I| leq } ell_I e_I} = sum_{0 < |I| leq } ell_I e_0 widetilde{e_I} ] The second expression is: [ e((1,1), ell) = left( sum_{0 < |I_1| leq } ell_{I_1} e_{I_1} right) widetilde{sum_{0 < |I_2| leq } ell_{I_2} e_{I_2}} = sum_{0 < |I_1|, |I_2| leq } ell_{I_1} ell_{I_2} e_{I_1} widetilde{e_{I_2}} ] The expressions involve summation indices I , I_1 , and I_2 , and use the tilde symbol widetilde{} to denote a specific operation.
The intuition behind \(\tilde e(J,\ell )\) is similar.
Remark 3.11
(i)
Let \( S_n(\ell )\) be as in (13) and observe that
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equar_HTML.png
The image displays a mathematical formula involving summation and product notations. The formula is: [ e(J, ell) = widetilde{prod_{i=1}^{|J|}} sum_{0 < |I| leq } left( e_I mathbb{1}_{(j_i=0)} mathbb{1}_{[I=(0)]} + ell_I e_I mathbb{1}_{(j_i=1)} right) ] Key elements include the use of Greek letters ell and I , product notation prod, summation notation sum, and indicator functions mathbb{1}.
Setting \(c(j,I,\ell ):={\boldsymbol 1_{\{j=0\}}}{\boldsymbol 1_{\{I=(0)\}}}+\ell _{I}{\boldsymbol 1_{\{j=1\}}}\) we thus obtain that
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equas_HTML.png
The image displays a mathematical formula involving expected values, summation, and product notations. The formula is: [ mathbb{E}[e_J, hat{S}_n(ell)_t] = mathbb{E}[e(J, ell), hat{X}_t] ] [ = sum_{|I_1|, ldots, |I_J| = 1}^{} mathbb{E}left[left(tilde{prod}_{i=1}^{|J|} e_{I_i}, hat{X}_tright) prod_{i=1}^{|J|} c(j_i, l_i, ell)right] ] Symbols include Greek letters such as ell (lambda) and hat{} (hat) notation indicating estimates or approximations. The formula involves complex operations with indices and conditions.
Although this representation may seem intricate at first glance, it is, in fact, quite convenient as the expectations https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq295_HTML.gif
Expected value notation with mathematical symbols: mathbb{E}left[leftlangle prod_{i=1}^{tilde{J}} e^{I_i}, hat{X}_t rightrangleright]. The formula includes a product from i=1 to tilde{J}, exponential function e^{I_i}, and a vector hat{X}_t. Symbols include tilde tilde{}, hat hat{}, and angle brackets langle rangle.
can be computed just once in advance. Since \(c(j,I,\cdot )\) is affine, we also immediately obtain that the map
$$\displaystyle \begin{aligned} (\ell,\mathbb{E}[\widehat{\mathbb{X}}_t])\mapsto P_J(\ell,\mathbb{E}[\widehat{\mathbb{X}}_t]):=\mathbb{E}[\langle e_{J},{\widehat{\mathbb{S}}_n(\ell)}_{t}\rangle]\end{aligned}$$
is polynomial of degree \(|J|\) in its first argument and linear in the second one.
 
(ii)
Similarly, let \( S_{n}(\ell )\) be as in Corollary 3.6, set \(\tilde e_\emptyset ^k:=e_k\), and observe that
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equau_HTML.png
The image displays a mathematical formula involving summations and various symbols. The formula is: [ tilde{e}(J, ell) = tilde{u} prod_{i=1}^{J} sum_{k=1}^{D} sum_{0 < |I| leq -1} left( tilde{e}_{I}^{k} mathbf{1}_{(j=0)} mathbf{1}_{(I=(0))} mathbf{1}_{(k=1)} + ell_{C}^{k} tilde{e}_{I}^{k} mathbf{1}_{(j=1)} right) ] Key elements include the use of Greek letters such as tilde{e}, tilde{u}, and ell, and mathematical symbols like summation (sum), product (prod), and indicator functions (mathbf{1}).
Setting \(\tilde c(j,I,k,\ell ):={\boldsymbol 1_{\{j=0\}}}{\boldsymbol 1_{\{I=(0)\}}}{\boldsymbol 1_{\{k=1\}}}+\ell _{I}^k {\boldsymbol 1_{\{j=1\}}}\) yields
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equav_HTML.png
The image displays a mathematical formula involving expectations, summations, and products. The formula is: [ mathbb{E}[e_t, hat{S}_n(ell_t)] = sum_{k_1, ldots, k_J = 1}^{|I_1|, ldots, |I_U| = 0} sum_{D}^{-1} mathbb{E}left[left(prod_{j=1}^{|J|} tilde{ell}_j^{k_n} hat{X}_t right) right] prod_{i=1}^{|J|} tilde{c}(j_i, I_i, k_i, ell) ] Key elements include the expectation operator mathbb{E}, summation symbols sum, and product symbols prod. The formula incorporates variables such as e_t, hat{S}_n(ell_t), tilde{ell}_j, hat{X}_t, and tilde{c}(j_i, I_i, k_i, ell).
We obtain that the map
$$\displaystyle \begin{aligned} (\ell,\mathbb{E}[\widehat{\mathbb{X}}_t])\mapsto \widetilde P_J(\ell,\mathbb{E}[\widehat{\mathbb{X}}_t]):=\mathbb{E}[\langle e_{J},{\widehat{\mathbb{S}}_{n}(\ell)}_{t}\rangle]\end{aligned}$$
is polynomial of degree \(|J|\) in its first argument and linear in the second one.
 
The expressions above also provide a formula of the variance of \( S_{n}(\ell )\).
Corollary 3.12
Let\( S_{n}(\ell )\)be as in Corollary3.6and assume that it is a true martingale. Then
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equ16_HTML.png
The image displays a mathematical formula related to variance. The formula is: [ text{Var}(S_n(ell_t)) = 2 sum_{k_1, k_2 = 1}^{D} sum_{|I_1|, |I_2| = 1}^{-1} mathbb{E}[tilde{e}_{I_1}^{k_1} perp !!! perp tilde{e}_{I_2}^{k_2}, hat{X}_t] ell_{I_1} ell_{I_2} ] Key elements include summations, expected value notation mathbb{E}, and various mathematical symbols such as tilde{e}, hat{X}_t, and perp !!! perp indicating independence.
(16)
Proof
The martingale property guarantees that \(\mathbb {E}[S_{n}(\ell )_{t}]=S_{n}(\ell )_0\) and hence, by the shuffle product, \(\mbox{Var} (S_{n}(\ell )_{t})=2\mathbb {E}[\langle e_{1}\otimes e_{1},\widehat {\mathbb {S}}_{n}(\ell )_{t}\rangle ]\). By Remark 3.11 we can conclude that
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equax_HTML.png
The image displays a mathematical formula involving variance. The formula is: [ text{Var}(S_n(ell_t)) = 2 sum_{k_1, k_2 = 1}^{D} sum_{|I_1|, |I_2| = 1}^{-1} mathbb{E}[tilde{e}_1^{k_1} perp tilde{e}_2^{k_2} hat{X}_t] tilde{c}(1, I_1, k_1, emptyset) tilde{c}(1, I_2, k_2, emptyset) ] Key elements include summation symbols, expected value notation mathbb{E}, and various mathematical symbols such as tilde{e}, hat{X}, and tilde{c}.
and the claim follows. □

3.4 Pricing of Sig-Payoffs

We recall here the notion of sig-payoffs as introduced in [34].
Definition 3.13 (Sig-Payoff)
Suppose that the price process S is given by a continuous semimartingale. A payoff \(F:\Omega \to \mathbb {R}\) is said to be a sig-payoff if there exists \(m\in \mathbb {N}\), and \(f:=\{f_\emptyset , f_J\colon 0<|J|\leq m\},\) such that
$$\displaystyle \begin{aligned} F:=f_\emptyset +\sum_{0<|J|\leq m}f_J\langle e_J, \widehat{\mathbb{S}}_{T}\rangle, \end{aligned} $$
where \(\widehat {\mathbb {S}}\) denotes the signature of \(\widehat {S}_t=(t,S_t)\).
Example 3.14
Let \(K>0\) be a strike price and \(T>0\) a maturity time. Then, Asian forwards written on S are payoffs of the form
$$\displaystyle \begin{aligned} \frac{1}{T}\int_0^T S_t \mathrm{d} t -K =\frac{1}{T}\int_0^T (S_t -S_{0}) \mathrm{d} t -K+S_{0} =\frac{1}{T}\langle e_{1}\otimes e_{0},\widehat{\mathbb{S}}_T\rangle+(K-S_{0})\langle e_{\emptyset},\widehat{\mathbb{S}}_T\rangle,\end{aligned}$$
and are thus sig-payoffs.
While standard vanilla derivatives like call and put options do not fall under the category of sig-payoffs, approximate sig-payoffs can still serve as efficient control variates in Monte Carlo pricing, as explained in Sect. 3.5.2.
In this section, we consistently denote \(\widehat {S}_t\) as \((t,S_t)\) and focus on pricing sig-payoffs when S follows a signature model, as specified in the subsequent corollary.
Corollary 3.15
Let the dynamics of\(S_n(\ell )\)under a local martingale measure\(\mathbb {Q}\)be specified as in Corollary3.6. Consider a sig-payoff
$$\displaystyle \begin{aligned} F=f_\emptyset +\sum_{0<|J|\leq m}f_J\langle e_J, \widehat{\mathbb{S}}_{n}(\ell)_{T}\rangle. \end{aligned}$$
Then, using the notation of Remark3.11we can write the corresponding price as
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equ17_HTML.png
The image displays a complex mathematical formula involving summations and expectations. The formula is: [ E_Q[F] = f_{emptyset} + sum_{0 < |J| leq m} f_J tilde{P}_J(ell, E_Q[hat{X}_T]) ] [ = f_{emptyset} + sum_{|J|=1}^{m} sum_{k_J=1}^{D} sum_{l_J=0}^{} E_Qleft(tilde{U}_i^{|J|} e_{l_i}^{k_i} hat{X}_Tright) f_J prod_{i=1}^{|J|} tilde{c}(j_i, l_i, k_i, ell) ] The formula includes Greek letters such as ell and tilde{P}, and mathematical symbols like summation sum, product prod, and expectation E_Q.
(17)
for\(\tilde c(j,I,k,\ell ):={\boldsymbol 1_{\{j=0\}}}{\boldsymbol 1_{\{I=(0)\}}}{\boldsymbol 1_{\{k=1\}}}+\ell _{I}^k{\boldsymbol 1_{\{j=1\}}}\).
Proof
This is a direct consequence of Theorem 3.9 and Remark 3.11. □
Remark 3.16
(i)
Expression (17) admits also a second representation that turns out to be useful for coding:
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equbb_HTML.png
The image displays a complex mathematical formula involving summations and products. The formula is as follows: [ mathbb{E}_Q[F] = f_{emptyset} + sum_{|J|=1}^{m} sum_{k_J=1}^{D} sum_{I_J=1}^{I} mathbb{E}_Q left[ prod_{i=1}^{|J|} tilde{e}_{I_i}^{k_{I_i}}; hat{X}_T right] ] [ times f left( 1_{{I_1 neq I_1^t, k_1 neq 0}}, ldots, 1_{{I_{|J|} neq I_{|J|}^t, k_{|J|} neq 0}} right) prod_{i=1}^{|J|} left( 1_{{I_i = I_i^t, k_i = 0}} + rho_{I_i}^{k_i} 1_{{I_i neq I_i^t, k_i neq 0}} right) ] The formula includes Greek letters such as rho and mathematical symbols like summation sum, product prod, and expectation mathbb{E}.
 
(ii)
With a similar procedure it is also possible to provide a representation of \(\mbox{Var}_{\mathbb {Q}}(F)\). By the shuffle product we know that
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equbc_HTML.png
The image displays a mathematical formula: [ (F - f_{emptyset})^2 = sum_{|J_1|, |J_2| = 1}^{m} f_{J_1} f_{J_2} langle e_{J_1} perp !!! perp e_{J_2}, hat{S}_n(ell)_T rangle ] The formula includes symbols such as the empty set emptyset, summation sum, perpendicular perp !!! perp, and a hat hat{} over S_n.
Setting https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq319_HTML.gif
The image displays a mathematical formula involving summation and tilde notation. The formula is: [ tilde{P}_{I_1 perp!!!perp I_2} := sum_{i=1}^{K} tilde{P}_{J_i} ] Key elements include the tilde symbol over P, the independence symbol perp!!!perp, and the summation from i=1 to K.
for \(J_i\) and K satisfying https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq321_HTML.gif
The image displays a mathematical formula involving indexed variables and summation. The formula is: [ e_{I_1} perp!!!perp e_{I_2} = sum_{i=1}^{K} e_{J_i} ] Here, e_{I_1} and e_{I_2} are variables, and the symbol perp!!!perp represents independence. The right side of the equation is a summation from i = 1 to K of e_{J_i} .
we thus obtain
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equbd_HTML.png
The image displays a mathematical formula related to variance. The formula is expressed as: [ text{Var}_{mathbb{Q}}(F) = mathbb{E}_{mathbb{Q}}[(F - f_{emptyset})^2] - mathbb{E}_{mathbb{Q}}[F - f_{emptyset}]^2 ] [ = sum_{|J_1|, |J_2| = 1}^{m} f_{J_1} f_{J_2} left( tilde{P}_{J_1 perp perp J_2}(ell, mathbb{E}_{mathbb{Q}}[hat{X}_T]) - tilde{P}_{J_1}(ell, mathbb{E}_{mathbb{Q}}[hat{X}_T]) tilde{P}_{J_2}(ell, mathbb{E}_{mathbb{Q}}[hat{X}_T]) right) ] The formula includes symbols such as variance (Var), expectation (mathbb{E}), and summation (sum). It also uses Greek letters like ell and mathematical symbols like perp perp.
 
(iii)
From this representations we can see that the maps \(P_F(f,\ell ,\mathbb {E}_{\mathbb {Q}}[\widehat {\mathbb {X}}_T]):=\mathbb {E}_{\mathbb {Q}}[F]\) and \(P_{\mbox{Var} (F)}(f,\ell ,\mathbb {E}_{\mathbb {Q}}[\widehat {\mathbb {X}}_T]):=\mbox{Var}_{\mathbb {Q}} (F)\) inherit good properties from \(\widetilde P_J\). In particular, \(P_F\) is linear in f and polynomial of degree m in \(\ell \), and \(P_{\mbox{Var} (F)}\) is quadratic in f and polynomial of degree 2m in \(\ell \). Both maps are linear in \(\mathbb {E}[\widehat {\mathbb {X}}_T]\).
 

3.5 Calibration to Option Prices

In this section our objective is to calibrate models to call (and put) option prices, aiming to closely match their implied volatilities.
Throughout this section, we assume the existence of a pricing measure \(\mathbb {Q}\) and that the primary process X is a vector of correlated \(\mathbb {Q}\)-Brownian motions. The essence of this approach lies in selecting the parameters \(\ell \) to reproduce the market prices of options on S. Given the prices of N call options \(\pi ^{\ast }(T_{1},K_{1}),\dots ,\pi ^{\ast }(T_{N},K_{N})\) characterized by maturities \(T_{i}\) and strikes \(K_{i}\), our goal is to determine these parameters for an optimal fit. Then, in spirit of [6], we translate the current calibration task into finding \(\ell \) such that
$$\displaystyle \begin{aligned} \ell\in \mathop{\operatorname*{\mathrm{argmin}}}\nolimits_{\ell}\ L_{\text{option}}(\ell), \end{aligned}$$
where
$$\displaystyle \begin{aligned} {} L_{\text{options}}(\ell)=\sum_{i=1}^{N}\gamma_{i}\Big(\pi^{\ast}(T_{i},K_{i})-\pi^{\text{model}}(\ell,T_{i},K_{i})\Big)^{2}, \end{aligned} $$
(18)
with \(\gamma _{i}\) Vega weights and \(\pi ^{\text{model}}(\ell ,T_{i},K_{i})\) denoting the price of the option under the signature model with parameters \(\ell \), maturity \(T_{i}\) and strike \(K_{i}\).
Remark 3.17
The loss function described in (18) can be adapted depending on the available data. A common approach is for instance to weigh the options by their bid-ask spreads, see for instance [6]. This reflects the relative importance of reproducing different option prices precisely. In the present work we shall employ Vega weights since bid-ask spreads are not always provided in the data and since they are well-suited to match the implied volatility surface from option prices, see also Remark 5.8 in [13].
For notational convenience, let us here briefly recall the definition of implied volatility, which serves as our criterion for goodness of fit. Notice that for liquidity reasons we only calibrate to vanilla options.
Definition 3.18
Let \(\pi (T,K)\) be the price of a call option with maturity T and strike price K written on the asset S. The implied volatility of \(\pi (T,K)\) is defined as the volatility \(\sigma _{\text{IV}}(T,K)\) that solves the equation
$$\displaystyle \begin{aligned} {} \pi^{BS}(T,K,\sigma_{\text{IV}}(T,K))=\pi(T,K), \end{aligned} $$
(19)
where \(\pi ^{BS}\) denotes the Black-Scholes option price. Then \((\sigma _{\text{IV}}(T,K))_{T,K}\) is called (implied) volatility surface, and \((\sigma _{\text{IV}}(T,K))_{K}\) is called (implied) volatility smile for each fixed maturity T.
Once the optimal parameters \(\ell ^{\ast }\) are found via (18) we then need to solve (19) numerically for \(\pi (T,K)=\pi ^{\text{model}}(\ell ^\ast ,T,K)\) to assess the goodness of fit. Indeed we measure it in terms of the absolute relative error (in percentage) between the implied volatilities from the market and the model. Another goodness of fit test is to check whether the calibrated implied volatilities lie in the market’s bid-ask spread (if available), see Sect. 4.2.

3.5.1 Monte Carlo Pricing

We now turn our attention to the task of computing the price \(\pi ^{\text{model}}(\ell ,T,K)\) for a given strike \(K\in \mathbb {R}\) and maturity \(T>0\) under the signature model.
Before presenting our Monte Carlo-based method, it is worth recalling that Sect. 3.4 provides a closed-form formula for computing sig-payoffs without relying on Monte Carlo techniques. Additionally, thanks to the universal approximation theorem, observe that call options payoffs can be approximated arbitrarily well by sig-payoffs, or even just by polynomials, on compact sets. The combination of these two properties forms the basis of the approach employed by [37], which involves breaking down the computation of \(\pi ^{\text{model}}(\ell ,T,K)\) into two components: an approximation of call (put) payoffs using sig-payoffs and the pricing of the latter through the closed-form formula derived in Sect. 3.4. However, achieving an acceptable error between the original call payoff and the sig-payoff requires selecting a sig-payoff with signature terms of sufficiently high order. This approximation must hold for all sets of coefficients \(\ell \) involved in the optimization procedure, encompassing a large compact set K.
It is important to note that while sig-payoffs may be used as control variates in Monte Carlo pricing techniques, as we will discuss in Sect. 3.5.2, we opted for a Monte Carlo approach. The linearity of the model makes this approach particularly manageable, rendering the entire calibration computationally feasible within a reasonable time while delivering highly accurate results.
For the Monte Carlo price we thus fix a number of samples \(N_{MC}>0\) and approximate \(\pi ^{\text{model}}(\ell ,T,K)\) via
$$\displaystyle \begin{aligned} \pi^{\text{model}}(\ell,T,K)\approx \frac{1}{N_{MC}}\sum_{i=1}^{N_{MC}}(S_{n}(\ell)_{T}(\omega_i)-K)^{+}. \end{aligned}$$
We stress again that this can be computed fast. Indeed, by the linearity of the model simulating \((S_{n}(\ell )_{T}(\omega _{i}))_{i=1}^{N_{MC}}\) boils down to the following steps:
  • simulate \((X_{t}(\omega _{i}))_{t\in [0,T]}\), which in the current setting are just trajectories for correlated Brownian motions, for each \(i\in \{1,\ldots ,N_{MC}\}\);
  • compute \(\langle e_{I},\widehat {\mathbb {X}}_{T}(\omega _{i})\rangle \) for all \(i=1,\dots ,N_{MC}\) and for all multi-indices I such that \(\lvert I\lvert \le n\);
  • take linear combinations to compute \(\langle \tilde {e}_{I},\widehat {\mathbb {X}}_{T}(\omega _{i})\rangle \) for all \(i=1,\dots ,N_{MC}\) and for all multi-indices \(|I|\leq n-1\) as described in Lemma 3.5;
  • retrieve \((S_{n}(\ell )_{T}(\omega _i))_{i=1}^{N_{MC}}\) via (14).
It is worth noting that the parameters \(\ell \) come into play only in the final step, enabling the precomputation and storage of all other quantities. This contrasts with other models where the calibration parameters are involved at each time point during the simulation steps, such as in an Euler scheme for classical Markovian models or more complex schemes employed in rough volatility models.

3.5.2 Variance Reduction with Sig-Payoffs

Even though Monte Carlo pricing is fast since all essential quantities can be precomputed as explained above, we here discuss variance reduction techniques (see e.g. [26]) that can speed up the procedure even further. The idea is to introduce a control variate, i.e., a random variable \(\Phi ^{cv}\) such that:
$$\displaystyle \begin{aligned} \mathbb{E}_{\mathbb{Q}}[\Phi^{cv}]=0, \qquad \qquad \mbox{Var} \big((S_{n}(\ell)_{T}-K)^{+}-\Phi^{cv}\big)< \mbox{Var} \big((S_{n}(\ell)_{T}-K)^{+}\big). \end{aligned}$$
An example of control variates used for pricing and calibrating neural SDE models can be found in [10, 25], where \(\Phi ^{cv}\) is constructed from the Delta hedge. A possible other choice of control variates for signature models are sig-payoffs. Indeed, one can use the pricing formula derived in Sect. 3.4 to define:
$$\displaystyle \begin{aligned} \Phi^{cv}(\ell,T,K):=f_{\emptyset}+\sum_{0<\lvert J \lvert \le m} f_{J}\langle e_{J},\widehat{\mathbb{S}}_{n}(\ell)_{T}\rangle-\widetilde{P}_f(\ell,\mathbb{E}_{\mathbb{Q}}[\widehat{\mathbb{X}}_T]), \end{aligned}$$
for some fixed \(m>0\) where
$$\displaystyle \begin{aligned} (S_{n}(\ell)_{T}-K)^{+}\approx f_{\emptyset}+\sum_{0<\lvert J \lvert \le m} f_{J}\langle e_{J},\widehat{\mathbb{S}}_{n}(\ell)_{T}\rangle, \end{aligned}$$
for a wide range of \(\ell \) and with high probability. This can be done by performing a linear regression to obtain the coefficients \(f=(f_{J})_{|J|\le m}\). Alternatively, a polynomial approximation of the payoff’s function can also be employed.
The properties of \(\Phi ^{cv}\) then guarantee the accuracy of the approximation
$$\displaystyle \begin{aligned} \pi^{\text{model}}(\ell,T,K)\approx\frac{1}{N_{MC}}\Big(\sum_{i=1}^{N_{MC}}(S_{n}(\ell)_{T}(\omega_i)-K)^{+}-\Phi^{cv}(\ell,T,K)(\omega_i)\Big), \end{aligned}$$
already for smaller values of \(N_{MC}\). In the following remark we report a numerical experiment in this regard.
Example 3.19
Fix \(n, d =2\) and consider the parameters \(\ell ^{\ast }\in \mathbb {R}^{13}\) calibrated to the first smile of the market data shown in Fig. 4 (left). Consider as example a call option with maturity \(T=30\) days and strike \(K=85\%\) of the spot price. Let p be a polynomial of order \(m=4\) approximating the function \(f: x \mapsto (x-K)^{+}\) on the compact \([70\%S_0,120\%S_0]\). Recall that \(\mathbb {E}_{\mathbb {Q}}[p(S_n(\ell )_T)]\) can be computed analytically as polynomials are sig-payoffs. Using \(\Phi ^{cv}(\ell ):=p(S_n(\ell )_T)-\mathbb {E}_{\mathbb {Q}}[p(S_n(\ell )_T)]\) as control variate we can reduce the sample variance of the Monte Carlo estimator from approximately \(3.89\cdot 10^{-5}\) to \(6.5\cdot 10^{-7}\). Observe that \(p(S_n(\ell )_T)\) here coincides with the sig-payoff without time-augmentation i.e.,
$$\displaystyle \begin{aligned} p(S_n(\ell)_T)=\sum_{|I|\le 4} f_{I}\langle e_{I},\mathbb{S}_{n}(\ell)_T\rangle. \end{aligned}$$

3.5.3 Model Performance

In the following we discuss the problem of minimizing the functional (18) using the Monte Carlo method as described in Sect. 3.5.1 to compute the model prices. We consider the model described in Corollary 3.6 for two correlated Brownian motions B and W with correlation coefficient \(\rho =-0,5\), and \(D=1\).
As a first example we consider synthetic data, where the implied volatility surface to fit is generated by a Heston model whose dynamics are given by
$$\displaystyle \begin{aligned} {} \begin{aligned} {\mathrm{d}}S_{t}&=\mu S_{t}{\mathrm{d}}t+S_{t}\sqrt{V_{t}}{\mathrm{d}}B_{t}^{\mathbb{P}}\\ {\mathrm{d}}V_{t}&=\kappa(\theta-V_{t}){\mathrm{d}}t+\sigma \sqrt{V_{t}}{\mathrm{d}}W_{t}^{\mathbb{P}}, \end{aligned} \end{aligned} $$
(20)
where in both cases \({\mathrm {d}}[B^{\mathbb {P}},W^{\mathbb {P}}]_{t}=\rho {\mathrm {d}}t\) where \(\rho \in [-1,1]\).
We consider 7 maturities \((T_{k})_{k=1}^{7}\) ranging from 30 days to 2 years and 13 strikes \((K_{j})_{j=1}^{13}\) ranging from 80\(\%\) to 120\(\%\) of the spot price. The truncation parameter is fixed to \(n=3\) and the number of Monte Carlo samples to \(N_{MC}=10^6\). The results for the following two sets of parameters under a risk neutral measure
\(\kappa \)
\(\theta \)
\(\sigma \)
\(\rho \)
\(V_{0}\)
0.2
0.3
0.5
\(-\)0.5
0.08
are displayed in the first and in the second row of Fig. 1, respectively.
Fig. 1
On the left: blue stars correspond to the implied volatilities of the Heston models, red dots denote the calibrated implied volatilities of \(S_{n}(\ell )\) with \(n=3\) (13 estimated parameters). On the right: absolute relative errors between the two surfaces are expressed in percentages
Full size image
Fig. 2
On the left: the upper surface represents the implied volatility of the S\(\&\)P 500 index as of 17-03-2021, the lower one is the calibrated implied volatility of \(S_{n}(\ell ^*)\) with \(n=3\) (13 parameters). On the right: absolute relative error between the two surfaces in percentages
Full size image
We stress that these calibrations to Heston generated implied volatility surfaces can take between 4 and 15 minutes on a standard laptop. We consider now implied volatility data as of 17/03/2021 for call options written on the S\(\&\)P 500 index. Our dataset provided by Bloomberg consists of 7 maturities \((T_{k})_{k=1}^{7}\), ranging from 30 days to 2 years, and 9 strike prices \((K_{j})_{j=1}^{9}\) for each maturity which vary between 80\(\%\) and 120\(\%\) of the spot price. Again, the truncation parameter is fixed to \(n=3\) and the Monte Carlo’s parameter to \(N_{MC}=10^6\). The results are displayed in Fig. 2.
Fig. 3
Comparison between the calibrated implied volatility smiles for different parameters and the S\(\&\)P 500 index as of 17-03-2021 smile (in blue) at maturity \(T_{1}=30\) days (on the left) and for maturities ranging from 60 days to 2 years (on the right). Calibration has been performed using (21)
Full size image
Figure 2 indicates that the volatility smiles for short maturities have not been captured adequately. Specifically, as with many continuous models, the challenge lies in fitting the shortest maturities, primarily due to the high at-the-money skew in the market (refer to Chapter 3 and Chapter 7 of [22]).
To assess the model’s ability to replicate short maturity smiles, we conduct calibrations using different loss functions that penalize outliers more severely. In the initial calibration procedure described earlier, we denote by \(w_i\) the absolute error between the target implied volatility and the approximated one for maturity \(T_i\) and strike \(K_i\). Subsequently, inspired by generative-adversarial distances as considered in, e.g., [10], we define a new loss function
$$\displaystyle \begin{aligned} {} L_{p,\alpha}(\ell)=\sum_{i=1}^{N}(\gamma_{i}+\alpha w_i)|\pi^{\ast}(T_{i},K_{i})-\pi^{\text{model}}(\ell,T_{i},K_{i})|^{p}, \end{aligned} $$
(21)
depending on parameters p and \(\alpha \) that need to be chosen. By taking high values for p and \(\alpha \) we can approximate the sup-distance between the two price surfaces, i.e.,
$$\displaystyle \begin{aligned} L_{\infty}(\ell):=\sup_{i=1,\ldots,N}|\pi^{\ast}(T_{i},K_{i})-\pi^{\text{model}}(\ell,T_{i},K_{i})|,\end{aligned}$$
without compromising differentiability with respect to \(\ell \). The result for different choices of the parameters \(\alpha \) and p but also the truncation level n is displayed in Fig. 3. As can be guessed from the figure, although the maximal absolute relative error for maturities larger than 60 days is (almost) acceptable (5.6\(\%\) for \(n=3\), \(p=1000\), and \(\alpha =500\) and 2.3 \(\%\) for \(n=4\) and \(p=1000\), and \(\alpha =500\)), the absolute relative error for the shortest maturity and the far in and out of the money strikes are still above 12 \(\%\) and 18 \(\%\), respectively. Observe that the performance for \(n=4\) is the best for every maturity larger than 60 days as well as for the at-the-money region of the shortest maturity.
Fig. 4
Comparison between the calibrated implied volatility smiles and the SPX-500 Index as of 17-03-2021 smile (in blue) at maturity \(T_{1}=30\) days (on the left) and for maturities ranging from 60 days to 2 years (on the right). Two different calibrations have been performed using (21)
Full size image
In a final experiment, we conduct two separate calibrations: one to the shortest maturity alone and another to every other maturity combined. The first calibration is performed for \(T_1=30\) days, and the second calibration encompasses 6 maturities ranging from 60 days to 2 years. In both cases, we consider 9 strikes ranging from 80\(\%\) to 120\(\%\) of the spot price. For the first calibration, we fix the parameters at \(n=2\), \(p=2\), and \(\alpha =0\), while for the subsequent maturities, the parameters are set to \(n=4\), \(p=300\), and \(\alpha =500\). The results are displayed in Fig. 4. Notably, the fit for the first maturity is remarkably accurate, as well as for the entire implied volatility surface. The computational time required for the calibration of the first smile is approximately 2 minutes, while the calibration to the remaining surface might take longer, mainly due to the higher value of n and the calibration to all maturities.
These results suggest that introducing maturity-dependent parameters and performing a slice-wise calibration to the individual smiles as for instance in [10] or [25] can be of interest to obtain both, an excellent accuracy and a low computational time. For details in this regard and the out-of-sample performance see [11].

4 A Signature Model for Index Options and Volatility Derivatives

This section is dedicated to define the second model \((S_{t})_{t\geq 0}\) for the S&P 500 index in detail. Under a risk-neutral probability measure \(\mathbb {Q}\) the model’s dynamics are given by:
$$\displaystyle \begin{aligned} {} \mathrm{d}S_{t}=S_{t}\sigma_{t}^{S}\mathrm{d}B_{t}. \end{aligned} $$
(22)
Here, \(S_{0}\in \mathbb {R}^{+}\), \(\sigma ^{S}=(\sigma _{t}^{S})_{t\geq 0}\) is the volatility process, and \(B=(B_{t})_{t\geq 0}\) is a one-dimensional Brownian motion that is correlated with the volatility process. Note that the instantaneous variance is given by \(V_t:=(\sigma _t^S)^2\) for every \(t\geq 0\). Choosing a functional form of the volatility process is the crucial modelling choice. We set
$$\displaystyle \begin{aligned} {} \sigma_{t}^{S}(\ell):=\ell_{\emptyset}+\sum_{0<\lvert I \lvert \le n } \ell_{I} \langle e_{I},\widehat{\mathbb{X}}_{t}\rangle, \end{aligned} $$
(23)
i.e. the volatility process is determined by a linear function of the (time-extended) signature of a primary process X. Moreover, we assume that X is a polynomial process (recall Definition 2.1) and that the model parameters are \(\ell :=\{\ell _{I}\in \mathbb {R}: |I|\le n\}\in \mathbb {R}^{(d+1)_{n}}\).
For later convenience, we now introduce the process \((Z_t)_{t\geq 0}\) where \(Z_t=(X_t,B_t)\) and denote by \((\widehat {\mathbb {Z}}_t)_{t\geq 0}\) its time-extended signature. The correlation of the components of \((Z_t)_{t\geq 0}\) is given by \(\rho = (\rho _{ij})_{i,j=1,\dots ,d+1}\), i.e.,
$$\displaystyle \begin{aligned} \rho_{ij}=\frac{[Z^{i},Z^{j}]}{\sqrt{[Z^{i}]}\sqrt{[Z^{j}]}}\in [-1,1], \end{aligned}$$
Recall that we use \(\lbrack \cdot ,\cdot \rbrack \) for the quadratic variation. We will often write \((\sigma _{t}^{S})_{t\geq 0}= (\sigma _{t}^{S}(\ell ))_{t\geq 0}\) as in (22) to keep the notation light and only mention the dependence on \(\ell \) when it is explicitly needed.
Remark 4.1 (Interest Rates and Dividends)
To calibrate our model to option prices we will both need the discounted, dividend-adjusted and undiscounted, unadjusted prices. This is due to the fact that the VIX is defined via the discounted, dividend-adjusted prices by the CBOE but the claims on the S&P500 itself are written on the undiscounted, unadjusted prices. Recall that \((S_t)_{t\geq 0}\) given by (22) is the discounted, dividend-adjusted price process. Including an interest rate r and dividend q, the corresponding undiscounted, unadjusted price process is given by
$$\displaystyle \begin{aligned} \mathrm{d} \tilde{S}_t = (r-q)\tilde{S}_t \mathrm{d} t + \tilde{S}_t \sigma_t^S(\ell) \mathrm{d} B_t.\end{aligned}$$
The price of a call option with time to maturity \(T>0\) and strike price \(K\in \mathbb {R}\) written on the S&P 500 index, is therefore
$$\displaystyle \begin{aligned} C(T,K) = \mathbb{E}\left[e^{-rT}(\tilde{S}_T(\ell)- K)^+\right]=\mathbb{E}[e^{- rT} (e^{(r-q)T}S_T(\ell)- K)^+], \end{aligned}$$
where we used that \(\tilde {S}_t(\ell ) = e^{(r-q)t}S_t (\ell )\).
Remark 4.2
Recall our assumption that \(\widehat {X}\) is a polynomial process. This is important since then \(\widehat {\mathbb {X}}^{n}\) is a finite-dimensional polynomial process in sense of [19] and [9] and therefore the expected signature of \(\widehat {X}\) and conditional expected signature can be found by solving a finite-dimensional ODE, more specifically, it is given by a finite-dimensional matrix exponential as described in Sect. 2.
This assumption still leaves us with a broad class of admissible primary processes. Some prominent examples are correlated Brownian motions, Cox-Ingersoll-Ross (CIR) processes, geometric Brownian motions, Jacobi processes, OU processes, and all continuous affine processes.
Remark 4.3
Several stochastic volatility models are encapsulated in our modelling framework (23). Let us elaborate on the following:
  • The Stein-Stein model, as introduced in [40], is obtained if we choose a one-dimensional OU process as our primary process \((X_t)_{t\geq 0}\) and set \(n=1\), \(\ell _{\emptyset }=\ell _{(0)}=0\) and \(\ell _{(1)}\neq 0\).
  • The SABR model, as introduced initially in [31] with \(\beta =1\). To this end we choose \((X_t)_{t\geq 0}\) to be a 1-dimensional geometric Brownian motion without drift and again let \(n=1\), with \(\ell _{\emptyset }=\ell _{(0)}=0\) and \(\ell _{(1)}\neq 0\).
  • Let \((X_t)_{t\geq 0}\) be a one-dimensional OU process with \(n=5\), \(\ell _{\emptyset },\ell _{(1)},\ell _{(1,1,1)},\)\(\ell _{(1,1,1,1,1)}\) non-zero and \(\ell _{I}=0\) otherwise. This results in the model described in [1], except for the fact that therein a deterministic input curve is added. Moreover, we can embed the entire set of Gaussian polynomial volatility models introduced in [1], if we allow X not to be a semimartingale and do not add a time-augmentation.
In the next section we discuss the pricing of VIX options with (23) as well as the nature of the log-price process.

4.1 Explicit Formulas for the VIX and the Log-Price

The CBOE Volatility Index (VIX) measures the market’s expected volatility of the S\(\&\)P 500 index. More specifically, its current value corresponds to the expected annualized change in the S&P 500 index over the 30 days ahead. Indeed, the index value is computed by
$$\displaystyle \begin{aligned} {} {\text{VIX}}_{T}=\sqrt{\mathbb{E}\left\lbrack -\frac{2}{\Delta}\log\left(\frac{S_{T+\Delta}}{S_{T}}\right)\Big\lvert \mathcal{F}_{T}\right\rbrack}, \end{aligned} $$
(24)
where \(\Delta =30\) days, \(S_{T}\) denotes the price process and \(\mathcal {F}_T\) the filtration at time \(T>0\). Note that for any stochastic volatility model of the form
$$\displaystyle \begin{aligned} dS_t = S_t\sqrt{V_t}dB_t,\end{aligned}$$
where \((B_t)_{t\geq 0}\) is a Brownian motion and \((V_t)_{t\geq 0}\) satisfies
$$\displaystyle \begin{aligned} {} \mathbb{E}\left\lbrack\int_0^T V_s \mathrm{d} s\right\rbrack< \infty, \end{aligned} $$
(25)
the VIX index at \(T>0\) is given by
$$\displaystyle \begin{aligned} {} {\text{VIX}}_{T}=\sqrt{\frac{1}{\Delta}\mathbb{E}\left\lbrack\int_{T}^{T+\Delta}V_{t}\mathrm{d}t\lvert \mathcal{F}_{T}\right \rbrack}, \end{aligned} $$
(26)
see e.g., [35]. Although there are of course put and call options written on the VIX, we from now on consider without loss of generality only call options and will simply refer to them as VIX options.
We shall now derive an analytical expression for the VIX index value (24) under our model, i.e. if S is defined as in (22) and (23) and for a polynomial process X. More precisely, we state in Theorem 4.4 that the value of the VIX is given by the square-root of a quadratic function in the parameters \(\ell \). This is achieved via polynomial technology and the computation of a matrix exponential, as we showed in Sect. 2.
Theorem 4.4
Let the price process\(S=(S_{t})_{t\ge 0}\)be given by
$$\displaystyle \begin{aligned} \mathrm{d} S_{t}=S_{t}\sigma_t^{S}(\ell)\mathrm{d} B_{t}, \end{aligned}$$
with volatility process\(\sigma ^{S}(\ell )=(\sigma _{t}^S(\ell ))_{t\ge 0}\)and\(B=(B_{t})_{t\ge 0}\)denoting a one-dimensional Brownian motion. Recall our modeling assumption, namely that\(\sigma ^S(\ell )\)and X satisfy (23) and note that under this (25) is fulfilled. As introduced in Sect.2we fix an injective labeling function\(\mathscr {L}:\{I: |I|\le n\}\to \{1,\dots , (d+1)_{2n+1}\}\)and denote by G be the\((d+1)_{(2n+1)}\)-dimensional matrix representative of the dual operator corresponding to\(\widehat {\mathbb {X}}\). Then, the following expression for the VIX at time\(T>0\)holds
$$\displaystyle \begin{aligned} {} {\mathit{\text{VIX}}}_{T}(\ell)=\sqrt{\frac{1}{\Delta}\ell^\top Q(T,\Delta)\ell}, \end{aligned} $$
(27)
with
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equ28_HTML.png
The image displays a mathematical formula involving various symbols and operations. The formula is: [ Q_{mathcal{L}_I, mathcal{L}_J}(T, Lambda) = text{vec}((e_1 perp!!!perp e_J) otimes e_0)^{prime} e^{Delta G^{top}} - text{Id} , text{vec}(hat{X}_T^{+1}) ] Key elements include: - Greek letters: Lambda, Delta, rho - Mathematical symbols: perp!!!perp (independence), otimes (tensor product), prime (transpose), top (transpose), Id (identity matrix) - Functions: vec() (vectorization), e^{} (exponential function)
(28)
and\(\operatorname {Id}\in \mathbb {R}^{(d+1)_{2n+1} \times (d+1)_{2n+1}}\)denoting the identity matrix. Note that Q is positive semidefinite and symmetric and hence allows for a Cholesky decomposition.
Proof
Note that \(V_t(\ell )=(\sigma ^S(\ell ))^2\) can be expressed as
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equbr_HTML.png
The image displays a mathematical formula involving summations and variables. The formula is: [ V_t(ell) = left( sum_{|I| leq } ell_I(e_I, hat{X}_t) right)^2 = sum_{|I|, |J| leq } ell_I ell_J (e_I sqcup e_J, hat{X}_t) ] Key elements include summation symbols, subscripts, and a hat symbol over X_t.
by the shuffle-property. Moreover, recall that continuous polynomials processes have finite moments of every degree and hence by Remark 1.2 it follows that (25) holds. Plugging-in the above equation for \(V_t(\ell )\) the value of the VIX is for each \(T>0\) given by
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equbs_HTML.png
The image displays a mathematical formula involving summation, integration, and expectation. The formula is: [ text{VIX}^2_T(ell) = frac{1}{Delta} sum_{|I_t|, |J| leq } ell_I ell_J mathbb{E} left[ int_T^{T+Delta} langle e_I perp !!! perp e_T, hat{X}_t rangle dt big| mathcal{F}_T right] = frac{1}{Delta} ell^T Q(T, Delta) ell, ] where ell, e, hat{X}, and Q are variables or functions, Delta is a parameter, and mathbb{E} denotes expectation. The formula includes Greek letters and mathematical symbols such as summation (sum), integration (int), and conditional expectation.
where for each \(T>0\) the matrix Q is given by
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equbt_HTML.png
The image displays a mathematical formula involving expected values, integrals, and inner products. The formula is expressed as: [ Q_{mathcal{L}(I)mathcal{L}(J)}(T, Delta) := mathbb{E} left[ int_{T}^{T+Delta} langle e_I perp !!!perp e_J, hat{X}_t rangle dt bigg| mathcal{F}_T right] ] [ = mathbb{E} left[ int_{0}^{T+Delta} langle e_I perp !!!perp e_J, hat{X}_t rangle dt - int_{0}^{T} langle e_I perp !!!perp e_J, hat{X}_t rangle dt bigg| mathcal{F}_T right] ] [ = mathbb{E} left[ langle (e_I perp !!!perp e_J) otimes e_0, hat{X}_{T+Delta} rangle - langle (e_I perp !!!perp e_J) otimes e_0, hat{X}_T rangle bigg| mathcal{F}_T right] ] [ = mathbb{E} left[ langle (e_I perp !!!perp e_J) otimes e_0, hat{X}_{T+Delta} rangle bigg| mathcal{F}_T right] - langle (e_I perp !!!perp e_J) otimes e_0, hat{X}_T rangle ] The formula includes Greek letters, such as Delta and mathcal{F}, and mathematical symbols like mathbb{E} for expectation, int for integrals, langle cdot, cdot rangle for inner products, and otimes for tensor products.
By Theorem 2.5 we can also express Q as
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equbu_HTML.png
The image displays a mathematical formula involving vector and matrix operations. The formula is: [ Q_{mathcal{L}(I) mathcal{L}(J)}(T, Delta) = text{vec}((e_I perp e_J) otimes e_0)^top e^{Delta G^top} text{vec}(hat{X}_T^{2n+1}) ] [ - text{vec}((e_I perp e_J) otimes e_0)^top text{vec}(hat{X}_T^{2n+1}) ] [ = text{vec}((e_I perp e_J) otimes e_0)^top (e^{Delta G^top} - text{Id}) text{vec}(hat{X}_T^{2n+1}) ] Symbols include vector notation "vec", tensor product "?", perpendicular "?", transpose "?", exponential "e", and identity matrix "Id".
which proves our claim. The fact that Q is positive semidefinite and symmetric follows from the shuffle property. Under these properties, it is well-known that a Cholesky decomposition exists, meaning that there is an upper triangular matrix \(U_{T}\in \mathbb {R}^{(d+1)_{n}\times (d+1)_{n}}\) such that
$$\displaystyle \begin{aligned} {\text{VIX}}_{T}(\ell)=\sqrt{\frac{1}{\Delta}\ell^\top U_T U_T^\top \ell}=\frac{1}{\sqrt{\Delta}}\sqrt{(U_T^{\top}\ell)^2}=\frac{1}{\sqrt{\Delta}}\lVert U_T^\top \ell \lVert. \end{aligned}$$
Moreover, an explicit expression of the log price in terms of a linear function of the signature of \(\widehat {Z}\) can be obtained, which we present in the next proposition.
Proposition 4.5
Let S be given by (22) with\(S_0=1\), as well as\(\sigma ^S\)and X satisfy (23). Under these assumptions the log-price at time\(t\geq 0\)is given by,
$$\displaystyle \begin{aligned} {} \log(S_{t}(\ell))=-\frac{1}{2}\ell^\top Q^{0}(t)\ell+\sum_{|I|\le n}\ell_{I}\langle \tilde{e}_{I}^{B},\widehat{\mathbb{Z}}_{t}\rangle, \end{aligned} $$
(29)
with
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equbw_HTML.png
The image displays mathematical formulae involving tensor products and summations. The first formula is tilde{e}_{emptyset}^{B} := e_{d+1}. The second formula is tilde{e}_{I}^{B} := e_{I} otimes e_{d+1} - sum_{|J| < m} frac{a_{i_{|I|}}^{J}(d+1)}{2} (e_{I'} sqcup e_{J}) otimes e_{0}. The symbols include Greek letters, tensor product otimes, and summation sum.
for each\(|I|>0\), and the matrix\(Q^{0}(t)\in \mathbb {R}^{(d+1)_{n}\times (d+1)_{n}}\)has components
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equbx_HTML.png
The image displays a mathematical formula: [ Q_{mathcal{L}(I)mathcal{L}(J)}^0(t) = langle (e_I bigsqcup e_J) otimes e_0, hat{X}_t rangle ] Key symbols include: - Q with subscript and superscript notation. - mathcal{L} representing a script letter. - bigsqcup indicating a disjoint union. - otimes for the tensor product. - langle cdot, cdot rangle denoting an inner product. - hat{X}_t with a hat symbol over X .
for an arbitrary but fixed labeling function\(\mathscr {L}:\{I: |I|\le n\}\to \{1,\dots ,(d+1)_{n}\}\).
Proof
Using Itô’s lemma and the form or S, \(\sigma ^S\) and X under our model, we obtain
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/MediaObjects/617302_1_En_7_Equby_HTML.png
The image contains a complex mathematical formula involving integrals, summations, and various mathematical symbols. The formula is expressed as: [ log(S_t(ell)) = -frac{1}{2} int_0^t V_s(ell) , ds + int_0^t sigma_s^S(ell) , dB_s ] [ = -frac{1}{2} sum_{|I|, |J| leq } ell_I ell_J int_0^t langle e_I perp perp e_J, hat{X}_s rangle , ds + sum_{|I| leq } ell_I int_0^t langle e_I, hat{X}_s rangle , dB_s ] [ = -frac{1}{2} sum_{|I|, |J| leq } ell_I ell_J langle (e_I perp perp e_J) otimes e_0, hat{X}_t rangle + sum_{|I| leq } ell_I langle tilde{e}_I^B, hat{Z}_t rangle ] [ = -frac{1}{2} ell^T Q^0(t) ell + sum_{|I| leq } ell_I langle tilde{e}_I^B, hat{Z}_t rangle ] The formula includes Greek letters such as ell, sigma, and hat{X}, and mathematical operations like integrals, summations, and inner products.
where we used that \(\int _{0}^{t}\langle e_{I},\widehat {\mathbb {X}}_{s}\rangle \mathrm {d}B_{s}=\langle \tilde {e}_{I}^{B}, \widehat {\mathbb {Z}}_{t}\rangle \) (Lemma 3.5) in the second to last equality. □

4.2 Joint Calibration of SPX and VIX Options

We point out again that we only work with call options in the following, however this choice was made merely for simplicity and it is straightforward to include any other liquid options. The time-to-maturities of options available on the market, as well as strikes-prices available on the market may differ for VIX and SPX. Accordingly, we denote by \(\mathcal {T}^{\mathrm {SPX}}, \mathcal {T}^{\mathrm {VIX}}\) the set of maturities and by \(\mathcal {K}^{\mathrm {SPX}}\), \(\mathcal {K}^{\mathrm {VIX}}\) the set of strikes for SPX and VIX options respectively.
Proposition 4.5 implies that the SPX call option payoff is given by
$$\displaystyle \begin{aligned} e^{-rT}(\tilde{S}_{T}(\ell)-K)^{+}{=}e^{-rT}\biggl(\exp\biggl\{(r-q)T-\frac{1}{2}\ell^\top Q^{0}(t)\ell {+}\sum_{|I|\le n}\ell_{I}\langle \tilde{e}_{I}^{B},\widehat{\mathbb{Z}}_{T}\rangle\biggr\}-K\biggr)^{+}, \end{aligned}$$
for a given maturity \(T>0\) and a strike price \(K>0\). Recall from Remark 4.1 that \(\tilde {S}\) stands for the undiscounted, unadjusted price process, r is the interest rate and q the dividend. Similarly, according to Theorem 4.4, the pay-off for the VIX reads
$$\displaystyle \begin{aligned} e^{-rT}({\text{VIX}}_{T}(\ell)-K)^{+} &= e^{-rT}\biggl(\sqrt{\frac{1}{\Delta}\ell^{\top}Q(T,\Delta)\ell}-K\biggr)^{+}\\ &= e^{-rT}\left(\frac{1}{\sqrt{\Delta}}\lVert U_{T}^{\top}\ell \lVert- K\right)^{+}, \end{aligned} $$
where we applied the Cholesky decomposition to \(Q(T,\Delta )=U_TU_T^{\mathsf {T}}\). To evaluate the expectations in the pricing formula for the above options as well as the VIX’s futures, we apply a Monte-Carlo procedure. However, we want to point out that we do not need an additional Monte-Carlo simulation to evaluate the conditional expectation in the definition of the VIX. This is a significant computational benefit of our model thanks to the polynomial technology.
The loss function to be minimized in the joint calibration is simply a convex combination of the individual loss functions of VIX and SPX respectively:
$$\displaystyle \begin{aligned} {} L_{\text{joint}}(\ell,\lambda):=\lambda L_{{\text{SPX}}}(\ell)+(1-\lambda) L_{{\text{VIX}}}(\ell), \end{aligned} $$
(30)
for \(\lambda \in (0,1)\). In particular, for some loss function \(\mathcal {L}^{\beta }\) to be specified below,
  • the VIX’ loss function is given by
    $$\displaystyle \begin{aligned} &L_{{\text{VIX}}}(\ell):=\\ &\;\; \!\!\!\sum_{\substack{T\in \mathcal{T}^{VIX}\\ K\in\mathcal{K}^{VIX}}}\!\!\!\mathcal{L}^{\beta}\!\left(\pi_{{\text{VIX}}}^{\text{model}}(\ell,T,K),\pi_{\text{{\text{VIX}}}}^{b,a}(T,K),\sigma_{\text{{\text{VIX}}}}^{b,a}(T,K), F_{{\text{VIX}}}^{\text{model}}(\ell,T)\!,F_{{\text{VIX}}}^{mkt}(T)\right)\!, \end{aligned} $$
    where \(\pi _{VIX}^{model}\) and \(F_{VIX}^{model}\) are the Monte Carlo option- and futures-prices under our model (i.e. for \({\text{VIX}}_T(\ell ,\omega _i)\) defined as in (27));
  • the SPX’ loss function is given by
    $$\displaystyle \begin{aligned} L_{{\text{SPX}}}(\ell):=\!\!\!\sum_{T\in\mathcal{T}^{{\text{SPX}}},K\in\mathcal{K}^{{\text{SPX}}}}\!\!\!\mathcal{L}^{\beta}(\pi_{{\text{SPX}}}^{\text{model}}(\ell,T,K),\pi_{\text{{\text{SPX}}}}^{b,a}(T,K),\sigma_{\text{{\text{SPX}}}}^{b,a}(T,K)). \end{aligned} $$
    Note that in the case of the SPX no future prices need to be calibrated (because the index value can directly be used as spot price) and we therefore do not add them as inputs to \(\mathcal {L}^{\beta }\) by slight abuse of notation.
For some \(\beta \in (0,1)\), we define \(\mathcal {L}^{\beta }\) to be the loss function
$$\displaystyle \begin{aligned} {} &\mathcal{L}^{\beta}(\pi, \pi^{mkt,b,a}, \sigma^{mkt,b,a}, F, F^{mkt})=\\ \notag &\;\left(\!\frac{\big(\beta\tilde{1}_{\{\pi \notin \pi^{mkt,b,a}\}}{+} (1{-}\beta)\big)\big|\pi{-}(\pi^{mkt,a}\,{+}\,\pi^{mkt,b})/2\big|\,{+}\,\big|\delta^{mkt}e^{-rT}(F\,{-}\,F^{mkt})\big|}{{\upsilon^{mkt} (\sigma^{mkt,a}{-}\sigma^{mkt,b})}}\right)^2\!, \end{aligned} $$
(31)
with
  • \(\upsilon ^{mkt}\), \(\delta ^{mkt}\) denoting the Vega and Delta of the option under the Black-Scholes model (but note that they depend on both maturity and strike price);
  • \(\pi ^{mkt, b, a}=[\pi ^{mkt,b},\pi ^{mkt,a}] \),\(\sigma ^{mkt, b, a}=[\sigma ^{mkt,b},\sigma ^{mkt,a}]\), with \(\pi ^{mkt,b},\pi ^{mkt,a}\), \(\sigma ^{mkt,b}\) and \(\sigma ^{mkt,a}\) denoting the market bid and ask prices and implied volatilities respectively.
  • F and \(F^{mkt}\) standing for the models and markets futures respectively (with maturity T);
  • \(\tilde {1}_{\{x \notin [y^{b},y^{a}]\}}:=s(y^b-x)+s(x-y^a)\) for \(s(x):=\frac {1}{2}\tanh (100x)+\frac {1}{2}\) a smooth approximation of the indicator function, which penalizes implied volatilities that lie out of the bid-ask spread.

4.3 Numerical Results

Before presenting the numerical results concerning the joint calibration problem within our framework, it is worth to mention that two main approaches in the literature are taken into account when choosing the maturities to fit.
1.
One approach is to choose \(T_{1}^{{\text{SPX}}}=T_{1}^{{\text{VIX}}}\) the first maturity to be the same for VIX and SPX and then for higher maturities to set \(j\ge 2\), \(T_{j}^{{\text{SPX}}}=T_{j-1}^{{\text{VIX}}}+\Delta \), see for instance [24, 27, 29, 30]. Sometimes, one may also choose the first maturity to differ by a few days between SPX and VIX, if the same maturity is not available for both on the market.
 
2.
Another approach consists of taking the same set of maturities for SPX and VIX \(\mathcal {T}^{{\text{SPX}}}=\mathcal {T}^{{\text{VIX}}}\) (or close together) as it was done for example by [4, 23, 39].
 
In the following we will adhere to the first approach and refer the reader to [13] for results regarding the second approach.
The trading day we consider for our calibration is 02/06/2021, the same as was used in [24, 30] and we use call options for both VIX and SPX. Maturities are reported in the following tables with the corresponding range of strikes (in percentage) with respect to the spot and the market’s futures prices.
In the table below, we report the maturities used in the calibrations (where \(T=1\equiv 365.25\) days) and report the corresponding moneyness range (i.e. strike price normalized by spot price or markets futures price) within which we calibrated our model.
\(T_{1}^{{\text{VIX}}}=0.0383\)
\(T_{2}^{{\text{VIX}}}=0.0767\)
\(T_{1}^{{\text{SPX}}}=0.0383\)
\(T_{2}^{{\text{SPX}}}=0.1205\)
\(T_{3}^{{\text{SPX}}}=0.1588\)
(90\(\%\),220\(\%\))
(90\(\%\),220\(\%\))
(92\(\%\),105\(\%\))
(70\(\%\),105\(\%\))
(80\(\%\),120\(\%\))
The maturities in days are \(\mathcal {T}^{SPX}= (14, 44, 58)\) an \(\mathcal {T}^{VIX}= (14,28)\). Let us also point out the high moneyness region (up to 220\(\%\)) for \({\text{VIX}}\) options is known to be challenging to fit. In our model, we need to choose both a primary process and a level of truncation of the signature a priori. Indeed, we treat these choices as hyperparameters and do not train them. As primary process we choose a three-dimensional OU process (see Example 2.9) with randomly chosen parameters
$$\displaystyle \begin{aligned} \kappa=(0.1,25,10)^\top, \qquad \theta=(0.1,4,0.08)^\top,\qquad \sigma=(0.7, 10,5)^\top, \end{aligned}$$
$$\displaystyle \begin{aligned} \rho= \begin{pmatrix} 1 & 0.213 & -0.576 & 0.329 \\ \cdot & 1 & -0.044 & -0.549 \\ \cdot & \cdot & 1 & -0.539 \\ \cdot & \cdot & \cdot & 1 \\ \end{pmatrix}, \qquad X_{0}=(1,0.08,2)^\top. \end{aligned}$$
and truncate the signature at \(n=3\). In this setting, our model has 85 trainable parameters, meaning that \(\ell \in \mathbb {R}^{85}\). Note that choosing instead a two-dimensional Brownian motion as primary process does not lead to good fits at low truncation levels of the signature (\(n=3\)), which is shown in [13, Appendix A]. The OU process is therefore a natural choice for a polynomial process which is tractable but exhibits richer dynamics than correlated Brownian motions alone. Moreover, OU processes have qualified for volatility modeling as shown in various articles, see e.g. [1, 2, 38, 40]. In terms of loss function, we choose the parameters \(\lambda =0.35\) and \(\beta =1\). For the evaluation of the pricing functional we simulate \(N_{MC}=80{,}000\) Monte Carlo samples.
The calibrated implied volatility smiles are depicted in Fig. 5 and the absolute value of the relative error between the models and markets future prices for each VIX-maturity are reported in the table below.
Fig. 5
The implied volatility smiles correspond to \(T_{1}^{{\text{SPX}}}, T_{1}^{{\text{VIX}}}, T_{2}^{\text{SPX}}, { T_{2}^{{\text{VIX}}}, T_{3}^{{\text{SPX}}}}\) (from top left to bottom right). The blue dots correspond to the values given by our calibrated model and the red stars are the markets bid-ask implied volatilities (from market call options). For the VIX, we additionally show the markets future price by the red dashed line and the one of the calibrated model by a blue dashed line
Full size image
\(T_{1}^{{\text{VIX}}}=0.0383\)
\(T_{2}^{{\text{VIX}}}=0.0767\)
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq572_HTML.gif
The image displays a mathematical formula: varepsilon_{T_1}^{VIX} = 9.8 cdot 10^{-6}. The formula includes the Greek letter epsilon (varepsilon), subscript T_1, and superscript VIX, equated to a numerical value in scientific notation.
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-97239-3_7/617302_1_En_7_IEq573_HTML.gif
The image shows a mathematical formula: varepsilon_{T_2}^{VIX} = 6.6 cdot 10^{-8}. The formula includes a Greek letter epsilon (varepsilon), subscript T_2, and superscript VIX, equated to a numerical value in scientific notation.
Indeed, the model does seem to fit the market implied volatilities of the VIX and SPX options at the given maturities rather well. As shown in Fig. 5, the model’s implied volatilities lie within the market’s bid-ask spreads and the calibrated future’s prices coincide with the ones of the market. The latter is confirmed by the small relative errors reported in the table. However, it is also worth noting that this numerical experiment only consists of a small set of maturities and a single trading day. We refer the reader to [13] for a more thorough empirical study with calibration results for various sets of maturities and tests for parameter stability under re-calibration over multiple weeks.

Acknowledgements

The first three authors gratefully acknowledge financial support through grant Y 1235 and grant I 3852 of the Austrian Science Fund. All authors acknowledge financial support through the OEAD WTZ project FR.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Title
Signature-Based Models in Finance
Authors
Christa Cuchiero
Guido Gazzani
Janka Möller
Sara Svaluto-Ferro
Copyright Year
2026
DOI
https://doi.org/10.1007/978-3-031-97239-3_7
1
Note that the SPX is a theoretical index tracking the value of the S&P 500 without holding an actual portfolio of the constituent stocks. In the following we shall use the terms SPX and S&P 500 interchangeably.
 
1.
go back to reference E. Abi Jaber, C. Illand, S. Li, Joint SPX-VIX calibration with Gaussian polynomial volatility models: deep pricing with quantization hints. Preprint. arXiv:2212.08297 (2022)
2.
go back to reference E. Abi Jaber, C. Illand, S. Li, The quintic Ornstein-Uhlenbeck volatility model that jointly calibrates SPX & VIX smiles. Preprint. arXiv:2212.10917 (2022)
3.
go back to reference H. Boedihardjo, J. Diehl, M. Mezzarobba, H. Ni, The expected signature of Brownian motion stopped on the boundary of a circle has finite radius of convergence. Bull. Lond. Math. Soc. 53(1), 285–299 (2021)MathSciNetCrossRef
4.
go back to reference A. Bondi, S. Pulido, S. Scotti, The rough Hawkes Heston stochastic volatility model. Math. Finance 1, 1–45 (2024)MathSciNet
5.
go back to reference T. Cass, E. Ferrucci, On the wiener chaos expansion of the signature of a gaussian process, in Probability Theory and Related Fields (2024), pp. 1–39
6.
go back to reference R. Cont, S. Ben Hamida, Recovering volatility from option prices by evolutionary optimization. J. Comput. Finance 8(4), 43–76 (2005)CrossRef
7.
go back to reference C. Cuchiero, J. Möller, Signature methods in stochastic portfolio theory. Preprint. arXiv:2310.02322 (2023)
8.
go back to reference C. Cuchiero, S. Svaluto-Ferro, Infinite-dimensional polynomial processes. Finance Stoch. 25(2), 383–426 (2021)MathSciNetCrossRef
9.
go back to reference C. Cuchiero, M. Keller-Ressel, J. Teichmann, Polynomial processes and their applications to mathematical finance. Finance Stoch. 16, 711–740 (2012)MathSciNetCrossRef
10.
go back to reference C. Cuchiero, W. Khosrawi, J. Teichmann, A generative adversarial network approach to calibration of local stochastic volatility models. Risks 8(4), 101 (2020)
11.
go back to reference C. Cuchiero, G. Gazzani, S. Svaluto-Ferro, Signature-based models: Theory and calibration. SIAM J. Financ. Math. 14(3), 910–957 (2023)MathSciNetCrossRef
12.
go back to reference C. Cuchiero, S. Svaluto-Ferro, J. Teichmann, Signature SDEs from an affine and polynomial perspective. Preprint. arXiv:2302.01362 (2023)
13.
go back to reference C. Cuchiero, G. Gazzani, J. Möeller, S. Svaluto-Ferro, Joint calibration to SPX and VIX options with signature-based models. Math. Finance 1, 1–53 (2024)
14.
go back to reference C. Cuchiero, F. Guida, L. Di Persio, S. Svaluto-Ferro, Measure-valued affine and polynomial diffusions. Stoch. Process. Appl. 175, 104392 (2024)MathSciNetCrossRef
15.
go back to reference C. Cuchiero, F. Primavera, S. Svaluto-Ferro, Universal approximation theorems for continuous functions of càdlàg paths and Lévy-type signature models. Finance Stoch. (2025), to appear
16.
go back to reference F. Delbaen, W. Schachermayer, A general version of the fundamental theorem of asset pricing. Math. Ann. 300(1), 463–520 (1994)MathSciNetCrossRef
17.
go back to reference G. Di Nunno, K. Kubilius, Y. Mishura, A. Yurchenko-Tytarenko, From constant to rough: a survey of continuous volatility modeling. Mathematics 11(19), 4201 (2023)
18.
go back to reference T. Fawcett, Problems in stochastic analysis. Connections between rough paths and non-commutative harmonic analysis. PhD Thesis, Univ. Oxford, 2003
19.
go back to reference D. Filipović, M. Larsson, Polynomial diffusions and applications in finance. Finance Stoch. 20(4), 931–972 (2016)MathSciNetCrossRef
20.
go back to reference P.K. Friz, A. Shekhar, General rough integration, Lévy rough paths and a Lévy–Kintchine-type formula. Ann. Probab. 45(4), 2707–2765 (2017)
21.
go back to reference P.K. Friz, P.P. Hager, N. Tapia, Unified signature cumulants and generalized magnus expansions, in Forum of Mathematics, Sigma, vol. 10 (Cambridge University Press, Cambridge, 2022), p. e42
22.
go back to reference J. Gatheral, The Volatility Surface: A Practitioner’s Guide (Wiley, 2011)
23.
go back to reference J. Gatheral, T. Jaisson, M. Rosenbaum, Volatility is rough. Quant. Finance 18(6), 933–949 (2018)MathSciNetCrossRef
24.
go back to reference G. Gazzani, J. Guyon, Pricing and calibration in the 4-factor path-dependent volatility model. Preprint. arXiv:2406.02319 (2024)
25.
go back to reference P. Gierjatowicz, M. Sabate-Vidales, D. Siska, L. Szpruch, Z. Zuric, Robust pricing and hedging via neural SDEs. J. Comput. Finance 3(26), 1–32 (2022)
26.
go back to reference P. Glasserman, Monte Carlo Methods in Financial Engineering, vol. 53 (Springer, 2004)
27.
go back to reference I. Guo, G. Loeper, J. Obloj, S. Wang, Optimal transport for model calibration. Preprint. arXiv:2107.01978 (2021)
28.
go back to reference J. Guyon, The joint S&P 500/VIX smile calibration puzzle solved. Risk, April (2020)
29.
go back to reference J. Guyon, Dispersion-constrained martingale Schrödinger problems and the exact joint S&P 500/VIX smile calibration puzzle. Finance Stoch. 28, 1–53 (2023)
30.
go back to reference J. Guyon, J. Lekeufack, Volatility is (mostly) path-dependent. Quant. Finance, 23, 1–38 (2023)MathSciNetCrossRef
31.
go back to reference P.S. Hagan, D. Kumar, A.S. Lesniewski, D. Woodward, Managing smile risk. Best Wilmott 1, 249–296 (2002)
32.
go back to reference T. Lyons, H. Ni, Expected signature of Brownian motion up to the first exit time from a bounded domain. Ann. Probab. 43(5), 2729–2762 (2015)MathSciNetCrossRef
33.
go back to reference T. Lyons, N. Victoir, Cubature on Wiener space. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 460(2041), 169–198 (2004)MathSciNetCrossRef
34.
go back to reference T. Lyons, S. Nejad, I. Perez Arribas, Non-parametric pricing and hedging of exotic derivatives. Appl. Math. Finance 27(6), 457–494 (2020)MathSciNetCrossRef
35.
go back to reference A. Neuberger, The log contract. J. Portfolio Manag. 20(2), 74 (1994)
36.
go back to reference H. Ni, The expected signature of a stochastic process. Ph.D. thesis, University of Oxford, 2012
37.
go back to reference I. Perez Arribas, C. Salvi, L. Szpruch, Sig-SDEs model for quantitative finance, in Proceedings of the First ACM International Conference on AI in Finance (2020), pp. 1–8
38.
go back to reference S. Rømer, Empirical analysis of rough and classical stochastic volatility models to the SPX and VIX markets. Quant. Finance 22, 1–34 (2022)MathSciNetCrossRef
39.
go back to reference M. Rosenbaum, J. Zhang, Deep calibration of the quadratic rough Heston model. Preprint. arXiv:2107.01611 (2021)
40.
go back to reference E.M. Stein, J.C. Stein, Stock price distributions with stochastic volatility: an analytic approach. Rev. Financ. Stud. 4(4), 727–752 (1991)CrossRef
    Image Credits
    Salesforce.com Germany GmbH/© Salesforce.com Germany GmbH, IDW Verlag GmbH/© IDW Verlag GmbH, Diebold Nixdorf/© Diebold Nixdorf, Ratiodata SE/© Ratiodata SE, msg for banking ag/© msg for banking ag, C.H. Beck oHG/© C.H. Beck oHG, OneTrust GmbH/© OneTrust GmbH, Governikus GmbH & Co. KG/© Governikus GmbH & Co. KG, Horn & Company GmbH/© Horn & Company GmbH, EURO Kartensysteme GmbH/© EURO Kartensysteme GmbH, Jabatix S.A./© Jabatix S.A.