Finds documents with both search terms in any word order, permitting "n" words as a maximum distance between them. Best choose between 15 and 30 (e.g. NEAR(recruit, professionals, 20)).
Finds documents with the search term in word versions or composites. The asterisk * marks whether you wish them BEFORE, BEHIND, or BEFORE and BEHIND the search term (e.g. lightweight*, *lightweight, *lightweight*).
This chapter delves into the mathematical framework of expected signatures and signature cumulants in semimartingale models. It begins by introducing tensor series and their role in describing sequential data and its statistics. The text then explores the concept of signatures and expected signatures, detailing their properties and how they are derived from tensor series. A significant portion of the chapter is dedicated to the functional equations and recursive formulas for expected signatures and signature cumulants, providing a comprehensive understanding of their behavior in semimartingale models. The chapter also discusses the practical applications of these concepts, including their use in finance and data science. Additionally, it examines the implications of these results for multivariate moments and cumulants, offering insights into the broader implications of the findings. The chapter concludes with a summary of the main results and an outlook on potential applications and future research directions.
AI Generated
This summary of the content was generated with the help of AI.
Abstract
The signature transform, a Cartan type development, translates paths into high-dimensional feature vectors, capturing their intrinsic characteristics. Under natural conditions, the expectation of the signature determines the law of the signature, providing a statistical summary of the data distribution. This property facilitates robust modeling and inference in machine learning and stochastic processes. Building on previous work by the present authors (Friz et al., Unified signature cumulants and generalized Magnus expansions. In Forum of Mathematics, Sigma, vol. 10, p. e42, 2022) we here revisit the actual computation of expected signatures, in a general semimartingale setting. Several new formulae are given. A log-transform of (expected) signatures leads to log-signatures (signature cumulants), offering a significant reduction in complexity.
1 Introduction
Signatures and expected signatures have become increasingly important in recent years, offering a top-down description of sequential data and its statistics, respectively. To formalize this recall the tensor series over \(\mathbb {R}^d\), denoted \(\mathcal {T} := T\big (\big (\mathbb {R}^d\big )\big )\), defined as infinite direct product of tensor powers of \(\mathbb {R}^d\), with elements of the form
\({\mathbf {x}}^{(k)} \in (\mathbb {R}^d)^{\otimes k}, k \in {\mathbb {N}}\), with both \((\mathbb {R}^d)^{\otimes 0} \cong \mathbb {R}\) and \((\mathbb {R}^d)^{\otimes 1} \cong \mathbb {R}^d\) embedded in \(\mathcal {T}\). Tensor series can be multiplied, so that \(\mathcal {T}\) carries a natural algebra structure, \(\exp \) and \(\log \) are then defined by their usual power series. Given a smooth \(\mathbb {R}^d\)-valued path X, its signature on \([0,T]\), is obtained from solving the universal linear differential equation
which just rephrases the common definition in terms of iterated integrals. We then set \(\mathrm {Sig} (X)_{0,T} := {\mathbf {S}}_T\). Here and below, write \(\mathcal {T}_\lambda \) for series starting with (scalar) \(\lambda \). It is known, essentially as consequence of the chain rule, that the log-signature \(\log {\mathbf {S}}_T\) takes values in \(\mathcal {L} \equiv \mathrm {Lie} \big (\big (\mathbb {R}^d\big )\big ) (\subset \mathcal {T}_0)\), the (generically infinite) Lie series over \(\mathbb {R}^d\). cf. [24, 29, 55].
Equation (1) remains meaningful when X is a sufficiently smooth \(\mathcal {T}_0\)-valued path, including the important case of \(\mathcal {L}\)-valued path; in which case \(\mathbf {S}\) is known as smooth geometric rough path, of recent importance in signature kernel methods [6, 44].
Advertisement
Now, many paths of interest arise as sample paths from stochastic processes, far from continuously differentiable, leave alone smooth. A natural class of stochastic processes, where (1) still makes sense - as (Stratonovich) stochastic differential equation - is given by semimartingales. Let accordingly \(\mathscr {S}^c(\mathbb {R}^d)\) denote the class of continuous, d-dimensional semimartingales on some filtered probability space \((\Omega , (\mathcal {F}_t)_{t \ge 0}, \mathbb {P})\). We recall that this is the somewhat decisive class of stochastic processes that allows for a reasonable, stochastic integration theory, notably required by no-arbitrage based continuous-time finance. Classic texts include [37, 42, 54, 56], as a concise introduction we recommend Chapter 1 of [2]. Similar to the above one considers, for \(X \in \mathscr {S}^c(\mathbb {R}^d)\),
the solution of which is expressed in terms of iterated (Stratonovich) integrals. As before, (2) remains meaningful in greater generality, with X replaced by a \(\mathcal {T}_0\)-valued semimartingale, say \(\mathbf {X} \in \mathscr {S}^c(\mathcal {T}_0)\). We then set \(\mathrm {Sig} (\mathbf {X})_{0,T} (\omega ) := {\mathbf {S}}_T (\omega )\), a (random) element in \(\mathcal {T}_1\). Since \(\mathcal {T}_0\) (resp. \(\mathcal {T}_1\)) are simple examples of Lie algebras (resp. groups), the linear Stratonovich SDE (2) then falls into a classical setting [33, 52].
Whenever the (generalized) signature \(\mathrm {Sig} (\mathbf {X})_{0, T}\) is component-wise integrable, we define the expected signature and signature cumulants by
Already in the case where \(\mathbf {X}\) is deterministic and sufficiently regular, this leads to an interesting (ordinary differential) equation for \(\boldsymbol {\kappa }\) with accompanying (Magnus) expansion, well understood as effective computational tool [7, 36]. Lyons [47] is an excellent survey with a variety of applications, ranging from machine learning to numerical algorithms on Wiener space known as cubature [48]; signature cumulants were named and first studied in their own right in [9], providing, in particular, formulas for converting signature moments to signature cumulants and vice versa.
In the special case of \(d=1\) and \(\mathbf {X}=(0,X,0,\dots )\) where X is a scalar semimartingale, \(\boldsymbol {\mu } (T)\) and \(\boldsymbol {\kappa } (T)\) are nothing but the sequence of moments and cumulants of the real valued random variable \(X_T-X_0\). When \(d > 1\), signature moments/cumulants provide an effective way to describe the process X on \([0,T]\), see [15, 43, 47]. The question arises how to compute them. If one takes \(\mathbf {X}\) as d-dimensional Brownian motion, the signature cumulant \(\boldsymbol {\kappa }(T)\) equals \((T/2) {\mathbf {I}}_d\), where \({\mathbf {I}}_d\) is the identity 2-tensor over \(\mathbb {R}^d\). This is known as Fawcett’s formula, [21, 48].
Advertisement
Exposing and building on [28], loosely speaking, our main functional equations (Theorems 3.4 and 3.6) are a vast generalization of Fawcett’s formula. Projecting these equations to tensor levels, we obtain recursions for signature moments and signature cumulants respectively (Theorems 4.1 and 4.4). These recursions represent \(\boldsymbol {\mu }^{(n)}\) in terms of \(\boldsymbol {\mu }^{(1)}, \dots , \boldsymbol {\mu }^{(n-1)}\) and \(\boldsymbol {\kappa }^{(n)}\) in terms of \(\boldsymbol {\kappa }^{(1)}, \dots , \boldsymbol {\kappa }^{(n-1)}\) using the underlying dynamic structure of the probability space.
This chapter is organized as follows: Sect. 2 introduces the mathematical preliminaries, including tensor algebra, continuous semimartingales taking values therein, generalized signatures, and their conditional expectations. In Sect. 3, we derive functional equations for the expected signature and signature cumulants, transitioning from discrete-time processes to continuous semimartingales. Section 4 develops the resulting recursive formulas for signature moments and cumulants and demonstrates how they relate to the so-called diamond products of semimartingales. In Sect. 5, we examine the implications of the general functional equations for multivariate moments and cumulants by projecting to the symmetric algebra. In Sect. 6, we apply the main results to time-inhomogeneous Lévy processes and Brownian rough paths. Finally, Sect. 7 summarizes all main results and corollaries and gives an outlook to potential applications and directions for future research.
2 Preliminaries
2.1 The Tensor Algebra and Tensor Series
Denote by \(T({\mathbb {R}^d})\) the tensor algebra over \({\mathbb {R}^d}\), i.e.
with \({\mathbf {x}}^{(k)} \in ({\mathbb {R}^d})^{\otimes k}, {\mathbf {x}}^w \in \mathbb {R}\) and linear basis vectors \(e_w := e_{i_1}\dotsm e_{i_k}\in ({\mathbb {R}^d})^{\otimes k}\) where w ranges over all words \(w=i_1\dotsm i_k\in \mathcal {W}_d\) over the alphabet \(\{1,\dots ,d\}\). Note \({\mathbf {x}}^{(k)} = \sum _{|w|=k} {\mathbf {x}}^w e_w\) where \(|w|\) denotes the length a word w. The element \(e_\emptyset = 1 \in ({\mathbb {R}^d})^{\otimes 0} \cong \mathbb {R}\) is the neutral element of the concatenation (tensor) product, which is obtained by linear extension of \(e_we_{w'}=e_{ww'}\) where \(ww' \in \mathcal {W}_d\) denotes concatenation of two words. We thus have, for \(\mathbf {x},\mathbf {y} \in T({\mathbb {R}^d})\),
Note that we reserve the usual product symbol \(\otimes \) for another product that will be introduced below. This will also prevent an overflow of product symbols throughout the text.
This extends naturally to infinite sums, i.e., tensor series, elements of the “completed” tensor algebra
which are written as in (3), but now as formal infinite sums with identical notation and multiplication rules; the resulting algebra \(\mathcal {T}\) obviously extends \(T(\mathbb {R}^d)\). Denote by \(\mathcal {T}_0\) and \(\mathcal {T}_1\) the subspaces of tensor series starting with 0 and 1 respectively; that is, \(\mathbf {x} \in \mathcal {T}_0\) (resp. \(\mathcal {T}_1\)) if and only if \({\mathbf {x}}^\emptyset =0\) (resp. \({\mathbf {x}}^\emptyset =1\)). Restricted to \(\mathcal {T}_0\) and \(\mathcal {T}_1\) respectively, the exponential and logarithm in \(\mathcal {T}\), defined by the power series,
are globally defined and inverse to each other. We will usually abbreviate \(e^{\mathbf {x}} = \exp (\mathbf {x})\). The vector space \(\mathcal {T}_0\) becomes a Lie algebra with the commutator bracket \([-,-]\colon \mathcal {T}_0\otimes \mathcal {T}_0\to \mathcal {T}_0\) given by
The exponential image \(\mathcal {T}_1=\exp (\mathcal {T}_0)\) is a Lie group, at least formally so. We refrain from equipping the infinite-dimensional \(\mathcal {T}_1\) with a differentiable structure, not necessary in view of the “locally finite” nature of the group law \((\mathbf {x},\mathbf {y}) \mapsto \mathbf {x} \mathbf {y}\), by which we mean that the coefficients of the product \(\mathbf {x}\mathbf {y}\) can be computed with a finite number of operations as is clear from Eq. (4).
For two tensor series \(\mathbf {x},\mathbf {y}\) we define their outer tensor product
We emphasize that here \(\otimes \) does not denote the (inner tensor) product in \(\mathcal {T}\), for which we did not reserve a symbol, but it denotes another (outer) tensor product. In particular, this expression is not the same as Eq. (4).
Given linear maps \(g,f\colon \mathcal {T}\to \mathcal {T}\) we define
whenever \(\mathbf {x},\mathbf {y}\in \mathcal {T}\), where the multiplication on the right-hand side is performed in \(\mathcal {T}\) according to Eq. (4). It then extends by linearity to the linear span of outer tensors which we will denote by \(\mathcal {T}\otimes \mathcal {T}\).
is a two sided ideal of \(\mathcal {T}\), meaning that \(\mathcal {T}\mathcal {I}_n+\mathcal {I}_n\mathcal {T}\subseteq \mathcal {I}_n\), which turns out to be the right condition under which the quotient space \(\mathcal {T}/\mathcal {I}_n\) has a natural algebra structure. We can identify \(\mathcal {T}/\mathcal {I}_n\) with
which formally is the same as Eq. (4) after setting \({\mathbf {x}}^{(k)}=0\) for all \(k>n\) as the above definition suggests. We denote the canonical projection map by \(\pi _{(0,n)}\colon \mathcal {T}\to \mathcal {T}^n\). The usual power series in \(\mathcal {T}^{n}\) define \(\exp _n\colon \mathcal {T}^n_0 \to \mathcal {T}^n_1\) with inverse \(\log _n\colon \mathcal {T}^n_1 \to \mathcal {T}^n_0\), and we may again abuse notation and write \(\exp \) and \(\log \) when no confusion arises. As before, \(\mathcal {T}^n_0\) has a natural Lie algebra structure, and \(\mathcal {T}^n_1\) (now finite dimensional) is a bona fide Lie group.
where \(|\cdot |_{({\mathbb {R}^d})^{\otimes k}}\) is the Euclidean norm on \(({\mathbb {R}^d})^{\otimes k}\cong \mathbb {R}^{d^k}\), which makes it a Banach space. The same norm makes sense in \(\mathcal {T}^n\), and since the definition is consistent in the sense that \(|a|_{\mathcal {T}^k} = |a|_{\mathcal {T}^n}\) for any \(a \in \mathcal {T}^{n}\) and \(k \ge n\), and \(|a|_{\mathcal {T}^n} = |a|_{({\mathbb {R}^d})^{\otimes n}}\) for any \(a \in ({\mathbb {R}^d})^{\otimes n}\). We will drop the index whenever it is possible and write simply \(|a|\). For a word \(w \in \mathcal {W}_d\) with \(|w|>0\) we define the directional derivative for a function \(f\colon \mathcal {T} \to \mathbb {R}\) by \((\partial _w f)(\mathbf {x}):=\partial _t f(\mathbf {x} + t e_w)\big \vert _{t=0},\) whenever the derivative exists. In particular, for the exponential function we have
where the operators \(G( \operatorname {\mathrm {ad}}\mathbf {x})\colon \mathcal {T}_0\to \mathcal {T}_0\) and \(Q( \operatorname {\mathrm {ad}}\mathbf {x})\colon \mathcal {T}_0\otimes \mathcal {T}_0\to \mathcal {T}_0\) are defined by the power series
The first identity is well known and can be traced back to Schur [57]. With this identities one also obtains the following formal Taylor expansion of the exponential map:
where we recall that the symbol \(\otimes \) is meant as an outer tensor product. Naturally, similar formulas hold in the truncated tensor algebra where \(\exp \) is replaced by \(\exp _n\).
2.2 Semimartingales
Let \(\mathscr {C}\) be the space of continuous adapted processes \(X\colon \Omega \times [0,T) \to \mathbb {R}\), with \(T\in (0,\infty ]\), defined on some filtered probability space \((\Omega , (\mathcal {F}_t)_{0 \le t \le T}, \mathbb {P})\). The space of continuous semimartingales\(\mathscr {S}^c\) is given by the processes \(X\in \mathscr {C}\) that can uniquely be decomposed as
where \(M \in \mathscr {M}_{\mathrm {loc}}\) is a continuous local martingale, and \(A\in \mathscr {V}\) is an adapted process of locally bounded variation, both started at zero. The (predictable) quadratic variation process of X is denoted by \(\langle X\rangle _t\). Covariation angle brackets \(\left \langle X, Y \right \rangle \), for another real-valued semimartingale Y , are defined by polarization. For \(q \in [1, \infty )\), write \(\mathcal {L}^{q} = L^{q}(\Omega , \mathcal {F}, \mathbb {P})\), then a Banach space \(\mathscr {H}^{q} \subset \mathscr {S}^c\) is given by those \(X\in \mathscr {S}^c\) with \(X_0 = 0\) and
and define the space \(\mathscr {S}^q\subset \mathscr {S}^c\) of semimartingales \(X\in \mathscr {S}^c\) such that \(\Vert {X}\Vert _{\mathscr {S}^q}< \infty \). Note that there exists a constant \(c_q>0\) depending on q such that (see [54, Ch. V, Theorem 2])
We view d-dimensional semimartingales, \(X= \sum _{i=1}^d X^i e_i \in \mathscr {S}^c(\mathbb {R}^d)\), as special cases of tensor series valued semimartingales \(\mathscr {S}^c(\mathcal {T})\) of the form
with each component \({\mathbf {X}}^w\) a real-valued continuous semimartingale. This extends mutatis mutandis to the spaces of \(\mathcal {T}\)-valued adapted continuous processes \(\mathscr {C}(\mathcal {T})\), martingales \(\mathscr {M}(\mathcal {T})\) and processes of finite variation \(\mathscr {V}(\mathcal {T})\). Note also that we typically deal with \(\mathcal {T}_0\)-valued semimartingales \(\mathbf {X}\) which amounts to have only words with length \(|w| \ge 1\). Standard notions such as continuous local martingale \({\mathbf {X}}^{c}\) and jump process \(\Delta {\mathbf {X}}_t = {\mathbf {X}}_t - {\mathbf {X}}_{t^-}\) are defined componentwise. To weaken integrability requirements later, we often deal semimartingales taking values in the truncated tensor algebra \(\mathcal {T}^N\) and will omit specifically extending notations when it is straightforward.
Brackets
Now let \(\mathbf {X}\) and \(\mathbf {Y}\) be \(\mathcal {T}\)-valued semimartingales. We define the (non-commutative) outer quadratic covariation bracket of \(\mathbf {X}\) and \(\mathbf {Y}\) by
As usual, we may write \(\langle \mkern -4mu\langle {\mathbf {X}}\rangle \mkern -4mu\rangle \equiv \langle \mkern -4mu\langle {\mathbf {X}, \mathbf {X}}\rangle \mkern -4mu\rangle \) and \(\left \langle \mathbf {X} \right \rangle \equiv \left \langle \mathbf {X}, \mathbf {X} \right \rangle \).
\(\mathscr {H}\)-Spaces
The definition of \(\mathscr {H}^{q}\)-norm naturally extends to tensor valued martingales. More precisely, for \({\mathbf {X}}^{(n)} \in \mathscr {S}^c(({\mathbb {R}^d})^{\otimes n})\) with \(n\in {\mathbb {N}_{\ge 1}}\) and \(q\in [1,\infty )\) we define
Note that \(\vert \mkern -2.5mu\vert \mkern -2.5mu\vert {\cdot }\vert \mkern -2.5mu\vert \mkern -2.5mu\vert _{\mathscr {H}^{q,N}}\) is sub-additive and positive definite on \(\mathscr {H}^{q, N}\) and it is homogeneous under dilation in the sense that
Note that if \(\mathbf {X}\in \mathscr {S}^c(\mathcal {T})\) such that \(\vert \mkern -2.5mu\vert \mkern -2.5mu\vert {{\mathbf {X}}^{(0,N)}}\vert \mkern -2.5mu\vert \mkern -2.5mu\vert _{\mathscr {H}^{1,N}} < \infty \) for all \(N\in {\mathbb {N}_{\ge 1}}\) then it also holds \(\mathbf {X}\in \mathscr {H}^{\infty -}(\mathcal {T})\).
Stochastic Integrals
We are now going to introduce a notation for the stochastic integration with respect to tensor valued semimartingales. Denote by \(\mathcal {L}(\mathcal {T}; \mathcal {T}) = \{ f: \mathcal {T} \to \mathcal {T}\;|\; f \text{ is linear}\}\) the space of endomorphisms on \(\mathcal {T}\) and let
where \(\mathcal {I}_n\subset \mathcal {T}\) was introduced in Sect. 2.1, consisting of series with tensors of level \(n+1\) and higher. In this case, we can define the stochastic Itô integral of \(\mathbf {F}\) with respect to \(\mathbf {X}\in \mathscr {S}^c(\mathcal {T})\) by
For example, let \(\mathbf {Y}, \mathbf {Z} \in \mathscr {C}(\mathcal {T})\) and define \(\mathbf {F} := \mathbf {Y}\,\mathrm {Id}\,\mathbf {Z}\), i.e. \({\mathbf {F}}_t(\mathbf {x}) = {\mathbf {Y}}_t \, \mathbf {x} \, {\mathbf {Z}}_t\), the concatenation product from the left and right, for all \(\mathbf {x}\in \mathcal {T}\). Then we see that \(\mathbf {F}\) indeed satisfies the conditions (8) and (9) and we have
Another important example is given by \(\mathbf {F} = ( \operatorname {\mathrm {ad}} \mathbf {Y})^{k} := \operatorname {\mathrm {ad}}\mathbf {Y} \circ \cdots \circ \operatorname {\mathrm {ad}}\mathbf {Y}\) (k-times) for any \(\mathbf {Y}\in \mathscr {C}(\mathcal {T}_0)\) and \(k\in {\mathbb {N}}\). Indeed, we immediately see \(\mathbf {F}\) satisfies the condition (9) and since the iteration of adjoint operations can be expanded in terms of left- and right-multiplication, we also see that \(\mathbf {F}\) satisfies (8). More generally, integrals with respect to power series of adjoint operators are well defined. Indeed, introducing for \(\ell \in (\mathbb {N}_{\ge 1})^k\) the notation \(\ell = (l_1, \dots , l_k)\), \(|\ell |:=k\) and \(||\ell ||:=l_1 + \dotsb + l_k\) and \(( \operatorname {\mathrm {ad}} {\mathbf {x}}^{(\ell )}) = ( \operatorname {\mathrm {ad}} {\mathbf {x}}^{(l_{1})} \cdots \operatorname {\mathrm {ad}} {\mathbf {x}}^{(l_k)})\) for any \(\mathbf {x} \in \mathcal {T}_0\). Then for any \((a_k)_{k=0}^{\infty }\subset \mathbb {R}\) and \(\mathbf {X}\in \mathscr {S}^c(\mathcal {T}_0)\) the integral
with respect to processes \(\mathbf {X} \in \mathscr {S}^c(\mathcal {T}\otimes \mathcal {T})\) is completely analogous.
2.3 Generalized Signatures
We now give precise meaning to \(\mathrm d\mathbf {S}=\mathbf {S}\,{\circ \mathrm d}\mathbf {X}\), or component-wise, for every word \(w\in \mathcal {W}_d\),
where the driving noise \(\mathbf {X}\) is a \(\mathcal {T}_0\)-valued continuous semimartingale, i.e. \(\mathbf {X} \in \mathscr {S}^c(\mathcal {T}_0)\). The Itô integral meaning of this equation, when started at time s from \(\mathbf {s} \in \mathcal {T}_1\), for times \(t \ge s\), is given by
leaving the component-wise version to the reader. We have
Proposition 2.1
Suppose\(\mathbf {X}\in \mathscr {S}^c(\mathcal {T}_0)\). For every\(s \in [0,T]\)and\(\mathbf {s} \in \mathcal {T}_1\), equation (13) has a unique global solution on\(\mathcal {T}_1\)starting from\({\mathbf {S}}_s=\mathbf {s}\).
Proof
Note that \(\mathbf {S}\) solves (13) iff \({\mathbf {s}}^{-1} \mathbf {S}\) solves the same equation started from \(1 \in \mathcal {T}_1\). We may thus take \(\mathbf {s} = 1\) without loss of generality. The graded structure of our problem, and more precisely that \(\mathbf {X} = (0,X,\mathbb {X},\dots )\) in (13) has no scalar component, shows that the (necessarily) unique solution is given explicitly by iterated integration, as may be seen explicitly when writing out \({\mathbf {S}}^{(0)} \equiv 1\), \({\mathbf {S}}^{(1)}_t = \int _s^t \mathrm d X = X_{s,t} \in \mathbb {R}^d\),
$$\displaystyle \begin{aligned} {\mathbf{S}}^{(2)}_t = \int_{(s,t]} {\mathbf{S}}^{(1)}_{u}\,\mathrm d X_u +\mathbb{X}_{t} -\mathbb{X}_{s} + \frac{1}{2} \left\langle X \right\rangle _{s,t} \in (\mathbb{R}^d)^{\otimes 2}, \end{aligned}$$
and so on. (In particular, we do not need to rely on abstract existence, uniqueness results for SDEs [40] or Lie group stochastic exponentials [33].) □
Definition 2.2
For \(\mathbf {X}\in \mathscr {S}^c(\mathcal {T}_0)\) and \(s\in [0,T]\) we define \( \mathrm {Sig} (\mathbf {X} \vert _{[s,\cdot ]}) \equiv \mathrm {Sig}(\mathbf {X})_{s,\cdot }\) as the unique solution to (13) such that \(\mathrm {Sig}(\mathbf {X})_{s,s}=1\). We call \(\mathrm {Sig}(\mathbf {X})_{s,t}\) the generalized signature (short: signature) of \(\mathbf {X}\) on \([s,t]\).
Remark 2.3
In rough path theory, one defines signatures for geometric rough paths. This is consistent with Definition 2.2 when \(\mathbf {X} \in \mathscr {S}^c(\mathcal {L})\) (recall \(\mathcal {L} = \mathrm {Lie} \big (\big (\mathbb {R}^d\big )\big )\)), including the special case of a continuous semimartingale with values in \(\mathbb {R}^d\).
The following can be seen as a (generalized) Chen relation.
Lemma 2.4
Let\(\mathbf {X}\in \mathscr {S}^c(\mathcal {T}_0)\)and\(0 \le s \le t \le u \le T\). Then the following identity holds with probability one, for all such\(s,t,u\),
Call \(\Phi _{t \leftarrow s} \mathbf {s} := {\mathbf {S}}_t\) the solution to (13) at time \(t \ge s\), started from \({\mathbf {S}}_s = \mathbf {s}\). By uniqueness of the solution flow, we have \( \Phi _{u \leftarrow t} \circ \Phi _{t \leftarrow s} = \Phi _{u \leftarrow s} . \) It now suffices to remark that, thanks to the multiplicative structure of (13) we have \( \Phi _{t \leftarrow s} \mathbf {s} = \mathbf {s} \mathrm {Sig}(\mathbf {X})_{s,t}\). □
As usual, when \(\mathbf {X} \in \mathscr {S}^c(\mathcal {T}^{N}_0)\) for some \(N\ge 1\) the development (13) is understood within the truncated tensor algebra \(\mathcal {T}^{N}_1\) and therefore \(\mathrm {Sig}(\mathbf {X})_{s,t}\) is an element in \(\mathcal {T}^N_1\).
2.4 Expected Signatures and Signature Cumulants
Throughout this section let \(\mathbf {X} \in \mathscr {S}^c(\mathcal {T}_0)\) be defined on a filtered probability space \((\Omega , \mathcal {F}, (\mathcal {F}_t)_{0 \le t \le T}, \mathbb {P})\) satisfying the usual conditions and with the property that every martingale has a continuous version. Recall that \(\mathbb {E}_t\) denotes the conditional expectation with respect to the sigma algebra \(\mathcal {F}_t\). When \(\mathbb {E}(|\mathrm {Sig}(\mathbf {X})^w_{0,t}|)<\infty \) for all \(0 \le t \le T\) and all words \(w\in \mathcal {W}_d\), then the (conditional) expected signature
Therefore projecting to the tensor components we have
$$\displaystyle \begin{aligned} \boldsymbol{\mu}_t(T)^w = \sum_{w_1w_2 = w}(-1)^{|w_1|}\mathrm{Sig}(\mathbf{X})^{\overline{w_1}}_{0,t}\mathbb{E}_t\left(\mathrm{Sig}(\mathbf{X})^{w_2}_{0,T}\right), \quad 0 \le t \le T, \quad w \in \mathcal{W}_d. \end{aligned} $$
where \(\overline {w_1}\) denotes word reversal. Since \((\mathrm {Sig}(\mathbf {X})^w_{0, t})_{0 \le t \le T}\) and \((\mathbb {E}_t(\mathrm {Sig}(\mathbf {X})^w_{0,T})_{0 \le t \le T}\) are semimartingales (the latter in fact a martingale), it follows from Itô’s product rule that \(\boldsymbol {\mu }^w(T)\) is also a semimartingale for all words \(w\in \mathcal {W}_d\), hence \(\boldsymbol {\mu }(T)\in \mathscr {S}^c(\mathcal {T}_1)\). Further, recall that \(\boldsymbol {\kappa }(T) = \log (\boldsymbol {\mu }(T))\) and therefore it follows from the definition of the logarithm on \(\mathcal {T}_1\) that each component \(\boldsymbol {\kappa }(T)^w\) with \(w\in \mathcal {W}_d\) is a polynomial of \((\boldsymbol {\mu }(T)^{v})_{v\in \mathcal {W}_d, |v|\le |w|}\). Hence it follows again by Itô’s product rule that \(\boldsymbol {\kappa }(T)\in \mathscr {S}^c(\mathcal {T}_0)\). □
It is of strong interest to have a more explicit necessary condition for the existence of the expected signature. The following theorem below, the proof of which can be found in [28, Section 7.2], yields such a criterion.
Theorem 2.6
Let\(q\in [1, \infty )\)and\(N\in {\mathbb {N}_{\ge 1}}\), then there exist two constants\(c,C>0\)depending only on d, N and q, such that for all\(\mathbf {X} = {\mathbf {X}}^{(0,N)} \in \mathscr {H}^{q,N}\)
This estimate is already known and follows from the Burkholder-Davis-Gundy inequality for enhanced martingales, which was first proved in [23].
3 Functional Equations for the Expected Signature
In this section we present central functional equations for the expected signature and signature cumulants of semimartingales. For better accessibility we will first present the formulas for discrete processes, which then will inform the results for purely continuous processes via a limiting procedure. We will however prove the results in the continuous case directly by a stochastic calculus approach.
3.1 Discrete Processes
Let \((\Omega , \mathcal {G},\mathbb {P})\) be a probability space with filtration \((\mathcal {G}_j)_{j = 0, \dots , J}\). Further, let \(({\mathbf {X}}_j)\) be a \(\mathcal {T}_0\)-valued adapted process. We can define a discrete version of the signature by multiplying exponential increments
This is precisely the signature of the paths obtained from linearly interpolating the points \({\mathbf {X}}_0, {\mathbf {X}}_1, \dots , {\mathbf {X}}_J\) in \(\mathcal {T}_0\). Assume for simplicity that \(\mathbf {X}\) has moments of all orders, i.e., \(\mathbb {E}[|{\mathbf {X}}^{(n)}_j|^p] < \infty \) for all \(n, p\in \mathbb {N}_{\ge 1}\), for otherwise we need to truncate tensor levels. We then correspondingly define the conditional expected signature of \(({\mathbf {X}}_j)\) by
These identities permit to calculate \(\boldsymbol {\mu }\) in a backwards induction. What is not revealed, however, is that to compute \(\boldsymbol {\mu }^{(n)}\) only strictly lower tensor levels \(\boldsymbol {\mu }^{(1)}, \dots \boldsymbol {\mu }^{(n-1)}\) are needed. To see that this is really the case, we first bring the identity into the following difference form
Summing over \(\{j, j+1, \dots , J-1\}\) and conditioning to \(\mathcal {G}_j\) we obtain the following
Theorem 3.1
Given that the adapted\(\mathcal {T}_0\)-valued process\(({\mathbf {X}}_j)\)satisfies the moment condition\(\mathbb {E}[|{\mathbf {X}}^{(n)}_j|^p] < \infty \)for all\(n, p\in \mathbb {N}_{\ge 1}\), its conditional expected signature\((\boldsymbol {\mu }_j)\)is uniquely characterized by the equation
Example 3.2 (Markov Chains; See also [10, Section 5])
Consider a d-dimensional Markov chain \((X_j)\) and set \(({\mathbf {X}}^{(1)}_j) = (X_j)\) and \({\mathbf {X}}^{(n)}_j \equiv 0\) for \(n \neq 1\). Denote by \(p^{(i)}_j(x, \mathrm {d}{y})\) the step-i transition probability kernel of X starting in \(x\in {\mathbb {R}^d}\) at time j with \(p^{(0)}_j(x, \mathrm {d}{y}) = \delta _{x}(\mathrm {d}{y})\). We see inductively that almost surely \(\boldsymbol {\mu }_j^{(n)} = f^n_j(X_j)\) where the functions \(f^{n}_j : {\mathbb {R}^d} \to ({\mathbb {R}^d})^{\otimes n}\) satisfy the recursive scheme
Let \((g_j)_{j=1, \dots , J}\) be an IID sequence with values in the Lie algebra \(\mathcal {T}_0^N\) and exponential image \(G_j = e^{g_j} \in \mathcal {T}_1\). Consider the random walk \({\mathbf {X}}_j=\sum _{k=1}^jg_k\) and its signature
assuming that \(M= \mathbb {E}(G_1)\) is (componentwise) well defined.
For instance, the case of a planar lattice random walk is covered by \(N=1,d=2\), and equal weighted point masses at \(\pm e_1 \pm e_2\), that is, \(\mathbf {X}\) is a simple random walk in \(\mathbb {R}^2\). A direct computation shows that
where \(\cosh \colon \mathcal {T}\to \mathcal {T}\) is defined by its power series (see Sect. 2.2). In particular, for any tensor series \(\mathbf {x}\in \mathcal {T}_0\),
Upon taking the limit as \(J\to \infty \) we recover Fawcett’s formula for planar Brownian motion.
For an example with \(N=2\) consider equal weighted point masses at \(\{\pm [e_i,e_j] \;:\; 1\le i < j \le d\}\). A similar argument leads to a Fawcett-type formula
for the level-4 Brownian rough paths that recently arose in [32] as limits of fractional Brownian motion with \(H\to \frac {1}{4}\) (see also Sect. 6.2).
We close the discussion on discrete processes by arguing how to extend these results to the continuous case. Expressing the functional equation from Theorem 3.1 in the alternative form
to anticipate the functional relation of Theorem 3.4, derived by stochastic calculus.
For the conditional signature cumulants \((\boldsymbol {\kappa }_j) = (\log \boldsymbol {\mu }_j)\) we obtain a first identity by multiplying (15) with \(e^{-\boldsymbol {\kappa }_j}\) from the right:
Even in the commutative (one-dimensional) case the above identity gives non-trivial relations for conditional cumulants (see Sect. 5.3.2). In contrast to the continuous case, transforming the above formula entirely to logarithmic coordinates under the use of the Baker-Campbell-Hausdorff (BCH) formula is in general not possible, since after all the logarithm is non-linear and cannot be exchanged with the conditional expectation. Nevertheless, a martingale transform yields the identity
where following Eq. (6) a second order expansion in the terms \(\delta \mathbf {X}\) and \(\delta \boldsymbol {\kappa }_i = \boldsymbol {\kappa }_{i+1} - \boldsymbol {\kappa }_i\) yields
allows us to anticipate that the diffusive limit leads to the functional equation given in Theorem 3.6 below.
3.2 The Continuous Case
Throughout this section we will assume that \((\Omega , \mathcal {F}, (\mathcal {F}_t), \mathbb {P})\) is a filtered probability space satisfying the usual conditions and with the property that every martingale has a continuous version. This is for instance the case if the filtration is the natural and completed filtration of a Brownian motion.
To streamline the derivation, we will start from first principles and introduce the key concepts from Itô-calculus in the non-commutative setting when first needed, not focusing on integrability considerations which are fully resolved in [28]. For \(\mathbf {X} \in \mathscr {H}^{\infty -}(\mathcal {T}_0)\) the conditional expected signature and the signature cumulants
One is now inclined to apply Itô’s rule to the product on the right-hand side as this should reveal non-trivial cancellations implied by the martingality of \(\mathbf {M}\). For ease of notation we denote \({\mathbf {S}}_t := \mathrm {Sig}(\mathbf {X})_{0,t}\) and \(\boldsymbol {\mu }_t := \boldsymbol {\mu }_t(T)\). Then the definitions put forward in Sect. 2.2 directly yield the product rule
where in the second line we have used the Itô-integral form of \(\mathbf {S}\) in (13). The martingality of \(\mathbf {M}\) and invertability of the tensor \({\mathbf {S}}_t\) for all times \(t \in [0,T]\) imply that the term in the bracket is the differential of a local martingale. By Theorem 2.6 we have \(\mathbf {S}, \boldsymbol {\mu } \in \mathscr {H}^{\infty -}(\mathcal {T}_0)\) and it is therefore not difficult to conclude that this local martingale is also a true martingale. Taking conditional expectations of its integral then yields the following functional equation \(\boldsymbol {\mu }\):
Theorem 3.4
The conditional expected signature\(\boldsymbol {\mu } = \boldsymbol {\mu }(T)\)of\(\mathbf {X} \in \mathscr {H}^{\infty -}(\mathcal {T}_0)\)is the unique solution (up to indistinguishably) of the following functional equation
Furthermore, if\(\mathbf {X}\in \mathscr {H}^{1,N}\)for some\(N\in {\mathbb {N}_{\ge 1}}\), then the identity (17) still holds true for the truncated expected signature\(\boldsymbol {\mu } := (\mathbb {E}_t(\mathrm {Sig}({\mathbf {X}}^{(0,N)})_{t,T}))_{0\le t \le T}\).
It is crucial to understand that the uniqueness part of the above statement follows by projecting the above equation to a tensor level, say \(n\ge 1\), and then noting that the above right-hand side only depends on tensor levels \(\boldsymbol {\mu }^{(k)}\) for \(k \le n-1\). In other words, Eq. (17) directly leads to a recursive scheme for calculating \(\boldsymbol {\mu }\) which is explicitly spelled out in Corollary 4.1 of Sect. 4.
Moving forward to the signature cumulants \(\boldsymbol {\kappa } := \boldsymbol {\kappa }_\cdot (T)\) one needs to further resolve the differential of \(\boldsymbol {\mu } = e^{\boldsymbol {\kappa }}\) in (16). The key is following Itô-formula for the tensor exponential map.
A version of the above lemma in a matrix setting previously appeared in [38]; the full proof can be found in [28, Lemma 7.8]. Loosely speaking, the form of the right-hand side follows immediately from the order-2 Taylor expansion for the tensor exponential given in Eq. (6).
Applying Theorem 3.5 to \(\boldsymbol {\kappa }\) in (16) and collecting terms, we arrive at
Again the martingality of \(\mathbf {M}\) and invertability of \({\mathbf {S}}_t, e^{-\boldsymbol {\kappa }_t} \in \mathscr {S}^c(\mathcal {T}_1)\) imply that \(\mathbf {L}\) is a local martingale. The main work done in the proof of [28, Theorem 4.1] is to show that under the given assumptions on \(\mathbf {X}\) it also holds that \(\mathbf {L}\) is also a true martingale. Taking expectations then yields a first functional equation for \(\boldsymbol {\kappa }\):
Similarly to the functional equation (17) for \(\boldsymbol {\mu }\), the above equation leads to a recursive scheme over tensor levels which uniquely determines \(\boldsymbol {\kappa }\). However, there is still a degree of implicitness that can be removed.
To this end note that the operator \(G( \operatorname {\mathrm {ad}}{\mathbf {x}})\) has the following inverse1
with Bernoulli numbers \((B_k)_{k\ge 0} = (1, -\frac {1}{2}, \frac {1}{6}\dotsc )\). Integrating \(H( \operatorname {\mathrm {ad}}{\boldsymbol {\kappa }})\) against \(\mathrm {d}\mathbf {L}\) and verifying that the resulting process is in \(\mathscr {M}(\mathcal {T}_0)\) we arrive at a second functional equation for \(\boldsymbol {\kappa }\):
Theorem 3.6
The signature cumulant\(\boldsymbol {\kappa }\)of\(\mathbf {X} \in \mathscr {H}^{\infty -}(\mathcal {T}_0)\)is the unique solution (up to indistinguishably) of the following functional equation
for all\(0 \le t \le T\). Furthermore, if\(\mathbf {X}\in \mathscr {H}^{1,N}\)for some\(N\in {\mathbb {N}_{\ge 1}}\), then the identity (19) still holds true for the truncated signature cumulant\(\boldsymbol {\kappa } := (\log \mathbb {E}_t(\mathrm {Sig}({\mathbf {X}}^{(0,N)})_{t,T}))_{0\le t \le T}\).
4 Recursive Formulas and Diamond Products
Theorems 3.4 and 3.6 allow for an iterative computation of the expected signature and signature cumulants, by projecting the functional equations to tensor levels. In the following subsections we will explicitly spell out these recursive schemes and then rephrase them in terms of so called diamond products as they were introduced for scalar semimartingales in [3].
While all recursions can be stated for the general càdlàg case [28, Section 4.2], we only present the continuous case for simplicity. Therefore we shall presume the assumptions on the filtration from the continuous setting in Sect. 3.2 throughout this section.
4.1 Recursive Formula for Signature Moments
This recursive scheme trivially starts with \(\boldsymbol {\mu }^{(0)} \equiv 0\) and the first level of signature moments
Consider the special case with vanishing higher order components, \({\mathbf {X}}^{(i)} \equiv 0\), for \(i \ne 1\), and \(\mathbf {X} = {\mathbf {X}}^{(1)} \equiv M\), a d-dimensional continuous square-integrable martingale. We then have \(\boldsymbol {\mu }^{(1)} \equiv 0\) and it directly follows from Stratonovich-Itô correction that
which is indeed a (very) special case of the general expression for \(\boldsymbol {\mu }^{(2)}\).
Assuming that \(M \in \mathscr {H}^{N}\) the recursion proceeds for levels \(n\in \{2, 3,\dots , N\}\) and simplifies due to the martingality to
$$\displaystyle \begin{aligned} \boldsymbol{\mu}_t^{(n)} = \mathbb{E}_t\bigg\{\frac 12\int_t^T\mathrm{d}\left\langle M, M \right\rangle _u\, \boldsymbol{\mu}_u^{(n-2)} + \left\langle M, \boldsymbol{\mu}^{(n-1)} \right\rangle _{t,T}\bigg\}. \end{aligned} $$
In case M is a Gaussian martingale of the form \(M_t = \int _0^t \sigma (s). \mathrm {d}{B_s} \in \mathscr {H}^{\infty -}\) for an m-dimensional Brownian motion B and \(\sigma \in L^2([0,T]; \mathbb {R}^{d\times m})\), an induction immediately shows that \(\boldsymbol {\mu }\) is deterministic. Whence brackets with \(\boldsymbol {\mu }\) vanish and we obtain \(\boldsymbol {\mu }^{(n)} \equiv 0\) for odd n and
$$\displaystyle \begin{aligned} {} \boldsymbol{\mu}_t^{(n)} = \frac 12\int_t^T\sigma.\sigma^T(u) \;\boldsymbol{\mu}_u^{(n-2)}\; \mathrm{d}{u}, \qquad \text{ for even }n. \end{aligned} $$
(22)
When M is a standard Brownian motion, i.e., \(\sigma \equiv {\mathbf {I}}_d \in \mathbb {R}^{d\times d}\) is the identity matrix, we now readily verify Fawcett’s formula [19, 21]
Fawcett’s formula is reminiscent of the case of real-valued Gaussian vectors, which are in fact characterized by having non-vanishing cumulants only at degree 2, and one might wonder whether that is also the case for expected signatures as they play a role analogue to characteristic functions for paths. In that case, the formula is valid only for Brownian motion. Other Gaussian processes may have arbitrarily complicated cumulants as can already be seen from Theorem 4.2 above. For an even more elementary example, let \(d>1\) and consider a centered Gaussian vector \(Z=(Z^1,\dotsc ,Z^d)\in \mathbb {R}^d\) with covariance matrix \(\Sigma \), and \(X_t:= tZ\). Clearly, the law of \(X_t\) is Gaussian for every fixed \(t>0\). Moreover,
so that the first few terms of the expected signature are \(\boldsymbol {\mu }^{(1)}_t(T)=\boldsymbol {\mu }^{(3)}_t(T)=0\) (in fact all odd degrees vanish), and
where in the last term we have used Isserlis’s theorem to express mixed moments as products of covariances. Using the power series expansion of \(\log \) we see that the first four signature cumulants are \(\boldsymbol {\kappa }^{(1)}_t(T)=\boldsymbol {\kappa }^{(3)}_t(T)=0\), and
In particular, the fourth cumulant does not vanish. A general diagrammatic expansion, analog to Isserlis’ theorem, for the expected signature of a Gaussian processes in terms of its correlation function has been recently obtained by Cass and Ferrucci [13].
4.2 Recursive Formula for Signature Cumulants
The first level of signature cumulants are identical to the signature moments
From the very definition of the logarithm we must have \(\boldsymbol {\kappa }^{(2)} = \boldsymbol {\mu }^{(2)} - \frac 12 \boldsymbol {\mu }^{(1)} \boldsymbol {\mu }^{(1)},\) which is in line with the above expression for \(\boldsymbol {\kappa }^{(2)}\) and the previous formula (20) for \(\boldsymbol {\mu }^{(2)}\), but already requires a little work to verify. Indeed, using the martingality of \({\mathbf {X}}^{(1)}+\boldsymbol {\kappa }^{(1)} = \mathbb {E}_\cdot [{\mathbf {X}}^{(1)}_T]\) we resolve both expressions using the Itô-product rule in the following calculation
The general recursion for \(\boldsymbol {\kappa }\) is combinatorially more involved. To present it as concise as possible, we recall the following notation from Sect. 2.2: For \(\ell \in (\mathbb {N}_{\ge 1})^k\) we write \(\ell = (l_1, \dots , l_k)\), \(|\ell |:=k\) and \(||\ell ||:=l_1 + \dotsb + l_k\). Furthermore, for \(0 \le i, j \le k\) we define \(\ell _{i:j} = (l_{i+1}, \dots , l_j)\) if \(i < j\) and \(\ell _{i:j} = ()\) otherwise. Moreover \(( \operatorname {\mathrm {ad}} {\mathbf {x}}^{(\ell )}) = ( \operatorname {\mathrm {ad}} {\mathbf {x}}^{(l_{1})} \cdots \operatorname {\mathrm {ad}} {\mathbf {x}}^{(l_k)})\) for any \(\mathbf {x} \in \mathcal {T}_0\) and \(( \operatorname {\mathrm {ad}} {\mathbf {x}}^{()}) = \mathrm {Id}\).
Corollary 4.4
Let\(\mathbf {X}\in \mathscr {H}^{1,N}\)for some\(N\in \mathbb {N}_{\ge 1}\), then we have
In the Gaussian martingale case from Theorem 4.2, i.e., where \(\mathbf {X} = (0, \int _0^\cdot \sigma (s). \mathrm {d}{B_s}, 0, \dots ) \in \mathscr {H}^{-\infty }(\mathcal {T}_0)\), an induction yields that \(\boldsymbol {\kappa }^{(n)}\) is deterministic for all \(n\in {\mathbb {N}}\). Indeed, firstly the \(\mathrm {HMag}^1\)-terms vanish due to martingality of \(\mathbf {X}\). Secondly, from the an induction hypothesis \(\boldsymbol {\kappa }^{(0,n-1)}\) is deterministic, hence, of finite variation and all cross-variation terms vanish. Finally, the remaining \(\mathrm {HMag}^2\)-terms are deterministic due \(\left \langle \mathbf {X} \right \rangle _t = \sigma .\sigma ^T(t)\). The recursion thus dramatically simplifies to \(\boldsymbol {\kappa }^{(n)} \equiv 0\) for odd n and
This is precisely the (deterministic) Magnus expansion of the logarithm of the solution of Eq. (22). We will revisit this connection in Sect. 6.1 in the more general setting of time dependent Lévy-processes.
Note that \(\boldsymbol {\kappa }_t\) is a Lie-series over symmetric 2-tensors \(\mathrm {Sym}({\mathbb {R}^d} \otimes {\mathbb {R}^d}) \subset \mathcal {T}_0\). In the standard Brownian case \(\sigma \equiv {\mathbf {I}}_d\), all commutators vanish and we obtain Fawcett’s formula in logarithmic form \(\boldsymbol {\kappa }_t = \frac {1}{2}(T-t) {\mathbf {I}}_d\).
4.3 Diamond Products
We extend the notion of the diamond product introduced in [3], which we recall, for continuous scalar semimartingales to our setting. Denote by \(\mathbb {E}_t\) the conditional expectation with respect to the sigma algebra \(\mathcal {F}_t\).
The proof relies on standard approximation results for continuous semimartingales as it is fairly clear from the definition of the diamond product that the identity holds for simple processes Z, since \((X\diamond Y)_T(T)=0\). The Kunita-Watanabe inequality ensures that the expectation on the right-hand side is well defined.
For clarity we focus on the case where \(\mathbf {X} = \mathbf {M} \in \mathscr {H}^{\infty -}\) is a martingale, for otherwise one needs to carry along conditional expectations of the finite variation parts. For the conditional expected signature, the recursion from Theorem 4.1 is then conveniently rewritten into
When \(d=1\) (or in the projection onto the symmetric algebra, cf. Sect. 5) the cumulant recursion takes a particularly simple form, since \( \operatorname {\mathrm {ad}}\mathbf {x}\equiv 0\) for all \(\mathbf {x}\in \mathcal {T}_0\). Equation (23) then becomes
We shall revisit this in a multivariate setting and comment on related works in Sect. 5.
4.4 Remark on Tree Representation
As illustrated in the previous section, in the case where \(d=1\) (or when projecting onto the symmetric algebra cf. Sect. 5), the cumulant recursion takes a particularly simple form. The algebraic perspective in the setting of Friz et al. [27] gives a tree series expansion of cumulants using binary trees. This representation follows from the fact that the diamond product of semimartingales is commutative but not associative. As an example (with notations taken from Sect. 5.3), in case of a one-dimensional continuous martingale, the first terms are
This expansion is organized (graded) in terms of the number of leaves in each tree, and each leaf represents the underlying martingale.
In the deterministic case, tree expansions are also known for the Magnus expansion [35] and the BCH formula [12]. These expansions, also in terms of binary trees, are different from the ones above as the trees are required to be planar to account for the non-commutativity of the Lie algebra. As an example, consider a matrix Lie group G with Lie algebra \(\mathfrak {g}\). Let Y solve the matrix ODE \(\dot {Y}_t=A_tY_t\) for some \(\mathfrak {g}\)-valued path A, and set \(\Omega _t(T)=\log (e^{Y_t}e^{-Y_T})\). We have
In this expansion, the nodes represent the underlying vector field and edges represent integration (with respect to time) and application of the Lie bracket, coming from the \( \operatorname {\mathrm {ad}}\) operator. It is an interesting open question to find a unified tree representation that accounts for the unified functional recursion of our Corollary 4.4.
5 Multivariate Moments and Cumulants
We saw that \(\mathcal {T} := T\big (\big (\mathbb {R}^d\big )\big )\) is the natural state space for signatures, expected signatures and their logarithms. When \(d=1\), the signature of a path \(X:[0,T]\) is nothing more than the sequence
The expected signatures is then exactly the sequence of moments of the random variable \(X_{0,T}\), to the extent of being well-defined and up to factorial constants. Similarly, the signatures cumulants correspond to the sequence of classical cumulants of \(X_{0,T}\). Since \(T\big (\big (\mathbb {R}^d\big )\big )\) is a commutative algebra if (and only if) \(d=1\), our previous expression for expected signatures and signatures cumulants simplify dramatically, without becoming trivial (as pointed out in several works [3, 27, 30, 41]). We can capture multivariate moments and cumulants, by working with the “commutative shadow” [4] of \(\mathcal {T}\) which we now introduce.
5.1 The Symmetric Algebra
The symmetric algebra over \({\mathbb {R}^d}\), denoted by \(S({\mathbb {R}^d})\) is the quotient of \(T({\mathbb {R}^d})\) by the two-sided ideal I generated by \(\{xy-yx:x,y\in {\mathbb {R}^d}\}\). A linear basis of \(S({\mathbb {R}^d})\) is then given by \(\{ \hat e_v \}\) over non-decreasing words, \(v=(i_1,\dotsc ,i_n) \in \widehat {\mathcal {W}}_d\), with \(1 \le i_1 \le \dots \le i_n \le d, n \ge 0\). Every \(\hat {\mathbf {x}} \in S(\mathbb {R}^d)\) can be written as finite sum,
where \(\hat {w}\in \hat {\mathcal {W}}_d\) denotes the non-decreasing reordering of the letters of the word \(w\in \mathcal {W}_d\), is an algebra epimorphism, which extends to an epimorphism \(\pi _{\mathrm {Sym}}: \mathcal {T}\twoheadrightarrow \mathcal {S}\) where \(\mathcal {S} = S \big (\big ( \mathbb {R}^d\big )\big )\) is the algebra completion, identifiable as formal series in d commuting indeterminates. As a vector space, \(\mathcal {S}\) can be identified with symmetric formal tensor series. Denote by \( \mathcal {S}_0\) and \(\mathcal {S}_1\) the affine space given by those \(\hat {\mathbf {x}}\in \mathcal {S}\) with \( \hat {\mathbf {x}}^\emptyset =0\) and \( \hat {\mathbf {x}}^\emptyset =1\) respectively. The usual power series in \(\mathcal {S}\) define \(\widehat \exp {}\colon \mathcal {S}_0 \to \mathcal {S}_1\) with inverse \(\widehat \log {}\colon \mathcal {S}_1 \to \mathcal {S}_0\) and we have
We shall abuse notation in what follows and write \(e^{(\cdot )}\) (resp. \(\log \)), instead of \(\widehat \exp \) (resp. \( \widehat \log \)).
All definitions for tensor valued continuous (and càdlàg) semimartingales have a straightforward extension to \(\mathcal {S}\)-valued process. In particular, given \(\mathbf {X}\) and \(\mathbf {Y}\) in \(\mathscr {S}^c(\mathcal {S})\), the inner quadratic covariation is given by
Write \(\mathcal {S}^N\) for the truncated symmetric algebra, linearly spanned by \(\{ \hat e_{w}: w \in \widehat {\mathcal {W}}_d, |w| \le N\}\) and \(\mathcal {S}^N_0\) for those elements with zero scalar entry. In complete analogy with non-commutative setting discussed above, we then write \(\widehat {\mathscr {H}}^{q,N} \subset \mathscr {S}^c(\mathcal {S}^N_0)\) for the corresponding space of homogeneously q-integrable semimartingales.
Finally, also the definition of diamond products from Sect. 4.3 extends immediately to \(\mathcal {S}\)-valued semimartingales. In particular, given \(\hat {\mathbf {X}}\) and \(\hat {\mathbf {Y}}\) in \(\mathscr {S}^c(\mathcal {S})\), we have
where the last expression is given in terms of diamond products of scalar semimartingales.
5.2 Moments and Cumulants
We quickly discuss the development of a symmetric algebra valued semimartingale, more precisely \(\hat {\mathbf {X}} \in \mathscr {S}^c(\mathcal {S}_0)\), in the group \(\mathcal {S}_1\). That is, we consider
It is immediate (validity of chain rule) that the unique solution to this equation, at time \(t \ge s\), started at \(\hat {\mathbf {S}}_s = \hat {\mathbf {s}} \in \mathcal {S}_1\) is given by
and we also write \(\hat {\mathbf {S}}_{s,t} = e^{ \hat {\mathbf {X}}_t- \hat {\mathbf {X}}_s}\) for this solution started at time s from \(1\in \mathcal {S}_1\). The relation to signatures is as follows.
Proposition 5.1
Let\(\mathbf {X}\)and\(\mathbf {Y}\)be\(\mathcal {T}\)-valued semimartingales and define\(\hat {\mathbf {X}} := \pi _{\mathrm {Sym}}(\mathbf {X})\)and\(\hat {\mathbf {Y}} := \pi _{\mathrm {Sym}}(\mathbf {Y})\). Then for all\(0 \le s \le t \le T\)it holds almost surely
(i) That the projections \(\hat {\mathbf {X}},\hat {\mathbf Y}\) define \(\mathcal {S}\)-valued semimartingales follows from the componentwise definition and the fact that the canonical projection is linear. In particular, the right-hand side of Eq. (26) is well defined. To show Eq. (26) we apply the canonical projection \(\pi _{\mathrm {Sym}}\) to both sides of Eq. (11) after choosing \(Z_t\equiv \mathbf 1\), and using the explicit action of \(\pi _{\mathrm {Sym}}\) on basis tensors we obtain the identity
Assuming componentwise integrability, we then define the conditional symmetric moments and cumulants of the \(\mathcal {S}\)-valued semimartingale \(\hat {\mathbf {X}}\) by
for \(0\le t \le T\). If \(\hat {\mathbf {X}} = \pi _{\mathrm {Sym}}(\mathbf {X})\) for a \(\mathcal {T}\)-valued semimartingale \(\mathbf {X}\), with expected signature and signature cumulants \(\boldsymbol {\mu }(T)\) and \(\boldsymbol {\kappa }(T)\), it is then clear that the symmetric moments and cumulants of \(\hat {\mathbf {X}}\) are obtained by projection,
consists of the (time-t conditional) multivariate moments of \(X_T-X_t \in \mathbb {R}^d\). Here, the series on the right hand side is understood in the formal sense. It readily follows, also noted in [9, Example 3.3], that \(\hat {\boldsymbol {\kappa }}_t (T) = \log (\hat {\boldsymbol {\mu }}_t(T))\) consists precisely of the multivariate cumulants of \(X_T-X_t\). Note that the symmetric moments and cumulants of the scaled process \(a X\), \(a \in \mathbb {R}\), is precisely given by \(\delta _a \hat {\boldsymbol {\mu }}\) and \(\delta _a \hat {\boldsymbol {\kappa }}\) where the linear dilation map is defined by \(\delta _a\colon \hat e_w \mapsto a^{|w|} \hat e_w\). The situation is similar for \(a \cdot X=(a_1X^1,\dotsc ,a_dX^d)\), \(a \in \mathbb {R}^d\), but now with \(\delta _a\colon \hat e_w \mapsto a^w \hat e_1^{|w|}\) with \(a^w = a_1^{n_1} \cdots a_d^{n_d}\) where \(n_i\) denotes the multiplicity of the letter \(i \in \{1,\dots , d\}\) in the word w.
We next consider linear combinations, \(\hat {\mathbf {X}} = a X + b \langle X \rangle \), for general pairs \(a,b \in \mathbb {R}\), having already dealt with \(b=0\). The special case \(b = - a^2/2\), by scaling there is no loss in generality to take \((a,b) = (1,-1/2)\), yields a (at least formally) familiar exponential martingale identity.
Example 5.3
Let \( X\) be an \(\mathbb {R}^d\)-valued martingale in \(\mathscr H^{\infty -}\), and define
In this case we have trivial symmetric cumulants, \(\hat {\boldsymbol {\kappa }}_t(T)=0\) for all \(0\le t\le T\). Indeed, Itô’s formula shows that \(t\mapsto \exp (\hat {\mathbf {X}}_t)\) is an \(\mathcal {S}_1\)-valued martingale, so that
in which case \(\hat {\boldsymbol {\mu }} = \hat {\boldsymbol {\mu }} (a,b), \hat {\boldsymbol {\kappa }} = \hat {\boldsymbol {\kappa }}(a,b)\) contains full information of the joint moments of X and its quadratic variation process. A recursion of these was constructed as diamond expansion in [27].
5.3 Diamond Relations for Multivariate Cumulants
We will demonstrate how a symmetrization of the functional equations from Sect. 3 and the recursions from Sect. 4 lead to generalized view on the cumulant recursions from [3, 27, 41]. As before we will first divide the focus between the continuous case and the discrete setting. Referring to [28, Section 5.2] for a unification of both settings into a general càdlàg form.
5.3.1 The Continuous Case
We assume that the usual assumptions from the continuous setting in Sect. 3.2 on the filtration are in place. Then following Theorem 4.7 the diamond product of \(\hat {\mathbf {X}}, \hat {\mathbf {Y}} \in \mathscr {S}^c(\mathcal {S}_0)\) is another continuous \(\mathcal {S}_0\)-valued semimartingale given by
with summation over all \(w_1,w_2 \in \widehat {\mathcal {W}}_d\), provided all brackets are integrable. This trivially adapts to \(\mathcal {S}^N\)-valued semimartingales, \(N\in \mathbb {N}_{\ge 1}\), in which case all words have length at most N, the summation is restricted accordingly to \(|w_1|+|w_2| \le N\).
Theorem 5.4
Let\(\Xi = (0, \Xi ^{(1)},\Xi ^{(2)},\ldots )\)be an\(\mathcal {F}_T\)-measurable random variable with values in\(\mathcal {S}_0\), componentwise in\(\mathcal {L}^{\infty -}\). Then
then the identity (27) holds for the cumulants upto level N, i.e. for\(\mathbb {K}^{(0,N)} := \log (\mathbb {E}_t (e^{\Xi ^{(0,N)}})\)with values in\(\mathcal {S}^{N}_0\).
Remark 5.5
Identity (27) is reminiscent to the quadratic form of the generalized Riccati equations for affine diffusions. The relation can be presented more explicitly when the involved processes are assumed to have a Markov structure and the functional signature cumulant equation reduces to a PDE system (see [28, Section 6.2.2]). The framework described here, however, requires neither Markov nor affine structure. In [28, Section 6.3] it is exemplified for affine Volterra processes that such computations are also possible in the fully non-commutative setting.
Proof
We first observe that since \(\Xi \in \mathcal L^{\infty -}\), by Doob’s maximal inequality and the BDG inequality, we have that \(\hat {\mathbf {X}}_t:=\mathbb {E}_t\Xi \) is a continuous martingale in \(\mathscr {H}^{\infty -}(\mathcal {S}_0)\). In particular, thanks to Theorem 2.6, the signature moments are well defined. According to Sect. 5.2, the signature is then given by
and Eq. (27) follows upon recalling that \((\mathbb {K}\diamond \mathbb {K})_t(T)=\mathbb {E}_t\langle \mathbb {K}(T)\rangle _{t,T}\). The proof of the truncated version is left to the reader. □
As a corollary, we provide a general view on recent results of [3, 27, 41].
Corollary 5.6
The conditional multivariate cumulants\((\mathbb {K}_t)_{0\le t\le T}\)of a random variable\(\Xi \)with values in\(\mathcal {S}_0(\mathbb {R}^d)\), componentwise in\(\mathcal L^{\infty -}\)satisfy the recursion
The analogous statement holds true in the N-truncated setting, i.e. as recursion for\(n=1,..,N\)under the condition (28).
Remark 5.7
In absence of higher order information, i.e., \(\Xi ^{(2)} = \Xi ^{(3)} = \ldots \equiv 0\), this type of cumulant recursion appears in [41]; and under optimal integrability conditions on \(\Xi ^{(1)}\) with finite Nth moments in [27]. (The latter requires a localization argument which is avoided here by directly working in the correct algebraic structure.)
5.3.2 The Discrete Case
Consider a probability space \((\Omega , \mathcal {G},\mathbb {P})\) with discrete filtration \((\mathcal {G}_j)_{j = 0, \dots , J}\) and a \(\mathcal {G}_T\)-measurable \(\mathcal {S}_0\)-valued random variable \(\Xi \). Assuming that \(\Xi \) is componentwise in \(\mathcal {L}^{\infty -}\), we are again interested in calculating \((\mathbb {K}_j) := (\log \mathbb {E}(e^{\Xi }\,\vert \mathcal {G}_j))\). Instead of projecting the identities from the non-commutative setting in Sect. 3.1 to the symmetric algebra, we present a direct derivation of the corresponding identities.
Using the multiplicative property of the exponential map in the symmetric algebra, the martingale property of \(e^{\mathbb {K}}\) can be rewritten to
In particular, projecting to symmetric tensor levels we see that the conditional cumulants of\(\Xi \)satisfy for all\(n = 1, 2, \dots \)and\(j = J, J-1, \dots , 1\):
This is really the result we get from projecting the Theorem 3.1 to the symmetric algebra (upon using that the symmetric signature cumulants are given by \(\hat {\boldsymbol {\kappa }}_j = \mathbb {K}_j - \mathbb {E}(\Xi \,\vert \mathcal {G}_j)\)). Nevertheless, due the ad hoc derivation presented here, it might seem surprising that the resulting expansion leads to any non-trivial relations for conditional cumulants at all.
While on the first level one still trivially has \((\mathbb {K}^{(1)}_j) = (\mathbb {E}(\Xi ^{(1)}\vert \mathcal {G}_j))\), on the second level we have
which one recognizes, in case \(\Xi ^{(2)} = 0\), as the energy identity for the discrete square-integrable martingale \((\mathbb {K}^{(1)}_j) = (\mathbb {E}(\Xi ^{(1)}\,\vert \mathcal {G}_j))\). Going further in the recursion yields increasingly non-obvious relations. Taking \(\Xi ^{(2)} = \Xi ^{(3)} = \ldots \equiv 0\) for notational simplicity gives
It is interesting to note that inductively the identity for \(\mathbb {K}^{(n)}\) can by expressed in terms of variations (of variations) of the martingale \(\mathbb {K}^{(1)}\), which relates to the Bartlett identities for martingales that have appeared in the statistics literature, cf. Mykland [53] and the references therein.
Remark 5.9
The second formula in Theorem 5.8, in terms of summation over partitions \(\ell \), is sub-optimal in the sense that due to commutativity there are many repeated terms. For example, the factor \(\tfrac 12\) in front of the second term in Eq. (31) suggested by that formula is actually just 1, since both compositions \(\ell =(1,2)\) and \(\ell =(2,1)\) give the same product. In general one can take care of this algebraically by using (rescaled) Bell polynomials, which in the context of conditional cumulants was realized in [30].
6 Applications
6.1 Time-Inhomogeneous Lévy Processes
We now wish to derive a formula for the expected signature of time-inhomogeneous Lévy processes. This class of processes models trajectories having jumps, possibly infinitely many. To avoid the technical difficulties entailed by general càdlàg2 stochastic integration, we will only consider time-inhomogeneous Lévy process that have finitely many jumps. Specifically, for a filtered probability space \((\Omega , \mathcal {F}, \mathbb {P}, (\mathcal {F}_t))\) satisfying the usual conditions we consider a process of the form
$$\displaystyle \begin{aligned} {} X = \int_0^\cdot b(u) \mathrm{d}{u} ~+~ \sum_{k=1}^m\int_0^\cdot\sigma_k(u)\mathrm{d}{B^k_u} ~+~ N, \end{aligned} $$
(32)
where \(b: [0,T] \to \mathbb {R}^d\) is integrable, \(\sigma _k: [0,T] \to \mathbb {R}^d\) is square-integrable, \(B = (B^1, \cdots , B^m)\) is a Brownian motion with respect to \((\mathcal {F}_t)\) and N is a time-inhomogenous compound Poisson process with respect to \((\mathcal {F}_t)\), i.e., N is a process with piecewise constant right-continuous sample paths in \({\mathbb {R}^d}\) and is fully characterized by the existence of a finite intensity measureK so that
The interpretation of signature as the “geometric” development of a \(\mathcal {T}_0\)-value semimartingale \(\mathbf {X}\) in the group \(\mathcal {T}_1\) extents from the continuous [33] to the càdlàg case [18] and is consistent with the Marcus [5, 22, 40, 50, 51] interpretation of
The definition of the signature \(\mathrm {Sig}(\mathbf {X})_{s,\cdot }\) as the solution to (34) results in the desired property that, whenever the process jumps, the signature is multiplied by the exponential displacement, i.e.,
For the time-inhomogeneous Lévy processes X of the form (32) we can resolve the above Marcus definition of the signature \(\mathbf {S} = \mathrm {Sig}(X)_{s,\cdot }\) by simply integrating between the finite number of jump times. This finally leads to the Itô-integral from
Then its conditional expected signature\(\boldsymbol {\mu } = ( \mathbb {E}_t(\mathrm {Sig}(X)_{t,T})_{0 \le t \le T}\)is the solution of the following backwards differential equation in\(\mathcal {T}_1\):
Moreover, the conditional signature cumulants\(\boldsymbol {\kappa } = \log ( \boldsymbol {\mu })\)are the solution of
$$\displaystyle \begin{aligned} \boldsymbol{\kappa}_t = \int_t^{T}H(\operatorname{\mathrm{ad}}{\boldsymbol{\kappa}_{u}})(\mathfrak{y}(u))\,\mathrm{d} u, \qquad 0 \le t \le T. \end{aligned} $$
Remark 6.2
The case of general time-homogeneous Lévy process, i.e., semimartingales with independent increments that are continuous in probability, is treated in [28, Corollary 6.5]. Allowing for infinitely many jumps by extending K to a Lévy-measure, i.e., by requiring K to only be a \(\sigma \)-finite measure satisfying \(\int _0^T\int _{{\mathbb {R}^d}}1\wedge |x|^2 K_t(\mathrm {d}{x})\mathrm {d}{t} < \infty \), a martingale compensation of the small jumps is required and results in subtracted indicator function \(\mathbf 1_{|x|>1}\) appearing in (36) and (38).
Proof
To avoid using general stochastic integration, we present a direct proof following [22]. This prove bases on the identity
which is proven using a measure theoretic induction starting with (33) for \(H\equiv 1\) and all bounded measurable functions \(f:{\mathbb {R}^d}\to \mathbb {R}\), and then extending over simple processes H, i.e. piecewise constant adapted bounded processes, to all càdlàg adapted process H with \(\Vert H \Vert _{\mathscr {S}^1} < \infty \). We refer to [22, Lemma 43] for a more detailed proof in the time-homogeneous case.
As before we will omit proving the integrability properties of \(\mathrm {Sig}(X)_{0,T}\). As proven in [28, Corollary 6.5] the moment condition (36) suffices to show that \(\Vert \mathrm {Sig}(X)_{0,\cdot }^w\Vert _{\mathscr {S}^q} < \infty \) for all \(w \in \mathcal {W}_d\) and \(q >1\). Hence, the expected signature is well defined and it immediately follows from the independence of increments of X with respect to the underlying filtration that
Taking expectation of each of the terms in Itô-integral form (35) of \({\mathbf {S}}_t = \mathrm {Sig}(X)_{s,t}\) we obtain using the integrability of \(\mathbf {S}\) and the Fubini’s theorem that firstly
Summarizing, this show that \(\boldsymbol {\mu }_s\) satisfies the forward equation
$$\displaystyle \begin{aligned} \boldsymbol{\mu}_s(t) = 1+ \int_s^t \boldsymbol{\mu}_s(u-)\mathfrak{y}(u) \mathrm{d}{u}, \qquad s \le t \le T. \end{aligned} $$
This implies in particular that \(t\mapsto \boldsymbol {\mu }_s(t)\) is absolutely continuous. Furthermore, by the uniqueness of the solution flow we also have the identity \(\boldsymbol {\mu }_0(t)\boldsymbol {\mu }_t(T) = \boldsymbol {\mu }_0(T)\). Hence, \(t\mapsto \boldsymbol {\mu }_t(T) = \boldsymbol {\mu }_0(t)^{-1}\boldsymbol {\mu }_0(T)\) is absolutely continuous as well, and differentiation of the previous identity yields
for almost every \(t\in [0,T]\). Multiplication with \(\boldsymbol {\mu }_0(t)^{-1}\) and integration over \([t,T]\) then yields the backwards equation.
The second part of the statement follows by noting that from the first part of the proof we have
Hence, \(\boldsymbol {\mu }\) is really the signature of the continuous deterministic \(\mathcal {T}_0\)-valued path \(\mathbf {X} := \int _0^\cdot \mathfrak {y}(u)\mathrm {d}{u}\). An application of Theorem 3.6 then directly leads to the stated expansion for the log-signature \(\log (\mathrm {Sig}(\mathbf {X})_{t,T}) = \log (\boldsymbol {\mu }_t(T))\). □
We conclude this section with an important consequence that concerns the convergence radius of the expected signature, which for a general tensor series \(\mathbf {x}\in \mathcal {T}\) is defined as the maximal \(r \in [0,\infty )\) such that
In [15] it was proven that if the expected signature has infinite convergence radius it characterizes the law of the signature, hence, the law of the underlying process when the signature itself is characterizing. Thus, rendering the question of infinite convergence radius a central but in general difficult question. For Brownian motion it is a consequence of Fawcett’s formula [19], for Lévy-process a consequence of the signature Lévy-Khinchin formula in [22]. Apart from that there is a negative answer in [8, 45] for the case of Brownian motion stopped at the exit of a domain. The following results gives necessary condition for time-inhomogeneous \(\mathfrak {g}^N\)-valued Lévy processes.
Corollary 6.3
Let\(X, \boldsymbol {\mu }\)and\(\mathfrak {y}\)be as in Theorem6.1. Then for any\(N \in \mathbb {N}\)and\(\lambda > 0\)it holds for all\(0 \le t \le T\):
Projecting (37) to tensor level \(n\in \mathbb {N}\) and using the compatibility of the norm on \(({\mathbb {R}^d})^{\otimes n}\) we have for any \(\lambda > 0\) and all \(t\in [0,T]\):
The stated estimate then follows from Gronwall’s lemma and the second claim by talking taking limits for \(N\to \infty \). □
6.2 Brownian Rough Paths
This section is framed within rough path terminology and is directed to a reader with special interest into the topic. The unfamiliar reader may use [21] as a starting point. We denote by \(\mathcal {G}^M \subset \mathcal {T}^M_1\) the free step-M nilpotent free Lie group and recall that the space of weakly geometric rough paths consists of continuous finite p-variation paths \(\mathcal {W}\Omega ^{c,p}_T = C^p([0,T]; (\mathcal {G}^M, \vert \mkern -2.5mu\vert \mkern -2.5mu\vert {\cdot }\vert \mkern -2.5mu\vert \mkern -2.5mu\vert ))\) with \(M \le p < M+1\) and where \(\vert \mkern -2.5mu\vert \mkern -2.5mu\vert {\cdot }\vert \mkern -2.5mu\vert \mkern -2.5mu\vert \) is a homogeneous norm on \(\mathcal {G}^M\). In accordance with [14, 22] we call a random variable \(\mathbf {Y}\) with values in \(\mathcal {W}\Omega ^{c,p}_T\) a Brownian rough path with drift if it has stationary and independent increments \({\mathbf {Y}}_{s,t} := {\mathbf {Y}}_{s}^{-1}{\mathbf {Y}}_{t}\). For the means of this section we denote the signature of \(\mathbf {Y}\), i.e., the full Lyons lift, by \({\mathbf {Y}}^{<\infty }\).
Here, we will specifically consider the case where \(M = 2N\) and \(\mathbf {Y} = \mathbf {B}\) is the Stratonovich development of Brownian motion in the free step-N nilpotent Lie algebra \(\mathfrak {g}^N = \log \mathcal {G}^N\). More precisely, let \(\mathbb {B} = \sum _{i\in \mathcal {I}} B^{i} \mathfrak {u}_i\), where \(\{\mathfrak {u}_i\}_{i\in \mathcal {I}}\) is a orthonormal system in \(\mathfrak {g}^N \subset \mathcal {T}_0^N\) and \((B^{i})_{i\in \mathcal {I}}\) is a \(|\mathcal {I}|\)-dimensional Brownian motion with possibly mutually correlated coefficients. Define the correlation matrix of \(\mathbb {B}\) by
The truncated Stratonovich development \(\mathbf {B} := \pi _{(0,2N)}\mathrm {Sig}(\mathbb {B})_{0,\cdot }\) takes values in the group \(\mathcal {G}^{2N}\) and following standard arguments from [21, Section 3.2] we see that the samples of \(\mathbf {B}\) are (weakly) geometric rough path for any \(p \in (2N, 2N+1)\). Clearly, this is still the case after adding a drift \(\mathbb {B} \leadsto \mathbb {B}(\mathfrak {y}) := (\mathbb {B} + t\mathfrak {y})_{t\ge 0}\) for any constant vector \(\mathfrak {y} \in \mathfrak {g}^{2N}\). The resulting development \(\mathbf {B}(\mathfrak {y})\) is a Brownian rough-path with drift, as independence and stationarity of the increments follow directly from the definition. Its signature is given by the full development \(\mathbf {B}(\mathfrak {y})^{<\infty } = \mathrm {Sig}(\mathbb {B}(\mathfrak {y}))_{0,\cdot }\).
Theorem 3.4 implies the following generalization of Fawcett’s formula.
Corollary 6.4
The expected signature of a Brownian rough path\(\mathbf {B}(\mathfrak {y})\)with correlation\(\Sigma \)and drift\(\mathfrak {y}\)is given by
The above suggest to define a “standard” Brownian rough path by requiring additionally that \(\log \mathbb {E}({\mathbf {Y}}_{s,t}) = \frac {1}{2}(t-s)\sum _{i\in \mathcal {I}}(\mathfrak {u}_i)^2\) and that \(\{\mathfrak {u}_i\}\) spans \(\mathfrak {g}^N\). However, this is only a appropriate definition as far as it is appropriate to call the Stratonovich lift the “standard” lift of a Brownian motion.
Remark 6.6
The results from the previous section, can be extended to time-inhomogeneous Lévy rough paths in the sense of [14, 22], without any additional effort. Indeed, instead of starting with the differential characteristics \((b, a, K)\) in \({\mathbb {R}^d}\) we could have likewise used characteristics in \(\mathfrak {g}^N\), resulting in a \(\mathfrak {g}^N\)-valued semimartingale \(\mathbf {X}\) with independent increments. The truncated Marcus lift \(\mathbf {S} = \pi _{(0,2N)}\mathrm {Sig}(\mathbb {X})\) then has independent group increments \({\mathbf {S}}_{s,t} = {\mathbf {S}}_{s}^{-1}{\mathbf {S}}_{t}\) and the sample paths are càdlàg (weakly) geometric rough paths in the sense of [22, 25].
Proof
From the definition we then have \(\mathbf {B}(\mathfrak {y}) := \pi _{(0,2N)}\mathrm {Sig}(\mathbb {B}(\mathfrak {y}))_{0,\cdot }\) where \(\mathbb {B}\) is Brownian motion in \(\mathfrak {g}^N\). For any \(w \in \mathcal {W}_d\) and \(q \ge 1\) we have
Hence, \(\mathbb {B}(\mathfrak {y}) \in \mathscr {H}^{\infty -}\) and thus Theorems 3.4 and 4.1 apply to the conditional expected signature \(\boldsymbol {\mu }_t := \mathbb {E}_t(\mathrm {Sig}(\mathbb {B}(\mathfrak {y}))_{t,T})\). Due to martingality of \(\mathbb {B}\) we have
Further, due to the deterministic quadratic variation \(\left \langle \mathbb {B} \right \rangle _t = t\Sigma \) one sees inductively by following the recursion in Theorem 4.1 that \(\boldsymbol {\mu }^{(n)}\) is deterministic and hence by Theorem 3.4 satisfies
We readily conclude by noting that the unique solutions to the above ordinary differential equation in \(\mathcal {T}_1\) is given by \(\boldsymbol {\mu }_t = e^{(T-t)(\mathfrak {y}+ \frac 12\Sigma )}\). □
We mention two important examples: Firstly, the zero mass limit of a physical Brownian motion in a magnetic field ([26], [28, Section 3.4], [46]), which is a Brownian rough path with \(N=1\) and drift \(\mathfrak {y} = (0, 0, A)\) where \(A\in \mathfrak {so}({\mathbb {R}^d}\otimes {\mathbb {R}^d})\) is an antisymmetric matrix that depends on the physical environment. In line with [46] the above result immediately gives that the expected signature of the small mass limit is given by \(e^{T(A + \frac 12\Sigma )}\).
As a second example we mention the recent finding by Hairer [32] that the \(H\to \frac {1}{4}\) limit of a suitably renormalized fractional Brownian motion converges to a pure area rough path with coefficients given by standard Brownian motion. This limit is naturally included in our framework as a Brownian rough path of \(\mathbb {B} = \sum _{i\in \mathcal {I}}\mathfrak {u}_i B^i\) where the corresponding orthonormal system of \(\mathfrak {g}^2\) is given by \(\{\mathfrak {u}_i : i\in \mathcal {I}\} = \{[e_i, e_j] : 1 \le i < j \le d\}\). Theorem 6.4 yields the corresponding generalization of Fawcett’s formula. We note that it might be of interest to obtain this formula as the rescaled limits for \(H\to \frac {1}{4}\) in [13].
7 Conclusion
We summarize the results of this chapter and comment on their practical relevance. An overview of the different formulas and corresponding recursions with the relation to recent and classical results from the literature is displayed in Figs. 1 and 2.
Fig. 1
An overview of the different formulas and corresponding recursions (in Fig. 2) with the relation to recent and classical results from the literature. Recall that \(\mathscr {S}^{c}\) stands for continuous semimartingales and \(\mathscr {V}^{c}\) for continuous processes of finite variation. Here, “trivial” refers to the linear equation \(\mathrm {d}{\hat {\mathbf {S}}}=\hat {\mathbf {S}}\mathrm {d}{\hat {\mathbf {X}}}\) with \(\hat {\mathbf {S}}_0=1\in \mathcal {S}\) in the commutative case \(\hat {\mathbf {X}}\in \mathscr {V}^c(\mathcal {S})\), which is solved by \(e^{\hat {\mathbf {X}}-\hat {\mathbf {X}}_0}\) with logarithm \(\hat {\mathbf {X}}-\hat {\mathbf {X}}_0\)
Theorem 3.4 and Theorem 3.6, more precisely Eqs. (17) and (19), respectively characterize expected signatures and signature cumulants of semimartingales by dynamic formulas, with respect to the underlying filtration. The functional equation (19) for signature cumulants is represented in the top-left corner of Fig. 1.
Projecting (17) and (19) to tensor levels we obtain recursions for signature moments and signature cumulants respectively which are presented in Theorems 4.1 and 4.4. Specifically, these recursions represent \(\boldsymbol {\mu }^{(n)}\) in terms of \(\boldsymbol {\mu }^{(1)}, \dots , \boldsymbol {\mu }^{(n-1)}\) (and analogously for \(\boldsymbol {\kappa }\)). The recursion for signature cumulants is represented by the top-left corner in Fig. 2.
Even when an explicit form of \(\boldsymbol {\mu }_t\) or \(\boldsymbol {\kappa }_t\) is obtainable, the recursive nature of the expressions for each homogeneous component can be useful in some numerical scenarios. This provides a means of computation which can in principle be more efficient than the naive Monte Carlo approach. This is most evidently the case in discretized Markovian situations (compare with Theorem 3.2), where the recursion from Theorem 3.1 directly turns into a inductive backwards scheme, where conditional expectations can numerically be approximated using functional linear regression procedures.
The most classical consequence of Theorem 3.6 appears when \(\mathbf {X}\) is a deterministic continuous semimartingale, i.e., in particular the components of \(\mathbf {X}\) are continuous paths of finite variation. The signature cumulants are then just the log-signature \(\boldsymbol {\kappa }_t(T) = \log \mathrm {Sig}{(\mathbf {X})}_{t,T}\) and what remains of (19) is a classical differential equation due to [34, 57], here in backward form
The accompanying recursion is then precisely Magnus expansion [7, 35, 36, 49]. This transition to the deterministic case is represented in Figs. 1 and 2 by going from the top-left to the top-right corner.
Projecting to the symmetric tensor algebra \(\mathcal {S}\) (Sects. 5.1 and 5.2), i.e., enforcing commutativity, signature moments and signature cumulants turn into the sequence of classical multivariate moments and cumulants of random variable \({\mathbf {X}}_{0,T}\). Projecting the functional equation (19) to \(\mathcal {S}\) and recasting conditional quadratic variation increments as diamond products (Theorem 4.7) we obtain the cumulant equation (27) originally obtain in [27]. This consequence is depicted in Fig. 1 by going from the upper-left to the lower-left corner.
Equation (27) should be compared with Riccati’s ordinary differential equation from affine process theory [16, 17, 39]. Of course, our results, in particular Eqs. (19) and (27), are not restricted to affine semimartingales. In turn, expected signatures and cumulants—and subsequently all our statements above these—require moments, which is not required for the Riccati evolution of the characteristic function of affine processes. Of recent interest, explicit diamond expansions have been obtained for “rough affine” processes, non-Markov by nature, with cumulant generating function characterized by Riccati Volterra equations, see [1, 27, 31]. It is remarkable that analytic tractability remains intact when one passes to path space and considers signature cumulants (see in particular [28, Section 6.3]).
Projecting (19) to tensor levels, we obtained a diamond expansion for multivariate cumulants (29). Such expansions where previously obtained in [3, 27, 30, 41], together with a range of applications, from quantitative finance (including rough volatility models [1, 31]), to statistical physics: in [41] the authors rely on such formulas to compute the cumulant-generating function of log-correlated Gaussian fields, more precisely approximations thereof, that underlies the Sine-Gordon model, which is a key ingredient in their renormalization procedure.
In mathematical finance, a case of particular interest is the \(\mathcal {S}\)-valued semimartingale \(\hat {\mathbf {X}} = (0,aX,b\langle X \rangle , 0, \dotsc )\), for a d-dimensional continuous martingale X. The resulting diamond expansion from [27], which improves and unifies previous results [3, 41], allowing to calculate joint cumulants of log-price and integrated volatility [20].
We end this summary with a comment on questions of convergence: Basic facts of analytic functions show that classical moment- and cumulant-generating functions, for random variables with finite moments of all orders, have a radius of convergence \(\rho \ge 0\), directly related to growth of the corresponding sequence. Convergence, in the sense \(\rho > 0\), implies that the moment problem is well-posed. That is, the moments (equivalently: cumulants) determine the law of the underlying random variable. (Note that this might not be the case if \(\rho =0\). See also [27] for a related discussion in the context of diamond expansions.) Similarly, understanding the growth of expected signatures or signature cumulants is important task: in a celebrated paper [15] it was shown that under infinite convergence radius of the expected signature, in the sense of (40), the “expected signature problem” is well-posed; that is, the expected signature (equivalently: signature cumulants) determines the law of the random signature. In Sect. 6.1 we show for the case of time-inhomogeneous Lévy processes how our functional equation lead to a verifiable sufficient condition for expected signature to have an infinite convergence radius.
Acknowledgements
PKF and NT acknowledge support from DFG CRC/TRR 388 “Rough Analysis, Stochastic Dynamics and Related Fields”, projects A02, A05 (PF) and B01 (NT). PKF is also supported by the DFG Excellence Cluster MATH+ via a MATH+ Distinguished Fellowship.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
This follows from the fact that for linear operators, power series calculus is multiplicative in the sense that the composition \(G(L)H(L)\) equals \(GH(L)\) where \(GH(z)\) is the Cauchy product of G and H. Moreover, \(G(z)\) and \(H(z)\) are multiplicative inverses as power series so that \(GH(z) = 1\).
From French “continu(e) à droit, limite a gauche” (right-continuous with left limits).
1.
E. Abi Jaber, M. Larsson, S. Pulido, Affine Volterra processes. Ann. Appl. Probab. 29(5), 3155–3200 (2019)MathSciNetCrossRef
2.
Y. Aït-Sahalia, J. Jacod, High-frequency Financial Econometrics (Princeton University Press, Princeton, 2014)CrossRef
3.
E. Alòs, J. Gatheral, R. Radoičić, Exponentiation of conditional expectations under stochastic volatility. Quant. Finance 20(1), 13–27 (2020)MathSciNetCrossRef
4.
C. Améndola, P. Friz, B. Sturmfels, Varieties of signature tensors, in Forum of Mathematics, Sigma, vol. 7 (Cambridge University Press, Cambridge, 2019)
5.
D. Applebaum, Lévy Processes and Stochastic Calculus, volume 116 of Camb. Stud. Adv. Math., 2nd edn. (Cambridge University Press, Cambridge, 2009)
6.
C. Bellingeri, P.K. Friz, S. Paycha, R. Preiß, Smooth rough paths, their geometry and algebraic renormalization. Vietnam J. Math. 50(3), 719–761 (2022)MathSciNetCrossRef
7.
S. Blanes, F. Casas, J. Oteo, J. Ros, The Magnus expansion and some of its applications. Phys. Rep. 470(5–6), 151–238 (2009)MathSciNetCrossRef
8.
H. Boedihardjo, J. Diehl, M. Mezzarobba, H. Ni, The expected signature of Brownian motion stopped on the boundary of a circle has finite radius of convergence. Bull. Lond. Math. Soc. 53(1), 285–299 (2021)MathSciNetCrossRef
9.
P. Bonnier, H. Oberhauser, Signature cumulants, ordered partitions, and independence of stochastic processes. Bernoulli 26(4), 2727–2757 (2020)MathSciNetCrossRef
10.
P. Bonnier, C. Liu, H. Oberhauser, Adapted topologies and higher rank signatures. Ann. Appl. Probab. 33(3), 2136–2175 (2023)MathSciNetCrossRef
11.
E. Breuillard, P. Friz, M. Huesmann, From random walks to rough paths. Proc. Am. Math. Soc. 137(10), 3487–3496 (2009)MathSciNetCrossRef
12.
F. Casas, A. Murua, An efficient algorithm for computing the Baker-Campbell-Hausdorff series and some of its applications. J. Math. Phys. 50(3), 033513, 23 (2009)
13.
T. Cass, E. Ferrucci, On the Wiener chaos expansion of the signature of a Gaussian process. Probab. Theory Relat. Fields 189, 909–947 (2024)MathSciNetCrossRef
14.
I. Chevyrev, Random walks and Lévy processes as rough paths. Probab. Theory Relat. Fields 170(3–4), 891–932 (2018)MathSciNetCrossRef
15.
I. Chevyrev, T. Lyons, Characteristic functions of measures on geometric rough paths. Ann. Probab. 44(6), 4049–4082 (2016)MathSciNetCrossRef
16.
C. Cuchiero, D. Filipović, E. Mayerhofer, J. Teichmann, Affine processes on positive semidefinite matrices. Ann. Appl. Probab. 21(2), 397–463 (2011)MathSciNetCrossRef
17.
D. Duffie, D. Filipović, W. Schachermayer, Affine processes and applications in finance. Ann. Appl. Probab. 13(3), 984–1053 (2003)MathSciNetCrossRef
18.
A. Estrade, Exponentielle stochastique et intégrale multiplicative discontinues. Ann. Inst. Henri Poincaré, Probab. Stat. 28(1), 107–129 (1992)
19.
T. Fawcett, Problems in stochastic analysis: connections between rough paths and non-commutative harmonic analysis. PhD thesis, University of Oxford, 2002
20.
P. Friz, J. Gatheral, Diamonds and forward variance models. Preprint, arXiv:2205.03741 [q-fin.MF] (2022)
21.
P.K. Friz, M. Hairer, A Course on Rough Paths. Universitext, 2nd edn. (Springer International Publishing, Berlin, 2020)
22.
P.K. Friz, A. Shekhar, General rough integration, Lévy rough paths and a Lévy–Kintchine-type formula. Ann. Probab. 45(4), 2707–2765 (2017)
23.
P.K. Friz, N.B. Victoir, The Burkholder-Davis-Gundy Inequality for Enhanced Martingales. Lecture Notes in Mathematics, vol. 1934 (Springer, Berlin, 2006)
24.
P.K. Friz, N.B. Victoir, Multidimensional Stochastic Processes as Rough Paths: Theory and Applications. Cambridge Studies in Advanced Mathematics (Cambridge University Press, 2010)
25.
P.K. Friz, H. Zhang, Differential equations driven by rough paths with jumps. J. Differ. Equations 264(10), 6226–6301 (2018)MathSciNetCrossRef
26.
P. Friz, P. Gassiat, T. Lyons, Physical Brownian motion in a magnetic field as a rough path. Trans. Am. Math. Soc. 367(11), 7939–7955 (2015)MathSciNetCrossRef
27.
P.K. Friz, J. Gatheral, R. Radoičić, Forests, cumulants, martingales. Ann. Probab. 50(4), 1418–1445 (2022)MathSciNetCrossRef
28.
P.K. Friz, P.P. Hager, N. Tapia, Unified signature cumulants and generalized Magnus expansions, in Forum of Mathematics, Sigma, vol. 10 (Cambridge University Press, Cambridge, 2022), p. e42
29.
P.K. Friz, T. Lyons, A. Seigal, Rectifiable paths with polynomial log-signature are straight lines. Bull. Lond. Math. Soc. 56, 2922 (2024)MathSciNetCrossRef
30.
M. Fukasawa, K. Matsushita, Realized cumulants for martingales. Electron. Commun. Probab. 26, 10 (2021), Id/No 12
31.
J. Gatheral, M. Keller-Ressel, Affine forward variance models. Finance Stoch. 23(3), 501–533 (2019)MathSciNetCrossRef
32.
M. Hairer, Renormalisation in the presence of variance blowup. Preprint, arXiv:2401.10868 [math.PR] (2024)
33.
M. Hakim-Dowek, D. Lépingle, L’exponentielle stochastique des groupes de Lie, in Séminaire de Probabilités XX 1984/85 (Springer, 1986), pp. 352–374
34.
F. Hausdorff, Die symbolische Exponentialformel in der Gruppentheorie. Ber. Verh. Kgl. Sächs. Ges. Wiss. Leipzig. Math. Phys. Kl. 58, 19–48 (1906)
35.
A. Iserles, S.P. Nørsett, On the solution of linear differential equations in Lie groups. Philos. Trans. R. Soc. A 357(1754), 983–1019 (1999)MathSciNetCrossRef
36.
A. Iserles, H. Munthe-Kaas, S. Nørsett, A. Zanna, Lie-group methods. Acta Numer. 9, 1 (2005)
37.
J. Jacod, A.N. Shiryaev, Limit Theorems for Stochastic Processes. Number 488 in Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] (Springer, Berlin, 2003)
38.
K. Kamm, S. Pagliarani, A. Pascucci, On the stochastic Magnus expansion and its application to SPDEs. J. Sci. Comput. 89(3), 31 (2021), Id/No 56
39.
M. Keller-Ressel, W. Schachermayer, J. Teichmann, Affine processes are regular. Probab. Theory Related Fields 151(3–4), 591–611 (2011)MathSciNetCrossRef
40.
T.G. Kurtz, E. Pardoux, P. Protter, Stratonovich stochastic differential equations driven by general semimartingales. Ann. Inst. Henri Poincaré Probab. Stat. 31(2), 351–377 (1995)MathSciNet
41.
H. Lacoin, R. Rhodes, V. Vargas, A probabilistic approach of ultraviolet renormalization in the boundary sine-Gordon model. Probab. Theory Relat. Fields 185(1–2), 1–40 (2023)MathSciNetCrossRef
42.
J.-F. Le Gall, Brownian Motion, Martingales, and Stochastic Calculus (Springer, 2016)
43.
Y. LeJan, Z. Qian, Stratonovich’s signatures of Brownian motion determine Brownian sample paths. Probab. Theory Related Fields 157, 209 (2011)MathSciNetCrossRef
S. Li, H. Ni, Expected signature of stopped Brownian motion on d-dimensional \(C^{2, \alpha }\)-domains has finite radius of convergence everywhere: \(2 \leq d \leq 8\). J. Funct. Anal. 282(12), 45 (2022), Id/No 109447
46.
S. Li, H. Ni, Q. Zhu, Small mass limit of expected signature for physical Brownian motion. Preprint, arXiv:2305.00343 [math.PR] (2023) (2023)
47.
T. Lyons, Rough paths, signatures and the modelling of functions on streams, in Proceedings of the International Congress of Mathematicians—Seoul 2014, vol. IV (Kyung Moon Sa, Seoul, 2014), pp. 163–184
48.
T. Lyons, N. Victoir, Cubature on Wiener space. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 460(2041), 169–198 (2004)MathSciNetCrossRef
49.
W. Magnus, On the exponential solution of differential equations for a linear operator. Commun. Pure Appl. Math. 7(4), 649–673 (1954)MathSciNetCrossRef
50.
S.I. Marcus, Modeling and analysis of stochastic differential equations driven by point processes. IEEE Trans. Inform. Theory 24(2), 164–172 (1978)MathSciNetCrossRef
51.
S.I. Marcus, Modeling and approximation of stochastic differential equations driven by semimartingales. Stochastics 4(3), 223–245 (1981)MathSciNetCrossRef
52.
H.P. McKean, Stochastic Integrals. Number 353 in AMS Chelsea Publishing Series (American Mathematical Society, Piscataway, 1969)
53.
P.A. Mykland, Bartlett type identities for martingales. Ann. Stat. 22(1), 21–38 (1994)MathSciNetCrossRef
54.
P.E. Protter, Stochastic Integration and Differential Equations. Stochastic Modelling and Applied Probability, 2nd edn. (Springer, Berlin Heidelberg, 2005)
55.
C. Reutenauer, Free Lie algebras, in Handbook of Algebra, vol. 3 (Elsevier, Amsterdam, 2003), pp. 887–903
56.
D. Revuz, M. Yor, Continuous Martingales and Brownian Motion. Grundlehren der mathematischen Wissenschaften (Springer, Berlin Heidelberg, 2004)
57.
F. Schur, Zur Theorie der endlichen Transformationsgruppen. Math. Ann. 38, 263–286 (1891)MathSciNetCrossRef