1 Introduction

Since the beginning of the early 1990’s and the introduction of the Brownian continuum random tree by Aldous [2], random trees have been the object of intense research in probability theory. Various classes of continuous random trees have been considered to extend the initial Brownian tree setup, two important such classes are Lévy and fragmentation trees. The Lévy trees have been introduced by Le Gall and Le Jan [20] to describe the genealogical structure of continuous-state branching processes and furnish all the possible scaling limits of Galton–Watson trees [12]. The fragmentation trees [16] encode the genealogy of self-similar fragmentation processes and are the scaling limits of Markov branching trees [15]. The intersection of these two classes consists of the stable trees of parameter

$$\begin{aligned} \alpha \in (1,2]. \end{aligned}$$

When \(\alpha =2\), the \(2\)-stable tree corresponds to Aldous’ Brownian CRT, up to a multiplicative scaling. For general \(\alpha \in (1,2]\), the \(\alpha \)-stable trees can either be seen as the scaling limits of conditioned critical Galton–Watson trees with offspring distribution in the domain of attraction of a stable law of index \(\alpha \) [11], or as a \(1/\alpha -1\) self-similar fragmentation tree which is invariant under uniform re-rooting [18]. The increasing function \(\alpha \mapsto 1-1/\alpha \) will repeatedly appear when dealing with stable trees and we shall use the following notation throughout the paper:

$$\begin{aligned} \bar{\alpha } = 1-1/\alpha . \end{aligned}$$

We adopt the convention of Duquesne and Le Gall [12] and denote by \( \mathcal{T }_{\alpha }\) the standard \(\alpha \)-stable tree, which is the one describing the genealogical structure of a continuous-state branching process with branching mechanism \(\lambda \mapsto \lambda ^{\alpha }\). Alternatively, for \(\alpha =2\), it can be defined as the scaling limit of a conditioned Galton-Watson tree with offspring distribution \(\eta \) with mean 1 and variance \(\sigma ^2 \in (0,\infty )\) in the following sense:

$$\begin{aligned} \frac{T_n^{\text{ GW}}}{\sqrt{n}} \xrightarrow [{n \rightarrow \infty }]{(\mathrm{d})} \frac{\sqrt{2}}{\sigma } \mathcal{T }_2, \end{aligned}$$

where \(T_n^{\text{ GW}}\) is a version of the above mentioned Galton–Watson tree conditioned to have \(n\) vertices. Similarly, for \(\alpha \in (1,2)\) and for an offspring distribution \(\eta \) with mean 1 and such that \(\eta (k)\sim Ck^{-1-\alpha }\) as \(k \rightarrow \infty \), we have that:

$$\begin{aligned} \frac{T_n^{\text{ GW}}}{n^{\bar{\alpha }}} \xrightarrow [{n \rightarrow \infty }]{(\mathrm{d})} \left(\frac{\alpha (\alpha -1)}{C\Gamma (2-\alpha )}\right)^{1/\alpha } \mathcal{T }_{\alpha }. \end{aligned}$$

These convergences hold for the Gromov–Hausdorff topology, as recalled in Sect. 2.

The nested family of stable trees. Some connections between different stable and Lévy trees have already been observed. In [1] Abraham et al. present a pruning procedure for a large class of Lévy trees that leads to other Lévy trees. However, when applied to stable trees, their procedure gives pruned subtrees that are not stable anymore. In the other direction, Bertoin et al. [9] obtain stable trees by folding randomly the branches of a Brownian tree. These results are a priori unrelated to the ones presented here.

It is well-known [13, 16] that the Hausdorff dimension of the \(\alpha \)-stable tree is almost surely \(1/\bar{\alpha }\), for all \(\alpha \in (1,2]\). Thus, in some sense, the stable trees are decreasing in the parameter \(\alpha \). The goal of this work is to give a precise geometric statement of the last heuristic and to show that all the stable trees can be seen as a single family of (random) nested trees. To do so, rather than considering the standard versions of the stable trees \( \mathcal{T }_{\alpha }\), we will consider randomly scaled versions so that it is possible to build on the same probability space a family of nested stable trees. More precisely, we let \(\Gamma _{a}\), \(a>0\), denote a Gamma random variable with shape parameter \(a\) and scale parameter 1, that is with density proportional to \(x^{a-1}e^{-x}\mathbf 1 _{\{x >0\}}\). We then set

$$\begin{aligned} J_{\alpha } \overset{(\mathrm{d})}{=} \alpha \cdot \left(\Gamma _{1+\bar{\alpha }}\right)^{\bar{\alpha }}\!. \end{aligned}$$

This variable has been designed so that if we rescale a stable tree \( \mathcal{T }_{\alpha }\) by a independent variable distributed as \( J_{\alpha }\) (that is we multiply the distances in \( \mathcal{T }_{\alpha }\) by the factor \( J_{\alpha }\)) then the height of a random uniform point in \( J_{\alpha } \cdot \mathcal{T }_{\alpha }\) has a distribution that does not depend on \(\alpha \), namely a Gamma distribution of parameter \(2\). Our main result is then

Theorem 1

There exists a process of rescaled nested stable trees \(({\fancyscript{T}}_{\alpha },1 < \alpha \le 2)\) such that:

  • \({\fancyscript{T}}_{\alpha }\overset{(\mathrm{d})}{=}J_{\alpha } \cdot \mathcal{T }_{\alpha }\), where \(J_{\alpha }\) is independent of \(\mathcal{T }_{\alpha }\), \(\alpha \in (1,2],\)

  • \({\fancyscript{T}}_{\alpha ^{\prime }} \subset {\fancyscript{T}}_{\alpha }\), for all \(1<\alpha \le \alpha ^{\prime } \le 2\).

The existence of this decreasing process of rescaled stable trees is actually a simple corollary of the following proposition, which says that hidden inside any stable tree of parameter \(\alpha \), there exists a rescaled version of a stable tree of parameter \(\alpha ^{\prime }>\alpha \) (see Fig. 1).

Fig. 1
figure 1

A Brownian tree hidden in a stable tree

Proposition 2

Let \(1 < \alpha < \alpha ^{\prime } \le 2.\) There exists a closed (random) subtree \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) of the \(\alpha \)-stable tree \(\mathcal{T }_{\alpha }\), such that

$$\begin{aligned} \mathfrak{T }_{\alpha ,\alpha ^{\prime }} \overset{(\mathrm{d})}{=} (M_{\alpha ,\alpha ^{\prime }})^{\bar{\alpha }^{\prime }} \cdot \mathcal{T }_{\alpha ^{\prime }}, \end{aligned}$$

where \( \mathcal{T }_{\alpha ^{\prime }}\) is a standard \(\alpha ^{\prime }\)-stable tree and \( M_{\alpha ,\alpha ^{\prime }}\) is an independent variable distributed as \((\alpha ^{\prime }/\alpha )^{1/\bar{\alpha }^{\prime }}\) times a generalized Mittag-Leffler distribution with parameters \(\big ({\bar{\alpha }}/{\bar{\alpha }^{\prime }}, \bar{\alpha } \big )\).

The definition of generalized Mittag-Leffler distributions is recalled in Sect. 3.1. Throughout this work, the boundary case \(\alpha ^{\prime } =2\), where a (rescaled) Brownian tree is extracted from an \(\alpha \)-stable tree, deserves a special attention since our constructions and proofs simplify substantially in this case. Although not unique, the subtree \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) of the last proposition can be explicitly constructed by a pruning procedure of the tree \( \mathcal{T }_{\alpha }\).

Pruning procedure. Let \(1 < \alpha < \alpha ^{\prime } \le 2.\) Conditionally on the stable tree \( \mathcal{T }_{\alpha }\), let \(X_{i},i \ge 0\) be i.i.d. random leaves sampled according to its uniform mass measure. Then write \( \mathfrak t _n\) for the subtree spanned by \(X_0, X_1, \ldots , X_n\) in \( \mathcal{T }_\alpha \). For \(d,d^{\prime } \in \{0,2,3,4,5,\ldots \}\) with \(d \ge d^{\prime }\) and \(d \ge 2\), we introduce the following probabilities

$$\begin{aligned} p_{\alpha ,\alpha ^{\prime },d,d^{\prime }}&= \left\{ \begin{array}{ccl} 0&\,&\text{ if}\,d^{\prime }=0,\\ 1&\,&\text{ if}\,d^{\prime }=2,\\ \frac{(d^{\prime }-1-\alpha ^{\prime })(\alpha -1)}{(d-1-\alpha )(\alpha ^{\prime }-1)}&\,&\text{ otherwise}, \end{array} \right. \quad \in [0,1], \end{aligned}$$
(1)

from which we construct inductively a sequence of subtrees \(\tau _{n} \subset \mathfrak t _n \subset \mathcal{T }_{\alpha }\) as follows:

  • \( \tau _1= \mathfrak t _1 = [ [ X_0, X_1 ] ]\) is the line segment linking \(X_0\) to \(X_1\) in \( \mathcal{T }_{\alpha }\),

  • at step \(i\ge 2\), write \( \mathfrak t _{i} = \mathfrak t _{i-1} \cup [ [ X_{i}, \Delta _{i} ] ]\), where \(\Delta _{i}\in \mathfrak t _{i-1} \) and \([ [ X_i,\Delta _i ] ]\) is the shortest path in \(\mathfrak t _i\) connecting \(X_{i}\) to \( \mathfrak t _{i-1}\). Then let \(d_{i}\) denote the degree (multiplicity) of \(\Delta _{i}\) in \( \mathfrak t _{i-1}\) and \(d^{\prime }_{i}\) its degree in \( \tau _{i-1}\), with the convention that \( d_i^{\prime }=0\) if \(\Delta _{i} \notin \tau _{i-1}\). Finally set (see Fig.  2)

    $$\begin{aligned} \tau _{i} = \tau _{i-1}\cup [ [ X_{i}, \Delta _{i} ] ]&\text{ with} \text{ probability}&p_{\alpha ,\alpha ^{\prime },d_{i},d^{\prime }_{i}}, \\ \tau _{i}= \tau _{i-1}&\text{ otherwise}. \end{aligned}$$
Fig. 2
figure 2

Illustration of the pruning construction. Here \(d^{\prime }=3\), \(d=4\) and the trees \(\tau _5\) and \(\tau _6\) are in blue (color figure online)

The sequence \((\tau _n)\) is clearly increasing in \( \mathcal{T }_{\alpha }\) and we denote by \( \text{ Prun}_{\alpha ,\alpha ^{\prime }}( \mathcal{T }_\alpha ; (X_i)_{i \ge 0} )\) the closure of this increasing union.

Notice that when \( \alpha ^{\prime }=2\) we have \(p_{\alpha ,2,d,3}=0\) and thus no branch point of degree larger or equal to \(4\) is created in \( \tau _{n}\). In other words, \((\tau _{n})\) is a sequence a binary trees. In this case, the pruning procedure boils down to adding those points \(X_{n+1}\) whose attachment to \(\tau _{n}\) preserves the binary structure of the tree and thus \( \text{ Prun}_{\alpha ,2}( \mathcal{T }_{\alpha } ; (X_{i})_{i\ge 0})\) can be seen as a deterministic function of \( \mathcal{T }_{\alpha }\) and of the leaves \(X_{i}, i \ge 0\).

Theorem 3

The statement of Proposition 2 holds with

$$\begin{aligned} \mathfrak{T }_{\alpha ,\alpha ^{\prime }} = \mathrm{Prun}_{\alpha ,\alpha ^{\prime }}\big ( \mathcal{T }_\alpha ; (X_i)_{i \ge 0} \big ). \end{aligned}$$

Moreover, since \( \alpha ^{\prime } \mapsto p_{\alpha ,\alpha ^{\prime },d,d^{\prime }}\) is decreasing, we can couple the realizations of the pruned subtrees such that \( \alpha ^{\prime } \in (\alpha ,2] \mapsto \text{ Prun}_{\alpha ,\alpha ^{\prime }}\big ( \mathcal{T }_\alpha ; (X_i)_{i \ge 0} \big )\) is decreasing inside \( \mathcal{T }_\alpha \). The pruning operation can also be viewed from a fragmentation point of view which sheds new light on the construction of \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\).

Modification of the stable fragmentation. Fragmentation theory is a mathematical attempt to model the behavior of particles that undergo a splitting process, see [7] for an account on this field. Any \(\alpha \)-stable tree \( \mathcal{T }_{\alpha }\) can be associated with a self-similar fragmentation process that describes the masses of the subtrees of \( \mathcal{T }_{\alpha }\) above a certain height. In other words, an \(\alpha \)-stable tree can be seen as the genealogical tree (in the sense of [16]) of a pure-jump self-similar fragmentation process with index \(-\bar{\alpha }\), whose dislocation measure \(\nu _{\alpha }\) has been identified by Bertoin [6] in the Brownian case \(\alpha =2\) and by Miermont [22] in the general case (see Sect. 5 for an expression of \(\nu _{\alpha }\)). All these dislocation measures are conservative, which means that the mass is kept at each dislocation. In Sect. 5 we introduce for \(1<\alpha < \alpha ^{\prime } \le 2\) a modification of the dislocation measure \(\nu _{\alpha }\) by (roughly speaking) keeping randomly some of the fragments created by \(\nu _{\alpha }\). When \(\alpha ^{\prime }=2\), this simply consists in keeping only two fragments, chosen proportionally to their sizes. In the other cases the procedure is more complex and depends on the probabilities \(p_{\alpha ,\alpha ^{\prime },d,d^{\prime }}\) introduced in (1). Obviously, the resulting dislocation measure \(\nu _{\alpha ,\alpha ^{\prime }}\) is now dissipative (some mass is lost at each dislocation). It is still possible to associate with it a random tree coding its genealogy (see [26]) and we have:

Proposition 4

The tree \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) of Proposition 2 can be seen as the genealogical tree of a pure-jump self-similar fragmentation of index \(-\bar{\alpha }\) and dislocation measure \(\nu _{\alpha ,\alpha ^{\prime }}\).

Thus, in addition to being the genealogy of a canonical conservative fragmentation process, the stable tree of parameter \(\alpha ^{\prime }\) is also, up to a random scaling, the genealogical tree of a dissipative fragmentation process with dislocation measure \(\nu _{\alpha ,\alpha ^{\prime }}\) and auto-similarity index \(-\bar{\alpha }\) for all \(\alpha \in (1,\alpha ^{\prime })\). Besides, as such a dissipative fragmentation tree, \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) naturally carries a Malthusian measure whose total mass is distributed, up to a deterministic scaling, as the random variable \(M_{\alpha ,\alpha ^{\prime }}\) appearing in the random scaling in Proposition 2. In Theorem 15 of Sect. 5, we reinforce Proposition 4 by showing that this dissipative fragmentation tree endowed with its Malthusian measure is distributed as an \(\alpha ^{\prime }\)-stable tree with its natural uniform mass measure, up to a random scaling depending on \(M_{\alpha ,\alpha ^{\prime }}\).

Strategy and organization of the paper. The construction of the subtree \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) presented in Proposition 2 relies on a discrete approach. In [21], Marchal introduced a Markov chain \(( \mathbf T _{\alpha }(n), n \ge 1)\) made of increasing labeled trees which, once re-normalized, converge towards the stable tree of parameter \(\alpha \). Let us present quickly this construction: we start with \( \mathbf T _{\alpha }(1)\), the tree with a single edge, then inductively at step \(n \ge 2\) we associate a weight \(\alpha -1\) to every edges of the tree \( \mathbf T _{\alpha }(n)\) and a weight \(d-1-\alpha \) to each vertex with degree \(d \ge 3\). An edge or a vertex of \( \mathbf T _{\alpha }(n)\) is chosen at random accordingly to these weights and a new edge is attached either directly to the vertex or in the “middle” of the chosen edge, see Fig. 3 and Sect. 2.3 for more details. In the case \(\alpha =2\) this construction is the famous algorithm for growing binary trees due to Rémy [25]. We prove in Theorem 5 that, once re-normalized, the trees \( \mathbf T _{\alpha }(n)\) converge towards a stable tree

$$\begin{aligned} \frac{ \mathbf T _{\alpha }(n)}{n^{\bar{\alpha }}} \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} \alpha \mathcal{T }_{\alpha }, \end{aligned}$$

in the Gromov–Hausdorff sense. This result thus strengthens the convergence in probability already obtained in [17, Corollary 24].

Fig. 3
figure 3

Illustration of the recursive rules to construct \( \mathbf T _{\alpha }(n+1)\) from \(\mathbf T _{\alpha }(n)\)

The key observation that triggered this work is that it is possible to identify within Marchal’s construction of the \(\mathbf T _{\alpha }(n)\)’s a “sub-Markov chain” of trees \(\mathbf T _{\alpha ,\alpha ^{\prime }}(n) \subset \mathbf T _{\alpha }(n)\) whose growth mechanism is identical to Marchal’s algorithm but with parameter \(\alpha ^{\prime }>\alpha \), see Sect. 3. The scaling limit of the sequence \(( \mathbf T _{\alpha ,\alpha ^{\prime }}(n), n \ge 1)\) then furnishes the tree \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) of Proposition 2.

The paper is organized as follows. Section 2 contains the background on discrete and continuous trees as well as the presentation of Marchal’s construction and its almost sure convergence towards the stable tree (Theorem 5). We then move in Sect. 3 to the observation that sub-constructions lie inside Marchal’s algorithm and deduce Proposition 2 and Theorem 1. Section 4 is devoted to the pruning construction of \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) (Theorem 3). Finally, the last section explores the fragmentation approach and its consequences.

2 Background on stable trees

In this section, we present the recursive construction of Marchal and prove (Theorem 5) that it converges almost surely, after rescaling, towards a stable tree. Before embarking into that topic, we start by introducing some background on trees and the Gromov–Hausdorff and Gromov–Hausdorff–Prokhorov distances.

2.1 Discrete and continuous trees

Discrete trees. A discrete tree \(\tau \) is a finite connected graph without cycle, considered up to graph isomorphisms: it is not embedded in any space and its vertices are unlabeled. If \(x\) and \(y\) are two vertices of a tree \(\tau \), we denote by \( [ [ x,y ] ]\) the discrete geodesic in \(\tau \) between \(x\) and \(y\). If \(y_{1}, y_{2},\ldots , y_{k}\) are (distinct) vertices of \(\tau \) we write

$$\begin{aligned} \text{ Span}(\tau ; y_{1}, \ldots , y_{k}) = \bigcup _{1 \le i,j \le k} [ [ y_{i}, y_{j} ] ], \end{aligned}$$

for the discrete tree spanned by these vertices. The degree of a vertex is the number of edges adjacent to it, for instance a leaf is a vertex of degree one. A tree is binary if the degrees of its vertices are in \(\{1,2,3\}\).

A labeled discrete tree \(\varvec{\tau }=(\tau ; x_0, x_1, \ldots , x_n)\) is a pair formed by a discrete tree \(\tau \) given with an exhaustive enumeration of its leaves. If \(\varvec{\tau }\) is a labeled tree we call the shape of \( \varvec{\tau }\) the tree \(\tau \) obtained after forgetting the labeling of the tree. We will systematically use bold letters for labeled trees and standard ones for their associated shapes.

Continuous trees. We briefly recall here some facts on \(\mathbb{R }\)-trees and refer to [14, 19] for an overview of this topic. An \(\mathbb{R }\)-tree is a metric space \((\mathcal{T },d)\) such that for every \( x,y\in \mathcal{T }\),

  • there is an isometry \(\varphi _{x,y}:[0,d(x,y)]\rightarrow \mathcal{T }\) such that \(\varphi _{x,y}(0)=x\) and \(\varphi _{x,y}(d(x,y))=y\)

  • for every continuous, injective function \(c:[0,1]\rightarrow \mathcal{T }\) with \(c(0)=x,\,c(1)=y\), one has \(c([0,1])=\varphi _{x,y}([0,d(x,y)])\).

We identify two \(\mathbb{R }\)-trees when they are isometric, and still use the notation \((\mathcal{T },d)\) to represent an isometry class. Note that a discrete tree may be seen as a \(\mathbb{R }\)-tree by “replacing” its edges by segments. Unless specified, it is implicit in this paper that these line segments are all of length 1. We use much of the notation we introduced in the context of discrete trees when it is non ambiguously extended to \(\mathbb{R }\)-trees. In particular if \((\mathcal{T },d)\) is an \(\mathbb{R }\)-tree and if \(a,b \in \mathcal{T }\) we denote by \( [ [ a,b ] ]\) the geodesic line segment between \(a\) and \(b\) in \( \mathcal{T }\). Also \(\text{ Span}(\mathcal{T }; y_{1},\ldots , y_{k}) = \cup _{1 \le i,j \le k} [ [ y_{i}, y_{j} ] ]\) still denotes the subtree spanned by \(y_{1},\ldots , y_{k} \in \mathcal{T }\). The degree (or multiplicity) of a point \(x \in \mathcal{T }\) is the number of connected components of \( \mathcal{T }\backslash \{x\}\). A leaf is a point of degree \(1\), a branch point has degree larger than or equal to \(3\), and an \(\mathbb{R }\)-tree is said to be binary if the degrees of its points are in \(\{1,2,3\}\).

Gromov–Hausdorff–Prokhorov topology. The reader interested by the Gromov–Hausdorff and Gromov–Hausdorff–Prokhorov topologies should consult [10, 14, 23] for details and proofs. Let \(k \in \{0,1,2,\ldots \}\). A compact metric space \((E,d)\) is \(k\)-pointed if it is given with \(k\) points \(x_{1},\ldots , x_{k} \in E\) (when \(k=0\) this is just a compact metric space). An isometry between two \(k\)-pointed compact metric spaces is an isometry between the spaces that maps the \(k\) distinguished points to each other. The set of isometry classes of \(k\)-pointed compact metric spaces is endowed with the classical (\(k\)-pointed) Gromov–Hausdorff topology, that makes it Polish. This distance can be defined as

$$\begin{aligned}&\text{ d}_\mathrm{GH}( (E,d; x_{1},\ldots , x_{k}),(E^{\prime },d^{\prime }; x_{1}^{\prime },\ldots , x_{k}^{\prime })) \\&\quad = \inf \Big \{ \text{ d}_\mathrm{H}(\phi (E),\phi (E^{\prime })) \vee \max _{1\le i \le k} \delta (\phi (x_{i}),\phi ^{\prime }(x_{i}^{\prime }))\Big \}, \end{aligned}$$

where the infimum is taken over all choices of metric spaces \((F, \delta )\) and isometric embeddings \(\phi : E \rightarrow F\), \(\phi ^{\prime } : E^{\prime } \rightarrow F\), and where \(\text{ d}_\mathrm{H}\) denotes the Hausdorff distance in \(F\). The class of compact \(\mathbb{R }\)-trees forms the accumulation points of the class of rescaled discrete trees for the Gromov–Hausdorff topology and is thus closed. In the remainder of the paper, we use the notation \(a \cdot E\) for the rescaled metric space \((E,ad)\) for any \(a>0\).

The Gromov–Hausdorff topology can be enriched in order to take into account measured spaces. A compact metric space \((E,d)\) endowed with a Borel probability measure \(\mu \) is called a measured compact metric space. We can extend the Gromov–Hausdorff topology to isometry classes (defined in the obvious way) of measured compact metric spaces by putting

$$\begin{aligned} \text{ d}_\mathrm{GHP}((E,d,\mu ),(E^{\prime },d^{\prime },\mu ^{\prime })) = \inf \left\{ \text{ d}_\mathrm{H}(\phi (E),\phi ^{\prime }(E^{\prime }))\vee \text{ d}_{\text{ P}}(\phi _*\mu ,\phi ^{\prime }_*\mu ^{\prime })\right\} \!, \end{aligned}$$

where again \(\phi ,\phi ^{\prime }\) are isometries from \(E,E^{\prime }\) into a common space \((F,\delta )\), \(\phi _*\mu ,\phi ^{\prime }_*\mu ^{\prime }\) are the push-forward of \(\mu ,\mu ^{\prime }\) by \(\phi ,\phi ^{\prime }\), and \(\text{ d}_{\text{ P}}\) is the Prokhorov distance between Borel probability measures on \(F\):

$$\begin{aligned} \text{ d}_\mathrm{P}(m,m^{\prime })=\inf \{\varepsilon >0:m(C)\le m^{\prime }(C^\varepsilon )+\varepsilon \,\, \text{ for} \text{ every} \,\,C\subset F \text{ closed}\}, \end{aligned}$$

where \(C^\varepsilon =\{x\in F:\inf _{y\in C}\text{ d}(x,y)<\varepsilon \}\) is the \(\varepsilon \)-neighborhood of \(C\). The function \(\text{ d}_\mathrm{GHP}\) is a distance on the set of isometry classes of measured compact metric spaces that makes it Polish.

Let us give another convenient way to express Gromov–Hausdorff distances. A correspondence between two \(k\)-pointed compact metric spaces \((E ,d ; x_{1}, \ldots , x_{k})\) and \((E^{\prime },d^{\prime } ; x_{1}^{\prime }, \ldots , x_{k}^{\prime })\) is a subset \(\mathcal{R }\) of \(E\times E^{\prime }\) such that, for every \(y_{1} \in E\), there exists at least one point \(y_{2}\in E^{\prime }\) such that \((y_{1},y_{2}) \in \mathcal{R }\) and conversely, for every \(z_{2}\in E^{\prime }\), there exists at least one point \(z_{1}\in E\) such that \((z_{1},z_{2}) \in \mathcal{R }\). Also \((x_{1},x_{1}^{\prime }), \ldots , (x_{k}, x_{k}^{\prime }) \in \mathcal{R }\). The distortion of the correspondence \(\mathcal{R }\) is defined by

$$\begin{aligned} \text{ dis}(\mathcal{R }) = \sup _{}\big \{|d(x_{1},y_{1})-d^{\prime }(x_{2},y_{2})| : (x_{1},x_{2}),(y_{1},y_{2}) \in \mathcal{R } \big \}. \end{aligned}$$

The Gromov–Hausdorff distance between \((E,d; x_{1}, \ldots , x_{k})\) and \((E^{\prime },d^{\prime }; x_{1}^{\prime },\ldots , x_{k}^{\prime })\) can be expressed as half the infimum of the distortions of correspondences between \(E\) and \(E^{\prime }\). A similar (but more involved) definition of \( \text{ d}_\mathrm{GHP}\) via correspondences can be found in [23].

2.2 The stable trees

We now give some additional background on stable trees and refer to [12, 13] for a complete account. Stable trees are random variables \(\mathcal{T }_{\alpha }\) for \(\alpha \in (1,2]\) taking values in the set of isometry classes of compact \(\mathbb{R }\)-trees that can be defined in various manners. For example, the tree \(\mathcal{T }_{\alpha }\) can be defined by its finite-dimensional marginals (see [12, Theorem 3.3.3]) or by its contour process. In particular, the \(2\)-stable tree can be identified with \(\sqrt{2}\) times the Brownian tree \(\mathcal{T }_\mathrm{Br}\) coded by a standard normalized excursion \( (\mathbf e _{t} : 0 \le t \le 1)\). Note also that the Brownian tree introduced by Aldous in [3] is equal to twice \( \mathcal{T }_\mathrm{GHP}\) so that we have in distribution

$$\begin{aligned} \mathcal{T }_{2} = \sqrt{2} \mathcal{T }_\mathrm{GHP} = \frac{ \mathcal{T }_\mathrm{Aldous}}{\sqrt{2}}. \end{aligned}$$

We mention that contrary to the Brownian tree where all the branch points are of degree \(3\), in the stable case \(1<\alpha <2\), they are of infinite multiplicity ([13]).

Let us make precise the definition of \(\mathcal{T }_{\alpha }\) via scaling limits of Galton–Watson trees and introduce the mass measure on \( \mathcal{T }_{\alpha }\). For \(\alpha \in (1,2)\), consider an (unordered) Galton–Watson tree with offspring \(\eta \) with mean 1 and such that \(\eta (k)\sim Ck^{-1-\alpha }\) when \(k \rightarrow \infty \). Let \(T^\mathrm{GW}_n\) denote a version of this tree conditioned to have \(n\) vertices, and equip it with the uniform probability measure on its vertices, which is denoted by \(\mu _n\). Then by Duquesne [11] (recall that \(\bar{\alpha }=1-1/\alpha \)),

$$\begin{aligned} \left(\frac{T^\mathrm{GW}_n}{n^{\bar{\alpha }}},\mu _n \right) \xrightarrow [n\rightarrow \infty ]{(\mathrm{d})} \left(\left(\frac{\alpha (\alpha -1)}{C\Gamma (2-\alpha )}\right)^{1/\alpha } \mathcal{T }_{\alpha }, \mu _{\alpha }\right), \end{aligned}$$

for the Gromov–Hausdorff–Prokhorov topology, where \(\mathcal{T }_{\alpha }\) is a stable tree of index \(\alpha \) equipped with a probability measure \(\mu _{\alpha }\) which can be interpreted as the uniform mass measure on \(\mathcal{T }_{\alpha }\). A similar result holds for \(\alpha =2\). It is well-known [12] that for all \(\alpha \in (1,2]\), the measure \(\mu _{\alpha }\) is actually fully supported by the set of leaves of \(\mathcal{T }_{\alpha }\) and that it can be measurably constructed from \(\mathcal{T }_{\alpha }\). It thus makes sense to speak about an i.i.d. sample \(X_{n}, n \ge 0\) of leaves according to \( \mu _{\alpha }\) conditionally on \(\mathcal{T }_{\alpha }\). We shall repeatedly use the following easy fact: \(\{X_{n}, n \ge 0\}\) is dense in \(\mathcal{T }_{\alpha }\) a.s.

2.3 Marchal’s recursive construction

Let \(\alpha \in (1,2]\). In [21], Marchal proposed a recursive construction to build random finite trees that converge in the scaling limit towards the \(\alpha \)-stable tree. The construction is in fact a Markov chain \((\mathbf T _{\alpha }(n))_{n\ge 1}\) with values in the set of labeled trees, such that \( \mathbf T _{\alpha }(n)\) has \(n+1\) leaves and is defined as follows: we start with the tree \( \mathbf T _{\alpha }(1)\) which is the only tree with one edge and two labeled vertices denoted by \(A_{0}\) and \(A_{1}\). We then build recursively \( \mathbf T _{\alpha }(n+1)\) from \( \mathbf T _{\alpha }(n)\) by adding a new edge. More precisely, given \( \mathbf T _{\alpha }(n)\), assign a weight \(\alpha -1\) to any edge of \( \mathbf T _{\alpha }(n)\) and a weight \(d-1-\alpha \) to each vertex of degree \(d \ge 3\) (all other vertices have zero weight). The total weight \(W( \mathbf T _\alpha (n))\) of the tree \( \mathbf T _{\alpha }(n)\) is easily checked to be

$$\begin{aligned} W(\mathbf T _\alpha (n)) = n\alpha -1, \end{aligned}$$
(2)

and is in particular independent of the shape of the tree. We then choose an edge or a vertex of \( \mathbf T _{\alpha }(n)\) proportionally to its weight:

  • if we picked an edge then we split it into two edges with a middle vertex on which we attach a new edge carrying the \(n+1\)th leaf denoted by \(A_{n+1}\).

  • if a vertex has been selected then we attach a new edge carrying \(A_{n+1}\) to it.

In [21], Marchal exactly computed the distribution of the tree \( \mathbf T _{\alpha }(n)\). More precisely, he proved by induction on \(n\) that if \( \varvec{t}_{0}\) is a fixed labeled tree with \(n+1\) leaves, then

$$\begin{aligned}&\mathbb{P } ( \mathbf T _{\alpha }(n) = \varvec{t}_{0} ) = \frac{ \prod _{v \in \varvec{t}_{0}} p_{ \text{ deg}(v)}}{ \prod _{i=1}^{n-1} (i\alpha -1)},\\&\text{ with} \quad p_{1}=1, \quad p_{2}=0 \quad \text{ and} \quad \quad p_{k} \!=\! |(\alpha -1) (\alpha -2) \ldots (\alpha -k+2)| \quad \text{ for}\,k \ge 3. \end{aligned}$$

From this formula, it is easy to deduce (see [21]) that \(\mathbf T _{\alpha }(n)\) has the same distribution as the labeled tree obtained from the finite \(n+1\)-dimensional marginal of the standard \(\alpha \)-stable tree, see Theorem 3.3.3 in [12]. With this identification in hand, it is shown in [17, Corollary 24] that there exists a stable tree \(\mathcal{T }_{\alpha }\) built on the same probability space that supports \(\mathbf{T}_{\alpha }(n)\), \(n \ge 1\), and equipped with its uniform mass measure \(\mu _{\alpha }\), such that

$$\begin{aligned} \left(\frac{\text{ T}_{\alpha }(n)}{\alpha n^{\bar{\alpha }}}, \frac{\sum _{i=0}^n \delta _{A_i}}{n+1} \right) \xrightarrow [n\rightarrow \infty ]{(\mathbb{P })} \left( \mathcal{T }_{\alpha }, \mu _{\alpha }\right) \end{aligned}$$

for the Gromov–Hausdorff–Prokhorov topology. The goal of the next section is to improve this convergence into an almost-sure one.

2.4 The convergence theorem

Let \(\mathcal{T }_{\alpha }\) be an \(\alpha \)-stable tree and conditionally on \( \mathcal{T }_{\alpha }\), let \((X_{i}, i \ge 0)\) be a sample of i.i.d. leaves distributed according to the mass distribution \(\mu _{\alpha }\) on \( \mathcal{T }_{\alpha }\). We consider the finite \(n+1\)th-dimensional trees \(\text{ Span}( \mathcal{T }_\alpha ; X_0,X_1, \ldots , X_n)\) for \(n \ge 1\). These objects are by definition continuous trees but can equivalently be seen as discrete labeled trees

$$\begin{aligned} \varvec{\tau }_{\alpha }(n)= (\tau _\alpha (n) ; x_0, x_1, \ldots , x_n) \end{aligned}$$

carrying positive lengths \((\ell _{e})\) on their edges, see Fig. 4.

Theorem 5

 

  1. (i)

    We have the joint identity in distribution

    $$\begin{aligned} \left( \mathbf{T}_{\alpha }(n),n\ge 1\right)&\overset{{\mathrm{(d)}}}{=}&\left( \varvec{\tau }_\alpha (n),n\ge 1\right)\!. \end{aligned}$$
  2. (ii)

    For every \(k \ge 0\), we have in the sense of \(k\)-pointed Gromov–Hausdorff distance

    $$\begin{aligned} \left(\frac{\tau _{\alpha }(n)}{\alpha n^{ \bar{ \alpha }}}; x_{0}, x_{1}, \ldots , x_{k}\right) \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} (\mathcal{T }_{\alpha } ; X_{0}, X_{1}, \ldots , X_{k}). \end{aligned}$$
    (3)
  3. (iii)

    If \(\mu _{n} = \frac{1}{n+1} \sum _{i=0}^{n} \delta _{x_{i}}\) denotes the uniform mass measure on the leaves of \( \tau _{\alpha }(n)\) then we have the following almost sure convergence for the Gromov–Hausdorff–Prokhorov topology

    $$\begin{aligned} \left(\frac{ \tau _{\alpha }(n)}{\alpha n^{{\bar{\alpha }}}}, \mu _{n}\right) \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} ( \mathcal{T }_{\alpha }, \mu _{\alpha }). \end{aligned}$$

Consequently, \(\text{ T}_{\alpha }(n)/ \alpha n^{\bar{\alpha }}\) endowed with the uniform mass measure on its leaves indeed converges almost surely for the Gromov–Hausdorff–Prokhorov topology towards an \(\alpha \)-stable tree endowed with its uniform mass measure.

Proof

To simplify the notation during the proof we fix \(\alpha \in (1,2]\) once for all and drop the indices \(\alpha \) by writing \( \varvec{\tau }_{n} = (\tau _{n} ; x_{0}, \ldots , x_{n})\) instead of \( \varvec{\tau }_{\alpha }(n) = (\tau _{\alpha }(n) ; x_{0}, \ldots , x_{n})\). To get (i), we use the fact discussed at the end of the previous section, and proved by Marchal in [21], that \( \mathbf T _{\alpha }(n) = \varvec{\tau }_n\) in distribution in terms of labeled trees, for all \(n\ge 1\). Since for \(k \le n\) the labeled trees \( \varvec{\tau }_k\) and \( \mathbf T _{\alpha }(k)\) are deduced from \( \varvec{\tau }_n\) resp. \( \mathbf T _{\alpha }(n)\) by the same deterministic procedure, we deduce that \(( \mathbf T _{\alpha }(k))_{1 \le k \le n}\) has the same law as \( ( \varvec{\tau }_k)_{1 \le k \le n}\). Since this holds for all \(n \ge 1\) we indeed get \((i)\).

Let us turn to the second point and remark first that for every \(k \ge 1\) we have the almost sure convergence in the sense of pointed Gromov–Hausdorff metric

$$\begin{aligned} \Big ( \text{ Span}( \mathcal{T }_\alpha ; (X_i)_{0\le i \le n}) ; X_{0}, X_{1}, \ldots , X_{k}\Big ) \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} \Big ( \mathcal{T }_{\alpha } ; X_{0}, X_{1}, \ldots , X_{k}\Big ). \end{aligned}$$
(4)

This is easy to obtain and follows from the fact that the sequence of leaves \((X_i)_{i \ge 0}\) is almost surely dense in the compact tree \( \mathcal{T }_\alpha \). The convergence (3) will be established by comparing the labeled tree \(\varvec{\tau }_n\) to its continuous analogue \(\text{ Span}(\mathcal{T }_\alpha ; (X_i)_{0\le i \le n})\) obtained by dressing it up with edge lengths. More precisely, we recall that the continuous tree \(\text{ Span}( \mathcal{T }_\alpha ; (X_i)_{0 \le i \le n})\) can be seen as the discrete labeled tree \(\varvec{\tau }_n\) where each edge \(e\) has been replaced by a Euclidean segment of length \(\ell ^{(n)}_e\), see Fig. 4 below.

Fig. 4
figure 4

The trees \( \mathcal{T }_\alpha \), \( \text{ Span}(\mathcal{T }_\alpha ; X_0,\ldots , X_5)\) and \(\tau _{\alpha }(5)\) with its edge-lengths

For notation convenience we write

\(\text{ d}^{(n)}_\mathrm{gr}\) for the graph metric in \(\tau _n\) (with edge lengths \(1\))

\(\text{ d}_\ell ^{(n)}\) for the metric on the vertices of \(\tau _n\) associated to the edge-lengths \(( \ell ^{(n)}_e)\).

In particular the two metrics live on the set of vertices of \(\tau _n\). Also, for \(0 \le i,j \le n\) the \( \text{ d}_{\ell }^{(n)}\) distance between \(x_{i}\) and \(x_{j}\) in \(\tau _{n}\) is equal to the distance between \(X_{i}\) and \(X_{j}\) in \( \mathcal{T }_{\alpha }\). Our goal is to show that

$$\begin{aligned} \sup _{a,b \in {\tau }_n} \left|\frac{\text{ d}^{(n)}_\mathrm{{gr}}(a,b)}{ n^{{\bar{\alpha }}}}- \alpha \text{ d}_\ell ^{(n)}(a,b)\right| \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} 0. \end{aligned}$$
(5)

Let us first show how to use the last display to complete the proof of the theorem. For that we use the definition of Gromov–Hausdorff distance via correspondences. Let \(\delta \) denote the metric on \(\mathcal{T }_{\alpha }\) and define a correspondence \(\mathcal{R }_{n}\) between the trees \((\tau _{n}, n^{- \bar{\alpha }} \text{ d}_\mathrm{gr}^{(n)})\) and \( (\text{ Span}( \mathcal{T }_{\alpha } ; X_{0}, \ldots , X_{n}),\alpha \delta )\) by declaring that \(a \in \tau _{n}\) and \(x \in \text{ Span}( \mathcal{T }_{\alpha }; X_{0}, \ldots , X_{n})\) are in correspondence if \(x\) belongs to an “edge” of the continuous tree that is combinatorially adjacent to the associate vertex \(a\) in \({\tau }_n\). Note that \(x_{i}\) is in correspondence with \(X_{i}\) for all \(0 \le i \le n\). The distortion of this correspondence is bounded by

$$\begin{aligned} \text{ dist}(\mathcal{R }_{n}) \le \sup _{a,b \in {\tau }_n} \left|\frac{ \text{ d}^{(n)}_\mathrm{gr}(a,b)}{n^{{\bar{\alpha }}}} -\alpha \text{ d}_\ell ^{(n)}(a,b)\right| + 2 \alpha \sup \ell _{e}^{(n)}. \end{aligned}$$

By (4) and the fact that branch-points are dense in \( \mathcal{T }_{\alpha }\) we deduce that \( \sup _{e}\ell _{e}^{(n)} \rightarrow 0\) as \(n \rightarrow \infty \). This fact and (5) thus show that the distortion of \( \mathcal{R }_{n}\) vanishes as \(n \rightarrow \infty \). We finally use (4) to get (3).

It thus remains to prove (5). Fix \(\varepsilon >0\). For any \(n \ge 0\), we denote by \( \mathcal L _n = \sum \ell ^{(n)}_e\) the total length of the tree \( \text{ Span}( \mathcal{T }_\alpha ; X_0, \ldots , X_n)\) and by \(| {{\tau }} _n|\) the number of edges of the discrete tree \( { \tau }_n\). In order to simplify notation we set

$$\begin{aligned} \chi _{n} = \frac{| {\tau }_n|}{\mathcal{L }_{n}}. \end{aligned}$$

It follows from [12, Theorem 3.3.3] that conditionally on \(\varvec{\tau }_n\) and on \( \mathcal L _n\), the edge lengths \(\ell ^{(n)}_e\) are distributed uniformly on \( \mathbb R _+^{| {\tau }_n|}\) subject to the condition \( \sum \ell ^{(n)}_e = \mathcal L _n\). In particular, conditionally on \(\varvec{\tau }_n\) and on \( \mathcal L _n\), if \(a\) and \(b\) are two vertices of \(\varvec{\tau }_n\) then the distribution of the random variable \(\text{ d}^{(n)}_\ell (a,b)\) is given by first dividing uniformly the interval \([0, \mathcal L _n]\) into \(|{\tau }_n|\) pieces (by throwing \(|{\tau }_n|-1\) i.i.d. uniform variables over \([0, \mathcal L _{n}]\)) and summing the first \(\text{ d}^{(n)}_\mathrm{gr}(a,b)\)-th ones. We can then apply the following lemma to deduce

$$\begin{aligned}&\displaystyle \mathbb{P } \left(\! \left.\left|\text{ d}^{(n)}_\ell (a,b) \!-\! \frac{\text{ d}^{(n)}_{ \text{ gr}}(a,b) }{\chi _{n}}\right| \!\ge \! \varepsilon \text{ d}^{(n)}_{ \ell }(a,b) \ \right| \varvec{\tau }_n, \mathcal L _n \!\right) \le c_{ \varepsilon }^{-1}\exp \left({-c_\varepsilon \text{ d}^{(n)}_{ \text{ gr}}(a,b)}\right)\!,\qquad&\end{aligned}$$
(6)
$$\begin{aligned}&\displaystyle \mathbb{P } \left( \text{ d}_{\ell }^{(n)}(a,b) \ge \varepsilon \mid \varvec{\tau }_{n}, \mathcal L _{n} \right) \le c_{ \varepsilon }^{-1}\exp \left(- c_{ \varepsilon } \frac{\chi _{n}}{\text{ d}^{(n)}_{ \text{ gr}}(a,b)} \right).\qquad&\end{aligned}$$
(7)

Lemma 6

Let \(\varepsilon >0\). There exists a constant \(c_{\varepsilon }>0\) such that for all \(n \ge 1\) and for all \(L >0\), if \(0=U_{0} < U_1 < \cdots < U_{n-1} < U_{n}=L\) is a uniform splitting of the interval \([0,L]\) then for any \(0 \le k \le n\) we have

$$\begin{aligned} \mathbb{P }\left( \left|U_k-\frac{kL}{n}\right| \ge \varepsilon U_k \right)&\le c_{ \varepsilon }^{-1}\exp (-c_\varepsilon k), \\ \mathbb{P } \left( U_{k} \ge \varepsilon \right)&\le c_{ \varepsilon }^{-1}\exp \left( -c_{\varepsilon } \frac{n}{L k} \right)\!. \end{aligned}$$

Proof

By standard properties of the uniform splitting of the unit interval we have \(L^{-1}U_k = T_k/T_n\) in distribution where the \(T_i\)’s are the arrival times of a standard Poisson process. In particular for all \( \varepsilon >0\), there exists \(c>0\) such that

$$\begin{aligned} \mathbb{P }(|T_{i}-i| \ge \varepsilon i) \le \exp (- ci) \end{aligned}$$

for all \(i \ge 0\) which easily implies the first inequality of the lemma. For the second inequality, one has, for all \(k\ge 1\),

$$\begin{aligned} \mathbb{P }( U_{k} \ge \varepsilon )&= \mathbb{P }\left( T_{k} \ge \frac{ \varepsilon T_{n}}{L}\right) \\&\le \mathbb{P }\left(T_{k} \ge \frac{ \varepsilon (1- \varepsilon ) n}{L}\right) + \mathbb{P }\left( T_{n} <(1- \varepsilon )n\right) \\&\le \mathbb{P }\left( \mathcal P ( \varepsilon (1- \varepsilon )nL^{-1}) \le k\right) + \exp ( -cn), \end{aligned}$$

where \( \mathcal P (a)\) denotes a Poisson random variable of parameter \(a >0\). Next, since the function \(a \in (0,\infty ) \mapsto \mathcal{P }(a)\) is stochastically increasing, and since, for \(q \in \mathbb{N }\), \(\mathcal{P }(qk)\) is distributed as the sum of \(q\) independent Poisson random variables of parameter \(k\), one has,

$$\begin{aligned} \mathbb{P }\left( \mathcal P ( \varepsilon (1- \varepsilon ) nL^{-1}) \le k\right)&\le \mathbb{P }\left( \mathcal{P }\left(\left\lfloor \varepsilon (1- \varepsilon ) n L^{-1}k^{-1} \right\rfloor k\right) \le k \right) \\&\le \mathbb{P }\left( \mathcal P (k) \le k\right)^{\lfloor \frac{\varepsilon (1- \varepsilon ) n}{Lk}\rfloor }. \end{aligned}$$

Putting the pieces together and using \(\mathbb{P }( \mathcal P (k) \le k) \rightarrow 1/2\) as \(k \rightarrow \infty \) we get the second inequality. \(\square \)

Lemma 7

We have the almost sure convergence

$$\begin{aligned} \frac{\chi _{n}}{n^{{\bar{\alpha }}}} \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} \alpha . \end{aligned}$$

Proof

It is possible to prove the last lemma by analysing separately the processes \(( | {\tau }_n|)_{n\ge 1}\) and \(( \mathcal L _n)_{n\ge 1}\) and show that \(| {\tau }_n| \sim \alpha n\) and \(\mathcal L _n \sim n^{1/\alpha }\) a.s. as \(n \rightarrow \infty \). However we bypass these calculations by using a result of [17, Sect. 5.4]: we have

$$\begin{aligned} \frac{\text{ d}^{(n)}_\mathrm{gr}(x_0,x_1)}{n^{{\bar{\alpha }}}} \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} \alpha \delta (X_{0},X_{1}). \end{aligned}$$
(8)

Recall that for any \(i , j \ge 0\) and any \(n \ge \text{ max}(i,j)\) we have \( \text{ d}^{(n)}_\ell (x_i,x_j) = \delta (X_{i},X_{j})\). Let \( \varepsilon >0\) and set

$$\begin{aligned} A_{n} = \left\{ \left|\delta (X_{0},X_{1})- \frac{\text{ d}^{(n)}_{ \mathrm{gr}}(x_0,x_1)}{\chi _{n}} \right| \ge \varepsilon \delta (X_{0},X_{1}) \right\} \!. \end{aligned}$$

Our goal is to show that the probability that \(A_{n}\) is realized for infinitely many \(n\)’s is 0, thus proving that \(\text{ d}^{(n)}_{ \text{ gr}}(x_0,x_1)/\chi _{n}\) almost surely converges towards \( \delta (X_{0},X_{1})\). Identifying the limit with (8) then proves the statement of the lemma. To do this, let \(\beta \in (0,\bar{\alpha })\) and consider the events \(B_{n} = \{ \text{ d}^{(n)}_{ \text{ gr}}(x_0,x_1) \ge n^{\beta }\}\). By (8) (and the fact that \(\delta (X_0,X_1)>0\) a.s.) we get that \(\mathbb{P }(\cup _{k \ge 1} \cap _{n \ge k} B_n)=1\). Hence,

$$\begin{aligned} \mathbb{P } ( A_{n} \ i.o. ) = \mathbb{P }( A_{n} \cap B_{n} \ i.o.), \end{aligned}$$

where \(i.o.\) means “is realized for infinitely many \(n\)’s”. Using (6), we get that \( \mathbb{P }(A_{n}\cap B_{n})\) is less than \( {c}_{\varepsilon }^{-1} \exp ( - {c_{ \varepsilon }}n^{ \beta })\) hence by Borel–Cantelli \( \mathbb{P }( A_{n} \cap B_{n} \ i.o.) =0\). This completes the proof of the lemma.

Coming back to the proof of Theorem 5(ii), for any \( n \ge 1\), conditionally on \( \varvec{\tau }_{n}\) and on \( \mathcal L _{n}\), let us evaluate the probability that two vertices \(a,b \in \tau _{n}\) are such that

$$\begin{aligned} \left|\text{ d}^{(n)}_\ell (a,b) - \frac{\text{ d}^{(n)}_{ \mathrm{gr}}(a,b) }{\chi _{n}}\right| \ge \varepsilon ( \mathsf{Diam} \vee 1) + \sqrt{\chi _{n}^{-1}}, \end{aligned}$$
(*)

where \(\mathsf{Diam}\) is the diameter of \( \mathcal{T }_{\alpha }\), that is the maximal distance between any pair of points in \(\mathcal{T }_{\alpha }\). We remark that \(\mathsf{Diam}\) bounds from above all the quantities \(\text{ d}_{\ell }^{(n)}(a,b)\). We then split the cases according to the graph distance between \(a\) and \(b\):

  • if \( \text{ d}_\mathrm{gr}^{(n)}(a,b) \ge \sqrt{\chi _{n}}\) then the conditional probability of (*) is bounded by \( c_{ \varepsilon }^{-1}\exp (- c_{ \varepsilon } \sqrt{\chi _{n}})\) by (6),

  • if \( \text{ d}_\mathrm{gr}^{(n)}(a,b) < \sqrt{\chi _{n}}\) then the conditional probability of (*) is bounded by \( c_{ \varepsilon }^{-1}\exp (- c_{ \varepsilon } \sqrt{\chi _{n}})\) by (7).

We then use a conditional version of Borel–Cantelli’s lemma exactly as in the proof of Lemma 7. Specifically, let \(C_{n} = \{ \chi _{n} \ge n^{\bar{\alpha }}\}\) so that Lemma 7 implies \( \mathbb{P }( \cup _{k\ge 1} \cap _{n \ge k} C_{n})=1\). Let also \(D_{n} = \{ (*) \, \text{ holds} \text{ for} \text{ some}\,a,b \in \tau _{n}\}\). We thus have \( \mathbb{P }( D_{n} \ i.o.) = \mathbb{P }( C_{n} \cap D_{n} \ i.o.)\). Since there are deterministically less than \(2n\) vertices in \( {\tau }_{n}\), the preceding discussion shows that

$$\begin{aligned} \mathbb{P }( C_{n} \cap D_{n}) \!\le \! \mathbb{P }(D_{n} \mid C_{n}) \!\le \!\mathbb E \left[ c_{ \varepsilon }^{-1}4n^2 \exp (-c_{ \varepsilon } \sqrt{\chi _{n}}) \mid C_{n}\right] \!\le \!c_{ \varepsilon }^{-1}4n^2 \exp (-c_{ \varepsilon } n^{\bar{\alpha } /2}). \end{aligned}$$

Hence by the standard Borel–Cantelli Lemma we have \(\mathbb{P }(C_{n} \cap D_{n} \ i.o.)=0\) and thus \(\mathbb{P }(D_{n} \ i.o.)=0\) as well. Consequently,

$$\begin{aligned} \sup _{a,b \in {\tau }_n} \left|\frac{ \text{ d}^{(n)}_{ \text{ gr}}(a,b)}{\chi _{n}}- \text{ d}_\ell ^{(n)}(a,b)\right| \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} 0. \end{aligned}$$

Finally we use Lemma 7 again to replace \(\chi _{n}\) by \(\alpha n^{{\bar{\alpha }}}\) in the last display and get (5).

The last point of the theorem is obtained in the same spirit. Since conditionally on \( \mathcal{T }_{\alpha }\) the \(X_{i}\)’s are i.i.d. according to \(\mu _{\alpha }\) it follows that \(\mu _{\alpha }^{(n)}=\frac{1}{n+1} \sum _{i=0}^{n} \delta _{X_{i}}\) is the empirical measure associated to \(\mu _{\alpha }\) and then

$$\begin{aligned} \left( \mathcal{T }_{\alpha }, \mu _{\alpha }^{(n)} \right) \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} ( \mathcal{T }_{\alpha }, \mu _{\alpha }), \end{aligned}$$

in the Gromov–Hausdorff–Prokhorov sense. Futhermore, since \( \text{ Span}( \mathcal{T }_{\alpha } ; X_{0}, \ldots , X_{n})\) is a subtree of \( \mathcal{T }_{\alpha }\) that converges in the Gromov–Hausdorff sense towards \( \mathcal{T }_{\alpha }\) the last display holds with \(( \mathcal{T }_{\alpha }, \mu _{\alpha }^{(n)} )\) replaced by \(( \text{ Span}( \mathcal{T }_{\alpha } ; X_{0}, \ldots , X_{n}), \mu _{\alpha }^{(n)})\). Recall now that \(\text{ Span}( \mathcal{T }_{\alpha } ; X_{0}, \ldots , X_{n})\) can be seen as the tree \( \varvec{\tau }_{n}\) where each edge has been replaced by a Euclidean segment of length \(\ell _{e}^{(n)}\). This identification transports the measure \(\mu _{\alpha }^{(n)}\) onto the discrete atomic measure \(\mu _{n}\). Thus it suffices to prove that

$$\begin{aligned} \mathrm{d}_\mathrm{GHP} \left( \left( \tau _{n}, \frac{ \text{ d}_\mathrm{gr}^{(n)}}{\alpha n^{ \bar{ \alpha }}}, \mu _{n}\right),\left( \tau _{n}, \text{ d}_{\ell }^{(n)}, \mu _{n}\right)\right) \xrightarrow [n\rightarrow \infty ]{} 0. \end{aligned}$$

But this again follows from (5). Indeed, the obvious correspondence \(\mathcal{R } = \{ (x,x) : x \in \tau _n\}\) between \(\Big (\tau _{n}, \frac{ \text{ d}_\mathrm{gr}^{(n)}}{\alpha n^{ \bar{\alpha }}}\Big )\) and \(\Big ( \tau _{n}, \text{ d}_{\ell }^{(n)}\Big )\) has a vanishing distortion as \(n \rightarrow \infty \) by (5). This means (see [10]) that for any \( \varepsilon >0\) and for all \(n\) large enough we can isometrically embed these two trees into a common metric space \((F_n, \delta _n)\) such that two elements in correspondence (that is the same vertex of \(\tau _n\) in the two embeddings of the trees) are at \(\delta _n\)-distance less than \( \varepsilon \) from each other. The image measures \(\nu _n\) and \(\nu _n^{\prime }\) of \(\mu _n\) by these embeddings then obviously satisfy \( \nu _n( C) \le \nu _n^{\prime }(C^\varepsilon )\) and \( \nu ^{\prime }_n( C) \le \nu _n(C^\varepsilon )\) for all Borel \(C \subset F_n\). This suffices to prove the claim and to finish the proof of the theorem. \(\square \)

3 The discrete approach

This section is mainly devoted to the construction of the \(\alpha ^{\prime }\)-stable subtrees \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) of \(\mathcal{T }_{\alpha }\), for \(\alpha ^{\prime } > \alpha \), as stated in Proposition 2, and then to the proof of Theorem 1. In order to establish the key couplings of Marchal’s constructions that will be used for the proofs, we introduce once and for all a sequence \(U_{i}, i \ge 2 \) of i.i.d. random variables uniformly distributed on \((0,1)\) that are independent of all other variables introduced so far and later on.

3.1 Coupling different Marchal’s constructions

Fix \(\alpha \in (1,2)\). We will couple Marchal’s constructions for different values of \(\alpha ^{\prime } \in (\alpha ,2]\) in such a way that the \(\alpha ^{\prime }\)-constructions are nested “sub-constructions” of the \(\alpha \)-one. To see this, fix \(\alpha ^{\prime } \in (\alpha ,2]\) and recall the definition of the probabilities \(p_{\alpha ,\alpha ^{\prime },d,d^{\prime }} \) in (1), as well as the definition of Marchal’s Markov chain \(( \mathbf T _{\alpha }(n),n \ge 1)\) in Sect. 2.3. The idea is to enrich the construction of this Markov chain with two colors, blue and red, such that the only edge and the two leaves of \( \mathbf T _{\alpha }(1)\) are blue and the colors evolve recursively according to the following rules, see Fig. 5. At step \(i \ge 2\):

  • if an edge of color \( \mathsf{C} \in \{\)red,blue\(\}\) is selected, then we split this edge into two \( \mathsf{C}\) edges divided by a \( \mathsf{C}\) vertex and attach a \( \mathsf{C}\) edge-leaf to it;

  • if a vertex is selected, then the new edge-leaf attached on it is blue if

    $$\begin{aligned} U_{i} \le p_{\alpha ,\alpha ^{\prime },d_{i},d^{\prime }_{i}} \end{aligned}$$

    and red otherwise, where \(d_{i}\) is the degree of the vertex in the full tree \(\mathbf{T}_{\alpha }(i-1)\) and \(d^{\prime }_{i}\) is its degree in the blue subtree of \(\mathbf{T}_{\alpha }(i-1)\), with the convention that \(d^{\prime }_{i}=0\) if this vertex is red.

Fig. 5
figure 5

Illustration of the construction of the blue tree (color figure online)

We denote by \( \mathbf T _{\alpha ,\alpha ^{\prime }}(n)\) the subtree of \( \mathbf T _{\alpha }(n)\) spanned by the blue edges and leaves. Remark that when \(\alpha ^{\prime }=2\), the coloring rules are particularly simple (and deterministic): each edge-leaf attached on a red edge or on a vertex (blue or red) is red, whereas an edge-leaf attached on a blue edge is blue. In this case the blue subtree is therefore binary. Remark also that the function \( \alpha ^{\prime } \in (\alpha ,2] \mapsto p_{\alpha ,\alpha ^{\prime },d,d^{\prime }}\) is decreasing, hence for every \(n \ge 1\) the function \(\alpha ^{\prime } \in (\alpha ,2] \mapsto {T}_{\alpha ,\alpha ^{\prime }}(n)\) is decreasing for the inclusion inside \( {T}_{\alpha }(n)\).

The labeling of \( \mathbf T _{\alpha ,\alpha ^{\prime }}(n)\) is given by sorting the leaves of \( \mathbf T _{\alpha ,\alpha ^{\prime }}(n)\) according to their labels in \( \mathbf T _{\alpha }(n)\). Also, we let

$$\begin{aligned} L_{\alpha ,\alpha ^{\prime }}(n)=\text{ number} \text{ of} \text{ leaves} \text{ of}\, \mathbf T _{\alpha ,\alpha ^{\prime }}(n) \text{ minus} 1 \end{aligned}$$

and \(L_{\alpha ,\alpha ^{\prime }}^{-1}(n) = \inf \{ k \ge 1 : L_{\alpha ,\alpha ^{\prime }}(k) =n\}\) its inverse. Here is the main observation:

Lemma 8

The subchain \( \mathbf{T}_{\alpha ,\alpha ^{\prime }}(.)\) evolves as a time-changed Marchal’s construction with parameter \(\alpha ^{\prime }\). More precisely, we have the following identity in distribution

$$\begin{aligned} \left( \mathbf T _{\alpha ,\alpha ^{\prime }}\left(L_{\alpha ,\alpha ^{\prime }}^{-1}(n)\right)\right)_{n \ge 1} \overset{\mathrm{(d)}}{=} \left(\mathbf T _{\alpha ^{\prime }}(n)\right)_{n \ge 1}\!. \end{aligned}$$

Futhermore, \((\mathbf T _{\alpha ,\alpha ^{\prime }}(L_{\alpha ,\alpha ^{\prime }}^{-1}(n)))_{n \ge 1}\) is independent of \((L_{\alpha ,\alpha ^{\prime }}(n))_{n\ge 1}\).

Proof

We examine the transition probabilities of the chain \(( \mathbf T _{\alpha }(n), \mathbf T _{\alpha ,\alpha ^{\prime }}(n))_{n \ge 1}\). Recall that in Marchal’s construction, a weight \(\alpha -1\) is assigned to each edge and a weight \(d-1-\alpha \) to each vertex of degree \(d \ge 3\). Without changing the dynamic, we multiply all these weights by \((\alpha ^{\prime }-1)/(\alpha -1)\), so that each edge has now a weight \(\alpha ^{\prime }-1\) and each vertex a weight \((d-1-\alpha )(\alpha ^{\prime }-1)/(\alpha -1)\). By (2), the total weight of the tree \( \mathbf T _{\alpha }(n)\) is now

$$\begin{aligned} \overline{W}( \mathbf T _{\alpha }(n)) = \frac{\alpha ^{\prime }-1}{\alpha -1} (n \alpha -1). \end{aligned}$$

Hence, the new edge-leaf added to \( \mathbf T _{\alpha }(n)\) to get \( \mathbf T _{\alpha }(n+1)\) is blue if it has been

  • grafted on an edge of \( \mathbf T _{\alpha ,\alpha ^{\prime }}(n)\), which occurs with probability \((\alpha ^{\prime }-1)/ \overline{W}( \mathbf T _{\alpha }(n))\) for each edge of \( \mathbf T _{\alpha ,\alpha ^{\prime }}(n)\)

  • or grafted on a vertex of \( \mathbf T _{\alpha ,\alpha ^{\prime }}(n)\) and then colored in blue, which occurs with probability

    $$\begin{aligned} \frac{(d-1-\alpha )(\alpha ^{\prime }-1)(\alpha -1)^{-1} \times p_{\alpha ,\alpha ^{\prime },d,d^{\prime }}}{\overline{W}( \mathbf T _{\alpha }(n)) }=\frac{d^{\prime }-1-\alpha ^{\prime }}{\overline{W}( \mathbf T _{\alpha }(n))} \end{aligned}$$

    when \(d^{\prime }\) is the degree of the vertex in \( \mathbf T _{\alpha ,\alpha ^{\prime }}(n)\) and \(d\) its full degree in \( \mathbf T _{\alpha }(n)\).

Observe that these probabilities are proportional to the weights assigned in Marchal’s construction with parameter \(\alpha ^{\prime }\). Consequently their sum, which is the total probability that the added edge-leaf is blue, is equal to

$$\begin{aligned} \frac{L_{\alpha ,\alpha ^{\prime }}(n)\alpha ^{\prime }-1}{\overline{W}( \mathbf T _{\alpha }(n))}, \end{aligned}$$

where we recall that \(L_{\alpha ,\alpha ^{\prime }}(n)+1\) is the number of leaves of \(\mathbf T _{\alpha ,\alpha ^{\prime }}(n).\) Moreover, conditionally on the fact that the added edge-leaf is blue, it is grafted on an edge or vertex of \( \mathbf T _{\alpha ,\alpha ^{\prime }}(n)\) with the dynamic of Marchal’s algorithm with parameter \(\alpha ^{\prime }\). In other words, the transition probabilities of the chain \(( \mathbf T _{\alpha }(n), \mathbf T _{\alpha ,\alpha ^{\prime }}(n))_{n \ge 1}\) can be described as follows. First, \(L_{\alpha ,\alpha ^{\prime }}(n),n\ge 1\) is a Markov chain with transition probabilities

$$\begin{aligned} \mathbb{P }\left( L_{\alpha ,\alpha ^{\prime }}(n+1)= L_{\alpha ,\alpha ^{\prime }}(n)+1 \ | \ \mathcal{F }_n \right)&= 1-\mathbb{P }\left(L_{\alpha ,\alpha ^{\prime }}(n+1)=L_{\alpha ,\alpha ^{\prime }}(n) \ | \ \mathcal{F }_n \right) \nonumber \\&= \frac{L_{\alpha ,\alpha ^{\prime }}(n)\alpha ^{\prime }-1}{\overline{W}( \mathbf T _{\alpha }(n))}, \end{aligned}$$
(9)

where \(\mathcal{F }_n\) denotes the sigma-field generated by \((\mathbf T _{\alpha }(k), \mathbf T _{\alpha ,\alpha ^{\prime }}(k), k \le n)\). Second, conditionally on \(L_{\alpha ,\alpha ^{\prime }}(n+1)=L_{\alpha ,\alpha ^{\prime }}(n)+1\), the transition probabilities on \( \mathbf T _{\alpha ,\alpha ^{\prime }}(n)\) are those of Marchal’s construction with parameter \(\alpha ^{\prime }\), whereas conditionally on \(L_{\alpha ,\alpha ^{\prime }}(n+1)= L_{\alpha ,\alpha ^{\prime }}(n)\), the transition on the red component \(\mathbf{T}_{\alpha }(n) \backslash \mathbf T _{\alpha ,\alpha ^{\prime }}(n)\) follows the appropriate conditional distribution. The statements of the lemma follow from these considerations.

The chain \(L_{ \alpha , \alpha ^{\prime }}\) can easily be studied using (9). Let us introduce its limit distribution (after re-normalization). A generalized Mittag-Leffler random variable of parameters \((\beta ,\theta )\) with \(\beta \in (0,1)\) and \(\theta > -\beta \) is a random variable denoted by \( \text{ ML}_{\beta ,\theta }\) which is characterized by its positive moments

$$\begin{aligned} \mathbb{E }\left[\text{ ML}^p_{\beta ,\theta }\right]=\frac{\Gamma (\theta +1)\Gamma (\theta /\beta +p+1)}{\Gamma (\theta /\beta +1)\Gamma (\theta +p\beta +1)}, \quad p \ge 0. \end{aligned}$$
(10)

This random variable is actually a biased version of a power of a stable distribution of index \(\beta \), see [24, Sect. 3] for background. For later use, remark that if \( 0 < a <b <1\) and if \(\text{ ML}_{a,a}\), \( \text{ ML}_{b,b}\) and \( \text{ ML}_{a/b,a}\) are independent generalized Mittag-Leffler with parameters \((a,a)\), \((b,b)\) and \((a/b,a)\) respectively, then we have the following identity in distribution

$$\begin{aligned} \text{ ML}_{a,a} = \big (\text{ ML}_{a/b,a}\big )^b \cdot \text{ ML}_{b,b}, \end{aligned}$$
(11)

this can be easily checked using moments (10).

Lemma 9

There exists a generalized Mittag-Leffler r.v. \(\text{ ML}_{\bar{\alpha }/\bar{\alpha }^{\prime },\bar{\alpha }}\) such that

$$\begin{aligned} \frac{L_{\alpha ,\alpha ^{\prime }}(n)}{n^{ \bar{\alpha }/\bar{\alpha }^{\prime }}} \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} \text{ ML}_{\bar{\alpha }/\bar{\alpha }^{\prime },\bar{\alpha }}. \end{aligned}$$

Proof

It is possible to analyse the chain \((L_{\alpha ,\alpha ^{\prime }}(n))_{n\ge 1}\) “by hand” using the transition probabilities (9). However we bypass any calculation by using a connection with Pitman’s Chinese restaurant process, see [24, Sect. 3]. Indeed, it is straightforward to check from (9) that the sequence \((L_{\alpha ,\alpha ^{\prime }}(n+1)-1,n\ge 1)\) is distributed as the sequence of the number of tables in a \((\bar{\alpha }/\bar{\alpha }^{\prime },\bar{\alpha })\)-Chinese restaurant process, see [24, Sect. 3.2.3]. The result then follows from [24, Theorem 3.8]. \(\square \)

3.2 Proofs of Proposition 2 and Theorem 1

Let \(1<\alpha < \alpha ^{\prime } \le 2\) and consider an \(\alpha \)-stable tree \(\mathcal{T }_{\alpha }\) and its uniform mass measure \(\mu _{\alpha }\). By Theorem 5, there exists a version of Marchal’s Markov chain \((\mathbf{T}_{\alpha }(n), n \ge 1)\) such that

$$\begin{aligned} \left(\frac{T_{\alpha }(n)}{\alpha n^{\bar{\alpha }}}, \frac{\sum _{i=0}^n \delta _{A_i}}{n+1} \right) \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} \left(\mathcal{T }_{\alpha }, \mu _{\alpha }\right)\!, \end{aligned}$$
(12)

in the Gromov–Hausdorff–Prokhorov sense. We will use the blue subchain \((\mathbf{T}_{\alpha ,\alpha ^{\prime }}(n), n \ge 1)\) constructed from \((\mathbf{T}_{\alpha }(n), n \ge 1)\) in the previous section to build the closed subtree \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) of Proposition 2.

Proof of Proposition 2

Recall Lemma 8 and note from Lemma 9 that \(L_{\alpha ,\alpha ^{\prime }}(n)\rightarrow \infty \) a.s. as \(n \rightarrow \infty \). Then apply Theorem 5 to obtain the almost sure convergence

$$\begin{aligned} \frac{T_{\alpha ,\alpha ^{\prime }}(n)}{{L_{\alpha ,\alpha ^{\prime }}(n)}^{\bar{\alpha }^{\prime }}} \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}}\alpha ^{\prime } \mathcal{T }_{\alpha ^{\prime }}, \end{aligned}$$

in the Gromov–Hausdorff sense, where \( \mathcal{T }_{\alpha ^{\prime }}\) denotes a stable tree of index \(\alpha ^{\prime }\) (which is not independent of \(\mathcal{T }_{\alpha }\)!). Furthermore, still by Lemma 8, the tree \( \mathcal{T }_{\alpha ^{\prime }}\) is independent of the sequence \((L_{\alpha ,\alpha ^{\prime }}(n),n \ge 1)\), hence also of its limit after rescaling. Hence

$$\begin{aligned} \frac{T_{\alpha ,\alpha ^{\prime }}(n)}{\alpha n^{{\bar{\alpha }}}}&= \frac{T_{\alpha ,\alpha ^{\prime }}(n)}{\alpha {L_{\alpha ,\alpha ^{\prime }}(n)}^{\bar{\alpha }^{\prime }}}\times \left(\frac{L_{\alpha ,\alpha ^{\prime }}(n)}{n^{\bar{\alpha }/\bar{\alpha }^{\prime }}}\right)^{\bar{\alpha }^{\prime }}\\&\xrightarrow [n\rightarrow \infty ]{\text{ a.s.}}&\frac{\alpha ^{\prime }}{\alpha } (\text{ ML}_{\bar{\alpha }/\bar{\alpha }^{\prime },\bar{\alpha }})^{\bar{\alpha }^{\prime }} \mathcal{T }_{\alpha ^{\prime }}=: \mathfrak{T }_{\alpha ,\alpha ^{\prime }}, \end{aligned}$$

where \(\text{ ML}_{\bar{\alpha }/\bar{\alpha }^{\prime },\bar{\alpha }}\) denotes the Mittag-Leffler random variable of Lemma 9, which is independent of \( \mathcal{T }_{\alpha ^{\prime }}\). The variable \(M_{\alpha , \alpha ^{\prime }}\) of Proposition 2 is thus equal to \((\alpha ^{\prime }/\alpha )^{1/\bar{\alpha } ^{\prime }} \text{ ML}_{\bar{\alpha }/\bar{\alpha }^{\prime },\bar{\alpha }}\) and the tree \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) can be realized as a closed subtree of \( \mathcal{T }_{\alpha }\). \(\square \)

In the constructions of \( \mathbf T _{\alpha }(n)\) and \(\mathbf T _{\alpha ,\alpha ^{\prime }}(n)\), the two leaves \(A_{0}\) and \(A_{1}\) are kept in both trees and provide in the limit, by Theorem 5, two independent uniform points in \( \mathcal{T }_{\alpha }\) and in \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\). More precisely, for \(\beta \in (1,2]\) we denote by \(I_{\beta }\) the height between two uniform points in a \(\beta \)-stable tree. We also write \(H_{n}\) the distance between \(A_{0}\) and \(A_{1}\) in \( \mathbf T _{\alpha }(n)\). Then we have by Theorem 5

$$\begin{aligned} \frac{H_{n}}{n^{ \bar{\alpha }}} \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} \alpha I_{\alpha }. \end{aligned}$$

The distance \(H_n\) is also the distance between \(A_0\) and \(A_1\) in \( \mathbf T _{\alpha ,\alpha ^{\prime }}(n)\). Hence, by Lemmas 8 and 9, combined with Theorem 5,

$$\begin{aligned} \frac{H_{n}}{n^{{\bar{\alpha }}}}= \frac{H_{n}}{{L_{\alpha ,\alpha ^{\prime }}(n)}^{\bar{\alpha }^{\prime }}}\times \left(\frac{L_{\alpha ,\alpha ^{\prime }}(n)}{n^{\bar{\alpha }/\bar{\alpha }^{\prime }}}\right)^{\bar{\alpha }^{\prime }}&\xrightarrow [n\rightarrow \infty ]{\text{ a.s.}}&\alpha ^{\prime }I_{\alpha ^{\prime }} \left(\text{ ML}_{\bar{\alpha }/\bar{\alpha }^{\prime },\bar{\alpha }}\right)^{\bar{\alpha }^{\prime }}\!, \end{aligned}$$

where, in the limit, the Mittag-Leffler random variable and \(I_{\alpha ^{\prime }}\) are independent. Thus if for \(\alpha <\alpha ^{\prime } \in (1,2]\) we set

$$\begin{aligned} Q_{\alpha \rightarrow \alpha ^{\prime }} \overset{\mathrm{(d)}}{=} \frac{\alpha ^{\prime }}{\alpha }\left(\text{ ML}_{\bar{\alpha }/\bar{\alpha }^{\prime },\bar{\alpha }}\right)^{\bar{\alpha }^{\prime }} = (M_{\alpha ,\alpha ^{\prime }})^{\bar{\alpha }^{\prime }}\!, \end{aligned}$$

we can easily check that for \(a<b<c \in (1,2]\), we have \( Q_{a \rightarrow c} = Q_{a \rightarrow b}\cdot Q_{b \rightarrow c}\) in distribution where the two last variables are independent. Hence, \(Q\) can be interpreted as a transition kernel. Furthermore the random height in the stable trees form an invariant family for \(Q\) in the sense that

$$\begin{aligned} I_{\alpha } \overset{\mathrm{(d)}}{=} Q_{\alpha \rightarrow \alpha ^{\prime }}\cdot I_{\alpha ^{\prime }} \end{aligned}$$
(13)

with the two variables on the right-hand side independent. This could have been proved directly using moments and using the fact (see e.g. [22]) that \(\alpha I_{\alpha }\) is distributed as \( \text{ ML}_{\bar{\alpha },\bar{\alpha }}\) for all \(\alpha \in (1,2]\). Hence (13) reduces to (11).

We now aim at a backward analog of (13). As in Theorem 1, let \(J_{\alpha }\) be distributed as the power of a Gamma distribution, namely \(\alpha (\Gamma _{1+\bar{\alpha }})^{\bar{\alpha }}\) for \(\alpha \in (1,2]\). In particular the positive moments of \( J_{\alpha }\) are given by

$$\begin{aligned} \mathbb{E }\big [J_{\alpha }^p\big ] = \alpha ^p \frac{ \Gamma \left((p+1) \bar{\alpha }+1\right)}{ \Gamma \left( \bar{\alpha }+1\right)}, \quad \text{ for}\, p \ge 0. \end{aligned}$$

These variables have been designed to satisfy the “dual” relations of (13), namely, for \(1<\alpha <\alpha ^{\prime } \le 2\), one can check using moments that the following identity in distribution holds

$$\begin{aligned} J_{\alpha }\cdot Q_{\alpha \rightarrow \alpha ^{\prime }} = J_{\alpha ^{\prime }}, \end{aligned}$$
(14)

when the two first random variables are independent.

Proof of Theorem 1

Let \({\fancyscript{T}}_{\alpha }\) be a stable tree \(\mathcal{T }_{\alpha }\) that has been rescaled by an independent copy of \(J_{\alpha }\), that is \( {\fancyscript{T}}_{\alpha } = J_{\alpha } \cdot \mathcal{T }_{\alpha }\). For \(\alpha ^{\prime } \in (\alpha ,2]\), using Proposition 2, we can find inside \(\mathcal{T }_{\alpha }\) a subtree \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) which is distributed as an \(\alpha ^{\prime }\)-stable tree rescaled by an independent factor distributed as \( Q_{\alpha \rightarrow \alpha ^{\prime }}\). Hence, after rescaling by \(J_{\alpha }\) this provides a subtree of \( {\fancyscript{T}}_{\alpha }\) which is distributed as

$$\begin{aligned} J_{\alpha } \cdot Q_{\alpha \rightarrow \alpha ^{\prime }} \mathcal{T }_{\alpha ^{\prime }} \overset{\mathrm{(d)}}{=} J_{\alpha ^{\prime }} \cdot \mathcal{T }_{\alpha ^{\prime }} \overset{\mathrm{(d)}}{=} {\fancyscript{T}}_{\alpha ^{\prime }}, \end{aligned}$$

since \( Q_{\alpha \rightarrow \alpha ^{\prime }}\) and \( J_{\alpha }\) are independent. Iterating, we obtain for all finite increasing sequences \(1<\alpha _1<\cdots <\alpha _n\le 2\) a sequence of nested trees \({\fancyscript{T}}_{\alpha _n} \subset \cdots \subset {\fancyscript{T}}_{\alpha _1}\). Using Kolmogorov’s extension theorem we can thus construct a process \( ( {\fancyscript{T}}_{\alpha })_{1 < \alpha \le 2}\) of nested rescaled stable trees. \(\square \)

Still using moments, note that

$$\begin{aligned} J_{\alpha }I_{\alpha } \overset{\mathrm{(d)}}{=}\Gamma _2, \end{aligned}$$

when \(I_{\alpha }\) and \(J_{\alpha }\) are independent. This implies that the distance between two independent uniform points in \( {\fancyscript{T}}_{\alpha }\) has a Gamma distribution of parameter 2 (that is with density \(xe^{-x}\) on \((0,\infty )\)) for all \(\alpha \in (1,2]\). For the nested sequence of trees \( ( {\fancyscript{T}}_{\alpha })_{1 < \alpha \le 2}\), it is then possible to couple the choice of the two uniform independent points so that their distance is the same in all trees \( {\fancyscript{T}}_{\alpha }\).

3.3 Zero measure

Prposition 10

For all \(\alpha ^{\prime } \in (\alpha ,2]\),

$$\begin{aligned} \mu _{\alpha }\big ( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\big )=0 \quad \text{ a.s}. \end{aligned}$$

Proof

We will use that \((\mathcal{T }_{\alpha },\mu _{\alpha })\) is obtained from a Marchal’s Markov chain \(\mathbf{T}_{\alpha }\) via (12), jointly with the fact that \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) is the scaling limit of the subchain \(\mathbf{T}_{\alpha ,\alpha ^{\prime }}\). Since, by Lemma 9, the number of leaves of \( \mathbf T _{\alpha ,\alpha ^{\prime }}(n) \) is proportional to \(n^{\bar{\alpha }/ \bar{\alpha }^{\prime }} < < n\), it should be intuitively clear that \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) has zero mass inside \( \mathcal{T }_{\alpha }\). We sketch here a more rigorous argument. Let \(\gamma \) denote the expectation of the \(\mu _{\alpha }\)-mass of \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\):

$$\begin{aligned} \gamma = \mathbb E \left[\mu _{\alpha } \big ( \mathfrak{T }_{\alpha ,\alpha ^{\prime }} \big )\right]\!, \end{aligned}$$

our goal being to show that \(\gamma =0\). For that purpose we use the self-similarity of Marchal’s construction. Indeed, if we stop this construction after one step, we obtain the tree \(\mathbf{T}_{\alpha }(2)\) made of a “Y” with three leaves \(A_{0}, A_{1}\) and \(A_{2}\) joined by a branch point denoted by \(\mathcal{B }_{2}\). In the future evolution of the process, the three edges \(e_{0}=\{\mathcal{B }_{2},A_{0}\}, e_{1}=\{ \mathcal{B }_{2},A_{1}\}\) and \(e_{2}=\{\mathcal{B }_{2},A_{2}\}\) will give rise in the scaling limit to three copies of rescaled stable trees denoted by \( \tau _{0}, \tau _{1}\) and \(\tau _{2}\). Similarly, the edges that will later be grafted on \(\mathcal{B }_{2}\) will also give rise to a countable collection of continuous trees \( \tau _{3}, \tau _{4}, \ldots \) such that the tree \( \mathcal{T }_{\alpha }\) is made of the glueing of all the \( \tau _{i}\)’s by one vertex. It is clear from Marchal’s construction, and (12), that the measured trees \((\tau _i,\mu _{\alpha ,i})\), \(i\ge 0\), where \(\mu _{\alpha ,i}\) denotes the restriction of the measure \(\mu _{\alpha }\) to \(\tau _i\), satisfy

$$\begin{aligned} \tau _i=\mu _{\alpha }(\tau _i)^{\bar{\alpha }}\mathcal{T }^{(i)}_{\alpha }, \quad \mu _{\alpha ,i}=\mu _{\alpha }(\tau _i)\mu ^{(i)}_{\alpha }, \end{aligned}$$

where the measured trees \((\mathcal{T }^{(i)}_{\alpha },\mu ^{(i)}_{\alpha }), i \ge 0\) are i.i.d. copies of \((\mathcal{T }_{\alpha },\mu _{\alpha })\), that are moreover independent of the masses \(\mu _{\alpha }(\tau _i), i \ge 0\).

Let us now examine the evolution of the “blue” subprocess \(\mathbf T _{\alpha ,\alpha ^{\prime }}\) inside each of these trees. First, is it plain that the construction of \( \mathbf T _{\alpha ,\alpha ^{\prime }}\) restricted to each offspring subtree of the three initial edges \(e_{0},e_{1}\) and \(e_{2}\) follows exactly the rules of a subchain of parameters \(\alpha ,\alpha ^{\prime }\) inside this subtree. This gives in the limit a subtree \(t_{i} \subset \tau _{i}\), for \(i\in \{0,1,2\}\) that satisfies

$$\begin{aligned} \gamma = \frac{\mathbb{E }[ \mu _{\alpha }( t_{i})]}{\mathbb{E }[ \mu _{\alpha }(\tau _{i})]}. \end{aligned}$$

For \(i\ge 3\), the situation is almost the same, provided that the first edge grafted on \(\mathcal{B }_{2}\) that will give rise to \(\tau _{i}\) is blue. If this is not the case, that is the ancestral edge creating \(\tau _{i}\) is red, then the blue subprocess does not “enter” this part of the tree and we set \(t_{i} = \varnothing \). If the ancestral edge creating \(\tau _{i}\) is blue, then the blue subprocess enters this part of the tree and provides in the limit a subtree \( t_{i} \subset \tau _{i}\) such that

$$\begin{aligned} \gamma = \frac{\mathbb{E }[ \mu _{\alpha }( t_{i}) \ | \ t_{i} \ne \varnothing ]}{\mathbb{E }[ \mu _{\alpha }(\tau _{i})]}. \end{aligned}$$

The subtree \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) is thus made of the union of the \( t_{i},i\ge 0\) and we have

$$\begin{aligned} \gamma =\mathbb{E }\Bigg [\sum _{i \ge 0} \mu _{\alpha }( t_{i}) \mathbf 1 _{\{{t}_{i} \ne \varnothing \}}\Bigg ]= \gamma \sum _{i \ge 0} \mathbb{P }(t_i \ne \varnothing ) \mathbb{E }\left[\mu _{\alpha }(\tau _i)\right]. \end{aligned}$$
(15)

To conclude, note that \(\mathbb{P }(\exists i : t_i=\varnothing )=1\). Indeed, the probability that for all \(\tau _i,i \ge 3\) the ancestral edges are blue is

$$\begin{aligned} \prod _{d\ge 3} \frac{(d-1-\alpha ^{\prime })(\alpha -1)}{(d-1-\alpha )(\alpha ^{\prime }-1)}=0, \end{aligned}$$

since \(\alpha ^{\prime }>\alpha \). Together with the fact that \(\sum _{i\ge 0} \mu _{\alpha }(\tau _i)=1\) a.s., this implies that the sum in (15) is strictly less than \(1\). Hence \(\gamma =0\).

4 Pruning of discrete and continuous trees

The goal of this section is to present a “geometric” way of retrieving \( \mathbf T _{\alpha ,\alpha ^{\prime }}(n)\) from the labeled tree \(\mathbf T _\alpha (n)\). Although equivalent to the iterative construction of \( \mathbf T _{\alpha , \alpha ^{\prime }}(n)\) this new procedure, called “pruning operation” is well-suited to pass to the continuous limit and yields Theorem 3.

4.1 Pruning

Let \(1<\alpha <\alpha ^{\prime } \le 2\). We recall and extend the pruning procedure of stable trees described in the Introduction to the context of continuous or discrete trees. This procedure depends on the probabilities \(p_{\alpha ,\alpha ^{\prime },d,d^{\prime }}\) and uses the sequence of uniform random variables \((U_{i})_{i \ge 2}\) introduced in Sect.  2.3. Let \( ( {t} ; x_{0}, x_{1}, x_{2}, \ldots , x_{n})\) be a discrete or continuous tree given with an ordered subset of distinct leaves. We remove randomly some edges in this labeled tree so as to obtain a subtree of \(t\) (given with an ordered subset of its leaves) that we denote by \( \text{ Prun}_{\alpha ,\alpha ^{\prime }}(t ; x_{0}, \ldots , x_{n})\) and which is constructed recursively as follows:

  • the tree \(\text{ Prun}_{\alpha ,\alpha ^{\prime }}(t ; x_{0},x_{1})\) is \([ [ x_{0},x_{1} ] ]\), the tree spanned by \(x_0\) and \(x_1\) in \(t\);

  • for \( i \ge 2\), consider two trees: \( \mathfrak{t }_{i-1}\), the tree spanned by \(x_{0}, \ldots , x_{i-1}\) in \(t\), and \(\tau _{i-1}=\text{ Prun}_{\alpha ,\alpha ^{\prime }}(t ; x_{0}, \ldots , x_{i-1})\). In \(t\), the vertex \(x_{i}\) is attached to \(\mathfrak{t }_{i-1}\) via the point \( \Delta _{i}\), that is

    $$\begin{aligned} \mathfrak t _{i-1} \cap [ [ x_{i}, \Delta _{i} ] ]=\{\Delta _i\}. \end{aligned}$$

    Let then \(d_{i}\) denote the degree of \(\Delta _{i}\) in \(\mathfrak{t }_{i-1}\) and \(d^{\prime }_{i}\) its degree in \(\tau _{i-1}\) with the usual convention that \(d^{\prime }_{i}=0\) if \(\Delta _{i} \notin \tau _{i-1}\). See Fig. 2. We then set:

    $$\begin{aligned} \begin{array}{ll} \text{ Prun}_{\alpha ,\alpha ^{\prime }}(t; x_{0}, \ldots , x_{i}) = \tau _{i-1} \cup [ [ x_{i}, \Delta _{i} ] ]&\quad \text{ if}\,U_{i} \le p_{\alpha ,\alpha ^{\prime },d_{i},d^{\prime }_{i}},\\ \text{ Prun}_{\alpha ,\alpha ^{\prime }}(t; x_{0}, \ldots , x_{i}) = \tau _{i-1}&\text{ otherwise}. \end{array} \end{aligned}$$

Note that \(\text{ Prun}_{\alpha ,\alpha ^{\prime }}(t ; x_{0}, \ldots , x_{n})\) is a function of \((t; x_{0}, \ldots , x_{n})\) and of the random variables \( (U_{i})_{i \ge 2}\). This tree can be given with an ordered sequence of leaves, which is the subsequence of elements of \(\{x_{0}, x_{1}, \ldots , x_{n}\}\) that end-up in \( \text{ Prun}_{\alpha ,\alpha ^{\prime }}(t ; x_{0}, \ldots , x_{n})\) and following our principles we denote this labeled tree by \(\mathbf{Prun}_{\alpha ,\alpha ^{\prime }}(t ; x_0, \ldots , x_n)\). To stick with the color coding that we defined in Sect. 3.1 we think of the pruned subtree as a blue subtree of \(t\) whose complement in \(\mathfrak t _n\) is red. In particular a leaf \(x_i\) is blue if it belongs to the pruned subtree. All these constructions are coupled via the \(U_{i}^{\prime }\)s in such a way that

$$\begin{aligned} \text{ Prun}_{\alpha ,\alpha ^{\prime }}(t ; x_{0}, \ldots , x_{n-1}) \subset \text{ Prun}_{\alpha ,\alpha ^{\prime }}(t ; x_{0}, \ldots , x_{n}), \quad \forall n \ge 1, \end{aligned}$$

and for every \(\alpha \) the map \( \alpha ^{\prime }\in (\alpha ,2] \mapsto \text{ Prun}_{\alpha ,\alpha ^{\prime }}(t ; x_{0}, \ldots , x_{n}) \) is decreasing for the inclusion in \(t\). Thus, when \( t\) is a continuous tree and \(x_{0}, \ldots , x_{n}, \ldots \) is an infinite sequence of (distinct) leaves in \(t\), we can define

$$\begin{aligned} \text{ Prun}_{\alpha ,\alpha ^{\prime }}( t ; (x_{i})_{i \ge 0})&= \overline{\bigcup _{ n \ge 1} \text{ Prun}_{\alpha ,\alpha ^{\prime }}( t ; x_{0}, \ldots , x_{n})}. \end{aligned}$$

Note also that when \( \alpha ^{\prime }=2\), since \( p_{\alpha ,2,d,3}=0\), the tree \( \text{ Prun}_{\alpha ,\alpha ^{\prime }}(t; x_{0}, \ldots , x_{n})\) is binary for all \(n\ge 1\). In this case, the pruning procedure is deterministic (it does not depend on the \(U_i\)’s): the leaves \(x_{i}\) are just added one by one provided that the tree remains binary. Notice also that the pruning constructions are coupled with the chain \(\mathbf T _{\alpha ,\alpha ^{\prime }}\) via the \(U_i\)’s.

Now, recall that \(\mathbf T _\alpha (n)\) is a discrete tree with \(n+1\) ordered leaves \(A_0,\ldots ,A_n\). The pruning has been defined such that the blue subtree \( \mathbf T _{\alpha ,\alpha ^{\prime }}(n)\) can also be seen as \(\mathbf{Prun}_{\alpha ,\alpha ^{\prime }}(\mathbf T _\alpha (n))\):

Proposition 11

For all \( 1< \alpha < \alpha ^{\prime } \le 2\) and for every \(n \ge 1\), we have

$$\begin{aligned} \mathbf{Prun}_{\alpha ,\alpha ^{\prime }}\big ( \mathbf{T}_\alpha (n)\big ) {=} \mathbf{T}_{\alpha ,\alpha ^{\prime }}(n). \end{aligned}$$

Proof

This is easy to prove by induction on \(n\) and is safely left to the reader.

4.2 Proof of Theorem 3

We start with a continuity property of the application \(\text{ Prun}(.)\) with respect to the leaves. It applies both in the discrete and continuous settings and will be useful to prove Theorem 3.

Proposition 12

Let \((t ; x_{0}, \ldots , x_{n})\) be a discrete or continuous tree given with a subset of \(n+1\) leaves. Then for any \(k \in \{0,1,\ldots , n\}\) we have

$$\begin{aligned}&\fancyscript{D} \left( \mathrm{Prun}_{\alpha ,\alpha ^{\prime }}(t ; x_0, \ldots , x_n), \mathrm{Prun}_{\alpha ,\alpha ^{\prime }}( t ; x_{0},\ldots , x_{k})\right) \\&\le \inf \left\{ \varepsilon >0 : x_{0}, \ldots , x_{k} \text{ is} \text{ an}\, \varepsilon \text{-net} \text{ in}\,(t, \fancyscript{D})\right\} , \end{aligned}$$

where \( \fancyscript{D}\) denotes either the graph distance if \(t\) is discrete or its metric if \(t\) is a continuous tree.

Proof

Let \( 0 \le k \le n\). By the monotonicity of the pruning operation, the pruned tree \(\text{ Prun}_{\alpha ,\alpha ^{\prime }}(t ; x_0, \ldots , x_n)\) can be obtained from \(\text{ Prun}_{\alpha ,\alpha ^{\prime }}(t; x_{0}, \ldots , x_{k})\) by grafting on each of its vertices and edges some trees. The maximal height of these grafted trees is thus equal to the (Hausdorff) distance between \(\text{ Prun}_{\alpha ,\alpha ^{\prime }}(t; x_{0}, \ldots , x_{k})\) and \(\text{ Prun}_{\alpha ,\alpha ^{\prime }}(t; x_0, \ldots , x_n)\). We just have to remark that none of the points \(x_{0}, \ldots , x_{k}\) belong to the grafted trees that is

$$\begin{aligned} \{x_{0}, \ldots , x_{k}\} \cap \big ( \text{ Prun}_{\alpha ,\alpha ^{\prime }}(t; x_0, \ldots , x_n)\backslash \text{ Prun}_{\alpha ,\alpha ^{\prime }}( t; x_{0},\ldots , x_{k}) \big )&= \varnothing . \end{aligned}$$

The statement of the proposition follows.

Proof of Theorem 3

Starting with an \(\alpha \)-stable tree \(\mathcal{T }_{\alpha }\) and a sample \((X_{i}, i \ge 0)\) of i.i.d. random leaves with distribution \(\mu _{\alpha }\) (given \(\mathcal{T }_{\alpha }\)), we consider the chain \(\mathbf T _{\alpha }\) such that (12) holds, as well as its blue subchain \(\mathbf{T}_{\alpha ,\alpha ^{\prime }}\). We have already established (see the proof of Proposition 2) that once re-normalized by \( \alpha n^{\bar{\alpha }}\) the blue tree \( {T}_{\alpha ,\alpha ^{\prime }}(n)\) converges almost surely towards \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }} \subset \mathcal{T }_{\alpha }\). By Proposition 11 we just have to show that \(\text{ Prun}_{\alpha ,\alpha ^{\prime }}( \mathbf T _{\alpha }(n))\) converges after renormalization towards \( \text{ Prun}_{\alpha ,\alpha ^{\prime }}( \mathcal{T }_{\alpha } ; (X_{i})_{i \ge 0}).\) For this, we will use that for every \(k \ge 0\) (see Theorem 5(ii))

$$\begin{aligned} \left( \frac{T_{\alpha }(n)}{\alpha n^{\bar{ \alpha }}}; A_{0}, A_{1}, \ldots , A_{k}\right) \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} (\mathcal{T }_{\alpha } ; X_{0}, \ldots , X_{k}), \end{aligned}$$
(16)

in the \(k+1\)-pointed Gromov–Hausdorff topology. Let \( \varepsilon >0\). Almost surely, since the \(X_{i}\)’s form a dense subset of \(\mathcal{T }_{\alpha }\), which is compact, we can find a (random) \(0 \le k < \infty \) such that \(\{X_{0}, \ldots , X_{k}\}\) is an \( \varepsilon \)-net in \( \mathcal{T }_{\alpha }\). In particular, we eventually have that \(\{A_{0}, \ldots , A_{k}\}\) is a \(\lfloor 2 \alpha \varepsilon n^{{\bar{\alpha }}} \rfloor \)-net in \( T_{\alpha }(n)\). Applying Proposition 12 twice we get that for \(n\) large enough,

$$\begin{aligned} \delta \big (\text{ Prun}_{\alpha ,\alpha ^{\prime }}( \mathcal{T }_{\alpha } ; (X_{i})_{i \ge 0}),\text{ Prun}_{\alpha ,\alpha ^{\prime }}( \mathcal{T }_{\alpha } ; X_{0},\ldots , X_{k}) \big )&\le \varepsilon ,\\ \frac{1}{\alpha n^{{\bar{\alpha }}}}\text{ d}_\mathrm{gr}\big (\text{ Prun}_{\alpha ,\alpha ^{\prime }}( \mathbf T _{\alpha }(n)), \text{ Prun}_{\alpha ,\alpha ^{\prime }}( {T}_{\alpha }(n) ; A_0, \ldots , A_k) \big )&\le 2 \varepsilon , \end{aligned}$$

where \(\delta \) is the metric in \(\mathcal{T }_{\alpha }\). Hence the proof will be completed when we will have shown that for every fixed \(k_{0} \ge 0\),

$$\begin{aligned} \frac{1}{ \alpha n^{{\bar{\alpha }}}} \text{ Prun}_{\alpha ,\alpha ^{\prime }}({T}_{\alpha }(n) ; A_0, \ldots , A_{k_{0}}) \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} \text{ Prun}_{\alpha ,\alpha ^{\prime }}( \mathcal{T }_{\alpha } ; X_0, \ldots , X_{k_{0}}), \end{aligned}$$
(17)

in the Gromov–Hausdorff sense as \(n \rightarrow \infty \). To see this, we will obviously use (16). Let us however emphasize here that the convergence (16) in the \(k+1\)-pointed Gromov–Hausdorff sense does not imply the convergence of the pruned subtrees in general. Counter-examples can easily be set up when some of the marked points in the limiting tree are not leaves. However, here, the \(X_i\)’s are distinguished leaves of \(\mathcal{T }_{\alpha }\), therefore the discrete labeled trees obtained by removing vertices of degree 2 (and glueing the adjacent edges) in \(\text{ Span}(T_{\alpha }(n);A_0,\ldots ,A_{k_0}), n \ge k_0\) and forgetting lengths on the edges of \(\text{ Span}(\mathcal{T }_{\alpha };X_0,\ldots ,X_{k_0})\) are all identical. From this and the definition of the pruning procedure (with, for reminder, the same sequence of uniform random variables \(U_i,i \ge 2\) in all cases), we get that the discrete labeled trees obtained by removing vertices of degree 2 in \(\text{ Prun}_{\alpha ,\alpha ^{\prime }}(T_{\alpha }(n);A_0,\ldots ,A_{k_0})\) for \(n\) large enough and forgetting lengths on the edges of \(\text{ Prun}_{\alpha ,\alpha ^{\prime }}(\mathcal{T }_{\alpha };X_0, \ldots ,X_{k_0})\) are all identical. Re-incorporating distances, the conclusion follows from the convergence of \(\text{ Span}(T_{\alpha }(n);A_0,\ldots ,A_{k_0})\), after normalization, towards \(\text{ Span}(\mathcal{T }_{\alpha };X_0,\ldots ,X_{k_0})\), which is a consequence of (16), see e.g. Lemma 14 in [15].

With the notation of the proof of Proposition 2 we have thus proved that

$$\begin{aligned} \text{ Prun}_{\alpha ,\alpha ^{\prime }}( \mathcal{T }_\alpha ; (X_i)_{i \ge 0})=\mathfrak{T }_{\alpha ,\alpha ^{\prime }}= M_{\alpha ,\alpha ^{\prime }}^{\bar{\alpha }^{\prime }} \cdot \mathcal{T }_{\alpha ^{\prime }}. \end{aligned}$$

With a slight abuse of notation, we also denote by \(\mu _{\alpha ^{\prime }}\) the uniform mass measure on the tree \(\text{ Prun}_{\alpha ,\alpha ^{\prime }}( \mathcal{T }_\alpha ; (X_i)_{i\ge 0})\). Let then \(\mathsf{B}_i, i \ge 0\) be the indices of the leaves that are kept in the pruning procedure applied to \( ( \mathcal{T }_{\alpha } ; (X_i)_{i \ge 0})\). By the above proof, these indices also correspond to the indices of the leaves \(A_i, i \ge 0\) that end-up being blue in the construction of \( \mathbf T _{\alpha ,\alpha ^{\prime }}\) from \( \mathbf T _{\alpha }\). We then claim that conditionally on \(\text{ Prun}_{\alpha ,\alpha ^{\prime }}( \mathcal{T }_\alpha ; (X_i)_{i\ge 0})\),

$$\begin{aligned} \text{ the} \text{ leaves}\,X_{ \mathsf{B}_i}, i \ge 0 \text{ are} \text{ i.i.d.} \text{ according} \text{ to} \mu _{\alpha ^{\prime }}. \end{aligned}$$

In words, the blue leaves of the pruning procedure form an i.i.d. sequence of leaves distributed according to the uniform mass measure on the remaining pruned subtree. Although this is a purely continuous setup statement we give a proof relying on a discrete argument. Indeed, it follows from the preceding proof that for every \(k \ge 0\)

$$\begin{aligned} \bigg (\frac{T_{\alpha ,\alpha ^{\prime }}(n)}{\alpha n^{\bar{\alpha }}} ; A_{ \mathsf{B}_0}, \ldots , A_{ \mathsf{B}_k} \bigg ) \xrightarrow [n\rightarrow \infty ]{\text{ a.s.}} \big (\text{ Prun}_{\alpha ,\alpha ^{\prime }}( \mathcal{T }_\alpha ; (X_i)_{i\ge 0}) ; X_{ \mathsf{B}_0}, \ldots , X_{ \mathsf{B}_k} \big ), \end{aligned}$$

in the \(k+1\)-pointed Gromov–Hausdorff sense. However, since \(\mathbf T _{\alpha ,\alpha ^{\prime }}\) has the same distribution has a Marchal chain of parameter \(\alpha ^{\prime }\) time-changed by an independent increasing process, we deduce from Theorem 5(ii) that \( X_{ \mathsf{B}_0}, \ldots , X_{ \mathsf{B}_k}\) are i.i.d. according to \( \mu _{\alpha ^{\prime }}\) conditionally on \( M_{\alpha ,\alpha ^{\prime }}^{\bar{\alpha }^{\prime }}\cdot \mathcal{T }_{\alpha ^{\prime }}\) as demanded.

5 A fragmentation point of view

In this section, we exploit the pruning operation of the last section in order to give another point of view on the construction of \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) which is based on the fragmentation properties of the stable tree. Indeed, it is by now standard that the stable tree can be constructed from a certain self-similar fragmentation process whose conservative dislocation measure is explicitly known. We will show that we can trim this fragmentation process by throwing certain fragments away and such that the genealogy of the resulting dissipative fragmentation process is coded by the tree \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}\). This, finally, gives another interpretation of the random scaling appearing in Proposition 2 as the limit of the Malthusian martingale associated to the dissipative fragmentation.

5.1 Fragmentation of the stable tree

General self-similar fragmentation processes. This paragraph is deliberately brief and we refer to Bertoin [57] for a detailed and rigorous introduction to the theory of self-similar fragmentations. Let \(a \in \mathbb{R }\) and \(\nu \) be a sigma-finite measure on the set of decreasing sequences

$$\begin{aligned} \mathcal{S }=\Bigg \{\mathbf{s}=(s_1,s_2,\ldots ):s_1 \ge s_2 \ge \ldots \ge 0 \text{ and} \sum _{i\ge 1}s_i \le 1 \Bigg \} \end{aligned}$$

(endowed with the topology of pointwise convergence), such that

$$\begin{aligned} \nu (1,0,\ldots )=0 \quad \text{ and} \quad \int _{\mathcal{S }}(1-s_1)\nu (\text{ d} \mathbf s )<\infty . \end{aligned}$$

Such measure is called a conservative dislocation measure when \(\nu (\sum _i s_i<1)=0\) and a dissipative dislocation measure otherwise.

A pure-jump self-similar fragmentation process \(F\) with index of self-similarity \(a\) and dislocation measure \(\nu \) is a \(\mathcal{S }\)-valued càdlàg Markov process modelling the evolution of a system of splitting masses. The sequence \(F(t)\) represents the masses present at time \(t\), ranked in the decreasing order. When \(\nu \) is finite the dynamic of the process is easy to describe:

  1. (i)

    different masses present at a given time evolve independently

  2. (ii)

    each mass \(m\) splits after a random time with an exponential distribution of parameter \(m^a\nu (\mathcal{S })\) to give sub-masses \( mS_1,m S_2,\ldots \) where \((S_1,S_2,\ldots ) \in \mathcal{S }\) is distributed according to \(\nu /\nu (\mathcal{S })\), independently of the splitting time.

When \(\nu \) is infinite, the splitting times are dense in \(\mathbb{R }_+\) and the description of the process is more tricky. Informally, the item (i) is still valid, whereas the second item is replaced by the more general fact that a mass \(m\) splits in masses \(m \mathbf s \), \(\mathbf s \in \mathcal{S }\) at rate \(m^a \nu (\text{ d} \mathbf s )\).

In all cases, this can be made rigorous by using Poisson point processes (see [4, 5] for details). We start with the description of homogeneous fragmentations, the fragmentations with an index of self-similarity equal to 0. Let \((S(t),k(t))_{t \ge 0}\) be a Poisson point process with intensity measure \(\nu \otimes \#\), where \(\#\) denotes the counting measure on \(\mathbb{N }\). Then it is possible to build a pure-jump càdlàg \(\mathcal{S }\)-valued process \(H\) such that \(H(0)=(1,0,\ldots )\) and \(H\) jumps only when an atom \((S(t),k(t))\) of the Poisson point process occurs: \(H(t)\) is then obtained from \(H(t-)\) by keeping all elements of the sequence \(H(t-)\) but \(H_{k(t)}(t-)\) which is replaced by the sequence of masses \(H_{k(t)}(t-)S_i(t), i \ge 1\), that is

$$\begin{aligned} H(t)= \downarrow \big \{H_{j}(t-), H_{k(t)}(t-)S_i(t),i,j \ge 1, j \ne k(t)\big \}, \end{aligned}$$

where the symbol \(\downarrow \) means that we have ranked the terms in the decreasing order. This defines a fragmentation process \(H\) with parameters \((0,\nu )\), and all \((0,\nu )\)-fragmentations can be constructed similarly from a Poisson point process.

Next, from a homogeneous fragmentation \(H\) with dislocation measure \(\nu \) and any \(a \in \mathbb{R }\), one can construct a fragmentation \(F\) with parameters \((a,\nu )\) by using a family of suitable time-changes. And vice-versa, any \((a,\nu )\)-fragmentation can be turned into a homogeneous one with same dislocation measure by suitable time-changes. We do not specify these time-changes here, as it requires some additional technical material, but refer to [6] for details. Let us just mention that the time-changes only depend on the past evolution of the masses. They modify the splitting times of the masses but neither change the masses themselves, nor the genealogical relations between them. It turns out that for all \(t \ge 0\) and all \(i \in \mathbb{N }\), there exists a pair \((s,j) \in [0,\infty ) \times \mathbb{N }\) such that \(F_i(t)=H_j(s)\), and vice-versa.

Fragmentation trees. A rooted continuous tree is a pair consisting of a continuous tree and one of its points, this distinguished point being called the root. To simplify notation we write \(\text{ ht}(.)\) for the distance to the root in a rooted tree.

In [16] it is proved that when \(a<0\) and \(\nu \) is conservative, if \(F\) is an \((a,\nu )\) fragmentation process then one can construct a random compact rooted continuous tree \( \mathcal{T }^\bullet \) endowed with a (random) probability measure \(\mu \) which is supported on its leaves such that \( \mathcal{T }^\bullet \) codes “the genealogy” of the process \(F\) in the sense that almost surely for all \(t \ge 0\) we have

$$\begin{aligned} F(t)=\downarrow \mu \text{-masses} \text{ of} \text{ the} \text{ connected} \text{ components} \text{ of} \{v \in \mathcal{T }^\bullet : \text{ ht}(v)>t\}. \end{aligned}$$
(18)

The boundedness of this genealogical tree is due to a loss of mass by formation of dust in the fragmentation, which is itself due to the strictly negative index of self-similarity. The tree \( \mathcal{T }^\bullet \) is called the fragmentation tree associated with \(F\). Recently, this result has been extended to the dissipative dislocation measures [26], the main difference being that the probability measure \(\mu \) is then no more concentrated on the set of leaves of the genealogical tree and also charges its skeleton. When \(\nu \) is dissipative and \(\nu (0,0,\ldots )=0\) it is even fully supported by the skeleton.

A particular instance of fragmentation trees is given by the stable trees. Specifically, let \( \mathcal{T }_{\alpha }^\bullet \) denote an \(\alpha \)-stable tree endowed with its uniform mass measure \(\mu _{\alpha }\) and rooted at \(X_{0}\) (a uniform random leaf). If \(F_{\alpha }\) denotes the \( \mathcal S \)-valued process associated with \( \mathcal{T }^\bullet _{\alpha }\) by (18) then Bertoin [6] for the Brownian case and Miermont [22] for the general setting showed that \(F_{\alpha }\) is a pure-jump self-similar fragmentation process with self-similarity index \(-\bar{\alpha }\) and computed its dislocation measure. For the fragmentation \(F_2\) this measure is given for all test functions \(f:\mathcal{S } \rightarrow \mathbb{R }\) by

$$\begin{aligned} \int _{\mathcal{S }}f(\mathbf s )\nu _{\text{ Br}}(\text{ d} \mathbf s )=\int _{1/2}^1 f(x,1-x,0,\ldots ) \left(\pi x^{3}(1-x)^{3} \right)^{-1/2}\text{ d}x, \end{aligned}$$

and for the stable fragmentation \(F_{\alpha }\), \(1<\alpha <2\), it is given by

$$\begin{aligned} \int _{\mathcal{S }}f(\mathbf s )\nu _{\alpha }(\text{ d} \mathbf s ) = C_{\alpha } \mathbb{E }\left[\sigma _1 f\left(\frac{\Xi _i}{\sigma _1},i \ge 1\right)\right], \end{aligned}$$

where

$$\begin{aligned} C_{\alpha }=\frac{\alpha (\alpha -1) \Gamma (\bar{\alpha })}{\Gamma (2 - \alpha )} \end{aligned}$$
(19)

and \((\sigma _t, t\ge 0)\) is a stable subordinator of Laplace exponent \(\lambda ^{1/\alpha }\) and \(\left(\Xi _i,i\ge 1 \right)\) the sequence of its jumps before time 1, ranked in the decreasing order. Note that \(\nu _{\alpha }(\mathcal{S })=\infty \) for all \(\alpha \in (1,2]\), which is related to the fact that the set of leaves of \(\mathcal{T }_{\alpha }\) is dense in \(\mathcal{T }_{\alpha }\). Note also that \(\nu _{\alpha }\) is conservative.

We need some more notation. Note that for all \(t\ge 0\), the connected components of \(\{ v \in \mathcal{T }_{\alpha }^{\bullet } : \text{ ht}(v)>t\}\) are open subtrees of \(\mathcal{T }_{\alpha }\). We denote these subtrees by \( \mathcal{T }_{\alpha }^i(t), i \ge 1\) so that

$$\begin{aligned} F_{\alpha }^i(t) = \mu _{\alpha }( \mathcal{T }_{\alpha } ^i(t)), \end{aligned}$$

with \( F_{\alpha }(t)=(F_{\alpha }^i(t), i \ge 1)\). These subtrees are all distributed as \(\mathcal{T }_{\alpha }\backslash \{X_0\}\), up to random scalings. Specifically, conditionally on \(F_{\alpha }(t)\), the trees \( \mathcal{T }_{\alpha }^i(t), i \ge 1\) are independent, such that for all \(i\), \(\mathcal{T }_{\alpha }^i(t)\) endowed with the probability measure \(\mu _{\alpha }(\ \cdot \mid \mathcal{T }_{\alpha }^i(t))\) is distributed as \(F^i_{\alpha }(t) ^{\bar{\alpha }} \cdot (\mathcal{T }^i_{\alpha }\backslash \{X_0^i\})\) endowed with \(\mu ^i_{\alpha }\), where \((\mathcal{T }^i_{\alpha }, \mu ^i_{\alpha })\) is an independent copy of the \(\alpha \)-stable tree and \(X_0^i\) is a random point of \(\mathcal{T }^i_{\alpha }\) with distribution \(\mu _{\alpha }^i\). This is called the self-similar property of the stable tree \(\mathcal{T }_{\alpha }\). A similar property holds for all fragmentation trees.

5.2 The tree \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) is a dissipative fragmentation tree

From now on and in the remainder of this paper, we let \(1<\alpha <\alpha ^{\prime } \le 2\) and consider a stable tree \(\mathcal{T }_{\alpha }\) together with a sample of uniform leaves \(X_i,i \ge 0\). We let

$$\begin{aligned} \mathfrak{T }_{\alpha ,\alpha ^{\prime }} = \text{ Prun}_{\alpha ,\alpha ^{\prime }}( \mathcal{T }_{\alpha }, (X_{i})_{i\ge 0}) \end{aligned}$$

and denote by \( \mathcal{T }_{\alpha }^\bullet \) and \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}^\bullet \) the trees \(\mathcal{T }_{\alpha }\) and \( \mathfrak{T }_{ \alpha , \alpha ^{\prime }}\) rooted at the point \(X_{0}\). Since \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) is a closed subtree of \( \mathcal{T }_{\alpha }\) for every \(x \in \mathcal{T }_{\alpha }\) there exists a unique point \( \text{ Proj}(x) \in \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) minimizing the distance to \(x\) in \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\). We then consider the measure \( \mu _{\alpha ,\alpha ^{\prime }}\) on \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) obtained as the push-forward of \(\mu _{\alpha }\) by the application \(\text{ Proj}(.)\).

Proposition 13

The probability \(\mu _{\alpha ,\alpha ^{\prime }}\) is purely atomic and supported by the skeleton of \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}\).

Proof

If \( x \notin \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) then a moment of thought using the definition of the pruning procedure shows that \(\text{ Proj}(x)\) is a branch point of \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}\). Thus the image of the measure \(\mu _{\alpha }\) restricted to \( \mathcal{T }_{\alpha } \backslash \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) is supported by the countable branch points of \( \mathcal{T }_{\alpha }\) that belong to \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\). The statement of the proposition follows since \( \mu _{\alpha }( \mathcal{T }_{\alpha } \backslash \mathfrak{T }_{\alpha ,\alpha ^{\prime }}) =1\) by Proposition 10.

The goal of this section is to show that \((\mathfrak{T }_{\alpha ,\alpha ^{\prime }}^{\bullet },\mu _{\alpha ,\alpha ^{\prime }})\) is the genealogical tree of a dissipative self-similar fragmentation process. To make this precise, we start by constructing its dislocation measure \(\nu _{\alpha ,\alpha ^{\prime }}\) from \(\nu _{\alpha }\). For this, we design a procedure that extract randomly certain fragments of a dislocation of the unit mass.

Let \( \mathbf s = (s_{1}, s_{2}, \ldots ) \in \mathcal S \). We first consider \( \mathbf s ^* = (s_{1}^*, s_{2}^*,\ldots )\) a size-biased random reordering of this sequence from which we will extract a subsequence \( ( \widehat{s^*_{i}})_{i \ge 1}\) as follows. When \(\alpha ^{\prime }=2\) the extraction is deterministic: set \(\widehat{s^*_{1}}= s_{1}^* \) and \(\widehat{s_{2}^*}= s_{2}^*\) and \(\widehat{s_{n}^*}=0\) for \(n\ge 3\). For \(\alpha ^{\prime }<2\), we use an extra randomness and the definition of the probabilities \(p_{\alpha ,\alpha ^{\prime },d,d^{\prime }} \) defined by (1) in the Introduction. We proceed recursively on \(n\) as follows:

  1. (i)

    \(\widehat{s^*_{1}}= s_{1}^* \) and \(\widehat{s_{2}^*}= s_{2}^*\),

  2. (ii)

    for \(n \ge 3\), let \(d=n\) and \(d^{\prime }= \#\{ 1 \le i \le n-1 : \widehat{s_{i}^*}=s_{i}^*\}+1\) and put

    $$\begin{aligned} \widehat{s_{n}^*}=s_{n}^*&\ \text{ with} \text{ probability}&p_{\alpha ,\alpha ^{\prime },d,d^{\prime }},\\ \widehat{s_{n}^*}= 0&\text{ otherwise}. \end{aligned}$$

The dislocation measure \(\nu _{\alpha ,\alpha ^{\prime }}\) is then defined for all test functions \(f:\mathcal{S } \rightarrow \mathbb{R }\) by

$$\begin{aligned} \int _{\mathcal{S } }f(\mathbf s )\nu _{\alpha ,\alpha ^{\prime }}(\text{ d} \mathbf s ): = \int _{ \mathcal S } \nu _\alpha ( \text{ d} \mathbf s ) \, \mathbb{E }\left[f\big ( \downarrow ( \widehat{s_{i}^*})\big )\right]\!, \end{aligned}$$
(20)

where we recall that the symbol \( \downarrow \) means reordering in decreasing order. Note that \(\nu _{\alpha ,\alpha ^{\prime }}\) is dissipative since \(\alpha <\alpha ^{\prime }\).

Our goal, now, is to prove the following result which clearly leads to Proposition 4. The proof does not rely on the particular expression of the dislocation measure of the stable tree and the result could be generalized to any fragmentation tree pruned by a similar procedure.

Proposition 14

The process \(F_{\alpha ,\alpha ^{\prime }}\) obtained by considering for each \(t \ge 0\) the decreasing reordering of the

$$\begin{aligned} \mu _{\alpha ,\alpha ^{\prime }}\text{-masses} \text{ of} \text{ the} \text{ connected} \text{ components} \text{ of} \big \{v \in \mathfrak{T }_{\alpha ,\alpha ^{\prime }}^{\bullet } : \text{ ht}(v)>t\big \} \end{aligned}$$

is a self-similar fragmentation, with index \(-\bar{\alpha }\) and dislocation measure \(\nu _{\alpha ,\alpha ^{\prime }}\).

To be coherent with the notation introduced for the fragmentation \( F_{\alpha }\) of the stable tree \(\mathcal{T }_{\alpha }^{\bullet }\), we denote by \( \mathcal{T }_{\alpha , \alpha ^{\prime }}^i(t), i \ge 1\) the open subtrees forming the connected components of \(\{ v \in \mathfrak{T }_{\alpha ,\alpha ^{\prime }}^\bullet : \text{ ht}(v) >t\}\), so that the elements of \(F_{\alpha ,\alpha ^{\prime }}(t)\) satisfy for \(i \ge 1\),

$$\begin{aligned} F_{ \alpha , \alpha ^{\prime }}^i(t) = \mu _{\alpha ,\alpha ^{\prime }} ( \mathcal{T }_{\alpha ,\alpha ^{\prime }} ^i(t)). \end{aligned}$$

Note that for each \(i \ge 1\), there exists a unique integer \(j\) (\(\ge i\)) such that \( \mathcal{T }_{\alpha , \alpha ^{\prime }}^i(t) \subset \mathcal{T }_{\alpha }^j(t)\), and then that

$$\begin{aligned} F_{ \alpha , \alpha ^{\prime }}^i(t) = \mu _{\alpha } ( \mathcal{T }_{\alpha } ^j(t))=F_{\alpha }^j(t). \end{aligned}$$

We then say that the mass \(F_{\alpha }^j(t)\) contributes to \(F_{ \alpha , \alpha ^{\prime }}(t)\). Note that \(F_{\alpha }^j(t)\) contributes to \(F_{ \alpha , \alpha ^{\prime }}(t)\) if and only if \(\mathcal{T }_{\alpha }^j(t) \cap \mathfrak{T }_{\alpha ,\alpha ^{\prime }} \ne \emptyset \).

Proof (Sketch)

A complete proof of the last proposition would be technical and we deliberately chose to stay rather informal. We first use the pruning construction of \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) to give a practical criterion to decide which of the elements of \(F_{\alpha }(t)\) contribute to \(F_{\alpha ,\alpha ^{\prime }}(t)\). Let \(F_{\alpha }^i(t)\) be an element of \(F_{\alpha }(t)\) and consider the smallest \(k \ge 1\) such that \(X_{k} \in \mathcal{T }_{\alpha }^i(t)\). Then \( F_{\alpha }^i (t)\) contributes to \(F_{\alpha ,\alpha ^{\prime }}(t)\) if and only if \(X_{k} \in \text{ Prun}_{\alpha ,\alpha ^{\prime }}( \mathcal{T }_{\alpha } ; X_{0}, \ldots , X_{k})\), that is if \(X_k\) is a blue leaf, with the color coding of Sect. 4.1. Consider then a jump of the process \(F_{\alpha }\): Pick a time \(t\) such that a fragment \(F_{\alpha }^{i}(t-)\) splits into a countable collection of masses

$$\begin{aligned} F_{\alpha }^{j}(t) \quad \text{ for}\,j \in J=\{ \text{ indices} \text{ offspring} \text{ of}\,i \text{ at} \text{ time}\,t\}. \end{aligned}$$

In geometric terms, there exists a branch point at height \(t\) in \( \mathcal{T }_{\alpha }^\bullet \) that branches into the subtrees \( \mathcal{T }_{\alpha }^j(t), j \in J\). Let us then assume that \(F_{\alpha }^i(t-)\) contributes to \(F_{\alpha ,\alpha ^{\prime }}(t-)\). This means that during the pruning operation the first leaf \(X_{k}\) falling into \( \mathcal{T }_{\alpha }^i(t-)\) is blue. Now, let us condition on that fact and on \( (F_{\alpha }(s), s \le t)\) and set

$$\begin{aligned} s_{j} = \frac{\mu _{\alpha }( \mathcal{T }_{\alpha }^j(t))}{\mu _{\alpha }( \mathcal{T }_{\alpha }^i(t-))} \quad \text{ for}\,j\in J. \end{aligned}$$

Our goal is to show that the relative masses \(s_k\) of the fragments \(F_\alpha ^k(t), k \in J\) that will contribute to \( F_{\alpha ,\alpha ^{\prime }}(t)\) are distributed as \((\widehat{s_j^*})_{j \in J}\). We denote by \(Y_{k}, k \ge 1\) the leaves among \(X_{k}, k \ge 1\) that belong to \( \mathcal{T }_{\alpha } ^i(t-)\). Clearly, on our conditioning, the \((Y_{k})_{k \ge 1}\) are i.i.d. sampled according to \(\mu _{\alpha }(\ \cdot \mid \mathcal{T }_{\alpha }^i(t-))\). Thus, if \( \mathcal{T }_{\alpha }^{a_{1}}(t), \mathcal{T }_{\alpha }^{a_{2}}(t), \ldots \) are the first, second, ...subtrees into which one of the points \(Y_{i}\) falls then the sequence \(s_{a_{1}}, s_{a_{2}}, \ldots \) is just a sized-biased ordering of the sequence \((s_{j})_{j \in J}\),

$$\begin{aligned} (s_{a_{n}})_{n\ge 1} = (s_j^*)_{j\in J}. \end{aligned}$$

Moreover, for every \(n \ge 1\), the first leaf that falls into \( \mathcal{T }_{\alpha } ^{a_{n}}(t)\) is kept in the pruning construction (otherwise said is blue) with probability \(p_{\alpha ,\alpha ^{\prime },d,d^{\prime }}\), where \(d = a_{n}\) and \(d^{\prime }=\{ k \le n-1 : \mathcal{T }_{\alpha }^{a_{k}}(t) \cap \mathfrak{T }_{\alpha ,\alpha ^{\prime }} \ne \varnothing \} +1\). From this observation and the definition of \((\widehat{s_j}^*)\) it should be clear that the \(s_{k}\)’s, \(k \in J\), corresponding to offspring subtrees of \(\mathcal{T }^i_{\alpha }(t-)\) intersecting \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) are thus distributed as \(\widehat{s_{j}^*}, j \in J\) as desired. Obviously this procedure has to be done for each jump of the fragmentation process. It is then possible to conclude by using the Poisson point process construction of the fragmentation \(F_{\alpha }\). One important point is to notice that the time-changes used to pass to the homogenous counterparts of \(F_{\alpha }\) and \(F_{\alpha ,\alpha ^{\prime }}\) are identical in both fragmentations for each element of \(F_{\alpha }\) that contributes to \(F_{\alpha ,\alpha ^{\prime }}\). We leave subtleties to the careful reader.\(\square \)

5.3 From the dissipative fragmentation to a randomized conservative fragmentation

Since the random tree \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) is distributed as the stable tree \(\mathcal{T }_{\alpha ^{\prime }}\) (up to a random scaling) it supports a uniform mass measure distributed as \(\mu _{\alpha ^{\prime }}\). However, the measure \(\mu _{\alpha ,\alpha ^{\prime }}\) (see last section) does not charge the set of leaves of the tree \( \mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) but only its skeleton and so is not connected, a priori, to \(\mu _{\alpha ^{\prime }}\). We will see in this section how to construct from \(\mu _{\alpha ,\alpha ^{\prime }}\) a measure on the leaves of \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}^{\bullet }\) that will be related to \(\mu _{\alpha ^{\prime }}\).

Keeping the mass. Recall that the measure \(\nu _\alpha \) is conservative, that is the mass is conserved at each splitting. However there is a loss of mass in \(F_{\alpha }\) by formation of dust (due to the negative auto-similarity index). That is why, in this section, we rather focus on the homogeneous counterpart \(H_\alpha \) of the fragmentation \(F_{\alpha }\), which is obtained by applying suitable time-changes in \(F_{\alpha }\), see Sect. 5.1. In this case we have for all \(t \ge 0\)

$$\begin{aligned} \sum _{i\ge 1} H_{\alpha }^i(t) =1. \end{aligned}$$

Obviously, this conservation of mass does not hold for homogeneous fragmentations with a dissipative dislocation measure. However there is then a natural way to define a random measure called the Malthusian measure which is stochastically conserved [7, 8, 26]. Although this new random measure is not a probability measure, it has a finite total mass with mean 1.

Let us now introduce rigorously this object. For this we use results established by Stephenson [26] for general dissipative fragmentation trees, following the ideas of Bertoin [7] for dissipative fragmentations with finite dislocation measures. From now on, we focus on the case of \( \nu _{\alpha ,\alpha ^{\prime }}\) (see (20)), that is we consider \(H_{\alpha ,\alpha ^{\prime }}\) the homogeneous counterpart of the process \(F_{\alpha ,\alpha ^{\prime }}\). The starting point is the existence of a Malthusian index \(p^*_{\alpha ,\alpha ^{\prime }}\) which is characterized by

$$\begin{aligned} \int _{\mathcal{S }} \bigg (1-\sum _{i \ge 1} s_i^{p^*_{\alpha ,\alpha ^{\prime }}}\bigg ) \nu _{\alpha ,\alpha ^{\prime }}(\text{ d} \mathbf s )=0. \end{aligned}$$

See below for the discussion about existence and computation of this exponent. Once the existence of \(p^*_{\alpha ,\alpha }\) is granted, the process

$$\begin{aligned} \text{ Mart}_{1,0}(s) = \sum _{i\ge 1} \big (H^i_{ \alpha , \alpha ^{\prime }}(s)\big )^{p^*_{\alpha ,\alpha ^{\prime }}}, \quad s\ge 0 \end{aligned}$$

is a positive martingale, see e.g. [7, 26]. More generally, for each \(t \ge 0\) and \(i \in \mathbb N \), the process

$$\begin{aligned} \text{ Mart}_{i,t}(s) = \sum _{i \rightarrow j} \big ( H^j_{ \alpha , \alpha ^{\prime }}(s)\big )^{p^*_{\alpha ,\alpha ^{\prime }}}\!, \quad s\ge t, \end{aligned}$$
(21)

where the sum is over all fragments whose ancestor at time \(t\) is \(H^i_{ \alpha , \alpha ^{\prime }}(t)\), is a positive càdlàg martingale which therefore converges almost surely towards a limiting value denoted by \( \text{ Mart}_{i,t}\), \(i \in \mathbb N \), \(t \ge 0\). Actually, one can check that

$$\begin{aligned} \mathbb{E } \left[ \text{ Mart}_{1,0}\right]=1 \end{aligned}$$
(22)

(this will be done during the proof of Theorem 15), which then ensures that the random variables \(\text{ Mart}_{i,t},i \in \mathbb{N },t \ge 0\) can be chosen so that the convergences hold for all \(i,t\), almost surely (see [26]). Besides, by the homogeneity property of the fragmentation process \(H_{ \alpha , \alpha ^{\prime }}\), we clearly have for all \(t\ge 0\),

$$\begin{aligned} (\text{ Mart}_{i,t},i\ge 1) = \left((H^i_{ \alpha , \alpha ^{\prime }}(t))^{p^*_{\alpha ,\alpha ^{\prime }}}\text{ Mart}^{(i,t)}_{1,0},i \ge 0\right)\!, \end{aligned}$$

where the r.v. \(\text{ Mart}^{(i,t)}_{1,0}, i\ge 0\) are independent, all distributed as \(\text{ Mart}_{1,0}\) and also independent of \((H^i_{ \alpha , \alpha ^{\prime }}(t),i \ge 1)\). Finally, recall the notation \(\mathcal{T }_{\alpha }^i(t)\) and \(\mathcal{T }_{\alpha ,\alpha ^{\prime }}^i(t)\), \(i \ge 1\) used in the last section for the open subtrees above level \(t\) in \( \mathcal{T }_\alpha ^\bullet \) and \( \mathfrak{T }_{ \alpha , \alpha ^{\prime }}^\bullet \) respectively. We similarly denote by \( \mathcal H _{\alpha }^{i}(t)\) and \( \mathcal H _{\alpha ,\alpha ^{\prime }}^{i}(t)\) the open subtrees of \( \mathcal{T }_\alpha ^\bullet \) and \( \mathfrak{T }_{ \alpha , \alpha ^{\prime }}^\bullet \) so that

$$\begin{aligned} H_\alpha ^i(t)= \mu _\alpha ( \mathcal H _\alpha ^i(t)), \qquad H_{\alpha ,\alpha ^{\prime }}^i(t)= \mu _{\alpha ,\alpha ^{\prime }}( \mathcal H _{\alpha ,\alpha ^{\prime }}^i(t)). \end{aligned}$$

The set of subtrees \(\{\mathcal{H }_{\alpha }^i(t), i \in \mathbb{N },t \ge 0\}\) is in bijection with the set of subtrees \(\{\mathcal{T }_{\alpha }^i(t), i \in \mathbb{N },t \ge 0\}\), and it is the same for the subtrees indexed by \(\alpha ,\alpha ^{\prime }\). We now have the material to build the Malthusian measure \(\mu ^*_{\alpha , \alpha ^{\prime }}\). On the event with probability 1 where all the martingales (21) converge, we set

$$\begin{aligned} \mu _{\alpha ,\alpha ^{\prime }}^*\left(\mathcal{H }_{\alpha ,\alpha ^{\prime }}^i(t)\right)=\text{ Mart}_{i,t} \end{aligned}$$
(23)

for all \(i \in \mathbb{N }\) and \(t \ge 0\). By [26], this indeed defines uniquely a \(\sigma \)-finite measure \(\mu ^*_{\alpha ,\alpha ^{\prime }}\) on the tree \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}^{\bullet }\), which is fully supported by the set of leaves. Note that the set \(\{\mathcal{H }_{\alpha ,\alpha ^{\prime }}^i(t), i \in \mathbb{N }, t \ge 0\}\) generates the Borel \(\sigma \)-field on \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}\).

Computing \(p^*_{\alpha ,\alpha ^{\prime }}\) . Apart from the Brownian case when \(\alpha ^{\prime }=2\), it is not easy to calculate directly from the expression of \(\nu _{\alpha ,\alpha ^{\prime }}\) given in (20) its Malthusian index \(p^*_{\alpha ,\alpha ^{\prime }}\). However we can use information from the proof of Proposition 2, based on the coupling in Marchal’s construction, to get that

$$\begin{aligned} p^*_{\alpha ,\alpha ^{\prime }}=\frac{\bar{\alpha } }{\bar{\alpha } ^{\prime }}. \end{aligned}$$

This will be done during the proof of Theorem 15. There is another indirect way to get the value of \(p^*_{\alpha ,\alpha ^{\prime }}\), provided we know that this index exists and that the measure \(\nu _{\alpha ,\alpha ^{\prime }}\) satisfies suitable (weak) integrability properties. Indeed, on the one hand, the Hausdorff dimension of the \(\alpha ^{\prime }\)-stable tree, hence also that of \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}^{\bullet }\), is \(1/\bar{\alpha } ^{\prime }\) a.s. [13, 16]. On the other hand, according to results of [26] on general dissipative fragmentation trees, we know that the Hausdorff dimension of the \((-\bar{\alpha })\)-self-similar fragmentation tree \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}^{\bullet }\) is almost surely \(p_{\alpha ,\alpha ^{\prime }}^*/\bar{\alpha }\), provided \(\nu _{\alpha ,\alpha ^{\prime }}\) satisfies the above mentioned integrability properties. After identification this gives back \(p^*_{\alpha ,\alpha ^{\prime }}=\bar{\alpha }/\bar{\alpha }^{\prime }\).

The main result of this section is the fact that \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}^{\bullet }\) endowed with the measure \(\mu ^*_{\alpha ,\alpha ^{\prime }}\) is distributed as an \(\alpha ^{\prime }\)-stable tree endowed with its uniform mass measure, up to a random scaling. Recall from (10) the definition of a generalized Mittag-Leffler random variable \(\text{ ML}_{\bar{\alpha }/\bar{\alpha }^{\prime },\bar{\alpha }}\) with parameters \(({\bar{\alpha }}/\bar{\alpha } ^{\prime },{\bar{\alpha }})\) and set \(M_{\alpha ,\alpha ^{\prime }}=(\alpha ^{\prime }/\alpha )^{1/\bar{\alpha }^{\prime }}\text{ ML}_{\bar{\alpha }/\bar{\alpha }^{\prime },\bar{\alpha }}\).

Theorem 15

There is the following identity in distribution, with \(M_{\alpha ,\alpha ^{\prime }}\) independent of \((\mathcal{T }_{\alpha ^{\prime }}^{\bullet }, \mu _{\alpha ^{\prime }})\),

$$\begin{aligned} \left(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}^{\bullet }, \mathbb{E }[M_{ \alpha , \alpha ^{\prime }}] \mu ^*_{\alpha ,\alpha ^{\prime }}\right)&\overset{\mathrm{(d)}}{=}&\left( M_{ \alpha , \alpha ^{\prime }}^{ \bar{ \alpha }^{\prime }} \mathcal{T }_{\alpha ^{\prime }}^{\bullet }, M_{ \alpha , \alpha ^{\prime }} \mu _{\alpha ^{\prime }}\right)\!, \end{aligned}$$

where \(\mathbb{E }[M_{\alpha ,\alpha ^{\prime }}]=(\alpha /\alpha ^{\prime })^{1/\bar{\alpha } ^{\prime }} \bar{\alpha } ^{\prime } \Gamma \big (\bar{\alpha } \big )/\Gamma \big (\bar{\alpha } (1+1/\bar{\alpha } ^{\prime })\big )\), and where the measure \(\mu _{\alpha ^{\prime }}\) in the right-hand side of this identity actually represents the push-forward of \(\mu _{\alpha ^{\prime }}\) on \(\mathcal{T }_{\alpha ^{\prime }}\) by the homothety that sends \(\mathcal{T }_{\alpha ^{\prime }}\) on \(M_{ \alpha , \alpha ^{\prime }}^{ \bar{ \alpha }^{\prime }} \mathcal{T }_{\alpha ^{\prime }}\).

In other words, the dissipative fragmentation tree \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}^{\bullet }\), endowed with a rescaled version of its Malthusian measure, is distributed as the genealogical tree of an \(\alpha ^{\prime }\)-stable fragmentation of the random mass \(M_{ \alpha , \alpha ^{\prime }}\). Note that the change of measures on the tree, from \(\mu _{\alpha ,\alpha ^{\prime }}\) to \(\mu ^*_{\alpha ,\alpha ^{\prime }}\), changes the index of self-similarity of the underlying fragmentation, which passes from \(-\bar{\alpha }\) to \(-\bar{\alpha } ^{\prime }\).

Proof of Theorem 15

This proof is divided into three steps. We recall that in addition to proving the identity in distribution of the statement, we also have to prove the existence of \(p^*_{\alpha ,\alpha ^{\prime }}\), as well as (22), so that the measure \(\mu ^*_{\alpha ,\alpha ^{\prime }}\) can indeed be defined via (23) on an event with probability one.

Step 1. Recall that \( \mathfrak{T }_{ \alpha , \alpha ^{\prime }} = \text{ Prun}_{ \alpha , \alpha ^{\prime }}( \mathcal{T }_{ \alpha }, (X_i)_{i\ge 0})\) and that a leaf \(X_i\) is said to be blue if it belongs to \( \mathfrak{T }_{ \alpha , \alpha ^{\prime }}\) and red otherwise. Write \( \mathbb{B }\) for the indices of the blues leaves and \( \mathbb{B }(n) = \mathbb{B } \cap \{0,1,2,\ldots ,n\}\). Recalling from the proof of Theorem 3 that \( \mathbb{B }(n)\) is the number of leaves of a chain \(\mathbf{T}_{\alpha ,\alpha ^{\prime }}\), we apply Lemma 9 to get

$$\begin{aligned} \frac{\# \mathbb{B }(n)}{n^{\bar{\alpha } /\bar{\alpha } ^{\prime }}}&\xrightarrow [n\rightarrow \infty ]{\text{ a.s.}}&\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}, \end{aligned}$$
(24)

where \(\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}\) has a generalized Mittag-Leffler distribution with parameters \((\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha })\). In particular, its mean is

$$\begin{aligned} \mathbb{E }[\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}]=\frac{\bar{\alpha } ^{\prime } \Gamma \left(\bar{\alpha } \right)}{\Gamma \big (\bar{\alpha } \big (1+1/\bar{\alpha } ^{\prime }\big )\big )}, \end{aligned}$$

using (10) and the relation \(\Gamma (a+1)=a\Gamma (a)\) to get simplifications. Moreover, we know from (the proof of) Proposition 2 and the discussion at the end of Sect. 4.2 that there exists an \(\alpha ^{\prime }\)-stable tree \(\mathcal{T }_{\alpha ^{\prime }}\) such that

$$\begin{aligned} \mathfrak{T }_{\alpha , \alpha ^{\prime }} =\frac{\alpha ^{\prime }}{\alpha } \text{ ML}^{\bar{\alpha } ^{\prime }}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}\mathcal{T }_{\alpha ^{\prime }} \end{aligned}$$

and that \((X_i)_{i \in \mathbb{B }}\) is an i.i.d. sample of the uniform mass measure on (the leaves of) \(\mathfrak{T }_{\alpha , \alpha ^{\prime }}\), which, with a slight abuse of notation, is also denoted by \(\mu _{\alpha ^{\prime }}\). Consequently,

$$\begin{aligned} \mu _{\alpha ^{\prime }} \text{ is} \text{ the} \text{ almost} \text{ sure} \text{ empirical} \text{ limit} \text{ of} \quad \frac{\sum _{i \in \mathbb{B }(n)} \delta _{X_i}}{\# \mathbb{B }(n)} \quad \text{ as}\,n \rightarrow \infty , \end{aligned}$$
(25)

which will be useful in the following. Theorem 15 will therefore be proved if we show the existence of \(p^*_{\alpha ,\alpha ^{\prime }}\) and (22) and that almost surely for all \(i \in \mathbb{N }, t \ge 0\),

$$\begin{aligned} \text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }} \mu _{\alpha ^{\prime }} \left(\mathcal{H }_{\alpha ,\alpha ^{\prime }}^i(t) \right)= \mathbb{E }[\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}]\mu _{\alpha ,\alpha ^{\prime }}^*\left(\mathcal{H }_{\alpha ,\alpha ^{\prime }}^i(t)\right)\!. \end{aligned}$$
(26)

For this, we will use (24) and (25).

Step 2. As for their self-similar counterparts, \(H_{\alpha ,\alpha ^{\prime }}\) is a sub-fragmentation of \(H_{\alpha }\). More precisely, for \(i \in \mathbb{N }\) and \(t \ge 0\), there exists a unique integer \(j\) such that \( \mathcal{H }_{\alpha ,\alpha ^{\prime }}^i(t) \subset \mathcal{H }_{\alpha }^j(t)\) and moreover,

$$\begin{aligned} H^i_{\alpha ,\alpha ^{\prime }}(t)=\mu _{\alpha ,\alpha ^{\prime }}( \mathcal{H }_{\alpha ,\alpha ^{\prime }}^i(t))=\mu _{\alpha }( \mathcal{H }_{\alpha }^j(t) )=H^j_{\alpha }(t). \end{aligned}$$

Then, similarly to (24), introduce the random variable

$$\begin{aligned} \text{ ML}_i(t)=\lim _{n \rightarrow \infty } \frac{\# \{ k \in \mathbb{B }(n) : X_{k} \in \mathcal H _{\alpha }^{j}(t)\}}{\# \{ k \le n : X_{k} \in \mathcal H _{\alpha }^{j}(t)\} ^{ \bar{ \alpha }/ \bar{\alpha }^{\prime }}} \end{aligned}$$

which, by the self-similarity property of \(\mathcal{T }_{\alpha }\) and of the blue-coloring operation exists a.s., is distributed as \(\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}\) and is independent of \(H_{\alpha }(t)\). Moreover, the random variables \(\text{ ML}_{i}(t), i \ge 1\) are independent. Now note the following easy but crucial fact: almost surely,

$$\begin{aligned} \text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }} \mu _{\alpha ^{\prime }} \left(\mathcal{H }_{\alpha ,\alpha ^{\prime }}^i(t) \right)&= \lim _{n \rightarrow \infty } \frac{\# \mathbb{B }(n)}{n^{\bar{\alpha } /\bar{\alpha }^{\prime }}} \times \frac{\# \{ k \in \mathbb{B }(n) : X_{k} \in \mathcal H _{\alpha }^{j}(t)\} }{\# \mathbb{B }(n)} \nonumber \\&= \lim _{n \rightarrow \infty } \frac{\# \{ k \in \mathbb{B }(n) : X_{k} \in \mathcal H _{\alpha }^{j}(t)\}}{\# \{ k \le n : X_{k} \in \mathcal H _{\alpha }^{j}(t)\} ^{ \bar{ \alpha }/ \bar{\alpha }^{\prime }}} \nonumber \\&\quad \times \left(\frac{\{ k \le n : X_{k} \in \mathcal H _{\alpha }^{j}(t)\}}{n}\right)^{\bar{\alpha }/\bar{\alpha }^{\prime }}\nonumber \\&= \text{ ML}_i(t) \left(H^{j}_{\alpha }(t)\right)^{\bar{\alpha } /\bar{\alpha } ^{\prime }} = \text{ ML}_i(t) \left( H^i_{\alpha ,\alpha ^{\prime }}(t)\right)^{\bar{\alpha }/\bar{\alpha }^{\prime }}.\nonumber \\ \end{aligned}$$
(27)

Note also that a.s.

$$\begin{aligned} \sum _{i \ge 1}\mu _{\alpha ^{\prime }}\left(\mathcal{H }_{\alpha ,\alpha ^{\prime }}^i(t) \right)=1, \quad \forall t \ge 0. \end{aligned}$$

Indeed, since \(H_\alpha \) is the homogeneous version of the fragmentation \(F_\alpha \), the union \(\cup _i \mathcal H _\alpha ^i(t)\) contains all the leaves of \( \mathcal{T }_\alpha \) but the root, \(\forall t \ge 0\) a.s. Consequently, \(\cup _i \mathcal H _{\alpha ,\alpha }^i(t)\) also contains all the leaves (except the root) of \( \mathfrak{T }^\bullet _{ \alpha , \alpha ^{\prime }}\), hence the last display. Together with (27), this leads to

$$\begin{aligned} \text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}=\sum _{i \ge 1} \text{ ML}_i(t) \left( H_{\alpha ,\alpha ^{\prime }}^i(t)\right)^{\bar{\alpha } /\bar{\alpha }^{\prime }}\!, \quad \forall t\ge 0 \text{ a.s.} \end{aligned}$$
(28)

Taking expectations and recalling that \(0<\mathbb{E }[\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}]<\infty \), this implies that

$$\begin{aligned} 1=\mathbb{E } \bigg [\sum _{i \ge 1} \left( H_{\alpha ,\alpha ^{\prime }}^i(t)\right)^{\bar{\alpha } /\bar{\alpha }^{\prime }}\bigg ]= \exp \bigg (- t\int \limits _{\mathcal{S } }\bigg (1-\sum _{i\ge 1} s_i^{\bar{\alpha }/\bar{\alpha }^{\prime }} \bigg ) \nu _{\alpha ,\alpha ^{\prime }} (\text{ d} \mathbf s )\bigg ), \end{aligned}$$

using for the second equality that \(\nu _{\alpha ,\alpha ^{\prime }}\) is the dislocation measure of \(H_{\alpha ,\alpha ^{\prime }}\) and then the well-known fact in the theory of homogeneous fragmentations

$$\begin{aligned} \mathbb{E } \left[\sum _{i \ge 1} \left( H_{\alpha ,\alpha ^{\prime }}^i(t)\right)^{q}\right]= \exp \left(- t\int \limits _{\mathcal{S } }\left(1-\sum _{i\ge 1} s_i^{q}\right) \nu _{\alpha ,\alpha ^{\prime }} (\text{ d} \mathbf s )\right), \quad \forall q \in \mathbb{R }, \end{aligned}$$

see e.g. Bertoin [5]. This shows that \(p^*_{\alpha ,\alpha ^{\prime }}\) indeed exists and is equal to \(\bar{\alpha } /\bar{\alpha }^{\prime }\). In particular, we can now consider the martingales (21).

Step 3. It remains to show (22) and then use (27) to get (26). Following an idea of [8] we get from (28) and the definition of the martingale \((\text{ Mart}_{1,0}(t),t \ge 0)\) that

$$\begin{aligned}&\mathbb{E } \left[\big (\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}-\mathbb{E }[\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}]\text{ Mart}_{1,0}(t) \big )^2 \right]\\&\quad = \mathbb{E }\left[ \left(\sum _{i\ge 1} \left( H_{\alpha ,\alpha ^{\prime }}^i(t)\right)^{\bar{\alpha } /\bar{\alpha } ^{\prime }}\left(\text{ ML}_i(t)-\mathbb{E }[\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}] \right)\right)^2\right] \\&\quad =\text{ Var}(\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }})\mathbb{E } \left[ \sum _{i \ge 1} \left( H_{\alpha ,\alpha ^{\prime }}^i(t)\right)^{2\bar{\alpha } /\bar{\alpha } ^{\prime }} \right] \\&\quad =\text{ Var}(\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}) \exp \left(- t\int \limits _{\mathcal{S } }\Bigg (1-\sum _{i\ge 1} s_i^{2\bar{\alpha } /\bar{\alpha } ^{\prime }} \Bigg ) \nu _{\alpha ,\alpha ^{\prime }} (\text{ d} \mathbf s )\right), \end{aligned}$$

where the second equality comes from the fact that the random variables \(\text{ ML}_i(t),i \ge 1\) are i.i.d distributed as \(\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}\), and independent of \(H_{\alpha }(t)\). The function

$$\begin{aligned} q \mapsto \int _{\mathcal{S }}\Bigg (1-\sum _{i\ge 1} s_i^q \Bigg ) \nu _{\alpha ,\alpha ^{\prime }} (\text{ d} \mathbf s ) \end{aligned}$$

is strictly increasing on its domain of definition and is equal to 0 for \(q=\bar{\alpha } /\bar{\alpha } ^{\prime }\), as seen above. Hence, noticing that \(\text{ Var}(\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }})<\infty \),

$$\begin{aligned} \mathbb{E } \left[\big (\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}-\mathbb{E }[\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}]\text{ Mart}_{1,0}(t) \big )^2 \right] \rightarrow 0\quad \text{ as}\,t \rightarrow \infty , \end{aligned}$$

which implies the almost sure equality

$$\begin{aligned} \text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}=\mathbb{E }[\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}] \text{ Mart}_{1,0}, \end{aligned}$$

since moreover \(\text{ Mart}_{1,0}(t) \rightarrow \text{ Mart}_{1,0}\) a.s. Note that this implies the expected equality (22), hence the existence of \(\mu ^*_{\alpha ,\alpha ^{\prime }}\). Similarly, by the homogeneity property of the fragmentation \(H_{\alpha }\), for each \((i,t)\),

$$\begin{aligned} \text{ ML}_{i}(t)\left( H_{\alpha ,\alpha ^{\prime }}^i(t)\right)^{\bar{\alpha } /\bar{\alpha ^{\prime }}}=\mathbb{E }[\text{ ML}_{\bar{\alpha } /\bar{\alpha ^{\prime }},\bar{\alpha }}] \text{ Mart}_{i,t}, \quad \text{ a.s.}, \end{aligned}$$

which, together with (27), finally gives the expected

$$\begin{aligned} \text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }} \mu _{\alpha ^{\prime }}\left(\mathcal{H }_{\alpha ,\alpha ^{\prime }}^i(t) \right) \!=\!\mathbb{E }[\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}] \text{ Mart}_{i,t}=\mathbb{E }[\text{ ML}_{\bar{\alpha } /\bar{\alpha }^{\prime },\bar{\alpha }}] \mu _{\alpha ,\alpha ^{\prime }}^*\left(\mathcal{H }_{\alpha ,\alpha ^{\prime }}^i(t)\right) \quad \text{ a.s.} \end{aligned}$$

Note that this identity is proved for each fixed \((i,t)\) almost surely, and we actually want it almost surely for all \((i,t) \in \mathbb{N } \times [0,\infty )\). This is indeed true: for each \((i,t)\), consider a sequence \((t_n)\) with \(t_n \in \mathbb{Q } \cap [0, \infty )\) decreasing towards \(t\); then one has

$$\begin{aligned} \mathcal{H }_{\alpha ,\alpha ^{\prime }}^i(t)= \bigcup _{n \ge 1} \bigcup _{j \in J_n(i,t)} \mathcal{H }_{\alpha ,\alpha ^{\prime }}^{j}(t_n) \end{aligned}$$

where \(J_n(i,t)\) is the set of all indices \(j\) such that the mass \(H_{\alpha ,\alpha ^{\prime }}^j(t_n)\) is a descendant of \(H_{\alpha ,\alpha ^{\prime }}^i(t)\). The above union on \(j \in J_n(i,t)\) is disjoint, whereas that on \(n\) is an increasing union. Consequently, a measure on \(\mathfrak{T }_{\alpha ,\alpha ^{\prime }}\) is entirely characterized by the weights it assigns to the subtrees \(\mathcal{H }_{\alpha ,\alpha ^{\prime }}^j(q), j \in \mathbb{N }, q \in \mathbb{Q } \cap [0,\infty )\). This ends the proof (and the paper).