Skip to main content
Erschienen in: Quality & Quantity 3/2022

Open Access 08.07.2021

Mediation analysis in recursive systems of distributed-lag linear regressions

verfasst von: Alessandro Magrini

Erschienen in: Quality & Quantity | Ausgabe 3/2022

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Recursive systems of linear regressions is a consolidated methodology for mediation analysis, allowing to determine causal effects of interest in a closed form based on the regression coefficients. In a dynamic perspective, distributed-lags can be added to each regression in order to represent causal effects persisting over several periods. However, mediation analysis in the dynamic case is challenging, because causal effects depend on the time lag, and a general procedure to compute their lag distribution based on the regression coefficients is currently missing. In this paper, we formalize the rules to perform mediation analysis in recursive systems of distributed-lag linear regressions, here called Distributed-lag Linear Recursive Models (DLRMs). Firstly, mediation analysis is based on the Directed Acyclic Graph (DAG) representation of the DLRM, then a DAG-free algorithm is proposed to improve computational efficiency. Our DAG-free algorithm is applied to a DLRM representing the impact pathways of agricultural research expenditure towards poverty reduction in rural areas.
Hinweise

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

The use of recursive systems of linear regressions for mediation analysis has a long history rooted to path analysis (Wright 1934) and enriched by several contributions from the 1940s to the 1980s (Haavelmo 1943; Koopmans et al. 1950; Wold 1954, 1960; Duncan 1966; Wermuth 1980; Baron and Kenny 1986). The term ‘recursive’ means that the regressions imply an acyclic dependence structure, i.e., no variable can influence itself, and the random errors are uncorrelated, i.e., further variables are not important to describe the relationships among the considered ones. Recursive systems of linear regressions are known with several different names, including simultaneous equation models (Greene 2008, Chapter 13), linear recursive equations (Wermuth 1980) and linear Markovian models (Pearl 2000; Wermuth and Cox 2015). Here, we refer to them as Linear Recursive Models (LRMs). In a LRM, it is possible to determine causal effects of interest in a closed form based on the regression coefficients (Fox 1985; Sobel 1990; Chen 2017), making mediation analysis simple to perform. For this reason, LRMs are still very popular in many applied research fields, including econometrics, psychology, sociology and medicine.
In a dynamic perspective, lagged instances of the same explanatory variable (distributed-lags) can be added to each regression in order to represent causal effects persisting over several periods. The resulting model is here called Distributed-lag Linear Recursive Model (DLRM). The use of cross-sectional mediation models like LRMs for mediation analysis has been heavily criticized in recent years, because causal effects persisting over several periods may not be identified (Cole and Maxwell 2003; Maxwell and Cole 2007; Maxwell et al. 2011; Mitchell and Maxwell 2013). Longitudinal mediation models like DLRMs represent an appropriate methodology in the dynamic case, but they still entail several critical issues threatening the consistency of the estimates and thus the identification of causal effects. These issues include: (i) non-stationary time series, (ii) unmeasured confounding, (iii) lag attribution.
Non-stationary time series involve causal effects depending not only on the time lag, but also on the time point, thus they may not be identified. In order to ensure a consistent estimation of causal effects in the dynamic case, the time series should be stationary at least in a weak sense, i.e., they should have expected value and autocorrelation function constant in time, a property often called stability (Cole and Maxwell 2003; Maxwell and Cole 2007; Mitchell and Maxwell 2013).
Unmeasured confounding is the most common source of inconsistency in cross-sectional mediation analysis, which typically arises from omitted variables and/or omitted relationships and may generate correlation between some random errors. Addressing unmeasured confounding in the dynamic case is more difficult, because omitting a variable or a relationship between two variables may also generate autoregressive effects or autocorrelated errors (Cole and Maxwell 2003; Goldsmith et al. 2018).
A further potential source of inconsistency in longitudinal mediation models is the determination of the number of lags to consider for each direct causal effect, a problem known as lag attribution. Specifying an insufficient number of lags for a direct causal effect equates to the omission of relevant causes, which in this case are represented by specific temporal instances of a variable in the model. Such omission may lead to lack of identification not only for the considered direct causal effect, but also for all the indirect and/or total causal effects deriving from it (Cain et al. 2018; Reichardt 2011; Goldsmith et al. 2018).
This paper contributes to the literature on dynamic mediation analysis by providing a general procedure for DLRMs to compute the lag distribution of causal effects of interest based on the regression coefficients. Borrowing from the methodology for the computation of causal effects in LRMs, mediation analysis in DLRMs is firstly based on the Directed Acyclic Graph (DAG) encoding the dependence structure among the variables. Secondly, since the DAG of a DLRM includes all the temporal instances of each variable, DAG-based computation of causal effects may be extremely expensive for high time lags. For this reason, we also propose a DAG-free algorithm to improve computational efficiency. Our proposal assumes that the DAG and the coefficients are known a-priori or have been consistently estimated from data, thus topics like causal discovery and identification of causal effects go beyond the purpose of this paper.
This paper is organized as follows. In Sect. 2, the current methodology for mediation analysis in LRMs is refreshed. Although this section does not contain novelties with respect to existing literature, it is important to introduce the notation used throughout the paper and to provide a starting point for the definition of DLRMs. In Sect. 3, we formalize the class of DLRMs, then we derive DAG-based rules for mediation analysis and present the DAG-free algorithm. In Sect. 4, our algorithm is applied to a DLRM representing the impact pathways of agricultural research expenditure towards poverty reduction in rural areas. Section 5 includes concluding remarks and considerations on future developments.

2 Mediation analysis in linear recursive models

Let \(X_1,\ldots ,X_p\) be a set of \(p\ge 2\) quantitative random variables. A Linear Recursive Model (LRM) on \(X_1,\ldots ,X_p\) is defined as:
$$\begin{aligned} \left\{ \begin{array}{l} X_1=\alpha _1+\varepsilon _1\\ X_2=\alpha _2+\beta _{2,1}X_1+\varepsilon _2\\ \ldots \\ X_j=\alpha _j+\sum \limits _{i=1}^{j-1}\beta _{j,i}X_i+\varepsilon _j\\ \ldots \\ X_p=\alpha _p+\sum \limits _{i=1}^{p-1}\beta _{j,i}X_i+\varepsilon _p\\ \end{array} \right. \end{aligned}$$
(1)
where, for \(j=1,\ldots ,p\), variable \(X_j\) is regressed from \(X_1,\ldots ,X_{j-1}\), so that \(\alpha _j\) is the intercept, \(\beta _{j,i}\) is the regression coefficient associated to \(X_i\) (\(i=1,\ldots ,j-1\)), and \(\varepsilon _j\) is the random error. It is assumed that the random errors \(\varepsilon _1,\ldots ,\varepsilon _p\) have null expected value and are each other uncorrelated, i.e., \(\text {E}(\varepsilon _i)=\text {Cov}(\varepsilon _i,\varepsilon _j)=0\) \(\forall i,j\). The system of equations in formula (1) can be written compactly as:
$$\begin{aligned} X_j=\alpha _j+\sum \limits _{i=1}^{p}\beta _{j,i}X_i+\varepsilon _j \quad ~j=1,\ldots ,p\quad ~\beta _{j,i}=0~\forall j\le i \end{aligned}$$
(2)
We see that this system of equations implies an acyclic structural dependence between \(X_1,\ldots ,X_p\), i.e., there is no way to write any variable as a function of itself. As a consequence, the structural dependence implied by a LRM can be represented by a Directed Acyclic Graph (DAG), where each variable is denoted by one node and, for each pair (ij), an edge is directed from the node denoting \(X_i\) to the node denoting \(X_j\) if and only if \(\beta _{j,i}\ne 0\). The random errors can be omitted from the DAG of a LRM because they are each other uncorrelated (see Pearl 2000, page 30 and following).
Note that if \(\beta _{j,i}\ne 0\) \(\forall j>i\), then there is a direct dependence between each pair of variables, i.e., a saturated structural dependence holds between \(X_1,\ldots ,X_p\), represented by a DAG where all the pairs of nodes are connected by an edge (Fig. 1, panel a). Instead, if \(\beta _{j,i}=0\) for a given \(j>i\), then there is no direct dependence between variables \(X_i\) and \(X_j\), represented by the absence of an edge between \(X_i\) and \(X_j\) in the DAG (Fig. 1, panel b–d). In the remainder, we will refer to the parents of a variable in a LRM as the variables with non-zero coefficient in the regression for that variable, or, equivalently, to the direct predecessors of that variable in the DAG.
Under the assumption that the regressions and the edges in the DAG have a causal interpretation, it is possible to exploit coefficients \(\beta _{j,i}\) to compute the effect of a unit increase in one variable, say \(X_i\), on the expected value of another variable, say \(X_j\) (Fox 1985; Sobel 1990; Chen 2017). Such effect is termed total effect and mediation analysis consists of its decomposition into the sum of several other effects, one for each path connecting \(X_i\) to \(X_j\): the path composed by a single edge entails the direct effect, while each path composed by more than one edge entails an indirect effect.
In the following subsections, we illustrate the computation of causal effects of interest in a LRM based on the regression coefficients. Firstly, we illustrate the case with three variables (Sect. 2.1), then the general case with any number of variables is addressed (Sect. 2.2).

2.1 Computation with three variables

Consider a LRM involving three variables \(X_1\), \(X_2\) and \(X_3\) with DAG displayed in Fig. 1a:
$$\begin{aligned} \left\{ \begin{array}{l} X_1=\alpha _1+\varepsilon _1\\ X_2=\alpha _2+\beta _{2,1}X_1+\varepsilon _2\\ X_3=\alpha _3+\beta _{3,1}X_1+\beta _{3,2}X_2+\varepsilon _3\\ \end{array} \right. \end{aligned}$$
(3)
From the properties of linear regression, we deduce that coefficient \(\beta _{2,1}\) represents the difference in the expected value of \(X_2\) for a unitary increase in the value of \(X_1\):
$$\begin{aligned} \beta _{2,1}\equiv \varDelta \text {E}(X_2\mid \varDelta X_1=1) \end{aligned}$$
(4)
Instead, coefficient \(\beta _{3,1}\) represents the difference in the expected value of \(X_3\) for a unitary increase in the value of \(X_1\) at constant value of \(X_2\):
$$\begin{aligned} \beta _{3,1}\equiv \varDelta \text {E}(X_3\mid \varDelta X_1=1,\varDelta X_2=0) \end{aligned}$$
(5)
Similarly, coefficient \(\beta _{3,2}\) represents the difference in the expected value of \(X_3\) for a unitary increase in the value of \(X_2\) at constant value of \(X_1\):
$$\begin{aligned} \beta _{3,2}\equiv \varDelta \text {E}(X_3\mid \varDelta X_2=1,\varDelta X_1=0) \end{aligned}$$
(6)
Suppose that we are interested in the causal effect of \(X_1\) on \(X_3\). In this case, we define the total effect as the difference in the expected value of \(X_3\) for a unit increase in the value of \(X_1\) whichever the value of \(X_2\), equating to:
$$\begin{aligned} {\text{TE}}(X_1,X_3)\equiv \varDelta \text {E}(X_3\mid \varDelta X_1=1)=\beta _{3,1}+\beta _{2,1}\beta _{3,2} \end{aligned}$$
(7)
where:
$$\begin{aligned} \text {DE}(X_1,X_3)=\beta _{3,1} \end{aligned}$$
(8)
is the direct effect of \(X_1\) on \(X_3\), enatiled by the path \(<X_1,X_2>\), and:
$$\begin{aligned} {\text{IE}}(X_1,X_3;<X_1,X_2,X_3>)=\beta _{2,1}\beta _{3,2} \end{aligned}$$
(9)
is the indirect effect of \(X_1\) on \(X_3\) through (mediated by) \(X_2\), entailed by the path \(<X_1,X_2,X_3>\). The motivation beyond formula (7) is that a unitary increase in the value of \(X_1\) implies the activation of the two paths connecting \(X_1\) to \(X_3\): \(<X_1,X_3>\) (direct path) and \(<X_1,X_2,X_3>\) (indirect path through \(X_2\)). The path \(<X_1,X_3>\) is composed of a single edge, which, after a unit increase in the value of \(X_1\), entails a difference in the expected value of \(X_3\) equal to \(\beta _{3,1}\), thus \(\text {DE}(X_1,X_3)=\beta _{3,1}\). Instead, the path \(<X_1,X_2,X_3>\) is composed of two edges: (i) a first edge between \(X_1\) and \(X_2\), which, after a unit increase in the value of \(X_1\), entails a difference in the expected value of \(X_2\) equal to \(\beta _{2,1}\); (ii) a second edge between \(X_2\) and \(X_3\), which, after an increase equal to \(\beta _{2,1}\) in the value of \(X_2\), entails a difference in the expected value of \(X_3\) equal to \(\beta _{2,1}\beta _{3,2}\). Thus, we conclude that the indirect effect of \(X_1\) on \(X_3\) through \(X_2\) is equal to \(\beta_{2,1}\beta_{3,2}\).

2.2 Computation with any number of variables

In the general case \(p\ge 2\), the causal effect of \(X_i\) on \(X_j\) may involve more than one indirect effect, and the path entailing each indirect effect may be composed of more than two edges. Also, \(X_j\) may have other parents besides all the variables involved in any path connecting \(X_i\) and \(X_j\).
Consider the LRM with DAG shown in Fig. 2, and suppose that we are interested in the causal effect of \(X_1\) on \(X_8\). We see that there is the direct effect, entailed by the path \(<X_1,X_8>\), and five indirect effects, entailed by paths \(<X_1,X_7,X_8>\), \(<X_1,X_6,X_8>\), \(<X_1,X_6,X_7,X_8>\), \(<X_1,X_5,X_6,X_8>\) and \(<X_1,X_5,X_6,X_7,X_8>\). The direct effect of \(X_1\) on \(X_8\) is equal to \(\beta _{8,1}\), but the interpretation changes with respect to the previous example with three variables: since \(X_8\) has other parents besides \(X_1\), precisely \(X_4\), \(X_6\) and \(X_7\), the direct effect \(\beta _{8,1}\) represents the difference in the expected value of \(X_8\) for a unitary increase in the value of \(X_1\) at constant values of \(X_4\), \(X_6\) and \(X_7\):
$$\begin{aligned} \text {DE}(X_1,X_8)\equiv \varDelta \text {E}(X_8\mid \varDelta X_1=1,\varDelta X_4=0,\varDelta X_6=0,\varDelta X_7=0)=\beta _{8,1} \end{aligned}$$
(10)
Based on the same arguments for the computation of the indirect effect in the previous example with three variables, we find:
$$\begin{aligned} \begin{aligned}&{\text{IE}}(X_1,X_8;\hspace{0.04cm}<X_1,X_7,X_8>)=\beta _{7,1}\beta _{8,7}\\&{\text{IE}}(X_1,X_8;\hspace{0.04cm}<X_1,X_6,X_8>)=\beta _{6,1}\beta _{8,6}\\&{\text{IE}}(X_1,X_8;\hspace{0.04cm}<X_1,X_6,X_7,X_8>)=\beta _{6,1}\beta _{7,6}\beta _{8,7}\\&{\text{IE}}(X_1,X_8;\hspace{0.04cm}<X_1,X_5,X_6,X_8>)=\beta _{5,1}\beta _{6,5}\beta _{8,6}\\&{\text{IE}}(X_1,X_8;\hspace{0.04cm}<X_1,X_5,X_6,X_7,X_8>)=\beta _{5,1}\beta _{6,5}\beta _{7,6}\beta _{8,7} \end{aligned} \end{aligned}$$
(11)
As a consequence, the total effect of \(X_1\) on \(X_8\) is:
$$\begin{aligned}&{\text{TE}}(X_1,X_8)\equiv \varDelta \text {E}(X_8\mid \varDelta X_1=1,\varDelta X_4=0)\nonumber \\&\quad =\beta _{8,1}+\beta _{7,1}\beta _{8,7}+\beta _{6,1}\beta _{8,6}+\beta _{6,1}\beta _{7,6}\beta _{8,7}+\beta _{5,1}\beta _{6,5}\beta _{8,6}+\beta _{5,1}\beta _{6,5}\beta _{7,6}\beta _{8,7} \end{aligned}$$
(12)
equating to the difference in the expected value of \(X_8\) for a unitary increase in the value of \(X_1\) at constant values of the parents of \(X_8\) which are not involved in any path between \(X_1\) and \(X_8\) (in this case, \(X_4\)).
In general, the direct effect of \(X_i\) on \(X_j\) is entailed by the direct path between \(X_i\) and \(X_j\), i.e., \(<X_i,X_j>\), and is equal to the coefficient \(\beta _{j,i}\), as formalized in Definition 1. In this view, each coefficient in a LRM represents a direct effect, which is associated to a specific edge in the DAG.
Definition 1
(Direct effect in a LRM) Let \(\varvec{M}\) be the random vector including the variables in any path between \(X_i\) and \(X_j\), excepting \(X_i\) and \(X_j\), and \(\varvec{W}\) be the random vector including the parents of \(X_j\) which are not in \(\varvec{M}\). The direct effect of \(X_i\) on \(X_j\) is defined as:
$$\begin{aligned} \text{DE} (X_i,X_j)\equiv \varDelta \text{E} (X_j\mid \varDelta X_i=1,\varDelta {\varvec{M}}=\varvec{0},\varDelta {\varvec{W}}=\varvec{0})=\beta _{j,i} \end{aligned}$$
(13)
\(\square\)
Denote any indirect path between variables \(X_i\) and \(X_j\) as \(<X_i,X_{d_1},\ldots ,X_{d_m},X_j>\), where \(X_{d_1},\ldots ,X_{d_m}\) are the mediating variables. For notational convenience, we set \(X_{d_0}\equiv X_i\) and \(X_{d_{m+1}}\equiv X_j\). Note that, for \(m=0\), the direct path \(<X_i,X_j>\) is obtained. The indirect effect of \(X_i\) on \(X_j\) through a specific set of mediating variables \(X_{d_1},\ldots ,X_{d_m}\) is computed by multiplying the coefficients associated to each edge composing the path \(<X_i,X_{d_1},\ldots ,X_{d_m},X_j>\), as formalized in Definition 2.
Definition 2
(Indirect effect in a LRM) The indirect effect of \(X_i\) on \(X_j\) through variables \(X_{d_1},\ldots ,X_{d_m}\), is:
$$\begin{aligned} \text{IE} (X_i,X_j;<X_i,X_{d_1},\ldots ,X_{d_m},X_j>)=\prod _{k=1}^{m+1}\beta _{d_k,d_{k-1}} \end{aligned}$$
(14)
where \(d_0=i\) and \(d_{m+1}=j\).
\(\square\)
Note that, if applied to the direct path between \(X_i\) and \(X_j\), i.e., \(<X_i,X_j>\), formula (14) returns the direct effect in formula (13). Finally, the total effect of \(X_i\) on \(X_j\) is obtained by summing the direct and all the indirect effects, as formalized in Definition 3.
Definition 3
(Total effect in a LRM) Let \(\varvec{W}\) be the random vector including the parents of \(X_j\) which are not in any path between \(X_i\) and \(X_j\). The total effect of \(X_i\) on \(X_j\) is defined as:
$$\begin{aligned} \begin{gathered} \text{TE} (X_i,X_j)\equiv \varDelta \text{E} (X_j\mid \varDelta X_i=1,\varDelta \varvec{W}=\varvec{0})\\ \quad =\sum _{<X_{d_0},\ldots ,X_{d_{m+1}}>:~d_0=i\wedge d_{m+1}=j}~\prod _{k=1}^{m+1}\beta _{d_k,d_{k-1}}\\ \end{gathered} \end{aligned}$$
(15)
\(\square\)

3 Mediation analysis in distributed-lag linear recursive models

In a dynamic perspective, each of the random variables \(X_1,\ldots ,X_p\) has several instances, one for each time point. We denote the instance of variable \(X_j\) at time t as \(X_{j,t}\). Under the assumption that time is a discrete variable and that the time series of \(X_1,\ldots ,X_p\) are weakly stationary, i.e., their expected value and autocorrelation function is constant in time, we define the dynamic version of a LRM, here called Distributed-lag Linear Recursive Model (DLRM), as:
$$\begin{aligned}&X_{j,t}=\alpha _j+\sum _{i=1}^p\sum _{l=0}^\infty \beta _{j,i}^{(l)}X_{i,t-l}+\varepsilon _{j,t}\nonumber \\&j=1,\ldots,p\quad ~~t=0,\ldots ,\infty \quad ~~\beta _{j,i}^{(l)}=0~\forall j\le i \end{aligned}$$
(16)
Each model in a DLRM is a distributed-lag linear regression, thus, in contrast to a LRM, each coefficient depends on the time lag l. For \(j=1,\ldots ,p\), the set \(\{\beta _{j,i}^{(l)}: l=0,1,\ldots ,\infty \}\) includes the coefficients associated to all the lags of \(X_i\) in the regression of \(X_j\), and is called lag distribution for the direct effect of \(X_i\) on \(X_j\). Each lag distribution is assumed to have infinite cardinality (infinite lag distribution), but it can be truncated at lag \(l^*\) by setting \(\beta _{j,i}^{(l)}=0\) \(\forall l>l^*\) (finite lag distribution). Note that, in a DLRM, the assumption of uncorrelated random errors implies the absence of autocorrelation in the errors of the same variable, i.e., for \(j=1,\ldots ,p\), we have \(\text {Cov}(\varepsilon _{j,s},\varepsilon _{j,t})=0\) \(\forall s\ne t\).
The DAG of a DLRM, here denoted as full DAG, includes all the temporal instances of variables \(X_1,\ldots ,X_p\) (Fig. 3a). However, it is always possible to convert the DAG of a DLRM into a static representation to highlight the structural dependence among the variables, at the cost of disregarding the lag attribution. For example, from Fig. 3b it is clear that \(X_1\) has a direct effect on both \(X_2\) and \(X_3\), but the time lags at which each effect operates are not shown.
The computation of causal effects of interest in a DLRM is analogous to the one in LRMs, with the difference that it is conditioned to the time lag. Specifically, one can compute the direct effect, the indirect effects and the total effect of any variable to another one at specific time lags by applying the rules for LRMs detailed in Sect. 2 to the full DAG.
DAG-based computation of causal effects of interest in a DLRM is illustrated in Sect. 3.1. Since it requires to search for paths in the full DAG, DAG-based computation may become extremely expensive when performed at high time lags. Our DAG-free algorithm, detailed in Sect. 3.2, is equivalent to DAG-based computation but more efficient, because searching for paths is not required. Section 3.3 includes a discussion on mediation analysis in partially causal DLRMs, i.e., DLRMs including autoregressive effects.

3.1 DAG-based computation

Consider the DLRM with full DAG in Fig. 3a. The direct effect of \(X_1\) on \(X_3\) at lag 0, say \(\text {DE}_{0}(X_1,X_3)\), is composed of the effects entailed by all the paths in the full DAG between \(X_{1,t}\) and \(X_{3,t}\) passing through no instances of \(X_2\). There is a single path satisfying such condition: \(<X_{1,t},X_{3,t}>\), entailing an effect equal to \(\beta _{3,1}^{(0)}\), thus we conclude that \(\text {DE}_{0}(X_1,X_3)=\beta _{3,1}^{(0)}\).
Instead, the direct effect of \(X_1\) on \(X_3\) at lag 1 is composed of the effects entailed by all the paths in the full DAG between \(X_{1,t}\) and \(X_{3,t+1}\) passing through no instances of \(X_2\). The only path satisfying such condition is \(<X_{1,t},X_{3,t+1}>\), entailing an effect equal to \(\beta _{3,1}^{(1)}\), thus we have \(\text {DE}_{1}(X_1,X_3)=\beta _{3,1}^{(1)}\).
We proceed similarly to compute the direct effect of \(X_1\) on \(X_3\) at lag 2, concluding that \(\text {DE}_{2}(X_1,X_3)=\beta _{3,1}^{(2)}\). Rule 1 generalizes the computation of the direct effect of \(X_i\) on \(X_j\) at a given time lag l in any DLRM.
Rule 1
(Direct effect in a DLRM) The direct effect of \(X_i\) on \(X_j\) at time lag l is equal to the sum of the effects entailed by each path in the full DAG between \(X_{i,t}\) and \(X_{j,t+l}\) containing no instances of variables besides \(X_i\) and \(X_j\).
The indirect effect of \(X_i\) on \(X_j\) through mediating variables \(X_{d_1},\ldots ,X_{d_m}\) at time lag l, denoted as \({\text{IE}}_{l}(X_i,X_j;<X_i,X_{d_1},\ldots ,X_{d_m},X_j>)\), is computed analogously to the direct effect, with the difference that we must consider only the paths in the full DAG between \(X_{i,t}\) and \(X_{j,t+l}\) containing at least one instance of each variable \(X_{d_1},\ldots ,X_{d_m}\), as stated by Rule 2.
Rule 2
(Indirect effect in a DLRM) The indirect effect of \(X_i\) on \(X_j\) through variables \(X_{d_1},\ldots ,X_{d_m}\) at time lag l is equal to the sum of the effects entailed by each path in the full DAG between \(X_{i,t}\) and \(X_{j,t+l}\) containing at least one instance of each variable \(X_{d_1},\ldots ,X_{d_m}\).
For example, Tables 1 and 2 show the addenda of the indirect effect of \(X_1\) on \(X_3\) mediated by \(X_2\) at lag 1 and 2 in the DLRM with DAG in Fig. 3.
Table 1
Addenda of \({\text{IE}}_{1}(X_1,X_3;<X_1,X_2,X_3>)\) in the DLRM with DAG in Fig. 3
Path
Entailed effect
\(<X_{1,t},X_{2,t},X_{3,t+1}>\)
\(\beta _{2,1}^{(0)}\beta _{3,2}^{(1)}\)
\(<X_{1,t},X_{2,t+1},X_{3,t+1}>\)
\(\beta _{2,1}^{(1)}\beta _{3,2}^{(0)}\)
Table 2
Addenda of \({\text{IE}}_{2}(X_1,X_3;<X_1,X_2,X_3>)\) in the DLRM with DAG in Fig. 3
Path
Entailed effect
\(<X_{1,t},X_{2,t},X_{3,t+2}>\)
\(\beta _{2,1}^{(0)}\beta _{3,2}^{(2)}\)
\(<X_{1,t},X_{2,t+1},X_{3,t+2}>\)
\(\beta _{2,1}^{(1)}\beta _{3,2}^{(1)}\)
\(<X_{1,t},X_{2,t+2},X_{3,t+2}>\)
\(\beta _{2,1}^{(2)}\beta _{3,2}^{(0)}\)
Finally, the total effect of \(X_i\) on \(X_j\) at time lag l is the sum of the direct and of all the indirect effects of \(X_i\) on \(X_j\) at time lag l, equating to the sum of the effects entailed by all the possible paths between \(X_{i,t}\) and \(X_{j,t+l}\) in the full DAG, as stated by Rule 3.
Rule 3
(Total effect in a DLRM) The total effect of \(X_i\) on \(X_j\) at time lag l is equal to the sum of the effects entailed by each path in the full DAG between \(X_{i,t}\) and \(X_{j,t+l}\).
For example, the total effect of \(X_1\) on \(X_3\) at time lag 2 in the DLRM with DAG in Fig. 3 is obtained by summing the effect entailed by the direct path, i.e., \(\beta _{3,1}^{(2)}\), to all the effects entailed by the paths listed in Table 2.

3.2 DAG-free computation

DAG-based computation requires to search for paths in the full DAG, thus it may become extremely expensive when performed at high time lags. Thus, we propose an alternative method, detailed in Algorithm 1, to compute any direct or indirect effect at the desired time lag in a DLRM without requiring to search for paths in the DAG.
As an illustration, consider \({\text{IE}}_{l}(X_1,X_3;<X_1,X_2,X_3>)\) in the DLRM with DAG in Fig. 3. According to Algorithm 1, for all l, we have \(X_{d_1}=X_2\) and \(X_{d_{m+1}}=X_{d_2}=X_3\). For \(l=0\), we get \({\mathcal {U}}=\{(0,0)\}\), leading to \({\text{IE}}_{0}(X_1,X_3;<X_1,X_2,X_3>)=\beta _{2,1}^{(0)}\beta _{3,2}^{(0)}\), as previously obtained with DAG-based computation. For \(l=1\), we get \({\mathcal {U}}=\{(0,1),(1,0)\}\), leading to \({\text{IE}}_{1}(X_1,X_3;<X_1,X_2,X_3>)=\beta _{2,1}^{(0)}\beta _{3,2}^{(1)}+\beta _{2,1}^{(1)}\beta _{3,2}^{(0)}\), as obtained with DAG-based computation shown in Table 1. For \(l=2\), we get \({\mathcal {U}}=\{(0,2),(1,1),(2,0)\}\), leading to \({\text{IE}}_{2}(X_1,X_3;<X_1,X_2,X_3>)=\beta _{2,1}^{(0)}\beta _{3,2}^{(2)}+\beta _{2,1}^{(1)}\beta _{3,2}^{(1)}+\beta _{2,1}^{(2)}\beta _{3,2}^{(0)}\), as obtained with DAG-based computation shown in Table 2.
As a further illustration, consider the indirect effect of \(X_1\) on \(X_8\) mediated by \(X_6\) and \(X_7\) at time lag 3, i.e., \({\text{IE}}_{3}(X_1,X_8;<X_1,X_6,X_7,X_8>)\), in the DLRM with static DAG in Fig. 2. According to Algorithm 1, we have \(X_{d_1}=X_6\), \(X_{d_2}=X_6\) and \(X_{d_{m+1}}=X_{d_3}=X_8\), and we obtain the ten addenda shown in Table 3. Note that, in the DLRM with static DAG in Fig. 2, the indirect effect of \(X_1\) on \(X_8\) mediated by \(X_6\) and \(X_7\) differs from the indirect effect of \(X_1\) on \(X_8\) mediated by \(X_6\) only, i.e., \({\text{IE}}_{3}(X_1,X_8;<X_1,X_6,X_8>)\), as well as from the indirect effect of \(X_1\) on \(X_8\) mediated by \(X_5\), \(X_6\) and \(X_7\), i.e., \({\text{IE}}_{3}(X_1,X_8;<X_1,X_5,X_6,X_7,X_8>)\).
Table 3
Addenda of \({\text{IE}}_{3}(X_1,X_8;<X_1,X_6,X_7,X_8>)\) in the DLRM with DAG in Fig. 2 obtained using Algorithm 1
Combination
Path in the full DAG
Entailed effect
(0,0,3)
\(<X_{1,t},X_{6,t},X_{7,t},X_{8,t+3}>\)
\(\beta _{6,1}^{(0)}\beta _{7,6}^{(0)}\beta _{8,7}^{(3)}\)
(0,1,2)
\(<X_{1,t},X_{6,t},X_{7,t+1},X_{8,t+3}>\)
\(\beta _{6,1}^{(0)}\beta _{7,6}^{(1)}\beta _{8,7}^{(2)}\)
(0,2,1)
\(<X_{1,t},X_{6,t},X_{7,t+2},X_{8,t+3}>\)
\(\beta _{6,1}^{(0)}\beta _{7,6}^{(2)}\beta _{8,7}^{(1)}\)
(0,3,0)
\(<X_{1,t},X_{6,t},X_{7,t+3},X_{8,t+3}>\)
\(\beta _{6,1}^{(0)}\beta _{7,6}^{(3)}\beta _{8,7}^{(0)}\)
(1,0,2)
\(<X_{1,t},X_{6,t+1},X_{7,t+1},X_{8,t+3}>\)
\(\beta _{6,1}^{(1)}\beta _{7,6}^{(0)}\beta _{8,7}^{(2)}\)
(1,1,1)
\(<X_{1,t},X_{6,t+1},X_{7,t+2},X_{8,t+3}>\)
\(\beta _{6,1}^{(1)}\beta _{7,6}^{(1)}\beta _{8,7}^{(1)}\)
(1,2,0)
\(<X_{1,t},X_{6,t+1},X_{7,t+3},X_{8,t+3}>\)
\(\beta _{6,1}^{(1)}\beta _{7,6}^{(2)}\beta _{8,7}^{(0)}\)
(2,0,1)
\(<X_{1,t},X_{6,t+2},X_{7,t+2},X_{8,t+3}>\)
\(\beta _{6,1}^{(2)}\beta _{7,6}^{(0)}\beta _{8,7}^{(1)}\)
(2,1,0)
\(<X_{1,t},X_{6,t+2},X_{7,t+3},X_{8,t+3}>\)
\(\beta _{6,1}^{(2)}\beta _{7,6}^{(1)}\beta _{8,7}^{(0)}\)
(3,0,0)
\(<X_{1,t},X_{6,t+3},X_{7,t+3},X_{8,t+3}>\)
\(\beta _{6,1}^{(3)}\beta _{7,6}^{(0)}\beta _{8,7}^{(0)}\)
Finally note that, if Algorithm 1 is applied to compute any direct effect, say \(\text {DE}_{l}(X_i,X_j)\), it correctly returns \(\beta _{j,i}^{(l)}\).

3.3 Partially causal distributed-lag linear recursive models

So far, we have excluded autoregressive effects in the definition of DLRMs, i.e., the lagged direct effects of one variable on itself, represented by all the coefficients \(\beta _{j,i}^{(l)}\) such that \(i=j\) and \(l>0\). We now discuss their inclusion in a DLRM.
In general, autoregressive effects for a certain variable can arise from two distinct situations: (i) the variable has a stochastic trend, (ii) some causes with lagged direct effects on the considered variable are omitted. In the first situation, autoregressive effects reflect the law of the temporal evolution of the variable, thus they have a causal interpretation. Instead, the interpretation of autoregressive effects in the second situation is more complex, thus a detailed illustration is provided below.
Consider the DAG of a causal DLRM over two variables \(X_1\) and \(X_2\) displayed in Fig. 4, panel a. The rules for arc reversal in DAGs (Shachter 1990) can be exploited to marginalize out variable \(X_1\). In essence, before reversing the edge connecting two nodes, some other edges should be added to make the two nodes share their parents. Firstly, all the edges connecting each instance of \(X_1\) to any instance of \(X_2\) at the same time point are reversed from the maximum to the minimum time point, leading to the creation of all the possible autoregressive effects for \(X_1\) (Fig. 4, panel b). Secondly, all the remaining edges connecting each instance of \(X_1\) to any instance of \(X_2\) are reversed from the minimum to the maximum time point, leading to the creation of all the possible autoregressive effects for \(X_2\) (Fig. 4, panel c). Thirdly, the instance of \(X_1\) at the maximum time point has no descendants, thus it can be deleted from the DAG, and the same holds recursively for all the remaining instances of \(X_1\) (Fig. 4, panel d). In the resulting DAG, \(X_1\) is omitted and \(X_2\) is represented by an autoregressive process. We conclude that autoregressive effects arising from the omission of variables with lagged direct effects on some other variables have not a causal interpretation, but only a predictive one coherently with the concept of Granger-Sims causality (Eichler 2013).
In presence of autoregressive effects, the definition of a DLRM in formula (16) becomes:
$$\begin{aligned}&X_{j,t}=\alpha _j+\sum _{i=1}^p\sum _{l=0}^\infty \beta _{j,i}^{(l)}X_{i,t-l}+\varepsilon _{j,t}\nonumber \\&\quad j=1,\ldots ,p \quad ~~~~~~t=0,\ldots ,\infty \nonumber \\&\quad \beta _{j,i}^{(l)}=0~\forall j<i \quad ~\beta _{j,i}^{(0)}=0~\forall i=j \end{aligned}$$
(17)
Since the time series in a DLRM are assumed to be weakly stationary, autoregressive effects can arise only from the omission of variables with lagged direct effects on some other variables, thus they have not a causal interpretation. For this reason, we refer to the model in formula (17) as partially causal DLRM. Figure 5 shows the DAG of the partially causal version of the DLRM in Fig. 3. We see that the full DAG includes a number of non-causal edges besides the causal ones. As an example, Tables 4 and 5 show the addenda of the direct effect of \(X_1\) on \(X_3\) for \(l=1,2\), while Table 6 shows the addenda of the indirect effect of \(X_1\) on \(X_3\) mediated by \(X_2\) for \(l=1\). Note that causal addenda are the same as in the strictly causal DLRM with DAG in Fig. 1, but the total number of addenda increases faster as higher time lags are considered. Unfortunately, our DAG-free algorithm does not work for partially causal DLRMs, thus one should rely on DAG-based mediation analysis described in Sect. 3.1, which works indistinctly for both strictly causal and partially causal models. However, in the partially causal case, it involves a higher computational burden.
Note that the omission of a variable with lagged direct effects on another variable can also be represented through autocorrelated random errors (see, for example, Goldsmith et al. 2018), but this would lead to a non-recursive model, thus this case is not addressed here.
Table 4
Causal and non-causal addenda of \(\text {DE}_{1}(X_1,X_3)\) in the partially causal DLRM with DAG in Fig. 5
Path
Causal
Entailed effect
\(<X_{1,t},X_{3,t+1}>\)
Yes
\(\beta _{3,1}^{(1)}\)
\(<X_{1,t},X_{1,t+1},X_{3,t+1}>\)
No
\(\beta _{1,1}^{(1)}\beta _{3,1}^{(0)}\)
\(<X_{1,t},X_{3,t},X_{3,t+1}>\)
No
\(\beta _{3,1}^{(0)}\beta _{3,3}^{(1)}\)
Table 5
Causal and non-causal addenda of \(\text {DE}_{2}(X_1,X_3)\) in the partially causal DLRM with DAG in Fig. 5
Path
Causal
Entailed effect
\(<X_{1,t},X_{3,t+2}>\)
Yes
\(\beta _{3,1}^{(2)}\)
\(<X_{1,t},X_{1,t+1},X_{3,t+2}>\)
No
\(\beta _{1,1}^{(1)}\beta _{3,1}^{(1)}\)
\(<X_{1,t},X_{1,t+2},X_{3,t+2}>\)
No
\(\beta _{1,1}^{(2)}\beta _{3,1}^{(0)}\)
\(<X_{1,t},X_{3,t},X_{3,t+2}>\)
No
\(\beta _{3,1}^{(0)}\beta _{3,3}^{(2)}\)
\(<X_{1,t},X_{3,t+1},X_{3,t+2}>\)
No
\(\beta _{3,1}^{(1)}\beta _{3,3}^{(1)}\)
\(<X_{1,t},X_{1,t+1},X_{1,t+2},X_{3,t+2}>\)
No
\((\beta _{1,1}^{(1)})^2\beta _{3,1}^{(0)}\)
\(<X_{1,t},X_{1,t+1},X_{3,t+1},X_{3,t+2}>\)
No
\(\beta _{1,1}^{(1)}\beta _{3,1}^{(0)}\beta _{3,3}^{(1)}\)
\(<X_{1,t},X_{3,t},X_{3,t+1},X_{3,t+2}>\)
No
\(\beta _{3,1}^{(0)}(\beta _{3,3}^{(1)})^2\)
Table 6
Causal and non-causal addenda of \({\text{IE}}_{1}(X_1,X_3;<X_1,X_2,X_3>)\) in the partially causal DLRM with DAG in Fig. 5
Path
Causal
Entailed effect
\(<X_{1,t},X_{2,t},X_{3,t+1}>\)
Yes
\(\beta _{2,1}^{(0)}\beta _{3,2}^{(1)}\)
\(<X_{1,t},X_{2,t+1},X_{3,t+1}>\)
Yes
\(\beta _{2,1}^{(1)}\beta _{3,2}^{(0)}\)
\(<X_{1,t},X_{1,t+1},X_{2,t+1},X_{3,t+1}>\)
No
\(\beta _{1,1}^{(1)}\beta _{2,1}^{(0)}\beta _{3,2}^{(0)}\)
\(<X_{1,t},X_{2,t},X_{2,t+1},X_{3,t+1}>\)
No
\(\beta _{2,1}^{(0)}\beta _{2,2}^{(1)}\beta _{3,2}^{(0)}\)
\(<X_{1,t},X_{2,t},X_{3,t},X_{3,t+1}>\)
No
\(\beta _{2,1}^{(0)}\beta _{3,2}^{(0)}\beta _{3,3}^{(1)}\)

4 Empirical application

We apply our DAG-free algorithm to a hypothesized causal structure representing the impact pathways of agricultural research expenditure towards poverty reduction in rural areas. Our objective is not to develop a new theory on the impacts of agricultural research expenditure, but to illustrate the practical application of our DAG-free algorithm for mediation analysis. As a consequence, the proposed causal structure is a simplified version that maintains enough complexity for an effective illustration of our algorithm. Qualitative and quantitative specifications of the causal structure are detailed in Sects. 4.1 and 4.2, respectively, while mediation analysis using the DAG-free algorithm is performed in Sect. 4.3.

4.1 Qualitative specification

The causal structure here proposed, shown in Fig. 6, is a readaptation of the model in Alene and Coulibaly (2009) to developed countries based on theoretical arguments in Renkow (2011). It is assumed that an increase in public research expenditure (\(X_1\)) can stimulate research activities towards the development of new technologies, which, once adopted by agricultural producers, can improve agricultural productivity (\(X_2\)). In turn, improved agricultural productivity (\(X_2\)) can stimulate firms in rural areas to increase employment and wages, thus leading to: (i) a decrease of unemployment in rural areas (\(X_3\)), (ii) an increase of the median familiar income in rural areas (\(X_4\)), (iii) a decrease of the price of agricultural products (\(X_5\)). Improved agricultural productivity (\(X_2\)) can increase the median familiar income in rural areas (\(X_4\)) both directly, for example due to increased wages, and indirectly through a decrease of unemployment in rural areas (\(X_3\)), for example due to the availability of new job positions. Finally, increased median familiar income in rural areas (\(X_4\)) and decreased price of agricultural products (\(X_5\)) can lead to a reduction of the at-risk-of-poverty rate in rural areas (\(X_6\)).
All the dependence relationships represented by the hypothesized causal structure in Fig. 6 are likely to persist over several periods, thus the use of a DLRM is definitely motivated. The DAG in Fig. 6 is the static DAG of the DLRM.

4.2 Quantitative specification

We assume that the time series of variables \(X_1,\ldots ,X_6\) are measured yearly and are weakly stationary when expressed as chained proportional variations or, equivalently, as first order difference of logarithmic values. As a consequence, the coefficients in the DLRM represent elasticities at specific time lags (years).
In order to efficiently represent the coefficients associated to the lags of the same variable, we exploited the Gamma lag distribution (Schmidt 1974). The direct effect of variable \(X_i\) on variable \(X_j\) as a function of the time lag, i.e., \(\{\beta _{j,i}^{(l)}:l=0,1,\ldots ,\infty \}\), follows the Gamma lag distribution if:
$$\begin{aligned} \beta _{j,i}^{(l)}(\theta ,\delta ,\lambda )=\theta ~w_l(\delta ,\lambda ) \end{aligned}$$
(18)
where:
$$\begin{aligned}&w_l(\delta ,\lambda )=\frac{(l+1)^{\frac{\delta }{1-\delta }}\lambda ^{l}}{\sum _{k=0}^\infty (k+1)^{\frac{\delta }{1-\delta }}\lambda ^{k}}\nonumber \\&\quad \theta \in {\mathbb {R}} \quad~~0\le \delta<1 \quad~~0\le \lambda <1 \end{aligned}$$
(19)
and we write:
$$\begin{aligned} \{\beta _{j,i}^{(l)}:l=0,1,\ldots ,\infty \}\sim \text {Gamma}(\theta ,\delta ,\lambda ) \end{aligned}$$
(20)
Parameters \(\delta\) and \(\lambda\) define the shape of the lag distribution, while parameter \(\theta\) defines its scale. Since \(\sum _{l=0}^\infty \beta _{j,i}^{(l)}=\theta\), we can interpret \(\theta\) as the long-term effect of \(X_i\) on \(X_j\). Note that the normalization constant at denominator in formula (19) is not closed form but convergent, and can be easily computed through numerical approximation. Futher details on the Gamma lag distribution, including maximum likelihood estimation, can be found in Magrini (2021, forthcoming).
We specified the effect of agricultural research expenditure (\(X_1\)) on agricultural productivity (\(X_2\)) as a Gamma lag distribution based on the empirical results for the United States of America in Alston et al. (2011):
$$\begin{aligned} \{\beta _{2,1}^{(l)}:l=0,1,\ldots ,\infty \}\sim \text {Gamma}(\theta =0.32,\delta =0.90,\lambda =0.70) \end{aligned}$$
(21)
This lag distribution has mode (peak) at 24 years, 99th percentile at 52 years and long-term elasticity equal to 0.32, meaning that agricultural productivity is expected to increase by 0.32% after 50 years that agricultural research expenditure has grown by 1%.
Unfortunately, estimates based on empirical evidence are not available for the lag distributions of the other direct effects in the DLRM, thus, in order to illustrate our algorithm in a context as realistic as possible, we provided a provisional specification based on the Gamma family. Precisely, for each direct effect in the DLRM, we determined the Gamma lag distribution based on a guess of long-term elasticity, median and 99th percentile, say \({\hat{\theta }}\), \({\hat{q}}_{0.5}\) and \({\hat{q}}_{0.99}\), respectively. At this purpose, we solved the following system of equations with respect to \(\theta\), \(\delta\) and \(\lambda\) through numerical approximation:
$$\begin{aligned} {\left\{ \begin{array}{ll} \theta = {\hat{\theta }} &{}\\ \sum\limits_{l=0}^{{\hat{q}}_{0.5}}w_l(\delta ,\lambda )=0.5 &{}\\ \sum\limits_{l=0}^{{\hat{q}}_{0.99}}w_l(\delta ,\lambda )=0.99 &{}\\ \end{array}\right. } \end{aligned}$$
(22)
The resulting lag distributions are shown in Table 7. Note that some Gamma lag distributions have a negative long-term elasticity reflecting an inverse relationship, for example the one between agricultural productivity (\(X_2\)) and unemployment in rural areas (\(X_3\)) (Fig. 7).
Table 7
Specification of the lag distribution of each direct effect in the DLRM with DAG in Fig. 6. Each lag distribution was determined by solving the system of equations in formula (22) based on a guess of long-term elasticity \(\theta\), median and 99th percentile
 
\(\theta\)
\(\delta\)
\(\lambda\)
Mode
Median
95%
99%
99.9%
\(\{\beta _{2,1}^{(l)}\}\)
0.32
0.90
0.70
24.2
25.6
42.5
51.2
62.0
\(\{\beta _{3,2}^{(l)}\}\)
− 0.45
0.80
0.35
2.8
2.9
7.2
9.6
12.6
\(\{\beta _{4,2}^{(l)}\}\)
0.35
0.80
0.35
2.8
2.9
7.2
9.6
12.6
\(\{\beta _{4,3}^{(l)}\}\)
− 1.00
0.90
0.05
2.0
1.7
3.8
4.8
6.0
\(\{\beta _{5,2}^{(l)}\}\)
− 0.40
0.80
0.35
2.8
2.9
7.2
9.6
12.6
\(\{\beta _{6,4}^{(l)}\}\)
− 0.90
0.80
0.35
2.8
2.9
7.2
9.6
12.6
\(\{\beta _{6,5}^{(l)}\}\)
0.60
0.90
0.30
1.6
6.5
11.6
14.1
17.4

4.3 Mediation analysis

From the hypothesized causal structure in Fig. 6, we see that agricultural research expenditure (\(X_1\)) influences at-risk-of-poverty rate in rural areas (\(X_6\)) through three indirect paths: (i) a first path \(<X_1,X_2,X_5,X_6>\), passing through agricultural productivity (\(X_2\)) and price of agricultural products (\(X_5\)); (ii) a second path \(<X_1,X_2,X_4,X_6>\), passing through agricultural productivity (\(X_2\)) and median familiar income in rural areas (\(X_4\)); (iii) a third path \(<X_1,X_2,X_3,X_4,X_6>\), passing through agricultural productivity (\(X_2\)), unemployment in rural areas (\(X_3\)) and median familiar income in rural areas (\(X_4\)).
It is possible to anticipate duration and sign of any indirect effect in the DLRM on the basis of duration and sign of the direct effects shown in Table 7. For example, the indirect effect of \(X_1\) on \(X_6\) passing through \(X_2\) and \(X_5\) is composed by the direct effect of: (i) \(X_1\) on \(X_2\), with positive sign and 99th percentile equal 51.2 years; (ii) \(X_2\) on \(X_5\), with negative sign and 99th percentile equal to 9.6 years; (iii) \(X_5\) on \(X_6\), with positive sign and 99th percentile of 14.1 years. Thus, we conclude that the indirect effect of \(X_1\) on \(X_6\) passing through \(X_2\) and \(X_5\) has negative sign and 99th percentile approximatively equal to \(51.2+9.6+14.1\approx 75\) years. Based on analogous arguments, we conclude that the indirect effect of \(X_1\) on \(X_6\) mediated by \(X_2\) and \(X_4\) has negative sign and 99th percentile approximatively equal to 70 years, while the one mediated by \(X_2\), \(X_3\) and \(X_4\) has negative sign and 99th percentile approximatively equal to 75 years.
The lag distributions of these three indirect effects and of the total one computed using Algorithm 1 are displayed in Fig. 8 and summarized in Table 8. Coherently with the anticipations above, the magnitudes of the effects become negligible above lag 75, as it is apparent from the similarity between the cumulative effects up to 75 and up to 90 lags (fifth ad sixth row in Table 8). Even the sign of the effects is negative as anticipated, meaning that agricultural research expenditure (\(X_1\)) is able to reduce the at-risk-of-poverty rate in rural areas (\(X_6\)). In particular, the total effect at lags 30, 45 and 60 resulted, respectively, equal to −0.0870, −0.2563, and −0.3029, meaning that a unitary percentage increase in agricultural research expenditure (\(X_1\)) is expected to entail an overall decrease by 0.09%, 0.26% and 0.30% of the at-risk-of-poverty rate in rural areas (\(X_6\)) after 30, 45 and 60 years.
Table 8
Cumulative effects of \(X_1\) on \(X_6\) in the DLRM with DAG in Fig. 6 and coefficients in Table 7. \(\varPi _1=<X_1,X_2,X_5,X_6>\); \(\varPi _2=<X_1,X_2,X_4,X_6>\); \(\varPi _3=<X_1,X_2,X_3,X_4,X_6\)
Time lag
\({\text{IE}}(X_1,X_6;\varPi _1)\)
\({\text{IE}}(X_1,X_6;\varPi _2)\)
\({\text{IE}}(X_1,X_6;\varPi _3)\)
\({\text{TE}}(X_1,X_6)\)
0 to 15
− 0.0001
− 0.0006
− 0.0002
− 0.0009
0 to 30
− 0.0167
− 0.0362
− 0.0340
− 0.0870
0 to 45
− 0.0610
− 0.0883
− 0.1070
− 0.2563
0 to 60
− 0.0753
− 0.0999
− 0.1277
− 0.3029
0 to 75
− 0.0767
− 0.1007
− 0.1295
− 0.3069
0 to 90
− 0.0767
− 0.1007
− 0.1295
− 0.3069

5 Concluding remarks

We have proposed an efficient DAG-free algorithm to perform mediation analysis in recursive systems of distributed-lag linear regressions, here called Distributed-lag Linear Recursive Models (DLRMs). The generality of our algorithm is clearly emphasized by the empirical application to a simplified, but not trivial, real-world causal structure with paths composed of three or four edges and direct effects characterized by an infinite lag distribution.
The proposed algorithm works for DLRMs with strict causal interpretation, i.e., not including autoregressive effects. Nevertheless, the class of DLRMs supports autoregressive terms, although they have a predictive interpretation coherently with the concept of Granger-Sims causality, thus implying a partially causal model. Mediation analysis in the partially causal case is still possible by making use of DAG-based computation, but, compared to our DAG-free algorithm, the computational burden increases faster as higher time lags are considered. Future work will include a simulation study to assess the efficiency of DAG-based computation in the partially causal case compared to the strictly causal one, as well as the efficiency of our DAG-free algorithm compared to DAG-based computation.
Weak stationarity is a necessary condition for distributed-lag linear regression in order to make coefficients not depend on the time point but only on the time lag. However, the class of DLRMs can be applied to non-stationary processes without loss of generality if the time series are preliminarily differenced to eliminate unit roots.
Linearity characterizing DLRMs is not a limitation from an empirical point of view, as it is possible to transform the variables through non-linear (but monotonic) functions while maintaining linearity with respect to parameters. For example, the logarithmic transformation of strictly positive variables is often used in empirical applications of linear regression to deal with power law relationships while maintaining interpretability of coefficients: on the logarithmic scale, absolute variations become percentage changes.
Recursivity of DLRMs, equating to uncorrelation between the random errors, is a simplification that excludes processes with autocorrelated errors, like MA ones. However, note that, due to the equivalence between MA(1) and AR(\(\infty\)), the class of DLRMs could reasonably approximate processes with autocorrelated errors if autoregressive effects are modelled through an infinite parametric lag distribution, like the Gamma lag exploited in our empirical application.
Our proposal assumes that the DAG of a DLRM is known a-priori, for instance it may have been specified based on domain knowledge. However, it is worth noting that, in case of unavailability of domain knowledge, the DAG can be estimated from data by making use of causal discovery algorithms for time series (see, for example, Deng et al. 2013; Peters et al. 2013; Malinsky and Spirtes 2018).
A further assumption of our proposal is that the coefficients are known a-priori or have been consistently estimated from data, thus the problem of identifying causal effects is not addressed in this paper. However, coefficients are typically unknown in real-world applications of mediation analysis, and the presence of unmeasured variables and/or unmeasured relationships makes their estimation challenging. Thus, assessing the identification of causal effects is an important step to be performed before estimating the coefficients and applying our algorithm. Identification methods for non-parametric mediation models with longitudinally repeated measures and time-varying treatments (see, for example, VanderWeele and Tchetgen 2017; Park et al. 2018; Loh et al. 2019) represent a valuable resource for this purpose.

Declarations

Conflict of interest

The authors declare that there is no conflict of interests.

Human and animal rights

This article does not contain any studies with human or animal subjects.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literatur
Zurück zum Zitat Goldsmith, K., Chalder, T., White, P., Sharpe, M., Pickles, A.: Measurement error, time lag, unmeasured confounding: considerations for longitudinal estimation of the effect of a mediator in randomised clinical trials. Stat. Methods Med. Res. 27(6), 1615–1633 (2018). https://doi.org/10.1177/0962280216666111CrossRef Goldsmith, K., Chalder, T., White, P., Sharpe, M., Pickles, A.: Measurement error, time lag, unmeasured confounding: considerations for longitudinal estimation of the effect of a mediator in randomised clinical trials. Stat. Methods Med. Res. 27(6), 1615–1633 (2018). https://​doi.​org/​10.​1177/​0962280216666111​CrossRef
Zurück zum Zitat Greene, W.H.: Econometric Analysis, 6th edn. Pearson, Upper Saddle River (2008) Greene, W.H.: Econometric Analysis, 6th edn. Pearson, Upper Saddle River (2008)
Zurück zum Zitat Koopmans, T.C., Rubin, H., Leipnik, R.B.: Measuring the equation systems of dynamic economics. In: Koopmans, T.C. (ed.) Statistical Inference in Dynamic Economic Models, pp. 53–237. Wiley, Hoboken (1950) Koopmans, T.C., Rubin, H., Leipnik, R.B.: Measuring the equation systems of dynamic economics. In: Koopmans, T.C. (ed.) Statistical Inference in Dynamic Economic Models, pp. 53–237. Wiley, Hoboken (1950)
Zurück zum Zitat Magrini, A.: A hill climbing algorithm for maximum likelihood estimation of the Gamma distributed-lag model with multiple explanatory variables. Aust. J. Stat. (2021, forthcoming) Magrini, A.: A hill climbing algorithm for maximum likelihood estimation of the Gamma distributed-lag model with multiple explanatory variables. Aust. J. Stat. (2021, forthcoming)
Zurück zum Zitat Malinsky, D., Spirtes, P.: Causal structure learning from multivariate time series in settings with unmeasured confounding. In: Proceedings of the: ACM SIGKDD Workshop on Causal Discovery, London UK (2018) Malinsky, D., Spirtes, P.: Causal structure learning from multivariate time series in settings with unmeasured confounding. In: Proceedings of the: ACM SIGKDD Workshop on Causal Discovery, London UK (2018)
Zurück zum Zitat Park, S., Steiner, P.M., Kaplan, D.: Identification and sensitivity analysis for average causal mediation effects with time-varying treatments and mediators: Investigating the underlying mechanisms of kindergarten retention policy. Psychometrika 83(2), 298–320 (2018). https://doi.org/10.1007/s11336-018-9606-0CrossRef Park, S., Steiner, P.M., Kaplan, D.: Identification and sensitivity analysis for average causal mediation effects with time-varying treatments and mediators: Investigating the underlying mechanisms of kindergarten retention policy. Psychometrika 83(2), 298–320 (2018). https://​doi.​org/​10.​1007/​s11336-018-9606-0CrossRef
Zurück zum Zitat Pearl, J.: Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge (2000) Pearl, J.: Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge (2000)
Zurück zum Zitat Peters, D.J., Janzing, D., Schölkopf, B.: Causal inference on time series using restricted structural equation models. In: Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS 2013), Lake Tahoe, US-NE, pp. 154–162 (2013) Peters, D.J., Janzing, D., Schölkopf, B.: Causal inference on time series using restricted structural equation models. In: Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS 2013), Lake Tahoe, US-NE, pp. 154–162 (2013)
Zurück zum Zitat Renkow, M.: Assessing the environmental impacts of CGIAR research: toward an analytical framework. In: Measuring the Environmental Impacts of Agricultural Research: Theory and Applications to CGIAR Research, Independent Science and Partnership Council, Rome, IT, pp. 1–33 (2011) Renkow, M.: Assessing the environmental impacts of CGIAR research: toward an analytical framework. In: Measuring the Environmental Impacts of Agricultural Research: Theory and Applications to CGIAR Research, Independent Science and Partnership Council, Rome, IT, pp. 1–33 (2011)
Metadaten
Titel
Mediation analysis in recursive systems of distributed-lag linear regressions
verfasst von
Alessandro Magrini
Publikationsdatum
08.07.2021
Verlag
Springer Netherlands
Erschienen in
Quality & Quantity / Ausgabe 3/2022
Print ISSN: 0033-5177
Elektronische ISSN: 1573-7845
DOI
https://doi.org/10.1007/s11135-021-01194-8

Weitere Artikel der Ausgabe 3/2022

Quality & Quantity 3/2022 Zur Ausgabe

Premium Partner