Brought to you by:
Topical Review The following article is Open access

Revealing networks from dynamics: an introduction

and

Published 11 August 2014 © 2014 IOP Publishing Ltd
, , Citation Marc Timme and Jose Casadiego 2014 J. Phys. A: Math. Theor. 47 343001 DOI 10.1088/1751-8113/47/34/343001

1751-8121/47/34/343001

Abstract

What can we learn from the collective dynamics of a complex network about its interaction topology? Taking the perspective from nonlinear dynamics, we briefly review recent progress on how to infer structural connectivity (direct interactions) from accessing the dynamics of the units. Potential applications range from interaction networks in physics, to chemical and metabolic reactions, protein and gene regulatory networks as well as neural circuits in biology and electric power grids or wireless sensor networks in engineering. Moreover, we briefly mention some standard ways of inferring effective or functional connectivity.

Export citation and abstract BibTeX RIS

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Where are you linked?

1.1. Relating connection topology of a network to its dynamics

Networks are everywhere. And most of them are dynamic. From networks of biochemical reactions that regulate the metabolism in the cells of our bodies to the neuronal circuits in our brains, from social ties forming networks of our friendships and collaborators to the power grids and internet that provide huge amounts of electric energy and information every second. All of these systems form networks of units that interact to yield complex, collective forms of functions—and all are crucial to our everyday life.

The interaction topology of complex networks strongly impact their collective dynamics and thus the function of entire systems. For many network dynamical systems, for instance in physics and biology, the dynamics of individual units becomes more and more accessible whereas their intricate web of interactions remains uncertain or even often largely unknown. As an example, many constituents of protein and gene interaction networks are well characterized but how they interact and which pathways are relevant for suitable functioning is not well understood [1, 2]. In neuroscience, the number of units from which one can simultaneously measure neuronal activity is increasing rapidly from a few units to hundreds of them [3]. Still, identifying the synaptic connections of a neuronal circuit by anatomical methods is mostly restricted to individual synapses and computer-aided reconstruction based on optical methods for more than two cells becomes available only since very recently [46]. In social networks, even in simple settings such as basic games, pairwise interactions are roughly understood, but often both the (temporally varying) interaction network and its collective consequences remain a riddle. Thus, reconstructing the structure of interaction networks from (only) the collection of local dynamical data constitutes a current open challenge, with applications across the natural and social sciences as well as engineering.

Network dynamics: forward versus inverse problem. Yet, the vast majority of research on network dynamics has focused on the 'forward direction' of modeling and asked what types of collective dynamics emerge from a network of given topology. Researchers from the natural sciences and engineering systematically address the reverse questions—how to control a network or, more generally, how to design networks for a desired dynamics and thus function and how to infer topology from dynamics—now at a rapidly increasing pace: in particular in engineering, the design of systems for a specific function always was core [7] and with the systems becoming more complex, considering recurrently interacting networks becomes indispensable. Conversely, complex systems in physics and biology require a view on networked systems to understand how complex emergent phenomena contribute to (possibly optimal) system's dynamics or function [812]. Finally, also how one could perhaps redesign collective dynamics of networks, e.g., gene and protein networks [13, 14] poses challenges of frontier research.

The inverse problem of how to infer interaction topology from network dynamics constitutes the main topic of this review.

1.2. Aims and options of network inference

What do we aim to infer? Before addressing any inference problem, we have to clarify what we actually want to find out about the networked system and which level of detail we are interested in, see table 1. For instance, we may want to know effective or functional connectivity not caring about individual interactions per se but only about statistical dependences which the entire set of interactions yield between pairs of units through the collective network dynamics. Inter-unit correlations and various information-theoretic measures have been devised to solve such problems. These often neglect the temporal dynamics as they use temporal averages or statistical distributions of observables. Moreover, effective connectivity may depend on the current collective state and function of a given system and thus the same physical network may display different effective topologies for different functions or in different states. We may alternatively want to know structural connectivity, i.e. which unit directly interacts with which other units (and how)?

Table 1. Levels of interest. Which properties of the connections are we interested in?

  Distinctions
Property of connection ...and examples thereof
Structural versus effective Direct interaction or statistical dependence?
  EEG versus connectomics for neural 'connectivity'
Pure existence Presence or absence of links?
  does one gene directly influence another?
Sign Positive, negative, or mixed-sign interaction?
  phase-advancing or phase retarding (for coupled oscillators);
  activating or inhibiting (for gene and protein interactions);
Directedness Directed or undirected (bi-directed) interactions?
  chemical versus electrical neuronal synapses
Type of interaction Continuous time or discrete time; linear versus nonlinear?
  chemical versus electrical synapses
  diffusive versus nonlinear coupling
Time scales Instantaneous, temporally localized or extended?
  slot communication (in mobile phones) versus genetic interactions
Spatial scales Local, global, non-local, bounded?
  message broadcasting, from wireless networks to cell tissue

In this review, we focus on these direct physical interactions and address various levels of detail. We also briefly present basic methods to infer different types of effective connectivity. By construction, such a review cannot be complete, also because the field is currently developing at a breathtaking pace. We therefore select specific reconstruction methods from those that are commonly used, appear promising for the future of the field, or have been recently developed and form the basis of current research.

What can we learn about the connections of a network from accessing the units' dynamics? Mathematically, inferring the connectivity constitutes a high-dimensional inverse problem and various methods have been devised to address this question. Every inference method starts from different levels of pre-knowledge about the system and has its own aims what to reconstruct, cf table 1. We may be interested in whether the interactions are directed or undirected, in whether or not a link exists, in the sign of the interaction dynamics, the type of links (e.g., electric versus chemical synapses, diffusive versus nonlinear coupling), the strengths and the temporal and spatial scales of interactions etc.

Structural connectivity may be very different from effective or functional connectivity (figure 1). On the one hand, high correlations may exist between two units that are not directly connected but only influenced by each other via a third unit they both directly interact with. In general, such indirect interactions may be induced not only by one third node, but equally by the entire collective dynamics of a network. On the other hand, even a strong direct interaction between two units does not necessarily mean that their dynamics is highly correlated; correlations could be submerged, e.g., by external noise or recurrent inputs the two units receive, e.g., from two distinct other parts of the network, or even from outside of it.

Figure 1.

Figure 1. Diverse dynamic impact of structural links on effective connectivity. (a) Structural links may not be detectable by certain correlation measures due to strong independent driving signals (e.g. noise). For example, strong inputs along two links (from within or outside the network, marked by arrows) may decorrelate the dynamics of the two nodes (gray disks) although they directly interact. (b) Common input may create effective link without the structural link present. For example, common input from a third node (open circle) may create effective connectivity (dashed line) between two nodes (gray disks) that are not directly connected. In both examples, (a) and (b), it may depend on the entire collective state of the network (and the external inputs it may receive) whether or not an effective connection is detected.

Standard image High-resolution image

In general, effective and structural connectivity are related in a highly non-trivial way. In fact, a number of counter-intuitive phenomena have been observed in various systems. For instance, recent work on coupled oscillator networks highlight that under certain conditions noise may aid to reconstruct structural connectivity from correlation-based effective connectivity [15]; and one may detect small-world features in the functional connectivity even if it is derived from randomly connected dynamical systems without any specific small-world features [16].

Where do we start from? It is important to clarify which knowledge about the networked system we presuppose. Is the collective dynamics known to be simple such as converging to a fixed point of concentrations in biochemical networks or to a limit cycle in coupled oscillator networks? Or do we expect more complex, irregular, and perhaps unpredictable, chaotic or random types of dynamics? Can we change the dynamics of units by interfering with the system or can we just observe? Does the research question require a global analysis in state space or do we focus on a specific dynamical state where local analysis may suffice? We should answer these questions, among others, before using or developing any inference method—to achieve reconstruction at the level we need with best quality and minimal efforts, both experimentally and computationally.

Here we review recently developed approaches to inferring structural connectivity of a network from accessing its collective dynamics. The presented approaches assume various levels of pre-knowledge about the system and may or may not require the observer to interfere with the system. The article is structured as follows. We first clarify in section 2 what we mean by a network dynamical system, taking the view of continuous-time dynamics described by coupled ordinary differential equations. In sections 35, we explain three principally different classes of methods to obtain information about the structure of the network topology from dynamical quantities.

Section 3 illustrates basic theoretical approaches based on measuring and evaluating the response of a network to external perturbations or driving. As the response depends on both the external driving signal and the interaction topology of the network, sufficiently many driving-response experiments yield information about the entire network topology. A second class of methods sets up a model copy of the original system (section 4) and adapts the coupling matrix of the model so as to synchronize its dynamics with the original. If synchronization is achieved, the topology obtained for the model is taken as a proxy for the original. Finally, a set of direct methods (section 5) rely on measuring time series (or features thereof), evaluating temporal derivatives, and exploiting smoothness assumptions to find solutions to an optimization problem given by the restrictions by data.

We briefly comment on technical issues (section 6) and mention basic core approaches for identifying effective connectivity (section 7). These approaches rely on simple linear correlation, maximum entropy principles and related statistical inference methods. Finally, we provide an outlook (section 8) where we highlight current challenges, point out aspects sometimes overlooked and show potential research paths towards uncovering more of the topology of the networks that surround us.

This review on purpose is short but self-contained. It is intended for researchers mainly in the physical and biological sciences and engineering and assumes basic knowledge of dynamical systems and probability theory. Let us start with the details.

2. Nonlinear dynamics of networks

2.1. Systems of ordinary differential equations describing networks

Throughout the main part of this review, we consider networks of units assumed to be described by systems of ordinary differential equations. Discrete time maps coupled to a network are not discussed but typically approaches similar to those presented here are viable in slightly modified forms. We discuss specific issues for systems of pulse-coupled units, such as spiking neurons, in section 5. These are formally hybrid systems, i.e. mixtures of continuous-time and discrete time systems.

Assuming that interactions occur between pairs of coupled units, a generic network dynamical systems is given by

Equation (1)

where i, j ∈ {1, 2, ...N}, $\boldsymbol{x}_{i}(t)=[x_{i}^{(1)}(t),x_{i}^{(2)}(t),\ldots ,x_{i}^{(D)}(t)]^{\mathsf {T}}\in \mathbb {R}^{D}$ describes the state of the ith unit at time $t\in \mathbb {R}$, and the functions $\boldsymbol{f}_{i}:\,\mathbb {R}^{D}\rightarrow \mathbb {R}^{D}$ and $\boldsymbol{g}_{ij}:\,\mathbb {R}^{D}\times \mathbb {R}^{D}\rightarrow \mathbb {R}^{D}$ mediate intrinsic and interaction dynamics of the D-dimensional units, respectively. The term Ii(t) represents a vector of external driving signals (possibly random) and ξi(t) symbolically represents external noise. Finally, the Jij define the interaction topology, in the simplest setting in terms of the adjacency matrix A such that Jij = Aij = 1 if there is a direct physical interaction from unit j to i and Jij = Aij = 0 otherwise. In general, units' interaction may be higher order, e.g. requiring terms like hijk(xi, xj, xk) added to the right-hand side of (1). For instance in gene and protein interaction networks, a protein (the so-called transcription factor, say unit k) is directly influencing the rate of transcription of a gene (say, unit j) to a DNA segment (say unit i). We do not treat such terms here explicitly. Their relevance for network dynamical systems is discussed in [17].

For some methods to infer effective connectivity, the functional form of (1) does not directly enter the inference argument, other methods can be extended to include higher order terms explicitly. We comment on higher order terms where appropriate (cf also section 5).

2.2. Rescaling, simplifications, and common interactions

Some a priori technical issues. Considering (1) as our basic level of description, in case the dimension Di of the local dynamical system i depends on unit i, we would just consider the maximal occurring dimension D = max i ∈ {1, ..., N}Di and for each unit ignore the DDi dummy variables. This is done purely for notational simplification. Note further that in general, the quantity Jij is a D × D matrix of coupling strengths $J_{ij}^{dd^{\prime }}$, but that for many paradigmatic model systems, only one of these elements is non-zero, i.e. $J_{ij}^{dd^{\prime }}=J_{ij}^{dd^{\prime }}\delta _{d,d_{1}}\delta _{d^{\prime },d_{2}}$such that $J_{ij}=J_{ij}^{d_{1}d_{2}}$ is sufficient to describe the influence of unit j on i.

Given a dynamical system of the form (1), the right-hand side is determined up to N × D additive constants $C_{i}^{d}$ and one overall multiplicative constant C0. Shifting the state variables $x_{i}^{d}$ to co-moving frames and rescaling time enables us to set $C_{i}^{d}=0$ for all i and all d and C0 = 1 without loss of generality.

For simplicity of presentation, we furthermore describe the methods below as if they were for networks of coupled one-dimensional units only. Often, different dimensions may be treated independently during reconstruction.

We thus take the coupled equations

Equation (2)

as our basic network characterization we start from, where now the variables xi and xj and the functions fi and gij are treated as real scalars.

Common Interaction Functions. Only non-trivial coupling terms that are not identically zero,

Equation (3)

for the relevant domain of arguments actually contribute to interactions, so we assume all coupling functions gij for which Jij ≠ 0 to be non-trivial in this sense. A broad range of systems exhibits diffusive coupling such that

Equation (4)

which is a special case of coupling functions

Equation (5)

that depend on state differences (e.g. phase differences for coupled oscillators) only. The simplest non-trivial form of interaction is linear coupling,

Equation (6)

and does not depend on the dynamical variable of the unit it influences.

We have now set the stage to dive into specific inference approaches.

3. Driving-response experiments

One idea of inferring network topology is to measure the collective response of a network dynamical system to driving by external signals. For instance, if a system exhibits a stable collective state (e.g. fixed point or periodic orbit, cf figure 2), it will relax back to that state after a transient input (pulse), if the latter is not too strong (such as to not kick the system out of the basin of attraction of the stable state) [1822]. Here, the input signal (driving) effectively changes the initial condition of the system, leaving the system features (given by the local and interaction functions and their parameters) the same. Similarly, a stable state of the system will generically move in state space (and keep qualitatively the same stability properties) in response to sufficiently weak, temporally constant external perturbations. These kinds of perturbations effectively creates a non-identical but similar systems with different parameters determined by the driving signal.

Figure 2.

Figure 2. Stable state approaches to network inference. (a) Constant external driving signal (parameter change) moves the stable fixed point (gray) in state space to another position (blue). The difference vector v (red) depends on both the driving signal and the network topology such that several measurements of v under different driving conditions yields information about the topology. (b) Same as (a) but for transient perturbative signal: After the driving signal is switched off, the dynamics relaxes back (dashed blue trajectory) towards the original fixed point (gray). Now, the relaxation trajectory contains information about the topology.

Standard image High-resolution image

Both the relaxation dynamics and the shift in state space in general depend not only on the external signal (which unit is perturbed, how and how strongly, i.e. known quantities), but also on the (unknown) interaction topology of the network. Each collective response of the system to an external perturbation yields a restriction on the network topology such that sufficiently many driving-response experiments may reveal the entire topology. In this section, we present the main ideas underlying several related driving-response approaches.

3.1. Restrictions from local fixed point analysis

Recent efforts for developing methods to identify a network's topology have emerged from the need to understand biological, in particular gene regulatory networks [1, 2325]. Such networks consist of genes and proteins that interact with each other within a cell [26, 27]. These interactions in particular control (indirectly) the rates at which genes are transcribed into mRNA and the regulatory features emerge via the interactions and generally do not follow from the single-gene level [28]. Gene regulatory networks and other reaction networks, e.g., in chemistry or population dynamics, are often modeled by nonlinear differential equations

Equation (7)

describing the rate of change of the expected numbers (or concentrations) zi(t) of entities (e.g. genes, atoms and molecules or individual organisms) at time t in terms of their dependence on the number of other entities [27]4. Here p is a set of parameters and $\tilde{\boldsymbol{I}}$ represents a set of external perturbations directed to the entities. The zi are typically positive real numbers but mathematically there is no restriction for them to also be negative. Under the assumption that such a system is close to a steady state $\boldsymbol{z}^{*}=(z_{1}^{*},z_{2}^{*},\ldots ,z_{N}^{*})$, a stable fixed point where $f_{i}(z_{1},z_{2},\ldots ,z_{N};\boldsymbol{p},\tilde{\boldsymbol{I}})=0$ for all i, the dynamics for perturbations $x_{i}(t)=z_{i}(t)-z_{i}^{*}$ from steady state concentrations of such nonlinear models may be approximated to first order in the xi by

Equation (8)

where the local Jacobian Jij = (∂fi/∂zj)(z*) is the effective interaction matrix given the steady state and Ii(t) is assumed to be an external perturbation linearly coupling to deviations of variables xi. We remark that when deriving (8) from (7) we implicitly use the relation

Equation (9)

ignoring the second and higher order terms.

3.1.1. Driving the system constantly to move a stable state (changing parameters)

The method presented now effectively moves the parameters and thus in particular the fixed points of a system. Originally intended for gene and protein interaction networks, Gardner and coworkers [1, 29] have explicated that reconstruction is possible, at least for small networks. We here discuss two approaches of using the general form (8) to reconstruct the interaction network, i.e. the Jij. The first is based on driving the system (8) by sufficiently small, temporally constant driving forces Ii(t) = Ii, m to new fixed points $x_{i,m}^{*}\ne 0$ close to the original one $x_{i}^{*}=0$. As the original fixed point was structurally stable, the new fixed point will generically exist and have qualitatively the same stability properties if the Ii, m are small enough (figure 2(a)). This is because solutions of (7), in particular fixed point solutions, typically vary continuously with changing parameters (here: changing Ii, m from zero) and there is no bifurcation close to the parameters yielding a generic stable fixed point $y_{i}^{*}$.

The observed values assumed at each new steady state together with the (known) driving signals provide information about the interaction topology Jij. Performing several perturbation experiments m ∈ {1, ..., M} yield N × M equations

Equation (10)

one for each experiment m and for each unit i.

After an arbitrary number M of experiments, equation (10) in matrix form becomes

Equation (11)

where $J\,{\in}\, \mathbb {R}^{N\times N}$ represents the connectivity among the units, $X\in \mathbb {R}^{N\times M}$ the steady state values with $X_{i,m}=x_{i,m}^{*}$, and $Y{\in \mathbb {R}}^{N\times M}$ the perturbations Yi, m = −Ii, m that we assume to be known. This matrix equation restricts the connectivity J given the measured data X and the input perturbations I. The matrix equations constraining the full network topology J can be split into N equations

Equation (12)

one for each input connectivity $\boldsymbol{J}_{i}:=(J_{i,1},\ldots ,J_{i,N})\in \mathbb {R}^{1\times N}$ of a unit i. Thus, the same set of data X restrict all the sets of units providing interactions to i ∈ {1, ..., N} but the data $\boldsymbol{Y}{}_{i}=(Y_{i,1},\ldots ,Y_{i,M})^{\sf {T}}$ are unit dependent. This reduction to N individual equations also admits to split the computational effort for solving them. The problem becomes trivially parallelizable because for different i, these restrictions (12) are independent in the sense that reconstruction of the input coupling strengths to each unit i can be performed without taking care of input coupling strengths of other units ki.

3.1.2. Observe relaxation to stable state after transient driving (changing initial conditions)

A second approach assumes that the quantities yi (and thus the xi) are perturbed such that at time t0 we have $x_{i}(t_{0})=x_{i}^{(0)}$ and the transient dynamics yi(t) of relaxation back to the original fixed point $y_{i}^{*}$ (and thus $x_{i}^{*}=0$) are observed at a sequence of times tm > t0, m ∈ {1, ..., M}. This yields the same type of equation (11), but now with the $Y_{i,m}=-\hat{\dot{x}}_{i,m}$ being estimates of the derivatives $\dot{x}{}_{i}(t_{m})$. We remark that these derivatives may be estimated in various ways, each of them requiring a resolution of the measured data on sufficiently small time scales, cf section 5.1. In this second approach, the different times the transient dynamics is measured replaces the different driving experiments in the first approach. Finally, both approaches can of course be combined, several experiments evaluated at several time points, again yielding the same form of restrictions (11).

3.2. Solving the restricting equations

How can we finally obtain the coupling elements Jij and thus the interaction network? In principle, solving the matrix equation (11) yields the interaction matrix J as a function of the known data X and Y. One may naively assume that it is directly solvable once the number of experiments equals the number of units in the network, M = N. However, this problem can be numerically ill-conditioned [30] for large N, such that the result is not reliable. In addition, as also stressed in [1], the results may be sensitive to noise in the measured data.

A way to overcome this problem is by performing (many) more experiments than nodes available, MN, thus over-determining (11). In general, due to noise and measurement inaccuracies, this yields the system (12) to be inconsistent such that there is no vector Ji that satisfies all constraints. It will still be possible to find a robust approximation $\boldsymbol{\hat{J}}_{i}$ that minimizes the error between the predicted dynamics JiX and the actual dynamics Yi for a given node. Specifically, this error function may be modeled as

Equation (13)

where the distance measure $d(\boldsymbol{v},\boldsymbol{w})=\Vert {\boldsymbol v}-{\boldsymbol w}\Vert _{p}^{p}$ between two vectors $\boldsymbol{v},\boldsymbol{w}\in {\mathbb R}^M$ is commonly defined in terms of the pth power

Equation (14)

of an Lp-norm with p ⩾ 1, due to its convexity properties. This guarantees that any local minimum of Ei is also a global minimum [31]. Particularly, the L2-minimization criterion,

Equation (15)

is of great importance because it has an analytical solution for its extremum. Equating to zero the derivatives of the error function with respect to the matrix elements, $\frac{\partial }{\partial J_{ik}}E_{i}(\boldsymbol{\hat{J}}{}_{i})\stackrel{!}{=}0$, yields an analytical solution (see appendix A) to L2 error-minimization given by

Equation (16)

Evaluating such equations for all i ∈ {1, ..., N} yields the complete reconstructed network $\hat{J}$. This mathematical form of minimum L2-norm solution is implemented in many mathematical packages (e.g. as the mrdivide function in Matlab [32] or the LeastSquares function in Mathematica [33]). We explicate that the obtained off-diagonal terms $\hat{J}_{ij}$ serve as the best estimate (in the procedural sense using the L2-minimization above) for the coupling constants Jij ; at the same time, the diagonal elements Jii are not relevant for the network topology because the influence of these terms on the dynamics of unit i is physically indistinguishable from an intrinsic drive to i included in the local dynamics specified by f(xi) in (2), cf equation (3).

Experimentally, it is in principle possible to over-determine a system of equations by performing repeated measurements on the network until the condition MN is achieved. Nevertheless, it may often seem unsuitable for large networks due to the large number of experiments that would be required.

So, if the size of the network is an issue or the number of available measurements insufficient to over-determine the system, we have M < N. Assuming that the network is sparse (i.e. that each unit is connected with a small number of others and thus many connection strengths are Jij = 0), may still yield the collection of all network links. This implies that several Jij are effectively set to zero, therefore decreasing the amount of unknown coefficients to be solved for. It leaves us with the problem of finding which links are actually present and which are not. We present two related options to do so.

3.2.1. Sparse solution with a bounded connectivity per unit

If an upper bound for the number of links Ki < M is known, we may assume only M < N experiments are available and we have a rough idea of how many nodes (at most) are connected to a particular node. In particular, assume that the number of incoming connections for node i is given by at most Ki < M [1]. This assumption shifts the system (12) from having more unknowns than constraints, M < N, to have more constraints than unknowns, M > Ki, therefore, implicitly over-determining the system. It means that out of the N nodes present in the network only Ki of them are chosen to be part of the system of equations for node i. Such assumption may be done when there is some a priori information about the network's connectivity and dynamics.

Specifically, the system of equations (12) may in principle be rewritten as

Equation (17)

where ${\boldsymbol B}_{i}\in \mathbb {R}^{1\times K_{i}}$ is the reduced connectivity vector for node i that contains the coupling strengths for the selected nodes and $\boldsymbol{Z}_{i}\in \mathbb {R}^{K_{i}\times M}$ is a matrix that contains the states of such nodes. If we knew which Ki of the N − 1 possible connections actually contributed, we could use equation (16) to solve (17) using L2-minimization yielding

Equation (18)

where $\hat{\boldsymbol{B}}_{i}$ is the best approximation to Bi.

Yet, it so far remains unclear which of the ${{N}\choose{K_{i}}}=\frac{N!}{K_{i}!(N-K_{i})!}$ possible combinations of incoming connections is best suited for reproducing the dynamics of i. In principle, $\hat{\boldsymbol{B}}_{i}$ may be calculated for each combination of Ki genes, and the combination that yields the smallest value of the L2 norm $\Vert \hat{\boldsymbol{B}}_{i}\Vert _{2}$ in (18) may be chosen as the best estimate for Ji. The efficiency of such a procedure relies in number Ki of interactions per node. Hence, choosing a proper Ki aiming to recover the largest number of real interactions with the smallest number of false positives is a key factor to achieve a successful topological reconstruction.

3.2.2. Maximizing the sparseness of the connectivity matrix

If M < N and the Ki are unknown, cannot be estimated or there are too many of them (making the combinatorial search practically impossible) maximizing the number of zero entries in J, (i.e. minimizing the Ki and thus maximizing sparseness) may be a way to solve (12). This approach is particularly useful if the only a priori knowledge about the network's connectivity is some sparsity.

For general matrix equations

Equation (19)

where $A\in \mathbb {R}^{m\times n}$, $\boldsymbol{y}\in \mathbb {R}^{n\times 1}$ and $\boldsymbol{b}\in \mathbb {R}^{m\times 1}$, singular value decomposition (SVD) of A according to

Equation (20)

yields an analytic solution

Equation (21)

where $\tilde{\Sigma }=\Sigma ^{\mathsf {T}}(\Sigma \Sigma ^{\mathsf {T}})^{-1}$that parametrizes the space of all solutions through the vector $\boldsymbol{c}\in \mathbb {R}^{n\times 1}$ with ci = 0 for i ∈ {1, ..., r} and r = Rank(A).

In our reconstruction problem, we are seeking to maximize the number of zero entries in J based on solving the restricting equations (12) for Ji as we solved (19) for y. Consider the transpose

Equation (22)

of (12). The analogous SVD-based solution then reads

Equation (23)

where $U\in \mathbb {R}^{M\times M}$, $V\in \mathbb {R}^{N\times N}$, $\tilde{\Sigma }\in \mathbb {R}^{N\times M}$ and $\boldsymbol{c}\in \mathbb {R}^{N\times 1}$ is a vector of remaining coefficients parametrizing the solution space. Thus, the set of all possible solutions for Ji is given by (23). The goal now is to pick the sparsest solution from this set. Therefore, equation (23) may be posed as the overdetermined (M > Nr) problem

Equation (24)

Minimizing the L1 error

Equation (25)

yields a sparse solution [31]. However, unlike the L2 minimization, L1 minimization has no analytical solution, so choosing an appropriate iterative algorithm to solve it is essential. The Barrodale Roberts algorithm [34] provides a particularly fast solver that has been vastly used in the field of network reconstruction [10, 12, 29, 35].

Remarks. The core equations (50) also provide the option to reconstruct network connectivity via maximizing sparseness of the network and there is a particular relation to what is known as compressive sensing cf, e.g. [36].

In general, linearization of dynamical equations, e.g. linearizing in state variables close to fixed points, often well approximates nonlinear dynamics. This seems to hold for gene regulatory networks [29] as well as in models of Drosophila segmentation networks [37] and may thus be of general use across systems. For gene and protein interaction networks, often single genes are selected for perturbations in an experiment, with the danger of providing non-generic restrictions in (12). Finally, for some systems increasing the number of experiments may reduce the resulting computational costs such that this trade-in may be considered.

3.3. Driving the system's state to a fixed point

One may also infer network structure by externally driving the system to a fixed point and shifting component values of the fixed point for individual units [38, 39]. As before, the differences between pairs of steady-state responses are analyzed. Let us describe the network as

Equation (26)

where the Aij ∈ {0, 1} are the entries of the adjacency matrix specifying only if an interaction from j to i is present (Aij = 1) or not (Aij = 0), the gij(xi, xj) are the coupling functions from j ∈ {1, 2, ..., N} to i, and Ii is the driving signal applied to unit i. It was demonstrated by Yu and Parlitz [38] that under driving signals

Equation (27)

with sufficiently large gain factor $\theta \in \mathbb {R}$ and Lipschitz continuous fi and gij , the network may be driven to a globally stable fixed point $\boldsymbol{x}^{*}:=(x_{1}^{*},x_{2}^{*},\ldots ,x_{N}^{*})^{\mathsf {T}}\in \mathbb {R^{N}}$ that is arbitrarily close to a predetermined point $\hat{\boldsymbol{x}}:=(\hat{x}_{1},\hat{x}_{2},\ldots ,\hat{x}_{N})^{\mathsf {T}}\in \mathbb {R^{N}}$, independent of the initial conditions [38] . At such fixed point we have

Equation (28)

for all i. To understand how the network responds to changes in $\hat{\boldsymbol{x}}$, let us define

Equation (29)

hence, equation (28) may be rewritten in terms of Δi as

Equation (30)

The main idea at this point is to check whether unit k couples to unit i by evaluating how $x_{i}^{*}$ reacts to the shifting of $x_{k}^{*}$ through $\hat{x}_{k}$. Therefore, let us set

Equation (31)

and evaluate (30) at this point, yielding

Equation (32)

Shifting the same component twice to $\hat{x}_{k}^{(1)}$ and $\hat{x}_{k}^{(2)}$, resp., fixes a reference frame and thereby yields equations characterizing the difference between responses of a given unit i to a shifted fixed point component for unit k. It results in

Equation (33)

a condition that may be rewritten as

Equation (34)

We remark that the differences [Δik, 1 − Δik, 2] in (33) are not known but the general form (34) may be used to reveal whether they are zero, λik = 0, or not: given that we are dealing with entries of the adjacency matrix, we may infer two possible outcomes from (34), whether system k is coupled to i or not. Specifically,

Equation (35)

It was also demonstrated by Yu and Parlitz [38] that Δi decreases with θ if fi and gij are Lipschitz continuous. This permits to discriminate whether there is a coupling between a pair ki. Especially, when θ is sufficiently large, the |Sikθ| values may be classified into sets $\mathbb {I}_{0}$ and $\mathbb {I}_{1}$, non-coupled and coupled sets, respectively. To construct such sets, Yu and Parlitz [38] propose to:

  • For fixed k, organize the |Sikθ| values in an ascending order, i.e., the values should be arranged into a new series z where zk, j < zk, j + 1 < ... that defines the indexing j of the zk, j.
  • Establish the critical values of each set. In this case, the critical values jc and jc + 1 define the end and the beginning of $\mathbb {I}_{0}$ and $\mathbb {I}_{1}$, respectively. Yu and Parlitz suggest to find jc by requiring the distance between any element from $\mathbb {I}_{1}$ with respect to zi, 1 to be larger than twice the size of $\mathbb {I}_{0}$, $z_{k,j}-z_{k,1}\ge 2(z_{k,j_{c}}-z_{k,1})$ for all j > jc.

Finally, by performing this process on every unit, the topology of the network may be reconstructed.

Remarks. The approach relies on the feasibility of (i) perturbing the systems in a specific manner, and (ii) measuring the steady states, suggesting that it is model independent to a large extent in that it does not in principle require knowledge of local dynamics or coupling functions (yet it requires these to be Lipschitz continuous). These features may make the approach of interest under certain conditions where only little pre-knowledge about the system is available. Yet the approach requires substantial control over the system, in particular, the option of externally driving every unit (independently) constitutes a major requirement. The study [38] does not state how indirect actions are treated, for instance, how is an indirect effect from unit k via k' onto i distinguished from direct interactions from k to i? Possibly, driving k may indirectly affect i only weakly and this potentially second order contribution could be treated in a perturbative way.

3.4. Distributed perturbations to collective periodic dynamics

The approaches presented above (sections 3.13.3) required the existence of fixed points either in the original system or in the presence of sufficiently strong external driving. Yet, more complex dynamics prevails in a large range of biological, physical or artificial systems. The second most simple invariant dynamics are periodic orbits and often arise as limit cycles of coupled oscillatory units, thus asking for a generalization beyond simple fixed point approaches. Even more complex dynamics, e.g. collective chaos, is treated by a direct approach below in section 5.1.

Is it possible to infer network topology from driving-response experiments also for oscillator networks? Below we positively answer this question, at the same time showing that distributed driving signals not precisely targeted to one or a few units are at least equally appropriate to infer network topology. Several theoretical model studies of coupled oscillators [4045] have shown that the response of single units in a network to constant or periodic driving signals as well as the transient dynamics of synchronization depend on the network topology. Some recent works [43, 44] helped us to understand specific quantitative influence of structural features on the response and how the network response provides some information about the structure (and the driving signal). For instance, the magnitude of responses seem to decay exponentially with distance from the driving node [43], and the coarse-scale connectivity among connected components may qualitatively determine to which degree network dynamics is coordinated globally [44]. Further developing such insights, a follow-up work [10] presents a method of reconstructing network topology from systematic measurements of network responses to temporally constant, distributed driving signals in coupled phase-oscillator networks.

The basic idea is that any network displaying a stable invariant dynamics, not just fixed points, yield a specific response to a given perturbation as a consequence of the network's topology and the perturbation itself [40, 42, 44], cf figure 3. If the perturbations are small, the invariant set is typically qualitatively unchanged and only slightly moved in state space. Keeping track of which driving signals resulted in which responses, we can collect evidence about the interactions among units in a network. Sufficiently many repetitions of appropriate driving-response experiments then yield the network's topology.

Figure 3.

Figure 3. Topology revealed by driving a stable periodic orbit. As for fixed point approaches discussed above (section 3.1 and figure 2), the difference vector v (red) depends on both the driving signal and the network topology such that several measurements of v under different driving conditions yields information about the topology.

Standard image High-resolution image
Figure 4.

Figure 4. Revealing network topologies from response dynamics for directed networks of phase-locked Kuramoto oscillators. Variables evolve according to $\dot{x}_{i}=\omega _{i}+k^{-1}\sum _{j=1}^{N}J_{ij}\sin (x_{j}-x_{i})$, with random frequencies ωi ∈ [0.1, 1] and k = 8 directed interactions of strengths Jij = k−1, randomly selected for each unit. Panels show reconstructed coupling matrices J for (a) N = 16, and M = 32, and (b) N = 64, and M = 32. The matrices are gray-coded from white ($\hat{J_{ij}}$ = 0) to black ($\hat{J}_{ij}=\max _{i^{\prime }j^{\prime }}\lbrace \hat{J}_{i^{\prime }j^{\prime }}\rbrace$). Insets: element-wise absolute difference $|J_{ij}^{{\rm derived}}-J_{ij}^{{\rm original}}|$, plotted on the same scale.

Standard image High-resolution image

Weakly coupled limit cycle oscillators are well-characterized by ignoring (in the long time limit) amplitude responses to coupling and by modeling them as phase-oscillators with coupling via their phase differences only. A method to infer network topology for coupled phase-oscillators with arbitrary stable, phase-locked dynamics has been presented in [10]. One key observation is that the phase differences (yet not the phases themselves) in such systems converge with time and that comparing differences of phase differences among different driving conditions yield restrictions to network topology. The network dynamics is given by

Equation (36)

where ϕi(t) and ωi are the phase and natural frequency of oscillator i, respectively, Jij the connection strength from oscillator j to i and Ii, m is a temporally constant driving signal applied to i during the experiment m. We assume that in the absence of driving, Ii, m ≡ 0, the network is in a phase-locked state where $\dot{\phi }_{j}-\dot{\phi }_{i}=0$ for all i, j. We remark that one, several or all units may be perturbed during each given experiment, such that driving can be arbitrarily distributed and effectively changes the frequencies of the driven oscillators. As for the approaches relying on fixed points (section 3.1), the existence of a stable periodic orbit (and thus in particular a phase-locked state) implies that sufficiently small constant perturbations yield an (only slightly moved and slightly different) stable periodic orbit.

If for a given driving condition m, the dynamics becomes phase-locked, the phase differences

Equation (37)

become constant in time, $\Delta _{ij,m}(t)\rightarrow \Delta _{ij,m}^{*}:=\lim _{t\rightarrow \infty }\left(\phi _{j,m}(t)-\phi _{i,m}(t)\right)$ because all oscillators move at the same collective frequency

Equation (38)

Hence, if the network is perturbed by a sufficiently small driving signal, the original phase-locked state (for Ii, m ≡ 0) is slightly moved such that $|\Delta _{ij,m}^{*}-\Delta _{ij,0}^{*}|\ll 1$ and there is a small difference between the perturbed and non-perturbed collective frequencies Ωm and Ω0. Defining the effective frequency difference Di, m := Ωm − Ω0Ii, m of oscillator i, and approximating the arbitrarily nonlinear coupling functions gij by a first order Taylor expansion around $\Delta _{ij,0}^{*}$ we obtain

Equation (39)

where θj, m := ϕj, m − ϕj, 0 is the phase shift and $\hat{J}$ is the Laplacian matrix of the network given by

Equation (40)

Now, identifying the matrices Xi, m = Di, m and Yi, m = θi, m we have reduced the problem of identifying network topology using distributed perturbations in systems of limit cycle oscillators to solving the same linear algebraic equation (11).

As remarked in previous sections, several experiments are necessary in order to perform the reconstruction of $\hat{J}$. Therefore, from repeated measurements for different conditions it is possible to rewrite equation (39) in the form (11) in terms of $Y=D\in \mathbb {R}^{N\times M}$ and $X=\Theta \in \mathbb {R}^{N\times M}$ representing the differences between collective dynamics and phase shifts for each of the N systems during the M experiments.

Now, for sufficiently many experiments, i.e. MN, the reconstruction may be accomplished in principle but may be ill-conditioned numerically. In addition, for large N the many required experiments may not be practical. if this condition is not fulfilled, methods like the setting a maximum connectivity per system or maximizing the sparseness of the connectivity matrix (section 3.2.2), may be applied in order to make the problem of retrieving $\hat{J}$ an overdetermined problem.

As shown in [10], we may compare how accurate our prediction is by defining $J_{\max }:=\max _{i^{\prime }j^{\prime }}\big\lbrace \big|J_{i^{\prime }j^{\prime }}^{{\rm derived}}\big|,\big|J_{i^{\prime }j^{\prime }}^{{\rm original}}\big|\big\rbrace$, and using a relative difference defined as

Equation (41)

where ΔJij ∈ [0, 1] ∀ i, j . In addition, the quality of reconstruction Qα may be posed as the fraction

Equation (42)

of connection strengths which are assumed to be correct. Here α ⩽ 1 is a constant employed to set the required accuracy for predictions and H is the Heaviside function, H(x) = 1 for x ⩾ 0 and H(x) = 0 for x < 0. For instance, α = 0.95 means that the derived matrix has a normalized relative error (41) of at most 5%. Moreover, we may estimate the minimum number of experiments

Equation (43)

required for a reconstruction with a quality level q and with a prediction accuracy α. Figure 5 illustrates these measures for random networks of phase-locked Kuramoto oscillators for several random topologies and parameters.

Figure 5.

Figure 5. Quality of reconstruction and number of required experiments for reconstructing directed networks. Phase-locked Kuramoto oscillators with dynamics defined as in figure 4 with ki = 10 random incoming connections per node. (a) Quality of reconstruction at α = 0.95 for N = 24 (×), N = 36 (▵), N = 66 (○) and N = 96 (□). (b) Minimum number of experiments required for a reconstruction of quality level q = 0.98 and accuracy α = 0.95.

Standard image High-resolution image

The driving response method in principle may be applied to a broad variety of problems involving stable dynamics. A model analogous to equation (39) could be inferred as long as the systems may be linearized around a stable state, allowing to retrieve the topology from the network responses as above. Yet, there may be practical problems. For instance, even for perturbations induced by constant driving signals, the invariant solution resulting from perturbations to more complex periodic orbits or other stable invariant sets may be describable only by time dependent quantities (and not, e.g., temporally constant phase differences), limiting the approach suggested above to specific classes of systems.

3.5. Features and restrictions

One common advantage of the approaches presented above is that their required computational effort scales well (weaker than linearly) with system size N such that at least moderately large systems appear accessible (cf figure 5). At the same time, the approaches are relatively simple to realize because they do not require knowledge in higher mathematics or computational approaches beyond a basic standard.

A possible route of generalization is to combine some of the above approaches. For instance, one may first drive a system to a stable fixed point as in section 3.1 and then apply small perturbations around that new point as in section 3.3.

Yet, all these approaches require the researchers to be able to access (measure and drive) the dynamics of all units in the system. Moreover, the local dynamics as well as the (approximate) form of interactions typically need to be at least partially known. The collective dynamics suitable for the driving-response approaches described above also need to be simple, in fact to exhibit a stable fixed point or periodic orbit or to admit the system to be driven to such as state. Finally, the presented inference of the existence of physical interactions and their functional form [46] seems well understood for networks of phase-oscillators, where perturbations in oscillation amplitude decays on faster time scale than the relaxation of phases. It thus remains an open problem how to use a driving-response approach to properly infer structural network connectivity of coupled oscillators in systems, where the amplitude degrees of freedom play a role or are even dominant. More generally, systems exhibiting more complex dynamics, such asynchronous chaotic activity, bifurcations, multistability or other prevalent features of high-dimensional, nonlinear systems, currently still prevent network reconstruction by the methods presented above.

These requirements severely restrict the range of applicability in praxis to simple, well-accessible systems only. In particular for biological systems such as neural circuits or gene interaction networks, dynamics are typically more complex, systems are large and it is still hard to implement controlled large-scale driving experiments on the single-unit level. Direct methods (section 5) that do not rely on driving the system seem to offer viable directions towards reconstructing networks with such more complex dynamics as well. A currently open question of research constitutes how to exploit recorded time series from only a fraction of the units.

4. Copy-synchronization: adapting a model copy

Another way of reconstructing network topology of a given network is by adapting the topology of a second, model system, a network copy, such that it synchronizes with the original system. The idea is to update the model topology continuously until the copy system exhibits a dynamics identical to the original system; the rationale is that the final topology of the copy is likely to be the original topology [47].

Specifically, consider an (original) system of the form

Equation (44)

where fi and gij are known and assumed to be Lipschitz continuous. This original system can in principle have arbitrary dynamics. Now, let us propose a model copy

Equation (45)

where yi represents the state of the copy system, Ii(t) are control feedback signals and Kij(t) is current coupling strength in the test system. In order to synchronize the copy to the original system, both the feedback signals Ii(t) and the coupling strengths Kij(t) evolve in time and depend on the states of the actual and the copy system. Defining the synchronization error

Equation (46)

we adapt the coupling strengths in the model copy according to

Equation (47)

and

Equation (48)

where α > 0 and γij > 0. We remark that here the local dynamics fi(.) as well as the coupling function gij(.) need to be known in order to set up the test copy. It was proven in [47] that under feedback signals (48) with sufficiently large α, the synchronization error ei decreases in time, $\dot{e}_{i}\le 0$ for all i such that the two systems converge to a synchronized state. The rationale is that after synchronization, the copy topology equals that of the original network, KijJij, cf figure 6. With minor modifications on the control signals Ii this method admits to reconstruct networks and sub-networks in the presence of disturbances and modeling errors as well [47].

Figure 6.

Figure 6. Revealing network topology via copy-synchronization. Dynamics of reconstructed network for directionally coupled Kuramoto oscillators $\dot{x}_{i}=\omega _{i}+k^{-1}\sum _{j=1}^{N}J_{ij}\sin (x_{j}-x_{i})$ with random frequencies ωi ∈ [ − 1, 1], random coupling strengths Jij ∈ [0.5, 1], N = 10 and k = 3 random incoming connections per node. (a) Dynamics of synchronization error between the original and the copy network measured as $e(t)=\sqrt{N^{-1}\sum _{i}e_{i}^{2}(t)}$. (b), (c), (d) and (e) show the inferred topologies at t = 0, t = 2, t = 500 and t = 104. The matrices are gray-coded from white (Jij = 0) to black (Jij = 1). The insets depict the element-wise absolute differences $|J_{ij}^{{\rm derived}}-J_{ij}^{{\rm original}}|$, plotted on the same scale.

Standard image High-resolution image

We remark that to the best of our knowledge, there is no guarantee that the resulting topology of the copy system actually reflects the one of the original network. In particular, symmetries might lead to disambiguities.

Further, the method based on copy-synchronization is model dependent such that knowing the intrinsic and coupling functions of units is vital, as in several parts of section (3). Its applicability has been explicated for sample networks of up to N = 16 nodes, but it remains unclear how to handle large networks as of the order of N2 links Kij need to be co-evolved in time and a bound of convergence times is currently missing. At the same time, the copy approach does not require perturbations to the original systems, so experimental access to it need not include driving access to its units. Interestingly, Yu et al [47] highlight that the method may be useful to track changes in a network's connectivity in real time.

5. Direct approaches

In the previous two sections, we have introduced methods to infer network connectivity by either interfering with the system (driving-response approaches, section 3) or by setting up and synchronizing a second, model system (section 4). Both classes of methods work if certain requirements are met (in particular, the option to actively drive the system or the option to synchronize the copy with the original system, respectively). It remains a challenge to infer network topology from dynamics without such requirements.

5.1. Reconstruction by purely observing dynamics5

Methods based on copy-synchronization (section 4) assume that the local dynamics fi as well as the interaction functions gij in (1) are known and the gij do not depend on the state of unit i, gij(xj). Inferring network connectivity, i.e. the Jij, then relies on the construction of a second, model network, with dynamics governed by equation (1) and network parameters $J^{\prime }_{ij}$ that are tuned to that of the real network by an error minimization procedure. As noted recently [12], one may solve the same reconstruction problem with significantly reduced efforts and reduced requirements by evaluating the states and their time derivatives directly from the time series recorded from the original system. In particular, such a simple direct method [12] removes the need to set up and synchronize a second, copied, system.

The idea is as follows. If the local dynamics and the coupling functions are known, their evaluations at different times are also known from recorded time series and the only remaining unknown parameters in equation (1) are the coupling strengths, which are to be determined. Specifically recording a time series xi(tm) at sufficiently closely spaced times $t_{m}\in \mathbb {R}$ and estimating the temporal derivatives6 of it yields the dynamics of the ith unit given by

Equation (49)

If there are M such times, m ∈ {1, ..., M}, we have M equations of the form

Equation (50)

with abbreviations $\dot{x}_{i,m}:=x_{i}(t_{m})$, fi, m := fi(xi(tm)) and $g_{ij,m}:=g_{ij}(\mathit {x{}_{i}(t_{m}),x}_{j}(t_{m}))$. Repeated evaluations of the equations of motion (49) of the system at different times tm thus comprise a simple and implicit restriction on the network topology Jij as follows: writing $Y_{i,m}=\dot{x}_{i,m}-f_{i,m}$ and the matrix $X_{i}=(g_{ij,m})_{j,m}\in \mathbb {R}^{N\times M}$, these equations constitute the matrix equation

Equation (51)

where $\boldsymbol{Y}_{i}\in \mathbb {R}^{1\times M}$ and $\boldsymbol{J}_{i}\in \mathbb {R}^{1\times N}$ is the ith row of the interaction matrix J, comprising the vector (Jij)j ∈ {1, ..., N} of all input coupling strengths to unit i.

The main restricting equations (51) again have the same form as the generic restrictions (11) and thus may be solved analogously. In [12], Euclidean L2-norm minimization was used to infer the topology. Numerical tests show that reconstruction works well for transient as well as attractor dynamics, for simple as well as complex, chaotic units and collective states, and even in the presence of noise that substantially alters the dynamics and thus the recorded time series. Figure 7 illustrates successful reconstruction in the presence of various levels of noise.

Figure 7.

Figure 7. Reconstructing a random network complex chaotic dynamics. Example of N = 32 Lorenz oscillators with dynamics given by $\dot{x}_{i}=\sigma _{i}(y_{i}-x_{i})+\sum _{j=1}^{N}J_{ij}(x_{j}-x_{i})+\xi _{i}^{(x)},\,\dot{y}_{i}=x_{i}(\rho _{i}-z_{i})-y_{i}+\xi _{i}^{(y)},\,\dot{z}_{i}=x_{i}y_{i}-\beta _{i}z_{i}+\xi _{i}^{(z)}$, with unknown parameters ∀ i, j  {Jij, σi, ρi, βi} in the presence of external Gaussian white noise $\boldsymbol{\xi }_{i}$ of amplitude λ. (a) Dynamics of a unit in the absence (blue) and presence (black) of noise with λ = 5. Original (b) and reconstructed (c) connectivity, inferred parameters (f), and their errors compared to the original topology and parameters (e) and (g). (c) Receiver operating characteristics (ROC) of reconstruction under noise-free (blue) and noisy measurements (black and red) for λ ∈ {0.1, 1, 10}. Adapted from [12].

Standard image High-resolution image

It is clear that certain types of dynamics do not allow topology inference. For instance, if all local dynamics are identical fif, all coupling functions are identical, gijg, and collectively the units are identically synchronous, i.e. all in the same states at the same times, there is no information about the connection topology one can possibly obtain from (only) recording time series, because for all strongly connected topologies7, the collective dynamics would be identical.

Further theoretical considerations show that following the same lines of analysis as above, all parameters occurring linearly in the local dynamics of coupling functions, can also be reconstructed by the same error minimization based on equation (11). For instance for the (fictional) coupling function gij(x, y) = asin (xy) − exp (b + yx), the parameters a and b need not be known but can be inferred as well (up to one constant prefactor for each pair (i, j) of nodes), because both a and exp (b) occur linearly in the equations of motions (49). In many physical systems, parameters actually do occur linearly. These include, for instance, the dynamics of coupled classical electric LCR circuits and the strengths of diffusive (4) or direct linear coupling (6). Moreover, widely used model systems for networks of neurons, such as leaky integrate-and-fire (LIF) and quadratic integrate-and-fire neurons [50] or networks of coupled chaotic systems such as Rössler or Lorenz systems [51, 52] exhibit all or at least most parameters occurring linearly or affinely. For all such systems, local dynamics fi and coupling functions gij may not or not completely be required to be known in advance.

Such a direct inference method [12] for network topology from complex collective dynamics thus serves as a simple starting strategy for connectivity reconstruction in a broad range of networked systems.

5.2. Pulse-coupling: reconstruction from spike patterns

Networks of spiking neurons or other pulse-coupled units are formally hybrid systems [53], because the continuous-time flow is interrupted at certain time points, where events (e.g. spike sending or reception) modify the dynamics [54, 55] via discrete time maps.

To start addressing the reconstruction problem for such hybrid systems, researchers have considered one of the simplest model networks of pulse-coupled units based on LIF models. Such network models are most broadly used in mathematical analysis and computational modeling. Each unit in such a network has one state variable, its membrane potential, roughly emulates leaky capacitive properties observed for membranes of biological neurons and is complemented with a threshold where a pulse (action potential or spike [56]) is artificially created and the membrane potential is reset. Although these models lack certain biological details, such as a natural spike generating mechanisms, they are simple enough to be studied analytically and they have been useful for furthering our understanding how the information is processed among neurons [57].

Explicitly, the membrane potential Vi(t) of a LIF unit i changes as in an RC circuit (resistor and a capacitor connected in parallel) according to

Equation (52)

where Ri, γi and Ii are the membrane resistance, inverse membrane time constant and the external current applied to neuron i, respectively. Once the potential of a unit j crosses a threshold, Vj(t) ⩾ VT, j, the potential is reset to Vj(t+) = VR, j and the unit emits a pulse that it transmitted to the neuron's post-synaptic neighbors [58]. The time of this pulse sending event is remembered as the unit's mth spike time tj, m := t. The collection of such pulses then define the actions onto post-synaptic units i such that the quantity

Equation (53)

in (52) represents the pulses that unit i receives from the rest of the network. Here, Jij and τij are the synaptic coupling strength and the synaptic transmission delay from unit j and i, respectively. Furthermore, δ(.) is the Dirac delta distribution modeling a potential response kernel that is much faster than all other time scales involved.

The main question we address now is whether and how the network connectivity, as specified by the matrix of coupling strength Jij can be reconstructed if the pattern of spikes times $\left(t_{j,m}\right)_{j\in \lbrace 1,\ldots ,N\rbrace ,m\in \mathbb {Z}}$ is given? We remark that we do not assume access to the natural state variables, the membrane potentials Vi(t), which may not be observable, but only to the spike times that are generated by the dynamics of these potentials. This difference constitutes the main novelty of the approaches presented in this subsection compared to those for continuous time state variables presented before.

First observe that (52) has an explicit solution [35] given by

Equation (54)

if the time interval [t0, t) lies in between two subsequent spikes of neuron i. The first sum is taken over all neurons while the second is taken over those spikes generated by the other neurons ji that affect i in the time interval [t0, t).

Based on this solution, we now present two distinct approaches to infer network connectivity, one direct and exact inference method assuming oscillatory units and one coarse approximate method based on stochastic optimization.

5.2.1. Direct topology inference assuming oscillatory units

Van Bussel et al [35] assumes that all the parameters of (54), besides the synaptic couplings Jij, are known. The rationale is that delays and membrane time constants as well as a unit's equilibrium potential RiIi may be estimated beforehand. How could these coupling strengths and thus the structural network topology be inferred? If the currents Ii are sufficiently large such that RiIi > VT, i the units are oscillatory such that they create spike even without recurrent inputs from the remaining network. After crossing the threshold VT, i, unit i sends a spike to its post-synaptic neighbors and is reset to a resting value VR, i, so that the unit's state at the boundaries of each inter-spike interval [ti, k − 1, ti, k) is determined and one may take t0 = ti, k − 1 and t = ti, k. However, these threshold crossings may be induced in two manners at t = ti, k − 1, (a) by incoming excitatory spikes from other neurons such that Vi(t) < VT, i but $V_{i}(t^{-})+\sum _{\left\lbrace j:\exists m:t_{j,m}+\tau _{ij}=t\right\rbrace }J_{ij}>V_{T,i}$ or (b) by the intrinsic oscillation of the unit such that Vi(t) = VT, i without simultaneously incoming spike(s). In both cases, the membrane potential is at reset value immediately after sending a spike. So identifying t0 = ti, k − 1 in (54), we have

Equation (55)

If the next spike is oscillation-induced (b), the membrane potential approaches its threshold value VT, i continuously such that in addition

Equation (56)

where we now identified t = ti, k in (54). Thus, each oscillation-induced spike at some t = ti, k implies a linear restriction of the form (54) for the coupling strengths Jij by equating [t0, t) = [ti, k − 1, ti, k).

For M different inter-spike intervals obeying (55) and (56), this provides a linear system of equations restricting the coupling matrix. However, as van Bussel et al [35] remark, consecutive intervals may display similar patterns, such that it is often advisable to select M > N inter-spike intervals sufficiently separated in time, to minimize correlations between intervals and thus numerical inaccuracies. For each i, defining the subset of inter-spike intervals

Equation (57)

and

Equation (58)

equation (54) may be rewritten as

Equation (59)

where Ji := (Jij)j ∈ {1, ..., N} and $X_{i}\in \mathbb {R}^{M\times N}$ and $\boldsymbol{Y}_{i}\in \mathbb {R}^{M\times 1}$ are given by

Equation (60)

and

Equation (61)

Repeating the same process for all units i yields the topology of the whole network. As shown in section 3.2, the overdetermined problem, M > N, and the undetermined, MN, may be solved minimizing the L2-norm or maximizing the sparseness of the network, respectively. A major limitation is that the presented approach requires a large amount of prior knowledge about the system, such as the time delays between units and their time constants, among others. Given this knowledge, the approach is capable of inferring large networks of neurons as the computation time, as well as the number M of required inter-spike intervals to be evaluated scales linearly with system size [35, 59].

5.2.2. Stochastic optimization of all systems parameters

In a complementary study, Makarov et al [60] showed how stochastic optimization of all system parameters

Equation (62)

given the spike trains for all neurons in the network may yield a network topology roughly consistent with the actual one.

The idea is to evaluate the predicted inter-spike interval durations $\bar{T}_{i,m}(\boldsymbol{p})$ from a test set of parameters and to optimize those parameters to reproduce a given (measured) spike train as closely as possible. Thus, minimizing the difference between the predicted and measured spike trains is vital for this method. The authors in particular applied their idea also to recordings of biological neurons.

As the equation (54) yields transcendental relations for the $\bar{T}_{i,m}(\boldsymbol{p})$, their estimates need to be determined numerically. Thus, finding

Equation (63)

yields the best (in Euclidean distance norm) solution for the set of parameters.

Briefly, to find the minimum $\boldsymbol{p}_{i}^{*}$ in (63) one must explore the parameters space. This means that the solution is found through iteratively choosing random values for pi and comparing the value of (63) for consecutive iterations. By relating the changes in (63) with the changes in pi one may establish directions in which the minimum may be found by gradient descent. However, it is also advisable to change directions in the parameters space randomly. Mainly, because there may better solutions for $\boldsymbol{p}_{i}^{*}$ in regions of the parameters space where they are not expected to exist [61]. Makarov et al [60] thus applied stochastic optimization for searching the minimum.

As a strong requirement, this method needs the number of recorded spikes to be M≫2N; as noted in [60], robust regression models are more suitable to handle this kind of problems and special care with the inter-spike intervals must be taken as, e.g., spike bursts may lead to false intervals.

5.3. Features and limitations

The stochastic optimization approach provides a generic approach in finding best fitting parameters and thus here, potentially a well fitting network; yet, it is computationally demanding and simultaneously requires many recordings compared to the size of the network. In contrast, the direct approach assumes a large amount of pre-knowledge, in particular regarding the unit's parameters. In addition, by requiring to pre-process the data sets (i.e. choosing appropriate time intervals), the minimization problem is no longer required to deal with transcendental relations. This leads to a considerable increase on the computational performance and it is the key factor that makes the method suitable for large networks. We remark that still, both methods are currently not suitable in our opinion to reliably infer network topology from real recordings of spike data, because of several reasons: For instance, other network-external sources of spike generation, stochastic fluctuations due to intrinsic noise and the highly specific conditions required for reconstruction still limit the methods applicability. Finally, both approaches assume linearity of interactions, yet it is known that interactions can be nonlinear, e.g. due to dendritic spikes in response to sufficiently strong, simultaneous inputs to single dendritic branches of a neuron [6264].

6. Technical issues

Let us briefly comment on four technical issues related to the structural inference approaches discussed above. We mention how some of them may be directly transferred to settings, where discrete time dynamics describes the system (section 6.1), discuss issues when measuring data from real, e.g. biological networks (section 6.2), remark on L1 versus L2 minimization (section 6.4) and finally discuss what happens for hyper-networks, where more than two-point interactions influence the collective state (section 6.5).

6.1. Discrete time maps

If the dynamical system is described as a network of coupled discrete time maps instead of by coupled differential equations (2), approaches similar to the ones presented above are often viable in slightly modified form. For instance, the core equation becomes (for one-dimensional local units)

Equation (64)

where now the time $t\in \mathbb {Z}$ is an integer. Here, the direct method of subsection 5.1 is immediately applicable, even with the additional advantage that no temporal derivatives need to be estimated such that an observed time series (xi(t))i ∈ {1, ..., N}, t ∈ {1, ..., M} enters the inference equations without approximations. Of course, that time series may still contain substantial measurement errors that propagate into any solution of the inference problem.

6.2. Sampling dynamics of real networks

When sampling real life phenomena, measurement constraints may arise due to the nature of the given phenomena, making the study of such a major challenge. For instance, in EEG analysis, a group of sensors is employed to measure the brain activity. These recordings help in finding functional connections among regions in the brain. Basically, each sensor measures the activity of a population of neurons below its active area for a certain amount of time. Then, by analyzing the correlations among the time series of each sensor, the interaction network is constructed. Nevertheless, as remarked by Bialonski in [65], special considerations regarding the spatial and temporal sampling must be accounted for. A simple constraint is that sensors placed too close to each other, may record overlapping activity from neighboring populations, thus increasing the difficulty to discern between direct and indirect interactions. Moreover, if the size and frequency of the sampling fails to capture the intrinsic time scales, false functional connections in network may be constructed. Clearly this example is particularly relevant for inferring effective connectivities, as discussed in section 7. As an example regarding the time domain, gene and protein interaction networks often still have a very limited number of sampling points available [66, 67] such that a network inference problem may become drastically under-determined. Additional information such as the known existence of certain interactions in such a network (but not the existence of others) sometimes is available but how precisely to use this is currently unknown and requires further method development. So briefly, when studying real networks, special technical considerations should be accounted for to actually use what it is measured (the data) for what one wants to know about the system, cf section 1.2.

6.3. L1 norm minimization as a linear program

Given the optimization problem

Equation (65)

where $A\in \mathbb {R}^{M\times N}$, $\boldsymbol{y}\in \mathbb {R}^{N}$ and $\boldsymbol{b}\in \mathbb {R}^{M}$, (65) may be posed as [31]

Equation (66)

where $\boldsymbol{1}\in \mathbb {R}^{M}$ is a vector of ones and ⪯ and ⪰ denote entry-wise comparison and ${\boldsymbol s}$ is an auxiliary variable. To solve (66), one has to solve the linear program

Equation (67)

where

Equation (68)

and $I\in \mathbb {R}^{M\times M}$ and $\boldsymbol{0}\in \mathbb {R}^{N}$ are the identity matrix and a vector of zeroes, respectively. The advantage of posing problem (65) as (67) is that the latter can be easily solved in a standard way by implementing any solver for linear programs (e.g. the linprog function in MATLAB [32]).

6.4. L2  versus L1 norm minimization

The need to choose a minimization scheme may turn the reconstruction problem into a great challenge, because different schemes may yield different solutions, thus forcing us to discern which scheme is best suited for our purposes in specific reconstruction problems. As illustrated in [1, 10, 29, 35, 59, 60, 68] these differences between minimizers may be exploited in certain situations. For instance, the L2 minimizer finds the closest solution in the L2-norm and moreover has an analytic solution. Yet, given the nature of the minimizers (check [68]), an L1 minimizer is suited for finding particular sparse solutions and it is more robust to outliers than L2, so it might be seen as more useful for applications. However, minimizing an L1 norm is computationally more costly compared to the L2 and it may have more than one solution [31]. We note that ||p||2 ⩽ ||p||1 for any vector $\boldsymbol{p}\in \mathbb {R}^{N}$.

6.5. Hypernetworks: beyond two-point interactions

We have explicitly excluded networks with more than two-unit interactions from our general mathematical description (2) or dynamical networks with temporally changing connections. Those may occur, for instance in networks of computers, or other communication networks of engineering, where inputs from several units that give input to the same other unit are nonlinearly processed (for instance, multiplied) to change the latter units' state. Similarly, non-additive dendritic interactions in neurons [20, 63], where two simultaneously received synaptic inputs in close spatial proximity initiate so-called dendritic spikes and thereby a nonlinearity [62], naturally imply three-point interactions.

We remark that direct three-unit and higher order interactions (i.e. three-term and higher order products such as $x_{i}^{d}x_{j}^{d^{\prime }}x_{k}^{d^{\prime \prime }}$ where i, j, k are mutually different) are omitted in (1) because they refer to dynamical systems on hypernetworks, thus going beyond the scope of this review. Such third order and higher order terms are not covered by (1), firstly, because the notation stays much simpler without them, but secondly and more importantly, because there seem to be few major theoretical results on reconstructing such systems, if any, that may be or become of practical use. Yet, in social, communication and information networks, such questions may soon become of interest as 'big data' are pouring in.

A brief introduction to the dynamics of complex hypernetworks is given in [17].

7. Effective connectivity: correlation-based methods

7.1. Linear correlation and covariance

One common way of quantifying effective connectivity is to measure scalar time series x(t) = (x1(t), ..., xN(t)), t ∈ {t1, ...tM} from the N units of a system (spike rates of neurons, expression levels of genes, ...), compute their (temporal) averages μi = 〈xi(t)〉t and variances $\sigma _{i}^{2}=\langle (x_{i}(t)-\mu _{i})^{2}\rangle _{t}$ and from those compute the covariance matrix

Equation (69)

Here the averages are time averages

Equation (70)

If multiple measurements are available, averages may be taken over ensembles of experiments rather than (or in addition to) temporal averaging. The obtained covariance matrix is then often thresholded, either just choosing a heuristic value or according to some measure of statistical significance (against an appropriate null hypothesis) and the resulting non-zero values interpreted either as 'strength' of effective connectivity (weighted connectivity matrix) or just as existence (adjacency matrix) of a functional link between units. Sometimes, the value of a threshold is systematically varied and changes in resulting connectivity evaluated. Correlation, covariance and other linear algebra measures are commonly used, often complemented by nonlinear operations such as thresholding, to analyze, for instance functional magnetic resonance imaging (fMRI) or other spatio-temporal data [69]. We note that correlation measure may be (mathematically) seen in a number of different ways [70].

7.2. Entropy maximization

Entropy measures the uncertainty associated with a given probability distribution and constitutes a key quantity in information theory [71]. Given the probabilities of a set of events, the entropy measures how uncertain, on average, the occurrence of an event is; or in other words, how much information, on average, one obtains by measuring the occurrence of that event knowing the probability distribution of events. Reversely, given a collection of (observed) data points, we can choose probabilities to maximize the entropy. Such a distribution, known as a maximum entropy probability distribution, would be the least biased distribution possible and any other would require further assumptions on the nature of the problem [72]. For a network dynamical system, i.e. systems of coupled dynamical units, we can ask what the effective interactions are such that the probability distribution that best describes the data (averages and correlations) has maximum entropy.

Specifically, the principle of maximum information from Bayesian statistics postulates the 'most likely' probability ρ(x) of measuring x given the same type of time series data x(t) = (x1(t), ..., xN(t)) is the one maximizing the information obtained from measuring one more state y. More precisely, the goal is to find the probability ρ(x) that maximizes Shannon entropy

Equation (71)

under the constraints that the first and second moments are consistent with those estimated from the data,

Equation (72)

and

Equation (73)

By restricting ourselves to these conditions (and no others), we here focus on pairwise interactions and neglect three-point and higher order coupling that arise in hypernetworks (cf section 6.5 and [17]). This yields (appendix A) the probability distribution of the form

where $Z=\int _{\mathbb {R^{N}}}\exp \big(\sum _{i=1}^{N}h{}_{i}x_{i}+\frac{1}{2}\sum _{i=1}^{N}\sum _{j=1}^{N}\hat{J}_{ij}x_{i}x_{j}\big)d^{N}x$ is a normalization constant and hi and $\hat{J_{ij}}$are parameters to be chosen such that (C.3) and (C.4) hold. The quantities $\hat{J}_{ij}$ are interpreted as effective couplings between units i and j.

If the matrix of covariances between the N time series is

Equation (74)

the effective coupling matrix is given by its inverse (appendix A)

Equation (75)

This means that the best available probability distribution (i.e. that yielding the maximum information) is given by second order effective coupling strengths determined by (but by no means identical to) the linear correlation matrix.

This concept is applied to a range of different systems, in particular in biology, including gene networks [73], protein networks [74] and neural circuits [75]. We comment on an approach originally suggested by Bialek and coworkers [75, 76] for revealing in how far two-point interactions characterize the coupled dynamics of neural circuits in retina. In fact, a systematic study of neural activity in the retina of larval tiger salamander and guinea pig revealed that 90% of the multi-information (which measures all correlative dependences in a system [76]) is covered by pairwise correlations only [75]. The authors took this finding as a sign that pairwise interactions well characterize the full network dynamics and that higher order interactions may be neglected. Specifically, they temporally discretized neural responses into '1' or '0' states depending on whether a neuron was or was not active during each considered time interval of generically 20 ms. Thus, the state of the entire (observed) network at each time interval is given by an N-dimensional word composed of the binary components of the N neurons. As sometimes done for effective connectivity, researchers often tend to go beyond what Bialek and coworkers concluded and interpret the effective coupling strengths (75) as actual physical interactions of experimentally observed units. However, it is typically not clear how correlative dependences yield information about direct physical interactions and such that such interpretations are not justified without substantial further knowledge about the system.

7.3. Further in finding effective connectivity

We would like to emphasize that there is a multitude of additional approaches for finding effective or functional connectivities. For instance, a range of methods are based on information-theoretic measures such as mutual information [77], transfer entropy [78] and Granger causality [79, 80] and extensions thereof. In addition, there are various methods relying on Bayesian statistics or explicit or implicit modeling or extended regression such as used in generalized linear models [81]. In particular in biological sciences, such statistical methods have been used early on even at times not many or not very reliable data were available, see for instance [82] for an early study regarding effective connectivity in neural circuits.

There are a number of open challenges regarding both precision of such methods and the interpretation of the respective results. For instance, fMRI experiments of brain areas may rely on differences in blood oxygen level (so-called BOLD signals) as observables but often aims to relate actual dependences in neural spiking activity. The reasoning here is that the cell metabolism and thus the blood oxygen consumption in a local region (typically one cubic millimeter) of the brain is larger the more action potentials per time are generated by the (104–106) neurons in that region so that such approaches are not undisputed [83]. Moreover, the terms effective connectivity, functional connectivity and structural or anatomical connectivity are sometimes not well defined, used in different meanings across studies. There are even overlapping definitions of non-structural forms of connectivity, e.g. for functional connectivity, effective connectivity etc. Here we did not delve into historic waters and used the term 'effective connectivity' for all forms of connectivity that is not structural. Sometimes, effective connectivity is even taken as an indication for structural connectivity of physical interactions. For instance, distinguishing direct interactions from indirect influences may be an important issues (cf figure 1) that is not yet fully clarified [84].

Finally, we mention that for neural circuits [85] have devised a statistical method to find the couplings Jij that optimize the likelihood that a class of integrate and fire models generates the spike trains observed in experiments. This statistical method assumes the same model class as the approach for inferring structural connectivity presented above (section 5.2) and its goal indeed is finding the (most likely) structural connectivity.

As a conclusive warning, we remark that in particular different methods exit for obtaining effective connectivity given the same data set; the results each provide information about specific features of the system: exactly those defined by the method. Some might almost coincide with others, some might be congruent with and some may well be contradicting a given structural connectivity (cf [86]).

8. Conclusion and open questions

This review focuses on how structural connectivity of networks may be inferred from dynamical features of the networks' nodes. It is on purpose on an introductory level and (given that the area of network reconstruction is rapidly growing simultaneously in different fields, from gene and neural networks to engineering systems) by construction only highlights some selected approaches, most of which based on a perspective of considering the collective nonlinear network dynamics. We provide basic approaches about finding effective or functional connectivities of a network from time series as a brief complement and to get a taste for essential differences in perspective. One main distinction between approaches for identifying qualitatively different types of connectivity is that structural inference, aiming to reconstruct real physical interactions, take into account the time evolution of a system. In contrast, finding effective connectivity is often based on a stationarity assumption and uses distributions of states, neglecting all or parts of their temporal evolution. Relating observable, possibly effective and physical connectivity, is at the heart of current interdisciplinary research [15, 87]. As also mentioned in the Introduction and briefly discussed in section 7, functional versus structural inference if often not well distinguished and, in particular in early publications, the qualitative difference in approach was sometimes not even mentioned.

The methods and approaches presented in this review aim to tell whether or not and how strongly units in a network directly interact with each other. This is in distinction to the entire field of system's identification [7] where the aim is to identify the rules underlying a dynamical system from time series and predict its (future) dynamics based on this identification. Systems identification for multi-dimensional systems, and thus in particular for large networks, seems intractably hard because even if the 'real' system is almost (but not entirely) perfectly reconstructed, predicting their future dynamics can be impossible due to chaos (sensitive dependence on initial conditions), exponentially many different collective states and uncontrolled external influences, with all these factors becoming typical for multi-dimensional complex systems. At the same time, as in part illustrated in this review figure 5, learning the existence of strengths of interactions only (and not the precise form of dynamics) may well be successful for much larger systems. We thus speculate that novel methods complementing those of systems identification may yield further insights into the interaction networks of various complex systems.

A number of key issues are not discussed in this review but still are sometimes pressing for progress research, in particular with respect to applications to real world settings. We list a few.

  • (1)  
    How much information can we actually access (cf [88])? Can we observe all the units of a network? Can we observe all dynamical variables (dimensions) of each unit? In which sense may it make sense to seek connectivity of nodes that are not observed?
  • (2)  
    What do we know a priori about the system? Are the functional forms of network interactions known? What can we say about stability or instability of the dynamics? Perhaps the dynamics even exhibits a complex mixture of stable and unstable dynamics [9, 89, 90].
  • (3)  
    What are reasonable (or possible) number of experiments or measurements that can be done and what is a clever trade-off against the required computational efforts that may depend on the quality and quantity of those measurements [23].
  • (4)  
    Which of the structural features are actually the most relevant and which are even possible. In biochemical and gene regulatory networks, electric circuits and power grids, for instance, many interaction links (or their absence) may have been identified by other methods not related to network collective dynamics. Under these circumstances, can we improve existing approaches to take such information into account?
  • (5)  
    How can we address genuinely stochastic dynamics, e.g. in excitable systems, possibly even induced by small number fluctuations—a pathway of though that links to non-equilibrium physics, cf [91]?
  • (6)  
    What if network connectivities change, for instance by synaptic plasticity in neural circuits [9294] or other forms of adaptation, e.g. in social networks [95].
  • (7)  
    ...think of your own, there are many more questions with great opportunity for scientific progress in a range of fields ...

Answers to all these questions will specifically restrict or enhance the space of optional networks consistent with recorded data. From an abstract point of view, these modify the form and dimensionality of the set of all possible networks and points to a direction that still addresses network dynamics as an inverse problem but goes beyond network reconstruction. Given all restrictions, perhaps we can design or control a network to robustly exhibit a specific functionality. Network design and control form two additional branches in the theory of network dynamical systems, with currently highly active research, from biological sciences to engineering [11, 9699]. Such approaches might soon be extremely influential and thought-provoking, when, e.g., the neural and biochemical networks in our bodies as well as our infrastructure networks surrounding us can not only be reconstructed, but even controlled and specifically engineered. We recommend a view point by Freeman Dyson for a very practical taste [100].

Acknowledgments

We thank Srinivas Gorur Shandilya for help during project initiation and Raoul-Martin Memmesheimer, Mor Nitzan, Dirk Witthaut, David Gross, Matthias Wendland, Sarah Hallerberg and Hinrich Arnoldt for valuable discussions and constructive comments on parts of the manuscript. Partially supported by the International Max Planck Research School (IMPRS) Physics of Biological and Complex Systems (JC) the Ministry for Education and Science (BMBF), Germany, through the Bernstein Center for Computational Neuroscience, grant no. 01GQ1005B (MT) as well as by a grant by the Max Planck Society to MT.

Appendix A.: Multiple linear regression and L2 −norm minimization

Multiple linear regression is a widely-used statistical tool for predicting values of a set of variables depending on one independent set of variables. Its main purpose is to infer a linear relationship between them. Thus, the relationship between sets is assumed to have the form

Equation (A.1)

where $\boldsymbol{y}\in \mathbb {R}^{1\times M}$ are the values for a dependent variable, $\boldsymbol{\boldsymbol{\beta }}\in \mathbb {R}^{1\times K}$ the column vector of unknown linear coefficients, $x\in \mathbb {R}^{K\times M}$ is the set of values for the K −independent variables and $\boldsymbol{\boldsymbol{\varepsilon }}=(\varepsilon _{1},\ldots ,\varepsilon _{M})\in \mathbb {R}^{1\times M}$ are random errors εi, i ∈ {1, ..., M}. In addition, the errors εi are assumed to be independent random variables distributed according to a Gaussian distribution with mean μ = 0 and constant variance σ2 [101].

The question is how to estimate β by some $\hat{\boldsymbol{\beta }}$ such that the difference between the predicted and real values, $\boldsymbol{{\boldsymbol{\beta }}}x$ and y, is minimized? Using L2 minimization, the problem formally becomes

Equation (A.2)

also known as the linear least squares method [101].

The local extremum of the L2 norm appearing in the right-hand side of (A.2) implies

Equation (A.3)

such that

which in turn implies the best estimate

Equation (A.4)

of β according to the linear least squares method [101].

Appendix B.: Singular value decomposition and L1 −norm minimization

Singular value decomposition (SVD) is regarded as an important tool for statisticians due to its broad variety of applications (reduced rank regression, polar decomposition, image compression, among others [102]). Formally, from the fundamental theorem of linear algebra, any rectangular matrix $A\in \mathbb {R}^{m\times n}$ may be decomposed into the product of three matrices as

Equation (B.1)

where $U\in \mathbb {R}^{m\times m}$ and $V\in \mathbb {R}^{n\times n}$ are unitary matrices with their columns referred to as left and right singular vectors, respectively, and $\Sigma \in \mathbb {R}^{m\times n}$ is a rectangular diagonal matrix containing the singular values [103]. This decomposition is known as the SVD of matrix A.

The SVD properties are useful in finding the set of all possible solutions to an under-determined system of equations, because for every under-determined system

Equation (B.2)

where $\boldsymbol{b}\in \mathbb {R}^{m\times 1}$, we can use SVD (B.1) to rewrite A such that solving (B.2) for y yields the particular solution

Equation (B.3)

where $\tilde{\Sigma }\in \mathbb {R}^{n\times m}$ is the pseudo-inverse of Σ and is defined as

Equation (B.4)

Equation (B.3) defines a particular solution from the set of all possible solutions. The general solution to equation (B.2) is given by our particular solution plus a linear combination of vectors in the null-space of A, i.e.,

Equation (B.5)

where $\boldsymbol{c}\in \mathbb {R}^{n\times 1}$ and ci = 0 for i ∈ {1, ..., r} and r = Rank(A). The latter ensures that the linear combinations are made only with vectors that span the null-space.

In this case, the problem consists in maximizing the number of zero entries in y by choosing the cj  ∀  j > r properly. This may be achieved by solving the overdetermined system (B.5) with M equations and Mr unknowns. Specifically, it has been illustrated in [10, 29, 35] that by solving

Equation (B.6)

employing the Barrowdale and Roberts algorithm [34], a solution having a low number (possibly the least number) of non-zero values, consistent with the restricting equations, is often recovered. However, there seem to not be a general proof of this observation [31]. In the network reconstruction context, it has been shown that optimizing systems like (B.6) (where y and b are replaced by the unit's connectivity $\boldsymbol{J}_{i}^{\mathsf {T}}$ and network constraints $\boldsymbol{Y}_{i}^{\mathsf {T}}$) yields the actual network topology when the network is sparse (even if there are less linear constraints than units in the network).

Appendix C.: Estimating maximum entropy parameters

Maximizing the entropy

Equation (C.1)

we derive first the functional form of ρ(x) and second its parameters under the constraints that the non-negative function ρ ⩾ 0 is normalized

Equation (C.2)

and the first moment

Equation (C.3)

and second moment

Equation (C.4)

are consistent with those estimated from data of N time scalar series xi(t). The covariance matrix is defined via the data as

Equation (C.5)

We maximize the entropy under these 1 + N + N(N − 1)/2 constraints using a Lagrange function

Equation (C.6)

where we drop constants as they do not influence the location of a maximum. Here a, hi and $\frac{1}{2}\gamma _{ij}$ are the Lagrange multipliers to be determined. Computing the first derivatives and equating them to zero

Equation (C.7)

yields the (unique local) maximum entropy of the form

Equation (C.8)

where we use the abbreviations $\hat{J}:=(\gamma _{ij})_{ij}$, $\boldsymbol{z}:=\boldsymbol{x}+\hat{J}^{-1}\boldsymbol{h}$ is an affine function of x and

Equation (C.9)

is independent of x.

In summary, this exact computation yields the probabilities of the form

Equation (C.10)

where Z−1 = exp ( − 1 − a).

To estimate the parameters a, hi and $\hat{J}_{ij}$ we make the approximation that the data xi form a continuous set such that we can approximate sums by integrals. We then first observe that

Equation (C.11)

due to normalization (C.2). This implies

Equation (C.12)

and thus yields the parameter a as a function of h and $\hat{J}$. Similarly, fixing the averages (C.3) yields (approximately)

Equation (C.13)

for all i and thus h as a function of $\hat{J}$.

Finally, the equations fixing the second moments

Equation (C.14)

can be evaluated using

Equation (C.15)

subjected to

Equation (C.16)

With the transformation $\boldsymbol{x}=\boldsymbol{z}-\hat{J}^{-1}\boldsymbol{h}$, equation (C.14) becomes

Equation (C.17)

where the last equations follows from (C.9), (C.12) and (C.13). Thus, using the definition of the correlation matrix (C.5) and assuming that temporal and statistical averages are the same, we have that the matrix

Equation (C.18)

of effective coupling strengths between the variables is given by the inverse of the correlation matrix of the data.

Footnotes

  • From now on we write $\dot{z}$ for the rate of change $\frac{{\rm d}}{{\rm d}t}z$ of a variable z.

  • Part of the material presented in this subsection is taken from [12] and partially modified; this is not to be confused with the Guttenplag method.

  • We remark that equidistant times tm of sampling are not necessarily the best to evaluate $\dot{x}_{i}$, in particular if the time derivatives are approximated linearly.

  • A network is strongly connected if for every pair of nodes (i, j), node j can be reached from node i via a (directed) path within the network [48, 49].

Please wait… references are loading.