State–time spectrum of signal transduction logic models

Aidan MacNamara; Camille Terfve; David Henriques; Beatriz Peñalver Bernabé; Julio Saez-Rodriguez

doi:10.1088/1478-3975/9/4/045003

1. Introduction

The question of how signal transduction networks are able to weigh and integrate a multitude of extra- and intracellular signals into context-specific phenotypic outcomes is complex and difficult to answer. Typically, a signal transduction network links diverse inputs (stimuli) and outputs (gene regulation, motility, etc) through a dense system of proteins assembled in pathways that are connected by crosstalk and embedded in feedback loops (Rangamani and Iyengar 2008, Jørgensen and Linding 2010, Terfve and Saez-Rodriguez 2012). This complexity enhances the robustness and versatility of the network, but makes it difficult to understand in terms of mechanism. This is demonstrated where the complex consequences of mutation and deregulation in diseases such as cancer make identifying potential drug targets difficult, even in the case where the causative mutation is well known (Kreeger and Lauffenburger 2010, Patlak 2010). Often counter-intuitive therapeutic targets produce the most successful results due to this complexity, and the field of network pharmacology is based around this premise (Aislyn and Boran 2010).

Ideally, in order to understand such a complex and dynamic system, the quantities and states of large populations of proteins and their splice variants should be measured in vivo across time and across populations of cells (both tissues and individuals) under a range of different conditions (Liberali et al 2008). In the absence of this quality of data, it is necessary to use more qualitative and less time-resolved information to deduce mechanism. The focus of measurement in such signal transduction networks is the protein or, more specifically, the protein together with post-translational modifications (PTMs), as it is PTMs such as phosphorylation that convey information through a network. Hence, measurement comes from the field of proteomics. However, assumptions are necessary when considering the variety of PTMs that may occur. There are more than 500 different types of PTMs, and measuring the status of each site for all proteins is technically impossible (Khoury et al 2011). Indeed, this problem can be encountered with just phosphorylation alone. Considering the epidermal growth factor receptor (EGFR) has 31 phosphorylation sites, this implies that 2³¹ states of EGFR (each site can be phosphorylated or not) would need to be measured to provide full knowledge of the activation of this receptor and how it would change over time. Therefore, the study of signal transduction networks tends to concentrate on a subset of phosphorylation sites, where the site interaction partner(s) is known and measurement is technologically feasible (e.g. high-quality antibodies are available). These phosphorylation events are often used as markers of activation and deactivation. The consequences of such an approach is an experimental bias toward such phosphosites, a problem that is only now being addressed through less-biased high-throughput techniques such as mass spectrometry.

Phosphoproteomics can be divided into antibody- and mass-spectrometry-based methods. A comprehensive summary of these methods can be found in Terfve and Saez-Rodriguez (2012). Broadly speaking, the quality of data can be measured in terms of coverage, time resolution and specificity. Antibody-based methods are generally specific (depending on the quality of the antibody) and can be used to measure time courses of target proteins across many conditions. However, the number of targets that can be measured is limited. In contrast, mass spectrometry techniques allow for systematic identification and quantification of phosphorylated proteins. Although this comes with the caveat of requiring expensive equipment and advanced know-how as reliable quantification can be difficult, protocols (especially for mass spectrometry) can be laborious, and mapping measurements to proteins is not trivial (Ilsley et al 2009).

Whatever method is chosen from the above, the result is a quantitative 'part list' consisting of phosphoproteomic measurements from the signaling network of interest, taken under a certain number of conditions (different stimuli, inhibitors, time points etc) and describing the states of these parts.

1.1. From parts to interactions

In order to deduce a mechanism of action that explains these types of data, the interactions between the parts must be understood. Interactions can be represented as node-edge graphs. The nodes can be biological entities such as proteins, as in this case, or genes or metabolites in the case of transcriptional or metabolic studies. Edges can be described as biological activities, such as catalysis, association and modification of the participating nodes. They may be directed (i.e. protein X affects protein Y and not vice versa) or undirected. Furthermore, they can be signed (inhibitory/activating) or unsigned. When these graphs describe protein interactions, they can be characterized in two categories: protein interaction networks (PINs) and protein signaling networks (PSNs) (Pieroni et al 2008).

PINs can be constructed from a number of sources: high-throughput experiments such as two-hybrid and affinity purification/mass spectrometry or systematic literature searches (bibliome mining). These methods yield limited functional insight beyond a possible interaction between two proteins. They are represented as a graph with a set of nodes connected by a set of edges without sign or directionality. There are a number of public repositories of PINs, such as IntAct, HPRD and STRING.

PSNs are more detailed representations of protein interactions where, if described as a graph, their edges can have directionality and, when possible, sign. PSNs are generally obtained through expert curation of the literature or text mining. There are multiple public repositories of manually curated networks, including KEGG, WikiPathways, Nature Pathway Interaction Database and Reactome. Each has their strengths and weaknesses in terms of graphical formats, annotation, accuracy and curation. When creating a PSN, as many sources as possible should be consulted before deciding on a final network (Bader et al 2006, Bauer-Mehren et al 2009). Pathway Commons is also a useful portal that integrates these and other PSN and PIN repositories.

While PSNs provide an insight into the transfer of signaling information, they do so at an abstract level without specifying the mechanism of signal transduction. This additional detail is provided by biochemical networks that describe such interactions in a quantitative manner (i.e. phosphorylation, binding, dimerization etc). There are many examples in the context of metabolism, and recently an increasing number for signal transduction, such as the reconstruction of the signaling network downstream of the EGFR (Oda et al 2005) and the retinoblastoma/E2 transcription factor (RB/E2F) pathway (Calzone et al 2008).

Independent of the resolution of the network, there is such a high degree of interconnectedness, redundancy and cell/context specificity, even in well-studied networks such as the mitogen-activated protein kinase (MAPK) signaling pathway, that it is difficult to obtain a high degree of accuracy from prior knowledge alone (Kirouac et al 2012). Therefore, prior knowledge networks should be constructed with as much cell-specific knowledge as possible, and high-throughput databases can be advantageous to add annotation and information (e.g. expression data for a particular network in a specific cell type). The growing use of model standards such as Systems Biology Markup Language (SBML) (Hucka et al 2003) can also aid in pooling resources across datasets.

1.2. From interactions to mechanism

Through manual curation of the literature or network databases, there are a number of ways of arriving at a network or map that represents the biological interactions of the system of interest. So what are the requirements to step from this biological representation to a mathematical analysis and understanding of these networks? Graph theory can be used to analyze the topology of a network to understand the principles behind its design (Barabási and Oltvai 2004). These networks can also be used as a scaffold for overlaying expression data to better understand the activation of sub-networks (Bossi and Lehner 2009). These analyses can provide a useful insight, but are not amenable to explaining how a signaling network responds to a defined set of perturbations (Saez-Rodriguez et al 2009). This aim can only be achieved via mechanistic and predictive computational modeling. By computational modeling we mean, in the context of this paper, the construction of an in silico representation of a system (in this case, a cell signaling network) that can be simulated through a set of programmable commands that mimic the functioning of the system over time (Terfve and Saez-Rodriguez 2012). Simulating over time (or dynamic modeling) consists of using functions to describe how each species' (or node's) state in a network changes as a function of its inputs.

There are many approaches to this type of modeling, and the choice of method is highly dependent on the quality and type of data available for the network, together with the accuracy of ones prior knowledge about its topology and interactions. We will briefly give an overview of some popular methods, but more detail about each approach can be found in a number of excellent reviews (e.g. Aldridge et al 2006, de Jong 2002).

Physicochemical modeling (modeling that includes biochemical and physical features of the system) is an approach that uses equations derived from physical and chemical theory to describe biological processes such as covalent binding, association and diffusion (Aldridge et al 2006). This is a popular and insightful type of model for signaling networks, and many examples can be found in repositories such as biomodels.net. These equations are built through a deep understanding of the underlying biochemistry and hence refer to distinct processes (such as catalysis and complex formation). The family of physicochemical model formalisms include, among others, ordinary and partial differential equations (ODEs and PDEs), their stochastic variants and rule-based approaches. ODEs are the most common approach and can represent a signaling network through a set of coupled equations that describe the change in concentration of the elements (biomolecular species) of the network. ODEs are based on the assumption of mass action kinetics—a law that defines the rate of a reaction as being proportional to the concentrations of the reacting species (Chen et al 2010). This assumption can break down if there are spatial gradients for species or if concentrations of species are low enough that random fluctuations become a factor in the behavior of the system. In such cases, PDEs and stochastic formalisms are better suited to capture the biological behavior.

Another drawback to physicochemical modeling is the difficulty in managing and manipulating large networks, both in terms of the combinatorial complexity that such networks present (for example, the number of phosphorylation states of EGFR) and determination of the parameters of each equation such as rate constants and initial conditions. Rule-based modeling allows easier manipulation and management of larger systems. Models are specified by a set of rules corresponding to the molecular interactions among protein domains, and these are then automatically converted into a model that describes all possible reactions and molecular configurations (see Hlavacek et al (2006) for an introduction to rule-based modeling).

In summary, physicochemically detailed modeling generally works well with small, detailed biochemical networks. In the absence of such criteria, a coarser-grained approach is necessary and logic modeling can be viewed in this light.

1.3. Logic modeling

Unlike physicochemical modeling, logic modeling requires only a PSN as a starting point to simulate signaling processes. Although sparse in detail, such graphs are very insightful for understanding how the structure (or topology) determines the flow of information from input through to output (for example, ligand–receptor binding through to transcription factor activation (Ma'ayan et al 2005)). However, before questions can be addressed by simulation, the graph must be made computable by defining how each node state changes, as its inputs change so that input–output relationships can be quantified for the whole system.

Logic modeling uses transfer functions to describe the relationship between nodes in a graph (see section 8.3). Transfer functions are the mathematical representation of the relationship between inputs and outputs in a system. In physicochemical modeling, these are based on mass action kinetics and describe how the input species are transformed into output species by the chemical reaction. In logic modeling transfer, functions consist of logic operators (AND, OR, NOT) that describe how an output node is activated by its inputs. To illustrate this, we can use a simple case from the PI-3-Kinase-Akt signaling network that controls growth and division in mammalian cells. As part of this network, the kinase Akt is activated by the kinase PDK1 and the kinase complex mTOR. This would be represented in a PSN by two directed positive edges from PDK1 and mTOR to Akt. However, from this representation it would not be known whether both or either kinases are necessary for Akt activation. In logic modeling, this relationship can be represented using an AND operator that specifies the necessity of both kinases for Akt activation. Such an example also illustrates the strength and weakness of logic modeling: the reduction of complexity that enables modeling of large systems with incomplete information and fewer parameters, against less mechanistic detail and biochemical accuracy.

The use of logic-based modeling of biological systems goes back to more than 40 years, with the first model describing a gene regulatory network (Kauffman 1969). Since then, logic modeling has proved particularly useful in describing the effect of environmental inputs on cell phenotypes through networks of signal transduction. There are multiple studies using this type of modeling as a basis (Helikar et al 2008, Calzone et al 2010, González et al 2008, Mendoza and Xenarios 2006, Schlatter et al 2009, Sahin et al 2009) (see also reviews Morris et al 2010, Watterson et al 2008, Thakar and Albert 2010). The structure of a signal transduction network lends itself to logic modeling with clearly defined input nodes (ligand–receptor combinations), measurable elements corresponding to activation (phosphorylated proteins downstream of the receptor) and relatively little knowledge of the biochemistry involved.

Having summarized how logic modeling formulates input/output relationships (see above and section 8.3), the next step is to consider the complexity of the logic modeling in terms of how state and time are treated. The state refers here to the value or quantity attached to each node (typically a protein) and reflects the activation of that node at any point during a simulation. Their value is proportional to activation and can vary from 0 to some arbitrary or defined upper limit, depending on the type of logic modeling being undertaken. States can be defined as on/off (Boolean logic), multi-level or continuous, and we will discuss each of these in turn. When training logic models with experiments, these values are often compared to biochemical data. For instance, the phosphorylation of a protein is considered to be a proxy of its activation (e.g. phosphorylation at the phosphosite threonine-202 of extracellular signal-regulated kinase (ERK)).

Similarly, there are several approaches to handle time in a logic context, ranging from the simplest approximation of discrete or steady-state measurement to more biologically realistic continuous updating. These techniques will also be introduced in turn with the examples below.

1.4. Software

As a means to introduce the different methods and how they can be used to model different aspects of signal transduction, we will use the tool CellNOpt (www.cellnopt.org). CellNOpt is a software package that trains the topology of a PSN to experimental data by the criterion of minimizing the error between the data and the logic model created from the PSN. In CellNOpt, the starting network based on prior knowledge is called the prior knowledge network (PKN) (a name we will use in the rest of this paper). This PKN is preprocessed before training by compression and expansion (see materials and methods, section 8.1). The compression step of CellNOpt is a method of reducing the complexity of a logic model by removing nodes that have no effect on the outcome of simulation. The expansion step subsequently includes all possible hyperedges (materials and methods, section 8.1) in the model. The model is trained by minimizing a bipartite function that calculates the mismatch between the logic model and experimental data (mean squared error (MSE)) while penalizing model size. This minimization can be solved using different strategies, from simple enumeration of options for small cases to stochastic optimization algorithms such as genetic algorithms (Saez-Rodriguez et al 2009), or integer linear programming (Mitsos et al 2009).

The R version (CellNOptR) is available on Bioconductor and has a number of added features that allows the user to run different variations of logic modeling within the same framework of model calibration. These variations include steady state to discrete time Boolean modeling, fuzzy logic and logic ODEs, all of which will be discussed in turn below. We will also refer to other software packages that have contrasting or complementary approaches to CellNOptR.

For the remainder of the tutorial, different logic formalisms will be introduced and explained with the aid of CellNOptR, and the assumptions, strengths and weaknesses of each formalism with regard to training to data will be illustrated with a 'toy model' of signal transduction.

1.5. The example model

To illustrate the variety of logic modeling approaches, we will use an imaginary but biologically plausible PKN (figure 1). This network includes a subset of intracellular signaling networks known to be activated downstream of EGF and TNFα stimulation, and was derived from a larger network presented in Saez-Rodriguez et al (2009). In brief, the PKN includes three MAPK cascades (ERK, p38 and JNK1), the PI3K/Akt/GSK-3 pro-survival pathway and the IKK/IκB/NFκB pathway. It consists of 30 nodes and 33 edges.

From this PKN, we derived a model (the data-generating model) that was used to simulate experimental data (section 8.1 and table S1/figure S1 (available from stacks.iop.org/PhysBio/9/045003/mmedia)). The in silico data replicate biologically plausible behavior that has been seen in such networks, such as the transient behavior of ERK activation (Sasagawa et al 2005) and the oscillatory dynamics of NFκB translocation from the cytoplasm to the nucleus (Hoffmann et al 2002) (figure 2). These in silico data consist of ten 'experiments', which vary according to different combinations of stimulation and inhibition (inhibition is achieved by blocking the activity of two specific kinases (proteins), PI3K and Raf-1, with small-molecule inhibitors), and 16 observations at 2 min intervals from t = 0. Inhibition is used in such experiments to further understand the combinations of upstream events that contribute to the activation of a particular protein. The readouts chosen are well-established downstream events of EGF/TNFα stimulation. The experiment represents an ideal situation with multiple time-point sampling. However, as we will discuss later, with fewer measurements one can capture most (but not necessarily all) of the dynamics of the system. The values are between 0 and 1, and Gaussian noise was also added to the output to imitate inherent biological noise and the measurement error (see materials and methods, section 8.2). The PKN has the following important properties.

**Figure 2.** The *in silico* data generated to test each logic modeling formalism. The data were generated by a logic model (the data-generating model). Each row of the figure represents an experiment with a certain combination of stimuli and inhibitors (shown in the final two columns: black is ON, white is OFF). The simulated data are shown as a continuous black line. Gaussian noise was added (section 1.5 and materials and methods, section 8.2) and the data were 'sampled' at 16 equally spaced time points between 0 and 30 min to simulate a fine-grained time course experimental design.
Download figure:
Standard image

(i)
It does not specify which input or combination of inputs activate a particular node (for example, both Map3K1 and Map3K7 activate MKK4. The PKN does not specify whether this is an AND or OR relationship: figure 1).
(ii)
It includes additional interactions (TNFR → PI3K, PI3K → Rac, Rac → Map3K1) not present in the data-generating model.
(iii)
It is missing interactions (TRAF2 → ASK-1, ASK-1 → Map3K7) that are present in the data-generating model and are necessary to fully explain the in silico data.

The purpose of these gaps and errors in our 'prior knowledge' is to demonstrate the ability of CellNOptR to train context-specific models from unspecific prior knowledge and also to demonstrate the limitations of such an approach when information is incomplete. We will also demonstrate how CellNOptR performs when trying to find the true network topology and model parameters by using the different logic model formalisms to simulate the 'experimental data', and hence demonstrate the strengths, weaknesses and underlying assumptions of each of the logic model formalisms in turn. The network has been designed such that the features uncovered by the logic formalisms are not confounded by the missing interactions.

2. Boolean steady state

In arguably the simplest case of data, an experimental design looking at a particular signal transduction network will consist of a set of measurements representing the phosphorylation state of a subset of proteins in the signaling network. These measurements will be taken before the addition of a stimulus or stimuli and at a single time point after stimulation (t = 0 and t = t₁). Additionally, the effects of multiple conditions (inhibitions, perturbations) may also be examined with this design. This is a common approach when studying signal transduction, which has classically been used via low-throughput methods, and has more recently been scaled-up owing to new technological developments (Terfve and Saez-Rodriguez 2012).

Choosing a single time point after stimulation leads to a simple design and minimizes the cost per experiment. However, it then becomes critical to choose an appropriate time t₁ (see, figures S5–7 (available from stacks.iop.org/PhysBio/9/045003/mmedia)). Ideally, one should perform a set of detailed time course experiments that encapsulates the variation in activation in the system, but this is usually not viable in terms of cost and time constraints. It may be only possible to perform a detailed time course experiment for a single phosphoprotein. From this, a time point can be chosen that is characteristic of the activation of the phosphoproteins of interest. Typically, in signal transduction, a fast wave of activation occurs over a short timescale after stimulation. This is followed by slower, later mechanisms that often down regulate the signals over a longer timescale (e.g. degradation, internalization etc).

Returning to our example, the measurement of phosphorylated ERK could be viewed as a sensible output with which a time course can be obtained. (Its activation would be representative of the dynamics of the MAPK cascade and it's technically a good choice because of the quality of ERK phosphosite-specific antibodies.) From this time course, we would see that two different timescales seem to exist, an early activation phase, followed by a late phase. Thus, characteristic time points can be chosen (figure 2), and a reasonable early time point would be in the range 4–12 min. For argument's sake, we will choose t₁ = 10 min.

We can see from the data the difficulty with defining a characteristic time point, or how choosing a single time point may affect the ability to capture all dynamic features. For example, it is impossible to understand the oscillatory nature of NFκB translocation with a single time point, and ERK activation dynamics can only be partly representative of other phosphoprotein dynamics (even those closely related in function, such as Raf1). For the oscillations of NFκB, one would need to sample with a density of at least every 2.5 min (since the wavelength is 5 min), while to obtain an approximate sense of the transient activation of ERK, two well-chosen time points can be enough. In spite of this, steady-state measurement can give a qualitative overview of the system that allows for robust, albeit coarse-grained conclusions, with relatively few data points (and thus cost).

2.1. Steady-state optimization and simulation

One way to measure a model's ability to fit experimental data with a single time point, such as that described above, is to make the assumption that the system reaches at that point of a pseudo-steady state: the fast reactions have already occurred, while the slow reactions have not yet significantly affected the network's behavior (Klamt et al 2006). This approximation implies that the flux through the system (in our case, the phosphorylation cascade in signal transduction) has stabilized and the quantities of phosphorylated proteins are no longer varying to a significant degree. With this assumption, a model of this system can be simulated until it has also reached a steady state.

With the in silico data (figure 2) as our starting point, the PKN (figure 1) was trained using the steady-state model formalism at t₁ = 10 min. Details about the node states and transfer functions of this formalism (Boolean steady state) are summarized in section 8.3. Figure 3 shows the steady-state simulation overlaid on the experimental data.

**Figure 3.** The fit of the trained model using the Boolean steady-state formalism. The simulated data are shown as two blue circles (t₀ and t₁) connected by a blue dotted line. The colors represent the goodness of fit between the model and the data at t₁ = 10. Heat-map coloration is used to signify the range from high error (red, normalized mean squared error (MSE) = 1) to no error (white, MSE = 0). t is measured in minutes and the y-axis is the normalized activity of the measured proteins. The training in CellNOptR took 180 s.
Download figure:
Standard image

2.2. Interpretation of steady-state result

The Boolean steady-state formalism used by CellNOptR for optimization recovers most of the underlying 'true' network and hence gives a good steady-state approximation of the in silico data (see figures 3 and 8). However, there are some exceptions that highlight the limitations of steady-state measurements. Using this formalism, CellNOptR cannot identify the NFκB oscillations caused by feedback: hyperedges that cause negative feedback are penalized in CellNOptR as a steady state cannot be reached when they are present. Another limitation is that the state of each element in the model is limited to 0/1 (either switched on or off). Hence, intermediate levels of activation cannot be simulated (such as p38 activation under TNFα and EGF stimulation). Finally, the effect of the missing pathway from TNFα to AP1 is observed when the experimental measurement cannot be explained with TNFα stimulation in the absence of EGF stimulation.

Thus, the strength of steady-state Boolean logic is strongly dependent on the assumptions underlying the data. If one has enough knowledge of the data and biochemistry such that the assumption of steady state is a fair one to make, training a network to data using steady-state Boolean logic modeling can uncover cell-specific behavior, for example, differences between cancer and normal cells (Saez-Rodriguez et al 2011a). Another advantage is the scalability of such an approach: because the method is parameter-free, large networks can be trained under a large number of conditions.

3. Two time points (or additional steady state)

As mentioned in section 2, it is quite common in signaling networks to observe a transient behavior where a species is quickly activated and subsequently deactivated. Such a dynamic obviously cannot be captured with a steady-state approach where only one time point is considered. Therefore, in the above section, this issue was avoided by only modeling 'fast events', i.e. the activation phase of the signal propagation. However, when information about more than one time point is available and such a fast activation followed by slow deactivation (or indeed any combination of slower and faster processes) is observed, then it is possible to also capture these processes while keeping the simplifying assumption of steady states. In essence, it is assumed that multiple pseudo-steady states reflect the mechanisms that are acting at different timescales and they can be optimized independently. We will illustrate this with the CellNOptR implementation for two timescales, but the approach is extendable to more than two time points.

Defining suitable time points that adequately represent the process timescales that we want to model is a similar problem to what was discussed above for a unique steady state, with the added complexity of having to choose more than one point that is consistent for all modeled species. This can be guided by prior knowledge, e.g., if it is known that a receptor is activated on a fast timescale (e.g. 30 min for full activation) by phosphorylation and then deactivated by slow internalization and degradation (e.g. 2 h for full silencing of the signal). However, in general, it is better to develop a detailed time course as stated above. In our case, again using ERK, we would say that a second measurement at 20–30 min would be adequate; 30 min was used for the sake of argument (see figure 4).

**Figure 4.** The fit of the model at two time points t₁ and t₂ using the two steady-state approach. Again the colors are representative of the fit, this time at t₁ = 10 and t₂ = 30. t is measured in minutes and the y-axis is the normalized activity of the measured proteins. The training in CellNOptR took 240 s.
Download figure:
Standard image

3.1. Multiple steady-state optimization and simulation

In CellNOptR, a model of a system with two steady states at different timescales is simulated by assuming that a subset of the hyperedges (interactions) only become active at a later time point, that is, they operate on a different timescale (Klamt et al 2006). That being the case, the two time points can therefore be optimized separately. In practice, this means that the optimization is done in two steps.

(i)
The scaffold model (the model after compression of non-essential nodes and expansion of all possible hyperedges, see figure 1 and materials and methods, section 8.1) derived from the PKN is used to train the model against the data at t₁, thereby identifying hyperedges that best reproduce the data at this time point.
(ii)
Hyperedges that were not selected as active at t₁ are used as the search space for training the model at t₂. For simulation (and therefore testing the model fit), candidate models are tested by using the steady state of the t₁ model as an initial state, then computing the steady state from there, including candidate t₂ hyperedges. There is also the additional constraint that whenever hyperedges at t₁ and t₂ influence a node in contradicting ways, the t₂ hyperedge overrules the t₁ hyperedge and the state of the target nodes is locked to the state defined by the t₂ hyperedge.

Besides the additional constraint of the overriding hyperedges described above, the node states and transfer functions are calculated in the same way as the Boolean steady-state formalism (section 8.3).

3.2. Interpretation

In our example, we can see that the two steady-state optimization finds the feedback from ERK back to SOS-1 (figure 8). Hence, from figure 4, the transient activation of Raf1, ERK and AP1 is captured in the trained model. Using a single characteristic time point, a model that includes the negative feedback from ERK to SOS-1 at t₁ would not be selected, as the branch never reaches a stable steady state because of oscillation. However, if we say that the branch is active at t₁ but that the negative feedback is only active at t₂, and that when active, this negative feedback permanently turns SOS-1 off, then the model does reach a steady state at t₁ (where SOS-1, Raf-1 and ERK are all ON) and a different steady state at t₂ (where SOS-1, Raf-1 and ERK are all OFF as a result of the activated negative feedback).

4. Synchronous multiple time-point simulation and multiple timescales

As discussed in section 3, by measuring at two characteristic time points, the trained logic model is capable of finding the slow negative feedback from ERK to SOS-1 and therefore move a step closer to understanding the 'true' network. However, the oscillations of NFκB still cannot be explained with the pseudo-steady-state formalism, as it is necessary to use the full time course (and not just two time points) data to observe this effect. This can be modeled by a discrete time Boolean model that is available as add-on R package to CellNOptR: CNORdt (discrete time).

4.1. Synchronous and asynchronous updating

CNORdt introduces some variation in how time is handled in the model. Instead of simulating and fitting data at steady states, it is capable of fitting time course data by using an additional model parameter together with a synchronous updating scheme.

Synchronous updating is where all nodes are updated simultaneously during model simulation: hence, each node at time t is a function of its input nodes at t − 1 (see section 8.3). This is the updating scheme used in CellNOptR. An alternative method is asynchronous updating, where nodes are updated in a random or non-synchronous order, depending on the asynchronous method used. This leads to different simulation properties depending on the updating method chosen. Synchronous updates are deterministic and simulations run under the same conditions (inputs and perturbations) will reach the same steady state (or attractor) each time. In contrast, asynchronous updating introduces stochasticity into the system such that different steady states can be reached from the same starting conditions. The random updating of node values is one possible application of asynchronicity. This enables sampling over all timescales (any reaction can be deemed to be slowest or fastest), thus avoiding the constraint inherent in synchronous simulations of an equal timescale over all reactions. However, this added complexity can make results difficult to interpret (Garg et al 2008). Mixed synchronous/asynchronous updating is an intermediate approach that can stratify reaction groups according to their known reaction rates, thus taking advantage of a priori knowledge and reducing the complexity of a fully asynchronous approach (Faure et al 2006, Albert et al 2008, Assmann and Albert 2009, Garg et al 2008).

CNORdt introduces a scaling parameter that defines the timescale of the Boolean synchronous simulation. Where each 'tick' (t) (or simulation step) is the synchronous updating of all nodes in the model according to their inputs at t − 1, the scaling parameter defines the 'tick' frequency relative to the timescale of the real data. Although this is a crude approach (i.e. it implies a single rate across all reactions), it allows us to fit a synchronous Boolean simulation to data. Hence, all data points can be fitted to the model and hyperedges that cause feedback in the model can be included, which allows the model to reveal more complex dynamics such as oscillations. CNORdt still describes the node states as either on or off (1/0) and the transfer functions are calculated as in section 8.3. The scaling parameter is applied to the simulation of the system and hence does not affect the transfer functions themselves.

Figures 5 and 8 show how the NFκB oscillations can be predicted by fitting a dynamic logic model to the full time course and maintaining the two steady-state assumptions from section 3, i.e. simulating 'fast' reactions from t = 0 to t = 10 and 'slow' reactions from t = 10 to t = 30.

**Figure 5.** The fit of the model at multiple time points using fast (t = 0 to t = 10) and slow (from t = 10 to t = 30) timescales. t is measured in minutes and the y-axis is the normalized activity of the measured proteins. The training in CNORdt took 300 s.
Download figure:
Standard image

5. Constrained fuzzy logic

One of the main limitations of Boolean logic models is that the assumption of a single level of activation (species can only be on/off) is biochemically unrealistic. Fuzzy logic is another logic modeling formalism that allows for intermediate levels of activation. It was originally developed in the field of control theory for predicting the outputs of complex processes where inputs could only partially be characterized (Morris et al 2011a). Its strength lies in the flexibility it affords when defining relationships between input and output nodes. This flexibility can also be a weakness if a large number of parameters are required to define these functional relationships. Constrained fuzzy logic (cFL) deals with this potential complexity by limiting the repertoire of relationships between nodes. The cFL formalism used in CellNOpt (CNORfuzzy) is fully described in Morris et al (2011a). Briefly, the relationships (or transfer functions) between nodes in cFL are limited to Hill functions. Hence, each transfer function has two free parameters: the Hill coefficient n, which controls the steepness of the function, and the sensitivity parameter k, which determines the midpoint of the function (i.e. the value of the input that produces half the maximal output). By varying these two parameters, linear, sigmoidal and step-like dynamics can be produced that are good approximations to protein–protein interactions and enzymatic reactions. In CNORfuzzy, further constraints are imposed by initially limiting the possible parameter combinations to a subset of discrete values. Details of the transfer functions used can be found in materials and methods, section 8.4.

5.1. Model training and simulation

Modeling training and simulation in CNORfuzzy is carried out in a similar manner to the Boolean steady-state formalism. After compression and expansion of the logic hypergraph, a genetic algorithm determines transfer functions and a network topology that minimize the MSE between the model and the data at steady state. This is followed by a number of refinement steps that fine-tune the Hill function parameters and reduce the complexity of the network topology. The in silico data and model fit at t₁ = 10 are shown in figure 6.

5.2. Interpretation

CNORfuzzy is capable of fitting intermediate values (figure 6). For most cases, the cFL model generates similar fits to the steady-state Boolean model. However, the fit to data is more accurate since the values are continuous and not limited to 0 or 1. More importantly, the cFL model obtains a better fit for p38, as it uncovers a link in the structure that Boolean models are unable to capture. In the 'true' network, TNFα and EGF are both required to activate p38 (albeit the activation is low relative to the other signals). In the previous Boolean formalisms, this low activation of p38 cannot be modeled as the simulation can only take the values 0/1. However, CNORfuzzy is capable of adding the hyperedge 'Map3K1 AND Map3K7 → MKK4' (figure 8) to explain this activation and hence move a step closer to finding the underlying true network.

The CNORfuzzy model fit also illustrates some caveats associated with fuzzy logic. We can see that CNORfuzzy also retains the Map3K7 → p38 hyperedge (figure 8), thus activating p38 with TNFα stimulation alone (i.e. in the absence of EGF stimulation). This occurs as CNORfuzzy attempts to fit the noisy signal of inactive p38, thus adding a hyperedge that is not present. CNORfuzzy also adds hyperedges from TNFα to AP1 that convey a weak activating signal to compensate for the missing hyperedges (TRAF2 → ASK-1, ASK-1 → Map3K7) from the PKN (figure 1). These examples illustrate the sensitivity of the cFL approach to the data quality, and this can make interpretation of the results more subtle and difficult (Morris et al 2011a).

6. Logic ODEs

The Boolean logic formalisms described above can qualitatively fit the network topology and logic gates that best describe the underlying data. cFL can add quantitative information by its ability to fit intermediate values between 0 and 1 at steady state. In terms of time, however, all these formalisms rely on discrete simulations. To obtain a fully continuous model both in state and time, CNORode adds to these methods by transforming a discrete logic model to a continuous model. It does this by defining a set of ODEs for each model species. There are several formalisms to convert discrete logic to continuous models (e.g. SQUAD (Di Cara et al 2007)) or hybrid models (e.g. piecewise linear models, (de Jong 2002)). CellNOpt includes the method developed by Wittmann et al (2009) that was implemented in Matlab as Odefy (Krumsiek et al 2010).

6.1. Converting from Boolean to continuous

The approach used to convert Boolean to continuous models is fully explained in Wittmann et al (2009). Briefly, the goal is to simulate the full dynamics of each species in the logic model while retaining consistency with the Boolean representation. What this means is that, where the output of a logic gate is 0 or 1, the ODEs replacing a Boolean state should also return to 0 or 1. This is achieved in a similar manner to cFL (but with an additional parameter τ) by applying a normalized Hill function between the intervals 0 and 1. Applying these functions to each hyperedge defines a new continuous ODE model to replace the underlying Boolean model. This is more fully explained in section 8.5.

6.2. Parameter estimation

CNORode currently provides links to two stochastic, non-local optimization algorithms: a genetic algorithm (genalg package, http://cran.r-project.org/web/packages/genalg/) and an implementation in R of scatter search (Egea and Martí 2010). These are used to fit the Hill function parameters k and n and the ODE parameter τ to each logic gate in a model that has been already topologically optimized by one or more of the other formalisms.

6.3. Compressing an ODE model

Compression of the model before training may lead to the loss of elements important to capture dynamic features, and must thus be done with caution. Returning to our example (figure 2), the in silico data were generated through a set of normalized Hill functions. Hence, with the exception of AP1 (where the missing hyperedge prohibits any exact simulation of this signal), CNORode should be capable of simulating exactly the other signals in the system after parameter optimization of the associated logic ODEs. However, this may not be possible when the model is compressed. To give an example, in our toy model (figure 1), the pathway consisting of SOS-1, Ras, Raf-1, MEK 1 and ERK is compressed to SOS-1 → Raf-1 → ERK. The in silico data were generated with ODEs describing the uncompressed interactions. We can see from figure 7 that the compressed model can accurately simulate the in silico data for this pathway (Raf-1 and ERK signals). In this case, the normalized Hill functions have enough dynamic plasticity to summarize four interactions (SOS-1 → Ras → Raf-1 → MEK 1 → ERK) as two (SOS-1 → Raf-1 → ERK). However, this is not the case where we have feedback from ERK through a phosphatase (ph) back to SOS-1 and NFkB through expression (ex) back to IkB. In these cases, it is necessary to not compress 'ph' and 'ex' to allow CNORode to model the correct dynamics (transience and oscillations respectively). The non-compression is required as 'ph' and 'ex' are integral to the dynamics observed in the in silico data. So figures 7 and 8 show, with the exception of AP1, that CNORode can accurately model the in silico data of the toy model once compression of those key nodes is suppressed.

**Figure 7.** The fit of the trained model using CNORode. t is measured in minutes and the y-axis is the normalized activity of the measured proteins. The parameter training in CNORode took 2000 s.
Download figure:
Standard image

**Figure 8.** The contribution of each logic modeling formalism to the understanding of the model used to simulate the *in silico* training data. The time taken for training the model using each formalism is also shown.
Download figure:
Standard image

7. Summary and future developments

In this contribution, we have reviewed different logic-based approaches to model signal transduction networks. Recent developments in proteomics techniques, both antibody based (xMAP, protein arrays, high-throughput microscopy, etc) to mass spectrometry methods (Terfve and Saez-Rodriguez 2012) allow us to generate a large amount of phosphoproteomic data. Given the size of the underlying networks, we believe that logic-based models, which do not need extensive biochemical detail and thus lead to tractable models even when dealing with multiple pathways, are a useful approach to analyzing signal transduction on a large scale. Therefore, we have focused our work on how to train logic models to experimental data, and implemented various methodologies toward this end in our tool CellNOptR.

Our recent developments, presented here, expand our previous work by including strategies to deal with the inherent dynamic nature of signaling processes (and hence with time series data). We have discussed how modeling dynamic aspects require more detailed formalisms (and thus in general more data and computational time), and how the general methodology has to be re-evaluated at multiple levels, in particular, the compression of the network prior to the optimization: hence, we are currently working to develop a general compression routine for dynamic models. Another area of active development is the implementation of efficient optimization strategies to identify both structure and (if existing) continuous parameters (Banga 2008). Although we have covered here a broad palette of logic-based formalisms, we plan to explore other approaches. Some are combinations of what we have discussed (e.g. a cFL formalism simulated over multiple timescales), others are formalisms related to those used here (e.g. SQUAD), or others could add new features, such as a probabilistic framework (Shmulevich et al 2003), stochasticity (Albert et al 2008) or formal methodologies (Fisher and Henzinger 2007).

For the sake of simplicity, we have used a toy model that is itself based on a logic formalism to exemplify the potential dynamic behavior and thereby different modeling variants. We are currently working on more realistic benchmarks based on biochemical models, and studying in more detail the role of experimental noise and experimental design in recovering the underlying model structure.

As illustrated in our example with the link TRAF2 → ASK-1 → MKK7, databases are comprehensive but not complete, and it is therefore likely that important links are missing from the system of interest (Kirouac et al 2012). To overcome this limitation, we are working on strategies to integrate as many network resources as possible. These include methods that propose novel links that expand the prior knowledge network (Saez-Rodriguez et al 2009, Eduati et al 2010), and the use of information from PINs (Vinayagam et al 2011).

The focus of CellNOptR is the calibration of logic models to data, but a large set of other tools exist that analyze logic models from different angles (Morris et al 2010). For example, the Q2LM toolbox (Morris et al 2011b) uses cFL to understand the effect of perturbations in the context of the whole system under investigation (e.g. under what set of stimuli is a therapeutic perturbation most effective?). CellNetAnalyzer (Klamt et al 2007) has a battery of methods from graph theory, as well as specific techniques for logic models. These include minimal intervention sets (the minimum number of perturbations for a desired phenotype) to propose possible therapeutic targets. These tools use the same model format as CellNOptR so it is easy to pass models for analysis. More generally, we are part of the CoLoMoTo initiative, which aims to facilitate interoperability among these tools; the main goal here is the development of SBML-qual as a language to exchange logic models (sbml.org/Community/Wiki/SBML_Level_3_Proposals/Qualitative_Models), as well as the implementation of the SBGN format for network representation (Novère et al 2009).

In general, efficient integration of data and prior knowledge to model signal transduction require the use of appropriate standards for data, prior knowledge about the networks and the models themselves (Saez-Rodriguez et al 2011a). We consider that logic models will be an area of development in the future with increasing application to signal transduction research.

8. Materials and methods

8.1. CellNOptR

As mentioned in section 1.4, CellNOptR includes some additional steps in pre-processing logic models before simulation and training to data. The details of these steps can be found in Saez-Rodriguez et al (2009). Briefly, the model is compressed by removing non-identifiable elements. These include nodes on terminal branches that are not part of the experimental design (non-observables: figure 1: p90RSK and CREB), nodes that are not affected by the inputs or perturbations (non-controllables) and additional nodes that can be removed without affecting logic outcome during simulation (figure 1: Ras, MEK 1 etc).

After this compression step, a superstructure of all possible hyperedges is created (figure 1, inset). This superstructure contains 'the space' of hyperedges that is optimized (through the removal of redundant hyperedges) by training to the experimental data. The training uses a genetic algorithm to search for logic models that minimize a bipartite function. This function includes the MSE between the simulation of the optimized logic model and the data and a penalty term for model size. Depending on the formalism used (see the main text), the simulation and data may be at steady state (CellNOptR, CNORfuzzy) or all data points can be used (CNORdt). The resulting logic model is then a subset of the superstructure and contains only the hyperedges that best explain the experimental data (with the additional attribute of parsimony given the size penalty in the optimization function).

8.2. Network and data generation

The toy model was constructed manually and is based on the model from Saez-Rodriguez et al (2011a). The in silico data were generated from the toy model using CNORode. The parameters were manually adjusted to model as closely as possible the known dynamics of ERK and NFκB activation. After simulation, noise was added to each data point according to N(μ, σ²) where μ = 0 and σ² = 0.05. The data were then rescaled between the intervals [0, 1]. Two methods of cross validation were also performed to demonstrate the robustness of CellNOptR (steady-state Boolean) to sparseness in the data (figure S8 (available from stacks.iop.org/PhysBio/9/045003/mmedia)).

Model and data files, together with the corresponding R scripts, can currently be found at http://www.ebi.ac.uk/~aidanmac/public/logicModelingTutorial (password: tutorial).

8.3. Boolean logic

A Boolean model can be represented as follows:

(1)
N species X₁, X₂, ..., X_N, each represented by a variablex_i, taking values 0 or 1.
(2)
${\rm For}\;{\rm each}\;{\rm species}\;X_i ,\;{\rm there}\;{\rm are}\;{\rm a}\;{\rm subset}\;{\rm of}\;{\rm species}\;R_i : = \{ {X_{i1} ,\;X_{i2} , \ldots ,X_{iN_i } } \} \subset \{ {X_1 ,\;X_2 , \ldots ,X_N } \}\; {\rm that}\;{\rm influence} \;x_i .$
(3)
${\rm And\ for}\;{\rm each}\;{\rm species}\;X_i ,\;{\rm an}\;{\rm update}\;{\rm function}\;B_i :\{ 0,1\} ^{N_i } \to \{ 0,1\} .$

From these set of rules, the state of each species at time t + 1 is a function of the state of its influencing species at time t (Kauffman 1969).

So how does the function B_i (also called a transfer function) for each species X_i deal with inputs from other nodes? B_i can be represented in a sum-of-product (SOP) formulation (Mendelson 1970), which allows for multiple possible inputs (AND, NOT, OR gates) to be processed into a single output. To illustrate this, consider the following simple example (figure 9).

**Figure 9.** An overview of the graphical representation of logic models. (A) The SOP expression for the activation of C summarized as an XOR gate. (B) SOP expressions describing the activation of C and D. (C) An example of a hypergraph representation where the nodes are connected by hyperedges.
Download figure:
Standard image

We know that the element D is activated by a combination of A and B (i.e. both A and B are needed for activation). Hence, both the graphical and written representation of this activation is relatively straightforward:

$\begin{equation*} B_1 \left( {a,b} \right) = a \wedge b. \end{equation*}$

However, in the case of the activation of C, this occurs when A is active without B or when B is active without A. In this case, one needs some additional rules of representation.

The SOP representation allows the above activation to be written using only AND, NOT and OR operators:

$\begin{equation*} B_1 \left( {a,b} \right) = \left( {a \wedge \neg b} \right) \vee ( {\neg a \wedge b} ). \end{equation*}$

This is done by calculating the product within brackets and summing between brackets. Essentially, SOP representations are rules of precedence for complex multi-node inputs. In terms of graphically representing the activation of C, its activation cannot be easily represented using standard SBGN AND, NOT or OR operators (figure 9). Hence, this SOP expression can be summarized as an XOR gate.

A logic network where relations are encoded by SOP expressions that can be represented as a hypergraph (Klamt et al 2006). A hypergraph is defined as a set of nodes connected by hyperedges where a hyperedge is a generalization of an edge that can be connected to more than two nodes. This in turn can facilitate a more precise representation of biological knowledge (for example, where two proteins are necessary for the activation of a target).

8.4. Fuzzy logic

cFL defines the transfer function between nodes as a Hill function. Depending on the type of interaction (or logic gate, figure 10) this function can take different forms (Morris et al 2011a).

(a)
If node C depends only on A, a normalized Hill function is used to calculate C where k and n are the sensitivity coefficient and Hill coefficient, respectively:
$\begin{equation*} c = (k^n + 1)\frac{{a^n }}{{k^n + a^n }}. \end{equation*}$
(b)
An inhibitory relationship is represented as the above expression subtracted from 1:
$\begin{equation*} c = 1 - (k^n + 1)\frac{{a^n }}{{k^n + a^n }}. \end{equation*}$
(c)
An AND gate, the minimum value of c is used:
$\begin{equation*} c = {\rm min}\left( {\left( {k_1 ^{n_2 } + 1} \right)\frac{{a^{n_2 } }}{{k_1 ^{n_2 } + a^{n_2 } }},\;\left( {k_2 ^{n_2 } + 1} \right)\frac{{b^{n_2 } }}{{k_2 ^{n_2 } + b^{n_2 } }}} \right). \end{equation*}$
(d)
And for an OR gate the maximum value is used:
$\begin{equation*} c = {\rm max}\left( {\left( {k_1 ^{n_2 } + 1} \right)\frac{{a^{n_2 } }}{{k_1 ^{n_2 } + a^{n_2 } }},\;\left( {k_2 ^{n_2 } + 1} \right)\frac{{b^{n_2 } }}{{k_2 ^{n_2 } + b^{n_2 } }}} \right). \end{equation*}$

8.5. Logic ODEs

As in the case of cFL, CNORode uses phenomenological transfer functions (i.e. non-mechanistic normalized Hill functions) to describe the dynamics of a node's state as a function of its inputs. Using the examples in figure 10 again, these functions can be described as follows:

(a)
$\mathop {\bar c}\limits^. = \frac{1}{\tau }(\bar B(\bar a) - \bar c),$ where $\mathop {\bar c}\limits^.$ is the development of c over time, $\bar B(\bar a)$ is the normalized Hill function of the continuous variable $\bar a$ . This takes the form ${{\frac{{\bar a^n }}{{k^n + \bar a^n }}} \mathord{\left/ {\vphantom {{\frac{{\bar a^n }}{{k^n + \bar a^n }}} {\frac{{1^n }}{{k^n + 1^n }}}}} \right. \kern-1.2pt} {\frac{{1^n }}{{k^n + 1^n }}}}$ (k and n are again the sensitivity and Hill coefficients, respectively). τ can be interpreted as the maximum value of species c (biologically, this could encompass degradation or other limiting factors) and there is an additional degradation term proportional to $\bar c$ .
(b)
An inhibitory relationship is simply the above expression subtracted from 1: $\mathop {\bar c}\limits^. = 1 - \frac{1}{\tau }(\bar B(\bar a) - \bar a)$ .
(c)
The AND gates take the form $\mathop {\bar c}\limits^. = \frac{1}{\tau }(\bar B( {\bar a} ).\bar B(\bar b) - \bar c)$ .
(d)
The OR gate notation is as follows: $\mathop {\bar c}\limits^. = \frac{1}{\tau }( {\bar B( {\bar a} ).\bar B( {\bar b}) + \bar B( {\bar a}).[ {1 - \bar B( {\bar b})}] + \bar B( {\bar b}).[ {1 - \bar B( {\bar a})}] - \bar c}).$

In the case of an AND gate, the product of $\bar B\left( {\bar a} \right)$ and $\bar B(\bar b)$ is taken, which maintains consistency in the output with the equivalent Boolean model (i.e. if a = 1 and b = 0, in both ODE and logic formalisms c = 0; similarly, with an OR gate if a = 1 and b = 0, in both ODE and logic formalisms c = 1). As in the case of cFL, normalized Hill functions can approximate commonly observed biochemical dynamics such as linear, sigmoidal and step-like behavior.

Acknowledgments

The authors thank J Banga, J Egea, Inna Pertsovskaya and Melody Morris for valuable help and discussion. Funding was provided by the EU-7FP-BioPreDyn and EMBL-EIPOD programs.

State–time spectrum of signal transduction logic models

Article metrics

Submit

Author e-mails

Author affiliations

Dates

Abstract