3.1 Building a quantitative foundation for reactive systems theory
Every behavior of a reactive system is an infinite word whose letters represent observable events. The foundations of reactive models distinguish between the linear-time and the branching-time view [
6].
The linear-time view In the linear-time view, the set of possible behaviors of a system are collected in a language, which is a set of infinite words. Formally, we consider a
language over an alphabet
Σ to be a boolean-valued function
L:
\(\varSigma^{\omega}\rightarrow\mathbb{B}\), rather than a set (think of
w∈
L iff
L(
w)=1), to make the connection to quantitative generalizations self-evident. Such languages, which are infinite objects, can be defined using finite-state machines with infinite runs, so-called
ω-automata, whose transitions are labeled by letters from
Σ. For an
ω-automaton
A, let
L(
A) be the language accepted by
A. There is a rich and robust theory of finite-state acceptors of languages, namely, the theory of the
ω-regular languages [
5]. In particular, the
ω-regular languages are closed under boolean operations, and the interesting questions about
ω-automata—specifically language emptiness, language universality, and language inclusion—can all be decided algorithmically. In the linear-time view, the
language inclusion question is the basis for checking if a system satisfies a requirement, and for checking if one system description refines another one: given two
ω-automata
A and
B, the system
A satisfies the requirement
B (respectively, refines the system
B) iff
L(
A)⊆
L(
B).
There are two obvious and popular, but orthogonal, quantitative generalizations of languages. In both cases, the general language inclusion problem is open, i.e., we do not even know under which circumstances it can be decided. As the language inclusion problem lies at the heart of all linear-time verification, this is an obviously unsatisfactory situation. Therefore a natural direction to start building a quantitative theory for reactive modeling is to obtain a better understanding of the quantitative language inclusion problem, in all of its formulations.
Probabilistic languages
The first quantitative view is probabilistic. A
probabilistic word, which represents a behavior of a probabilistic system, is a probability space on the set
Σ
ω
of infinite words. We write
\(\mathcal{D}(\varSigma^{\omega})\) for the set of probabilistic words. A
probabilistic language is a set of probabilistic words, i.e., a function
L:
\(\mathcal{D}(\varSigma^{\omega})\rightarrow\mathbb{B}\). Probabilistic words can be defined by Markov chains, and probabilistic languages by Markov decision processes (MDPs), whose transitions are labeled by letters from
Σ. MDPs generalize
ω-automata by distinguishing between nondeterministic states, where an outgoing transition is chosen by a
scheduler, and probabilistic states, where an outgoing transition is chosen according to a given probability distribution. Unlike in
ω-automata, the scheduler may in general be probabilistic. Given an MDP
A and a scheduler, the outcome is a probabilistic word, and by collecting the outcomes of all schedulers in a set, we obtain a probabilistic language
L(
A). The language inclusion question for MDPs—given two finite-state MDPs
A and
B, is
L(
A)⊆
L(
B)—is open, even if schedulers are required to be nonprobabilistic and if
B has no nondeterministic states. A solution is known only for the special case where both
A and
B have no nondeterministic states; this special case is the equivalence problem for Markov chains [
43].
4
Weighted languages
In the second quantitative view, a language is a function from words to real values. The value
\(L(w)\in\mathbb{R}\) of a word
w may measure the cost or resource (e.g., power) consumption of the behavior represented by
w. Formally, a
weighted language is a function
L:
\(\varSigma^{\omega}\rightarrow\mathbb{R}\). Weighted languages can be defined by
weighted automata [
16], which are finite-state machines whose transitions are labeled by both letters from
Σ and real-valued weights. When assigning values to words, given a weighted automaton, we must make two decisions: (i) how to aggregate the infinite sequence of weights along a run of the automaton into a single value, and (ii) if the automaton is nondeterministic, how to aggregate the values of all possible runs over the same word. Canonical choices for (i) are
discounted-sum,
limit-average (mean payoff), and
energy (sum) values; a canonical choice for (ii) is to take the supremum of the values of all runs over the same word. We will motivate these choices below. Here it suffices to say that the language inclusion question
L(
A)⊆
L(
B) for weighted automata
A and
B is undecidable in the limit-average and energy cases [
45,
46], and open in the discounted-sum case. Solutions are known only for the special case where
B is deterministic [
47].
Consider an infinite sequence of real-valued weights
v
i
, for
i≥0, along a run of a weighted automaton. To aggregate such an infinite sequence into a single value, one can take the supremum sup
i≥0
v
i
(the largest weight that occurs along the run), or limsup
i≥0
v
i
(the largest weight that occurs infinitely often), or liminf
i≥0. Note that if all transition weights of an automaton are 0 or 1, then sup corresponds to the finite (reachability) acceptance condition; limsup corresponds to Büchi acceptance, and liminf to coBüchi acceptance. However in a truly quantitative setting, more general, real-valued aggregation functions seem more interesting and useful, and the following two have been studied extensively in game theory.
Discounted-sum values
One mechanism for obtaining a finite aggregate value from an infinite sequence of weights is discounting, which gives geometrically less weight to weights that occur later in the sequence. Given a real-valued discount factor λ∈(0,1), the discounted-sum value is ∑
i≥0
λ
i
⋅v
i
. Discounted-sum values depend strongly on the initial part of an infinite run, and hardly at all on the infinite tail. In a way, they are quantitative generalizations of safety properties. They are useful, for example, to define the time to failure of a system.
Limit-average values
Another standard way of obtaining a finite aggregate value from an infinite sequence of weights is averaging, which gives equal weight to all weights that occur infinitely often in the sequence (and no weight to values that occur only finitely often). The limit-average (or mean-payoff) value is the limit of the average weights of all prefixes: \(\textrm{liminf}_{n\geq 0}\frac{1}{n}\cdot\sum_{0\leq i\leq n}v_{i}\) (under some technical conditions liminf coincides with limsup in this definition). Limit-average values depend only on the infinite tail of a run; they are quantitative analogues of liveness properties. They are useful, for example, to define the mean time between failures of a system, or the average power consumption of a system, etc.
There are isolated results [
46‐
48] about the expressiveness, decidability, and closure properties of quantitative languages, in the probabilistic, discounted weight, and average weight cases, but we lack a complete picture and, more importantly, a compelling overall theory, i.e., a quantitative pendant to the theory of
ω-regular languages. We cannot even be sure that the discounted-sum and limit-average aggregation functions are in any way as canonical as Streett and Rabin acceptance are in the qualitative case. A topological characterization of weighted languages, akin to the topological characterization of safety and liveness as closed and dense sets in the Cantor topology, and to the Borel characterization of the
ω-regular languages, may be helpful in this regard.
5
The branching-time view Given the wide open situation of the quantitative linear-time view, it is natural to look also at the branching-time view, which is algorithmically simpler in many cases (for example, while language inclusion checking is PSPACE-hard for finite-state machines, the existence of a simulation relation between two finite-state machines can be checked in polynomial time). Topic 2 will therefore explore the pragmatics of a quantitative branching-time approach. However, we also wish to have a compelling quantitative
theory of branching time. Such a theory is best based on
tree automata [
51]. This is because in the branching-time view, the possible behaviors of a system are collected in an infinite computation tree which, unlike the set (language) of the linear-time view, captures internal decision points of the system. In a tree, the values of different infinite paths can be aggregated in at least two interesting, fundamentally different ways.
Worst-case analysis
Similarly to the linear-time case, we can assign to a computation tree the supremum of the values of all infinite paths in the tree.
Average-case analysis
We can interpret a computation tree probabilistically, by assigning probabilities to all branching decisions of the system. Since a branching decision often depends deterministically on the (unknown) external input that the system receives at that point, this approach amounts to assuming a probability distribution on input values or, more generally, on environment behavior. Given such a probabilistic environment assumption, we can assign to a computation tree the expected value over all infinite paths in the tree.
There has been little work on probabilistic and weighted tree automata in the context of reactive modeling and verification. For tree automata that accept worst-case and average-case computation trees whose infinite paths have sup, limsup, discounted-sum, and limit-average values, their decidability and closure properties, as well as connections to quantitative temporal logics and model checking, remain to be investigated. The aim is not only to obtain a complete picture but, which is more important from a practical perspective, to find at least one appealing and tractable quantitative setting.
3.2 Defining, computing, and optimizing distance measures between systems
A main practical goal of the project is to augment refinement preorders between reactive processes with directed distances that measure differences in the behavior of the processes. It is tempting to start with processes that contain quantitative information, such as transition probabilities and/or transition weights, which naturally suggest quantitative measurements. Instead, we propose to first define and study several quantitative distance measures between purely qualitative reactive processes.
The canonical structural refinement relation between reactive models is Robin Milner’s simulation relation [
7]. We generalize qualitative simulation between state machines to quantitative distances [
52]. Simulation can be viewed as a game between system
A, the
implementation, and system
B, the
specification, which is played on the product of the two state spaces. In a state pair (
p,
q) consisting of implementation state
p and specification state
q, first the implementation chooses a successor state
p′ of
p, and then the specification must choose a successor
q′ of
q such that the transition of the specification
matches (i.e., carries the same observation/letter as) the transition of the implementation. If the game continues for infinitely many rounds, the specification wins and is said to
simulate the implementation; in this case, every behavior of
A is a behavior of
B, that is,
L(
A)⊆
L(
B). On the other hand, if the specification cannot match an implementation move, then the implementation wins. (In this case, it may or may not be the case that
L(
A)⊈
L(
B); in other words, simulation is a sufficient but not necessary condition for language inclusion.) We propose the following original quantitative generalizations of simulation.
Simulation failure game
Suppose that at a state pair where the implementation would win the simulation game, i.e., where the specification has no matching move, we allow the specification to “cheat” by choosing a transition that is not permitted by the description of B. Moreover, whenever the specification cheats, the implementation receives a payoff of 1 (all other moves carry a payoff of 0). In this new, quantitative game, the implementation tries to maximize the average payoff (i.e., to make the specification cheat as often as possible), and the specification tries to minimize the average payoff (i.e., to cheat as little as possible). The value of the game is the maximal average payoff that the implementation can achieve, no matter how the specification plays. If A is simulated by B, written A⪯B, then the value of the game is 0, because the specification never needs to cheat. However, if A⋠B, then the value is a real number that measures a behavioral difference between A and B. The simulation failure distance
d(A,B) between two state machines A and B is the value of the game.
Simulation tolerance game
While simulation failure distance measures the distance between an implementation and a specification when there is no simulation relation, simulation tolerance distance measures the distance between A and B if A is simulated by B. If A⪯B, then A may have fewer behaviors and be more constrained than B; the following quantitative game measures how much more constrained. We invert the simulation game so that in each round, the specification now moves first, and the implementation tries to match the move of the specification. Whenever no match is possible, the specification gets payoff 1, otherwise 0. In this game the specification attempts to maximize the average payoff, i.e., it makes the implementation “cheat” as often as possible. The value of the game, and the resulting simulation tolerance distance, measures how much additional behavioral freedom a specification has when compared with an implementation.
Simulation robustness game
Another way to define a distance between an implementation A and a specification B in the case that A is simulated by B, is to measure how much the implementation may deviate from its description A without violating the property that A is simulated by B. This is a measure of robustness of the implementation A with respect to the specification B. Roughly speaking, in the simulation robustness game, the implementation is permitted to “cheat” and receives payoff 1 each time it cheats and the specification can match the move, payoff −∞ if the specification cannot match the move of the implementation (whether or not it is a cheating move), and payoff 0 otherwise. The implementation tries to maximize the average payoff. The value of the resulting game is called simulation robustness distance between A and B.
We defined all three quantitative versions of the simulation game as games with
mean-payoff objectives [
32]. Other variations are possible, for example, discounted versions where cheating is worse the earlier it comes in the game, or lexicographic combinations of qualitative and quantitative objectives [
53]. Furthermore, the precise definitions of these games depend on which kind of “cheating” is permitted for the players, e.g., may they consume an arbitrary letter and jump to an arbitrary state, or only change the letter on an existing transition, or consume any letter but not change state, etc. Finally, if the implementation and/or specification contain fairness assumptions, then corresponding variants of the
fair simulation game [
54] need to be considered.
There are even more possibilities for defining distances when the system descriptions contain time stamps (or clock constraints) [
55], probabilities [
26], and/or costs. In contrast to most semantic work on process metrics that can be found in the literature, our primary concerns are
computational (i.e., algorithmic): we wish to find, within the large space of possibilities, meaningful distances that (i) can be computed, or at least effectively approximated and/or compared, and (ii) can be used to synthesize optimal systems from requirements. In other words, we want to solve the following two basic problems:
Quantitative verification
Given two systems A and A′, and a requirement B, is d(A,B)≤d(A′,B)?
Quantitative synthesis
Given a requirement B, construct a system A such that d(A,B)≤d(A′,B) for all systems A′.
Consider, for example, the temporal requirement that every request
a must be followed by a response
b, written in temporal logic as □(
a⇒◊
b). Classical reactive synthesis uses graph games with qualitative objectives to automatically construct a finite-state machine that satisfies such a requirement [
11]. But there are many correct implementations, and we have no control over which implementation is produced by the synthesis algorithm. In quantitative synthesis, by contrast, we specify a distance between any possible implementation and the requirement, and the algorithm is obliged to produce an implementation with minimal distance from the requirement. For the request-response example, we may be interested in the following quantitative criteria for synthesis, among many others:
-
Minimize the maximal time between requests and responses, or minimize the maximal number of responses between requests. (Note that it is not required that there is any response between two consecutive requests, but on the other hand, an implementation may have many “unnecessary” responses between two consecutive requests.) These objective functions can be expressed using weighted sup automata (cf. Topic 1).
-
Minimize the average time between requests and responses, or minimize the average number of responses between requests. (Note that “average” does not refer to any probabilities, but to the mean of the possibly infinite number of response times along a run.) These objective functions can be formalized using weighted limit-average automata (cf. Topic 1).
-
Minimize the expected maximal (or average) time between requests and responses, or the expected maximal (or average) number of responses between requests. Such objective functions can be stated only in a branching-time framework, relative to a probabilistic assumption about how the environment produces requests (again, cf. Topic 1).
Each criterion leads to a different preference order between systems, and to a different optimal implementation. Such optimal implementations can be synthesized automatically by solving graph games with suitable quantitative objectives. We have already solved specific instances of the quantitative synthesis problem [
56,
57], but many settings remain to be investigated.
A particularly intriguing aspect of a quantitative framework is that it permits the synthesis of implementations from inconsistent or incompatible requirements [
58]. Even if there is no solution to the synthesis problem which satisfies all requirements in the boolean sense, we may still be interested in a solution that comes as close as possible to satisfying the requirements.
3.3 Composing and refining distance measures between systems
It has been a long, and still incomplete, road from the principle of (qualitative) reactive modeling to the practice of reactive verification. Some critical milestones along this way include symbolic state-space exploration [
8], assume-guarantee decomposition of model-checking tasks [
9], and counter-example guided abstraction refinement [
59]. There is every reason to expect that in order for a quantitative framework to be of practical use, it must admit the
symbolic computation of distances,
compositional reasoning about distances, and the
abstraction/approximation of distances. In addition, it is of course desirable to build a quantitative framework in such a way that the boolean case becomes a special case of the quantitative case. In the past, data structures for symbolic reasoning about times and probabilities have received much attention [
18,
60], so we concentrate here on compositional and abstract reasoning with quantities.
Quantitative compositional reasoning
In the boolean setting, the basic compositional inference rule states that
A
1⪯
B
1 and
A
2⪯
B
2 together imply
A
1∥
A
2⪯
B
1∥
B
2. This rule is of practical importance because it ensures that the refinement between two complex systems
A
1∥
A
2 and
B
1∥
B
2 can be proved component by component. A quantitative analogue of the rule is
$$d(A_1\parallel A_2,B_1\parallel B_2) \leq f(d(A_1,B_1),d(A_2,B_2))$$
for a suitable function
f, such as addition [
61]. More interesting is assume-guarantee compositionality, which in the boolean setting states that
A
1∥
B
2⪯
B
1 and
B
1∥
A
2⪯
B
2 together imply
A
1∥
A
2⪯
B
1∥
B
2. This inference rule is stronger than the basic compositional rule because to establish the premises, it suffices to prove that
A
1 refines
B
1
under the assumption that the inputs of
A
1 are constrained by the environment
B
2, and that
A
2 refines
B
2 under the environment assumption
B
1. The assume-guarantee rule holds for certain probabilistic systems [
62], but it is open how quantitative assume-guarantee reasoning can look like in general.
Quantitative abstract and approximate reasoning
Abstraction is perhaps the most vexing problem and greatest challenge in quantitative settings. Abstraction is a powerful mechanism for relating models of different precision [
10]. However, in the boolean setting, the theory of abstraction is different from theories of approximation. A theory of approximation measures the precision of a model in terms of an error bound on how much a model may deviate from the system, and such error bounds are usually quantitative. For example, while discounted-sum automata cannot be determinized precisely, they can be determinized approximately [
63]. A theory of abstraction, by contrast, measures the precision of a model in terms of which properties of the system are preserved in the model. Consider, for example, the property that event
a is never followed by event
b. A model may preserve such a property despite possibly introducing a large error. On the other hand, even a model that introduces only a small error may violate the property when the system does not. The preservation of properties ensures that it suffices to check a property on a simple, abstract model in order to prove the property for a complex system. In the quantitative setting, we have no satisfactory, general theory of abstraction, and it is likely that in order to obtain such a theory, it needs to be combined with a theory of approximation. Also, one might envision that a generalized triangle inequality
$$d(A,B) \leq g(d(A,A'),d(A',B))$$
(for some function
g) can form the basis for quantitative reasoning with models of different precision: in order to bound the distance between a model
A and a requirement
B, it should suffice to compute the distance between
A and a more abstract model
A′, and the distance between
A′ and
B.
A significant benefit of abstract models is that they can often be refined automatically exactly to the precision that is needed to prove a desired property [
59]. Such automatic abstraction refinement lies at the heart of modern software model checking [
64] and has been pursued also for probabilistic software [
65]. In a quantitative setting, abstraction refinement can be used to over- or underapproximate the underlying, concrete values of individual runs, with monotonically increasing precision [
66]. This supports
anytime verification, where additional refinement steps give better and better estimates of, say, the resource consumption along a run. The ultimate goal is to develop a comprehensive system design methodology based on quantitative reactive models which makes use of composition, approximation, abstraction refinement, and the interplay of these operations.
3.4 Measuring system robustness and synthesizing robust systems
A universal challenge in systems design is the construction of systems whose behavior is robust in the presence of perturbations. While robustness is understood for most physical engineering artifacts, and increasingly also for biological systems [
67], it is rarely heard of in connection with software. This is because computer programs have traditionally been idealized as discrete mathematical objects. We often lose sight of the fact that the mathematical representation of a software system is just a model, and that the actual system executes on a physical, imperfect platform and interacts with a physical, unpredictable environment. This physicality has its extreme manifestation in embedded control systems (say, of aircraft), yet many traditional models for timed, hybrid, and probabilistic systems do not even take into account that in reality, event times and sensor input values can be measured only with finite precision, and most probabilities can only be estimated with large tolerances (if at all). This is why the
robustness of reactive systems has been recognized as an important concern for some time now [
68]. However, the few, mostly theoretical approaches towards designing robust software systems [
69] remain unconvincing and have had little practical impact.
One of the advantages of a quantitative setting is that we can formalize robustness as mathematical continuity [
4]: a system is
continuous if continuous changes in the environment (or execution platform) cannot cause discontinuous changes in the system behavior. For a continuous reactive system, for every positive real epsilon there must be a positive real delta such that delta changes in the input values and input times cause at most epsilon changes in the output values and output times. For example, a system that reads a sensor value, adds a constant, and outputs the result after some constant time delay, behaves continuously. On the other hand, a system that checks if the sensor value is greater than a threshold, and depending on the result of the check, invokes two different control algorithms with two different execution times to compute the output, does not behave continuously. A preferred, continuous way to combine the two control algorithms would be to execute both procedures if the input value is near the threshold, interpolate the result, and release the output after a predetermined delay. There has been recent work on checking the continuity of systems [
70,
71], but we know of no systematic design or synthesis methodology for continuous systems.
Another kind of robustness is resilience against faulty environment assumptions. Assumptions about the environment can be wrong because the environment may change, there may be malicious intent, or the specifier may simply have incomplete or erroneous information about the environment of a system. Traditionally, if the environment violates an assumption, then by definition, a system satisfies any requirement, no matter what it does, because falsehood on the left-hand side of an implication makes the implication true. Yet we wish robust systems to behave “reasonably,” and degrade gracefully, even if the environment misbehaves. For safety properties, we can quantitatively measure this kind of robustness as follows. Whenever the environment violates its safety assumption, by extending a safe finite behavior with an unsafe event/letter, we count an environment error. Similarly, whenever the system violates its safety requirement, we count a system error. The smaller the limit of the ratio of environment to system errors, the more robust is the system. In such a quantitative framework, we can synthesize robust systems by solving so-called
ratio games [
72], an approach that remains to be extended to general liveness properties.
3.5 Quantitative measures in multicore and cloud computing
Concurrent programming is the area of software development that is most prone to programming errors, yet least amenable to testing, because the behavior of concurrent programs is nondeterministic and therefore irreproducible. Concurrent software is thus a prime candidate for formal verification; it is also a prime application area for reactive modeling because the components of a concurrent program (threads, actors, tasks) interact with each other. In addition, there is tremendous urgency because concurrent software is becoming ubiquitous both in the small, on multicore processors, and in the large, in data centers. New programming paradigms, such as software transactions [
73] for small-scale concurrency and map-reduce [
74] for large-scale concurrency, are among the most discussed topics in computing today. It is therefore only natural that we target concurrent software, both in the small and in the large, as a testing ground for our quantitative agenda. We will do so by developing and evaluating quantitative performance and cost measures for thread concurrency on multiprocessors, and for job and task concurrency in cloud computing.
Performance metrics for shared-memory concurrency
There are at least two compelling needs for such metrics. First, a program, such as a new implementation of a concurrent data structure or transaction manager, is developed on, say, an 8-core machine but will run, in the future, on 16, 32, and more cores. We need performance metrics that can predict how concurrent programs scale to an ever increasing number of cores. Second, synthesis methods are used to add synchronization constructs, such as locks and/or fences, to buggy concurrent programs in order to prevent incorrect interleavings of threads [
75,
76] and to structure legacy software into atomic transactions [
77]. We need performance metrics that can predict how these synchronizations should be placed in order to minimize damage to performance. The measures that are currently used are simple—such as the number and size of atomic sections, the number of fences, the number of variables or objects that are locked at any given time—but it is unclear how predictive these measures are. It is also unclear to which degree architectural aspects, such as caching, need to be taken into account; they play an important role, for example, in execution-time analysis [
78].
Cost metrics for data center and cloud computing
Cloud computing aims to give users virtually unlimited pay-per-use computing resources without the burden of managing the underlying infrastructure [
79]. We believe that, in order to realize the full potential of cloud computing, the user must be presented with a pricing model that offers flexibility at the requirements level, such as a choice between different completion times and different levels of fault tolerance, and the cloud provider must be presented with a programming model that offers flexibility at the execution level, such as a choice between different processor assignment and scheduling policies. In such a flexible framework, with each job, the user purchases a virtual computer with the desired speed, reliability, and cost characteristics, and the cloud provider can optimize the utilization of resources across a stream of jobs from different users. To implement such a framework, we need quantitative measures that quickly and reliably estimate trade-offs between deadlines, fault protection, and resource consumption for networks of interacting tasks.
In both the multicore and cloud scenarios, theoretical performance and cost measures need to be evaluated experimentally, by building both simulators and prototype systems. For the measures that, according to the evaluation, are most accurate in predicting performance and/or resource needs, one may develop game-based algorithms for synthesizing systems that are optimal with respect to the considered measures [
80]. Such algorithms can be then used for
optimal lock synthesis, to add locks and other synchronization constructs to thread concurrent code, and for
optimal schedule synthesis, to assign the tasks of a job to the processors of a data center.
3.6 Quantitative models in systems biology
An ultimate measure of success for a scientific paradigm is whether it proves useful outside of the field in which it originated. Reactive modeling has the potential for such a success: as the theory behind describing interacting discrete processes using syntax that can be executed, composed, and refined, reactive modeling is useful wherever a dynamical system is, at some level of granularity, best viewed as a system of discrete, causally related events. This, for example, is the case with metabolic pathways in the cell, and with other biological networks [
1].
In this project we will not primarily focus on designing new reactive modeling languages or modeling styles for describing certain biological and/or biochemical phenomena (although this would be a welcome side-effect), but we plan to explore and increase the verification and analysis benefits that become available when reactive models are used in biology. In particular, we will pursue the following two directions in close collaboration with experimental systems biology groups, which will supply the biological data.
Quantitative measures of models and data
Experiments generate large amounts of data, and in cell biology, a reactive model represents a hypothesis about the mechanism behind the collected data. The hypothesis can be validated by measuring the fit between the model and the data. In other words, for a biological model, the data that results from experiments plays the same role that, for a software model, is played by the requirements. We plan to use the quantitative techniques developed in this project to measure the fit between reactive models and biological data, and to automatically synthesize best-fitting reactive models from data.
Quantitative techniques for state-space exploration
A second way for validating a reactive model of a biological network is to use the model for predicting the outcome of new experiments, and then compare the predicted outcomes against the observed outcomes. For such predictions, one may execute, or
simulate, the model [
81]; one simulation run corresponds roughly to one “in-silico” experiment. This, however, is a very inefficient way for analyzing a model [
82], akin to obtaining information about a software system solely by testing the program. A main benefit of reactive models in software is that they can be analyzed by state-space exploration techniques (so-called
model checking), and the same benefit is offered by reactive models in biology [
83]. However, the state spaces of biological systems are usually unbounded, and their transitions probabilistic. This calls for the adaptation of proven state-space exploration techniques, and the development of new techniques and heuristics that are specifically targeted towards biological and biochemical systems. We have started to design such techniques—including on-the-fly state-space generation, abstraction, and abstraction refinement—for continuous-time Markov models of (bio)chemical reactions [
84,
85]. Also techniques from hybrid systems show great promise, such as switching between discrete and continuous variable representations depending on the population counts for different molecules/species.
Our aim is to demonstrate that the benefits of using quantitative reactive models in biology and other sciences are not limited to the observation that these models can naturally and unambiguously express mechanistic hypotheses, but that they also can come with a set of computational analysis techniques and tools that are far more powerful than simulation.