Floating-point computations are pervasive in low-level control software and embedded applications. Such programs are frequently used in contexts where safety is critical, such as automotive and avionic applications. It is important to develop tools for accurate and scalable reasoning about programs manipulating floating-point variables.
Floating-point numbers have a dual nature that complicates complete logical reasoning. On the one hand, they are approximate representations of real numbers, which suggests reasoning about floating-point arithmetic using real arithmetic. On the other hand, floating-point numbers have a discrete, binary implementation, which suggests reasoning about them using bit-vector encodings and sat solvers. Both approaches suffer from an explosion of cases that arises when considering the possible results of evaluating floating-point expressions.
1.1 Discussion of floating-point solver architectures
We now discuss in detail a few different possibilities for designing an smt solver for floating-point arithmetic. Interpreting a floating-point expression as a real arithmetic expression leads to incorrect conclusions because there are several cases where floating-point operations differ from real arithmetic operations. Encoding all these cases as constraints in a real arithmetic formula leads to large formulae containing several cases. Such formulae are difficult for real arithmetic solvers to handle.
A second approach to floating-point reasoning, called bit-blasting or propositional encoding, currently yields better performance than a real arithmetic encoding [
11]. Floating-point operations are represented as circuits that are then translated into a Boolean formula that is solved by a propositional
sat solver. This approach enables precise modelling of floating-point semantics and allows high-performance
sat solvers to be reused to implement a floating-point solver. The disadvantage of this approach is that the
sat solver only has an operational view of arithmetic operations and must reason about individual bits in adder and multiplier circuits without the ability to simplify formulae using high-level, numeric reasoning.
A third approach is to use the popular
dpll(t) architecture [
5]. The
dpll(t) architecture uses a
sat solver to reason about the Boolean structure of a formula and a specialised solver for conjunctions of theory literals. Thus, two efficient solvers for fragments of a theory are combined to obtain a solver for the theory. The first problem with using
dpll(t) is that we would still require a solver for conjunctions of literals in floating-point logic, and no off-the-shelf solution is available. A second issue is that, in some cases, separating Boolean reasoning from theory reasoning is detrimental to performance [
9,
51]. Details of the theory are not visible to the propositional solver and cannot guide search for a model, while information from previous runs are not available to the theory solver for conflict analysis.
The issues with
dpll(t) mentioned above are known and have fuelled research in
natural-domain
smt procedures. The term ‘natural-domain
smt’ was first used by Cotton [
19] but we use it for
smt procedures that perform all reasoning directly in a theory [
19,
33,
47,
48,
51]. A fourth possibility is to develop a natural-domain floating-point solver which performs decisions, backtracking and learning using variables and atoms of the theory. The challenge in pursuing this approach is identifying which elements of the theory can be used for these operations, and developing efficient algorithms for propagation and learning.
In this paper, we pursue the fourth approach and develop a natural-domain
smt solver for reasoning about floating-point arithmetic formulae. We address the efficiency concerns highlighted above by developing a natural-domain
smt solver that supports imprecise reasoning. For insight into the operation of our solver, consider the formula
$$0.0 \le x \land x \le10.0 \land y = x^5 \land y > 10^5 $$
where the variables
x and
y have double-precision floating-point values. Interval propagation [
54] tracks the range of each variable and can derive the fact
x∈[0.0,10.0] from the first two constraints, which implies the fact
y∈[0.0,100000.0] from the third constraint. This range is not compatible with the final conjunct
y>10
5, so the formula is unsatisfiable. The computation requires a fraction of a second with our interval solver. In contrast, translating the formula above into a bit-vector and invoking the
smt solver
z3 requires 16 minutes on a modern processor to prove that the formula is unsatisfiable.
The efficiency of interval reasoning comes at the cost of completeness. Consider the floating-point formula below.
$$z = y \land x = y \cdot z \land x < 0 $$
After bit-vector encoding, the solver
z3 can decide satisfiability of this formula in a fraction of a second. The interval abstraction cannot represent relationships between the values of variables. Interval propagation will not deduce that
y and
z are either both positive or both negative. The interval solver cannot conclude that
x must be positive and cannot show that the formula is unsatisfiable.
To recover completeness, we lift the Conflict Driven Clause Learning algorithm in sat solvers to reason about intervals. Our solver uses intervals to make decisions, propagates intervals for deduction, and uses a conflict analysis over intervals to refine the results of interval propagation. Our algorithm is a strict, mathematical generalisation of propositional cdcl in that replacing floating-point intervals with partial assignments yields the original cdcl algorithm. Our approach is parametric, allowing for abstractions such as equality graphs, difference graphs, or linear inequalities to be used in place of floating-point intervals to obtain natural-domain cdcl algorithms for equality logic, difference logic, and linear arithmetic respectively.
Clause learning is not the only approach one may take to obtain a complete solver based on interval propagation. One may eliminate imprecision by splitting intervals into ranges that can be analysed without loss of precision. There are at least two ways to perform such splitting.
Splitting can be integrated in a
dpll(t) solver [
4]. New propositions are required to represent intervals over ranges that do not occur explicitly in the original formula. Implementing good learning heuristics for this approach is difficult because the propositional learning algorithm is unaware of the intervals associated with these propositions.
Splitting can also be implemented in a natural-domain fashion. For the second example above, the solver can consider the cases y<0 and y≥0, and can in each case conclude that x is positive. Such splitting yields a complete, natural-domain smt solver that only manipulates intervals, but requires considering a potentially exponential number of cases. Moreover, the conclusions drawn from proving one case are not used to reason about another case, so the solver may repeatedly perform certain reasoning.