In this paper we want to estimate the nonlinearity of Boolean functions, by probabilistic methods, when it is computationally very expensive, or perhaps not feasible to compute the full Walsh transform (which is the case for almost all functions in a larger number of variables, say more than 30). Firstly, we significantly improve upon the bounds of Zhang and Zheng (1999) on the probabilities of failure of affinity tests based on nonhomomorphicity, in particular, we prove a new lower bound that we have previously conjectured. This new lower bound generalizes the one of Bellare et al. (IEEE Trans. Inf. Theory 42(6), 1781–1795 1996) to nonhomomorphicity tests of arbitrary order. Secondly, we prove bounds on the probability of failure of a proposed affinity test that uses the BLR linearity test. All these bounds are expressed in terms of the function’s nonlinearity, and we exploit that to provide probabilistic methods for estimating the nonlinearity based upon these affinity tests. We analyze our estimates and conclude that they have reasonably good accuracy, particularly so when the nonlinearity is low.
Hinweise
This article belongs to the Topical Collection: Sequences and Their Applications III Guest Editors: Chunlei Li, Tor Helleseth and Zhengchun Zhou
This is a substantially revised and extended version of the article [11] that appeared in the proceedings of Sequences and Their Applications – SETA 2020. In particular, Section 4 and Proposition 13 are new.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
1 Introduction and motivation
Boolean functions are defined on a vector space over the binary finite field \(\mathbb {F}_{2}\) with output in \(\mathbb {F}_{2}\). For many cryptographic applications it is important that functions are not affine, and not even close (with respect to the Hamming distance, defined in (3)) to being affine. The nonlinearity of a function f, denoted \({\mathrm {d}_{\mathcal {A}}}(f)\), defined as the minimum Hamming distance to any affine function, is therefore an important cryptographic property. This indicator can be computed by using the Walsh transform (also called Walsh-Hadamard or discrete Fourier transform). The Walsh transform of a function f in n variables can be computed from its truth table by an algorithm similar to the fast Fourier transform in time O(n2n). Computing the Walsh transform is not feasible in practice when the number of variables is large (e.g., it is not feasible for functions in 80 variables; functions which model an output of a stream or block cipher as a function of the key would have a number of variables equal to the length of the key, i.e. at least 80 variables) and the function is given as a “black box” (or given by an algorithm or formula which is not amenable to simple manipulation for the purpose of computing the Walsh transform).
The motivation of this paper is to probabilistically estimate the nonlinearity of f to a reasonable degree of accuracy. The main idea is as follows. Consider a probabilistic test (we will see some examples shortly) which has a success/fail outcome based on the values of f at some fixed number k of points in \({\mathbb {F}_{2}^{n}}\) (f can therefore be given as a “black box” function). Denote by T(f) the probability of failing the test (with the probability taken over all possible choices of k inputs in \({\mathbb {F}_{2}^{n}}\)). We assume T(f) is positively correlated, to some extent, with the nonlinearity \(\mathrm {d}_{\mathcal {A}}(f)\), and can be bounded by some functions in \(\mathrm {d}_{\mathcal {A}}(f)\), say \( \text {Lower}(\mathrm {d}_{\mathcal {A}}(f)) \le T(f) \le \text {Upper}(\mathrm {d}_{\mathcal {A}}(f)). \) If we can obtain T(f) with reasonable accuracy by practical statistical testing (e.g. binomial proportion confidence interval), we can then estimate the nonlinearity as:
To obtain an accurate estimate, it is important that T(f) depend strongly on \(\mathrm {d}_{\mathcal {A}}(f)\) and that the bounds are very good. We will examine several probabilistic tests, improve some of the existing bounds, and analyze the accuracy of the resulting estimation.
Anzeige
The linearity test most commonly used is based on the textbook definition of a linear function, namely f(u + v) = f(u) + f(v) (often called the BLR test from [3]): what it means is that we pick \(u,v{\in \mathbb {F}_{2}^{n}}\) uniformly at random, compute u + v, query the black box to extract f(u),f(v),f(u + v), and check if the aforementioned condition holds. If f passes this test for many pairs (u,v), then f is probably linear. If f fails the test for at least one pair, then f is certainly not linear. We denote by P2(f) the probability of f failing the test (with probability taken over all pairs \((u, v)\in {\mathbb {F}_{2}^{n}}{\times \mathbb {F}_{2}^{n}}\)) and by \(\mathrm {d}_{{\mathscr{L}}}(f)\) the normalized Hamming distance of f to the closest linear function. Several authors have determined upper and lower bounds for P2(f) as a function of \(\mathrm {d}_{{\mathscr{L}}}(f)\) (see [1, 9] and the references therein).
For cryptographic applications we are not so much interested in whether the function is linear, but rather whether it is affine. For example, such tests play a crucial role in the cube and AIDA attacks (see [6, 12]), which are refined high-order differential attacks, targeted at primitives in stream and block ciphers based on low-degree components. The probabilistic test used in [6] for deciding whether a function f is affine is to check whether f(u + w) + f(u) + f(w) + f(0) = 0 holds (for u,w chosen uniformly at random), which can be viewed as using the BLR test to check whether f(u) − f(0) is linear. The functions of interest f are functions in many variables (typically at least 80 variables), obtained as higher-order derivatives of a function g which describes, for example, the first output bit of the stream cipher as a function of the key and initialisation vector. Although explicit algorithms are available for computing g and f (in the case of the Trivium cipher, the algorithm for computing g starts with some relatively simple functions of algebraic degree two, which are iteratively composed 1152 times for the full cipher, or about 700 times for reduced versions of the cipher), it is not feasible in practice to compute their algebraic normal form, or truth table, or nonlinearity, or Walsh transform. Instead, g is treated as a “black box” function, and f can be evaluated at any given input using several calls to g.
Another test used in the literature for deciding whether a Boolean function is affine is to check whether the equation f(u + v + w) + f(u) + f(v) + f(w) = 0 holds, for some \(u,v,w\in {\mathbb {F}_{2}^{n}}\) chosen uniformly at random. Like in the case of the linearity test, if f passes the test for many triples u,v,w, then f is probably affine. We denote by P3(f) the probability of f failing this affinity test (with the probability taken over all triples \((u, v, w)\in \mathbb {F}_{2}^{3n}\)). As in the case of the linearity tests, a natural question is whether P3(f) is related to \(\mathrm {d}_{\mathcal {A}}(f)\), the distance to the closest affine function (note that this is the nonlinearity of f ). A lower bound for P3(f) in terms of \(\mathrm {d}_{\mathcal {A}}(f)\) was given in Bellare et al. [1].
A generalization of the tests above was proposed by Zhang and Zheng in [14], where the authors defined the notion of (k + 1)-st order nonhomomorphicity of a function f as the probability Pk(f) of failing the test
with the probability taken over all tuples \((u_{1}, \ldots , u_{k})\in ({\mathbb {F}_{2}^{n}})^{k}\) (see Definition 1). It was shown that for k odd, f is affine if and only if Pk(f) = 0; for k even, f is linear if and only if Pk(f) = 0; also, still for k even, f is affine if and only if Pk(f) ∈{0,1}. Furthermore, some bounds on Pk(f) with respect to \(\mathrm {d}_{\mathcal {A}}(f)\), for k odd, were given in [14].
Anzeige
In this paper, we firstly improve both the upper and lower bounds presented by Zhang and Zheng in [14] for Pk(f) with k odd (see Sections 3 and 4). Our lower bound holds for arbitrary k and generalizes the lower bound proven in [1] for k = 2,3. The proofs use the techniques employed in [1] as well as additional combinatorial manipulation. We also prove the lower bound we conjectured in [11].
Secondly, we consider the following probabilistic test for affine functions. We can use any probabilistic linearity test, and test whether f is linear or f + 1 is linear. If either of these holds, then f is affine. For the nonhomomorphicity test with k even, this is equivalent to testing whether Pk(f) ∈{0,1}. The fact that f is affine if and only if Pk(f) ∈{0,1} was proven in [14]. However, when f is not affine no results were given regarding how the probability of failing this test depends on the nonlinearity of f. In Section 5 we show that upper and lower bounds can be obtained for the value of \(\min \limits (P_{k}(f), P_{k}(f+1))\) in the case k = 2 (i.e. the BLR test). Namely, using the bounds on failing the BLR linearity test from [1], which depend on the distance to the closest linear function, we show that similar bounds hold for \(\min \limits (P_{2}(f), P_{2}(f+1))\), but this time the bounds depend on the distance to the closest affine function. We also show that the refinements of the bounds from [1] given in [9] can be applied to our bounds too.
The nonlinearity of a function f can be estimated by first using any of the above tests and a practical statistical method to estimate the probability of failing that test (as demonstrated in [14]). Then, using (1) or (2), we obtain an estimate for the nonlinearity of f. In Section 6 we analyze the accuracy of the estimation. There are functions f,g such that f has higher probability of failing the test than g, even though f has lower nonlinearity than g. This was shown in [2] for the BLR test and in [14] for the tests based on the (k + 1)-st order nonhomomorphicity with k odd. However, the estimates get more accurate as k increases. For example, for k = 7, for any given value of P7(f) we can estimate the nonlinearity as being within an interval of length 0.011 or less if P7(f) ≤ 0.49 and length 0.053 or less if P7(f) ∈ [0.49,0.5].
Other nonlinearity tests were proposed for reducing the number of evaluations needed for the black box function, such as [7, 13] (the latter being also useful to estimate the algebraic degree of f ). The (k + 1)-st order nonhomomorphicity for k = 3 was used for attacks on actual ciphers in [10]. We intend to push further the connection between the probability of failing these tests and the nonlinearity, as well as look at estimating the nonlinearity of functions of cryptographic interest.
2 Preliminaries
We recall definitions and known results needed for the rest of the paper.
Throughout, n will denote a positive integer. Boolean functions in n variables are functions \(f :{\mathbb {F}_{2}^{n}} \rightarrow \mathbb {F}_{2}\), where \(\mathbb {F}_{2}\) is the binary field, and \({\mathbb {F}_{2}^{n}}\) is the n dimensional vector space over \(\mathbb {F}_{2}\). It is well known that any such function can be uniquely represented in its ANF (Algebraic Normal Form), i.e. as a polynomial in \(\mathbb {F}_{2}[x_{1}, \ldots , x_{n}]\) of degree at most 1 in each variable. The total degree of the ANF representation is called the algebraic degree of f. Functions of algebraic degree at most one are called affine; affine functions with zero constant term are called linear. We will denote by \(\mathcal {A}\) the set of affine functions and by \({\mathscr{L}}\) the set of linear functions in n variables over \(\mathbb {F}_{2}\), if the dimension is understood from the context.
In this paper, like in [1], it will be convenient to use the normalized version of the Hamming distance and weight. More precisely, we define the (normalized) Hamming distance and Hamming weight for vectors a = (a1,…,at) and b = (b1,…,bt) in \({\mathbb {F}_{2}^{t}}\), as well as the distance of a vector \(a\in {\mathbb {F}_{2}^{t}}\) to a set of vectors \(S\subseteq {\mathbb {F}_{2}^{t}}\) as:
In the literature, the Hamming weight and distance are more often used without normalization (i.e. in the definitions above, one does not divide by the length of the vector) but we will explain shortly why normalization is useful for our purpose. The truth table of a function f is the vector \(TT(f)=(f(v_{0}), \ldots , f(v_{2^{n}-1}))\), where vi are all the elements of \({\mathbb {F}_{2}^{n}}\) in some fixed order, e.g., lexicographical order. The (normalized) Hamming weight, denoted by w(f), of a Boolean function f is w(TT(f)) and the distance, denoted by d(f,g), between two Boolean functions f,g is d(TT(f),TT(g)).
Of particular importance will be the distance of a function f to the set of affine or of linear functions. The minimum distance to any affine function, \(\mathrm {d}_{\mathcal {A}}(f)\), is called the (normalized) nonlinearity of f and is a very important cryptographic indicator. It is easy to see that \(\mathrm {d}_{\mathcal {A}}(f) = \min \limits (\mathrm {d}_{{\mathscr{L}}}(f), \mathrm {d}_{{\mathscr{L}}}(f+1))\). Our motivation for using the normalized version of nonlinearity (based on the normalized version of Hamming distance) is that it allows a meaningful comparison of the nonlinearity of two functions which might not have the same number of variables.
The Fourier-Hadamard transform of a function \(f:{\mathbb {F}_{2}^{n}} \rightarrow \mathbb {R}\) (the 0/1 values of a Boolean functions are viewed as real numbers for this purpose) is the function \(W(f): {\mathbb {F}_{2}^{n}} \rightarrow \mathbb {R}\) defined as
where the dot product can be defined as \(u \cdot v = {\sum }_{i=1}^{n}u_{i}v_{i}\). Note that we use a normalized version of the transform here. If f is replaced by its sign function, \(\hat f\), defined by \(\hat f(u)=(-1)^{f(u)}\), then \(W(\hat f)\) is customarily referred to as the Walsh (or Walsh-Hadamard) transform of f, and the values \(W(\hat f)(v)\) for \(v\in {\mathbb {F}_{2}^{n}}\) are called the Walsh coefficients. We will refer to the sequence of output values of the Walsh transform (when the input is ordered lexicographically) as the Walsh spectrum.
We will be using later Parseval’s identity (see [5] for example):
It is well-known [5] and easy to see that the Walsh transform of a Boolean function f expresses its distance to the set of linear functions, and consequently the distance of f to the set of affine functions. Denoting ℓa(u) = a ⋅ u, the nonlinearity of f is related to the Walsh transform as follows:
Note that \(0\le \mathrm {d}_{{\mathscr{L}}}(f) \le \frac {1}{2}\). It is known [5] that \(0\le \mathrm {d}_{\mathcal {A}}(f) \le \frac {1}{2}\left (1 - \frac {1}{2^{\frac {n}{2}}}\right )\). We call a function \(f:{\mathbb {F}_{2}^{n}}\to \mathbb {F}_{2}\) (n ≥ 2) bent if its nonlinearity is exactly \(\frac {1}{2}\left (1 - \frac {1}{2^{\frac {n}{2}}}\right )\) (they exist only for even integers n). It is known [5] that f is bent if and only if the absolute values of all of its Walsh coefficients satisfy \(|W(\hat f)(v)|=2^{-\frac {n}{2}}\).
Let \(f :{\mathbb {F}_{2}^{n}} \rightarrow \mathbb {F}_{2}\) be a Boolean function in n variables and let k ≤ 2 be an integer. The (k + 1)-st order nonhomomorphicity of f, denoted Pk(f), is defined as the probability that the equation f(u1 + ⋯ + uk) + f(u1) + ⋯ + f(uk) = 0 does not hold, with the probability taken over all tuples \((u_{1}, \ldots , u_{k})\in \mathbb {F}_{2}^{kn}\) i.e.
In other words, Pk(f) is the normalised Hamming weight of the function \(F:\mathbb {F}_{2}^{kn} \rightarrow \mathbb {F}_{2}\), F(u1,…,uk) = f(u1 + ⋯ + uk) + f(u1) + ⋯ + f(uk). Equivalently, considering U1,…,Uk independent uniformly distributed random variables in \({\mathbb {F}_{2}^{n}}\), we can define Pk(f) = P[f(U1 + ⋯ + Uk) + f(U1) + ⋯ + f(Uk)≠ 0].
Note that the BLR test corresponds to the particular case of k = 2.
3 Improved bounds on the probability of failure of existing affinity tests
We consider the test of whether a function is affine by checking whether f(u1 + ⋯ + uk) + f(u1) + ⋯ + f(uk) = 0, for some fixed odd integer k. We examine the relationship between the probability Pk(f) of failing this test and \(\mathrm {d}_{\mathcal {A}}(f)\), the nonlinearity of f. It is well known, and easy to prove, that f is affine if and only if Pk(f) = 0.
A lower bound for P3(f) was proven in [1, Lemma 5.1] (with \(x = \mathrm {d}_{\mathcal {A}}(f)\)):
If we allow the bound to also depend on n, we have the improved bound \(P_{k}(f) \le \mathrm {U}\mathrm {p}\mathrm {p}\mathrm {e}\mathrm {r}_{n,k}(\mathrm {d}_{\mathcal {A}}(f))\), where
In [8, Theorem 3.1], [14, Theorem 2] (and, for k ≤ 3, in [1]) the following expression for Pk is obtained (we reformulate it for the normalized versions of the Walsh transform and nonhomomorphicity):
where \(\hat f(x) = (-1)^{f(x)}\) and \(W(\hat f)\) is the Walsh transform of f.
In order to obtain the lower bound in the statement we need an upper bound on the sum \({\sum }_{u\in {\mathbb {F}_{2}^{n}}} (W(\hat f)(u))^{k+1}\), which we obtain by a technique similar to the one of [1]:
which gives the upper bound (9). For the bound (10), let \(u_{0} \in {\mathbb {F}_{2}^{n}}\) be such that \(|W(\hat f)(u_{0})| = \max \limits _{u\in {\mathbb {F}_{2}^{n}}} |W(\hat f)(u)|\). We have
Recall that the weighted power means inequality states that for any integers m ≥ 1,j ≥ 2 and any positive real numbers a1,…,am we have \( {\sum }_{i = 1}^{m} \frac {1}{m} {a_{i}^{j}} \ge \left (\frac {1}{m} {\sum }_{i = 1}^{m} a_{i} \right )^{j}\) (see for example [4, Chapter III]). Using this inequality and Parseval’s identity, we obtain
Note that for k ≥ 3 odd, the bounds in the theorem above are better than the bounds (7). The lower bound in (7) is negative when \(\mathrm {d}_{\mathcal {A}}(f)< \frac {1}{2}\left (1-\frac {1}{2^{\frac {n}{k+1}}}\right )\), so it does not provide any useful information in that range. When it is positive, it is still always smaller than the lower bound in (8), only reaching equality when \(\mathrm {d}_{\mathcal {A}}(f)\) attains its maximum value, namely \(\frac {1}{2}\left (1-\frac {1}{2^{\frac {n}{2}}}\right )\). The upper bound in (7) does not depend on \(\mathrm {d}_{\mathcal {A}}(f)\), whereas the one in (10) increases continuously from 0 to \(\frac {1}{2}\left (1- \frac {1}{2^{(k-1)n/2}} \right )\) as \(\mathrm {d}_{\mathcal {A}}(f)\) increases from 0 to \(\frac {1}{2}\left (1-\frac {1}{2^{\frac {n}{2}}}\right )\).
We examine the tightness of the bounds in Theorem 2. The upper bound (9) cannot be reached (except for the trivial case \(\mathrm {d}_{\mathcal {A}}(f)=0\) and Pk(f) = 0) because Uppern,k(x) < Upperk(x) for 0 < x < 0.5. Note however that Upperk(x) is the limit of Uppern,k(x), as n approaches infinity. We found experimentally functions f for which Pk(f) is very close to the upper bound \({{\text {Upper}}}_{k}(\mathrm {d}_{\mathcal {A}}(f))\), while \(\mathrm {d}_{\mathcal {A}}(f)\) covers many values throughout the interval (0,0.5), see the last graph in the ??. We suspect therefore that this upper bound cannot be improved much (as a bound which is independent of n).
The examples below present functions for which the upper bound (10) as well as the lower bound in Theorem 2 are attained.
Example 3
For n even and k odd, consider a bent function in n variables, for example f(x1,…,xn) = x1x2 + x3x4 + ⋯ + xn− 1xn. The nonlinearity achieves the maximum possible value for a function in n variables, namely \(\mathrm {d}_{\mathcal {A}}(f)=\frac {1}{2}\left (1-\frac {1}{2^{\frac {n}{2}}}\right )\). All the Walsh coefficients of a bent function are equal to \(\pm \frac {1}{2^{\frac {n}{2}}}\), with \(2^{n-1}+2^{\frac {n}{2}-1}\) having one sign and \(2^{n-1}-2^{\frac {n}{2}-1}\) the opposite sign. Using (11) we can compute
One can verify that in this case \(P_{k}(f)=\mathrm {U}\mathrm {p}\mathrm {p}\mathrm {e}\mathrm {r}_{k,n}(\mathrm {d}_{\mathcal {A}}(f))\) so the upper bound (10) is attained.
Example 5
Consider an arbitrary function in m variables, \(f^{\prime }(x_{1}, \ldots , x_{m})\). We can view it as a function in a larger number n of variables for any n ≥ m by defining \(f(x_{1}, \ldots , x_{n}) =f^{\prime }(x_{1}, \ldots , x_{m})\). We show that f and \(f^{\prime }\) have the same nonlinearity and \(P_{k}(f) = P_{k}(f^{\prime })\). To this end, we examine the Walsh transform. Denoting \(x^{\prime }=(x_{1},\ldots ,x_{m})\) and \(x^{\prime \prime }=(x_{m+1},\ldots ,x_{n})\), as well as, \(y^{\prime }=(y_{1},\ldots ,y_{m})\) and \(y^{\prime \prime }=(y_{m+1},\ldots ,y_{n})\), we have
by using [5, Lemma 2.9]. Therefore we conclude that \(\mathrm {d}_{\mathcal {A}}(f)=\mathrm {d}_{\mathcal {A}}(f^{\prime })\) using (5); also \(P_{k}(f) = P_{k}(f^{\prime })\) using (11).
We consider now the function f(x1,…,xn) = x1x2 + x3x4 + ⋯ + xm− 1xm with m even and m ≤ n and let k be odd. Using the argument above and the computation in Example 3 we know that \(\mathrm {d}_{\mathcal {A}}(f)=\frac {1}{2}\left (1-\frac {1}{2^{\frac {m}{2}}}\right )\) and \(P_{k}(f) =\frac {1}{2}\left (1-\left (1-2\mathrm {d}_{\mathcal {A}}(f)\right )^{k-1} \right )\), so this function reaches the lower bound in Theorem 2 as well.
Summarising the examples above, for each fixed number of variables n and each odd k, the upper bound Uppern,k in Theorem 2 is reached at nonlinearity \(\frac {1}{2^{n}}\) (which is the lowest possible non-zero nonlinearity) and if n is even, also at \(\frac {1}{2}\left (1-\frac {1}{2^{\frac {n}{2}}}\right )\) (which is the highest possible nonlinearity). The lower bound in Theorem 2 is reached for nonlinearities \(\frac {1}{2^{n}}\) as well as all nonlinearities of the form \(\frac {1}{2}\left (1-\frac {1}{2^{\frac {m}{2}}}\right )\) for m even, m ≤ n, i.e. \(\frac {1}{4}, \frac {3}{8}, \frac {7}{16}, \ldots \), for m = 2,4,6,…, respectively. In between these values, the lower bound might not be tight. Indeed for \(\mathrm {d}_{\mathcal {A}}(f)<\frac {1}{4}\), (6) provides a better lower bound for k = 3.
We conjectured in [11] that the lower bound (6) can be generalized to arbitrary odd k ≥ 3, as follows:
In the next section we will prove this conjecture.
4 Reformulated conjecture and its proof
Theorem 7
Let f be a Boolean function in n variables, k ≥ 2 an integer and ℓ a linear function in n variables if k is even or an affine function if k is odd. Then
with the probability taken over all \(u_{1}, \ldots , u_{k} \in {\mathbb {F}_{2}^{n}}\).
Proof
The first part of the proof follows the lines of [1, Lemma 2.3]. Denote x = d(f,ℓ). Let \(u_{1}, \ldots , u_{k} \in {\mathbb {F}_{2}^{n}}\) and denote \(u_{k+1} = {\sum }_{i=1}^{k} u_{i}\). The function f fails the test f(u1 + ⋯ + uk) + f(u1) + ⋯ + f(uk) = 0 exactly for those values u1,…,uk for which an odd number of the values f(u1) − ℓ(u1),…,f(uk+ 1) − ℓ(uk+ 1) are equal to 1 (note that ℓ always passes the test). Denote by Aj the probability that the first j of these k + 1 values are equal to 1 and the rest are equal to 0, i.e.
where, for ease of notation, we denoted g = f − ℓ, and the probability is taken over all the 2kn elements of the set \(V=\{(u_{1}, \ldots , u_{k+1})\in ({\mathbb {F}_{2}^{n}})^{k+1}: {\sum }_{i=1}^{k+1}u_{i}=0\}\). For any subset \(I\subseteq \{1, \ldots , k+1\}\) of cardinality j, Aj also equals the probability (again over all (u1,…,uk+ 1) ∈ V) that g(ui) = 1 for all i ∈ I and g(ui) = 0 for all i ∈{1,…,k + 1}∖ I. Therefore, taking into account that there are \(\left (\begin {array}{cc}{k+1}\\{j} \end {array}\right )\) subsets of cardinality j, we obtain
In the second part of the proof we will obtain a closed form for the formula above. Firstly, let us process the inner sum in (15); we replace the index of summation by u = i − j and then use the well-known identity an − bn = (a − b)(an− 1 + an− 2b + ⋯ + bn− 1):
We are now ready to prove Conjecture 6; we will, in fact, prove a more general result that also includes the case of k even.
Corollary 8
For odd k we have \(P_{k}(f)\ge \mathrm {L}\mathrm {o}\mathrm {w}\mathrm {e}\mathrm {r}_{k}(\mathrm {d}_{\mathcal {A}}(f))\) and for even k we have \(P_{k}(f)\ge \mathrm {L}\mathrm {o}\mathrm {w}\mathrm {e}\mathrm {r}_{k}(\mathrm {d}_{{\mathscr{L}}}(f))\) where Lowerk(x) equals
for any linear function ℓ. Using the fact that sl(f,ℓ) ≥ 0 and choosing ℓ to be a linear function whose distance to f is minimal, we obtain \(P_{k}(f)\ge G_{k}(\mathrm {d}_{{\mathscr{L}}}(f))\) as required.
with the probability taken over all tuples \((u_{1}, \ldots ,u_{k})\in ({\mathbb {F}_{2}^{n}})^{k}\). Combining this inequality with Theorem 7 for k odd gives
for any affine function ℓ. Choosing ℓ to be an affine function whose distance to f is minimal, we obtain \(P_{k}(f)\ge G_{k}(\mathrm {d}_{\mathcal {A}}(f))\) as required.
Surely, we can ask ourselves whether it is possible that another affine/linear function (for k odd/even) say ℓ1, which is further from f, i.e. d(f,ℓ1) > d(f,ℓ0), could yield a better lower bound, i.e. Gk(d(f,ℓ1)) > Gk(d(f,ℓ0)). This is not the case, and we give a sketch of the proof for k odd. Firstly, the reader can verify that Gk(x) ≥ 0 on [0,0.5], Gk(x) ≤ 0 on [0.5,1], and that on the interval [0.25,0.5], the function Gk is monotonically decreasing. Therefore, when d(f,ℓ0) ≥ 0.25, keeping in mind that 0 ≤d(f,ℓ0) < 0.5 and d(f,ℓ0) < d(f,ℓ1) ≤ 1, we have indeed Gk(d(f,ℓ1)) ≤ Gk(d(f,ℓ0)). Secondly, when d(f,ℓ0) < 0.25, the triangle inequality gives
which is greater than or equal to zero on [0,0.25].
Finally, the more explicit expression (19) for k odd is obtained by verifying that Gk(x) > Hk(x) when \(x\in \left [0,\frac {1}{4}\right )\), Gk(x) < Hk(x) when \(x\in \left [\frac {1}{4},\frac {1}{2}\right ]\), and \(G_{k}(\frac {1}{4})= H_{k}(\frac {1}{4})\). A similar situation happens for k even, but the intersection of the two functions does not occur at \(\frac {1}{4}\), but at a point whose value depends on k, and is in the interval \(\left [\frac {1}{4},\frac {1}{3}\right ]\). □
The new lower bound in Corollary 8 is attained for k odd by some functions with nonlinearity in the range \(0<\mathrm {d}_{\mathcal {A}}(f)<\frac {1}{4}\) and for k even by some functions with \(0<\mathrm {d}_{{\mathscr{L}}}(f)<\frac {1}{4}\):
Example 9
Let f(x1,x2,…,xn) = x1x2⋯xm with 3 ≤ m ≤ n. Using the computations in Example 4 and the same arguments as in Example 5 we see that \(\mathrm {d}_{\mathcal {A}}(f)=\mathrm {d}_{{\mathscr{L}}}(f)=\frac {1}{2^{m}}\) and the Walsh spectrum consists of one element equal to \(1-\frac {1}{2^{m-1}}\), 2m− 1 elements equal to \(\frac {1}{2^{m-1}}\), and 2m− 1 − 1 elements equal to \(-\frac {1}{2^{m-1}}\), the remaining elements being zero.
This shows that for each fixed n our lower bound is attained at nonlinearity (for k odd) or \(\mathrm {d}_{{\mathscr{L}}}(f)\) (for k even) equal to \(\frac {1}{8}, \frac {1}{16}, \frac {1}{32}, \ldots , \frac {1}{2^{n}}\), but in between these values, the bound might not be tight.
While not the purpose of this paper, we get an interesting consequence of the previous theorem, namely an upper bound for the moments of the Walsh coefficients.
Corollary 10
For any integer k ≥ 2, the (k + 1)-st moments of the Walsh transform satisfy
$$ {\sum}_{u\in {\mathbb{F}_{2}^{n}}} (W(\hat f)(u))^{k+1} \leq \begin{cases} \min(y^{k+1} + (1-y)^{k}(1+y), y^{k-1}) & \text{if } k \text{ is odd}\\ \min(y^{k+1} + (1-y)^{k+1}, y^{k-1}) & \text{if } k \text{ is even}, \end{cases} $$
where \(y = \max \limits _{u\in {\mathbb {F}_{2}^{n}}} |W(\hat f)(u)|\) for k odd and \(y = \max \limits _{u\in {\mathbb {F}_{2}^{n}}} (W(\hat f)(u))\) for k even.
Proof
The claim follows by using (5), (11) and the previous corollary. □
5 Affinity tests using linearity tests and bounds on the probability of failure
In this section we focus on the test f(u1 + ⋯ + uk) + f(u1) + ⋯ + f(uk) = 0 for k even, and on the probability Pk(f) of failing this test. It is shown in [14] that f is linear if and only if Pk(f) = 0; moreover, f is affine if and only if Pk(f) ∈{0,1}. The test can therefore be used as a probabilistic affinity test as follows: run the test several times on f, and if f always passes the test (suggesting a probability of failure Pk(f) = 0), or f always fails the test (suggesting that Pk(f) = 1), then declare f to be affine. Note that f + 1 passes the linearity test above for some given tuples if and only if f fails the test for those same tuples; therefore we have Pk(f + 1) = 1 − Pk(f). Another way of looking at this affinity test is that we are testing both f and f + 1 for linearity, and if one of them passes all the tests and is declared linear then f can be declared affine. Any other linearity test could be used this way as an affinity test.
When f is not affine however (and therefore neither f nor f + 1 are linear), there are to our knowledge no results regarding the relationship of the probability Pk(f) of failing the test (for k even) and the nonlinearity \(\mathrm {d}_{\mathcal {A}}(f)\) of f; the existing lower and upper bounds on Pk(f) depend on \(\mathrm {d}_{{\mathscr{L}}}(f)\), the distance of f to the set of linear functions. Since this affinity test is equivalent to testing both f and f + 1 for linearity, it seems natural to consider both Pk(f) and Pk(f + 1) when examining a connection to \(\mathrm {d}_{\mathcal {A}}(f)\). We define
and study its relation to \(\mathrm {d}_{\mathcal {A}}(f)\). Further motivation for this choice is given in Remark 12. For k = 2, we will prove lower and upper bounds for \(\overline {P}_{2}(f)\) in terms of the nonlinearity of f.
In Bellare et al. [1] lower and upper bounds were given for P2(f) in terms of \(\mathrm {d}_{{\mathscr{L}}}(f)\). Namely, it was proven that
where \(\mathrm {L}\mathrm {o}\mathrm {w}\mathrm {e}\mathrm {r}_{2}, \mathrm {U}\mathrm {p}\mathrm {p}\mathrm {e}\mathrm {r}_{2} : [0, \frac {1}{2}] \rightarrow \mathbb {R}\). The function Lower2(x) is defined as
(observe that \(\frac {3}{16},\frac {5}{16}\) are the two solutions of the equation \(3x-6x^{2}= \frac {45}{128}\), and so, since \(\frac {45}{128}>\frac {5}{16}\), then the value \(\mathrm {L}\mathrm {o}\mathrm {w}\mathrm {e}\mathrm {r}_{2}(x)=\frac {45}{128}\) is greater than 3x − 6x2 on the interval \(\frac {5}{16} \le x \le \frac {45}{128}\)). The function Upper2(x) is defined as Upper2(0) = 0 and for x > 0
We now prove bounds for \(\overline {P}_{2}(f)\) in terms of \(\mathrm {d}_{\mathcal {A}}(f)\), the distance to the closest affine function, which is the natural parameter to consider when testing if a function is affine. Note that in the following theorem, although the bounds look similar to the bounds in (21) above, there is a subtle and important difference: the bounds are now a function of \(\mathrm {d}_{\mathcal {A}}(f)\), the distance to the closest affine function, whereas in (21) the bounds are expressed in terms of \(\mathrm {d}_{{\mathscr{L}}}(f)\), the distance to the closest linear function.
Theorem 11
Let \(\overline {P}_{2}(f)=\min \limits (P_{2}(f), 1- P_{2}(f))\), where P2(f) is the probability of failure of the BLR test. We have
where \(\mathrm {d}_{\mathcal {A}}(f)\) is the nonlinearity of f and Lower2(x), Upper2(x) are as defined above in (22), respectively, (23).
Proof
We know that \(\mathrm {d}_{\mathcal {A}}(f) = \min \limits (\mathrm {d}_{{\mathscr{L}}}(f), \mathrm {d}_{{\mathscr{L}}}(f+1))\). We can assume, without loss of generality, that \(\mathrm {d}_{{\mathscr{L}}}(f) \le \mathrm {d}_{{\mathscr{L}}}(f+1)\) (otherwise, we can just replace f by f + 1, and \(\overline {P}_{2}(f)\) is unchanged) and therefore \(\mathrm {d}_{\mathcal {A}}(f) = \mathrm {d}_{{\mathscr{L}}}(f)\).
First, let us examine the function Upper2(x) more closely. If \(\frac {1}{4}\le x <\frac {1}{2}\) then \(\lfloor \log _{2} x\rfloor = -2\) so a simple computation shows that Upper2(x) = 6x2 − 3x + 1 in this case. If \(\frac {1}{8}\le x <\frac {1}{4}\) then \(\lfloor \log _{2} x\rfloor = -3\) so \(\mathrm {U}\mathrm {p}\mathrm {p}\mathrm {e}\mathrm {r}_{2}(x) = 6x^{2}+\frac {1}{4}\) in this case. One can check that the function Upper2(x) is monotonically increasing on the domain \(\left [0, \frac {1}{2}\right ]\). (It is continuous, and the derivative exists at all points except those of the form \(x=\frac {1}{2^{m}}\) for some integer m ≥ 1. The derivative is greater than zero at all points where it exists.) The equation \(\mathrm {U}\mathrm {p}\mathrm {p}\mathrm {e}\mathrm {r}_{2}(x)=\frac {1}{2}\) has only one solution in the interval \(\left [0,\frac {1}{2}\right ]\), namely \(x= \frac {1}{2\sqrt {6}} \in \left [\frac {1}{8},\frac {1}{4}\right )\). Therefore \(\mathrm {U}\mathrm {p}\mathrm {p}\mathrm {e}\mathrm {r}_{2}(x)\le \frac {1}{2}\) if and only if \(x\le \frac {1}{2\sqrt {6}}\) (see the first graph in the ??).
Therefore, using the fact that Upper2 is monotonic and the assumption \(\mathrm {d}_{{\mathscr{L}}}(f) \le \mathrm {d}_{{\mathscr{L}}}(f+1)\) we obtain
The bound \(\overline {P}_{2}(f)\le \frac {1}{2}\) is immediate from \(\overline {P}_{2}(f)=\min \limits (P_{2}(f), 1- P_{2}(f))\).
Now let us deal with the lower bound. If P2(f) ≤ 1 − P2(f) (in other words, \(P_{2}(f)\le \frac {1}{2}\)), then \(\overline {P}_{2}(f) = P_{2}(f)\ge \mathrm {L}\mathrm {o}\mathrm {w}\mathrm {e}\mathrm {r}_{2}(\mathrm {d}_{{\mathscr{L}}}(f)) = \mathrm {L}\mathrm {o}\mathrm {w}\mathrm {e}\mathrm {r}_{2}(\mathrm {d}_{\mathcal {A}}(f))\) and we are done. Let us assume P2(f) > 1 − P2(f) i.e. HCode \(P_{2}(f)> \frac {1}{2}\). From the behaviour of Upper2(x) discussed above, we see that this can only happen when \(\mathrm {d}_{{\mathscr{L}}}(f)\ge \frac {1}{2\sqrt {6}}\). We have to prove that in this case \( 1- P_{2}(f) \ge \mathrm {L}\mathrm {o}\mathrm {w}\mathrm {e}\mathrm {r}_{2}(\mathrm {d}_{{\mathscr{L}}}(f)). \)
Let us first consider the case \(\frac {1}{2\sqrt {6}} \le \mathrm {d}_{{\mathscr{L}}}(f)\le \frac {1}{4}\). We have
where the last inequality uses the fact that \(\mathrm {d}_{{\mathscr{L}}}(f)\le \frac {1}{4}\).
Next assume that \(\frac {1}{4}<\mathrm {d}_{\mathcal {A}}(f)\). Consider first the subcase \(\frac {1}{4} \le \mathrm {d}_{\mathcal {A}}(f) < \frac {5}{16}\). We have:
with the last inequality based on the fact that \(\frac {5}{16}< \mathrm {d}_{{\mathscr{L}}}(f)\le \mathrm {d}_{{\mathscr{L}}}(f+1)\) and Lower2(x) is monotonically increasing when the argument is above \(\frac {5}{16}\). □
Remark 12
One might wonder if the situation where \(\mathrm {d}_{{\mathscr{L}}}(f)\le \mathrm {d}_{{\mathscr{L}}}(f+1)\) and P2(f) > P2(f + 1), which is the non-straightforward case in the proof of Theorem 11, does even happen in practice. Experimentally, we did find such functions, but they seemed to be relatively rare. For example, for n = 6 and n = 7, we generated several random functions for each possible nonlinearity and we only observed that behaviour in a proportion of less than 0.06 of them. Therefore it is a reasonable heuristic, but only a heuristic, to assume that whichever of the functions f and f + 1 achieves \(\min \limits (P_{2}(f), P_{2}(f+1))\) also achieves \(\min \limits (\mathrm {d}_{{\mathscr{L}}}(f), \mathrm {d}_{{\mathscr{L}}}(f+1))\). It also justifies our choice to examine \(\min \limits (P_{2}(f), P_{2}(f+1))\) for its correlation to \(\mathrm {d}_{\mathcal {A}}(f)=\min \limits (\mathrm {d}_{{\mathscr{L}}}(f), \mathrm {d}_{{\mathscr{L}}}(f+1))\).
As a byproduct of the proof of Theorem 11, we can also obtain bounds on how large the difference P2(f) − P2(f + 1) can be when \(\mathrm {d}_{{\mathscr{L}}}(f)\le \mathrm {d}_{{\mathscr{L}}}(f+1)\). Namely, denoting \(x=\mathrm {d}_{{\mathscr{L}}}(f)\), we have the following cases. When \(x\in \left [0, \frac {1}{2\sqrt {6}}\right ]\) we cannot have P2(f) > P2(f + 1). When \(x\in \left (\frac {1}{2\sqrt {6}},\frac {1}{2}\right ]\) both P2(f) ≤ P2(f + 1) and P2(f) > P2(f + 1) are possible. If the latter happens, when \(x\in \left (\frac {1}{2\sqrt {6}},\frac {1}{4}\right ]\) we obtain from (25) that \(P_{2}(f)- P_{2}(f+1)= 2P_{2}(f)-1\le 12x^{2}-\frac {1}{2}\le \frac {1}{4}\); when \(x\in \left (\frac {1}{4},\frac {1}{2}\right ]\) we obtain from (26) and (27) that \(P_{2}(f)- P_{2}(f+1)= 2P_{2}(f)-1\le 1-2\mathrm {L}\mathrm {o}\mathrm {w}\mathrm {e}\mathrm {r}_{2}(x)\le \frac {38}{128} \approx 0.296\).
By contrast, for arbitrary functions f,g such that \(\mathrm {d}_{{\mathscr{L}}}(f)< \mathrm {d}_{{\mathscr{L}}}(g)\) but P2(f) > P2(g), the difference P2(f) − P2(g) can be larger, approaching 0.5. For example, for any integer t ≥ 2 consider the functions f(x) = 1 + x1x2⋯xt and g(x) = x1x2 + x3x4 + ⋯ + x2t− 1x2t. Using the calculations in Examples 3, 4, 5 and 9, we have that \(\mathrm {d}_{{\mathscr{L}}}(f)=\frac {1}{2}-\frac {1}{2^{t}}< \frac {1}{2}-\frac {1}{2^{t+1}}= \mathrm {d}_{{\mathscr{L}}}(g)\), and P2(f) > P2(g) with a difference
An improvement of the lower bound for the BLR linearity test (21) is given in [9]. Namely, it is shown that \(P_{2}(f)\ge H(\mathrm {d}_{{\mathscr{L}}}(f))\), where
where for any constant \(0< c\le \frac {1}{2}\), g1,g2 are defined as
$$ \begin{array}{@{}rcl@{}} g_{1}(x) &=& x + cx(1-2x)^{4},\\ g_{2}(x) &=& x + 2^{12}\left( 1-\frac{5}{4}c +\frac{1}{8}c^{2}\right)x^{3}(1-2x)^{12}. \end{array} $$
Note that this is indeed an improved lower bound as Lower2(x) ≤ H(x) and the inequality is strict on the interval \(\left (\frac {45}{128}, \frac {1}{2}\right )\). The analogue of Theorem 11 holds for this improved bound as well.
Proposition 13
With the notations in Theorem 11, we have \(\overline {P}_{2}(f) \ge H\left (\mathrm {d}_{\mathcal {A}}(f)\right )\), where H(x) is as defined above in (28).
Proof
Examining the proof of Theorem 11 we see that for the lower bound in the interval \(\left [\frac {5}{16}, \frac {1}{2}\right ]\) the only property that is used is that it is monotonically increasing. It suffices therefore to show that \(\min \limits (g_{1}, g_{2})\) is monotonically increasing. We compute \(g^{\prime }_{1}\) and \(g^{\prime }_{2}\), the derivatives of g1 and g2, and show that they are positive on the specified domain. Namely, \(0 <c\le \frac {1}{2}\) implies \(0<1-\frac {5}{4}c +\frac {1}{8}c^{2}\le 1\); further, \(\frac {5}{16}\le x \le \frac {1}{2}\) implies \(1-2x\le \frac {3}{8}\), 10x − 1 ≤ 4 and \(x(1-2x)\le \frac {15}{2^{7}}\). Therefore,
The above affinity tests can be used to estimate the nonlinearity of a Boolean function. The probability of failing a test can be estimated by running the test several times and using statistical methods such as the binomial proportion confidence interval (see [14]). The bounds will then allow to give an interval for the value of the nonlinearity as per (1) and (2). For simplicity, we will assume that we have obtained an exact value for Pk(f) (in practice we will actually obtain a confidence interval). We will examine each test in turn. The graphs in the ?? will aid the discussion.
We first look at the affine test based on the BLR test, as described in Section 5. The first graph in the ?? displays the lower and upper bound described in Theorem 11. Thus, for values of \(0\le \overline {P}_{2}(f)< y^{(2)}_{1} = \frac {45}{128} = 0.3515625\) we can estimate the nonlinearity with good precision as being in the interval \( \mathrm {d}_{\mathcal {A}}(f)\in \left [{\mathrm {U}\mathrm {p}\mathrm {p}\mathrm {e}\mathrm {r}_{2}}^{-1}(\overline {P}_{2}(f)), {\mathrm {L}\mathrm {o}\mathrm {w}\mathrm {e}\mathrm {r}_{2}}^{-1}(\overline {P}_{2}(f))\right ]. \) The length of this interval increases with \(\overline {P}_{2}(f)\) to a length of approximately 0.058. For \( \overline {P}_{2}(f)=\frac {45}{128}\) we get
For \(\frac {45}{128}< \overline {P}_{2}(f) <\frac {1}{4}\), \({\mathrm {L}\mathrm {o}\mathrm {w}\mathrm {e}\mathrm {r}_{2}}^{-1}(\overline {P}_{2}(f)) = \{\alpha ^{(2)}(\overline {P}_{2}(f)), {\upbeta }^{(2)}(\overline {P}_{2}(f)),\overline {P}_{2}(f)\} \), where \(0<\alpha ^{(2)}(y)\le {\upbeta }^{(2)}(y)< \frac {5}{16}\) are the two roots of the equation 3x − 6x2 = y in this domain. We obtain two disjoint intervals where \(\mathrm {d}_{\mathcal {A}}(f)\) might be:
Finally, for \(\overline {P}_{2}(f) \ge \frac {1}{4} \), the interval for \(\mathrm {d}_{\mathcal {A}}(f)\) is \(\left [{\mathrm {U}\mathrm {p}\mathrm {p}\mathrm {e}\mathrm {r}_{2}}^{-1}(\overline {P}_{2}(f)), \overline {P}_{2}(f)\right ]\). The estimate for \(\mathrm {d}_{\mathcal {A}}(f)\) becomes less and less precise (the interval length increases) as \(\overline {P}_{2}(f)\) increases. When \(\overline {P}_{2}(f)\) reaches \(\frac {1}{2}\), we obtain \(\mathrm {d}_{\mathcal {A}}(f) \in \left [\frac {1}{2\sqrt {6}}, \frac {1}{2}\right )\), an interval of length approximately 0.295.
Next we look at the nonhomomorphicity test f(u1 + ⋯ + uk) + f(u1) + ⋯ + f(uk) = 0 with odd k ≥ 3. We use the upper bound Upperk described in Theorem 2 and the lower bound Lowerk described in Corollary 8, illustrated for k = 3,5 in the second and third graph in the ??.
As x increases in the interval [0,0.5], Upperk(x) increases, whereas Lowerk(x) first increases from 0 to a local maximum \(y^{(k)}_{2}\), then decreases to a value of \(y^{(k)}_{1} = \frac {1}{2}\left (1-\frac {1}{2^{k-1}} \right )\) (reached for \(x=\frac {1}{4}\)) and increases again to 0.5. Consequently, we have three cases. When \(0\le P_{k}(f) < y^{(k)}_{1}\) we have that
where for each 0 ≤ y ≤ 0.5 we denote by \(0<\alpha ^{(k)}(y)\le {\upbeta }^{(k)}(y)< \frac {1}{2}\) the two roots of the equation \(\frac {1}{2}\left (1- (1 - 2 x)^{k+1} - 2^{k+1}x^{k}(1-x)\right ) =y\). The length of this interval increases as Pk(f) increases from 0 to \(y^{(k)}_{1}\) (for illustration, it increases to a value of 0.028,0.016 and 0.011 for k = 3,5 and 7, respectively).
When \(y=P_{k}(f)\in [y^{(k)}_{1}, y^{(k)}_{2}]\), we have that
and then decreases to 0, when Pk(f) reaches 0.5. For example, if k = 3,5,7, the length of the interval peaks at a value of 0.125,0.0741 and 0.05273, respectively (achieved when Pk(f) = 0.469,0.496 and 0.4995, respectively). The maximum length of the interval is achieved when Pk(f) is quite close to 0.5; the larger the value of k, the smaller the maximum length of the interval, that is, the more precisely we can estimate the nonlinearity.
We summarize these results in Table 1, which contains, for different values of k, the maximum length of the interval obtained when estimating the nonlinearity. The length is displayed firstly, for low values of Pk(f), namely the values in the interval \([0, y^{(k)}_{1}]\) discussed above. Secondly, the last column displays the maximum length of the interval for the remaining (higher) values of Pk(f).
Table 1
Precision of estimating the nonlinearity
k
Length of interval for \(\mathrm {d}_{\mathcal {A}}\)
Length of interval for \(\mathrm {d}_{\mathcal {A}}\)
when Pk is low
when Pk is high
2
≤ 0.058
≤ 0.295
\(\overline {P}_{2}\le 0.3515625\)
\( 0.3515625 \le \overline {P}_{2}\le 0.5\)
3
≤ 0.028
≤ 0.125
P3 ≤ 0.375
0.375 ≤ P3 ≤ 0.5
5
≤ 0.016
≤ 0.0741
P5 ≤ 0.46875
0.46875 ≤ P5 ≤ 0.5
7
≤ 0.011
≤ 0.05273
P7 ≤ 0.492188
0.492188 ≤ P7 ≤ 0.5
We also present in Table 2 the estimate of the nonlinearity \(\mathrm {d}_{\mathcal {A}}\) that would be obtained by this method for a few examples of functions, and compare it with the true value of the nonlinearity. The examples in this table are the ones in Example 5, f(x1,…,xn) = x1x2 + ⋯ + xm− 1xm with m = 2,4,6,8 and n ≥ m and the functions in Example 9 of the type f(x1,…,xn) = x1x2⋯xm with m = 3,4,5,6 and n ≥ m. We observe that for all these functions, the true value of the nonlinearity is at the top end of the estimated interval.
We also examined experimentally random functions in up to 9 variables (see the fourth figure in the ??), plotting the probability of failure P3(f) as a function of the nonlinearity \(\mathrm {d}_{\mathcal {A}}(f)\). To obtain data for each possible value of the nonlinearity we started by randomly generating several functions for each possible weight lower than 0.5. We then computed their nonlinearity (for weights lower than 0.25 it is equal to the weight of the function, as the function is closer to the all-zero function than to any other affine function; for higher weights, the nonlinearity can be different from the weight, but many functions will have a nonlinearity close to their weight). We noticed that for functions in 7 or more variables most of the functions in our data have probability P3 of failing the test close to the upper bound for P3. This translates to the true value of the nonlinearity being at the low end of the estimated interval. We observed a similar situation for k = 5,7.
To conclude this section, we note that each test we considered is quite accurate in estimating nonlinearity when the probability of failing the test is small (and consequently the nonlinearity of the function is small), but the accuracy decreases as the probability of failing the test increases. If we were to apply different tests to the same function, we note that the estimated interval for the nonlinearity is least accurate when using the affinity test based on the BLR test. The tests based on (k + 1)-st order nonhomomorphicity with k odd have better accuracy, and this accuracy improves as k increases.
Acknowledgements
The authors are grateful to the editor for the prompt handling of our paper and to the reviewers for extensive and helpful comments and suggestions which have highly improved the manuscript.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.