nach oben

Cryptography and Communications

Erschienen in:

Open Access 15.11.2021

Rational complexity of binary sequences, F$\mathbb {Q}$SRs, and pseudo-ultrametric continued fractions in $\mathbb {R}$

verfasst von: Michael Vielhaber, Mónica del Pilar Canales Chacón, Sergio Jara Ceballos

Erschienen in: Cryptography and Communications | Ausgabe 2/2022

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

We introduce rational complexity, a new complexity measure for binary sequences. The sequence s ∈ B^ω is considered as binary expansion of a real fraction $s \equiv {\sum }_{k\in \mathbb {N}}s_{k}2^{-k}\in [0,1] \subset \mathbb {R}$. We compute its continued fraction expansion (CFE) by the Binary CFE Algorithm, a bitwise approximation of s by binary search in the encoding space of partial denominators, obtaining rational approximations r of s with r → s. We introduce Feedback in $\mathbb {Q}$ Shift Registers (F$\mathbb {Q}$SRs) as the analogue of Linear Feedback Shift Registers (LFSRs) for the linear complexity L, and Feedback with Carry Shift Registers (FCSRs) for the 2-adic complexity A. We show that there is a substantial subset of prefixes with “typical” linear and 2-adic complexities, around n/2, but low rational complexity. Thus the three complexities sort out different sequences as non-random.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

$\mathbb {N}$

$= \{1,2,3,\dots \}, \mathbb {N}_{0} = \{0,1,2,3,\dots \}$

$\mathbb {D}$

$:= \{a / 2^{k}\colon a\in \mathbb {Z} \text { odd}, k\in \mathbb {N} \text {\ or\ } a\in \mathbb {Z}, k=0\}$, dyadic fractions

$\mathbb {R}_{1},\mathbb {Q}_{1},\mathbb {D}_{1}\colon $

For $X\subset \mathbb {R}$, X₁ := X ∩ [0,1]

= {0,1} is the binary alphabet, with complement $\overline a = 1-a, a\in B$, e.g. $\overline {10011} = 01100$

$B^{*}=\{\varepsilon ,0,1,00,01,10,11,000,\dots \}$

and B^ω, are the finite, resp. infinite, words over B

B ⁺

= B^∗∖{ε}. For $v= v_{1}v_{2}{\dots } v_{|v|}\in B^{*}$: $(v)_{2} = {\sum }_{k=0}^{|v|-1} v_{|v|-k} 2^{k}\in \mathbb {N}_{0}$ in binary, (ε)₂ = 0

1 Introduction and motivation

Stream ciphers employ pseudorandom sequences as a replacement for one-time-pads. Such a cipher is deterministic, but should be indistinguishable from true randomness. A means to assess the quality of such a pseudorandom symbol stream s ∈ B^ω is the linear complexity profile (LCP), $(L(s,n), n\in \mathbb {N}_{0})$, among other tests. The expected value is L(s, n) ≈ n/2, and while deviations of $\pm \log _{2}(n)$ are typical, values with $|L(s,n) -n/2|> 2\cdot \log _{2}(n)$ indicate atypical “nonrandom” behaviour, see [15].

Klapper and Goresky introduced 2-adic complexity with profile $(A(s,n), n\in \mathbb {N}_{0})$, $A = 1 + \lfloor \log _{2} ({\Phi })\rfloor \in \mathbb {N}$ (see [8, p. 128] for Φ). They demand [6, p. 1]: “In general, one must consider this ‘2-adic span’ as a measure of security along with ordinary linear span.”

Linear complexity uses Linear Feedback Shift Registers (LFSRs) and approximates $G(s) = {\sum }_{k\in \mathbb {N}}s_{k}x^{-k}\in \mathbb {F}_{2}[[x^{-1}]]$ by rational functions from $\mathbb {F}_{2}(x)$, while 2-adic complexity employs Feedback with Carry Shift Registers (FCSRs) and approximates ${\sum }_{k\in \mathbb {N}}s_{k}2^{k-1}\in \mathbb {Z}_{2}$ by rational numbers $p/q\in \mathbb {Q}$ with odd denominator.

The new measure of rational complexity, the corresponding register type Feedback in $\mathbb {Q}$ Shift Register (F$\mathbb {Q}$SR), and thus this paper grew out of 3 observations:

Binary sequences can be interpreted as real numbers from [0,1] and approximated by rationals from $\mathbb {Q}_{1}$. No sequence measure seems to treat this point-of-view yet.
Approximations of $s\in \mathbb {R}_{1}$ by $P/Q\in \mathbb {Q}_{1}$ are notoriously difficult numerically due to the Archimedean, non-ultrametric nature of $\mathbb {R}$. Only recently, we developed a pseudo-ultrametric continued fraction expansion (CFE) algorithm which produces about one bit of CFE encoding for each additional bit of the sequence prefix, thus mimicking the Berlekamp-Massey algorithm (BMA) for L in this respect, see [22].
Comparing linear, 2-adic and rational complexities L(s, n),A(s, n) and R(s, n) of finite prefixes of a certain length n, it turns out that the measures assign low complexities to very different parts of the prefix set:

Motivating Example

Consider the three binary prefixes of length n = 24 given in Table 1. The columns L, A, R give the linear, 2-adic, and rational complexities, respectively.

Table 1

Prefixes with distinct complexities L, A, R

	Sequence prefix	Complexities
		L	A	R
(1)	1011.0010.0011.1101.0110.0100	4	12	14
(2)	0110.1001.1101.1110.0101.1000	12	5	14
(3)	0100.1111.0111.0010.1100.0010	12	14	5

As can be seen, in each case two complexities lie around n/2 = 12, which is near the expected average behaviour (see [15] for L, [16] for A: we only know (n − 1)/2 ≤ E(A) < 0.772 ⋅ n, and Section 5 for R), and thus the sequence is unremarkable from that point of view. Each sequence, however, has a markedly low value in one of the complexities, (1) for L, (2) for A, and (3) for R. Hence, from a cryptographic point of view, all three sequences should be discarded as non-pseudorandom. We need though all three tests, L, A, and R to encounter all three of them as atypical.

Like Klapper stated for the 2-adic complexity, we therefore should also consider the rational complexity R, defined in Section 5, as a means to assess randomness.

The general idea is to represent a given binary sequence as a real number

$$s = \sum\limits_{k\in\mathbb{N}} s_k2^{-k}\in\mathbb{R}_1\subset\mathbb{R}$$

and to obtain its CFE, and thereby good rational approximations. By reuse of notation, we will write s both for the binary sequence and the resulting real number.

Going in the direction from s to its CFE is a very tedious process over $\mathbb {R}$. It is not suited to bitwise, incremental computation with growing prefix length, since it needs the “full” precision (whatever that will be) right from the start. Instead, we define a sequence of CFEs (of rational numbers) that converge to the CFE of s, by doing a binary search in the space of all CFEs. Due to the results from [22], this is possible and efficient.

We obtain an encoding of the CFEs which is both monotonic

$$ s < s^{\prime} \Longleftrightarrow \text{encoding(CFE(}s)) < \text{encoding(CFE(}s^{\prime}))$$

(in $\mathbb {R}$ and in lexicographical order, respectively) and at the same time pseudo-ultrametric: If two sequences (or binary representations of numbers) s and $s^{\prime }$ differ in bit k, but not before, their encodings differ—in the limit for large k, for numbers $s,s^{\prime }$ satisfying the so-called Gauß-Kuz’min measure (Section 3)—for the first time around bit ⌊1.024 ⋅ k⌋, not much earlier and not much later: The information content provided by $s_{1},{\dots } , s_{k}$ is roughly preserveded and translated into ≈ 1.024 ⋅ k bits of information about the CFE of s.

The actual trial values $r\in \mathbb {Q}_{1}$ with r → s are taken from the so-called V₁₀ tree [22]. Let v be the address of r within this infinite binary tree (v = ε at the root, each move down appends a bit to v, 0 to the left, 1 to the right). Then the CFE encoding of r will be v10^ω ∈ B^ω.

Therefore apparently, since we move further and further down the V₁₀ tree, any address is a prefix of the CFE encoding of s: the prefix grows when moving down, previously defined bits are not altered afterwards.

This paper is divided in 3 parts.

First, we treat continued fractions (Section 2), encodings (Section 3), the Binary CFE algorithm (Section 4), define Rational Complexity (Section 5), and give an example.

The second part treats properties of rational complexity such as connections with 2-adic complexity for periodic sequences as well as impulse response sequences (Section 7). A comparison of the three complexity measures R, L, and A on 30-bit prefixes (a generalisation of the “Motivating Example”) is given in Section 8. We show that encodings and prefix lengths do not differ too much (the “pseudo-ultrametric” property) in Section 9. In Section 10, we introduce the Q Tree, providing an ultrametric ordering of $\mathbb {Q}_{1}$.

The final part covers implementation aspects: In Section 11, we show, how to efficiently update the convergents P/Q, when moving down the V₁₀ tree, by a Finite State Machine with Counter. Section 12 covers Feedback in $\mathbb {Q}$ Shift Registers (F$\mathbb {Q}$SR) which allow to efficiently compute the infinite rational sequence r—to be compared to the input s—from the two numbers P and Q. We finish with Open Problems.

2 Continued fraction expansions in $\mathbb {R}$ and $\mathbb {F}_{2}[[x^{-1}]]$

Case $\mathbb {R}$

The usual way to compute the CFE of a number $s := {\sum }_{k\in \mathbb {N}} s_{k}2^{-k}\in \mathbb {R}_{1}$ is

$$s^{(0)} := s,b_0 := 0, \forall i\in\mathbb{N}\colon s^{({i})} := \frac{1}{\{s^{({i-1})}\}} = \frac{1}{s^{({i-1})} - b_{i-1}}, b_i := \lfloor s^{(i)}\rfloor,$$

giving $\displaystyle s = 0 +\frac {1|}{| b_{1}}+\frac { 1|}{| b_{2}}+\cdots $, where the resulting b_i are the partial denominators (PDs).

We write $s = [b_{1},b_{2},\dots ]$ for the CFE.

The convergents P_i/Q_i to s are obtained via Perron’s schema, Table 2 (see [14], we use s = π − 3 as example). The initial values are P_− 2 = Q_− 1 = 0, Q_− 2 = P_− 1 = 1, and then P_i := b_i ⋅ P_i− 1 + P_i− 2, Q_i := b_i ⋅ Q_i− 1 + Q_i− 2 with $\left | s - P_{i}/Q_{i}\right | < Q_{i}^{-2}$ for $s\not \in \mathbb {Q}$, [14, Satz 2.10].

Table 2

Schema for PDs b_i and convergents P_i/Q_i, for s = π − 3

https://static-content.springer.com/image/art%3A10.1007%2Fs12095-021-00539-2/MediaObjects/12095_2021_539_Figa_HTML.png

Case $\mathbb {F}_{2}[[x^{-1}]]$

In the case of formal power series from $\mathbb {F}_{2}[[x^{-1}]]$, one computes the linear complexity via the Berlekamp-Massey Algorithm (BMA) [1, 11]. The BMA takes the sequence s of coefficients and computes the CFE

$$\mathbb{F}_2[[x^{-1}]] \ni G(s) = \sum\limits_{k=1}^{\infty} s_{kx}^{-k}= \frac{\hfill 1\hfill|}{|\hfill p_1(x)\hfill}+\frac{\hfill 1\hfill|}{|\hfill p_2(x)\hfill}+ \frac{\hfill 1\hfill|}{|\hfill p_{3}(x)\hfill}+\cdots,$$

with polynomials $p_{i}(x) \in \mathbb {F}_{2}[x]$ as partial denominators. With the adaptations by Dornstetter [3] and Vielhaber [19], the BMA computes an isometry K on $\mathbb {F}_{2}^{\omega }$ such that the modified discrepancy sequence d := K(s) (see [19])

$$\mathbf{K}(s) = d = (d_k) = \text{C}_{\mathbb{F}_{2}[\mathrm{x}]}(p_1)|\text{C}_{\mathbb{F}_{2}[\mathrm{x}]}(p_2)|\text{C}_{\mathbb{F}_{2}[\mathrm{x}]}(p_3)|\dots{} \in \mathbb{F}_{2}^{\omega}$$

is at the same time an encoding of the PDs, by

$$\text{C}_{\mathbb{F}_{2}[\mathrm{x}]}\colon \mathbb{F}_2[x]\backslash \mathbb{F}_2 \to \mathbb{F}_2^{*},\quad p(x) = \sum\limits_{k=0}^g a_kx^k\mapsto \text{C}_{\mathbb{F}_{2}[\mathrm{x}]}(p(x)) = 0^{g-1}1a_{g-1}{\dots} a_1a_0.$$

The degrees $g_{i} = \deg (p_{i})$ of the PDs determine the linear complexity profile of s (see [19]). For fast implementations see [2, 13].

Our approach is inspired by this ultrametric, bitwise model and will avoid the costly inversions s⁽ⁱ⁾ := 1/{s^(i− 1)} (which would have to be executed each time with the maximum prefix length corresponding to the eventually desired precision).

3 Binary CFE I: Gauß-Kuz’min measure, codes, monotonicity

3.1 Gauß-Kuz’min measure

To define an encoding for PDs, we start with the relevant facts obtained by Gauß, Kuz’min, Khinchin, and Lévy about continued fractions for $r\in \mathbb {R}\backslash \mathbb {Q}$ and their PDs b_i. The probability for a PD value of b is $\mu _{GK}(b) := \lim _{n\to \infty }{|\{i\ |1\leq i\leq n,\ b_{i}=b\}|}/{n}$.

Theorem 1

Gauß-Kuz’min-Khinchin-Lévy [5, 9, 10]

For almost all values $r\in \mathbb {R}$ (exceptions are rational, quadratic-algebraic and some more numbers, including $e^{2/k},k\in \mathbb {N}$), we have:

(i)

The probability for a partial denominator $b\in \mathbb {N}$, its Gauß-Kuz’min measure, is

$$\mu_{GK}(b) = -\log_2\left( 1-\frac{1}{(b+1)^2}\right) = \log_2\left( 1+\frac{1}{b(b+2)}\right).$$

(ii)

The geometric average of the partial denominators is Khinchin’s constant

$$K := \lim_{n\to\infty}\sqrt[n]{b_1\cdot b_2{\cdots} b_n} \doteq 2.68545.$$

(iii)

The average gain in precision, per partial denominator in bits, is

$$\displaystyle \frac{\pi^{2}}{6\ln(2)^{2}} \doteq 3.42371, \textit{with}~2^{3.42371/2}\doteq 3.27582~\textit{being L\'{e}vy's constant.} $$

3.2 Encodings C_I, C_II, and C

The encoding $C\colon \mathbb {R}_{1}\to B^{\omega }\backslash B^{*}1^{\omega }$ for CFEs, i.e. for the sequence of PDs from $\mathbb {N}$, shall satisfy the following 3 desiderata:

(GK) C is reasonably close to the Gauß-Kuz’min measure.

We want to be almost isometric that is the encoding length should advance in parallel with the sequence prefix, bit-by-bit (preservation of information content). The measure μ_GK(b) for the expected fraction of the PD value $b\in \mathbb {N}$ thus requires that an encoding for b has length around $-\log _{2}(\mu _{GK}(b))$.

(MON) C is monotonically increasing: $r < r^{\prime } \Longleftrightarrow C(r) < C(r^{\prime })$ in lexicographical order.

This allows to avoid the costly CFE computation for the real s altogether. Instead, we do a binary search in the encoding space $C(\mathbb {R}_{1})\subset B^{\omega }$ and compare the resulting (rational!) number r with s, where (MON) tells us, which of the two subintervals of the binary search remains valid, and (GK) ensures efficiency for this process.

(INC) C is incremental: Given the last two convergents P/Q and $P^{\prime }/Q^{\prime }$, from r = P/Q = C^− 1(v10^ω) we obtain

$$ C^{-1}(va10^{\omega}) = \frac{P\pm (P^{\prime}\cdot 2^{\delta_a})}{Q\pm (Q^{\prime}\cdot 2^{\delta_a})} \text{\ or\ } C^{-1}(va10^{\omega}) = \frac{2P+P^{\prime}}{2Q+^{\prime}Q}$$

for both values a ∈ B with some $\delta _{a}\in \mathbb {N}_{0}$. This yields an efficient implementation, where the multiplication P_i := b_i ⋅ P_i− 1 + P_i− 2 is replaced by iterated shift-and-add. This requirement (INC) is only relevant for fast implementations, see Section 11.

Definition 2

Encodings C_I, C_II, and C (Table 3)

Table 3

Codes C_I and C_II for partial denominators

b	C_I	C_II	l_I,II	l_GK	μ_GK
1	1	0	1	1.269	0.4150
2	011	100	3	2.557	0.1699
3	010	101	3	3.425	0.0931
4	00111	11000	5	4.086	0.0588
5	00110	11001	5	4.621	0.0406
6	00101	11010	5	5.071	0.0297
7	00100	11011	5	5.460	0.0227
8	0001111	1110000	7	5.802	0.0179
9	0001110	1110001	7	6.108	0.0144
10	0001101	1110010	7	6.384	0.0119
11	0001100	1110011	7	6.636	0.0100
12	0001011	1110100	7	6.868	0.0085
13	0001010	1110101	7	7.082	0.0073
14	0001001	1110110	7	7.282	0.0064
15	0001000	1110111	7	7.468	0.0056
16	000011111	111100000	9	7.644	0.0050
⋮
31	000010000	111101111	9	9.471	0.0014
32	00000111111	11111000000	11	9.559	0.0013
⋮
63	00000100000	11111011111	11	11.471	0.00035
64	0000001111111	1111110000000	13	11.516	0.00034
⋮
ℵ₀	0^ω	1^ω	–	–	–

We encode a PD $b\in \mathbb {N}$, $b = 2^{n} +{\sum }_{k=0}^{n-1} a_{k}2^{k}$ with $n := \lfloor \log _{2}(b)\rfloor $ as follows:

(i)

For PDs b_i with odd index i, let

$$\text{C}_{\mathrm{I}}\colon \mathbb{N}\to B^{*},\ b \mapsto \text{C}_{\text{I}}(b) = 0^n1\overline a_{n-1}{\dots} \overline a_1\overline a_0.$$

In particular, 1↦1;2↦011;3↦010. We also use the “pseudo-PD” ℵ₀ (Aleph-Zero), the first, countable infinity, with 1/ℵ₀ := 0, and set C_I: ℵ₀↦0^ω.

(ii)

For PDs b_i with even index i, we complement the codewords to obtain code C_II:

$$\text{C}_{\text{II}}\colon \mathbb{N}\to B^{*},\ b \mapsto \text{C}_{\text{II}}(b) = \overline{\text{C}_{\text{I}}(b)} = 1^n0a_{n-1}{\dots} a_1a_0.$$

In particular, 1↦0;2↦100;3↦101;4↦11000, and C_II(ℵ₀) := 1^ω.

Observe that both $\text {C}_{\mathbb {F}_{2}[\mathrm {x}]}$ and C_I,C_II are composed of a first half, encoding the size (degree / $\lfloor \log _{2}(b)\rfloor $), and a second half, describing the details from “coarse” (a_d− 1a^d− 1 / a_n− 12^n− 1) to “fine” (a₀x⁰ / a₀2⁰). C_I and C_II are prefixfree and complete prefixcodes.

(iii)

We concatenate encodings (rational finite CFE and irrational infinite CFE) as

$${}C\colon\mathbb{Q}_1\to B^{*}0^{\omega}, r =[b_1,b_2,\dots,b_{2l}] \mapsto C(r):= \text{C}_{\mathrm{I}}(b_1)|\text{C}_{\text{II}}(b_{2})|\dots|\text{C}_{\text{II}}(b_{2l})|\text{C}_{\mathrm{I}}(\aleph_0).$$

and

$$C\colon\mathbb{R}_1\backslash \mathbb{Q}_1\to B^{\omega} \backslash(B^{*}0^{\omega} \cup B^{*}1^{\omega}), r =[b_1,b_2,b_3,\dots] \mapsto C(r):=\text{C}_{\text{I}}(b_1)|\text{C}_{\text{II}}(b_{2})|\text{C}_{\text{I}}(b_{3})|\dots{}. $$

About C for $\mathbb {Q}_{1}$: A finite CFE can be represented in two ways, as $[\dots , b_{n},1] = [\dots , b_{n}+1]$, leading to the same value. We can resolve this ambiguity in 4 ways:

(i)

Let the last PD always be greater than 1, or

(ii)

always equal to 1, or

(iii)

always have an even, or (iv) always an odd number of PDs.

We use convention (iii), thus C always terminates in some C_II(b_2l), followed by the “pseudo-PD” ℵ₀ with C_I(ℵ₀) = 0^ω. We therefore have $B^{*}0^{\omega } = C(\mathbb {Q}_{1})$ as encodings of finite CFEs.

(GK) Columns l_I,II, with $l_{\text {I,II}}(b) = 1 + 2\lfloor \log _{2}(b)\rfloor $, and l_GK, with $l_{GK}(b) = -\log _{2}(\mu _{GK}(b))$, of Table 3 give the coding length for our code and an “ideal code” according to the Gauß-Kuz’min measure (which would lead to non-integral word lengths), respectively. As can be seen, the two columns have reasonably similar values for each b. The codes are from [22], but C_II already appears in Vallée [17] and she mentions Elias (reference [12] in [17]) as inventor. The complemented code C_I and the interleaving as C are though apparently new.

Comparing with L, the analogous measure for a polynomial PD $p(x) = x^{m}+{\sum }_{k=0}^{m-1}a_{k}x^{k}$ $\in \mathbb {F}_{2}[x]$ is μ(p) = 2^− 2m, and the encoding K (see [19]) is an isometry, with $|\text {C}_{\mathbb {F}_{2}[\mathrm {x}]}(p)| = 2m = -\log _{2}(\mu (p))$. In a sense, this paper tries to achieve the same in the setting of $\mathbb {R}$.

3.3 Monotonicity

We next show the monotonicity of $C\colon \mathbb {R}\to B^{\omega }$, where < is the usual order relation in $\mathbb {R}$, and refers to lexicographical order in B^ω. From continued fraction theory, we know the impact on r of changing a PD b_i:

Proposition 3

Let two real numbers $ r^{\prime } = [b_{1},\dots ,b_{i-1},b^{\prime }_{i},\dots ]$, $r^{\prime \prime }= [b_{1},\dots ,b_{i-1},b^{\prime \prime }_{i},\dots ] $ with $b_{i}^{\prime } < b_{i}^{\prime \prime }$ be given. Then $r^{\prime } < r^{\prime \prime }$ for even index i, and $r^{\prime } > r^{\prime \prime }$ for odd i, respectively.

Proof

See Satz 2.8 in Perron [14]. □

Theorem 4

(MON) Monotonicity of C(r)

Let $r^{\prime },r^{\prime \prime }\in [0,1)\subset \mathbb {R}$ with $r^{\prime } < r^{\prime \prime }$, and $C(r^{\prime }),C(r^{\prime \prime })$ their encodings.

Then $C(r^{\prime }) < C(r^{\prime \prime })$ in lexicographical order.

Proof

Follows with Proposition 3. As mentioned (see [14, Satz 2.8]), for odd index 2i + 1 (b_j, j < 2i + 1, fixed), $b^{\prime }_{2i+1} > b^{\prime \prime }_{2i+1}$ implies $r^{\prime } = [b_{1},\dots , b_{2i},b^{\prime }_{2i+1}] < r^{\prime \prime } = [b_{1},\dots , b_{2i}, b^{\prime \prime }_{2i+1}]$. Also, we then have $\text {C}_{\mathrm {I}}(b^{\prime }_{2i+1}) < \text {C}_{\mathrm {I}}(b^{\prime \prime }_{2i+1})$ and thus $C(r^{\prime }) < C(r^{\prime \prime })$ lexicographically.

For even index 2i + 2 (b_j, j < 2i + 2, fixed), we have $b^{\prime }_{2i+2} < b^{\prime \prime }_{2i+2}\Leftrightarrow r^{\prime } < r^{\prime \prime }\Leftrightarrow \text {C}_{\text {II}}(b^{\prime }_{2i+2}) < \text {C}_{\text {II}}(b^{\prime \prime }_{2i+2})\Leftrightarrow C(r^{\prime }) < C(r^{\prime \prime })$. □

4 Binary CFE II: Algorithm

The Binary Continued Fraction Expansion Algorithm (Table 4, left) builds upon the monotonicity of $C\colon \mathbb {R}_{1}\to B^{\omega }$ from sequences to encodings. We use the computationally easier inverse to obtain a trial value $r \in \mathbb {Q}_{1}$ from an encoding with finite support in B^∗10^ω, r := C^− 1(v10^ω),v ∈ B^∗. We then compare s to r with s < r ⇔ C(s) < C(r), Theorem 4.

Table 4

Binary CFE and approximations

https://static-content.springer.com/image/art%3A10.1007%2Fs12095-021-00539-2/MediaObjects/12095_2021_539_Figb_HTML.png

The function C^− 1 restricted to B^∗10^ω is represented in the V₁₀ Tree defined in [22], an infinite binary tree with labels, the binary analogue to the Stern-Brocot Tree. Table 4 (right) (see also Table 9) gives the first 5 levels of the V₁₀ Tree, and therefore the first 5 trial values (with |v|≤ 4) for any sequence s, with C^− 1(v10^ω) = [PDs$]=P/Q=r\in \mathbb {Q}_{1}$.

The | separates the encodings C_I,C_II and the “.” stands after the current prefix v.

The V₁₀ Tree arranges the $r\in \mathbb {Q}_{1}$ according to their CFE support length. The r with CFE v10^ω appears as label at address v on level |v| + 1, level 1 being the root node with address ε. The node at v has child nodes with addresses v0 (left) and v1 (right child).

Moving down the V₁₀ Tree with growing v, we successively shrink the set vB^ω of permissible encodings, whose center is the trial encoding v10^ω. Analogously to the BMA for L, we compute a discrepancy sequence D(s) = (d_k), where d_k = 0, if the approximation of s by r up to bit k − 1 also satisfies s_k = r_k; d_k = 1 otherwise, for s_k≠r_k. By definition, D is again an isometry on B^ω.

The bits within the sequences, input s, trial value r, discrepancies d, are indexed by k and the index is advanced in the FOREVER loop as long as s_k = r_k with d_k = D(s)_k := 0.

Whenever a discrepancy r_k≠s_k, d_k = D(s)_k := 1 occurs, we enter the WHILE loop to adjust the encoding, the CFE, and thereby the trial value r:

Either we have $(s_{1}{\dots } s_{k}) < (r_{1}{\dots } r_{k})$ lexicographically, and r is too large, has to decrease, Case ⊖: We set v := v0, leading to v0B^ω as new cylinder set of valid encodings, with the previous trial encoding v10^ω as upper bound. Thus v010^ω is the new trial encoding—the new center value.

Or, to the contrary, we had $(s_{1}{\dots } s_{k}) > (r_{1}{\dots } r_{k})$, Case ⊕: Then we adjust to v := v1, with v110^ω the center of v1B^ω. This interval halving or binary search within $B^{\omega } = C(\mathbb {R}_{1})$ allows to avoid the costly inversions s⁽ⁱ⁾ := 1/{s^(i− 1)} (see Section 2, CFE in $\mathbb {R}$), which in principle would have to be carried out with the full precision of |s|.

Given the new v (one new bit appended to the PD encoding), we generate the new convergent r = P/Q incrementally. We have to take three actions:

$v\mapsto v10^{\omega }\stackrel {C^{-1}}{\longmapsto } [b_{1},b_{2},\dots , b_{2l},\aleph _{0}]$, obtain the new sequence of PDs.

$[b_{1},b_{2},\dots , b_{2l}]\mapsto P/Q$, evaluate the encoded PDs by Perron’s schema.

For now, cut v10^ω into the pieces according to C_I,C_II in I, and then use Table 2 for II. Section 11 will show, how the new convergents P/Q are obtained almost effortlessly from the two previous ones, combining I and II, by property (INC).

III

P/Q↦r ∈ B^ω, expand in binary. Again, for now think “paper-and-pencil long division in binary / in base 2”. Section 12 will introduce F$\mathbb {Q}$SRs to do this efficiently.

If now s_i = r_i,1 ≤ i ≤ k, we leave the WHILE loop and return into the FOREVER loop, otherwise we refine the PD encoding by again extending v as long as necessary.

For sequences satisfying Theorem 1 (Gauß-Kuz’min), we enter the WHILE loop on average every second cycle of the FOREVER loop (for s_k≠r_k i.e. d_k = 1), and we show in Section 9 that $\lim _{k\to \infty } |v|/k \doteq 1.024$ in this case. Therefore, we stay within the WHILE loop for an average of 2.048 cycles for such sequences.

5 Rational complexity of a binary sequence

Before defining rational complexities, a comment is in order: The reals $\mathbb {R}_{1}$, and B^ω as their dyadic representations, are not topologically equivalent. Rather B^ω is isomorphic to the Cantor set (see Vepstas [18]). As a consequence, a dyadic fraction $r \in \mathbb {D}_{1}\subset \mathbb {Q}_{1}$ corresponds to the two sequences w10^ω and w01^ω for some w ∈ B^∗. We set the complexity for w01^ω to at most |w| + 2, including the first 1 of the “stuck-at” part in the length.

For 2^−k = 0^k− 110^ω = 0^k1^ω, this leads to R = k and k + 1, respectively. Apparently, for the latter, a k-bit register would be initialized with 0^k and could not generate 0^k1^ω.

We now define the rational complexity of a sequence in analogy to L and A.

Definition 5

Rational Complexity R(s, n)

(i)

Let R(s,0) := 0,∀s ∈ B^∗∪ B^ω. For allzero sequences, let $R(0^{\omega },n) := 0, n\in \mathbb {N}$ and R(0^k, n) := 0,1 ≤ n ≤ k.

(ii)

For sequences s ∈ B^∗∖({0}^∗∪ B^∗11) i.e. neither allzero, nor ending in 11, and for s ∈ B^ω∖({0^ω}∪ B^∗1^ω), the rational complexity of s is

$$R(s,n) := \lceil \log_2(Q_i)\rceil,$$

where P_i/Q_i = r is the first convergent of the Binary CFE matching r_k = s_k,1 ≤ k ≤ n. In view of Section 12, R(s, n) is the length of an F$\mathbb {Q}$SR producing the n-bit prefix of s.

(iii)

For sequences s = v01^k, k ≥ 2 or s = v01^ω with some v ∈ B^∗, let $R(s,n) = \min \limits (|v|+2, \lceil \log _{2}(Q_{i})\rceil )$, |v| + 2 ≤ n ≤|s|, with Q_i as in (ii). Also, $R(1^{\omega },n) := 1,\forall n\in \mathbb {N}$.

(iv)

Let the rational complexity profile for a sequence s ∈ B^∗ be (R(s, n),0 ≤ n ≤|s|). For an s ∈ B^ω, the profile is $(R(s,n), n\in \mathbb {N}_{0})$.

(v)

For ultimately periodic sequences s = x(y)^ω with x ∈ B^∗ and y ∈ B⁺, let $R(s) := \lim _{n\to \infty }R(s,n)$. Then R(s) is the length of the shortest F$\mathbb {Q}$SR producing s.

(vi)

The number of sequences of length n with rational complexity $c\in \mathbb {N}_{0}$ is denoted by N_R(n, c), and we set $N_{R}(c) := \lim _{n\to \infty } N_{R}(n,c)$.

Let $E(R(n)) := 2^{-n}\cdot {\sum }_{s\in B^{n}} R(s,n) = 2^{-n}\cdot {\sum }_{k=0}^{n} k\cdot N_{R}(n,k)$ be the average rational complexity for words of length n, and Var(R(n)) the corresponding variance. The distribution of rational complexities for words of length n ≤ 12, as well as E = E(R(n)) and Var = Var(R(n)) for n ≤ 24 are given in Table 5.

Table 5

Rational complexity: N_R(n, c), E(R(n)), Var(R(n))

https://static-content.springer.com/image/art%3A10.1007%2Fs12095-021-00539-2/MediaObjects/12095_2021_539_Figc_HTML.png

Theorem 6

Rational Complexity: Counts

The number N_R(c) is N_R(0) = 1 (for s = 0^ω), N_R(1) = 2 (for s = 10^ω,1^ω), and for c ≥ 2

$$ N_R(c) = 2^{c-2} + \sum\limits_{Q = 2^{c-1}+1}^{2^c} \varphi(Q),$$

where φ is Euler’s totient function.

Proof

The sum is the number of fractions P/Q with 2^c− 1 < Q ≤ 2^c,1 ≤ P < Q and P, Q coprime. The additional 2^c− 2 sequences correspond to the special cases v01^ω with v ∈ B^c− 2, the set of words over B of length c − 2, for dyadic fractions (Def. 5 (iii)). □

Conjecture 7

Rational Complexity: Average and Variance

The following limits exist and yield four small constants ${\nu _{E}^{0}},{\nu _{E}^{1}},{\nu _{V}^{0}},{\nu _{V}^{1}}$:

$$\nu_{E}^a := \lim_{\substack{n\to\infty\\n = a (\text{mod\ } 2)}}\mathrm{E}(R(n)) - (n/2 + 1)\text{\ \ and\ \ } \nu_{V}^a := \lim_{\substack{n\to\infty\\n = a (\text{mod\ } 2)}}\operatorname{Var}(R(n)) -5/4,\ \ a\in\{0,1\}.$$

The approximation of each PD may be alternating from above and below. Thus the rational complexity is not necessarily monotonic (the sequence of encodings v10^ω though has monotonically increasing support length |v| + 1). However, the error in monotonicity is at most one:

Proposition 8

For all sequences s ∈ B^ω and all $k_{1} < n < k_{2}\in \mathbb {N}$, we have

$$R(s,k_1)-1 \leq R(s,n) \leq R(s,k_2)+1.$$

Proof

A reduction of the absolute value of Q, and thereby possibly of R(s, n), may only occcur when the current PD b is decreased.

The PDs first increase through the values $2^{k}, k=1,2,3,\dots $ with either C_I(2^k − 1) = 0^k− 110^k− 1, C_II(1) = 0 or C_II(2^k) = 1^k− 10^k as last encodings before C_I(ℵ₀) = 0^ω, see Table 3 (also Section 11, Table 11). They then settle within the interval [2ⁿ,2^n+ 1] for some n — only this may decrease R.

The claim follows with $Q+Q^{\prime }\leq Q+2^{n}Q^{\prime } \leq Q+bQ^{\prime } \leq Q+2^{n+1}Q^{\prime } < 2(Q+2^{n} Q^{\prime })$. □

Remark 9

Being monotonic with growing prefix length is of course a desirable property of a complexity measure. We therefore define a monotonic version of R in the next definition. Note, however that this approach is no longer the Binary CFE with incremental bitwise consideration of both the sequence and the encoding.

Definition 10

Monotonic Rational Complexity $\overline R$

Let $\overline R(s,k) := R(s,k)$ for all sequences s and positions k ≤|s|, with two exceptions (following the algorithm from Table 4, left):

(i)

If R(s, k) is the result of the WHILE-loop with Case ⊖ in an encoding C_I or a case (C_II,⊕), then R(s, k) was computed from an upper bound for b. In this case, advance the encoding (append to v) until a situation (C_I,⊕) or (C_II,⊖), a lower bound for b, occurs within the current encoding, or until reaching the encoding of the next PD, irrespective of ⊕,⊖. If hereby the rational complexity decreases to R(s, k) − 1, take this value for $\overline R(s,k) := R(s,k)-1$, otherwise, keep $\overline R(s,k) = R(s,k)$.

(ii)

If, in a Case (√), we previously had $\overline R(s,k-1) = R(s,k-1) - 1$, keep that value, i.e. $\overline R(s,k) := \overline R(s,k-1) = R(s,k-1)-1 = R(s,k)-1$.

6 Example π

Example 11

The well-known CFE of $\pi \doteq \texttt {0x3.243F6A8885A308D}$ is $[3;7,15,1,292,1,\dots ]$. The Binary CFE Algorithm proceeds on π − 3 as in Table 6. We omit the final code 0^ω and PD ℵ₀. $\{\texttt {S}\in \texttt {A},\dots , \overline {\texttt {D}}\}$ is the state of the FSM+C (Section 11). $B^{\omega }\ni r\equiv p/q \in [0,1)\subset \mathbb {R}$ is given as prefix in hexadecimal. k gives the first place with s_k≠r_k. ⊖ indicates s < r, while ⊕ indicates s > r. R = R(k − 1); values (R) appear inside the WHILE loop.

Table 6

Approximating π

Read	`S`	v.10^ω	[(b_i)]	p/q	R	r	k	⊖ / ⊕
	`A`	.1\|0\|	[1, 1]	1/2	1	`8000`	1	⊖
s₁	B₁	0.10\|0\|	[3,1]	1/4	2	`4000`	2	⊖
s₂	B₂	00.100\|0\|	[7,1]	1/8	3	`2000`	6	⊕
$s_{3}{\dots } s_{6} $	C₂	001.10\|0\|	[5, 1]	1/6	(3)	`2AAA`	5	⊖
	`D`	0010.1\|0\|	[6,1]	1/7	3	`249249`	9	⊖
$s_{7}{\dots } s_{9}$	$\overline {\texttt {A}}$	00100.\|100\|	[7,2]	2/15	(4)	`2222`	6	⊕
	$\overline {\texttt {B}_{1}}$	00100\|1.1000\|	[7, 4]	4/ 29	(5)	`234`	6	⊕
	$\overline {\texttt {B}_{2}}$	00100\|11.10000\|	[7, 8]	8/ 57	(6)	`23E`	6	⊕
	$\overline {\texttt {B}_{3}}$	00100\|111.100000\|	[7,16]	16/113	7	`243F6F`	22	⊖
$s_{10}{\dots } s_{22}$	$\overline {\texttt {C}_{3}}$	00100\|1110.100\|	[7,12]	12/85	(7)	`242424`	12	⊕
	$\overline {\texttt {C}_{2}}$	00100\|11101.10\|	[7,14]	14/99	(7)	`2433B`	13	⊕
	$\overline {\texttt {D}}$	00100\|111011.1\|	[7,15]	15/106	(7)	`2439F`	14	⊕
	`A`	00100\|1110111.\|1\|0\|	[7,15,1,1]	31/219	(8)	`243CC`	15	⊕
	$\overline {\texttt {A}},\overline {\texttt {B}_{1..6}}$	⋮	[7, 15, 1, 2^j+ 1]			⋮		⊕
	$\overline {\texttt {B}_{7}}$	00..\|..11\|1\|1⁷.10⁹\|	[7, 15, 1, 256]		15	`243F69E52`	23	⊕
s₂₃	$\overline {\texttt {B}_{8}}$	00..\|..11\|1\|1⁸.10¹⁰\|	[7,15,1,512]		(16)	`243F6C728`	22	⊖
	$\overline {\texttt {C}_{8}}$	00..\|..11\|1\|1⁸0.10000000\|	[7,15,1,384]		16	`243F6B987`	24	⊖
s₂₄	$\overline {\texttt {C}_{7}}$	00..\|..11\|1\|1⁸00.1000000\|	[7,15,1,320]		16	`243F6AEA3`	26	⊖
s₂₅, s₂₆	$\overline {\texttt {C}_{6}}$	00..\|..11\|1\|1⁸000.100000\|	[7, 15, 1, 288]		(15)	`243F6A762`	25	⊕
	$\overline {\texttt {C}_{5}}$	00..\|..11\|1\|1⁸0001.10000\|	[7, 15, 1, 304]		16	`243F6AB33`	27	⊖
s₂₇	$\overline {\texttt {C}_{4}}$	00..\|..11\|1\|1⁸00010.1000\|	[7, 15, 1, 296]		16	`243F6A958`	28	⊖
s₂₈	$\overline {\texttt {C}_{3}}$	00..\|..11\|1\|1⁸000100.100\|	[7, 15, 1, 292]		16	`243F6A860`	29	⊕
s₂₉	$\overline {\texttt {C}_{2}}$	00..\|..11\|1\|1⁸0001001.10\|	[7, 15, 1, 294]		16	`243F6A8DD`	30	⊖
s₃₀	$\overline {\texttt {D}}$	00..\|..11\|1\|1⁸00010010.1\|	[7, 15, 1, 293]		16	`243F6A89F2`	32	⊖
s₃₁, s₃₂	`A`	00..\|..\|1\|..100\|.1\|0\|	[7,15,1,292,1,1]		(17)	`243F6A87FF`	29	⊕
	$\overline {\texttt {A}}$	00..\|..\|1\|..100\|1\|.100\|	[7,15,1,292,1,2]		17	`243F6A88A5`	35	⊖
$s_{33}{\dots } s_{35}$	`A`	00..\|..\|1\|..100\|1\|0\|.1\|0\|	[7,15,1,292,1,1,1,1]		18	`243F6A8863`	36	⊖

Given that the sequence π − 3 =.243F6A88$\dots $ fails to meet the suggested approximations in places $s_{1},s_{2},s_{6},s_{9},s_{22-24}, s_{26-30}, s_{32}, s_{35}\dots $, we obtain the discrepancy sequence D(0010.0110.0011.1111.0110.1010.1000.1000) = 1100.0100.1000.0000.0000.0111.0111.1101 = (d_k ) = d. The long 0-run between d₉ and d₂₂ is of course the result of the large PD 292 and the excellent approximation 16/113 (cf. 355/113 for π). Also, the encoding C(s) = v (compare with the ⊖ / ⊕ sequence), is $C(s) = 00100.1110111.1.11111111000100100.1\dots $.

7 Connections between rational and 2-adic complexities, and impulse response sequences

In the context of linear complexity, a periodic sequence $s = (s_{1},s_{2},\dots ,s_{T})^{\omega }$ of period T has linear complexity $T - \deg (\gcd (x^{T}-1, S))$, where $S := \sum s_{i}x^{i-1}\in \mathbb {F}_{2}[x]$ assembles the digits of the period. For rational complexity, we similarly have:

Proposition 12 (Rational complexity of periodic sequences)

Let $s=(s_{1},\dots ,s_{T})^{\omega }$ be a periodic sequence. Let $S := {\sum }_{i=1}^{T} s_{i}2^{T-i}\in \mathbb {N}$, and let g := gcd(2^T − 1,S). Then s has rational complexity $R(s) = T - \lfloor \log _{2}(g)\rfloor $.

Proof

s can be generated as P/Q with P := S and Q := 2^T − 1. Set $\tilde P := S/g$ and $\tilde Q :=(2^{T}-1)/g$. Then still $ s = \tilde P / \tilde Q = (S/g) / ((2^{T}-1)/g) = S / (2^{T}-1)$, which shows the upper bound $R(s) \leq \lceil \log _{2}((2^{T}-1)/g)\rceil $. As $\mathbb {Z}$ is a unique factorization domain, there is no simpler way to represent $\tilde P$ and $\tilde Q$. Since 2|̸g, $\log _{2}(g)$ is not integral, and thus $R(s) = \lceil \log _{2}((2^{T}-1)/g)\rceil = T - \lfloor \log _{2}(g)\rfloor $. □

In the following, $\hat s\in \mathbb {Q}\subset \mathbb {Z}_{2}$ refers to the 2-adic value of the ultimately periodic sequence s ∈ B^ω. Note that $\hat s\neq s \in \mathbb {Q}\subset \mathbb {R}$ (reuse of symbol s in $\mathbb {R}_{1}$ and B^ω) in general.

Theorem 13

Connection between Rational and 2-adic Complexity

(i)

Let s = y^ω be a purely periodic sequence. Then R(s) equals the 2-adic complexity of the sequence with reversed period, $R(y^{\omega }) = A((\stackrel {\leftarrow }{y})^{\omega })$.

The symmetric 2-adic complexity [4], $A_{\text {\scriptsize {\sc sym}}}(y^{\omega }) := \min \limits (A(\stackrel {}{y}^{\omega }), A((\stackrel {\leftarrow }{y})^{\omega }))$ is equal to $\min \limits (A(\stackrel { }{y}^{\omega }), R(\stackrel { }{y}^{\omega })) =\min \limits (R(\stackrel { }{y}^{\omega }), R((\stackrel {\leftarrow }{y})^{\omega }))$.

(ii)

Let s = xy^ω, x, y ∈ B⁺, be an ultimately periodic sequence. Then R(s) and the 2-adic complexity $A((\stackrel {\leftarrow }{x})(\stackrel {\leftarrow }{y})^{\omega })$ are both obtained from the numbers (x)₂,(y)₂,2^|x|,2^|y|− 1.

Proof

(i)

For purely periodic sequences s = y^ω, let $ \stackrel {\leftarrow }{s}:= (\stackrel {\leftarrow }{y})^{\omega }$. Then we have

$$ s = \frac{(y)_2}{2^{|y|}-1}\in\mathbb{R},\quad \hat s = \frac{-(\stackrel{\leftarrow}{y})_2}{2^{|y|}-1}\in\mathbb{Z}_2,\quad \stackrel{\leftarrow}{s} = \frac{(\stackrel{\leftarrow}{y})_2}{2^{|y|}-1}\in\mathbb{R},\quad \widehat{\stackrel{\leftarrow}{s}} = \frac{-(y)_2}{2^{|y|}-1}\in\mathbb{Z}_2 .$$

In this case, the reduced fractions $s=p/q, \stackrel {\leftarrow }{s}=m/n\in \mathbb {R}, \hat s = -m/n,\widehat {\stackrel {\leftarrow }{s}} = -p/q\in \mathbb {Z}_{2}$ satisfy 1 ≤ p < q and 1 ≤ m < n, q, n odd, see also [7, Cor. 2.1]. Hence, the rational complexity—of purely periodic sequences—can indeed be computed via the FCSR apparatus from 2-adic complexity:

$R(y^{\omega }) = A((\stackrel {\leftarrow }{y})^{\omega }) = \lceil \log _{2}(q)\rceil = 1+\lfloor \log _{2}(q)\rfloor $, if T > 1. The complexities also coincide for the special cases R(0^ω) = A(0^ω) = 0 and R(1^ω) = A(1^ω) = 1 (from $-1\in \mathbb {Z}_{2}$).

The claim about the symmetric 2-adic complexity follows immediately.

(ii)

For ultimately periodic sequences s = xy^ω with $s^{\prime } := \stackrel {\leftarrow }{x}(\stackrel {\leftarrow }{y})^{\omega }$, we similarly have

$$ s = 2^{-|x|}\cdot\left( (x)_2 + \frac{(y)_2}{2^{|y|}-1}\right)\in\mathbb{R},\ \hat s = (\stackrel{\leftarrow}{x})_2+ 2^{|x|}\cdot\frac{-(\stackrel{\leftarrow}{y})_2}{2^{|y|}-1},\ \widehat{s}^{\prime} = (x)_2+ 2^{|x|}\cdot\frac{-(y)_2}{2^{|y|}-1}\in\mathbb{Z}_2, $$

where the first term of the sum corresponds to the preperiod, the second to the period, in all formulae. s and $\widehat {s}^{\prime }$ are built up from the same 4 parts, thus so are R(s) and $A(s^{\prime })$, but (of course) s and $\widehat {s^{\prime }}$ are not necessarily equal, neither are R(s) and $A(s^{\prime })$.

□

Example 14

Let s = 1011(1100)^ω and t = 1101(0011)^ω, with $(1011)_{2}=11, (\overleftarrow {1011})_{2} = (1101)_{2} = 13, (1100)_{2}=12, (\overleftarrow {1100})_{2} = (0011)_{2} = 3$

Then $s = \frac {11}{16} + \frac {1}{16}\cdot \frac {12}{15} = \frac {177}{240} = \frac {59}{80}\in \mathbb {R}$ and $\hat s = 13 + 16 \cdot \frac {-3}{15} = \frac {195-48}{15} = \frac {147}{15} = \frac {49}{5}\in \mathbb {Z}_{2}$.

We have $R(s) = \lceil \log _{2}(80)\rceil = 7$, and $A(s) = 1+\lfloor \log _{2}(49)\rfloor = 6$.

Also, $t = \frac {13}{16} + \frac {1}{16}\cdot \frac {3}{15} = \frac {195+3}{240} = \frac {33}{40}\in \mathbb {R}$ and $\hat t = 11 + 16 \cdot \frac {-12}{15} = \frac {165-12}{15} = \frac {153}{15} = \frac {51}{5}\in \mathbb {Z}_{2}$.

We have $R(t) = \lceil \log _{2}(40)\rceil = 6$, and $A(t) = 1+\lfloor \log _{2}(51)\rfloor = 6$.

R(s)≠A(t) are both obtained using the numbers 11,12,2⁴,2⁴ − 1, while R(t) = A(s) use the numbers 13,3,2⁴,2⁴ − 1. The equality in the latter case is pure coincidence, R(t) stemming from the denominator, A(s) using the numerator.

Conjecture 15

Simulating R by A in O(n⁴)

There is an O(n⁴) algorithm to compute the rational complexity without using the Binary CFE, applying the apparatus from 2-adic complexity—and vice versa.

Proof

(Idea:) We can assume the separation into x and y after any position 0 ≤ k ≤ n within $s_{1}{\dots } s_{n}$, and then apply Theorem 13(ii) to each case. This incurs another factor n + 1 in complexity. We thus obtain n + 1 rational approximations to s that are all valid on the given n-bit prefix. Among these approximations, the one with smallest Q defines the rational complexity.

Let the current approximation to s = v... be $s \doteq (va{\dots } z)^{\omega }$.

We efficiently obtain z from v by first generating an FCSR for v, and then running the FCSR backwards. Thus, we need O(|v|) steps instead of the full period, which might be considerably larger.

Incrementally computing the best rational approximation to a sequence $s = v\overline {a}\dots $ from the approximation to the shorter prefix $s = v\dots $ via 2-adic complexity / FCSRs is not straightforward:

If the next bit after v in s happens to be a, the discrepancy is zero and the approximation is maintained. If however, s continues with $\overline {a} = 1-a$, then $s = v\overline {a}\dots $ will have a best approximation $s \doteq v_{1}(v_{2}\overline {a}{\dots } \tilde z)^{\omega }$, where firstly v possibly will be split into a preperiod v₁≠ε and the start v₂ of the new period, and secondly the end of the new period, $\tilde z$, has no direct relation with z. Also, the period length may grow or shrink considerably.

If the best approximation to v was already ultimately periodic with preperiod, things are similar. The length of the preperiod may change: $s=uv\dots \doteq uv^{\omega }$ may lead to $s=uvuv{\dots } \doteq (uv)^{\omega }$ for one extension, and to $s=uv10^{|uv|+1}\dots \doteq uv10^{\omega }$ in another case. In other words, the new preperiod may be empty or include the whole previous prefix, or anything in between.

Since the new periods terminate in different words, $\tilde z\neq z$, for each successive approximation— unless the previous value remains the same—we have to restart the FCSR from time step 1, a further factor n. Hence, we get a new approximation for a given separation position in O(n³). All in all, we can simulate rational approximation by 2-adic approximation in O(n⁴).

The four factors n correspond to (i) the operand lengths, (ii) the assumed length of the preperiod, i.e. the position of the separation, (iii) the restart time for the FCSRs with the reversed end of the new period, $\stackrel {\leftarrow }{\tilde z}$, and (iv) the growing prefix size of s.

As far as we can see, speed-up via FFT techniques is not feasible due to (iii), since every two steps on average (for i.i.d. sequences) we have to restart the FCSRs with the new value $\stackrel {\leftarrow }{\tilde z}$ instead of the $\stackrel {\leftarrow }{z}$ (Open Problem 7). □

Theorem 16

Periods of impulse response sequences

(i)

Given an odd $q\in \mathbb {N}$, let its impulse response sequence s ∈ B^ω be the binary expansion of 1/q. Then the period of s divides φ(q).

(ii)

The period of a/q is at most that of the impulse response sequence 1/q, for any a.

(iii)

For q = 2ⁿ − 1, the period of the associated sequence 1/q divides n. Hence, unlike the LFSR case, there are no sequences with period 2ⁿ − 1 and rational complexity n.

(iv)

Let p be a prime and 2 a generator of the multiplicative group $\mathbb {F}_{p}^{*}$. Then a/p has period p − 1 for all a≠ 0.

(v)

For $n\in \mathbb {N}$, N := 2ⁿ, we can expect a prime p with $\langle 2 \rangle = \mathbb {F}_{p}^{*}$ within the range [N − 2n, N − 1] (where N − 1 is excluded by (iii)).

Proof

Let ord_q(g) be the order of the (multiplicative) cyclic subgroup $\langle g\rangle \subset (\mathbb {Z}/q\mathbb {Z})^{*}$. For prime q = p we also write $\langle g\rangle \subset \mathbb {F}_{p}^{*}$. Our case is g = 2 for binary shifts.

(iii)

The residues mod q of 1/q are $\{2^{k} \ (\text {mod}~q)\ | \ k=1,2,\dots ,\operatorname {ord}_{q}(2)\} = \langle 2\rangle \subset (\mathbb {Z}/q\mathbb {Z})^{*}$, the group of units mod q. Since $|(\mathbb {Z}/q\mathbb {Z})^{*}|= \varphi (q)$, the claim now follows from Lagrange’s (subgroup index) theorem.

(ii)

The residues a ⋅ 2^k (mod q) for a/q differ from those for 1/q by a factor a. In particular they repeat after the same period, possibly having a shorter least period.

(iii)

$2^{n}=1 \ (\text {mod}~ q)\Longleftrightarrow \operatorname {ord}_{q}(2)\ |\ n$.

(iv)

Since 2 generates $\mathbb {F}_{p}^{*}$, i.e. 2^k passes through all nonzero residues mod p, the first p − 1 shifts of 1/p pass through all a/p with a≠ 0. Thus they all have the period p − 1.

(v)

By the Prime Number Theorem, the probability of being prime for some p near N is $1/\log _{e}(N)\doteq 1/ (0.69\cdot n)$. Assuming Artin’s conjecture (on primitive roots) [12] for prime p, 2 generates $\mathbb {F}_{p}^{*}$ with probability $C_{Artin} ={\prod }_{p} \left (1-\frac {1}{p(p-1)}\right ) \doteq 0.373955$. The joint probability C_Artin ⋅ 1/(0.69 ⋅ n) > 1/2n yields the result.

□

8 Comparison of R with A and L

Rational complexity R, linear complexity L, and 2-adic complexity A all assign a finite complexity to ultimately periodic sequences. However, the three complexities assign low values to quite different sequences / prefixes [20].

In Table 7, we give the number of 30-bit prefixes which have an unremarkable value in one of the complexities (L ≥ 14, A ≥ 14, and R ≥ 15, respectively), being higher than or at most one smaller than the expected value. We then classify these prefixes according to the remaining two complexities into 3 classes: expected value or higher (third column/row), off by at most $5\approx \log _{2}(30)$ (middle), and the definitely atypical case, off from the expected value by more than a logarithmic difference (first column/row).

Table 7

Number of sequences with one high complexity value, for L, A, and R at n = 30

	L ≥ 14				A ≥ 14
A :	R: 0..10	11..15	16..30	L:	R: 0..10	11..15	16..30
0..9	252	64425	123802	0..9	16	51795	107990
10..14	65256	65564743	149388020	10..14	49901	53639684	122678310
15..30	228863	255061144	592060508	15..30	240626	266201775	617172734
	R ≥ 15
A:	L : 0..9	10..14	15..30
0..9	14	32421	150405
10..14	31732	35724773	177284599
15..30	125736	139456530	700529908

In particular, we have in boldface the numbers of sequences, which are only recognized as low-complexity by rational complexity, but neither by linear, nor 2-adic complexity.

Over the set of prefixes of length 30, the three complexities L, A, and R are essentially uncorrelated, both in pairs and all three combined. Very low (0 or 1) and very high (30 = n) complexities correspond to the same sequences for L, A, and R, namely (0³⁰;1³⁰,10²⁹;0²⁹1) For most sequence prefixes, however, the three complexities behave statistically unrelated. We examine prefixes of length n = 30 in the following lemma.

Lemma 17

Low-Correlation-Lemma

Let #(L = l, A = a) = |{s ∈ B³⁰: L(s,30) = l ∧ A(s,30) = a}| etc.

Define the “fraction of non-overlapping of distributions” as

$$ \begin{array}{@{}rcl@{}} &&\rho_{LAR} = \sum\limits_{l=0}^{30}\sum\limits_{a=0}^{30}\sum\limits_{r=0}^{30} \left|\frac{\#(L=l, A=a,R=r)}{2^{30}} - \frac{\#(L=l)}{2^{30}}\cdot\frac{\#(A=a)}{2^{30}}\cdot\frac{\#(R=r)}{2^{30}}\right|,\\ &&\rho_{LA} = \sum\limits_{l=0}^{30}\sum\limits_{a=0}^{30} \left|\frac{\#(L=l, A=a)}{2^{30}} - \frac{\#(L=l)}{2^{30}}\cdot\frac{\#(A=a)}{2^{30}}\right|, \ \rho_{LR},\rho_{AR} \text{\ analogously.} \end{array} $$

Then $ \rho _{LAR}\doteq 0.00938, \rho _{LA} \doteq 0.00494, \rho _{AR} \doteq 0.00720, \rho _{LR} \doteq 0.00528. $

Proof

By numerical evaluation. More than 99% of all sequence prefixes behave according to the distribution expected for completely uncorrelated complexity measures, for all 4 correlations. □

9 Deviation from isometric behaviour

For the linear complexity, the encoding of PDs and the induced isometry, the discrepancy sequence, coincide. Both are given by K(s). In our case, the encoding C(s) and the discrepancy operator D(s), like K an isometry on B^ω, are different. This section investigates, by how much.

Lemma 18

Information Content of Prefixes from s, d = D(s), and C(s)

Let $s\in \mathbb {R}_{1}$ satisfy the Gauß-Kuz’min measure μ_GK. For a given prefix length $n\in \mathbb {N}$, let $r= P_{2i}/Q_{2i} = [b_{1},\dots ,b_{2i}]$ be the first approximation that matches the first n bits of s, i.e. r_k = s_k,1 ≤ k ≤ n.

Then asymptotically, encoding length and precision satisfy $\lim _{n\to \infty } \frac {{\sum }_{j=1}^{2i}l_{\text {I,II}}(b_{j})}{n} \doteq 1.024$.

Proof

From ${\sum }_{b\in \mathbb {N}}l_{\text {I,II}}(b)\cdot \mu _{GK}(b) \doteq 3.50698$, Theorem 1(iii), and $\frac {3.50698}{3.42371}$ $\doteq 1.024$. □

Lemma 18 shows that for numbers satisfying the Gauß-Kuz’min measure, the code length C(s) grows faster than the considered prefix length of the sequences s and d = D(s) by a factor of (only) 1.024. One could say that we miss our goal to mimic the isometric behaviour of K by just 2.4% (on average).

For numbers not satisfying the Gauß-Kuz’min measure, e.g. rationals or quadratic-algebraic numbers, the code may grow faster or slower. We compare the coding length l for ${\Phi }_{b} := [b,b,b,b,\dots ] = (\sqrt {b^{2}+4}-b)/2$ with its gain in precision. In this case, the convergents’ denominators (and numerators) asymptotically increase by a factor $\phi _{b} := (b+\sqrt {b^{2}+4})/2$ (= b + Φ_b), from $ \begin {array}{c|c|c} &&b_{i}=b\\ 1&\phi _{b}&b\cdot \phi _{b}+1 \end {array}$ (Perron’s schema) with $ b\cdot \phi _{b}+1\stackrel {!}{=}{\phi _{b}^{2}}\Longleftrightarrow {\phi _{b}^{2}}-b\phi _{b}-1=0$.

As |r − P_i/Q_i| decreases like ${\Theta }(Q_{i}^{-2}) = {\Theta }(\phi _{b}^{-2i})$ (since $r\not \in \mathbb {Q}$), see [14, Satz 2.10], each additional PD b asymptotically yields $L := \log _{2}({\phi _{b}^{2}})$ more bits with r_k = s_k.

Since b is encoded by $l = l_{\text {I,II}}(b) = 1+2\lfloor \log _{2}(b)\rfloor $ bits, we obtain the length extension for the encoding by

$$\gamma_b := \frac{l}{L} = \frac{1+2\lfloor \log_2(b)\rfloor}{2\cdot \log_2(\phi_b)}.$$

The values γ_b decrease monotonically with b, from b = 2^t to b = 2^t+ 1 − 1 with fixed l = 2t + 1, then jump to a local maximum at 2^t+ 1 etc. From Table 8, we can infer that γ₁ = 0.72 is the minimum with a saving of 28% in coding size, 1000 bits of precision require only 720 bits of C(s), while γ₄ = 1.20 is the maximum value for a length extension of 20%, the 1000 bits $s_{1},{\dots } s_{1000}$ require about 1200 bits of C(s). Hence, ϕ₁ has the shortest, ϕ₄ the longest code, compared with attained precision. Comparing with Lemma 18, the value 1.024 has to be replaced by the range [0.72, 1.20] to cover all $s\in \mathbb {R}_{1}\backslash \mathbb {Q}_{1}$.

Table 8

Length extension γ_b for fixed partial denominator and CFE for ϕ₁, ϕ₄

https://static-content.springer.com/image/art%3A10.1007%2Fs12095-021-00539-2/MediaObjects/12095_2021_539_Figd_HTML.png

For $s\in \mathbb {Q}_{1}$, we have l/L → 0 asymptotically, a finite CFE yields infinite precision.

10 The Q Tree: an ultrametric ordering of $\mathbb {Q}_{1} = \mathbb {Q}\cap [0,1)$

In this section, we generate the Q Tree (Table 9), an isometric version of the V₁₀ Tree. Its upper k levels, plus the allzero word, include all prefixes from B^k exactly once. We show, how to use the Q Tree to accelerate the Binary CFE.

Table 9

Address v ∈ B^∗, Q Tree (Q[v]), and V₁₀ Tree (C^− 1(v10^ω))

https://static-content.springer.com/image/art%3A10.1007%2Fs12095-021-00539-2/MediaObjects/12095_2021_539_Fige_HTML.png

First define an infinite binary tree with labels, the Expanded Q Tree: We read out the V₁₀ Tree [22] in breadth-first order, prepending 0/1, and put each value P/Q as label on every node whose address is a prefix of s = P/Q and that has not yet been labeled.

Example: The label s = 0/1 from 0^ω is put on nodes ε (the root), 0, 00, 000 etc. Then s = 1/2 = 10^ω labels 1, 10, 100, etc., while ε is already taken. s = 1/4 = 010^ω skips ε, 0 and labels from 01 onward. s = 2/3 = (10)^ω skips ε, 1, 10 and labels 101, 1010 and so on.

Theorem 19

Nodes and labels of the Expanded Q Tree

(i)

The labels P/Q = s of the 2^k nodes on level k yield each prefix v ∈ B^k exactly once.

(ii)

Exactly one of the two child nodes has the same label as the parent node.

Proof

(i)

By construction: Any prefix v ∈ B^∗ is met just at the node with address v.

(ii)

Once a node receives a label s, the same s moves further down the addresses and labels a— yet unlabeled—child node of each visited node.

□

To generate the Q Tree from the Expanded Q Tree, for each node replace its label by the one of its child nodes, which is different. This in particular removes all labels 0/1.

Theorem 20

Equidistribution of Prefixes in the Q Tree

(i)

The binary expansions of the labels of the 2^k − 1 nodes on levels $1{\dots } k-1$ of the Q Tree, plus the special value 0^k, include every prefix from B^k exactly once.

(ii)

In the Q Tree, each word v ∈ B^k∖{0^k} appears as prefix of the binary expansion of the label P/Q, at some point on the path from the root to the node with address $v_{1}{\dots } v_{|v|-1}$.

Proof

(i)

By construction and Theorem 19. All entries on level k from the Expanded Q Tree are preserved somewhere further up—except 0^ω.

(ii)

The prefix was present at address v in the Expanded Q Tree and has moved up by construction. However, all such places, where it can move are just those on the path from the root to the parent of the node with address v, which has address $v_{1}{\dots } v_{|v|-1}$.

□

Corollary 21

Writing out the Q Tree, preceeded by the value 0, yields an ultrametric ordering of $\mathbb {Q}_{1}$ (where the first 2^k entries of the sequence include all prefixes of length k exactly once, for all $k\in \mathbb {N}):$ $0/1,1/2,1/4,4/5,1/8,2/5\dots $.

The procedure of Table 4 (left part) is simplified considerably, if the Q Tree is known, see Table 10. Note that now the WHILE loop runs through r, s, and d. The original WHILE loop has been replaced by the single access to Q[v], whenever the current approximation does no longer hold, and at the exact address v for the new approximation P/Q. However, the Q Tree is as yet only a by-product after setting up the V₁₀ Tree, see Open Problem 6.

Table 10

Binary CFE using Q Tree

`1.` r := 0^ω; i := 1
`2. FOREVER`
`3. WHILE` (r_i = s_i) `BEGIN` d_i := 0`;i++ END`
`4.` $v = (r_{1}{}{\dots } r_{i-1});\ d_{i} := 1$`; i++`
`5.` r = `F`$\mathbb {Q}$`SR(Q[`v`])`
`// consult Q Tree at v, label is new P/Q for F`$\mathbb {Q}$`SR`
`6. ENDFOR`

11 Binary CFE III: states, incremental change of b_i, P_i, Q_i

Recalling Section 4, we start with the full cylinder set B^ω or $\mathbb {R}_{1}$ as possible encodings. Using the known correct prefix v (initially we know nothing, v = ε), we successively cut the cylinder set into half by probing with the trial value v10^ω↦C^− 1r, which sits at the center of the current cylinder set (we could as well have used v01^ω, yielding the same r, with convention (iv) for the ambiguity). This binary search leads to a new approximation:

$$v \mapsto v10^{\omega} = \text{C}_{\mathrm{I}}(b_1)|\text{C}_{\text{II}}(b_2)|\dots|\text{C}_{\text{II}}(b_{2l})|\text{C}_{\mathrm{I}}(\aleph_0) \mapsto [b_1,b_2,\dots, b_{2l}] = \frac{P}{Q}= r=C^{-1}(v10^{\omega}).$$

In order to obtain an efficient implementation of this part, we maintain the state of a Finite State Machine plus Counter (FSM+C) and the current as well as an auxilliary convergent, P/Q and $P^{\prime }/Q^{\prime }$, with initial values 1/2 and 0/1, respectively. The state set is $\{\texttt { A...D},\overline {\texttt {A}}...\overline {\texttt {D}}\}$, the counter appears as index j for states $\texttt {B}, \texttt {C}, \overline {\texttt {B}}, \overline {\texttt {C}}$. We start in state A.

The states A...D handle PDs with odd index 2i + 1, while states $\overline {\texttt {A}}...\overline {\texttt {D}}$ for PDs with even index operate exactly 0-1-inverse to A...D. Since we always maintain an even number of PDs, for an odd current index the PD value b_2l+ 1 is distributed as $[\dots , b_{2l+1}-1,1]$. This is the only difference between the two groups of states, apart from the bit complement. The update of the state and of the two convergents are shown in Tables 11 and 12 (for $Q,Q^{\prime }$ analogously). “<<” is the left shift. The first 2i PDs are fixed and omitted.

Table 11

Encodings and states

	IN: v.10^ω	OUT: ⊖ : v0.10^ω, for s < r, v_k = 0	Next-
		OUT: ⊕ : v1.10^ω, for s > r, v_k = 1	state
A	.1\|0\|≡ [1, 1]	0.10\|0\|≡ [3, 1]	B₁
		1\|.100\|≡ [1, 2]	$\overline {\texttt {A}}$
B_j	0^j.10^j\|0\|≡ [2^j+ 1 − 1, 1]	0^j0.10^j+ 1\|0\|≡ [2^j+ 2 − 1, 1]	B_j+ 1
		0^j1.10^j− 1\|0\|≡ [2^j + 2^j− 1 − 1, 1]	C_j, m := j
C_j	0^m1 ∗^m−j.10^j− 1\|0\|≡ [N, 1]	0^m1 ∗^m−j0.10^j− 2\|0\|≡ [N + 2^j− 2, 1]	C_j− 1
_j≥ 2		0^m1 ∗^m−j1.10^j− 2\|0\|≡ [N − 2^j− 2, 1]	C_j− 1
D	0^m1 ∗^m− 1.1\|0\|≡ [N, 1]	0^m1 ∗^m− 10\|.100\|≡ [N + 1, 2]	$\overline {\texttt {A}}$
= C₁		0^m1 ∗^m− 11\|.100\|≡ [N, 2]	$\overline {\texttt {A}}$
$\overline {\texttt {A}}$	$ \dots \|.100\| \equiv [\dots ,2]$	$\dots \|0\|.1\|0\| \equiv [\dots ,1,1,1] $	A
		$\dots \|1.1000\|\equiv [\dots ,4] $	$\overline {\texttt {B}_{1}}$
$\overline {\texttt {B}_{j}}$	$\dots \|1^{j}.10^{j+2}\|\equiv [\dots ,2^{j+1}]$	$\dots \|1^{j}0.10^{j-1}\equiv [\dots ,2^{j}+2^{j-1}]$	$\overline {\texttt {C}_{j}},m := j$
		$\dots \|1^{j}1.10^{j+3}\equiv [\dots ,2^{j+2}]$	$\overline {\texttt {B}_{j+1}}$
$\overline {\texttt {C}_{j}}$	$\dots \|1^{m}0*^{m-j}.10^{j-1}\|\equiv [\dots ,N]$	$\dots \|1^{m}0*^{m-j}0.10^{j-2}\|\equiv [\dots ,N-2^{j-2}]$	$\overline {\texttt {C}_{j-1}}$
_j≥ 2		$\dots \|1^{m}0*^{m-j}1.10^{j-2}\|\equiv [\dots ,N+2^{j-2}]$	$\overline {\texttt {C}_{j-1}}$
$\overline {\texttt {D}}$	$\dots \|1^{m}0*^{m-1}.1\|\equiv [\dots ,N]$	$\dots \|1^{m}0*^{m-1}0\|.1\|0\|\equiv [\dots ,N-1,1,1]$	A
$=\overline {\texttt {C}_{1}}$		$\dots \|1^{m}0*^{m-1}1\|.1\|0\|\equiv [\dots ,N,1,1]$	A

Table 12

Update of convergents

State	PD		⊖,v_k = 0		⊕,v_k = 1
	index	$P^{\prime +} := $	$P^+ := \dots $	${P^{\prime }+} := $	$P^+ := \dots $
`A`	2i + 1	$P^{\prime }$	$P + (P^{\prime }<<1)$	$P - P^{\prime }$	$(P{{<<}}1) - P^{\prime }$
B_j	2i + 1	$P^{\prime }$	$P + (P^{\prime }<<(j+1))$	$P^{\prime }$	$P - (P^{\prime }<<(j-1))$
C_j	2i + 1	$P^{\prime }$	$P + (P^{\prime }<<(j-2))$	$P^{\prime }$	$P - (P^{\prime }<<(j-2))$
D	2i + 1	P	$(P<<1)+P^{\prime }$	$P - P^{\prime }$	$(P<<1)-P^{\prime }$
$\overline {\texttt {A}}$	2i + 2	$P-P^{\prime }$	$(P<<1)-P^{\prime }$	$P^{\prime }$	$P+(P^{\prime }<<1)$
$\overline {\texttt {B}_{j}}$	2i + 2	$P^{\prime }$	$P - (P^{\prime }<<(j-1))$	$P^{\prime }$	$P + (P^{\prime }<<(j+1))$
$\overline {\texttt {C}_{j}}$	2i + 2	$P^{\prime }$	$P - (P^{\prime }<<(j-2))$	$P^{\prime }$	$P + (P^{\prime }<<(j-2))$
$\overline {\texttt {D}}$	2i + 2	$P - P^{\prime }$	$(P<<1) - P^{\prime }$	P	$(P<<1) + P^{\prime }$

Observe that no multiplication (as in P_i := b_i ⋅ P_i− 1 + P_i− 2, Q_i := b_i ⋅ Q_i− 1 + Q_i− 2) is necessary. It is replaced by an incremental shift-and-add. We thus obtain property (INC)—everything is done bitwise, incrementally, and only additions and shifts are used (the shifts by arbitrarily large j can be avoided by shifting $P^{\prime }, Q^{\prime }$ in each state $\texttt {B}_j, \texttt {C}_j, \overline {\texttt {B}_{j}}, \overline {\texttt {C}_{j}}$).

12 Feedback in $\mathbb {Q}$ shift registers F$\mathbb {Q}$SRs: Reduce-By-Feedback

We have to obtain the rational sequence r = P/Q to be compared with s. Paper-and-pencil long division in base 2 of P by Q (here P < Q is assumed) proceeds as in Table 13 (left).

Table 13

Algorithms: Division in $\mathbb {Q}$, Reduce-By-Feedback

https://static-content.springer.com/image/art%3A10.1007%2Fs12095-021-00539-2/MediaObjects/12095_2021_539_Figf_HTML.png

An F$\mathbb {Q}$SR is thus just a standard long accumulator X for values from $\mathbb {N}_0$ with a shift-and-add unit. It requires a full length compare in each cycle. The next part shows, how to replace that time-consuming compare by a 2-bit compare and a different way of keeping X (almost) within the range [0,Q).

Remark 22

In analogy with the LFSR case, we can consider FCSRs as “Fibonacci style” and F$\mathbb {Q}$SRs as “Galois style” implementation of feedback with carry.

If implementing in software, skip directly to Theorem 24.

An alternative implementation of division in $\mathbb {Q}$ and thus F$\mathbb {Q}$SRs, by Reduce-By-Feedback (see [21]), is shown in Table 13, right hand side.

Let $n := \lceil \log _2(Q)\rceil , N := 2^n$, hence N/2 < Q ≤ N. We use an (n + 2)-bit register for X, initially filled with P. Setting K ≡ N(mod Q) that is K := N − Q, we may replace X := X − Q by X := X − N + K.

Let m denote the two leftmost bits of the F$\mathbb {Q}$SR, of content 0,1,2, or 3. We replace the decision (X ≥ Q?), a long compare, by the two-bit-check (X ≥ N?) that is (m ≥ 1?).

If necessary, for m≠ 0, we subtract m ⋅ N (by setting m := 0) and compensate by adding the constant m ⋅ K. In this way, the contents of X is always correct mod Q. It may, however, exceed Q − 1, lying in the range Q ≤ X < N.

We now show that m ≤ 2 is always guaranteed.

Proposition 23

Overflow-freeness of the range m ∈{0,1,2}

Let 0 ≤ m ≤ 2. Then after one clock cycle we still have 0 ≤ m⁺ = ⌊(2X + mK)/N⌋≤ 2.

Proof

We have K = N − Q and N/2 < Q ≤ N. Therefore, 0 ≤ K < N/2 and 0 ≤ 2X + 0 ⋅ K ≤ 2X + m ⋅ K ≤ 2X + 2 ⋅ K < 2 ⋅ N + 2 ⋅ N/2 = 3N, which assures that the range {0,1,2} for m is also satisfied in the next step. □

We want the sequence r = (r_k) ∈{0,1}^ω, but instead we obtain (m_l) ∈{0,1,2}^ω. While we have ${\sum }_{k\in \mathbb {N}}r_k2^{-k} = {\sum }_{l\in \mathbb {N}}m_l2^{-l}$ by design, we must convert (m_l) into (r_k):

$$ \begin{array}{@{}rcl@{}} &&ctr := 0; k := 0\\ &&\texttt{FOR}~l = 1,2,\dots\\ &&{\kern5 mm} \texttt{IF}~m_l=0~\texttt{THEN Output}~(r_k=0, r_{k+1}={\dots} r_{k+ctr}=1); k := k+ctr+1; ctr := 0~\texttt{ENDIF}\\ &&{\kern5 mm} \texttt{IF}~m_l=1~\texttt{THEN}~ctr := ctr+1~\texttt{ENDIF}\\ &&{\kern5 mm} \texttt{IF}~m_l=2~\texttt{THEN Output} (r_k=1, r_{k+1}={\dots} r_{k+ctr}=0); k := k+ctr+1; ctr := 0~\texttt{ENDIF}\\ &&\texttt{ENDFOR} \end{array} $$

Essentially, a run of ones with leading zero is saved for later and output when m = 0. With m = 2, a carry ripples through the 1-run, converting it to 10^ctr. We start anew with a leading zero, not yet output. The value r₀ will always be zero and should be thrown away (since P < Q, we do never start with $m= 1^k2\dots $).

More on the implementation of F$\mathbb {Q}$SRs in hardware can be found in [21], including the use of Brickell’s delayed-carry-adder technique: Paper-and-pencil long division is a subset of RSA via modular exponentiation $\supseteq $ multiplication mod Q $\supseteq $ shift-and-add mod Q $\supseteq $ shift (no add) mod Q = long division.

Theorem 24

Fast Reinitialisation of F$\mathbb {Q}$SRs

We denote values at timestep $k\in \mathbb {N}$ by the superscript ^(k).

Consider 3 F$\mathbb {Q}$SRs with initial contents $P^{(1)},P^{\prime (1)},$ and $P^{\prime \prime (1)} := P^{(1)}+bP^{\prime (1)}$ for some $b\in \mathbb {N}$, with feedback values $Q, Q^{\prime }$, and $Q^{\prime \prime } := Q+bQ^{\prime }$, and where $N, N^{\prime },N^{\prime \prime }$ are the respective powers of 2 corresponding to the sizes of $Q, Q^{\prime }, Q^{\prime \prime }$.

Let m^(k) := 2 ⋅ P^(k) mod Q ∈{0,1}, and similarly $m^{\prime (k)}, m^{\prime \prime (k)}$ be the respective feedback bits and $P^{(k+1)} := 2P^{(k)}-m^{(k)}Q,\ \ P^{\prime (k+1)} := 2P^{\prime (k)}-m^{\prime (k)}Q^{\prime },\ \ P^{\prime \prime (k+1)} := 2P^{\prime \prime (k)}-m^{\prime \prime (k)}Q^{\prime \prime }$ be the register contents in the next timestep. Then:

(i) If $m^{(k)} = m^{\prime (k)}$ and the weighted sum property $ P^{\prime \prime (k)} = P^{(k)} + b P^{\prime (k)}$ holds, then also $m^{\prime \prime (k)} = m^{(k)} (= m^{\prime (k)})$ and the property remains valid for k + 1, $ P^{\prime \prime (k+1)} = P^{(k+1)} + b P^{\prime (k+1)}$.

(ii) Let r := P⁽¹⁾/Q⁽¹⁾ and $r^{\prime } := P^{\prime (1)}/Q^{\prime (1)}$ coincide for the first n bits, $r_k=r^{\prime }_k, 1\leq k\leq n$. Then $r^{\prime \prime } := P^{\prime \prime (1)}/Q^{\prime \prime (1)}$ also satisfies $r^{\prime \prime }_k = r_k, 1\leq k\leq n$, and the contents of the 3 F$\mathbb {Q}$SRs generating $r,r^{\prime }$, and $r^{\prime \prime }$ satisfy the weighted sum property $P^{\prime \prime (k)} = P^{(k)} + b\cdot P^{\prime (k)}$ for all timesteps 1 ≤ k ≤ n.

Proof

(i)

First assume $m=m^{\prime }=0$. Then $P<Q, P^{\prime } < Q^{\prime }$ and therefore $P^{\prime \prime } = P+bP^{\prime } < Q+bQ^{\prime }=Q^{\prime \prime }$, hence $m^{\prime \prime } = 0$. For $m=m^{\prime }=1$, analogously $P\geq Q, P^{\prime } \geq Q^{\prime }$ and therefore $P^{\prime \prime } = P+bP^{\prime } \geq Q+bQ^{\prime }=Q^{\prime \prime }$, hence $m^{\prime \prime } = 1$.

Also, $ P^{\prime \prime (k+1)} = 2P^{\prime \prime (k)} -m^{\prime \prime (k)}Q^{\prime \prime } = 2(P^{(k)}+bP^{\prime (k)}) - (m^{(k)}Q+m^{\prime (k)}bQ^{\prime }) = (2P^{(k)}-m^{(k)}Q) + b(P^{\prime (k)}-m^{\prime (k)}Q^{\prime }) = P^{(k+1)} + b\cdot P^{\prime }(k+1)$.

(ii)

By induction on k with (i), using $r_k = m^{(k)},r^{\prime }_k = m^{\prime (k)}$, and $r^{\prime \prime }_k = m^{\prime \prime (k)}$.

□

Remark 25

By Theorem 24, we may maintain two registers for the current and the previous convergent, P/Q and $P^{\prime }/Q^{\prime }$. Whenever a new F$\mathbb {Q}$SR with P⁺, Q⁺ is required, we set its current contents from $P, P^{\prime }$ by the weighted sum property, and may then drop $P^{\prime },Q^{\prime }$. We only have to repeat the (few) clock cycles starting from the last mismatch. Overall, we advance every contents P/Q, $P^{\prime }/Q^{\prime }$, or P⁺/Q⁺ from one mismatch to the one after the next, a total of 2n clock cycles or O(n).

Theorem 26

Bit Complexity

The bit complexity of computing the current best approximations P/Q, the discrepancy sequence D(s), the coding sequence C(s), and the rational complexity profile corresponding to the prefix $s_1,\dots , s_n$ of length n is O(n²).

Proof

Calculating P/Q requires a factor n for the growing wordlength and an amortized number of 2n cycles according to the previous theorem and remark, a total of O(n²) bit operations. D(s) = (d_k) and C(s) are obtained “on the fly” in the algorithm from Table 4 (left), in O(1) operations per cycle or a total of O(n). Observe that the comparison $(s_1{\dots } s_k) \lesseqqgtr (r_1{\dots } r_k)$ can effectively be done on a suffix, whose start index is maintained and updated separately for ⊖ and ⊕: We always have “r[⊖] < s < r[⊕]”, the r values from Cases ⊖ are strictly increasing towards s from below, the values for Cases ⊕ strictly decrease from above, and thus once a bit position matches the corresponding s_k, this fact will be maintained. Hence, for any index k, the corresponding s_k is matched exactly twice, once in a Case ⊖, once for a Case ⊕, a total of O(n). The rational complexity profile is obtained by comparing the length of the current Q against R(s, k − 1). Since R(s, n) ≤ n, the comparison is O(n) per step, O(n²) in total. □

Remark 27

This coincides with the complexity of computing L or A without using FFT techniques, O(n²). However, there are FFT implementations both for linear (Blackburn [2]) and 2-adic complexity (Klapper [8, p. 137]) that need only $O(n\log (n)^2\log \log (n))$ bit operations. See also Niederreiter and Vielhaber [13] for an algorithm that computes all discrepancy sequences of all shifted sequences $(\textbf {K}(s_k,\dots , s_n)_i, 1\leq i\leq n-k+1), 1\leq k\leq n$ in overall time O(n²), i.e. in a constant number of bit operations per output bit.

13 Open problems

When the approximation is very good, e.g. after n consecutive discrepancy bits d_k = 0, can we extend v directly by, say, 01^⌊0.72n⌋ or 10^⌊0.72n⌋, using the results in Section 9?

How, if at all, can the binary procedure be adapted to sequences over $\mathbb {F}_p$, p > 2?

Can we use FFT techniques to achieve an overall bit complexity of $O(n\log (n)^2\log \log (n))$ for the rational complexity, like in the case of L and A?

Give closed formulae for the rational complexity values N_R(n, c), E(R(n)) and Var(R(n)), for even and odd n. Describe the numbers $\nu _{E}^a\approx -0.06$ and $\nu _{V}^a$ from Conjecture 7.

How does the rational complexity behave for sequences with perfect autocorrelation, in particular compared with linear complexity and 2-adic complexity?

Can we construct the Q Tree without previously constructing the V₁₀ Tree via the FSM+C of Section 11? (this would also solve Problem 1)

Prove Conjecture 15. Can we speed up the simulation of rational complexity by 2-adic complexity (or vice versa) below the suggested O(n⁴) complexity?

R is a CFE on the Archimedean $\mathbb {R}$, while A is an isometry on $\mathbb {Z}_2$ (ultrametric, no CFE).

Can we reconcile R with A beyond the case of ultimately periodic sequences?

14 Conclusion

We have introduced Rational Complexity R as a novel complexity measure for binary sequences, a third way to assess the pseudorandomness of binary sequences by comparison with ultimately periodic sequences, besides linear and 2-adic complexity, L and A.

We presented the Binary CFE Algorithm, a pseudo-ultrametric, bit-by-bit version of the standard continued fraction expansion in $\mathbb {R}$, making use of Feedback in $\mathbb {Q}$ Shift Registers. The algorithm avoids the costly inversion s⁽ⁱ⁾ := {s^(i− 1)}^− 1 in $\mathbb {R}$.

We also defined two prefixfree complete codes, C_I and C_II, mapping the partial denominators from $\mathbb {N}$ to {0,1}^∗ and following the Gauß-Kuz’min measure. C_I and C_II yield a monotonic function $C\colon \mathbb {R}_1 \to B^{\omega }$, mapping reals to CFE encodings. We have seen that R, L and A differ pairwise on considerable parts of the prefix space B³⁰ and thus all three measures should be used to examine a given sequence for pseudo-randomness.

Acknowledgements

We sincerely thank the anonymous reviewers, both at SETA 2020 and Cryptography and Communications for their valuable and detailed suggestions.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Vorheriger Artikel Minimal linear codes from weakly regular bent functions

Nächster Artikel Improving bounds on probabilistic affine tests to estimate the nonlinearity of Boolean functions

Berlekamp, E.: Non-binary BCH decoding. TR North Carolina State University Department of Statistics (1966)

Blackburn, S.: Fast rational interpolation, Reed-Solomon decoding, and the linear complexity profiles of sequences. IEEE Trans. IT 43(2), 537–548 (1997)MathSciNetCrossRef

Dornstetter, J.L.: On the equivalence between Berlekamp’s and Euclid’s algorithm. IEEE Trans. IT 33(3), 428–431 (1987)MathSciNetCrossRef

Hu, H., Feng, D.: On the 2-adic complexity and the k-error 2-adic complexity of periodic binary sequences. In: SETA 2004 LNCS, vol. 3486, pp 185–197 (2005)

Khintchine, A.: Metrische kettenbruchprobleme. Comput. Math. 1, 361–382 (1935)MathSciNetMATH

Klapper, A., Goresky, M.: 2-Adic shift registers. In: Crypto ’95 LNCS, vol. 963, pp 262–273 (1995)

Klapper, A., Goresky, M.: Cryptanalysis based on 2-Adic rational approximation. In: Crypto ’95, LNCS, vol. 963, pp 262–273 (1995)

Klapper, A., Goresky, M.: Feedback shift registers, 2-adic span, and combiners with memory. J. Crypt. 10, 111–147 (1997)MathSciNetCrossRef

Kuzmin, R. O.: Sur une problème de Gauss. Atti Congr. Int. Bologna 6, 83–89 (1928)

10.

Lévy, P.: Sur les lois de probabilité dont dépendent les quotients complets et incomplets d’une fraction continue. Bull. S.M.F. 57, 178–194 (1929)MATH

11.

Massey, J.: Shift-register synthesis and BCH decoding. IEEE Trans. IT 15(1), 122–127 (1969)MathSciNetCrossRef

12.

Moree, P.: Artin’s primitive root conjecture—a survey. arXiv Math arXiv:0412262 (2012)

13.

Niederreiter, H., Vielhaber, M.: Simultaneous shifted continued fraction expansions in quadratic time. AAECC 9(2), 125–138 (1998)MathSciNetCrossRef

14.

Perron, O.: Die Lehre von den Kettenbrüchen, Bd. I. Teubner, Stuttgart 1954/1977

15.

Rueppel, R.: Analysis and Design of Stream Ciphers. Springer, Berlin (1986)CrossRef

16.

Tian, T., Qi, W.F.: Expected values for the rational complexity of finite binary sequences. Des. Codes Crypt. 55(1), 65–79 (2010)MathSciNetCrossRef

17.

Vallée, B.: Digits and continuants in euclidean algorithms Ergodic versus tauberian theorems. J. Théor. Nombres Bordeaux 12(2), 531–570 (2000)MathSciNetCrossRef

18.

Vepstas, L.: The Minkowsi Question Mark, PSL(2,z) and the Modular Group http://ns.linas.org/math/chap-minkowski.pdf (2020)

19.

Vielhaber, M.: Continued fraction expansion as isometry - the law of the iterated logarithm for linear, jump, and 2-adic complexity. IEEE Trans. IT 53(11), 4383–4391 (2007). Preprint: https://arxiv.org/pdf/cs/0511089.pdfMathSciNetCrossRef

20.

Vielhaber, M.: A unified view on sequence complexity measures as isometries. In: SETA 2004 LNCS, vol. 3486, pp 143–153 (2005)

21.

Vielhaber, M.: Reduce-by-feedback: timing resistant and DPA-aware modular multiplication, plus: how to break RSA by DPA. In: CHES 2012. Springer (2012)

22.

Vielhaber, M.: V Tree—continued fraction expansion, Stern-Brocot tree. Minkowski’s ?(x) Function In Binary: Exponentially Faster. arXiv:2008.08020 (2020)

Titel: Rational complexity of binary sequences, FSRs, and pseudo-ultrametric continued fractions in
verfasst von: Michael Vielhaber
Mónica del Pilar Canales Chacón
Sergio Jara Ceballos
Publikationsdatum: 15.11.2021
Verlag: Springer US
Erschienen in: Cryptography and Communications / Ausgabe 2/2022
Print ISSN: 1936-2447
Elektronische ISSN: 1936-2455
DOI: https://doi.org/10.1007/s12095-021-00539-2

	IN: v.10^ω	OUT: ⊖ : v0.10^ω, for s < r, v_k = 0	Next-
		OUT: ⊕ : v1.10^ω, for s > r, v_k = 1	state
A	.1\|0\|≡ [1, 1]	0.10\|0\|≡ [3, 1]	B₁
		1\|.100\|≡ [1, 2]	\(\overline {\texttt {A}}\)
B_j	0^j.10^j\|0\|≡ [2^j+ 1 − 1, 1]	0^j0.10^j+ 1\|0\|≡ [2^j+ 2 − 1, 1]	B_j+ 1
		0^j1.10^j− 1\|0\|≡ [2^j + 2^j− 1 − 1, 1]	C_j, m := j
C_j	0^m1 ∗^m−j.10^j− 1\|0\|≡ [N, 1]	0^m1 ∗^m−j0.10^j− 2\|0\|≡ [N + 2^j− 2, 1]	C_j− 1
_j≥ 2		0^m1 ∗^m−j1.10^j− 2\|0\|≡ [N − 2^j− 2, 1]	C_j− 1
D	0^m1 ∗^m− 1.1\|0\|≡ [N, 1]	0^m1 ∗^m− 10\|.100\|≡ [N + 1, 2]	\(\overline {\texttt {A}}\)
= C₁		0^m1 ∗^m− 11\|.100\|≡ [N, 2]	\(\overline {\texttt {A}}\)
\(\overline {\texttt {A}}\)	\( \dots \|.100\| \equiv [\dots ,2]\)	\(\dots \|0\|.1\|0\| \equiv [\dots ,1,1,1] \)	A
		\(\dots \|1.1000\|\equiv [\dots ,4] \)	\(\overline {\texttt {B}_{1}}\)
\(\overline {\texttt {B}_{j}}\)	\(\dots \|1^{j}.10^{j+2}\|\equiv [\dots ,2^{j+1}]\)	\(\dots \|1^{j}0.10^{j-1}\equiv [\dots ,2^{j}+2^{j-1}]\)	\(\overline {\texttt {C}_{j}},m := j\)
		\(\dots \|1^{j}1.10^{j+3}\equiv [\dots ,2^{j+2}]\)	\(\overline {\texttt {B}_{j+1}}\)
\(\overline {\texttt {C}_{j}}\)	\(\dots \|1^{m}0*^{m-j}.10^{j-1}\|\equiv [\dots ,N]\)	\(\dots \|1^{m}0*^{m-j}0.10^{j-2}\|\equiv [\dots ,N-2^{j-2}]\)	\(\overline {\texttt {C}_{j-1}}\)
_j≥ 2		\(\dots \|1^{m}0*^{m-j}1.10^{j-2}\|\equiv [\dots ,N+2^{j-2}]\)	\(\overline {\texttt {C}_{j-1}}\)
\(\overline {\texttt {D}}\)	\(\dots \|1^{m}0*^{m-1}.1\|\equiv [\dots ,N]\)	\(\dots \|1^{m}0*^{m-1}0\|.1\|0\|\equiv [\dots ,N-1,1,1]\)	A
\(=\overline {\texttt {C}_{1}}\)		\(\dots \|1^{m}0*^{m-1}1\|.1\|0\|\equiv [\dots ,N,1,1]\)	A

State	PD		⊖,v_k = 0		⊕,v_k = 1
	index	\(P^{\prime +} := \)	\(P^+ := \dots \)	\({P^{\prime }+} := \)	\(P^+ := \dots \)
`A`	2i + 1	\(P^{\prime }\)	\(P + (P^{\prime }<<1)\)	\(P - P^{\prime }\)	\((P{{<<}}1) - P^{\prime }\)
B_j	2i + 1	\(P^{\prime }\)	\(P + (P^{\prime }<<(j+1))\)	\(P^{\prime }\)	\(P - (P^{\prime }<<(j-1))\)
C_j	2i + 1	\(P^{\prime }\)	\(P + (P^{\prime }<<(j-2))\)	\(P^{\prime }\)	\(P - (P^{\prime }<<(j-2))\)
D	2i + 1	P	\((P<<1)+P^{\prime }\)	\(P - P^{\prime }\)	\((P<<1)-P^{\prime }\)
\(\overline {\texttt {A}}\)	2i + 2	\(P-P^{\prime }\)	\((P<<1)-P^{\prime }\)	\(P^{\prime }\)	\(P+(P^{\prime }<<1)\)
\(\overline {\texttt {B}_{j}}\)	2i + 2	\(P^{\prime }\)	\(P - (P^{\prime }<<(j-1))\)	\(P^{\prime }\)	\(P + (P^{\prime }<<(j+1))\)
\(\overline {\texttt {C}_{j}}\)	2i + 2	\(P^{\prime }\)	\(P - (P^{\prime }<<(j-2))\)	\(P^{\prime }\)	\(P + (P^{\prime }<<(j-2))\)
\(\overline {\texttt {D}}\)	2i + 2	\(P - P^{\prime }\)	\((P<<1) - P^{\prime }\)	P	\((P<<1) + P^{\prime }\)

Springer Professional

Rational complexity of binary sequences, F\(\mathbb {Q}\)SRs, and pseudo-ultrametric continued fractions in \(\mathbb {R}\)

Abstract

Publisher’s note

1 Introduction and motivation

2 Continued fraction expansions in \(\mathbb {R}\) and \(\mathbb {F}_{2}[[x^{-1}]]\)

3 Binary CFE I: Gauß-Kuz’min measure, codes, monotonicity

3.1 Gauß-Kuz’min measure

3.2 Encodings C_I, C_II, and C

3.3 Monotonicity

4 Binary CFE II: Algorithm

5 Rational complexity of a binary sequence

6 Example π

7 Connections between rational and 2-adic complexities, and impulse response sequences

8 Comparison of R with A and L

9 Deviation from isometric behaviour

10 The Q Tree: an ultrametric ordering of \(\mathbb {Q}_{1} = \mathbb {Q}\cap [0,1)\)

11 Binary CFE III: states, incremental change of b_i, P_i, Q_i

12 Feedback in \(\mathbb {Q}\) shift registers F\(\mathbb {Q}\)SRs: Reduce-By-Feedback

13 Open problems

14 Conclusion

Acknowledgements

Publisher’s note

Springer Professional

Abstract

Publisher’s note

1 Introduction and motivation

2 Continued fraction expansions in \(\mathbb {R}\) and \(\mathbb {F}_{2}[[x^{-1}]]\)

3 Binary CFE I: Gauß-Kuz’min measure, codes, monotonicity

3.1 Gauß-Kuz’min measure

3.2 Encodings CI, CII, and C

3.3 Monotonicity

4 Binary CFE II: Algorithm

5 Rational complexity of a binary sequence

6 Example π

7 Connections between rational and 2-adic complexities, and impulse response sequences

8 Comparison of R with A and L

9 Deviation from isometric behaviour

10 The Q Tree: an ultrametric ordering of \(\mathbb {Q}_{1} = \mathbb {Q}\cap [0,1)\)

11 Binary CFE III: states, incremental change of bi, Pi, Qi

12 Feedback in \(\mathbb {Q}\) shift registers F\(\mathbb {Q}\)SRs: Reduce-By-Feedback

13 Open problems

14 Conclusion

Acknowledgements

Publisher’s note

Weitere Artikel der Ausgabe 2/2022

Improving bounds on probabilistic affine tests to estimate the nonlinearity of Boolean functions

Differential and boomerang spectrums of some power permutations

On (− 1)-differential uniformity of ternary APN power functions

A class of affine-invariant codes and their support 2-designs

On the behavior of some APN permutations under swapping points

Hermitian LCD codes over and their applications to maximal entanglement EAQECCs

3.2 Encodings C_I, C_II, and C

11 Binary CFE III: states, incremental change of b_i, P_i, Q_i