We give an efficient classical probabilistic polynomial-time algorithm, that is essentially due to Miller [
15], for completely factoring
N given the order
r of a single element
g selected uniformly at random from
\({\mathbb {Z}}_N^*\). We furthermore analyze the runtime and success probability of the algorithm: In particular, we give a lower bound on its success probability.
3.1 Notes on our original intuition for this work
Given the order r of g, we can in general correctly guess the orders of a large fraction of the other elements of \({\mathbb {Z}}_N^*\) with high probability.
To see why this is, note that g is likely to have an order such that \(\lambda (N) / r\) is a moderate size product of small prime factors. Hence, by multiplying on or dividing off small prime factors to r, we can guess \(\lambda (N)\), and by extension the orders of other elements in the group. This observation served as our original intuition for pursuing this line of work. In this paper, we do, however, take matters a few steps further:
In particular, instead of guessing the orders of individual elements in \({\mathbb {Z}}_N^*\), we instead guess some multiple of \(\lambda '(N)\). Furthermore, we show that even if we only manage to guess some multiple of a divisor of \(\lambda '(N)\), we are still often successful in recovering the complete factorization of N.
3.2 The algorithm
In what follows, we describe a classical algorithm, essentially due to Miller [
15] (see the algorithm in Lemma 5 ERH), for completely factoring
N given a multiple of
\(\lambda '(N)\). We have slightly modified it, however, by adding a step in which we attempt to guess such a multiple, denoted
\(r'\) below, given the order
r of
g.
Furthermore, we select k group elements \(x_j\) uniformly at random from \({\mathbb {Z}}_N^*\), for \(k \ge 1\) some small parameter that may be freely selected, whereas Miller iterates over all elements up to some bound.
With these modifications, we shall prove that the resulting probabilistic algorithm runs in polynomial time, with the possible exception of the call to an order-finding algorithm in the first step, and analyze its success probability.
To be specific, the resulting algorithm first executes the below procedure once to find non-trivial factors of
N:
1
Select g uniformly at random from \({\mathbb {Z}}_N^*\).
Compute the order r of g via an order-finding algorithm.
2
Let \({\mathcal {P}}(B)\) be the set of primes \(\le B\).
Let \(\eta (q, B)\) be the largest integer such that \(q^{\eta (q, B)} \le B\).
Let \(m' = cm\) for some constant \(c \ge 1\) that may be freely selected. Recall from the introduction that m is the bit length of N.
Compute \(r' = r \prod _{q \, \in \, {\mathcal {P}}(m')} q^{\eta (q, m')}\).
3
Let \(r' = 2^t o\) where o is odd.
4
For
\(j = 1, \, 2, \, \ldots , \, k\) for some
\(k \ge 1\) that may be freely selected do:
4.1
Select \(x_j\) uniformly at random from \({\mathbb {Z}}_N^*\).
4.2
For
\(i = 0, \, 1, \, \ldots , \, t\) do:
4.2.1
Compute \(d_{i, j} = \gcd (x_j^{2^i o} - 1, N)\).
If \(1< d_{i, j} < N\) report \(d_{i, j}\) as a non-trivial factor of N.
We then obtain the complete factorization from the
\(d_{i, j}\) reported as follows:
A set is initialized and N added to it before executing the above algorithm. For each non-trivial factor reported, the factor is added to the set. The set is kept reduced, so that it contains only non-trivial pairwise coprime factors. It is furthermore checked for each factor in the set, if it is a perfect power \(q^e\), in which case q is reported as a non-trivial factor. The algorithm succeeds if the set contains all distinct prime factors of N when the algorithm stops.
Recall from the introduction that there are efficient methods for reducing \(q^e\) to q, and methods for testing primality in probabilistic polynomial time.
3.2.1 Notes on efficient implementation
Note that the algorithm as described in Sect.
3.2 is not optimized: Rather, it is presented for ease of comprehension and analysis.
In an actual implementation it would for example be beneficial to perform arithmetic modulo \(N'\) throughout step 4 of the algorithm, for \(N'\) a composite divisor of N that is void of prime factors that have already been found.
The algorithm would of course also stop incrementing j as soon as the factorization is complete, rather than after k iterations, and stop incrementing i as soon as \(x_j^{2^i o} \equiv 1 \,\, (\text {mod } N')\) rather than continue up to t. It would select \(x_j\) and g from \({\mathbb {Z}}_N^* \backslash \{ 1 \}\) rather than from \({\mathbb {Z}}_N^*\). In step 4.2.1, it would compute \(d_{i,j} = \gcd (u_{i,j} - 1, N')\), where \(u_{0,j} = x_j^o \text { mod } N'\), and \(u_{i,j} = u_{i-1,j}^2 \text { mod } N'\) for \(i \in [1, t]\), to avoid raising \(x_j\) to o repeatedly.
Ideas for potential optimizations: To further speed up the exponentiations, instead of raising each \(x_j\) to a pre-computed guess \(r'\) for a multiple of \(\lambda '(N)\), a smaller exponent that is a multiple of the order of \(x_j \text { mod } N'\) may conceivably be speculatively computed and used in place of \(r'\). To obtain more non-trivial factors from each \(x_j\), combinations of small divisors of the exponent may conceivably be exhausted; not only the powers of two that divide the exponent.
Missing factors: If an \(x_j\) is such that \(w_j = x_j^{N' r'} \not \equiv 1 \,\, (\text {mod } N')\), a factor q equal to the order of \(w_j \text { mod } N'\) is missing in the guess \(r'\) for a multiple of \(\lambda '(N')\). Should this lead the algorithm to fail to completely factor \(N'\), it may be worthwhile to attempt to compute the missing factor:
The options available include searching for q by exponentiating to all primes up to some bound, which is essentially analogous to increasing c, or using some form of cycle-finding algorithm that does not require a multiple of q to be known in advance.
In short, there are a number of optimizations that may be applied, but doing so above would obscure the workings of the algorithm, and the analysis that we are about to present. It is furthermore not necessary, since the algorithm as described is already very efficient.
Note that the order-finding call in step 1 may be performed for a multiple of
N if desired. This can only cause
r to grow by some multiple, which in turn can only serve to increase the success probability, in the same way that growing
r to
\(r'\) in step 2 serves to increase the success probability, see Sect.
3.3. In turn, this explains why we can replace
N with
\(N'\) as described in the previous section, and why a restriction to odd
N does not imply a loss of generality.
3.2.3 Notes on analogies with Miller’s, Rabin’s and Long’s works
Miller’s original version of the algorithm in Sect.
3.2 is deterministic, and proven to work only assuming the validity of the extended Riemann hypothesis (ERH), as is Miller’s primality test in the same thesis [
15].
Rabin [
18] later converted Miller’s primality test into a probabilistic polynomial-time algorithm that is highly efficient in practice. At about the same time, Long [
13], acknowledging ideas from Flajolet, converted Miller’s factoring algorithm into a probabilistic polynomial-time algorithm. This in a technical report that seemingly went unpublished.
1 More specifically, Long lower-bounds the probability of randomly selecting an element
\(g \in \mathbb Z_N^*\) of order
r a multiple of
\(\lambda '(N)\). He then gives a randomized version of Miller’s algorithm for splitting
N into two non-trivial factors given this multiple of
\(\lambda '(N)\).
The above is closely related to our work. We take matters a step further, however, by converting Miller’s factoring algorithm into an efficient probabilistic polynomial-time algorithm for recovering the complete factorization of N given the order r of a single element g selected uniformly at random from \(\mathbb Z_N^*\). We lower-bound the probability of the algorithm succeeding in recovering all factors of N given r. We show this probability to be very high, by carefully considering cases where r is a multiple of a divisor of \(\lambda '(N)\).
3.2.4 Notes on analogies with Pollard’s works
Miller’s algorithm may be regarded as a generalization of Pollard’s
\(p-1\) algorithm [
17]: Miller essentially runs Pollard’s algorithm for all prime factors
\(p_i\) in parallel by using that a multiple of
\(\lambda '(N) = \mathrm {lcm}(p_1 - 1, \ldots , p_n - 1)\) is known. Pollard, assuming no prior knowledge, uses a product of small prime powers up to some smoothness bound
B in place of
\(\lambda '(N)\). This factors out
\(p_i\) from
N if
\(p_i-1\) is
B-smooth, giving Pollard’s algorithm its name.
Since we only know r, a multiple of some divisor of \(\lambda '(N)\), we grow r to \(r'\) by multiplying on a product of small prime powers. This is in analogy with Pollard’s approach.
3.3 Analysis of the algorithm
A key difference between our modified algorithm in Sect.
3.2 and the original algorithm in Miller’s thesis [
15] is that we compute the order of a random
g and then add a guessing step: We guess an
\(r'\) in step 2 that we hope will be a multiple of
\(p_i - 1\) for all
\(i \in [1, n]\), and if not all, then at least for all but one index on this interval, in which case the algorithm will still be successful in completely factoring
N.
This is shown in the below analysis. Specifically, we lower-bound the success probability and demonstrate the polynomial runtime of the algorithm. Throughout the analysis, and the remainder of this work, we use the notation implicitly introduced in the pseudocode for the algorithm in Sect.
3.2.
Furthermore, we use that for
g selected uniformly at random from
$$\begin{aligned} \mathbb Z^*_{N} \simeq \mathbb Z^*_{p_1^{e_1}} \times \ldots \times \mathbb Z^*_{p_n^{e_n}}, \end{aligned}$$
we have that
\(g \text { mod } p_i\) for
\(i \in [1, n]\) are selected independently and uniformly at random from the cyclic groups
\(\mathbb Z_{p_i}^*\). The same holds with
\(x_j\) in place of
g.
By definition
q is said to divide
u iff
\(u \equiv 0 \,\, (\text {mod } q)\). Note that this implies that all
\(q \ne 0\) divide
\(u = 0\). This situation arises in the above proof of Lemma
2.
3.3.2 Main theorem
By the above main theorem, the algorithm will be successful in completely factoring N, if the constant c is selected so that \(1 / (2 c^{2} \log ^{2} cm)\) is sufficiently small, and if \(2^{-k}\) for k the number of iterations is sufficiently small in relation to \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) \) for n the number of distinct prime factors in N. The latter requirement is easy to meet: Pick \(k \ge 2 \log n - 1 + \tau \) for some positive \(\tau \). Then \(2^{-k} \cdot \left( {\begin{array}{c}n\\ 2\end{array}}\right) \le 2^{-\tau }\).
The time complexity of the algorithm is dominated by k exponentiations of an integer modulo N to an exponent of length \(O(m)\) bits, and by the need to test the factors identified for primality. This is indeed very efficient.
Note furthermore that our analysis of the success probability of the algorithm is a worst-case analysis. In practice, the actual success probability of the algorithm is higher. Also, nothing in our arguments strictly requires c to be a constant: We can make c a function of m to further increase the success probability at the expense of working with \(O(c(m)\,m)\) bit exponents.