Dixon's factorization method
In number theory, Dixon's factorization method (also Dixon's random squares method[1] or Dixon's algorithm) is a general-purpose integer factorization algorithm; it is the prototypical factor base method. Unlike for other factor base methods, its run-time bound comes with a rigorous proof that does not rely on conjectures about the smoothness properties of the values taken by a polynomial.
The algorithm was designed by John D. Dixon, a mathematician at Carleton University, and was published in 1981.[2]
Basic idea
[edit ]Dixon's method is based on finding a congruence of squares modulo the integer N which is intended to factor. Fermat's factorization method finds such a congruence by selecting random or pseudo-random x values and hoping that the integer x2 mod N is a perfect square (in the integers):
- {\displaystyle x^{2}\equiv y^{2}\quad ({\hbox{mod }}N),\qquad x\not \equiv \pm y\quad ({\hbox{mod }}N).}
For example, if N = 84923, (by starting at 292, the first number greater than √N and counting up) the 5052 mod 84923 is 256, the square of 16. So (505 − 16)(505 + 16) = 0 mod 84923. Computing the greatest common divisor of 505 − 16 and N using Euclid's algorithm gives 163, which is a factor of N.
In practice, selecting random x values will take an impractically long time to find a congruence of squares, since there are only √N squares less than N.
Dixon's method replaces the condition "is the square of an integer" with the much weaker one "has only small prime factors"; for example, there are 292 squares smaller than 84923; 662 numbers smaller than 84923 whose prime factors are only 2,3,5 or 7; and 4767 whose prime factors are all less than 30. (Such numbers are called B-smooth with respect to some bound B.)
If there are many numbers {\displaystyle a_{1}\ldots a_{n}} whose squares can be factorized as {\displaystyle a_{i}^{2}\mod N=\prod _{j=1}^{m}b_{j}^{e_{ij}}} for a fixed set {\displaystyle b_{1}\ldots b_{m}} of small primes, linear algebra modulo 2 on the matrix {\displaystyle e_{ij}} will give a subset of the {\displaystyle a_{i}} whose squares combine to a product of small primes to an even power — that is, a subset of the {\displaystyle a_{i}} whose squares multiply to the square of a (hopefully different) number mod N.
Method
[edit ]Suppose the composite number N is being factored. Bound B is chosen, and the factor base is identified (which is called P), the set of all primes less than or equal to B. Next, positive integers z are sought such that z2 mod N is B-smooth. Therefore we can write, for suitable exponents ai,
- {\displaystyle z^{2}{\text{ mod }}N=\prod _{p_{i}\in P}p_{i}^{a_{i}}}
When enough of these relations have been generated (it is generally sufficient that the number of relations be a few more than the size of P), the methods of linear algebra, such as Gaussian elimination, can be used to multiply together these various relations in such a way that the exponents of the primes on the right-hand side are all even:
- {\displaystyle {z_{1}^{2}z_{2}^{2}\cdots z_{k}^{2}\equiv \prod _{p_{i}\in P}p_{i}^{a_{i,1}+a_{i,2}+\cdots +a_{i,k}}\ {\pmod {N}}\quad ({\text{where }}a_{i,1}+a_{i,2}+\cdots +a_{i,k}\equiv 0{\pmod {2}})}}
This yields a congruence of squares of the form a2 ≡ b2 (mod N), which can be turned into a factorization of N, N = gcd(a + b, N) ×ばつ (N/gcd(a + b, N)). This factorization might turn out to be trivial (i.e. N = N ×ばつ 1), which can only happen if a ≡ ±b (mod N), in which case another try must be made with a different combination of relations; but if a nontrivial pair of factors of N is reached, the algorithm terminates.
Pseudocode
[edit ]This section is taken directly from Dixon (1981).
Dixon's algorithm
Initialization. Let L be a list of integers in the range [1, n], and let P = {p1, ..., ph} be the list of the h primes ≤ v. Let B and Z be initially empty lists (Z will be indexed by B).
Step 1. If L is empty, exit (algorithm unsuccessful). Otherwise, take the first term z from L, remove it from L, and proceed to Step 2.
Step 2. Compute w as the least positive remainder of z2 mod n. Factor w as:
{\displaystyle w=w'\prod _{i}p_{i}^{a_{i}}}
where w′ has no factor in P. If w′ = 1, proceed to Step 3; otherwise, return to Step 1.
Step 3. Let a ← (a1, ..., ah). Add a to B and z to Z. If B has at most h elements, return to Step 1; otherwise, proceed to Step 4.
Step 4. Find the first vector c in B that is linearly dependent (mod 2) on earlier vectors in B. Remove c from B and {\displaystyle z_{c}} from Z. Compute coefficients {\displaystyle f_{b}} such that:
{\displaystyle \mathbf {c} \equiv \sum _{b\in B}f_{b}\mathbf {b} {\pmod {2}}}
Define:
{\displaystyle \mathbf {d} =(d_{1},\dots ,d_{n})\gets {\frac {1}{2}}\left(\mathbf {c} +\sum f_{b}\mathbf {b} \right)}
Proceed to Step 5.
Step 5. Compute:
{\displaystyle x\gets z_{c}\prod _{b}z_{b}^{f_{b}},\quad y\gets \prod _{i}p_{i}^{d_{i}}}
so that:
{\displaystyle x^{2}\equiv \prod _{i}p_{i}^{2d_{i}}=y^{2}\mod n.}
If {\displaystyle x\equiv y} or {\displaystyle x\equiv -y{\pmod {n}}}, return to Step 1. Otherwise, return:
{\displaystyle \gcd(n,x+y)}
which provides a nontrivial factor of n, and terminate successfully.
Step-by-step example : factorizing (n = 84923) using Dixon's algorithm
[edit ]This example is lifted directly from the LeetArxiv substack.[3] Credit is given to the original author.
- Initialization:
- Define a list of numbers L, ranging from 1 to 84923:
- {\displaystyle L=\{1,\dots ,84923\}}
- Define a value v, which is the smoothness factor:
- {\displaystyle v=7}
- Define a list P containing all the prime numbers less than or equal to v:
- {\displaystyle P={2,3,5,7}}
- Define B and Z, two empty lists. B is a list of powers, while Z is a list of accepted integers:
- {\displaystyle B=[]}
- {\displaystyle Z=[]}
- Step 1: Iterating {\displaystyle z} values
- Write a for loop that indexes the list {\displaystyle L}. Each element in {\displaystyle L} is indexed as {\displaystyle z}. The for loop exits at the end of the list.
intn=84923; for(inti=1;i<=n;i++) { intz=i; }
- Step 2: Computing {\displaystyle z^{2}\mod n} and v-smooth Prime Factorization
- To proceed, compute {\displaystyle z^{2}\mod 84923} for each z, then express the result as a prime factorization.
- {\displaystyle 1^{2}\mod 84923\equiv 1\mod 84923=2^{0}\cdot 3^{0}\cdot 5^{0}\cdot 7^{0}\mod 84923}
- {\displaystyle \vdots }
- {\displaystyle 513^{2}\mod 84923=8400\mod 84923=2^{4}\cdot 3^{1}\cdot 5^{2}\cdot 7^{1}\mod 84923}
- {\displaystyle \vdots }
- {\displaystyle 537^{2}\mod 84923=33600\mod 84923=2^{6}\cdot 3^{1}\cdot 5^{2}\cdot 7^{1}\mod 84923}
- {\displaystyle 538^{2}\mod 84923=34675\mod 84923=5^{2}\cdot 19^{1}\cdot 73^{1}\mod 84923}
- This step continues for all values of z in the range.
- Step 3: If {\displaystyle z^{2}\mod 84923} is 7-smooth, then append its powers to list {\displaystyle B} and append {\displaystyle z} to list {\displaystyle Z}.
- {\displaystyle Z=\{1,513,537\}}
- {\displaystyle B=\{[0,0,0,0],[4,1,2,1],[6,1,2,1]\}}
- Step 4: This step is split into two parts.
- Part 1: Find {\displaystyle B} modulo 2.
- {\displaystyle B={\begin{pmatrix}0&0&0&0\4円&1&2&1\6円&1&2&1\end{pmatrix}}\mod 2\equiv B={\begin{pmatrix}0&0&0&0\0円&1&0&1\0円&1&0&1\end{pmatrix}}}
- Part 2: Check if any row combinations of {\displaystyle B} sum to even numbers
- For example, summing Row {\displaystyle 2} and Row {\displaystyle 3} gives us a vector of even numbers.
- {\displaystyle R_{2}=\{0,1,0,1\}} and {\displaystyle R_{3}=\{0,1,0,1\}}
- then
- {\displaystyle R_{2}+R_{3}=\{0,1,0,1\}+\{0,1,0,1\}}
- {\displaystyle R_{2}+R_{3}=\{0,2,0,2\}}.
- Step 5 : This step is split into four parts.
- Part 1. (Finding x): Multiply the corresponding {\displaystyle z} values for the rows found in Step 4. Then find the square root. This gives us {\displaystyle x}.
- For Row 2, we had {\displaystyle 2^{4}*3^{1}*5^{2}*7^{1}}.
- For Row 3, we had {\displaystyle 2^{6}*3^{1}*5^{2}*7^{1}}.
- Thus, we find {\displaystyle x} :
- Part 1. (Finding x): Multiply the corresponding {\displaystyle z} values for the rows found in Step 4. Then find the square root. This gives us {\displaystyle x}.
- {\displaystyle {\begin{array}{ll}(513\cdot 537)^{2}\mod 84923=y^{2}\\\\{\text{where}}\quad x^{2}\mod 84923=(513\cdot 537)^{2}\mod 84923\\\\{\text{thus}}\quad x=(513\cdot 537)\mod 84923\\\\{\text{so}}\quad x=275481\mod 84923\\\\{\text{Finally}}\quad x=20712\mod 84923\\\end{array}}}
- Part 2. (Finding y): Multiply the corresponding smooth factorizations for the rows found in Step 4. Then find the square root. This gives us {\displaystyle y}.
- {\displaystyle {\begin{array}{ll}y^{2}=(2^{4}\cdot 3^{1}\cdot 5^{2}\cdot 7^{1})\times (2^{6}\cdot 3^{1}\cdot 5^{2}\cdot 7^{1})\\\\{\text{By the multiplication law of exponents,}}\\y^{2}=2^{(4+6)}\cdot 3^{(1+1)}\cdot 5^{(2+2)}\cdot 7^{(1+1)}\\\\{\text{Thus,}}\\y^{2}=2^{10}\cdot 3^{2}\cdot 5^{4}\cdot 7^{2}\\\\{\text{Taking square roots on both sides gives}}\\y=2^{5}\cdot 3^{1}\cdot 5^{2}\cdot 7^{1}\\\\{\text{Therefore,}}\\y=32\times 3\times 25\times 7\\\\{\text{Finally,}}\\y=16800\end{array}}}
- Part 3. (Find x + y and x - y) where x = 20712 and y = 16800.
- {\displaystyle x+y=20712+たす16800=わ37512}
- {\displaystyle x-y=20712-16800=3912}
- Part 4. Compute GCD(x+y, n) and GCD(x-y, n), where n = 84923, x+y = 292281 and x-y = 258681
- {\displaystyle {\begin{array}{ll}\gcd(37512,84923)=521\\\gcd(3912,84923)=163\end{array}}}
Quick check shows {\displaystyle 84923=521\times 163}.
Optimizations
[edit ]The quadratic sieve is an optimization of Dixon's method. It selects values of x close to the square root of N such that x2 modulo N is small, thereby largely increasing the chance of obtaining a smooth number.
Other ways to optimize Dixon's method include using a better algorithm to solve the matrix equation, taking advantage of the sparsity of the matrix: a number z cannot have more than {\displaystyle \log _{2}z} factors, so each row of the matrix is almost all zeros. In practice, the block Lanczos algorithm is often used. Also, the size of the factor base must be chosen carefully: if it is too small, it will be difficult to find numbers that factorize completely over it, and if it is too large, more relations will have to be collected.
A more sophisticated analysis, using the approximation that a number has all its prime factors less than {\displaystyle N^{1/a}} with probability about {\displaystyle a^{-a}} (an approximation to the Dickman–de Bruijn function), indicates that choosing too small a factor base is much worse than too large, and that the ideal factor base size is some power of {\displaystyle \exp \left({\sqrt {\log N\log \log N}}\right)}.
The optimal complexity of Dixon's method is
- {\displaystyle O\left(\exp \left(2{\sqrt {2}}{\sqrt {\log n\log \log n}}\right)\right)}
in big-O notation, or
- {\displaystyle L_{n}[1/2,2{\sqrt {2}}]}
in L-notation.
References
[edit ]- ^ Kleinjung, Thorsten; et al. (2010). "Factorization of a 768-Bit RSA Modulus". Advances in Cryptology – CRYPTO 2010. Lecture Notes in Computer Science. Vol. 6223. pp. 333–350. doi:10.1007/978-3-642-14623-7_18. ISBN 978-3-642-14622-0. S2CID 11556080.
- ^ Dixon, J. D. (1981). "Asymptotically fast factorization of integers" (PDF). Math. Comp. 36 (153): 255–260. doi:10.1090/S0025-5718-1981-0595059-1 . JSTOR 2007743.
- ^ Kibicho, Murage (2025). Hand-Written Paper Implementation: Asymptotically Fast Factorization of Integers.