What's new

The maximal length of the Erdős–Herzog–Piranian lemniscate in high degree

15 December, 2025 in math.CV, paper | Tags: Erdos, lemniscate, polynomials | by Terence Tao | 20 comments

I’ve just uploaded to the arXiv my preprint The maximal length of the Erdős–Herzog–Piranian lemniscate in high degree. This paper resolves (in the asymptotic regime of sufficiently high degree) an old question about the polynomial lemniscates

\displaystyle \partial E_1(p) := \{ z: |p(z)| = 1\}

attached to monic polynomials {p} of a given degree {n}, and specifically the question of bounding the arclength {\ell(\partial E_1(p))} of such lemniscates. For instance, when {p(z)=z^n}, the lemniscate is the unit circle and the arclength is {2\pi}; this in fact turns out to be the minimum possible length amongst all (connected) lemniscates, a result of Pommerenke. However, the question of the largest lemniscate length is open. The leading candidate for the extremizer is the polynomial

\displaystyle p_0(z)= z^n-1

whose lemniscate is quite convoluted, with an arclength that can be computed asymptotically as

\displaystyle \ell(\partial E_1(p_0)) = B(1/2,n/2) = 2n + 4 \log 2 + O(1/n)

where {B} is the beta function.

(The images here were generated using AlphaEvolve and Gemini.) A reasonably well-known conjecture of Erdős, Herzog, and Piranian (Erdős problem 114) asserts that this is indeed the maximizer, thus {\ell(\partial E_1(p)) \leq \ell(\partial E_1(p_0))} for all monic polynomials of degree {n}.

There have been several partial results towards this conjecture. For instance, Eremenko and Hayman verified the conjecture when {n=2}. Asympotically, bounds of the form {\ell(\partial E_1(p)) \leq (C+o(1)) n} had been known for various {C} such as {\pi}, {2\pi}, or {4\pi}; a significant advance was made by Fryntov and Nazarov, who obtained the asymptotically sharp upper bound

\displaystyle \ell(\partial E_1(p)) \leq 2n + O(n^{7/8})

and also obtained the sharp conjecture {\ell(\partial E_1(p)) \leq \ell(\partial E_1(p_0))} for {p} sufficiently close to {p_0}. In that paper, the authors comment that the {O(n^{7/8})} error could be improved, but that {O(n^{1/2})} seemed to be the limit of their method.

I recently explored this problem with the optimization tool AlphaEvolve, where I found that when I assigned this tool the task of optimizing {\ell(\partial E_1(p))} for a given degree {n}, that the tool rapidly converged to choosing {p} to be equal to {p_0} (up to the rotation and translation symmetries of the problem). This suggested to me that the conjecture was true for all {n}, though of course this was far from a rigorous proof. AlphaEvolve also provided some useful visualization code for these lemniscates which I have incorporated into the paper (and this blog post), and which helped build my intuition for this problem; I view this sort of “vibe-coded visualization” as another practical use-case of present-day AI tools.

In this paper, we iteratively improve upon the Fryntov-Nazarov method to obtain the following bounds, in increasing order of strength:

(i) {\ell(\partial E_1(p)) \leq 2n + O(n^{1/2})}.
(ii) {\ell(\partial E_1(p)) \leq 2n + O(1)}.
(iii) {\ell(\partial E_1(p)) \leq 2n + 4 \log 2 + o(1)}.
(iv) {\ell(\partial E_1(p)) \leq \ell(\partial E_1(p_0))} for sufficiently large {n}.

In particular, the Erdős–Herzog–Piranian conjecture is now verified for sufficiently large {n}.

The proof of these bounds is somewhat circuitious and technical, with the analysis from each part of this result used as a starting point for the next one. For this blog post, I would like to focus on the main ideas of the arguments.

A key difficulty is that there are relatively few tools for upper bounding the arclength of a curve; indeed, the coastline paradox already shows that curves can have infinite length even when bounded. Thus, one needs to utilize some smooth or algebraic structure on the curve to hope for good upper bounds. One possible approach is via the Crofton formula, using Bezout’s theorem to control the intersection of the curve with various lines. This is already good enough to get bounds of the form {\ell(\partial E_1(p)) \leq O(n)} (for instance by combining it with other known tools to control the diameter of the lemniscate), but it seems challenging to use this approach to get bounds close to the optimal {\sim 2n}.

Instead, we follow Fryntov–Nazarov and utilize Stokes’ theorem to convert the arclength into an area integral. A typical identity used in that paper is

\displaystyle \ell(\partial E_1(p)) = 2 \int_{E_1(p)} |p'|\ dA - \int_{E_1(p)} \frac{|p\varphi|}{\varphi} \psi\ dA

where {dA} is area measure, {\varphi = p'/p} is the log-derivative of {p}, {\psi = p''/p'} is the log-derivative of {p'}, and {E_1(p)} is the region {\{ |p| \leq 1 \}}. This and the triangle inequality already lets one prove bounds of the form {\ell(\partial E_1(p)) \leq 2\pi n + O(n^{1/2})} by controlling {\int_{E_1(p)} |\psi|\ dA} using the triangle inequality, the Hardy-Littlewood rearrangement inequality, and the Polya capacity inequality.

But this argument does not fully capture the oscillating nature of the phase {\frac{|\varphi|}{\varphi}} on one hand, and the oscillating nature of {E_1(p)} on the other. Fryntov–Nazarov exploited these oscillations with some additional decompositions and integration by parts arguments. By optimizing these arguments, I was able to establish an inequality of the form

\displaystyle \ell(\partial E_1(p)) \leq \frac{1}{\pi} \int_{E_2(p)} |\psi|\ dA + O(X_1+X_2+X_3+X_4+X_5) \ \ \ \ \ (1)

where {E_2(p) = \{ |p| \leq 2\}} is an enlargement of {E_1(p)} (that is significantly less oscillatory, as displayed in the figure below), and {X_1,\dots,X_5} are certain error terms that can be controlled by a number of standard tools (e.g., the Grönwall area theorem).

One can heuristically justify (1) as follows. Suppose we work in a region where the functions {\psi}, {p} are roughly constant: {\psi(z) \approx \psi_0}, {p(z) \approx c_0}. For simplicity let us normalize {\psi_0} to be real, and {c_0} to be negative real. In order to have a non-trivial lemniscate in this region, {c_0} should be close to {-1}. Because the unit circle {\partial D(0,1)} is tangent to the line {\{ w: \Re w = -1\}} at {-1}, the lemniscate condition {|p(z)|=1} is then heuristically approximated by the condition that {\Re (p(z) + 1) = 0}. On the other hand, the hypothesis {\psi(z) \approx \psi_0} suggests that {p'(z) \approx A e^{\psi_0 z}} for some amplitude {z}, which heuristically integrates to {p(z) \approx c_0 + \frac{A}{\psi_0} e^{\psi_0 z}}. Writing {\frac{A}{\psi_0}} in polar coordinates as {Re^{i\theta}} and {z} in Cartesian coordinates as {x+iy}, the condition {\Re(p(z)+1)=0} can then be rearranged after some algebra as

\displaystyle \cos (\psi_0 y + \theta) \approx -\frac{1+c_0}{Re^{\psi_0 x}}.

If the right-hand side is much larger than {1} in magnitude, we thus expect the lemniscate to be empty in this region; but if instead the right-hand side is much less than {1} in magnitude, we expect the lemniscate to behave like a periodic sequence of horizontal lines of spacing {\frac{\pi}{\psi_0}}. This makes the main terms on both sides of (1) approximately agree (per unit area).

A graphic illustration of (1) (provided by Gemini) is shown below, where the dark spots correspond to small values of {\psi} that act to “repel” (and shorten) the lemniscate. (The bright spots correspond to the critical points of {p}, which in this case consist of six critical points at the origin and one at both of {+1/2} and {-1/2}.)

By choosing parameters appropriately, one can show that {X_1,\dots,X_5 = O(\sqrt{n})} and {\frac{1}{\pi} \int_{E_2(p)} |\psi|\ dA \leq 2n + O(1)}, yielding the first bound {\ell(\partial E_1(p)) \leq 2n + O(n^{1/2})}. However, by a more careful inspection of the arguments, and in particular measuring the defect in the triangle inequality

\displaystyle |\psi(z)| = |\sum_\zeta \frac{1}{z-\zeta}| \leq \sum_\zeta \frac{1}{|z-\zeta|},

where {\zeta} ranges over critical points. From some elementary geometry, one can show that the more the critical points {\zeta} are dispersed away from each other (in an {\ell^1} sense), the more one can gain over the triangle inequality here; conversely, the {L^1} dispersion {\sum_\zeta |\zeta|} of the critical points (after normalizing so that these critical points have mean zero) can be used to improve the control on the error terms {X_1,\dots,X_5}. Optimizing this strategy leads to the second bound

\displaystyle \ell(\partial E_1(p)) \leq 2n + O(1).

At this point, the only remaining cases that need to be handled are the ones with bounded dispersion: {\sum_\zeta |\zeta| \leq 1}. In this case, one can do some elementary manipulations of the factorization

\displaystyle p'(z) = n \prod_\zeta (z-\zeta)

to obtain some quite precise control on the asymptotics of {p(z)} and {p'(z)}; for instance, we will be able to obtain an approximation of the form

\displaystyle p(z) \approx -1 + \frac{z p'(z)}{n}

with high accuracy, as long as {z} is not too close to the origin or to critical points. This, combined with direct arclength computations, can eventually lead to the third estimate

\displaystyle \ell(\partial E_1(p)) \leq 2n + 4 \log 2 + o(1).

The last remaining cases to handle are those of small dispersion, {\sum_\zeta |\zeta| = o(1)}. An extremely careful version of the previous analysis can now give an estimate of the shape

\displaystyle \ell(\partial E_1(p)) \leq \ell(\partial E_1(p_0)) - c \|p\| + o(\|p\|)

for an absolute constant {c>0}, where {\|p\|} is a measure of how close {p} is to {p_0} (it is equal to the dispersion {\sum_\zeta |\zeta|} plus an additional term {n |1 + p(0)|^{1/n}} to deal with the constant term {p(0)}). This establishes the final bound (for {n} large enough), and even shows that the only extremizer is {p_0} (up to translation and rotation symmetry).

The Equational Theories Project: Advancing Collaborative Mathematical Research at Scale

9 December, 2025 in math.RA, paper | Tags: Aaron Hill, Alex Meiburg, Amir Livne Bar-on, Anand Rao Tadipatri, Andrés Goens, Bernhard Reinke, Bruno Le Floch, Cody Roux, Daniel Weber, David Renshaw, Douglas McNeil, Emmanuel Osalotioman Osazuwa, Equational Theory Project, Fan Zheng, Fernando Vaquerizo-Villar, Floris van Doorn, Giovanni Paolini, Harald Husum, Hernán Ibarra Mejia, Jérémy Scanvic, Joachim Breitner, Jose Brox, Lorenzo Luccioli, Marco Petracci, Marcus Rossel, Mario Carneiro, Martin Dvorak, Matthew Bolan, Nicholas Carlini, Pace P. Nielsen, Pietro Monticone, Shreyas Srinivas, universal algebra, Vlad Tsyrklevich, Zoltan Kocsis | by Terence Tao | 18 comments

Matthew Bolan, Joachim Breitner, Jose Brox, Nicholas Carlini, Mario Carneiro, Floris van Doorn, Martin Dvorak, Andrés Goens, Aaron Hill, Harald Husum, Hernán Ibarra Mejia, Zoltan Kocsis, Bruno Le Floch, Amir Livne Bar-on, Lorenzo Luccioli, Douglas McNeil, Alex Meiburg, Pietro Monticone, Pace P. Nielsen, Emmanuel Osalotioman Osazuwa, Giovanni Paolini, Marco Petracci, Bernhard Reinke, David Renshaw, Marcus Rossel, Cody Roux, Jérémy Scanvic, Shreyas Srinivas, Anand Rao Tadipatri, Vlad Tsyrklevich, Fernando Vaquerizo-Villar, Daniel Weber, Fan Zheng, and I have just uploaded to the arXiv our preprint The Equational Theories Project: Advancing Collaborative Mathematical Research at Scale. This is the final report for the Equational Theories Project, which was proposed in this blog post and also showcased in this subsequent blog post. The aim of this project was to see whether one could collaboratively achieve a large-scale systematic exploration of a mathematical space, which in this case was the implication graph between 4694 equational laws of magmas. A magma is a set {G} equipped with a binary operation {\diamond: G \times G \rightarrow G} (or, equivalently, a {G \times G} multiplication table). An equational law is an equation involving this operation and a number of indeterminate variables. Some examples of equational laws, together with the number that we assigned to that law, include

{E1} (Trivial law): {x = x}
{E2} (Singleton law): {x = y}
{E43} (Commutative law): {x \diamond y = y \diamond x}
{E168} (Central groupoid law): {x = (y \diamond x) \diamond (x \diamond z)}
{E4512} (Associative law): {x \diamond (y \diamond z) = (x \diamond y) \diamond z}

Up to relabeling and symmetry, there turn out to be 4694 equational laws that involve at most four invocations of the magma operation {\diamond}; one can explore them in our “Equation Explorer” tool.

The aim of the project was to work out which of these laws imply which others. For instance, all laws imply the trivial law {E1}, and conversely the singleton law {E2} implies all the others. On the other hand, the commutative law {E43} does not imply the associative law {E4512} (because there exist magmas that are commutative but not associative), nor is the converse true. All in all, there are {22,028,942} implications of this type to settle; most of these are relatively easy and could be resolved in a matter of minutes by an expert in college-level algebra, but prior to this project, it was impractical to actually do so in a manner that could be feasibly verified. Also, this problem is known to become undecidable for sufficiently long equational laws. Nevertheless, we were able to resolve all the implications informally after two months, and have them completely formalized in Lean after a further five months.

After a rather hectic setup process (documented in this personal log), progress came in various waves. Initially, huge swathes of implications could be resolved first by very simple-minded techniques, such as brute-force searching all small finite magmas to refute implications; then, automated theorem provers such as Vampire or Mace4 / Prover9 were deployed to handle a large fraction of the remainder. A few equations had existing literature that allowed for many implications involving them to be determined. This left a core of just under a thousand implications that did not fall to any of the “cheap” methods, and which occupied the bulk of the efforts of the project. As it turns out, all of the remaining implications were negative; the difficulty was to construct explicit magmas that obeyed one law but not another. To do this, we discovered a number of general constructions of magmas that were effective at this task. For instance:

Linear models, in which the carrier {G} was a (commutative or noncommutative) ring and the magma operation took the form {x \diamond y = ax + by} for some coefficients {a,b}, turned out to resolve many cases.
We discovered a new invariant of an equational law, which we call the “twisting semigroup” of that law, which also allowed us to construct further examples of magmas that obeyed one law {E} but not another {E'}, by starting with a base magma {M} that obeyed both laws, taking a Cartesian power {M^n} of that magma, and then “twisting” the magma operation by certain permutations of {\{1,\dots,n\}} designed to preserve {E} but not {E'}.
We developed a theory of “abelian magma extensions”, similar to the notion of an abelian extension of a group, which allowed us to flexibly build new magmas out of old ones in a manner controlled by a certain “magma cohomology group {H^1_E(G,M)}” which were tractable to compute, and again gave ways to construct magmas that obeyed one law {E} but not another {E'}.
Greedy methods, in which one fills out an infinite multiplication table in a greedy manner (somewhat akin to naively solving a Sudoku puzzle), subject to some rules designed to avoid collisions and maintain a law {E}, as well as some seed entries designed to enforce a counterexample to a separate law {E'}. Despite the apparent complexity of this method, it can be automated in a manner that allowed for many outstanding implications to be refuted.
Smarter ways to utilize automated theorem provers, such as strategically adding in additional axioms to the magma to help narrow the search space, were developed over the course of the project.

Even after applying all these general techniques, though, there were about a dozen particularly difficult implications that resisted even these more powerful methods. Several ad hoc constructions were needed in order to understand the behavior of magmas obeying such equations as E854, E906, E1323, E1516, and E1729, with the latter taking months of effort to finally solve and then formalize.

A variety of GUI interfaces were also developed to facilitate the collaboration (most notably the Equation Explorer tool mentioned above), and several side projects were also created within the project, such as the exploration of the implication graph when the magma was also restricted to be finite. In this case, we resolved all of the {22,028,942} implications except for one (and its dual):

Problem 1 Does the law {x = y \diamond (x \diamond ((y \diamond x) \diamond y))} (E677) imply the law {x = ((x \diamond x) \diamond x) \diamond x} (E255) for finite magmas?

See this blueprint page for some partial results on this problem, which we were unable to resolve even after months of effort.

Interestingly, modern AI tools did not play a major role in this project (but it was largely completed in 2024, before the most recent advanced models became available); while they could resolve many implications, the older “good old-fashioned AI” of automated theorem provers were far cheaper to run and already handled the overwhelming majority of the implications that the advanced AI tools could. But I could imagine that such tools would play a more prominent role in future similar projects.

The story of Erdős problem #1026

8 December, 2025 in expository, math.CA, math.CO | Tags: AI, AlphaEvolve, Erdos | by Terence Tao | 17 comments

Problem 1026 on the Erdős problem web site recently got solved through an interesting combination of existing literature, online collaboration, and AI tools. The purpose of this blog post is to try to tell the story of this collaboration, and also to supply a complete proof.

The original problem of Erdős, posed in 1975, is rather ambiguous. Erdős starts by recalling his famous theorem with Szekeres that says that given a sequence of {k^2+1} distinct real numbers, one can find a subsequence of length {k+1} which is either increasing or decreasing; and that one cannot improve the {k^2+1} to {k^2}, by considering for instance a sequence of {k} blocks of length {k}, with the numbers in each block decreasing, but the blocks themselves increasing. He also noted a result of Hanani that every sequence of length {k(k+3)/2} can be decomposed into the union of {k} monotone sequences. He then wrote “As far as I know the following question is not yet settled. Let {x_1,\dots,x_n} be a sequence of distinct numbers, determine

\displaystyle S(x_1,\dots,x_n) = \max \sum_r x_{i_r}

where the maximum is to be taken over all monotonic sequences {x_{i_1},\dots,x_{i_m}}“.

This problem was added to the Erdős problem site on September 12, 2025, with a note that the problem was rather ambiguous. For any fixed {n}, this is an explicit piecewise linear function of the variables {x_1,\dots,x_n} that could be computed by a simple brute force algorithm, but Erdős was presumably seeking optimal bounds for this quantity under some natural constraint on the {x_i}. The day the problem was posted, Desmond Weisenberg proposed studying the quantity {c(n)}, defined as the largest constant such that

\displaystyle S(x_1,\dots,x_n) \geq c(n) \sum_{i=1}^n x_i

for all choices of (distinct) real numbers {x_1,\dots,x_n}. Desmond noted that for this formulation one could assume without loss of generality that the {x_i} were positive, since deleting negative or vanishing {x_i} does not increase the left-hand side and does not decrease the right-hand side. By a limiting argument one could also allow collisions between the {x_i}, so long as one interpreted monotonicity in the weak sense.

Though not stated on the web site, one can formulate this problem in game theoretic terms. Suppose that Alice has a stack of {N} coins for some large {N}. She divides the coins into {n} piles of consisting of {x_1,\dots,x_n} coins each, so that {\sum_{i=1}^n x_i = N}. She then passes the piles to Bob, who is allowed to select a monotone subsequence of the piles (in the weak sense) and keep all the coins in those piles. What is the largest fraction {c(n)} of the coins that Bob can guarantee to keep, regardless of how Alice divides up the coins? (One can work with either a discrete version of this problem where the {x_i} are integers, or a continuous one where the coins can be split fractionally, but in the limit {N \rightarrow \infty} the problems can easily be seen to be equivalent.)

AI-generated images continue to be problematic for a number of reasons, but here is one such image that somewhat manages at least to convey the idea of the game:

For small {n}, one can work out {c(n)} by hand. For {n=1}, clearly {c(1)=1}: Alice has to put all the coins into one pile, which Bob simply takes. Similarly {c(2)=1}: regardless of how Alice divides the coins into two piles, the piles will either be increasing or decreasing, so in either case Bob can take both. The first interesting case is {n=3}. Bob can again always take the two largest piles, guaranteeing himself {2/3} of the coins. On the other hand, if Alice almost divides the coins evenly, for instance into piles {((1/3 + \varepsilon)N, (1/3-2\varepsilon) N, (1/3+\varepsilon)N)} for some small {\varepsilon>0}, then Bob cannot take all three piles as they are non-monotone, and so can only take two of them, allowing Alice to limit the payout fraction to be arbitrarily close to {2/3}. So we conclude that {c(3)=2/3}.

An hour after Desmond’s comment, Stijn Cambie noted (though not in the language I used above) that a similar construction to the one above, in which Alice divides the coins into {k^2} pairs that are almost even, in such a way that the longest monotone sequence is of length {k}, gives the upper bound {c(k^2) \leq 1/k}. It is also easy to see that {c(n)} is a non-increasing function of {n}, so this gives a general bound {c(n) \leq (1+o(1))/\sqrt{n}}. Less than an hour after that, Wouter van Doorn noted that the Hanani result mentioned above gives the lower bound {c(n) \geq (\frac{1}{\sqrt{2}}-o(1))/\sqrt{n}}, and posed the problem of determining the asymptotic limit of {\sqrt{n} c(n)} as {n \rightarrow \infty}, given that this was now known to range between {1/\sqrt{2}-o(1)} and {1+o(1)}. This version was accepted by Thomas Bloom, the moderator of the Erdős problem site, as a valid interpretation of the original problem.

The next day, Stijn computed the first few values of {c(n)} exactly:

\displaystyle 1, 1, 2/3, 1/2, 1/2, 3/7, 2/5, 3/8, 1/3.

While the general pattern was not yet clear, this was enough data for Stijn to conjecture that {c(k^2)=1/k}, which would also imply that {\sqrt{n} c(n) \rightarrow 1} as {n \rightarrow \infty}. (EDIT: as later located by an AI deep research tool, this conjecture was also made in Section 12 of this 1980 article of Steele.) Stijn also described the extremizing sequences for this range of {n}, but did not continue the calculation further (a naive computation would take runtime exponential in {n}, due to the large number of possible subsequences to consider).

The problem then lay dormant for almost two months, until December 7, 2025, in which Boris Alexeev, as part of a systematic sweep of the Erdős problems using the AI tool Aristotle, was able to get this tool to autonomously solve this conjecture {c(k^2)=1/k} in the proof assistant language Lean. The proof converted the problem to a rectangle-packing problem.

This was one further addition to a recent sequence of examples where an Erdős problem had been automatically solved in one fashion or another by an AI tool. Like the previous cases, the proof turned out to not be particularly novel. Within an hour, Koishi Chan gave an alternate proof deriving the required bound {c(k^2) \geq 1/k} from the original Erdős-Szekeres theorem by a standard “blow-up” argument which we can give here in the Alice-Bob formulation. Take a large {M}, and replace each pile of {x_i} coins with {(1+o(1)) M^2 x_i^2} new piles, each of size {(1+o(1)) x_i}, chosen so that the longest monotone subsequence in this collection is {(1+o(1)) M x_i}. Among all the new piles, the longest monotone subsequence has length {(1+o(1)) M S(x_1,\dots,x_n)}. Applying Erdős-Szekeres, one concludes the bound

\displaystyle M S(x_1,\dots,x_n) \geq (1-o(1)) (\sum_{i=1}^{k^2} M^2 x_i^2)^{1/2}

and on canceling the {M}‘s, sending {M \rightarrow \infty}, and applying Cauchy-Schwarz, one obtains {c(k^2) \geq 1/k} (in fact the argument gives {c(n) \geq 1/\sqrt{n}} for all {n}).

Once this proof was found, it was natural to try to see if it had already appeared in the literature. AI deep research tools have successfully located such prior literature in the past, but in this case they did not succeed, and a more “old-fashioned” Google Scholar job turned up some relevant references: a 2016 paper by Tidor, Wang and Yang contained this precise result, citing an earlier paper of Wagner as inspiration for applying “blowup” to the Erdős-Szekeres theorem.

But the story does not end there! Upon reading the above story the next day, I realized that the problem of estimating {c(n)} was a suitable task for AlphaEvolve, which I have used recently as mentioned in this previous post. Specifically, one could task to obtain upper bounds on {c(n)} by directing it to produce real numbers (or integers) {x_1,\dots,x_n} summing up to a fixed sum (I chose {10^6}) with a small a value of {S(x_1,\dots,x_n)} as possible. After an hour of run time, AlphaEvolve produced the following upper bounds on {c(n)} for {1 \leq n \leq 16}, with some intriguingly structured potential extremizing solutions:

The numerical scores (divided by {10^6}) were pretty obviously trying to approximate simple rational numbers. There were a variety of ways (including modern AI) to extract the actual rational numbers they were close to, but I searched for a dedicated tool and found this useful little web page of John Cook that did the job:

\displaystyle 1, 1, 2/3, 1/2, 1/2, 3/7, 2/5, 3/8, 1/3, 1/4.

\displaystyle 1/3, 4/13, 3/10, 4/14, 3/11, 4/15, 1/4.

I could not immediately see the pattern here, but after some trial and error in which I tried to align numerators and denominators, I eventually organized this sequence into a more suggestive form:

\displaystyle 1,

\displaystyle 1/1, \mathbf{2/3}, 1/2,

\displaystyle 2/4, \mathbf{3/7}, 2/5, \mathbf{3/8}, 2/6,

\displaystyle 3/9, \mathbf{4/13}, 3/10, \mathbf{4/14}, 3/11, \mathbf{4/15}, 3/12.

This gave a somewhat complicated but predictable conjecture for the values of the sequence {c(n)}. On posting this, Boris found a clean formulation of the conjecture, namely that

\displaystyle c(k^2 + 2a + 1) = \frac{k}{k^2+a} \ \ \ \ \ (1)

whenever {k \geq 1} and {-k \leq a \leq k}. After a bit of effort, he also produced an explicit upper bound construction:

Proposition 1 If {k \geq 1} and {-k \leq a \leq k}, then {c(k^2+2a+1) \leq \frac{k}{k^2+a}}.

Proof: Consider a sequence {x_1,\dots,x_{k^2+2a+1}} of numbers clustered around the “red number” {|a|} and “blue number” {|a+1|}, consisting of {|a|} blocks of {k-|a|} “blue” numbers, followed by {|a+1|} blocks of {|a+1|} “red” numbers, and then {k-|a|} further blocks of {k} “blue” numbers. When {a \geq 0}, one should take all blocks to be slightly decreasing within each block, but the blue blocks should be are increasing between each other, and the red blocks should also be increasing between each other. When {a < 0}, all of these orderings should be reversed. The total number of elements is indeed

\displaystyle |a| \times (k-|a|) + |a+1| \times |a+1| + (k-|a|) \times k

\displaystyle = k^2 + 2a + 1

and the total sum is close to

\displaystyle |a| \times (k-|a|) \times |a+1| + |a+1| \times |a+1| \times |a|

+ (k-|a|) \times k \times |a+1| = (k^2 + a) |a+1|.

With this setup, one can check that any monotone sequence consists either of at most {|a+1|} red elements and at most {k-|a|} blue elements, or no red elements and at most {k} blue elements, in either case giving a monotone sum that is bounded by either

\displaystyle |a+1| \times |a| + (k-|a|) \times |a+1| = k |a+1|

\displaystyle 0 + k \times |a+1| = k |a+1|,

giving the claim. \Box

Here is a figure illustrating the above construction in the {a \geq 0} case (obtained after starting with a ChatGPT-provided file and then manually fixing a number of placement issues):

Here is a plot of 1/c(n) (produced by ChatGPT Pro), showing that it is basically a piecewise linear approximation to the square root function:

Shortly afterwards, Lawrence Wu clarified the connection between this problem and a square packing problem, which was also due to Erdős (Problem 106). Let {f(n)} be the least number such that, whenever one packs {n} squares of sidelength {d_1,\dots,d_n} into a square of sidelength {D}, with all sides parallel to the coordinate axes, one has

\displaystyle \sum_{i=1}^n d_i \leq f(n) D.

Proposition 2 For any {n}, one has
\displaystyle c(n) \geq \frac{1}{f(n)}.

Proof: Given {x_1,\dots,x_n} and {1 \leq i \leq n}, let {S_i} be the maximal sum over all increasing subsequences ending in {x_i}, and {T_i} be the maximal sum over all decreasing subsequences ending in {x_i}. For {i < j}, we have either {S_j \geq S_i + x_j} (if {x_j \geq x_i}) or {T_j \geq T_i + x_j} (if {x_j \leq x_i}). In particular, the squares {(S_i-x_i, T_i-x_i)} and {(S_j-x_j, T_j-x_j)} are disjoint. These squares pack into the square {[0, S(x_1,\dots,x_n)]^2}, so by definition of {f}, we have

\displaystyle \sum_{i=1}^n x_i \leq f(n) S(x_1,\dots,x_n),

and the claim follows. \Box

This idea of using packing to prove Erdős-Szekeres type results goes back to a 1959 paper of Seidenberg, although it was a discrete rectangle-packing argument that was not phrased in such an elegantly geometric form. It is possible that Aristotle was “aware” of the Seidenberg argument via its training data, as it had incorporated a version of this argument in its proof.

Here is an illustration of the above argument using the AlphaEvolve-provided example

\displaystyle[99998, 99997, 116305, 117032, 116304,

\displaystyle 58370, 83179, 117030, 92705, 99080]

for n=10 to convert it to a square packing (image produced by ChatGPT Pro):

At this point, Lawrence performed another AI deep research search, this time successfully locating a paper from just last year by Baek, Koizumi, and Ueoro, where they show that

Theorem 3 For any {k \geq 1}, one has
\displaystyle f(k^2+1) \leq k

which, when combined with a previous argument of Praton, implies

Theorem 4 For any {k \geq 1} and {c \in {\bf Z}} with {k^2+2c+1 \geq 1}, one has
\displaystyle f(k^2+2c+1) \leq k + \frac{c}{k}.

This proves the conjecture!

There just remained the issue of putting everything together. I did feed all of the above information into a large language model, which was able to produce a coherent proof of (1) assuming the results of Baek-Koizumi-Ueoro and Praton. Of course, LLM outputs are prone to hallucination, so it would be preferable to formalize that argument in Lean, but this looks quite doable with current tools, and I expect this to be accomplished shortly. But I was also able to reproduce the arguments of Baek-Koizumi-Ueoro and Praton, which I include below for completeness.

Proof: (Proof of Theorem 3, adapted from Baek-Koizumi-Ueoro) We can normalize {D=k}. It then suffices to show that if we pack the length {k} torus {({\bf Z}/k{\bf Z})^2} by {k^2+1} axis-parallel squares of sidelength {d_1,\dots,d_{k^2+1}}, then

\displaystyle \sum_{i=1}^{k^2+1} d_i \leq k^2.

Pick {x_0, y_0 \in {\bf R}/k{\bf Z}}. Then we have a {k \times k} grid

\displaystyle (x_0 + {\bf Z}) \times (y_0 + {\bf Z}) \pmod {k{\bf Z}^2}

inside the torus. The {i^{th}} square, when restricted to this grid, becomes a discrete rectangle {A_{i,x_0} \times B_{i,y_0}} for some finite sets {A_{i,x_0}, B_{i,y_0}} with

\displaystyle |\# A_{i,x_0} -\# B_{i,y_0}| \leq 1. \ \ \ \ \ (2)

By the packing condition, we have

\displaystyle \sum_{i=1}^{k^2+1} \# A_{i,x_0} \# B_{i,y_0} \leq k^2.

From (2) we have

\displaystyle (\# A_{i,x_0} - 1) (\# B_{i,y_0} - 1) \geq 0

hence

\displaystyle \# A_{i,x_0} \# B_{i,y_0} \geq \# A_{i,x_0} + \# B_{i,y_0} - 1.

Inserting this bound and rearranging, we conclude that

\displaystyle \sum_{i=1}^{k^2+1} \# A_{i,x_0} + \sum_{i=1}^{k^2+1} \# B_{i,y_0} \leq 2k^2 + 1.

Taking the supremum over {x_0,y_0} we conclude that

\displaystyle \sup_{x_0} \sum_{i=1}^{k^2+1} \# A_{i,x_0} + \sup_{y_0} \sum_{i=1}^{k^2+1} \# B_{i,y_0} \leq 2k^2 + 1

so by the pigeonhole principle one of the summands is at most {k^2}. Let’s say it is the former, thus

\displaystyle \sup_{x_0} \sum_{i=1}^{k^2+1} \# A_{i,x_0} \leq k^2.

In particular, the average value of {\sum_{i=1}^{k^2+1} \# A_{i,x_0}} is at most {k^2}. But this can be computed to be {\sum_{i=1}^{k^2+1} d_i}, giving the claim. Similarly if it is the other sum. \Box

UPDATE: Actually, the above argument also proves Theorem 4 with only minor modifications. Nevertheless, we give the original derivation of Theorem 4 using the embedding argument of Praton below for sake of completeness.

Proof: (Proof of Theorem 4, adapted from Praton) We write {c = \epsilon |c|} with {\epsilon = \pm 1}. We can rescale so that the square one is packing into is {[0,k]^2}. Thus, we pack {k^2+2\varepsilon |c|+1} squares of sidelength {d_1,\dots,d_{k^2+2\varepsilon |c|+1}} into {[0,k]^2}, and our task is to show that

\displaystyle \sum_{i=1}^{k^2+2\varepsilon|c|+1} d_i \leq k^2 + \varepsilon |c|.

We pick a large natural number {N} (in particular, larger than {k}), and consider the three nested squares

\displaystyle [0,k]^2 \subset [0,N]^2 \subset [0,N + |c| \frac{N}{N-\varepsilon}]^2.

We can pack {[0,N]^2 \backslash [0,k]^2} by {N^2-k^2} unit squares. We can similarly pack

\displaystyle [0,N + |c| \frac{N}{N-\varepsilon}]^2 \backslash [0,N]^2

\displaystyle =[0, \frac{N}{N-\varepsilon} (N+|c|-\varepsilon)]^2 \backslash [0, \frac{N}{N-\varepsilon} (N-\varepsilon)]^2

into {(N+|c|-\varepsilon)^2 - (N-\varepsilon)^2} squares of sidelength {\frac{N}{N-\varepsilon}}. All in all, this produces

\displaystyle k^2+2\varepsilon |c|+1 + N^2-k^2 + (N+|c|-\varepsilon)^2 - (N-\varepsilon)^2

\displaystyle = (N+|c|)^2 + 1

squares, of total length

\displaystyle (\sum_{i=1}^{k^2+2\varepsilon |c|+1} d_i) +(N^2-k^2) + ((N+|c|-\varepsilon)^2 - (N-\varepsilon)^2) \frac{N}{N-\varepsilon}.

Applying Theorem 3, we conclude that

\displaystyle (\sum_{i=1}^{k^2+2\varepsilon |c|+1} d_i) +(N^2-k^2)

\displaystyle + ((N+|c|-\varepsilon)^2 - (N-\varepsilon)^2) \frac{N}{N-\varepsilon} \leq (N+|c|) (N + |c| \frac{N}{N-\varepsilon}).

The right-hand side is

\displaystyle N^2 + 2|c| N + |c|^2 + \varepsilon |c| + O(1/N)

and the left-hand side similarly evaluates to

\displaystyle (\sum_{i=1}^{k^2+2c+1} d_i) + N^2 -k^2 + 2|c| N + |c|^2 + O(1/N)

and so we simplify to

\displaystyle \sum_{i=1}^{k^2+2\varepsilon |c|+1} d_i \leq k^2 + \varepsilon |c| + O(1/N).

Sending {N \rightarrow \infty}, we obtain the claim. \Box One striking feature of this story for me is how important it was to have a diverse set of people, literature, and tools to attack this problem. To be able to state and prove the precise formula for {c(n)} required multiple observations, including some version of the following:

The sequence can be numerically computed as a sequence of rational numbers.
When appropriately normalized and arranged, visible patterns in this sequence appear that allow one to conjecture the form of the sequence.
This problem is a weighted version of the Erdős-Szekeres theorem.
Among the many proofs of the Erdős-Szekeres theorem is the proof of Seidenberg in 1959, which can be interpreted as a discrete rectangle packing argument.
This problem can be reinterpreted as a continuous square packing problem, and in fact is closely related to (a generalized axis-parallel form of) Erdős problem 106, which concerns such packings.
The axis-parallel form of Erdős problem 106 was recently solved by Baek-Koizumi-Ueoro.
The paper of Praton shows that Erdős Problem 106 implies the generalized version needed for this problem. This implication specializes to the axis-parallel case.

It was only through the combined efforts of all the contributors and their tools that all these key inputs were able to be assembled within 48 hours. It seems plausible that a more traditional effort involving just one or two mathematicians and simpler programming and literature search tools may eventually have been able to put all these pieces together, but I believe this process would have taken much longer (on the order of weeks or even months).

Another key ingredient was the balanced AI policy on the Erdős problem website, which encourages disclosed AI usage while strongly discouraging undisclosed use. To quote from that policy: “Comments prepared with the assistance of AI are permitted, provided (a) this is disclosed, (b) the contents (including mathematics, code, numerical data, and the existence of relevant sources) have been carefully checked and verified by the user themselves without the assistance of AI, and (c) the comment is not unreasonably long.”

Quantitative correlations and some problems on prime factors of consecutive integers

1 December, 2025 in math.NT, paper | Tags: correlation, Joni Teravainen, prime divisors | by Terence Tao | 4 comments

Joni Teravainen and I have uploaded to the arXiv our paper “Quantitative correlations and some problems on prime factors of consecutive integers“. This paper applies modern analytic number theory tools – most notably, the Maynard sieve and the recent correlation estimates for bounded multiplicative functions of Pilatte – to resolve (either partially or fully) some old problems of Erdős, Strauss, Pomerance, Sárközy, and Hildebrand, mostly regarding the prime counting function

\displaystyle \omega(n) := \sum_{p|n} 1

and its relatives. The famous Hardy–Ramanujan and Erdős–Kac laws tell us that asymptotically for {n \sim x}, {\omega(n)} should behave like a gaussian random variable with mean and variance both close to {\log\log x}; but the question of the joint distribution of consecutive values such as {\omega(n), \omega(n+1)} is still only partially understood. Aside from some lower order correlations at small primes (arising from such observations as the fact that precisely one of {n,n+1} will be divisible by {2}), the expectation is that such consecutive values behave like independent random variables. As an indication of the state of the art, it was recently shown by Charamaras and Richter that any bounded observables {f(\omega(n))}, {g(\omega(n+1))} will be asymptotically decorrelated in the limit {n \rightarrow \infty} if one performs a logarithmic statistical averaging. Roughly speaking, this confirms the independence heuristic at the scale {\sqrt{\log\log x}} of the standard deviation, but does not resolve finer-grained information, such as precisely estimating the probability of the event {\omega(n)=\omega(n+1)}.

Our first result, answering a question of Erdős, shows that there are infinitely many {n} for which one has the bound

\displaystyle \omega(n+k) \ll k

for all {k \geq 1}. For {k \gg \log\log n}, such a bound is already to be expected (though not completely universal) from the Hardy–Ramanujan law; the main difficulty is thus with the short shifts {k = o(\log\log n)}. If one only had to demonstrate this type of bound for a bounded number of {k}, then this type of result is well within standard sieve theory methods, which can make any bounded number of shifts {n+k} “almost prime” in the sense that {\omega(n+k)} becomes bounded. Thus the problem is that the “sieve dimension” {\sim \log\log n} grows (slowly) with {n}. When writing about this problem in 1980, Erdős and Graham write “we just know too little about sieves to be able to handle such a question (“we” here means not just us but the collective wisdom (?) of our poor struggling human race)”.

However, with the advent of the Maynard sieve (also sometimes referred to as the Maynard–Tao sieve), it turns out to be possible to sieve for the conditions {\omega(n+k) \ll k} for all {k = o(\log\log n)} simultaneously (roughly speaking, by sieving out any {n} for which {n+k} is divisible by a prime {p \ll x^{1/\exp(Ck)}} for a large {C}), and then performing a moment calculation analogous to the standard proof (due to Turán) of the Hardy–Ramanujan law, but weighted by the Maynard sieve. (In order to get good enough convergence, one needs to control fourth moments as well as second moments, but these are standard, if somewhat tedious, calculations).

Our second result, which answers a separate question of Erdős, establishes that the quantity

\displaystyle \sum_n \frac{\omega(n)}{2^n} = 0.5169428\dots

is irrational; this had recently been established by Pratt under the assumption of the prime tuples conjecture, but we are able to establish this result unconditoinally. The binary expansion of this number is of course closely related to the distribution of {\omega}, but in view of the Hardy–Ramanujan law, the {n^{th}} digit of this number is influenced by about {\log\log\log n} nearby values of {\omega}, which is too many correlations for current technology to handle. However, it is possible to do some “Gowers norm” type calculations to decouple things to the point where pairwise correlation information is sufficient. To see this, suppose for contradiction that this number was a rational {a/q}, thus

\displaystyle q \sum_n \frac{\omega(n)}{2^n} = 0 \hbox{ mod } 1.

Multiplying by {2^n}, we obtain some relations between shifts {\omega(n+h)}:

\displaystyle q \sum_{h=1}^\infty \frac{\omega(n+h)}{2^h} = 0 \hbox{ mod } 1.

Using the additive nature of {\omega}, one then also gets similar relations on arithmetic progressions, for many {n} and {p}:

\displaystyle q \sum_{h=1}^\infty \frac{\omega(n+ph)}{2^h} = 0 \hbox{ mod } 1.

Taking alternating sums of this sort of identity for various {n} and {p} (in analogy to how averages involving arithmetic progressions can be related to Gowers norm-type expressions over cubes), one can eventually arrive eliminate the contribution of small {H}, and arrive at an identity of the form

\displaystyle q \sum_{h=1}^\infty \frac{\sum_{\epsilon \in \{0,1\}^K} (-1)^{|\epsilon|} \omega(n + r_{\epsilon,h+K})}{2^{h+K}} = 0 \hbox{ mod } 1 \ \ \ \ \ (1)

for many {n}, where {K} is a parameter (we eventually take {K \sim \log\log\log\log n}) and {r_{\epsilon,h+K}} are various shifts that we will not write out explicitly here. This looks like quite a messy expression; however, one can adapt proofs of the Erdős–Kac law and show that, as long as one ignores the contribution of really large prime factors (of order {\gg n^{1/10}}, say) to the {\omega(n + r_{\epsilon,h+K})}, that this sort of sum behaves like a gaussian, and in particular once one can show a suitable local limit theorem, one can contradict (1). The contribution of the large prime factors does cause a problem though, as a naive application of the triangle inequality bounds this contribution by {O(1)}, which is an error that overwhelms the information provided by (1). To resolve this we have to adapt the pairwise correlation estimates of Pilatte mentioned earlier to demonstrate that the these contributions are in fact {o(1)}. Here it is important that the error estimates of Pilatte are quite strong (of order {O(\log^{-c} n)}); previous correlation estimates of this type (such as those used in this earlier paper with Joni) turn out to be too weak for this argument to close.

Our final result concerns the asymptotic behavior of the density

\displaystyle \frac{1}{x} \{n \leq x: \omega(n+1) = \omega(n)\}

(we also address similar questions for {\Omega(n+1)=\Omega(n)} and {\tau(n+1)=\tau(n)}). Heuristic arguments led Erdős, Pomerance, and Sárközy to conjecture that this quantity was asymptotically {\frac{1}{2\sqrt{\pi \log\log x}}}. They were able to establish an upper bound of {O(1/\log\log x)}, while Hildebrand obtained a lower bound of {\gg 1/(\log\log x)^3}, due to Hildebrand. Here, we obtain the asymptotic for almost all {x} (the limitation here is the standard one, which is that the current technology on pairwise correlation estimates either requires logarithmic averaging, or is restricted to almost all scales rather than all scales). Roughly speaking, the idea is to use the circle method to rewrite the above density in terms of expressions

\displaystyle \frac{1}{x} \sum_{n \leq x} e(\alpha \omega(n+1)) e(-\alpha \omega(n))

for various frequencies {\alpha}, use the estimates of Pilatte to handle the minor arc {\alpha}, and convert the major arc contribution back into physical space (in which {\omega(n+1)} and {\omega(n)} are now permitted to differ by a large amount) and use more traditional sieve theoretic methods to estimate the result.

Growth rates of sequences governed by the squarefree properties of its translates

1 December, 2025 in math.NT, paper | Tags: Erdos, Wouter van Doorn | by Terence Tao | 3 comments

Wouter van Doorn and I have uploaded to the arXiv our paper “Growth rates of sequences governed by the squarefree properties of its translates“. In this paper we answer a number of questions of Erdős} ( Problem 1102 and Problem 1103 on the Erdős problem web site) regarding how quickly a sequence {A = \{a_1 < a_2 < \dots\}} of increasing natural numbers can grow if one constrains its translates {n+A} to interact with the set {\mathcal{SF} = \{1,2,3,5,6,7,10,\dots\}} of squarefree numbers in various ways. For instance, Erdős defined a sequence {A} to have “Property {P}” if each of its translates {n+A} only intersected {\mathcal{SF}} in finitely many points. Erdős believed this to be quite a restrictive condition on {A}, writing “Probably a sequence having property P must increase fairly fast, but I have no results in this direction.”. Perhaps surprisingly, we show that while these sequences must be of density zero, they can in fact grow arbitrary slowly in the sense that one can have {a_j \leq j f(j)} for all sufficiently large {j} and any specified function {f(j)} that tends to infinity as {j \rightarrow \infty}. For instance, one can find a sequence that grows like {O(j \log\log\log j)}. The density zero claim can be proven by a version of the Maier matrix method, and also follows from known moment estimates on the gaps between squarefree numbers; the latter claim is proven by a greedy construction in which one slowly imposes more and more congruence conditions on the sequence to ensure that various translates of the sequence stop being squarefree after a certain point.

Erdős also defined a somewhat complementary property {Q}, which asserts that for infinitely many {n}, all the elements {n+a} of {A} for {a \leq n} are square-free. Since the squarefree numbers themselves have density {6/\pi^2}, it is easy to see that a sequence with property {Q} must have (upper) density at most {6/\pi^2} (because it must be “admissible” in the sense of avoiding one residue class modulo {p^2} for each {p}). Erdős observed that any sufficiently rapidly growing (admissible) sequence would obey property {Q} but beyond that, Erdős writes “I have no precise information about the rate of increase a sequence having property Q must have.”. Our results in this direction may also be surprising: we show that there exist sequences with property {Q} with density exactly {6/\pi^2} (or equivalently, {a_j \sim \frac{\pi^2}{6} j}). This requires a recursive sieve construction, in which one starts with an initial scale {n} and finds a much larger number {n'} such that {n'+a} is squarefree for most of the squarefree numbers {a \leq n'} (and all of the squarefree numbers {a \leq n}). We quantify Erdős’s remark by showing that an (admissible) sequence will necessarily obey property {Q} once it grows significantly faster than {\exp( C j \log j)}, but need not obey this property if it only grows like {\exp(O(j^{1/2} \log^{1/2} j))}. This is achieved through further application of sieve methods.

A third property studied by Erdős is the property of having squarefree sums, so that {a_i + a_j} is squarefree for all {i,j}. Erdős writes, “In fact one can find a sequence which grows exponentially. Must such a sequence really increase so fast? I do not expect that there is such a sequence of polynomial growth.” Here our results are relatively weak: we can construct such a sequence that grows like {\exp(O(j \log j))}, but do not know if this is optimal; the best lower bound we can produce on the growth, coming from the large sieve, is {\gg j^{4/3}}. (Somewhat annoyingly, the precise form of the large sieve inequality we needed was not in the literature, so we have an appendix supplying it.) We suspect that further progress on this problem requires advances in inverse sieve theory.

A weaker property than squarefree sums (but stronger than property {Q}), referred to by Erdős as property {\overline{P}}, asserts that there are infinitely many {n} such that all elements of {n+A} (not just the small ones) are square-free. Here, the situation is close to, but not quite the same, as that for property {Q}; we show that sequences with property {\overline{P}} must have upper density strictly less than {6/\pi^2}, but can have density arbitrarily close to this value.

Finally, we looked at a further question of Erdős on the size of an admissible set {A}. Because the squarefree numbers are admissible, the maximum number {A(x)} of elements of an admissible set {A} up to {x} (OEIS A083544) is at least the number {|{\mathcal SF} \cap [x]|} of squarefree elements up to {x} (A013928). It was observed by Ruzsa that the former sequence is greater than the latter for infinitely many {x}. Erdős asked, “Probably this holds for all large x. It would be of some interest to estimate A(x) as accurately as possible.”

We are able to show

\displaystyle \frac{\sqrt{x}}{\log x} \ll A(x) - \frac{6}{\pi^2} x \ll x^{4/5},

with the upper bound coming from the large sieve and the lower bound from a probabilistic construction. In contrast, a classical result of Walfisz shows that

\displaystyle |{\mathcal SF} \cap [x]| - \frac{6}{\pi^2} x \ll x^{1/2} \exp(-c \log^{3/5} x / (\log\log x)^{1/5}).

Together, this implies that Erdős’s conjecture holds {A(x) > |{\mathcal SF} \cap [x]|} for all sufficiently large {x}. Numerically, it appears that in fact this conjecture holds for all {n>17}:

However, we do not currently have enough numerical data for the sequence {A(x)} to completely confirm the conjecture in all cases. This could potentially be a crowdsourced project (similar to the Erdős-Guy-Selfridge project reported on in this previous blog post).

Call for industry sponsors for IPAM’s RIPS program

24 November, 2025 in advertising | Tags: ipam | by Terence Tao | 6 comments

Over the last 25 years, the Institute for pure and applied mathematics (IPAM) at UCLA (where I am now director of special projects) has run the popular Research in Industrial Projects for Students (RIPS) program every summer, in which industry sponsors with research projects are matched with talented undergraduates and a postdoctoral mentor to work at IPAM on the sponsored project for nine weeks. IPAM is now putting out a call for industry sponsors who can contribute a suitable research project for the Summer of 2026, as well as funding to cover the costs of the research; details are available here.

(Student applications for this program will open at a later date, once the list of projects is finalized.)

Climbing the cosmic distance ladder: another sample chapter

21 November, 2025 in advertising, book, update | Tags: cosmic distance ladder, Tanya Klowden | by Terence Tao | 14 comments

Five years ago, I announced a popular science book project with Tanya Klowden on the cosmic distance ladder, in which we released a sample draft chapter of the book, covering the “fourth rung” of the ladder, which for us meant the distances to the planets. In the intervening time, a number of unexpected events have slowed down this project significantly; but I am happy to announce that we have completed a second draft chapter, this time on the “seventh rung” of measuring distances across the Milky Way, which required the maturation of the technologies of photography and spectroscopy, as well as the dawn of the era of “big data” in the early twentieth century, as exemplified for instance by the “Harvard computers“.

We welcome feedback of course, and are continuing to work to complete the book despite the various delays. In the mean time, you can check out our instagram account for the project, or the pair of videos that Grant Sanderson (3blue1brown) produced with us on this topic, which I previously blogged about here.

Thanks to Clio Cresswell, Riley Tao, and Noah Klowden for comments and corrections.

Sum-difference exponents for boundedly many slopes, and rational complexity

19 November, 2025 in math.CO, paper | Tags: additive combinatorics, AlphaEvolve, entropy, Kakeya conjecture | by Terence Tao | 11 comments

I have uploaded to the arXiv my paper “Sum-difference exponents for boundedly many slopes, and rational complexity“. This is the second spinoff of my previous project with Bogdan Georgiev, Javier Gómez–Serrano, and Adam Zsolt Wagner that I recently posted about. One of the many problems we experimented using the AlphaEvolve tool with was that of computing sum-difference constants. While AlphaEvolve did modest improve one of the known lower bounds on sum-difference constants, it also revealed an asymptotic behavior to these constants that had not been previously observed, which I then gave a rigorous demonstration of in this paper.

In the original formulation of the sum-difference problem, one is given a finite subset {E} of {{\bf R}^2} with some control on projections, such as

\displaystyle |\{ a: (a,b) \in E \}| \leq N

\displaystyle |\{ b: (a,b) \in E \}| \leq N

\displaystyle |\{ a+b: (a,b) \in E \}| \leq N

and one then asks to obtain upper bounds on the quantity

\displaystyle |\{ a-b: (a,b) \in E \}|. \ \ \ \ \ (1)

This is related to Kakeya sets because if one joins a line segment between {(a,0)} and {(b,1)} for every {(a,b) \in E}, one gets a family of line segments whose set of directions has cardinality (1), but whose slices at heights {0,1,1/2} have cardinality at most {N}.

Because {a-b} is clearly determined by {a} and {b}, one can trivially get an upper bound of {N^2} on (1). In 1999, Bourgain utilized what was then the very recent “Balog–Szemerédi–Gowers lemma” to improve this bound to {N^{2-\frac{1}{13}}}, which gave a new lower bound of {\frac{13d+12}{25}} on the (Minkowski) dimension of Kakeya sets in {{\bf R}^d}, which improved upon the previous bounds of Tom Wolff in high dimensions. (A side note: Bourgain challenged Tom to also obtain a result of this form, but when they compared notes, Tom obtained the slightly weaker bound of {N^{2-\frac{1}{14}}}, which gave Jean great satisfaction.) Currently, the best upper bound known for this quantity is {N^{2-\frac{1}{6}}}.

One can get better bounds by adding more projections. For instance, if one also assumes

\displaystyle |\{ a+2b: (a,b) \in E \}| \leq N

then one can improve the upper bound for (1) to {N^{2-\frac{1}{4}}}. The arithmetic Kakeya conjecture asserts that, by adding enough projections, one can get the exponent arbitrarily close to {1}. If one could achieve this, this would imply the Kakeya conjecture in all dimensions. Unfortunately, even with arbitrarily many projections, the best exponent we can reach asymptotically is {1.67513\dots}.

It was observed by Ruzsa that all of these questions can be equivalently formulated in terms of Shannon entropy. For instance, the upper bound {N^{2-\frac{1}{6}}} of (1) turns out to be equivalent to the entropy inequality

\displaystyle {\bf H}(X-Y) \leq (2-\frac{1}{6}) \max( {\bf H}(X), {\bf H}(Y) , {\bf H}(X+Y) )

holding for all discrete random variables {X, Y} (not necessarily independent) taking values in {{\bf R}^2}. In the language of this paper, we write this as

\displaystyle SD(\{0,1,\infty\}; -1) \leq 2-\frac{1}{6}.

Similarly we have

\displaystyle SD(\{0,1,2,\infty\}; -1) \leq 2-\frac{1}{4}.

As part of the AlphaEvolve experiments, we directed this tool to obtain lower bounds for {SD(\{0,1,\infty\}; \frac{a}{b})} for various rational numbers {\frac{a}{b}}, defined as the best constant in the inequality

\displaystyle {\bf H}(X+\frac{a}{b} Y) \leq SD(\{0,1,\infty\}; \frac{a}{b}) \max( {\bf H}(X), {\bf H}(Y) , {\bf H}(X+Y) ).

We did not figure out a way for AlphaEvolve to efficiently establish upper bounds on these quantities, so the bounds provided by AlphaEvolve were of unknown accuracy. Nevertheless, they were sufficient to give a strong indication that these constants decayed logarithmically to {2} as {a+b \rightarrow \infty}:

The first main result of this paper is to confirm that this is indeed the case, in that

\displaystyle 2 - \frac{c_2}{\log(2+|a|+|b|)} \leq SD(\{0,1,\infty\}; \frac{a}{b}) \leq 2 - \frac{c_1}{\log(2+|a|+|b|)}

whenever {a/b} is in lowest terms and not equal to {0, 1, \infty}, where {c_1,c_2>0} are absolute constants. The lower bound was obtained by observing the shape of the examples produced by AlphaEvolve, which resembled a discrete Gaussian on a certain lattice determined by {a,b}. The upper bound was established by an application of the “entropic Plünnecke–Ruzsa calculus”, relying particularly on the entropic Ruzsa triangle inequality, the entropic Balog–Szemerédi–Gowers lemma, as well as an entropy form of an inequality of Bukh.

The arguments also apply to settings where there are more projections under control than just the {0,1,\infty} projections. If one also controls projections {X + r_i Y} for various rationals {r_1,\dots,r_k} and {R} denotes the set of slopes of the projections under control, then it turns out that the associated sum-difference constant {SD(\{0,1,\infty,r_1,\dots,r_k\}; s)} still decays to {2}, but now the key parameter is not the height {|a|+|b|} of {s}, but rather what I call the rational complexity of {s} with respect to {R}, defined as the smallest integer {D} for which one can write {s} as a ratio {P(r_1,\dots,r_k)/Q(r_1,\dots,r_k)} where {P,Q} are integer-coefficient polynomials of degree at most {D} and coefficients at most {2^D}. Specifically, {SD(\{0,1,\infty,r_1,\dots,r_k\}; s)} decays to {2} at a polynomial rate in {D}, although I was not able to pin down the exponent of this decay exactly. The concept of rational complexity may seem somewhat artificial, but it roughly speaking measures how difficult it is to use the entropic Plünnecke–Ruzsa calculus to pass from control of {X, Y, X+Y}, and {X+r_i Y} to control of {X+sY}.

While this work does not make noticeable advances towards the arithmetic Kakeya conjecture (we only consider regimes where the sum-difference constant is close to {2}, rather than close to {1}), it does highlight the fact that these constants are extremely arithmetic in nature, in that the influence of projections {X+r_iY} on {X+sY} is highly dependent on how efficiently one can represent {s} as a rational combination of the {r_i}.

New Nikodym set constructions over finite fields

12 November, 2025 in math.CO, paper | Tags: AlphaEvolve, finite fields, Nikodym set, probabilistic method | by Terence Tao | 14 comments

I have uploaded to the arXiv my paper “New Nikodym set constructions over finite fields“. This is a spinoff of my previous project with Bogdan Georgiev, Javier Gómez–Serrano, and Adam Zsolt Wagner that I recently posted about. In that project we experimented with using AlphaEvolve (and other tools, such as DeepThink and AlphaProof) to explore various mathematical problems which were connected somehow to an optimization problem. For one of these — the finite field Nikodym set problem — these experiments led (by a somewhat convoluted process) to an improved asymptotic construction of such sets, the details of which are written up (by myself rather than by AI tools) in this paper.

Let {{\mathbb F}_q} be a finite field of some order {q} (which must be a prime or a power of a prime), and let {d} be a fixed dimension. A Nikodym set in {{\mathbb F}_q^d} is a subset {N} of {{\mathbb F}_q^d} with the property that for every point {x \in {\mathbb F}_q^d}, there exists a line {\ell} passing through {x} such that all points of {\ell} other than {x} lie in {N}. Such sets are close cousins of Kakeya sets (which contain a line in every direction); indeed, roughly speaking, applying a random projective transformation to a Nikodym set will yield (most of) a Kakeya set. As a consequence, any lower bound on Kakeya sets implies a similar bound on Nikodym sets; in particular, one has a lower bound

\displaystyle |N| \geq \frac{q^d}{2^{d-1}} + O(q^{d-1})

on the size of a Nikodym set {N}, coming from a similar bound on Kakeya sets due to Bukh and Chao using the polynomial method.

For Kakeya sets, Bukh and Chao showed this bound to be sharp up to the lower order error {O(q^{d-1})}; but for Nikodym sets it is conjectured that in fact such sets should asymptotically have full density, in the sense that

\displaystyle |N| \geq q^d - o(q^d).

This is known in two dimensions thanks to work by Szönyi et al. on blocking sets, and was also established in bounded torsion cases (and in particular for even {q}) by Guo, Kopparty, and Sudan by combining the polynomial method with the theory of linear codes. But in other cases this conjecture remains open in three and higher dimensions.

In our experiments we focused on the opposite problem of constructing Nikodym sets of size as small as possible. In the plane {d=2}, constructions of size

\displaystyle |N| = q^2 - q^{3/2} + O(q \log q) \ \ \ \ \ (1)

when {q} is a perfect square were constructed by Blokhuis et al, again using the theory of blocking sets; by taking Cartesian products of such sets, one can also make similar constructions in higher dimensions, again assuming {q} is a perfect square. Apart from this, though, there are few such constructions in the literature.

We set AlphaEvolve to try to optimize the three dimensional problem with a variable field size {q} (which we took to be prime for simplicity), with the intent to get this tool to come up with a construction that worked asymptotically for large {q}, rather than just for any fixed value of {q}. After some rounds of evolution, it arrived at a construction which empirically had size about {q^3 - 8q^2}. Inspecting the code, it turned out that AlphaEvolve had constructed a Nikodym set {N} by (mostly) removing eight low-degree algebraic surfaces (all of the form {\{ (x,y,x^i y)\}} for various {i}). We used the tool DeepThink to confirm the Nikodym property and to verify the construction, and then asked it to generalize the method. By removing many more than eight surfaces, and using some heuristic arguments based on the Chebotarev density theorem, DeepThink claimed a construction of size

\displaystyle |N| = q^3 - 2 q^2 \log q + o(q^2 \log q) \ \ \ \ \ (2)

formed by removing several higher degree surfaces, but it acknowledged that the arguments were non-rigorous.

The arguments can be sketched here as follows. Let {V} be a random surface of degree {D}, and let {x} be a point in {{\mathbb F}_q^3} which does not lie in {V}. A random line through {x} then meets {V} in a number of points, which is basically the set of zeroes in {{\mathbb F}_q} of a random polynomial of degree {D}. The (function field analogue of the) Chebotarev density theorem predicts that the probability that this polynomial has no roots in {{\mathbb F}_q} is about {\delta_D}, where

\displaystyle \delta_D = 1 - \frac{1}{1!} + \frac{1}{2!} - \dots + \frac{(-1)^D}{D!}

is the proportion of permutations on {D} elements that are derangements (no fixed points). So, if one removes {k} random surfaces of degrees {D_1,\dots,D_k}, the probability that a random line avoids all of these surfaces is about {\delta_{D_1} \dots \delta_{D_k}}. If this product is significantly greater than {1/q^2}, then the law of large numbers (and concentration of measure) then predicts (with high probability) that out of the {\sim q^2} lines through {x}, at least one will avoid the removed surfaces, thus giving (most of) a Nikodym set. The Lang-Weil estimate predicts that each surface has cardinality about {q^2}, so this should give a Nikodym set of size about {q^3 - kq^2}.

DeepThink took the degrees {D_1,\dots,D_k} to be large, so that the derangement probabilities {\delta_{D_i}} were close to {1/e}. This led it to predict that {k} could be taken to be as large as {2 \log q}, leading to the claimed bound (2). However, on inspecting this argument we realized that these moderately high degree surfaces were effectively acting as random sets, so one could dramatically simplify DeepThink’s argument by simply taking {N} to be a completely random set of the desired cardinality (2), in which case the verification of the Nikodym set property (with positive probability) could be established by a standard Chernoff bound-type argument (actually, I ended up using Bennett’s inequality rather than Chernoff’s inequality, but this is a minor technical detail).

On the other hand, the derangement probabilities {\delta_D} oscillate around {1/e}, and in fact are as large as {1/2} when {D=2}. This suggested that one could do better than the purely random construction if one only removed quadratic surfaces instead of higher degree surfaces, and heuristically predicted the improvement

\displaystyle |N| = q^3 - \frac{2}{\log 2} q^2 \log q + o(q^2 \log q). \ \ \ \ \ (3)

However, our experiments with both AlphaEvolve and DeepThink to try to make this idea work either empirically, heuristically, or rigorously were all unsuccessful! Eventually Deepthink discovered the problem: random quadratic polynomials often had two or zero roots (depending on whether the discriminant was a non-zero quadratic residue, or a nonresidue), but would only very rarely have just one root (the discriminant would have to vanish). As a consequence, if {x} happened to lie on one of the removed quadratic surfaces {V}, it was extremely likely that most lines through {x} would intersect {V} in a further point; only the small minority of lines that were tangent to {V} and {x} would avoid this. None of the AI tools we tried were able to overcome this obstacle.

However, I realized that one could repair the construction by adding back a small random portion of the removed quadratic surfaces, to allow for a non-zero number of lines through {x} to stay inside the putative Nikodym set even when {x} was in one of the surfaces {V}, and the line was not tangent to {V}. Pursuing this idea, and performing various standard probabilistic calculations and projective changes of variable, the problem essentially reduced to the following: given {k} random quadratic polynomials in the plane {{\mathbb F}_q^2}, is it true that these polynomials simultaneously take quadratic residue values for {\gg 2^{-k}} of the points in that plane? Heuristically this should be true even for {2^{-k}} close to {1/q^2}. However, it proved difficult to accurately control this simultaneous quadratic residue event; standard algebraic geometry tools such as the Weil conjectures seemed to require some vanishing of étale cohomology groups in order to obtain adequate error terms, and this was not something I was eager to try to work out. However, by exploiting projective symmetry (and the {2}-transitive nature of the projective linear group), I could get satisfactory control of such intersections as long as {2^{-k}} was a little bit larger than {1/q} rather than {1/q^2}. This gave an intermediate construction of size

\displaystyle |N| = q^3 - (\frac{1}{\log 2} + 1) q^2 \log q + o(q^2 \log q),

which still beat the purely random construction, but fell short of heuristic predictions. This argument (generalized to higher dimensions) is what is contained in the paper. I pose the question of locating a construction with the improved bound (3) (perhaps by some modification of the strategy of removing quadratic varieties) as an open question.

We also looked at the two-dimensional case to see how well AlphaEvolve could recover known results, in the case that {q} was a perfect square. It was able to come up with a construction that was slightly worse than the best known construction, in which one removed a large number of parabolas from the plane; after manually optimizing the construction we were able to recover the known bound (1). This final construction is somewhat similar to existing constructions (it has a strong resemblance to a standard construction formed by taking the complement of a Hermitian unital), but is still technically a new construction, so we have also added it to this paper.

Mathematical exploration and discovery at scale

5 November, 2025 in math.CA, math.CO, math.MG, paper | Tags: Adam Zsolt Wagner, AlphaEvolve, Artificial Intelligence, Bogdan Georgiev, Javier Gomez-Serrano, optimization | by Terence Tao | 63 comments

Bogdan Georgiev, Javier Gómez-Serrano, Adam Zsolt Wagner, and I have uploaded to the arXiv our paper “Mathematical exploration and discovery at scale“. This is a longer report on the experiments we did in collaboration with Google Deepmind with their AlphaEvolve tool, which is in the process of being made available for broader use. Some of our experiments were already reported on in a previous white paper, but the current paper provides more details, as well as a link to a repository with various relevant data such as the prompts used and the evolution of the tool outputs.

AlphaEvolve is a variant of more traditional optimization tools that are designed to extremize some given score function over a high-dimensional space of possible inputs. A traditional optimization algorithm might evolve one or more trial inputs over time by various methods, such as stochastic gradient descent, that are intended to locate increasingly good solutions while trying to avoid getting stuck at local extrema. By contrast, AlphaEvolve does not evolve the score function inputs directly, but uses an LLM to evolve computer code (often written in a standard language such as Python) which will in turn be run to generate the inputs that one tests the score function on. This reflects the belief that in many cases, the extremizing inputs will not simply be an arbitrary-looking string of numbers, but will often have some structure that can be efficiently described, or at least approximated, by a relatively short piece of code. The tool then works with a population of relatively successful such pieces of code, with the code from one generation of the population being modified and combined by the LLM based on their performance to produce the next generation. The stochastic nature of the LLM can actually work in one’s favor in such an evolutionary environment: many “hallucinations” will simply end up being pruned out of the pool of solutions being evolved due to poor performance, but a small number of such mutations can add enough diversity to the pool that one can break out of local extrema and discover new classes of viable solutions. The LLM can also accept user-supplied “hints” as part of the context of the prompt; in some cases, even just uploading PDFs of relevant literature has led to improved performance by the tool. Since the initial release of AlphaEvolve, similar tools have been developed by others, including OpenEvolve, ShinkaEvolve and DeepEvolve.

We tested this tool on a large number (67) of different mathematics problems (both solved and unsolved) in analysis, combinatorics, and geometry that we gathered from the literature, and reported our outcomes (both positive and negative) in this paper. In many cases, AlphaEvolve achieves similar results to what an expert user of a traditional optimization software tool might accomplish, for instance in finding more efficient schemes for packing geometric shapes, or locating better candidate functions for some calculus of variations problem, than what was previously known in the literature. But one advantage this tool seems to offer over such custom tools is that of scale, particularly when when studying variants of a problem that we had already tested this tool on, as many of the prompts and verification tools used for one problem could be adapted to also attack similar problems; several examples of this will be discussed below. The following graphic illustrates the performance of AlphaEvolve on this body of problems:

Another advantage of AlphaEvolve was ~~(削除) robustness (削除ここまで)~~ adaptability: it was relatively easy to set up AlphaEvolve to work on a broad array of problems, without extensive need to call on domain knowledge of the specific task in order to tune hyperparameters. In some cases, we found that making such hyperparameters part of the data that AlphaEvolve was prompted to output was better than trying to work out their value in advance, although a small amount of such initial theoretical analysis was helpful. For instance, in calculus of variation problems, one is often faced with the need to specify various discretization parameters in order to estimate a continuous integral, which cannot be computed exactly, by a discretized sum (such as a Riemann sum), which can be evaluated by computer to some desired precision. We found that simply asking AlphaEvolve to specify its own discretization parameters worked quite well (provided we designed the score function to be conservative with regards to the possible impact of the discretization error); see for instance this experiment in locating the best constant in functional inequalities such as the Hausdorff-Young inequality.

A third advantage of AlphaEvolve over traditional optimization methods was the interpretability of many of the solutions provided. For instance, in one of our experiments we sought to find an extremum to a functional inequality such as the Gagliardo–Nirenberg inequality (a variant of the Sobolev inequality). This is a relatively well-behaved optimization problem, and many standard methods can be deployed to obtain near-optimizers that are presented in some numerical format, such as a vector of values on some discretized mesh of the domain. However, when we applied AlphaEvolve to this problem, the tool was able to discover the exact solution (in this case, a Talenti function), and create code that sampled from that function on a discretized mesh to provide the required input for the scoring function we provided (which only accepted discretized inputs, due to the need to compute the score numerically). This code could be inspected by humans to gain more insight as to the nature of the optimizer. (Though in some cases, AlphaEvolve’s code would contain some brute force search, or a call to some existing optimization subroutine in one of the libraries it was given access to, instead of any more elegant description of its output.)

For problems that were sufficiently well-known to be in the training data of the LLM, the LLM component of AlphaEvolve often came up almost immediately with optimal (or near-optimal) solutions. For instance, for variational problems where the gaussian was known to be the extremizer, AlphaEvolve would frequently guess a gaussian candidate during one of the early evolutions, and we would have to obfuscate the problem significantly to try to conceal the connection to the literature in order for AlphaEvolve to experiment with other candidates. AlphaEvolve would also propose similar guesses for other problems for which the extremizer was not known. For instance, we tested this tool on the sum-difference exponents of relevance to the arithmetic Kakeya conjecture, which can be formulated as a variational entropy inequality concerning certain two-dimensional discrete random variables. AlphaEvolve initially proposed some candidates for such variables based on discrete gaussians, which actually worked rather well even if they were not the exact extremizer, and already generated some slight improvements to previous lower bounds on such exponents in the literature. Inspired by this, I was later able to rigorously obtain some theoretical results on the asymptotic behavior on such exponents in the regime where the number of slopes was fixed, but the “rational complexity” of the slopes went to infinity; this will be reported on in a separate paper.

Perhaps unsurprisingly, AlphaEvolve was extremely good at locating “exploits” in the verification code we provided, for instance using degenerate solutions or overly forgiving scoring of approximate solutions to come up with proposed inputs that technically achieved a high score under our provided code, but were not in the spirit of the actual problem. For instance, when we asked it (link under construction) to find configurations to extremal geometry problems such as locating polygons with each vertex having four equidistant other vertices, we initially coded the verifier to accept distances that were equal only up to some high numerical precision, at which point AlphaEvolve promptly placed many of the points in virtually the same location so that the distances they determined were indistinguishable. Because of this, a non-trivial amount of human effort needs to go into designing a non-exploitable verifier, for instance by working with exact arithmetic (or interval arithmetic) instead of floating point arithmetic, and taking conservative worst-case bounds in the presence of uncertanties in measurement to determine the score. For instance, in testing AlphaEvolve against the “moving sofa” problem and its variants, we designed a conservative scoring function that only counted those portions of the sofa that we could definitively prove to stay inside the corridor at all times (not merely the discrete set of times provided by AlphaEvolve to describe the sofa trajectory) to prevent it from exploiting “clipping” type artefacts. Once we did so, it performed quite well, for instance rediscovering the optimal “Gerver sofa” for the original sofa problem, and also discovering new sofa designs for other problem variants, such as a 3D sofa problem.

For well-known open conjectures (e.g., Sidorenko’s conjecture, Sendov’s conjecture, Crouzeix’s conjecture, the ovals problem, etc.), AlphaEvolve generally was able to locate the previously known candidates for optimizers (that are conjectured to be optimal), but did not locate any stronger counterexamples: thus, we did not disprove any major open conjecture. Of course, one obvious possible explanation for this is that these conjectures are in fact true; outside of a few situations where there is a matching “dual” optimization problem, AlphaEvolve can only provide one-sided bounds on such problems and so cannot definitively determine if the conjectural optimizers are in fact the true optimizers. Another potential explanation is that AlphaEvolve essentially tried all the “obvious” constructions that previous researchers working on these problems had also privately experimented with, but did not report due to the negative findings. However, I think there is at least value in using these tools to systematically record negative results (roughly speaking, that a search for “obvious” counterexamples to a conjecture did not disprove the claim), which currently only exist as “folklore” results at best. This seems analogous to the role LLM Deep Research tools could play by systematically recording the results (both positive and negative) of automated literature searches, as a supplement to human literature review which usually reports positive results only. Furthermore, when we shifted attention to less well studied variants of famous conjectures, we were able to find some modest new observations. For instance, while AlphaEvolve only found the standard conjectural extremizer {z^n-1} to Sendov’s conjecture, as well as for variants such as Borcea’s conjecture, Schmeisser’s conjecture, or Smale’s conjecture it did reveal some potential two-parameter extensions to a conjecture of de Bruin and Sharma that had not previously been stated in the literature. (For this problem, we were not directly optimizing some variational scalar quantity, but rather a two-dimensional range of possible values, which we could adapt the AlphaEvolve framework to treat). In the future, I can imagine such tools being a useful “sanity check” when proposing any new conjecture, in that it will become common practice to run one of these tools against such a conjecture to make sure there are no “obvious” counterexamples (while keeping in mind that this is still far from conclusive evidence in favor of such a conjecture).

AlphaEvolve did not perform equally well across different areas of mathematics. When testing the tool on analytic number theory problems, such as that of designing sieve weights for elementary approximations to the prime number theorem, it struggled to take advantage of the number theoretic structure in the problem, even when given suitable expert hints (although such hints have proven useful for other problems). This could potentially be a prompting issue on our end, or perhaps the landscape of number-theoretic optimization problems is less amenable to this sort of LLM-based evolutionary approach. On the other hand, AlphaEvolve does seem to do well when the constructions have some algebraic structure, such as with the finite field Kakeya and Nikodym set problems, which we will turn to shortly.

For many of our experiments we worked with fixed-dimensional problems, such as trying to optimally pack {n} shapes in a larger shape for a fixed value of {n}. However, we found in some cases that if we asked AlphaEvolve to give code that took parameters such as {n} as input, and tested the output of that code for a suitably sampled set of values of {n} of various sizes, then it could sometimes generalize the constructions it found for small values of this parameter to larger ones; for instance, in the infamous sixth problem of this year’s IMO, it could use this technique to discover the optimal arrangement of tiles, which none of the frontier models could do at the time (although AlphaEvolve has no capability to demonstrate that this arrangement was, in fact, optimal). Another productive use case of this technique was for finding finite field Kakeya and Nikodym sets of small size in low-dimensional vector spaces over finite fields of various sizes. For Kakeya sets in {{\mathbf F}_q^d}, it located the known optimal construction based on quadratic residues in two dimensions, and very slightly beat (by an error term of size {O(q)}) the best construction in three dimensions; this was an algebraic construction (still involving quadratic residues) discovered empirically that we could then prove to be correct by first using Gemini’s “Deep Think” tool to locate an informal proof, which we could then convert into a formalized Lean proof by using Google Deepmind’s “AlphaProof” tool. At one point we thought it had found a construction in four dimensions which achieved a more noticeable improvement (of order {O(q^3)}) of what we thought was the best known construction, but we subsequently discovered that essentially the same construction had appeared already in a paper of Bukh and Chao, although it still led to a more precise calculation of the error term (to accuracy {O(q^{3/2})} rather than {O(q^2)}, where the error term now involves the Lang-Weil inequality and is unlikely to have a closed form). Perhaps AlphaEvolve had somehow absorbed the Bukh-Chao construction within its training data to accomplish this. However, when we tested the tool on Nikodym sets (which are expected to have asymptotic density {1}, although this remains unproven), it did find some genuinely new constructions of such sets in three dimensions, based on removing quadratic varieties from the entire space. After using “Deep Think” again to analyze these constructions, we found that they were inferior to a purely random construction (which in retrospect was an obvious thing to try); however, they did inspire a hybrid construction in which one removed random quadratic varieties and performed some additional cleanup, which ends up outperforming both the purely algebraic and purely random constructions. This result (with completely human-generated proofs) will appear in a subsequent paper.

What's new

The maximal length of the Erdős–Herzog–Piranian lemniscate in high degree

The Equational Theories Project: Advancing Collaborative Mathematical Research at Scale

The story of Erdős problem #1026

Quantitative correlations and some problems on prime factors of consecutive integers

Growth rates of sequences governed by the squarefree properties of its translates

Call for industry sponsors for IPAM’s RIPS program

Climbing the cosmic distance ladder: another sample chapter

Sum-difference exponents for boundedly many slopes, and rational complexity

New Nikodym set constructions over finite fields

Mathematical exploration and discovery at scale

Recent Comments

Top Posts

Archives

Categories

RSS The Polymath Blog

For commenters