An artificial-intelligence (AI) tool can rigorously prove whether geometric facts — statements about two-dimensional shapes such as triangles or polygons — are true, just as well as competitors in the International Mathematical Olympiad, a competition for school students. The system, called AlphaGeometry, was described in *Nature* on 17 January^{1}.

When tested on a set of 30 geometry problems from the International Mathematical Olympiad (IMO), AlphaGeometry could solve 25. This is approaching the performance of the competitions’ gold medallists — at least in geometry. Kevin Buzzard, a mathematician at Imperial College London, points out that geometry is only one of the disciplines that IMO contestants must excel at, and that it might be a lot harder for computers to get equally good at solving IMO problems in other areas, such as number theory. Still, Buzzard says, “the fact that they’ve managed to solve 25 IMO problems, and furthermore can generate human-readable proofs, is very impressive”.

## Language barrier

Table of Contents

Researchers have experimented with solving various mathematical problems using large language models — AI programs that learn by training on large collections of text, and that power the most popular chatbots. The results have been encouraging but often nonsensical.

For example, a maths-oriented chatbot called Minerva, developed by Google for research purposes, was trained using advanced maths papers and solutions to basic school-level maths problems written by people. Although the numerical solutions that Minerva arrives at are usually correct, the text that it produces to explain its reasoning is often faulty — and needs to be checked by humans. “If a system is trained in natural language, it’s going to be outputting natural language, which we cannot trust,” says Trieu Trinh, a computer scientist at Google DeepMind in Mountain View, California, and a co-author of the *Nature* paper.

DeepMind’s AI helps untangle the mathematics of knots

Instead of using natural language, Trinh and his collaborators developed a language for writing geometry proofs that has a rigid syntax similar to that of a computer programming language. Its answers can therefore be checked easily by a computer, while still making sense to humans.

The team focused on problems in Euclidean geometry in which the goal is to write a mathematical proof of a given statement. They embedded into their custom language several dozen basic rules of geometry, such as ‘if one straight line intersects a second straight line, it will also intersect a line that is parallel to the second line’.

They then wrote a program to automatically generate 100 million ‘proofs’. Essentially, these consisted of random sequences of simple but logically unassailable steps — such as “given two points A and B, construct the square ABCD”.

AlphaGeometry was trained on these machine-generated proofs. This meant that the AI was able to solve problems by guessing one step after the other, in the same way as chatbots produce text. But it also meant that its output was machine-readable and easy to check for accuracy. For every problem, AlphaGeometry generated many attempts at a solution. Because the AI could automatically weed out the incorrect ones, it was able to reliably produce correct results, including to geometry problems from the IMO.

## Combined approach

AlphaGeometry combines the brute-force statistical guesses of language models with symbolic reasoning, which might be the reason for its strength, says Stephan Schulz, a computer scientist at Baden-Württemberg Cooperative State University in Stuttgart, Germany. This overall approach is promising, he adds.

DeepMind AI invents faster algorithms to solve tough maths puzzles

Buzzard says that training AlphaGeometry on synthetic data also removes one of the big potential pitfalls in applying typical large language models to maths: cheating. When a neural network is trained on billions of texts scraped from the Internet, he says, “it is phenomenally hard to guarantee that the model has not seen the question before”.

Training AlphaGeometry with automatically generated proofs also had another purpose, says study co-author He He, a computer scientist at New York University. “We want to build such a system without human data, so that it may surpass human ability some day,” she says.

Still, mathematicians’ jobs will probably be safe for a while longer. “I can imagine in a few years’ time that these or other techniques in machine learning may be solving mathematics problems at undergraduate level which only the smartest undergraduates can solve,” says Buzzard. “But right now, I have seen no evidence of machines autonomously engaging with modern research-level mathematics.”