Inside AlphaGeometry: Google DeepMind’s New Model that Solves Geometry Problems Like a Math Olympiad Gold-Medalist
Last Updated on January 25, 2024 by Editorial Team
Author(s): Jesus Rodriguez
Originally published on Towards AI.
I recently started an AI-focused educational newsletter, that already has over 160,000 subscribers. TheSequence is a no-BS (meaning no hype, no news, etc) ML-oriented newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers, and concepts. Please give it a try by subscribing below:
TheSequence
The best source to stay up-to-date with the developments in the machine learning, artificial intelligence, and data…
thesequence.substack.com
A few months ago, the International Mathematical Olympiad announced the AIMO Prize, a $10 million award for an AI model that can achieve a gold medal in an International Math Olympiad (IMO). IMOs are elite high school competitions where the top six students from each participating country must answer six different questions over two days, with a four-hour time limit each day. Some of the most renowned mathematicians of the past few decades have been medalists in IMO competitions. Geometry, an important and one of the hardest aspects of IMO tests, combines visual and mathematical challenges. We might intuitively think that this would be the hardest type of problem for AI models to solve.
Well, not anymore.
Last week, Google DeepMind published a paper unveiling AlphaGeometry, a model capable of solving geometry problems at the level of an IMO gold medalist. The key magic behind AlphaGeometry is the combination of two different ML architectures: a neural language model and a symbolic deduction engine. Together, they tackle complex geometry theorems, bringing a unique approach to problem-solving.
Solving geometry problems at an IMO level is as complicated of a task as you can find in AI these days. These problems often involve intricate diagrams that require additional geometric elements like points, lines, or circles to reach a solution. The neural language model of AlphaGeometry excels in this area. It swiftly scans through these diagrams and intuitively predicts which new constructs might prove most beneficial. This is where its strength lies — in rapidly generating ideas from an endless array of possibilities.
One of the main challenges with solving IMO-type geometry problems is the reduced datasets available. In similar situations, synthetic data generation methods have proven to be incredibly helpful. But can you really generate high-quality synthetic data for geometry?
Synthetic Data Generation
AlphaGeometry employs a novel approach to generating synthetic data for theorem proving. This process starts with the creation of a diverse set of random theorem premises. These premises are the initial inputs for the system’s symbolic deduction engine, which plays a pivotal role in generating new conclusions from these inputs.
The symbolic deduction engine operates by applying forward inference rules to these premises. As it processes the data, it constructs a directed acyclic graph. This graph is a structured representation where each node represents a conclusion reached by the engine. These nodes are interconnected, with each one linking back to the premises that led to its conclusion. This structure is essential for tracing the logical steps taken to reach each conclusion.
For any given conclusion within this graph, AlphaGeometry can perform a traceback operation. This process involves tracing back through the graph from a specific conclusion to identify the subset of premises that directly contributed to it. The result of this traceback is a subgraph that clearly outlines the dependencies and reasoning path leading to the conclusion.
An interesting aspect of this approach is how it deals with seemingly irrelevant elements in the construction of a theorem. For instance, in a synthetic problem, certain points or lines might appear to be unrelated to the main construction at first glance. However, AlphaGeometry learns to recognize these elements as auxiliary constructions. These are elements that, while not directly contributing to the theorem’s main premise, play a role in the broader problem-solving context. This learning aspect of the system ensures a comprehensive understanding of geometric constructions, including both central and peripheral elements.
The Architecture
As mentioned before, AlphaGeometry relies on an AI system that combines a neural language model with a symbolic deduction engine. This unique pairing enables it to tackle the challenge of solving complex geometric theorems. The system operates in a dual-mode: the neural language model quickly generates intuitive ideas, while the symbolic deduction engine engages in thoughtful, logical reasoning.
The neural language model in AlphaGeometry is adept at recognizing patterns and relationships within data. This ability allows it to rapidly suggest potential geometric constructs that could be relevant to a problem. However, this model doesn’t specialize in detailed, rigorous reasoning or in providing explanations for its suggestions.
On the other side, the symbolic deduction engine is grounded in formal logic. It follows established rules to systematically work towards conclusions. This part of AlphaGeometry is thorough and transparent in its processes, but it can be somewhat slower, especially when faced with large and intricate problems.
Together, these two components of AlphaGeometry synergize to approach geometry problems effectively. For example, in Olympiad geometry problems, which often involve complex diagrams, the neural language model proposes new geometric elements that might be necessary for solving the problem. These could include additional points, lines, or circles. The symbolic deduction engine then takes these suggestions and applies logical reasoning to further explore the diagram, gradually moving closer to a solution. This collaborative process allows AlphaGeometry to navigate the intricacies of geometric theorems with a balance of speed and precision.
The Results
To understand the difficulty of solving IMO geometry problems, let’s look at the following chart that showcases the number of problems solved by IMO gold medalists:
In a recent benchmark test named IMO-AG-30, AlphaGeometry demonstrated exceptional problem-solving abilities in geometric theorem proving. Among ten different solvers, including eight that are search-based, AlphaGeometry emerged as the top performer. This test involved various solvers tackling a series of complex geometry problems, showcasing the advanced capabilities of these AI systems.
AlphaGeometry’s success was notable, solving a total of 25 problems from the set. This result was significantly higher than the previous best-known method, which solved ten problems, and even surpassed another strong competitor that combined algebraic reasoning and human-designed heuristics to solve 18 problems. The effectiveness of AlphaGeometry was further highlighted by its ability to operate efficiently, using fewer computational resources compared to its competitors.
The system’s impressive performance is attributed to the combination of its base symbolic deduction engine with an advanced algebraic deduction component and a language model. This integration enhanced its problem-solving capacity significantly. While the algebraic deduction alone brought the total solved problems to 14, the addition of the language model’s auxiliary constructions pushed this number up to 25.
Furthermore, AlphaGeometry’s robustness was evident in tests using limited resources. Even with only 20% of the training data or less than 2% of the search budget, it managed to maintain state-of-the-art results, solving 21 problems in both scenarios. In a more extensive test set comprising 231 geometry problems, AlphaGeometry continued to outshine other methods, solving almost all the problems (98.7%), a considerable leap from the achievements of other methods.
Notably, AlphaGeometry’s prowess was not just in solving a high volume of problems but also in tackling particularly challenging ones. It successfully solved geometry problems from the IMO of the years 2000 and 2015, milestones considered difficult even for average human contestants. In one super impressive instance, the system’s analysis of an IMO problem from 2004 led to the discovery of a more general version of the theorem, showcasing its deep understanding and innovative problem-solving approach.
AlphaGeometry’s solutions, along with an analysis of its successes and limitations, are detailed in supplementary materials. The system tends to employ a more basic toolkit for proving theorems compared to human methods, which has implications for the scope of its synthetic data generation, its performance during tests, and the readability of its proofs. Despite these limitations, AlphaGeometry’s achievements in geometric theorem proving mark a significant advancement in the field of AI.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI