DeepMind AI Achieves Silver Medal Standard in Math Olympiad

On 25 July 2024, Google DeepMind unveiled significant advancements in artificial intelligence by introducing two new models: AlphaProof and AlphaGeometry 2. These models achieved a notable milestone by solving four out of six problems from this year’s International Mathematical Olympiad (IMO), reaching the standard of a silver medalist for the first time.

We’re presenting the first AI to solve International Mathematical Olympiad problems at a silver medalist level.🥈

It combines AlphaProof, a new breakthrough model for formal reasoning, and AlphaGeometry 2, an improved version of our previous system. 🧵 https://t.co/U0OFXBia8n pic.twitter.com/h2mcLLRJjk
— Google DeepMind (@GoogleDeepMind) July 25, 2024

The IMO is a prestigious competition for young mathematicians and has long been viewed as a benchmark for assessing advanced mathematical reasoning capabilities in AI. This year’s competition presented a rigorous challenge, with problems spanning algebra, combinatorics, geometry, and number theory. DeepMind’s AI systems demonstrated their prowess by solving three algebra problems, one number theory problem, and one geometry problem.

AlphaProof: A Formal Approach to Mathematical Reasoning

AlphaProof is a new reinforcement-learning-based system designed for formal mathematical reasoning. It integrates the AlphaZero algorithm, previously used for mastering games like chess and Go, with pre-trained language models. This hybrid approach enables AlphaProof to solve and prove mathematical problems using the formal language of Lean.

Sir Timothy Gowers, an IMO gold medalist and Fields Medal winner, praised AlphaProof’s capabilities: “The fact that the program can come up with a non-obvious construction like this is very impressive, and well beyond what I thought was state of the art.”

It did this by solving four of the six problems completely, which got it 28 points out of a possible total of 42. I'm not quite sure, but I think that put it ahead of all but around 60 competitors.

However, that statement needs a bit of qualifying.
— Timothy Gowers @wtgowers (@wtgowers) July 25, 2024

AlphaProof’s performance included solving the most challenging problem in this year’s IMO, which only five contestants addressed.

AlphaGeometry 2: Enhanced Geometry Problem-Solving

AlphaGeometry 2 represents an advanced iteration of DeepMind’s geometry-solving AI. Based on the Gemini AI model, this neuro-symbolic hybrid system demonstrates significant improvements over its predecessor. It successfully solved Problem 4 within 19 seconds after receiving its formalisation.

AlphaGeometry 2’s improvements include a faster symbolic engine and a novel knowledge-sharing mechanism that facilitates handling more complex problems. It solved 83% of historical IMO geometry problems from the past 25 years, compared to 53% by the earlier version.

The Path Forward for AI in Mathematics

AlphaProof and AlphaGeometry 2’s achievement of a silver medal standard marks a significant step forward in developing AI systems capable of advanced mathematical reasoning. DeepMind’s ongoing research aims to enhance these models further and explore new AI approaches to mathematical problem-solving.

The results from this year’s IMO highlight the growing potential for AI to assist mathematicians in exploring new hypotheses and solving complex problems. As AI systems like Gemini advance, they promise to play an increasingly integral role in mathematical research and discovery.

Conclusion

Google DeepMind’s latest AI models have demonstrated impressive capabilities by achieving silver medal-level performance in the International Mathematical Olympiad. With AlphaProof and AlphaGeometry 2 setting new benchmarks for AI in mathematical reasoning, the future holds exciting possibilities for collaboration between human mathematicians and AI systems.