Loops

Linking to myself: math-y articles from Epoch

Aug 10, 2025

I haven’t had a chance to write much here since starting as a researcher at Epoch AI in May. But I’ve written several pieces for Epoch that are quite math-y! I’d have been very happy to write any of these as Lemmata posts. Even better, I think input from my colleagues improved the quality of my writing a great deal.

So, I’m sure many of you follow Epoch’s work already, but for those of you who don’t I thought you might be interested in some links.

From me:

What will the IMO tell us about AI math capabilities? My preview of the IMO, framing what I think we might learn from it and registering some predictions.
Evaluating Grok 4’s Math Capabilities A deep-dive analysis trying to understand the new model’s capabilities more qualitatively. Incidentally, this was a nice chance to synthesize a lot of what I’d written on this blog.
We didn’t learn much from the IMO My review article about AI performance on the IMO. The main claim is that an unlucky draw of problems made the event relatively uninformative.

And, for good measure, from my colleagues:

Is AI already superhuman on FrontierMath? A review of a competition Epoch hosted for humans to try their hands at some FrontierMath problems.
Beyond benchmark scores: Analyzing o3-mini’s mathematical reasoning FrontierMath problem authors review o3-mini’s full chains-of-thought on its attempts to solve some FrontierMath problems.
FrontierMath Tier 4 A benchmark of extremely challenging research-level math problems.

I’ll try to keep up these links posts routinely. I also hope to get back to posting original analysis, e.g. deeper looks at AI performance on selected problems. Stay tuned!

Lemmata

Discussion about this post

Ready for more?