The best performance by an AI system on FrontierMath Tier 4 as of December 31st 2025. See https://epoch.ai/frontiermath, under the section Tier 4, for results accepted for the purpose of this market. The "performance" is measured in terms of Pass@1 Accuracy.
At market creation (and day of the official announcement of the benchmark), the best model is o4-mini (high), with a score of 6.25%.
See also best performance on FrontierMath Tier 1-3:
🏅 Top traders
| # | Trader | Total profit |
|---|---|---|
| 1 | Ṁ866 | |
| 2 | Ṁ402 | |
| 3 | Ṁ171 | |
| 4 | Ṁ98 | |
| 5 | Ṁ84 |
People are also trading
@MaxLennartson yeah that was tier 1-3. before tier 4 release, ‘frontiermath score’ referred exclusively to the merged score across tiers 1-3
@Bayesian Yeah. But I didn't do too much research on the questions I just know they are unique from trainable datasets, and require a lot of reasoning steps. I think we need a new method that will help AIs better generalize their learnings and skills from different domains.
