Will Claude 4 achieve over 95% on the MMLU-Pro benchmark by end of 2025?
13
Ṁ100Ṁ1.3kresolved Jan 2
Resolved
NO1H
6H
1D
1W
1M
ALL
This market predicts whether Anthropic's next-generation Claude 4 model will achieve a score exceeding 95% on the MMLU-Pro benchmark before December 31, 2025. MMLU-Pro is an enhanced version of the Massive Multitask Language Understanding benchmark, which tests AI models on multiple-choice questions across various subjects. As of April 2025, Claude 3.7 Sonnet has achieved around 83% on MMLU-Pro, while the current record holder (OpenAI's o1) scores just over 90% on standard MMLU. A score above 95% would represent a significant breakthrough in AI capabilities, potentially surpassing average human expert performance on these tests.
This question is managed and resolved by Manifold.
Market context
Get
1,000 to start trading!
🏅 Top traders
| # | Trader | Total profit |
|---|---|---|
| 1 | Ṁ44 | |
| 2 | Ṁ26 | |
| 3 | Ṁ26 | |
| 4 | Ṁ16 | |
| 5 | Ṁ11 |
People are also trading
Related questions
Will I still be using Claude Code by the end of March 2026
87% chance
Will Claude Sonnet 5 exceed 85% on SWE-bench verified?
16% chance
New Claude tier above Max 20x before 2027?
38% chance
Will a text model achieve 100% performance on the MMLU in five years?
28% chance
MMLU 99% #5: Will SOTA for MMLU (average) pass 99% by the start of 2028?
44% chance
Will Al achieve 85% or higher on the Humanity's Last Exam benchmark before 2030?
84% chance
MMLU 99% #4: Will SOTA for MMLU (average) pass 99% by the start of 2027?
8% chance
Will Al achieve 95% or higher score on the FrontierMath benchmark before 2030?
81% chance
Will Al achieve 95% or higher on the Humanity's Last Exam benchmark before 2030?
33% chance
Will Al achieve 95% or higher on the Humanity's Last Exam benchmark before 2027?
7% chance