Will a multi-agent AI system publicly outperform a solo frontier model on a live benchmark before July 2026? | Manifold

Will a multi-agent AI system publicly outperform a solo frontier model on a live benchmark before July 2026?

3

Ṁ100Ṁ80

Jul 1

78%

chance

1H

6H

1D

1W

1M

ALL

Resolves YES if before July 1 2026 a documented result shows a multi-agent system (2+ collaborating agents) beating the best single model (GPT-4o, Claude, Gemini) on any recognized benchmark (MMLU, HumanEval, SWE-bench, GPQA). Must be published � paper, blog, or leaderboard. Not just a demo.

Market context

Get

1,000

to start trading!

People are also trading

[ACX 2026] Will an AI model reach a 3 hour time horizon with 80% reliability during 2026?

-5% 1d81% chance

Will a publicly known AI model achieve an 80% time horizon of 3 weeks by April 2027?

-3% 1d25% chance

Will an AI achieve >85% performance on the FrontierMath benchmark before 2027?

Will an AI achieve >80% performance on the FrontierMath benchmark before 2027?

Will there be a significant advancement in frontier AI model architecture by end of year 2026?

Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?

Will AI models solve at least 2 FrontierMath Open Problems before 2027?

Will any AI model score above 95% on ARC-AGI-2 by end of 2026?

Will I automate Vanguard rebalancing with an AI agent by 2026?

Will a new lab create a top-performing AI frontier model before 2028?

Related questions

[ACX 2026] Will an AI model reach a 3 hour time horizon with 80% reliability during 2026?

Will a publicly known AI model achieve an 80% time horizon of 3 weeks by April 2027?

Will an AI achieve >85% performance on the FrontierMath benchmark before 2027?

Will an AI achieve >80% performance on the FrontierMath benchmark before 2027?

Will there be a significant advancement in frontier AI model architecture by end of year 2026?

Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?

Will AI models solve at least 2 FrontierMath Open Problems before 2027?

Will any AI model score above 95% on ARC-AGI-2 by end of 2026?

Will I automate Vanguard rebalancing with an AI agent by 2026?

Will a new lab create a top-performing AI frontier model before 2028?