Prediction #4 from:
My Takeaways From AI 2027 - by Scott Alexander
A big problem with this market is that the premise it's based on—that there is an "at least one year gap" between frontier proprietary and open models—is demonstrably untrue and has not been true throughout most of 2024 and all of 2025. I'd argue even as early as mid-2023 this was already untrue.
Llama 2 70B (Jul '23) was roughly on par with GPT-3.5 (Nov '22); the benchmarks used at the time weren't great but it was firmly in the same league. Llama 3.1 405B (Jul '24) was fully on par with early GPT-4 Turbo (Nov '23) at least, and certainly ahead of the vanilla GPT-4 releases. That's a gap of ~8 months sustained over a year. DeepSeek R1 (Jan '25; feels forever ago but hasn't even been a year!) was miles ahead of any model released prior to September 2024, that's a gap of ~4.5 months, the smallest it's ever been. Even if we squint our brain really hard and argue that the likes of GLM-4.7, DS v3.2 Finale, MiniMax M2.1, Ernie 5.0, and Kimi K2 are still roughly on the level of o3-high (Apr '25) or early Gemini 2.5 Pro (Mar '25) and not in any way ahead of them (also untrue but let's entertain the thought), we're talking an overall capability gap of at most 7–8 months. Since mid-2023, we've never seen the gap grow bigger than 8 months.
All of this tracks both across objective capability indexes like AAII and subjective communal preference aggregators like LM Arena. So even if we consider R1 to be a sole blip of relative capability spurt, the real gap has never actually exceeded a year and nothing indicates it growing bigger. If anything, we see more and more Chinese players catch up and fit within that 8 month gap despite being severely constrained by available compute; otherwise we'd see it shrink to like 1–3 months in no time.
So I really don't know what Scott Alexander's numbers are based on (surely not anything demonstrable, otherwise we'd clearly see it) but it's gonna be hard to resolve this based on that alone unless we just go with whatever Scott himself thinks is the case. But as it stands, it's simply unfalsifiable. (This is not a criticism of the arguments in the post itself, just this particular premise it—and hence this market—goes with.)