By EOY 2025, will the model with the lowest perplexity on Common Crawl will not be based on transformers?

1kṀ13k

Dec 31

chance

ALL

If perplexity on Common Crawl is not available for models, I will use other benchmarks as a surrogate. This will inherently be a judgement process. If a model has not been announced by EOY 2025 and no benchmarks have been posted publicly, it will not be counted for the purpose of this market.

"Based on transformers" for the purpose of this question will be anything with multi-headed self-attention that feeds into an MLP.

Update 2025-04-10 (PST) (AI summary of creator comment): Clarification on what constitutes 'based on transformers':
- deepseek-style MLA with MoE is considered as based on transformers.
- All current models, except for SSMs and LSTMs, are assumed to fall under the category of based on transformers.
- The status of RWKV remains open for discussion.

Update 2025-04-12 (PST) (AI summary of creator comment): Hunyuan-turbo{s} Classification Update:
- Transformer-SSM Hybrid: Despite being a hybrid, it will be counted as non-transformer.
- This establishes that for models with mixed components (e.g., transformer and SSM), the creator will treat them as non-transformer for the purposes of market resolution, unless persuaded otherwise.

Get

1,000

to start trading!

People are also trading

Best public AI model uses "We" in CoT by EOY 2025?

23% chance

Will Transformer based architectures still be SOTA for language modelling by 2026?

97% chance

By the start of 2026, will I still think that transformers are the main architecture for tasks related to natural language processing?

90% chance

Best available language model from an OpenAI competitor by 2026

76% chance

Who will have the highest ranking model on web.lmarena.ai by EOY 2025?

Will transformers still be the dominant DL architecture in 2026?

88% chance

Which company has best Text-to-Image AI model end of 2025? (Artificial Analysis Leaderboard)

13% chance

LLMs by EOY 2025: Will Retentive Learning Surpass Transformers? (Subsidised 400 M$)

1% chance

On January 1, 2027, a Transformer-like model will continue to hold the state-of-the-art position in most benchmark

Sort by:

All of the top 20 lmarena models are based on transformers in my reckoning.

Of the top 20 lmarena models, hunyuan-turbo{s} is the only non-transformer one. appears to be transformer-SSM hybrid, which I'll count as non-transformer. Willing to be persuaded on this one. https://cloud.tencent.com/document/product/1729/104753

bought Ṁ5,000 NO

deepseek-style MLA with MoE counts as "based on transformers" in my mind. So do all the current models that I'm aware of outside of SSMs and LSTMs. I'm willing to be convinced either way on RWKV.