MANIFOLD
Will a top-10 LMArena model have weight-based online learning by 2028?
12
Ṁ2kṀ18k
2027
57%
chance

Resolution criteria:

Resolves YES if, before January 1, 2028, any model ranked in the top 10 on LMArena (lmarena.ai, Style Control OFF) has a publicly available version with demonstrated weight-based online learning capability, meaning the model updates internal parameters during inference or between user sessions to incorporate new information.

The model must:

  1. Be ranked top-10 on LMArena at any point while a publicly available version offers this capability

  2. Have the online learning feature available to end users (via API or direct access), not just as a research demo

  3. Encode new information through learned parameters, not context injection

  4. Update weights automatically during or after user sessions, not through manually triggered training jobs or at a slower cadence (e.g., weekly batch fine-tuning)

  5. Accumulate learning over time (training a fresh adapter from scratch each session, discarding prior updates, does not count)

Background:

Current LLMs are static after training. To handle new or user-specific information, providers use workarounds:

Context-based approaches (don't count):

  • RAG (retrieval-augmented generation): fetching relevant documents into the prompt

  • Long context windows: fitting more information into a single conversation

  • Conversation memory systems: storing and retrieving past exchanges as text

These provide information as input rather than changing the model itself. They're limited by context length and don't alter how the model computes its outputs.

Weight-based online learning (counts for YES):

  • Test-Time Training: layers with internal parameters updated via gradient descent during inference (paper)

  • Neural memory modules: learnable parameters that adapt to memorize information at test time, e.g., Titans (paper)

  • Adapter-based continual learning: LoRA-style parameter updates between sessions to incorporate new information

These approaches modify model parameters to encode new information, enabling learning from experience rather than just retrieval.


Note: If LMArena ceases to operate, I will use an obvious successor benchmark operating along similar lines, or resolve N/A if none exists.

Other:

/SamuelKnoche/will-a-top10-lmarena-model-have-wei-89ZNugZdRy
/SamuelKnoche/will-a-top10-lmarena-model-have-wei-pyhyEA5ISq

  • Update 2026-01-28 (PST) (AI summary of creator comment): The creator is considering adding additional conditions to clarify what counts as weight-based online learning:

    • Updates must be cumulative (training a new LoRA each time from the same base model would not count; the system must build on previous updates)

    • Must occur without explicit user data curation or manual triggering of training jobs

Example that would not count: Google's current Gemini 2.5 LoRA fine-tuning feature, even if extended to allow daily training on recent conversations, as it requires manual triggering and doesn't build cumulatively.

  • Update 2026-01-28 (PST) (AI summary of creator comment): Clarification on resolution criteria:

    • A model does not need to be the specific version on LMArena that has the online learning capability. The requirements "Be ranked top-10 on LMArena" and "has a publicly available version with demonstrated weight-based online learning capability" are independent conditions.

    • Weight updates must happen automatically during or after sessions, not at any slower cadence (e.g., manual daily training would not count).

  • Update 2026-01-29 (PST) (AI summary of creator comment): If LMArena ceases to operate before the resolution criteria are met:

    • The creator will use an obvious successor that operates along similar lines, if one exists

    • Otherwise, the market will resolve to N/A

Market context
Get
Ṁ1,000
to start trading!
Sort by:

if LMArena ceases to operate before this event happens, how does this resolve?

@BRUHMEMER If there is an obvious successor that roughly operates along the same lines, I'll use that, otherwise N/A. I doubt it'll come to that though.

@traders @prismatic @robert @EyadFareh @BlackCrusade (tagging 2027/2029 traders also).
Realized there's some ambiguity in the resolution criteria. I've done some research and I found that Google lets people train a LoRa on their Gemini 2.5 models: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-use-supervised-tuning

I don't think this should count for a Yes resolution, but I'm not sure where exactly to draw the line. If they add a feature to Gemini that lets you train a new LoRa daily on your recent conversations, does that count?

Maybe I could add the conditions that the updates need to be cumulative (training a new LoRA each time from the same base model, rather than building on previous updates, would not count), and that it has to be without explicit user data curation or manual triggering of training jobs.

Thoughts?

Side question but you cant interact with the lora based gemini on lmarena, that doesnt matter?

but yeah if it’s lora that isnt used between sessions in the background to learn new info on the session themselves in my mind it wouldnt meet the third counts for yes point but it is ofc up to you

@Bayesian Ah, yes, additional ambiguities...

I can see how "Be ranked top-10 on LMArena at any point while offering this capability" can been interpreted that way.

I think I meant it just as "Be ranked top-10 on LMArena" and "has a publicly available version with demonstrated weight-based online learning capability" independently.

> but yeah if it’s lora that isnt used between sessions in the background to learn new info on the session themselves in my mind it wouldnt meet the third counts for yes point but it is ofc up to you

Right, the weight updates should happen automatically during or after sessions, not at any slower cadence. That makes sense as a criterion.

@SamuelKnoche subjective, of course, but to me "online learning" implies something automatically happening in the background and not merely fine-tuning services. probably worth clarifying in the resolution criteria.

Okay, rephrased 1. to make it clearer that the online learning version of the model does not have to be on LMArena, and added:

4. Update weights automatically during or after user sessions, not through manually triggered training jobs or at a slower cadence (e.g., weekly batch fine-tuning)
5. Accumulate learning over time (training a fresh adapter from scratch each session, discarding prior updates, does not count)

opened a Ṁ10,000 NO at 50% order

@Bayesian I am something of a minnow. 😅

um takers? 😅

© Manifold Markets, Inc.TermsPrivacy