Will Transformer-Based LLMs Make Up ≥75% of Parameters in the Top General AI by 2030?
6
400Ṁ171
2029
47%
chance

As of December 31, 2029,will a large language model (LLM)—defined as a transformer-based, next-token prediction model—comprise at least 75% of the activation parameter count of the most capable, publicly-known general-purpose AI system?

Definitions:

  • LLM: A model whose main pre-training objective is next-token prediction and whose architecture is based primarily on transformers (including dense or sparse, MoE, or similar variants).

  • Activation parameters: The total number of trainable weights that are loaded in memory during a maximum-capability inference pass. For MoE models, count the union of all experts that could be active in any inference pass (not just the average active subset).

  • ≥ 75% rule: If one or more LLMs, combined, comprise at least 75% of all activation parameters (across all neural modules, including vision, planning, and others), the criterion is met.

  • Most capable general-purpose AI: The system, as of December 31, 2029, that demonstrates the highest publicly documented cross-domain performance (as measured by recognized AGI or multitask benchmarks) or is acknowledged as top-tier by a consensus.

  • Backbone: The neural component(s) that provide broad reasoning and general knowledge. Symbolic planners or retrieval databases without trainable weights are not counted.

  • Publicly-known: The system must be openly released or credibly leaked with reproducible technical details, such as model card, parameter count, architecture, or benchmark results.

Edge-case clarifications:

  • Mixture-of-Experts (MoE) LLMs: All possible experts count toward the parameter total, even if only a subset are active per token.

  • Retrieval-Augmented Generation (RAG) or external databases: Non-parametric resources (e.g., vector DBs) are ignored for parameter counting; only neural weights matter.

  • Controller LLM plus a non-LLM core (e.g., physics simulator): If the non-LLM neural weights exceed 25%, the criterion is not met.

  • Systems distilled from an LLM into a non-transformer architecture (e.g., Mamba, RWKV): Does not count, even if originally based on LLMs.

  • Neuro-symbolic or hybrid systems: Only count neural parameters. If LLMs make up less than 75%, the answer is “No.”

  • Multiple LLM agents: Combine all LLM weights for the total.

  • Quantized or adapted LLMs: Count the original trainable weights, not their precision.

  • Leaked systems without parameter evidence: If parameter count cannot be established, the answer is “No” (burden of proof on “Yes”).

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy