Resolution criteria
Each month option resolves YES if, during that calendar month (UTC), at least one of Google, OpenAI, or Anthropic makes a new large language model (LLM) publicly available beyond a private waitlist in any of: their consumer apps (Gemini/ChatGPT/Claude) or developer platforms/APIs (OpenAI API, Google AI Studio/Vertex AI, Anthropic API/AWS Bedrock/Vertex AI). “Publicly available” includes GA or an open preview accessible without individual approval. If availability and announcement fall in different months, use the first month of broad availability.
“New LLM” = a newly named model or sub‑variant with a distinct model identifier added to the company’s official model/catalog (e.g., Gemini 2.5 Flash‑Lite, Claude Sonnet 4, OpenAI o3/o4‑mini/GPT‑4.5). Routine snapshot updates, pricing/context changes, UI features, or pure renames do not count. Anonymous or cloaked models do not count - if a cloaked model such as Horizon Alpha were released in one month and a company publicly claims it in the next, this market would count that model as released in the second month.
Specialized coding models such as OpenAI's codex models, image generation models, or other special purpose models do not count.
In the case of ambiguity the market will err on resolving no. In case of ambiguity about whether there is ambiguity I will use my judgement.
Evidence must be from the company’s official pages (any one suffices):
OpenAI blog or Help Center release notes. (openai.com)
Google DeepMind/Google AI blog or Gemini models docs. (blog.google)
Anthropic news/docs model lists. (anthropic.com)
A month resolves NO if none of the three release a qualifying new LLM in that month. If multiple releases occur in a month, the option still resolves YES once. If a model is briefly released and pulled, it counts if availability lasted ≥24 hours and was documented by the above sources.
Considerations
Companies often “announce” before GA; this market uses the first month of broad availability. Experimental/preview models count only if openly accessible (no private tester gate).
Releases via partner platforms (AWS Bedrock, Vertex AI) count if they debut there first.
Minor updates to existing models (e.g., improved 2.5 Flash without a new model name) do not qualify; new named variants like “2.5 Flash‑Lite” or “Opus 4.1” do.