Does OpenAI's Q* 'breakthrough' represent a significant advance in AI capabilities?
➕
Plus
85
Ṁ16k
Jan 1
84%
chance

https://archive.is/tCE1t

Resolvse to the opinion of the AI safety community, whenever information that could resolve this comes out and consensus is reached, on my judgement or the judgement of a moderator resolution council if a single person disputes it

Get
Ṁ1,000
and
S3.00


Sort by:
2mo

This benchmark shows it only marginally improves the score. I mean sure it is better, but it also thinks way longer. Comparing to traditional benchmarks is also misleading, because it uses multi-step thinking, which could be trivially added to e.g. Claude as well using Auto GPT or similar, would be interesting to see a comparison then.

https://aider.chat/2024/09/12/o1.html

bought Ṁ500 YES3mo

@jacksonpolack for the purpose of this question you count o1 as being Q* right? OpenAI doesn't need to explicitly mention the old name?

4mo

I think that maybe the "AI safety community" isn't the best authority on what constitutes a breakthrough in AI capabilities.

1y

Could you clarify what you think should or shouldn't count count as a breakthrough?

predictedNO 1y

Hm.

In spirit, the idea is if it's something worth being interested in or nervous about in terms of AI capabilities. So the idea is, if I'm thinking about AI safety, or the general rate of AI advancement, should I pay any attentiont to what Q* is? This is obviously pretty fuzzy, but I don't think there's a less soft way to make a market on the topic, considering I don't know too much about what the thing is or what it accomplished.

1y

Some potentially useful clarifications :)
- does it need to have ultimately been related to the firing?
- some anchors (e.g. would Transformers, GPT-3, GPT-4, RLHF, AlphaGo, AlphaZero, OpenAI Five etc be counted as 'significant capabilities advances')
- If Q* is a model, does it matter if the underlying approach needs to be subsequently scaled up?

1y

1) No, edited title
2) Transformers, gpt-3 and 4, alphago, alphazero, and rlhf would count. No position on openai five.
3) No

What is this?

What is Manifold?
Manifold is the world's largest social prediction market.
Get accurate real-time odds on politics, tech, sports, and more.
Win cash prizes for your predictions on our sweepstakes markets! Always free to play. No purchase necessary.
Are our predictions accurate?
Yes! Manifold is very well calibrated, with forecasts on average within 4 percentage points of the true probability. Our probabilities are created by users buying and selling shares of a market.
In the 2022 US midterm elections, we outperformed all other prediction market platforms and were in line with FiveThirtyEight’s performance. Many people who don't like trading still use Manifold to get reliable news.
How do I win cash prizes?
Manifold offers two market types: play money and sweepstakes.
All questions include a play money market which uses mana Ṁ and can't be cashed out.
Selected markets will have a sweepstakes toggle. These require sweepcash S to participate and winners can withdraw sweepcash as a cash prize. You can filter for sweepstakes markets on the browse page.
Redeem your sweepcash won from markets at
S1.00
→ $1.00
, minus a 5% fee.
Learn more.
© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules