
Did OpenAI use MUP for zero shot hyper-parameter transfer in GPT-4?
6
Ṁ110Ṁ175resolved Jan 6
Resolved
N/A1H
6H
1D
1W
1M
ALL
Maximal Update Parameterization is technique published last year by Yang et al. at Microsoft. https://arxiv.org/abs/2203.03466
This question is managed and resolved by Manifold.
Market context
Get
1,000 to start trading!
Sort by:
@firstuserhere interesting that it is in the bibliography, although the reference in the first image is from a different section of the report with its own bibliography (that [16] actually refers to "DALL·E 2 Preview - Risks and Limitations.").
So the muP paper is in the bibliography, but not referenced anywhere.
@Stefan yep, and even then it's not actually used in gpt-4, the report only mentions the red team to have used the paper?