This is a quote from the blog I was thinking of:
In a widely-read 2020 paper, OpenAI reported that the accuracy of its language models scaled “as a power-law with model size, dataset size, and the amount of compute used for training, with some trends spanning more than seven orders of magnitude.”
Thanks. This still seems to be mostly talking about fixed training costs though.
I can't figure out why it's so expensive to run the models once they're created unless they're massive and irreducible ... which they probably are, but I haven't found a written account of that.
reply