reply on: Big Tech Struggles to Turn AI Hype Into Profits \ stacker news ~tech

pull down to refresh

982 sats \ 4 replies \ @SimpleStacker 10 Oct 2023 \ parent \ on: Big Tech Struggles to Turn AI Hype Into Profits tech

Training is definitely costlier than inference, but inference is not costless either, especially if you are fielding millions of requests

Moreover, to maintain a competitive edge I would assume that the models are constantly being fine-tuned, not to mention the fixed costs of maintaining highly specialized and in-demand engineers on staff... I can easily see how costs add up

0 new comment

0 sats \ 3 replies \ @k00b OP 10 Oct 2023

Training is definitely costlier than inference, but inference is not costless either, especially if you are fielding millions of requests

Oh for sure. My knowledge is dated but once upon a time it was thought you could ship trained models to clients and run them there without specialized hardware.

Moreover, to maintain a competitive edge I would assume that the models are constantly being fine-tuned, not to mention the fixed costs of maintaining highly specialized and in-demand engineers on staff... I can easily see how costs add up

If this is all there is to it then some customers are performing 4x more inference requests than others which tracks.

Maybe what I'm not accounting for is the size of these models. If they are enormous with many many weights, then scaling inference could be super-linear.

0 new comment

894 sats \ 0 replies \ @SimpleStacker 10 Oct 2023

Based on what I know of these model architectures, compute costs should scale linearly with the number of requests (or more precisely, the number of batches since TPUs will process requests in parallel)

There could be other issues regarding concurrency, latency, congestion, etc. Or maybe there are other physical limitations regarding hardware. But just on the model itself I don't see why it should super-linear in the number of requests. If I'm wrong I'd be happy to know it though.

0 new comment

894 sats \ 1 reply \ @0fje0 10 Oct 2023

This is a quote from the blog I was thinking of:

In a widely-read 2020 paper, OpenAI reported that the accuracy of its language models scaled “as a power-law with model size, dataset size, and the amount of compute used for training, with some trends spanning more than seven orders of magnitude.”

https://www.understandingai.org/p/large-language-models-explained-with

0 new comment

0 sats \ 0 replies \ @k00b OP 10 Oct 2023

Thanks. This still seems to be mostly talking about fixed training costs though.

I can't figure out why it's so expensive to run the models once they're created unless they're massive and irreducible ... which they probably are, but I haven't found a written account of that.

0 new comment