\

Are the costs of AI agents also rising exponentially? (2025)

86 points - last Wednesday at 1:47 PM

Source
  • quicklywilliam

    today at 12:16 AM

    Interesting read. I don't know if I quite buy the evidence, but it's definitely enough to warrant further investigation. It also matches up with my personal experience, which is that tools like Claude Code are burning through more and more tokens as we push them to do bigger and bigger work. But we all know the frontier model companies are burning through money in an unsustainable race to get you and your company hooked on their tools.

    So: I buy that the cost of frontier performance is going up exponentially, but that doesn't mean there is a fundamental link. We also know that benchmark performance of much smaller/cheaper models has been increasing (as far as I know METR only looks at frontier models), so that makes me wonder if the exponential cost/time horizon relationship is only for the frontier models.

    • agentifysh

      today at 12:42 AM

      Until there is some drastic new hardware, we are going to see a similar situation to proof of work, where a small group hordes the hardware and can collude on prices.

      Difference is that the current prices have a lot of subsidies from OPM

      Once the narrative changes to something more realistic, I can see prices increase across the board, I mean forget $200/month for codex pro, expect $1000/month or something similar.

      So its a race between new supply of hardware with new paradigm shifts that can hit market vs tide going out in the financial markets.

        • colechristensen

          today at 1:02 AM

          Doubtful, local models are the competitive future that will keep prices down.

          128GB is all you need.

          A few more generations of hardware and open models will find people pretty happy doing whatever they need to on their laptop locally with big SOTA models left for special purposes. There will be a pretty big bubble burst when there aren't enough customers for $1000/month per seat needed to sustain the enormous datacenter models.

          Apple will win this battle and nvidia will be second when their goals shift to workstations instead of servers.

      • matt3210

        today at 1:02 AM

        I took a month break and my side project took 2x as much tokens

        • dang

          yesterday at 9:42 PM

          Related ongoing thread:

          Measuring Claude 4.7's tokenizer costs - https://news.ycombinator.com/item?id=47807006 (309 comments)

          • greenmilk

            yesterday at 11:26 PM

            Are any inference providers currently making profit (on inference, I know google makes money)?

              • wsun19

                today at 12:05 AM

                Pretty much every major American inference provider claims to make a profit on API-based inference. Consumer plans might be subsidized overall, but it's hard to say since they're a black box and some consumers don't fully use their plans

                • wavemode

                  today at 12:44 AM

                  Selling inference is not fundamentally different from selling compute - you amortize the lifetime cost of owning and operating the GPUs and then turn that into a per-token price. The risk of loss would be if there is low demand (and thus your facilities run underutilized), but I doubt inference providers are suffering from this.

                  Where the long-term payoff still seems speculative, is for companies doing training rather than just inference.

                    • Gigachad

                      today at 12:56 AM

                      There’s a lot of debate over what the useful lifespan of the hardware is though. A number that seems very vibes based determines if these datacenters are a good investment or disastrous.

                  • yesterday at 11:37 PM

                    • jagged-chisel

                      yesterday at 11:44 PM

                      Google definitely makes money in other areas. Do they make money on inference?

                  • srslyTrying2hlp

                    yesterday at 11:28 PM

                    [dead]

                    • totalmarkdown

                      yesterday at 11:25 PM

                      [flagged]