By mid-2025, it's pretty clear - LLMs are here to stay, whether we like it or not. Yes, there are multiple scenarios where their applicability is still at least questionable, but Gen AI has anchored in software development for good. How so? It shines in scenarios with formal languages that are simpler than natural ones, such as programming languages. It also deals surprisingly well with stacked levels of abstraction, surpassing human capability constraints (like cognitive load limits). Last but not least, it excels in quick creation of code - and there's still a common perception (a very incorrect one) that the speed of writing code is the key constraint in delivering value via software ...

Okay, we have shiny new "AI" toys now, but will they make such a difference? And how? No, this is not yet another hype post praising how model XYZ excels in changing the color of a button or replicating the layout of a well-known e-commerce site. Models can do many things, some better and some worse - but it's the beginning of the journey, and their capabilities will improve over time. However, today ... I'd like us to focus on the economic aspects of building software (in context of LLM usage).

Classic economics of building software

Until now, when you wanted to build something, the approach was generally straightforward (assuming you knew pretty much what you wanted to build):

Map product vision onto solution architecture (what we have to develop)
Inventorize future artifacts and work needed to create them (aka break down the work)
Estimate (more or less accurately) and allocate the work to the teams/folks available
Execute! (in smaller or bigger iterations)

If you wanted to adjust the "when" (the effects will be available), your choice was the same as it was 20-30-40 years ago:

Adjust one or more of the classic four variables: scope (e.g., remove features), time (e.g., add more time), people (e.g., assign more engineers), or quality (e.g., ditch writing unit tests) ...
... but you could not affect significantly the velocity of an individual engineer; if there is work to be done, someone has to do it, and that "someone" can't be legally fed any steroids/drugs to perform XX% faster ...

LLM tokenomics

Okay, but now some things have changed ...

Some work now can be (the boundaries of what can and what can't are and will be drifting) assigned to LLMs or agents powered by LLMs (even if it means that someone has to review it later).
Hence, human work is potentially substitutable with LLM "work" (the cost of human effort with the cost of LLM model inference) - it's a matter of time when we'll get the first formulas for that flying around ("For a typical MVC application in ASP.NET and server-side rendering, every hour of an average software engineer is roughly NNN tokens").
Of course, the tricky thing is that for different kinds of engineers' work (writing code front-end/back-end/infra/..., adding missing tests, creating docs, reviewing code, etc.), the "conversion factor" will be different - but that inconvenience seems manageable.
To make it even more interesting, the quality of LLM "work" results improves once you enrich the context with additional information (infra specification, observability metrics, a wider scope of the solution, internal knowledge base, history of past incidents, etc.) - but this optional investment increases the inference costs (making the formula even more complicated). Some compare it to applying Nitrous Oxide in NOS - yes, it's an oversimplification, but a very vivid one.

Let's examine the implications of the four points above. Long story short, executives can now boost development velocity with an entirely new, highly variable resource: tokens. The classic quadrangle has become a pentagon.

The return on (LLM tokens) investment will vary a lot, depending on the activity type, but ALSO on how much you've already invested (the law of diminishing returns applies here). In certain circumstances, applying LLMs may actually bring zero (or nearly zero) value. Knowing where and when to apply LLM tokens for the most significant outcome will soon become a highly sought-after skill.

Wait, that's not all. Even if the LLM cost to deliver effect XYZ is higher than the alternative "protein" cost (that would provide the same value), it may STILL make more economic sense to use LLM. Why so? LLMs are faster than humans - this may give you the advantage of a first entrant.

The deliberate economic consideration of how to apply LLM tokens in software product development so that they most effectively replace or enhance human work is what I call LLM tokenomics.

It's will be a completely new "discipline", similar to what FinOps was to the cloud.

Plainly speaking

But what does that mean in practice? Very soon we'll all be hearing more & more questions (from our CEOs, CPOs, CTOs, ...) like this:

"Should we boost the development efforts on XYZ with LLMs?"
"In which phases/activity types?"
"How much should we spend optimally?"
"What will be the effect on the delivery plan?"

So far, in many cases - e.g., in fixed price projects or product development teams the cost side was pretty static - you had a given "capacity" and it had its change inertia (e.g., hiring & on-boarding takes time). But now, execs will have a dual budget - fixed "protein" budget (people) and dynamic, flexibly allocatable token pool - applicable where it's needed (& "productive") most. Those budgets will have different dynamics; they will be planned and spent in different ways. Also, it's likely that, in the long term, these budgets will be (to some degree) interchangeable, at least for certain types of work.

Oh well, interesting times ahead ...