There are very few topics online these days that receive as much attention as expected productivity gains due to LLM usage. Some CEOs claim publicly that their companies "write NN% code with Gen AI" or "have reduced headcount by MM% due to AI-powered automations". Of course, such statements cause a massive FOMO effect in all other CEOs, who don't want to waste an opportunity of being an early adopter of a game-changer technology like that.
Not-so-surprisingly, the companies that brag about those wondrous effects are typically the ones that build and sell AI/agentic tools ... On the other hand, numerous surveys and reports claim that the impact of applying AI in everyday work is uneven, limited to some specific roles/conditions, and in many cases questionable (because of the controversial choice of metrics used to prove the point).
So, what's the truth? Are LLMs making us more productive? If so, can anyone be boosted with Gen AI? One could easily write a whole book about that (even assuming that we scope it down to applying Gen AI in Software Engineering only). Still, this humble blog post is limited to one particular observation I have:
While there are engineers who get ultra-productive with LLM-powered tools, the gain for the overwhelming rest is far more limited. Barely noticeable.
Why? I believe I have an answer, and I'd like to share it with you.
The "depth"
To present my full thesis, I need to guide you through my chain of thought, so bear with me.
One that puts some effort in inspecting the way developers work with AI coding tools, can easily notice the difference in the "depth" of their usage:
- Some rely upon the default configuration of their Gen AI IDE plugin and treat it as a "smarter" auto-completion.
- The next step (in AI coding initiation ...) is more conscious context management (specific file referencing, adding instructions to context).
- Folks who want to go further, create custom commands, give LLMs access to bash tools, or even connect MCP agents (of their task trackers, knowledge bases or even infra control plane).
- The level "pro" is setting up their own autonomous agents (to automate repeatable/well-defined/tedious jobs) or going all-out when it comes to LLM output validation (with hooks or by cross-validation).
I firmly believe that the productivity gains are directly related to depth (if "depth" is applied deliberately and reasonably). Why so? Depth is what makes AI "scale" better - you can run multiple agents in-parallel, replace manual inspections with properly-crafted automated evals, and avoid many "round-trips" (between developer and agent) with proper context management.
But where's this "depth" variance (some go "deeper" while some do not) coming from? Is it a matter of intellectual laziness? Lack of training? Fear of getting irrelevant? Old habits kick in? Well, the answer is nuanced and multi-part ...
1st Reason: The Loop
First of all, the deeper into LLMs you get, the more you need to change your everyday "loop" (standard working routine) - and in cases of software engineers, some are in this job primarily because of how the "traditional" routine looks! They love how things "snap" into place, they appreciate the "aesthetics" of the code they craft manually, and they have developed an internal reward function that gets triggered when they produce code according to their beliefs (clean code, "DDD-compliant", implements a well-known pattern, etc.). Or they simply like "solving riddles" on their own (what's the fun when LLM does that for you?).
Besides, going deeper in many cases seems more like delegating to and supervising agents. Reviewing tons of LLM-generated changes is boring & tedious - wasn't Gen AI supposed to deal with exactly that kind of stuff?
Such folks will oppose going "deeper" with LLMs just because it deprives their work of the fun factor. This group is very used to getting things "their way" because they remember their privileged position in the ZIRP era. But the reality has changed, and this group will sooner or later perish, ruthlessly exterminated by outcome-driven CEOs/CTOs.
2nd reason: The Control
The second reason is equally important. Reliable, accountable engineers do not like letting control slip out of their hands. We, software builders, tend to be brilliant control freaks (who can inspect and verify A LOT in our heads) in a very deterministic discipline (that's why we call it software ENGINEERING). Passing down the part of the control to non-deterministic LLMs violates many of the fundamental beliefs and ground rules this industry has had since its inception.
It's far more easier to take full ownership over something you've personally designed, written, polished, and refactored to the state that fits your head perfectly. Enter the LLMs, like a herd of cats free-roaming everywhere with inhuman speed and no remorse when putting the artifacts of your craft upside-down.
This problem will disappear in time. Just like the older generations have learned to write tests, configure static and dynamic checks, or implement sophisticated observability for distributed systems, we'll quite soon be fully used to: writing immutable (& well-isolated) code (that can be easily replaced with yet another prompt), building NL-phrased evals, using next gen version-control tools (designed for LLMs, like DeltaDB), or having swarms of autonomous agents to monkey-test in search of tricky edge cases ...
3rd reason: 🙊🙉🙈
OK, and now we're getting to the trickiest part - the third reason - of why so many developers are not getting substantially productive with Gen AI.
This is the end of part I. Sorry for the cliff-hanger, but if you're interested in what may be the final reason and why I find it the most important one, you'll need to check No Kill Switch again in a week or two. Feel free to guess what I have on my mind and put your bet in the comments below.