It seems that the technical debt is one on my favorite topics ;P Looking back just a few months, I've assembled at least three posts on the subject. One of them is particularly on the cost of running software (https://no-kill-switch.ghost.io/the-cost-of-running-software/), but ... it doesn't answer the most important question:

How the heck do you actually estimate the maintenance cost of the software?

Hmm ...

so-called "expert" method? (aka just slap 30% of your total capacity)
a result of some code-based formula - e.g. based in the line/function/module count?
a direct derivative of codebase age? technologies used?
proportional to new development effort (within the same module)?

I've got some news for you. Both good and ... not so good.

First of all. There's no single formula that could estimate you an accurate amount of maintenance effort needed - neither up-front nor even for now precisely. It's just not possible, because of some many factors that have to be taken under consideration:

tech debt level
complexity (accidental & inherent)
team qualifications
operational maturity
etc.

Step #0

But don't despair (just yet) - there are many ways. Among them, two of my favorite ones: the trend method & the transition method. I'll get into the details in a moment, but first I'll need to introduce a very simple, yet absolutely essential pre-requisite for both those methods.

You need to start measuring stuff (if you don't do that already).

What "stuff"? Various measures of solution quality (proper for your context):

time spent on maintenance (e.g. avg weekly)
time spent on toil (manual operations that should have been automated or simply - stuff that should not be done at all in proper, sane conditions)
ratio of the total waiting time to the cycle time
no of tickets raised within the time unit
depth of a "buglog" (backlog with open bugs/issues)
etc.

Tracking all that may appeal tricky, but in majority of cases it can be automated up to the level where the measure is a side effect of an actual work. OK, now let's get back to our heuristics.

Mind the deltas

The trend method is deceptively straightforward - you start with the acceptable level of your metrics (or just a wild guess - what you start with is less important than what you do later ...) & continuously trace the trends. If ...

... metrics consistently raise towards warning thresholds, you don't wait but increase the constant ratio of maintenance time VS total effort (& keep observing)
... increasing the maintenance effort doesn't help the trend (anymore), you freeze the development & fight the fire by going for more thorough, precisely targeted refactoring effort
... metrics are rock-solid stable (or even get better), you consider reducing the regular maintenance effort

The assumption here is the direction correlation of your maintenance effort to technical debt costs. Of course this correlation does exist only if your efforts are truly well aimed (address actual, meaningful issues).

Convert debt into debt-prevention

The transition method is somehow different & it's more suited for when you're already struggling at the waterline (yikes!). What you start with is the measurement of the time you currently spend on all "unwanted" activities (costs of technical debt) - effects of poor Development Agility or tedious, manual activities (toil). You write this time off as a "waste" & commit to turn this "waste" 1:1 into activities aimed directly to fight it. In other words - whatever you save, you spend it on burning out even more tech debt (which is not an issue, because this time was already written off).

How to do that?

whatever happens, keep measuring! (to know the effects of your actions)
start with quick-wins & low hanging fruits - to "buy some time" in the initial phase
keep iterating at the short time-frame
if there are no real quick-wins in the beginning, negotiate some "time debt" - additional, initial investment to get a visible, meaningful effect really fast (it's more about psychological effect than anything else)

This method has a great motivating factor: "whatever you save, you can use it to reinforce your future efforts". It's a great morale boost for result-driven individuals - it helps them in prioritization & keeping a tight track of tangible outcomes.

Both these approaches need a certain level of discipline, but they have the advantage of being very transparent and open (what makes them easier to "sell out" to non-technical people). Anyone can learn the effects of your actions pretty much on-the-go (based on actual data).

The reality is not always so simple:

your team may be hopping between several products/modules of varying debt level
some investment may bring deferred value
there are some categories of technical debt that do not translate directly to "time tax" (e.g. bumping the unsupported version of a component)

I want to be perfectly honest here - what I've presented to you above is just heuristics, not a "silver bullet" - nevertheless, it's better to bet on them than pretend the problem does not exist ...

Next week let's talk about some of the worst (yet very popular) ideas of how to treat the technical debt ...