If you search through developer-targeted resources on the web you’ll find plenty of information about patterns and anti-patterns of software development: TDD, dependency injection, BDD, pair programming, composition vs inheritance, DRY, KISS and many many more. But I feel there’s one important topic that’s exactly as crucial as the ones above but doesn’t get as much _love_ as it should: coupling. That’s why I’ve decided to write this blog post.
So, why there’s so little information on the web about coupling if it’s that important? Because to get really painful it has to meet several conditions that do not apply to every software development project:
- you’re dealing with large IT landscape of several separate applications that communicate with each other
- those applications are more or less actively developed (or at least maintained)
- the development is performed in-house (outsourcing the development forces some practices that help with fighting the coupling)
Let’s focus on these conditions (typical work conditions in large, software factories that produce software for internal use), just to make things more simple.
Before I get into details, a proper theoretical introduction is a must. Coupling may occur in 3 stages of application development:
- compile-time
- deployment-time
- run-time
What does it mean?
- Compile-time coupling means that a change in component X requires component Y to be recompiled. The more complex the network of dependencies is, the more artifacts have to be recompiled (and re-deployed and re-tested and re-released). This type of coupling has an improved version called …
- Deployment-time coupling - once you modify the component X, you have to redeploy it to the instance of component Y that’s dependent on X (but you don’t have to recompile Y). This scenario typically happens when you’re using dependency injection and proper interface (contract) extraction.
- Run-time coupling happens when the change of X doesn’t require any changes to instance of Y, but Y requires X to operate properly in run-time (for instance: because it uses an endpoint X publishes) and if X is not present, Y doesn’t work properly (is dysfunctional).
Ok, so what’s so bad about points 1 & 2 (compile-time and deployment-time coupling)? Isn’t it just how the software development works, so we have to live with it? When something has to be re-compiled / re-deployed nobody seems to be harmed, right?
I’m afraid it’s not THAT simple.
Imagine that you have a utility library that has got really popular - several products (applications) use it already, they depend on it (are coupled with it). When you make a change, you also have to:
- either touch (update, re-compile, re-deploy, etc.) the coupled applications (and that may mean un-planned work no one has time to do)
- or leave these apps with previous version which is sometimes not possible and in other cases means just that you have to keep maintaining several versions of library -> splitting a code line may easily duplicate (or worse) the amount of work - for instance time spent on fixing bugs
The problems escalate if your apps don’t have a sufficient suites of automated tests - every unplanned change (due to new version of artifact you depend on) is a pain, because you should do the regression testing.
Direct coupling is a pain, but remember that component A may be dependent on component B that is also dependent on component C - such "cascade" (indirect) dependencies may in the end result in a need of testing complete IT landscape each time 1 simple component changes. And even if it testing is covered, think about artifact repository and Continuous Integration:
- every change causes cascading compilations of several projects
- it’s easy to loose a grip on proper CI setup (to avoid compiling unnecessary stuff)
- it’s hard to determine what actual has to be released (in terms of artifacts) once you’re about to release a feature set - you may end up with release’ing far more than you initially assumed
Don’t forget about point 3 (run-time coupling).
Run-time coupling is one of the worst enemies of so-called DevOps Nirvana - "Continuous Delivery". It’s also the most complex type of coupling - that has several sub-types:
- Data (State) coupling - you depend on other component’s data / state; in other words: you make an ASSUMPTION about other components data (content) or its format.
- Functional (Behavioral) coupling - you depend on the behavior of other component in a way that is not guarded by its API / contract. For instance: there’s a shared “convention” - for instance: you can call this function only if you called this another one earlier; the identifier is mixed company code and contract identifier; if this argument is equal to zero, you can skip the last one, etc.
- Temporal (Context) coupling - there is non-data bound state (usually related to time) that has a very serious impact on the interaction between components - for instance: this component is accessible only within 8-17; this can be checked only after midnight, when the EoD is over.
But what’s the fuss about? Why does run-time coupling hurt so much?
- It’s much harder to test your components independently (even automatically); and that means that it’s also much harder to release them in that way
- in many cases breaking the state / lack of expected behavior / incorrect temporal context in another components will cause your component to start working properly - it usually happens when: those two communicate synchronously and / or there’s no proper error handling
[ End of Part I ]
In Part II I’ll write about particular examples of coupling making a big harm to RL projects and about what can be done to reduce coupling.