A few weeks ago, we (myself & my two friends - Tomek & Wojtek, as a part of a community called CTO Morning Coffee) hosted an online X Spaces session on Domain-Driven Design (DDD). To be clear, this wasn't supposed to be an intro session; quite the contrary. We intended to discuss (from the senior technical leaders' perspective) - where DDD is now, nearly twenty years after its original inception. Did it meet the expectations it has created? What problems can truly be solved by applying DDD? Is DDD the most effective way to tackle them?
It was a hell of a discussion - mainly because we've focused on actual results (backed up with data & real case studies), not good intentions & wishful thinking. It has been recorded & we plan to release it as a podcast episode, but it will take some time (there's a queue of episodes to be archived this way ...), and (what may be even more painful) the session was held in Polish - which will inevitably limit the audience it could reach.
To address that last issue, I've decided to spend some time & summarize my viewpoint on DDD (which I've partially covered during the session) in a written form. To be frank, it is critical. But objective & honest feedback (even if to some degree - negative) is a necessary precondition for improvement, as long as everyone's willing to listen with an open mind.
This is supposed to be a short article, so instead of dissecting DDD into atoms (and evaluating them individually), I'll instead focus on DDD's main goal & purpose: "tackling the complexity in the heart of software" - did DDD succeed & deliver on that promise?
(disclaimer: in my considerations, I've entirely omitted the most obvious cases of accidental complexity - e.g., adding complexity with unnecessary layers of redundant technology; my area of focus is all about tackling the complexity of the domain & model)
To answer that question honestly (does DDD help to reduce complexity?), I've developed a personal mental model of complexity based on four factors: F1-F4. In my opinion, the complexity can be put under control when ...
- F1. Problem space is properly (well enough) mapped onto solution space (aka "modeling" has been done correctly).
- F2. We have a way to "slice the elephant" (organize the model "horizontally" in relatively independent parts).
- F3. We are able to navigate across the "elephant" (including vertical traversal - à la Google Maps), adjusting the level of detail dynamically.
- F4. We've established a way to maintain the model long-term.
What do those four factors mean practically, and why are they so important?
F1. The modeling
The products we build should never aim to reflect reality literally. The (physical) reality is unbound and infinitely complex - most of its details are just noise & distractions. We need its simplified, filtered version - these parts (/aspects) of the reality that are relevant in the current context. DDD calls it a "model".
But there's no single way to model reality. Or not even a single way to do it perfectly. There are many "good enough" ones, and the whole challenge is about finding one of them - that's what I mean by "proper/correct modeling".
DDD has done a lot of good job when it comes to evangelism for the importance of modeling:
- Emphasizing the difference between domain (problem space) and model (solution space)
- Providing a very useful taxonomy of (sub)domains
- Introducing the named term - Ubiquitous Language (UL) - as a way to properly express the model (with its implications: if the model is reflected in UL, the model maps 1:1 to the solution (code), and the change is also phrased in UL, then the implementation of changes brings the least possible friction & nearly zero waste).
- Albeit not part of "core" DDD, EventStorming is a technique that can greatly help "mining" input for modeling.
So, it looks like DDD provides a very decent foundation here. However, I can't get rid of the impression that the mission (to add some form & structure to modeling) has been abandoned before it has started for real. Why so?
- How do we tell a good model from a bad (/flawed) one? What's the quality function of the model?
- What would be a good (/recommended) way to express the model? (so it can be shared & collaborated upon)
- I like DDD's emphasis on retrieving knowledge via direct collaboration with domain experts. However, this "raw" knowledge is initially unstructured & disorganized: what are the recommended heuristics to frame it?
As a result, while modeling, we're pretty much left with instincts, intuition & our unique experience (patterns & mental models we've seen in action before) - DDD doesn't help us much here (by "packaging" this body of knowledge for re-use). The model is not a raw dump of experts' brains expressed in UL - just like in every other case, the classical DIKW (data-information-knowledge-wisdom) pyramid applies here: the model is the final stage in this processing pipeline.
F2. Slicing the elephant
Some domains are very complex due to their "size" - the sheer number of involved concepts & multitude of their interdependencies. At some point, the cognitive load starts to be a burden: even the most brilliant individuals fail to cover the whole model. It begins with not noticing risks, patterns, opportunities, or redundancies. That causes frustration, subsequent mistakes, and some ad-hoc defensive tactics that buy us some time but incur severe (model) debt in exchange, accelerating the growth of complexity even more.
That's why the solution space ("elephant") has to be split into relatively independent parts that could be owned, maintained, and developed in (more or less) separation. This split is one of the most challenging topics in software design & ... Domain-Driven Design doesn't help us much here, sadly.
It's not that DDD leaves us completely empty-handed here. We got "bounded contexts", "services" (stateless business logic), and "aggregates" (collections of entities that have their business logic) - but this mental model is far from sufficient. It feels like it was the first intuitive (and "unbaked") idea of the author (Eric Evans), but as it was already printed (in the "Blue Book"), he never dared to iterate & improve it.
What are my issues with these concepts:
- "Bounded context" is a revolutionary idea with an extremely foggy definition: it's part of the model, the UL is consistent within its boundary, and it intuitively aggregates focal concepts present in the consciousness of a cooperating group of people (with shared concerns) - and that's pretty much it. It is very subjective and relies on opinions, not facts/measures.
- While "bounded context" is very high level, "services", "aggregates" & "entities" are very low level. Unfortunately, the practical challenge is to organize areas/modules/applications so they group concepts & features that should belong together - DDD doesn't (even attempt to) give clear guidance on how to do that. There's no corresponding concept for such grouping on mid-level, where it's needed most.
- To be frank, concepts of coupling & cohesion are both mentioned in the "Blue Book", but more like loose guidance than an attempt to create a prescriptive foundation for structuring models. It could have been a nudge in the right direction, but no action followed.
It looks like yet another wasted opportunity. DDD could have been much better if it had been stripped of low-level (& relatively useless) concepts like domain service, aggregate, or entity while at the same time got enhanced with carefully chosen elements of promise theory (contracts!), scientific approach to coupling vs. cohesion dichotomy, or semantic analysis (conceptual "proximity" of terms, the definition of "capability", etc.) of the domain.
Without that, we keep struggling with everyday challenges like these:
- Should "discounts" be a part of "item pricing"? Depend on it? Or the other way around? And the "basket discounts"? And the "volume discounts"? And ...
- What would be the best way to organize courier delivery? Around "courier availability", "routes", "scheduled deliveries", or "availability in the vicinity" (these are four different perspectives on the very same data)?
The 2nd part of the series (out of 3) is available here.