Micromonoliths: scaling via sharding - part I

Micromonoliths: scaling via sharding - part I

In this post you'll find: micro-monolith neo-evangelism :), great (video) example of premature system distribution hitting the fan ..., demystifying of microservice independence myth and what are X-centric systems.

I hate blog posts, books or conf talks that focus on criticising only - it's far easier to b*tch on something than propose an actual solution to non-trivial problems. That's why after my recent post (on why microservices may not be an optimal answer to all of your growth problems), I'd like to re-visit an architectural pattern almost as old as Computer Science in general, in fact, a very under-appreciated one ... Sharding.

It's not a single answer to all scalability problems, but in my case, I treat it as one of 5-7 key techniques (altogether with e.g. CQRS, capability-based aggregates, open standard APIs, ...) I usually use to build future-proof solutions.

What is sharding?

Sharding is an idea of scaling the solution out horizontally (into separate processes) not by splitting solution by functions (responsibilities), but into micro-instances that do contain the same functionality, but process 1/X of homogenous traffic volume (using 1/X of data). Sharding doesn't cover only the split of stateless processing (basic service redundancy) but most of all - of data used by the shard exclusively.

Key principles of sharding:

  1. avoid having shards communicating with each other (except shard balancing)
  2. make it possible to adjust a range of data covered by a shard
  3. partitioning algorithm (how to distribute data across shards) is scenario-specific

At the first glance sharding may not look very appealing:

  • where's the decoupling?!
  • where's the independent functionality deployment?!
  • what about aggregations (e.g. reports) over full data volume that can't be sharded?!

These are all valid concerns - no solution is perfect & by moving from the concept of microservices to sharding you'd trade certain problems & challenges for different ones - still, a good deal in this case (IMHO), let's get into the details to prove this thesis ...

Myth of independence

Real-life microservices are independent only at the lowest level (atomic business capability) - these are just assembling pieces for normal, real-life business interactions (that frequently do involve several capabilities for each operation that is 'atomic' from the end-user perspective). If your system has a complex domain and involves a lot of reads (which are synchronous by their nature), that's what can happen:

TL;DW (Too Long; Didn't Watch) - excessive(& out-of-control) number of x-process calls (needed to serve a single request!) adds more & more latency, quickly exhausting connection pools & making the whole system unresponsive.

When sharding, communication between capabilities (services that are not separate processes, but modules of a well-structured monolith) is intra-process. This reduces the latency to minimum.

Do I need independent scaling for capabilities?

With microservices it's simple - if process X serves N times higher volume than process Y, you can slap N times more instances for X than for Y - voila. And it makes sense (except the fact that each process call has to go through a load balancer ...). What can we do in case of sharding?

When sharding, the idea is to distribute the domain (& data, & processing - as a consequence) across the nodes in a way that a total load (regardless of request type) on all of them is (roughly) equally balanced. If you can do that (design the split algorithm, maintain & monitor the balance, add more nodes on the way - if needed, etc.), you're perfectly covered - there's no point in worrying for differences between traffic on particular functionalities (capabilities/"services").

And in fact, you get some "bonus" gains: like A/B testing ability by design (you can update 1 or few nodes while the majority is still waiting).

Is splitting into shards a big deal?

As you can easily deduce out of previous paragraphs, the effectiveness of sharding relies on the successful implementation of:

  1. domain (/data/traffic) split (AKA partitioning)
  2. routing request to the correct shard(s)

Both of these are heavily scenario-dependent - there's no single optimal algorithm, no single perfect approach, in fact in many cases you may be forced to apply more than 1 partitioning algorithms for 1 system (e.g. 1 for core OLTP, 1 for reporting).

To design your partitioning algorithm you need to truly understand your domain:

  • how will the traffic look alike (in time) / how will key entities scale up with business growth?
  • what are the most common usage scenarios & what transactions boundaries will these have?
  • will the platform be read- or write-heavy? will the read & write models be similar?
  • for your key entities (aggregate roots for your key transactions) - what would be the best potential split criteria to distribute the traffic equally - geographical, alphabetical, round-robin across sequential identifiers, recency?
  • if there are needs for any aggregations, can these be pre-calculated (at least partially)? who & how often will use these aggregations?
  • which data has to be 100% online (in real-time), which data can be stalled & for how long?

I won't cover here all the potential considerations, BUT based on my past experience 95% of domains (& models, & applications) can be classified as X-centric, while X is some sort of particular key business entity.

I've seen systems that were: contract-centric, company-centric, user-centric, product-centric, location-centric, ...

For X-centric systems all the major transactional data (obviously excluding business configuration, audit logs, etc.) is somehow (directly or indirectly) linked to X. As a consequence each user (or narrow user group) operates across many tables (/object hierarchies), BUT all of them are in a single subgraph with one single (or few - but the list is strictly bound) entity X as its root. And yes - this makes X is the perfect candidate to rely partitioning upon.

Let's illustrate that with some examples ...

Next post in the series can be foudn here.

Sebastian Gebski

About Sebastian Gebski

Geek, agilista, blogger, codefella, serial reader. In the daylight - I lead software delivery. #lean #dotnet #webdev #elixir. I speak here for myself only.