TL;DR In our everyday software development work we tend to under-valuate the meaning of type systems, their expressiveness & role they play in proper domain modeling. Due to poor reflection of true nature of data, we increase error-proneness & readiness of the codebase - all of that because we assume that int, double, string & class are the 4 words sufficient to describe the static aspects of reality. In fact, it doesn't take much (of effort & good will) to fix this up.
I may be over-generalising here (for purpose ;>), but it's not that hard to find out the experience level of a software developer based on the code they've written ...
Development of a developer... #CleanCode pic.twitter.com/PJn4WMIQxL
— Mike Kaufmann (@mike_kaufmann) September 21, 2017
OK, but more seriously - more experienced developers establish some patterns & behaviours (bad or good, depending on the environment they've developed in), they come up with some already half-bakes solutions (their "trademarks") they smoothly & quickly re-apply. It's not necessarily good (remember a problem of a hammer & everything that seems like a nail ...), but at least it's a topic for some discussion.
BUT there's one, big single aspect of code that looks very similar regardless of how experienced the developer was. Types. Or more precisely - they way developer(s) use the types & their expressive power.
Types, you're doing it wrong
In shortest possible words, it usually looks like that (I've used C# as a case lang):
- everything that looks like a basic type, IS of this basic type (number? Int! some text? String!)
- if something doesn't fit any specific basic types, but have finite "states", make it an enumeration
- if something fits >1 basic types, prefer the most specific one, eventually broaden value range for future safety
- everything else that doesn't fit the points above is a class (there's another consideration about when to inherit / when to compose, but this is in general a separate topic)
- if in doubt (can't name / breaks "the perfect" composition / will be used just in this 1 place) use tuple, serialise or apply some other nasty work-around ...
No brainer, right-o? And it works! As it looks deceptively easy, we don't give it a much thought - hereby abandoning some basic benefits types provide ...
What are the consequences?
- 'Age' is int, so possibly ... -45
- 'BirthDate' is System.DateTime, which actually is struct, so can't be null (thanks Bartosz!), but its default value ain't any better at all ... (what does it even mean? unknown? restricted? not important?)
- Complex systems that define several "ratings", "scores", "ranks", "valuations" (e.g. in Financial Services) get extremely hard to read (or rather "decipher") - everything is an int or (even worse ...) string, so you have to clue how these values correspond to each other / convert to each other (or maybe they have nothing in common at all)
- Validation of these 'basic' types (age > 0 ...) is smeared around whole application, with some gaps & inconsistencies
- If you're not cautious enough, you can make a perfectly valid operation by adding age to your current account balance ... & it will be perfectly from language's perspective ...
Types are modeling tools
I don't have any hard numbers to back this thesis up but, ... based on what I've seen I'm eager to risk the statement that majority of bugs in the code are caused by such tiny, easy-to-omit mistakes (think about "corner cakes" & what they are about), not broken "real", algorithm-based business logic (because this is the part you have usually full focus on).
Great, now it's time to hit with the key statement of this blog post:
Type is an element of both Domain Model AND Ubiquitous Language!
Use all the (syntactical & conceptual) tools you have to reflect it properly then. Just to make sure we're on the same page ... I don't want you to ask your Domain Expert (DE) about the clarification whether something should be System.Single or System.Double - this is exactly NOT the point. DE should be able to describe the type using other terms from the Domain and common knowledge:
- by providing list of values, ...
- ... boundaries,
- ... or other constraints
Your job is just to codify these rules into Model (& further - into code).
If there's a need to distinguish Age in your system, make it an explicit type (for whatever property which nature conforms to definition of Age)!
PIN code? Type!
Financial rating? Type!
DUNS number? Type!
Be very expressive about the data you operate on. Fortunately, we have plenty of mechanism that we can use for this purpose (again, samples for C# - which is fortunately at least a type-safe language):
-
In fact, most likely the biggest problem (we got used to) is null-ability - unfortunately, the most obvious answer (algebraic data types) is not present directly in C#, but you can mimic it easily with libraries like language-ext (or just switch to F# ;>).
-
There are (in)famous Code Contracts - they seem to be capable of doing the job but ... their easiest use case is NOT directly related to type (as it's in imperative code), their future is uncertain (lack of support in VS2017), using them as ContractClass is ... a pain ;P
-
Don't be afraid of creating new types (classes/records/structs/whatever) to encapsulate both data & concept this data is behind. Tax (price to pay, e.g. in performance) is negligible, but benefit is very clear: a place to put validation in (including 'smart constructors' - just don't get overboard with putting business logic there ...), pattern matching (yes, it's not perfect, but already usable & will get even better in C# 7.3.
-
If, due to whatever reason, you don't want / can't do more, at least create aliases for your basic types (using 'using' clause) - will NOT prevent you from miss-assignment, but it will at least make the code more readable (in terms of purpose & logic). Explicit meaning FTW!