Check-mate: the infinity of (check)lists

Another day, another interesting discussion, another topic for a blog post: the operations procedures (aka handbooks) in agile environment - should we bother writing anything down? If so, what for?

The Context

Imagine a DevOps team that works on operations for several actively developed, interconnected applications:

work happens in several streams that may be parallel
there are many test environments, continuously (or close to) integrated, and environment list is not set in stone -> new ones are being created, existing ones are being decomissioned
applications are developed by several teams in more than 1 location

DevOps team is co-located, they know each other for some time as they've worked together and at the moment they have the situation pretty much under control, HOWEVER they don't have all the development process automated - several activities are still manual.

The knowledge is spread in quite a reasonable way - each person has a backup in case of absence.

The Question

Should they bother with writing their operations procedures down? What for?

The knowledge about what they do (and how they do) is common within a team - who would need the docs? They manage w/o docs on daily basis.
Written information tends to get out-of-date if no-one bothers to maintain it - who will guarantee that they update it, when they will no be looking at it (as there's no reason to).
Isn't being Agile about putting communication over documentation? Isn't it enough to make sure that people really talk to each other about what's relevant instead of writing stuff down?

Dear ladies & sirs, I dare to disagree

I may be one of the first to stone the traditional docs to death, but I'm afread it's not really as simple as stated above:

Ops/DevOps follow various procedures - some are executed on daily basis / some are not. The ones that are manual, are and always will be error-prone (especially when executed rarely): because people make mistakes, because they forget things, etc. - that's why I believe that a minimal set of documentation: for instance, in a form of checklist:

it will make people sure they won't forget anything
it will help with making sure that the knowledge is fully synchronized between team members - they use the same (ubiquitous) knowledge, etc.

Checklist as a written, but still concise form is great to abstract the idea in a short form, so:

the problem gets decomposed ("divide & conquer") - it will be easier for a team member to say how long does particular list item take, which one causes the most problems, etc.
the parts get trackable (and measurable... and can be commented / described with remarks / annotations)

Writing stuff down helps not only the readers, but the writer as well:

it's easier to validate what you've written
you can come up to some interesting conclusions about how to re-organize things (kaizen FTW)
it's far easier to notice things that can get automated

Checklist are superb for maintaining transparency on a set (reasonable) level - without a checklist, someone has to sit with you side-by-side to validate whether you're doing it right or wrong -> checklist is enough to make him sure that from the high-level perspective it looks like this and that.

What happens if you don't?

From my past experience:

"We don't know what takes us most of the time. How could we know, each day it's a different thing and we don't know how to track it."
"It just doesn't know. I don't know what I did wrong. Hell, I don't even know how to check what's really wrong - we're doing it twice a year at best."
"I think I did all the necessary steps. I did it so many times already, how could I make a mistake in a simple procedure that takes only about 3 hours of manual work?"
"Billy is on vacation and Tommy called that he's sick and he won't come before the end of the week. No-one else did it before and we have no clue what to do!"
"I've never really been thinking about automating that - you know, it's so complex and the activities are so inter-dependent that automating that would most likely be far too complex." (Big-Pile-Of-Shit syndrome)
"Yes, I realize that the environment directory structure looks a bit different when Jake does the procedures - he introduced his own improvements, but we had no time to discuss those so far."

Etc. etc. etc.

Conclusion

Write your stuff down. It doesn't hurt and it's your insurance policy: face to face communication is awesome and will always be a baseline, but the most crucial information like:

the ground rules
the most crucial procedures
the expected outcomes (the work will be evaluated based on)

should always been written down. For transparency. For recurrance, consistency & predictability (all the virtues of good DevOps). And because you want to make sure that everyone's on the same page.

You pick the medium, you pick the form. Don't do fancy Word docs or (even worse) PowerPoint presentations - write only what's relevant, make sure it's structured and easy to search through.

Check-mate: the infinity of (check)lists

The Context

The Question

Dear ladies & sirs, I dare to disagree

What happens if you don't?

Conclusion

My biggest lesson learned (when I've started building software)

Jesus-Driven Development

What if ... a hypothetical job ad

Dinosaurs, why don't you extinct ...

The Iron Law of Oligarchy

How much awesomeness in awesome? The case of Spotify