Why the blame game doesn't work - here's what does
Blameless postmortems. What are they?
Simply put: it’s how we learn and improve.
It’s about not pointing fingers at people but rather, pointing them at processes.
In other words, instead of looking for someONE to blame, we look for someTHING that was potentially missed/skipped/broken in the process.
The impacts of mistakes are significant, no doubt. But instead of pointing fingers, the best thing we can do once we're on the other side is to ask why. Then we can put processes and safeguards in place to avoid a similar situation in the future.
Here is a snippet from an excellent article shared by the Etsy team in 2012 about Blameless Postmortems:
Anyone who’s worked with technology at any scale is familiar with failure. Failure cares not about the architecture designs you slave over, the code you write and review, or the alerts and metrics you meticulously pore through.
So: failure happens. This is a foregone conclusion when working with complex systems. But what about those failures that have resulted due to the actions (or lack of action, in some cases) of individuals? What do you do with those careless humans who caused everyone to have a bad day?
Maybe they should be fired. Or maybe they need to be prevented from touching the dangerous bits again. Or maybe they need more training.
We don’t take this traditional view at Etsy. We instead want to view mistakes, errors, slips, lapses, etc. with a perspective of learning. Having blameless Post-Mortems on outages and accidents are part of that.
Note: this doesn’t mean that we don’t take accountability for our mistakes, but if people have fear of punishment or retribution, we won’t feel comfortable providing a detailed account of what happened.
This is the opposite of our #nosurprises culture.
We don’t shoot the messenger. Bad news is good news otherwise bad news is never shared.