The surprising effectiveness of launch checklists
This week, our team launched a new user experience flow to a percentage of new signups. No big deal, except that it touched some critical billing flows. We’re pretty risk averse to touching those. So I took the lead of one of the previous engineers on our team, and created a launch checklist to sync up what needed to be done in the launch.
It can often be easy and natural to talk in the abstract of the things that need to be done before flicking the switch and launching these things. And the thought behind “people over process” means well. But it’s super hard to decide when to take that plunge, especially for less comfortable devs on the team. One thing I recommend trying the next time you’re about to launch a thing is create a doc (either on a wiki or in some other team space) where everyone is synced on a few key things that need to be done in before, during and after the launch.
To be clear, this is not a list of things you should have ticked off during the development of a feature or product. These are more useful as a single place to store to prevent a mad rush if things hit the fan, and for peace of mind in working towards a common goal.
The guide
Summary
What is being changed? Why is it being changed? (business/technical context is helpful) How can it be turned off in the event of critical failure?
Checklist
This section includes things like checking last minute bug fixes actually do what they say they do, entering and verifying production systems configuration (you’d be surprised by how often production issues are caused by configuration errors), final verification of scope. Super useful if you want to parallelise items in the final stretch before you launch, and complete anything in that doesn’t make sense to fit the context of a full JIRA issue or task.
It also prevents the bus-factor issue and the risk of forgetting something - if someone “critical” is missing before you launch, you don’t need to ask them everything under the sun, and even if they are there, it prevents them forgetting before you flick that switch.
Announcements + monitoring
This section forces you to think about stakeholders who need to know (generally your support staff and SRE). Ideally they’d have their own doc that your team has prepared to spot red flags from customers, and how to quickly resolve known issues. But if you haven’t, NOW is the time to do it. List the places you’re going to notify when you launch, either with Slack or some other method of comms.
It also makes you think about how your own team can find smoke from fires burning in your feature, which you may not have thought too much about until now. Most of the time, teams have done some work to highlight issues, but the logging code can be obscure and hard to find, unless you’ve written it yourself. Surface links to dashboards and useful logging platform queries that can be used by any team members.
Key timelines / runtime
This is a bit more flexible. This is the place to record times where certain things were manually changed - i.e. when you turned on a feature for a subset of customers, changed the subset, deployed a bugfix post-launch.
Often useful to make sure we don’t double up on work, and to trace follow-on events and bugs to potential causes of those events.
Outcomes
Even more free-form. Useful to include learnings, things that were missed, things that were validated.
References, etc.
I got the idea from another engineer on the team, combined with reading the Checklist Manifesto, which convinced me of the value of shared team checklists.