Interrupts in software teams: using unplanned work to your advantage

🔥 Firefighting is eating my team’s productivity

Interrupts are often seen as a problem that eats away at your team’s productivity, and gets in the way of shipping important things for your customers. It’s often consciously accrued from the tech debt we accept to ship features sooner. However when a team doesn’t have a good strategy for dealing with the consequences of those decisions, the pain is felt much more acutely and much sooner.

Teams will often operate on a 'tick tock' approach to dealing with this, where they halt feature work and burn down tech debt for an entire planning cycle. The decision to spend a cycle on tech debt is usually spurred by the interrupts draining too much of people’s time, or a nasty incident that forces you to take stock and pay off some of the accrued debt, at the cost of not shipping features for a while. They might also depend more heavily on customer support to triage and solve issues, extending feedback loops and putting critical fixes into long product roadmaps, not seeing resolution for months.

📟 Interrupts can be an advantage

At incident.io we’ve tried to reframe interrupts to our advantage. We believe that by setting up our engineering team to explicitly cater for this work, we can deliver much lower latency tactical changes that delight customers, and makes the entire team more productive. We can often ship fixes for bugs or deliver simple features to our customers within an hour or two of them mentioning them.

By having a dedicated interruptible member of the team, we can tackle things that otherwise might fall into a backlog and never be prioritised. Many customer & engineering issues can be addressed with smaller tactical fixes, and when we encounter things larger than that, we’ve already had someone spending time to triage and scope those issues.

We call that role “Product Responder”, and as with much of how we run the company, we wrote a proposal on what the role is. I think it does a great job of showing our thinking around this role, so we’ve open sourced the proposal. Feel free to give it a read through.

This is also a great forcing function for knowledge distribution. By putting a newer member of the team on Product Responder (with a shadow for support) they’ll naturally be pushed to explore and solve problems in domains that they otherwise might miss. Importantly, those domains are implicitly determined by what the company and our customers need, exposing them to the stuff that matters most.

To summarise, a Product Responder is:

a member of the engineering team that is nominated each week on rotation
not given any planned team work, but not precluded from taking part, so long as them dropping out wouldn’t block the rest of the team.
the first interruptible person, handling alerts, errors, and requests from other teams & customers
not always expected to fix everything, triage & prioritisation is a big part of the role
explicitly encouraged to iterate on processes & make changes or proposals to keep product responder working well

I don’t think what we’re doing here is necessarily novel or a magic bullet, lots of engineering organizations have something in this shape, but I think there’s value in codifying what we’re doing, and being explicit about what the role is and why it’s worth investing in. We’re very lucky to be the shape of business where product teams working directly with customers is scaleable & effective, so doubling down on making that work as well as possible feels like a very good use of time.