Like many SaaS businesses, we have an on-call rota to enable us to provide 24x7 cover if there are problems with incident.io. We have a 'pager' which will alert the relevant person if something unexpected happens in our app, so that they can investigate and fix it if needed.
Note: This was adapted from an internal document we wrote about how we think about on-call at incident.io.
We're building a product that people depend on 24x7, all year around. It's important it always works, that means we need to support it around the clock. During office hours this is a shared responsibility across the whole team, but to limit the impact out of hours, we have a dedicated person 'holding the pager'.
Being on-call doesn't come without its benefits. By taking on the operational responsibility for the work we do, we tighten the feedback loops between the shipping and running. This helps us to make pragmatic engineering decisions and provide a healthy tension between shipping new code, and supporting and improving what we have.
Additionally, our product is designed, partly, to support folks who are on-call. There's no better way for us to empathise with our customers and find the opportunities and rough edges than to do the job ourselves.
As an incentive, and to compensate for the inconvenience of having to remain close to your laptop, we'll pay a fixed amount per week to anyone who's on-call.
We'll calculate pay automatically from our on-call schedules, and take overrides into account too. We'll calculate pay down to the minute, so if you cover someone for an hour while they go to the shops, you'll be paid for that time.
By compensating on-call we also aim to make overrides feel more fair, and avoid the need for more complex swaps of time. If someone offers to cover a day of your shift, they'll be paid for it so there's no need to feel indebted.
On-call payment is not expected to cover any time you spend working outside of hours. If you're paged and end up working in your evening, you should take time off in lieu. We trust you to manage this time yourself.
Being on-call unavoidably has an impact on your home life, but we want to provide the best possible experience. Here's a few ways we'll collectively help each other:
I'm one of the co-founders, and the Chief Product Officer here at incident.io.
We created a dedicated page for Anthropic to showcase our incident management platform, complete with a custom game called PagerTron, which we built using Claude Code. This project showcases how AI tools like Claude are revolutionizing marketing by enabling teams to focus on creative ways to reach potential customers.
We examine both companies' comparison pages and find some significant discrepancies between PagerDuty's claims and reality. Learn how our different origins shape our approaches to incident management.
The EU AI Act introduces new incident reporting rules for high-risk AI systems. This post breaks down what Article 73 actually mandates, why it's not as scary as it sounds, and how good incident management makes compliance a breeze.
Ready for modern incident management? Book a call with one our of our experts today.