Behind-the-scenes building On-call

Creating an on-call product is hard: it has to be rock-solid, capable of handling massive alert storms, and be designed to minimize the impact of on-call on the lives of those responding. In this series, we share behind-the-scenes details of how we built our on-call product. From collaborating closely with our design partners to running rigorous load testing and reliability drills, we’ll share the journey of developing a product that reimagines the on-call experience.

Launching incident.io On-call product
Engineering

Behind the scenes: Launching On-call

We like to ship it then shout about it, all the time. Building On-call was different.

Henry CoursePicture of Henry Course

Henry Course

8 min read
Engineering

Building On-call: Our observability strategy

Our customers count on us to sound the alarm when their systems go sideways—so keeping our on-call service up and running isn’t just important; it’s non-negotiable. To nail the reliability our customers need, we lean on some serious observability (or as the cool kids say, o11y) to keep things running smoothly.

Martha LambertPicture of Martha Lambert

Martha Lambert

21 min read
How on-call delivers to your phone
Engineering

Building On-call: The complexity of phone networks

Making a phone call is easy...right? It's time to re-examine the things you thought were true about phone calls and SMS.

Leo SjöbergPicture of Leo Sjöberg

Leo Sjöberg

7 min read
Engineering

Building On-call: Building a multi-platform on-call mobile app

What does it take to build a greenfield mobile app in 2024? When we launched On-call earlier this year, we had to find out.

Rory BainPicture of Rory Bain

Rory Bain

17 min read
Working with time
Engineering

Building On-call: Time, timezones, and scheduling

Time is tricky, but building our On-call scheduler meant getting cozy with all of its quirks— and lots of testing. No "time" like the present to dive in!

Henry CoursePicture of Henry Course

Henry Course

17 min read
Engineering

Building On-call: Continually testing with smoke tests

Launching On-call meant we had to make our system rock-solid from the get-go. Our solution? Smoke tests to let us continually test product health and make sure we're comfortable making changes at pace.

Rory MalcolmPicture of Rory Malcolm

Rory Malcolm

11 min read
Engineering

How we page ourselves if incident.io goes down

Learn how we tackle the ultimate paradox: ensuring our alerting system pages us, even when it’s the one failing. It's a common question - let's dive into detail on our "dead man's switch", how we stress-test our systems, and why we care so much about our setup allowing us to dogfood our own product.

Lawrence JonesPicture of Lawrence Jones

Lawrence Jones

8 min read

Move fast when you break things