The latest news from incident.io HQ

We’re building the best way for your whole organization to respond, review and learn from incidents. This is where we talk about how and why.

Engineering

How we page ourselves if incident.io goes down

Learn how we tackle the ultimate paradox: ensuring our alerting system pages us, even when it’s the one failing. It's a common question - let's dive into detail on our "dead man's switch", how we stress-test our systems, and why we care so much about our setup allowing us to dogfood our own product.

Lawrence JonesPicture of Lawrence Jones

Lawrence Jones

8 min read
Engineering

Organizing ownership: How we assign errors in our monolith

At incident.io, we streamline our monolith by assigning clear ownership to chunks of code and enforcing it with CI checks. Tagged errors are automatically routed to the right team, reducing on-call stress and keeping our system efficient as we scale. Here's how we do it.

Martha LambertPicture of Martha Lambert

Martha Lambert

7 min read
Engineering

Lessons from 4 years of weekly changelogs

Writing a meaningful update for customers every week has been held sacred at incident.io since we started the company. We've written over 200 of them in the past 4 years, and we recently celebrated going 2 years straight without missing a single a single week 🚀. Learn how we do it!

Pete HamiltonPicture of Pete Hamilton

Pete Hamilton

13 min read
Engineering

Observability as a superpower

At incident.io, tracing is our secret weapon for catching bugs before customers do. This blog unpacks how traces and spans are built, showcasing their role in debugging and performance tuning. From span creation to integrating traces with logs and error reports, it's a practical guide for adding tracing to your observability toolkit—whether you're in development or production.

Sam StarlingPicture of Sam Starling

Sam Starling

9 min read
Engineering

Choosing the right Postgres indexes

Indexes can dramatically boost your database performance, but knowing when to use them isn’t always obvious. This blog covers what indexes are, when to use them, how to choose the right type, and tips for spotting missing ones. Whether you're optimizing queries, enforcing uniqueness, or improving sorting, you'll learn how to fine-tune your indexing strategy without overcomplicating it.

Milly LeadleyPicture of Milly Leadley

Milly Leadley

10 min read
Engineering

Building On-call: Our observability strategy

Our customers count on us to sound the alarm when their systems go sideways—so keeping our on-call service up and running isn’t just important; it’s non-negotiable. To nail the reliability our customers need, we lean on some serious observability (or as the cool kids say, o11y) to keep things running smoothly.

Martha LambertPicture of Martha Lambert

Martha Lambert

21 min read
Engineering

Building On-call: Continually testing with smoke tests

Launching On-call meant we had to make our system rock-solid from the get-go. Our solution? Smoke tests to let us continually test product health and make sure we're comfortable making changes at pace.

Rory MalcolmPicture of Rory Malcolm

Rory Malcolm

11 min read
Engineering

Scoping week

A few months back, we launched On-call with a solid set of features—but that was just the start. To keep the wheels turning, we recently held a "scoping week" where we paired up, tackled ambiguities, and nailed down our project roadmap. Here's how we did it.

Leo SjöbergPicture of Leo Sjöberg

Leo Sjöberg

8 min read
Working with time
Engineering

Building On-call: Time, timezones, and scheduling

Time is tricky, but building our On-call scheduler meant getting cozy with all of its quirks— and lots of testing. No "time" like the present to dive in!

Henry CoursePicture of Henry Course

Henry Course

17 min read

Stay in the loop: subscribe to our RSS feed.

Move fast when you break things