The latest news from incident.io HQ

We’re building the best way for your whole organization to respond, review and learn from incidents. This is where we talk about how and why.

Article

The Incident Maturity Model

Incidents are inevitable—how you handle them matters. The Incident Maturity Model shows how to level up from basic response to company-wide resilience, with actionable steps backed by real data. Where does your team stand?

Stephen WhitworthPicture of Stephen Whitworth

Stephen Whitworth

13 min read
Article

The flight plan that brought UK airspace to its knees

On August 28, 2023, a software bug in the UK air traffic control system caused six hours of chaos, reducing air traffic capacity and forcing manual operations. It's a great story of failure, resilience and communications in complex systems.

Chris EvansPicture of Chris Evans

Chris Evans

25 min read
Article

AWS re:Invent: The handy guide for the massive conference

AWS re:Invent is packed with 3,000+ sessions for developers, covering everything from scaling apps to generative AI. In this guide, we break down the sessions you shouldn't skip on. If you're headed to Vegas, don't forget to stop by and say hi!

incident.ioPicture of incident.io

incident.io

11 min read
Engineering

How we page ourselves if incident.io goes down

Learn how we tackle the ultimate paradox: ensuring our alerting system pages us, even when it’s the one failing. It's a common question - let's dive into detail on our "dead man's switch", how we stress-test our systems, and why we care so much about our setup allowing us to dogfood our own product.

Lawrence JonesPicture of Lawrence Jones

Lawrence Jones

8 min read
Article

We’re opening a San Francisco office

We’re expanding our global presence by opening our first office in San Francisco. 🔥

Stephen WhitworthPicture of Stephen Whitworth

Stephen Whitworth

4 min read
Talent

Behind the Flame: Rory M.

Meet Rory M., Product Engineer 🔥

Megan BatterburyPicture of Megan Batterbury

Megan Batterbury

10 min read
Engineering

Organizing ownership: How we assign errors in our monolith

At incident.io, we streamline our monolith by assigning clear ownership to chunks of code and enforcing it with CI checks. Tagged errors are automatically routed to the right team, reducing on-call stress and keeping our system efficient as we scale. Here's how we do it.

Martha LambertPicture of Martha Lambert

Martha Lambert

7 min read
Data

How we handle sensitive data in BigQuery

We take handling sensitive customer data seriously. This blog explains how we manage PII and confidential data in BigQuery through default masking, automated tagging, and strict access controls.

Lambert Le ManhPicture of Lambert Le Manh

Lambert Le Manh

8 min read
Data

How we model our data warehouse

Curious about the inner workings of our data warehouse? We’ve shared a lot about our data stack, but this time we’re diving into the design principles behind our warehouse. This blog breaks down how we structure our data, from staging to marts layers, and how we use it all in our BI tool. It’s a quick look into how we keep things flexible, efficient, and built to scale.

Jack ColseyPicture of Jack Colsey

Jack Colsey

13 min read

Stay in the loop: subscribe to our RSS feed.

Move fast when you break things