Automated incident response: Why it matters and where it’s headed
For years, incident response has been a mostly manual process: someone gets paged, scrambles to investigate, loops in the right people, and after some firefighting, hopefully resolves the issue before too many customers notice. But as modern systems become more complex and interconnected, the old ways don’t scale. That’s where Automated Incident Response (AIR) comes in.
Tom Wentworth
Overhauling PagerDuty’s data model: a better way to route alerts
PagerDuty has long been the go-to solution for reliable on-call management, but its aging data model and lack of innovation have become a challenge. In this post we explore how incident.io On-call offers a better, more flexible approach to alert routing and provide practical advice on how to migrate smoothly from PagerDuty.
Chris Evans
The Incident Maturity Model
Incidents are inevitable—how you handle them matters. The Incident Maturity Model shows how to level up from basic response to company-wide resilience, with actionable steps backed by real data. Where does your team stand?
Stephen Whitworth
The flight plan that brought UK airspace to its knees
On August 28, 2023, a software bug in the UK air traffic control system caused six hours of chaos, reducing air traffic capacity and forcing manual operations. It's a great story of failure, resilience and communications in complex systems.
Chris Evans
AWS re:Invent: The handy guide for the massive conference
AWS re:Invent is packed with 3,000+ sessions for developers, covering everything from scaling apps to generative AI. In this guide, we break down the sessions you shouldn't skip on. If you're headed to Vegas, don't forget to stop by and say hi!
incident.io
We’re opening a San Francisco office
We’re expanding our global presence by opening our first office in San Francisco. 🔥
Stephen Whitworth
Mastering regulatory compliance with incident.io
Learn how incident.io streamlines regulatory compliance by automating incident management, enhancing collaboration, and simplifying audits for frameworks like GDPR, SOC2, and DORA.
Chris Evans
What is a SEV1 incident? Understanding critical impact and how to respond
Ever hear the phrase “expect the unexpected”? That’s a SEV1 incident in a nutshell. Understanding what a SEV1 incident is—and how it differs from other severities—is the first step in building organization-wide resilience and, ultimately, bouncing back stronger.
Kate Bernacchi-Sass
Why I like discussing actions items in incident reviews
Action items are a natural part of understanding and improving the systems we work with—an extension of the overall learning process. So, what better place to discuss them than in the incident review itself?
Chris Evans
Stay in the loop: subscribe to our RSS feed.