Register now: Why you’re (probably) doing service catalogs wrong
Register now: Why you’re (probably) doing service catalogs wrong
We're often asked: "How does incident.io help reduce alert noise?" And it’s a fair question. It’s typically much easier to add new alerts than to remove existing ones, which means most organizations slow-march into a world where noisy, un-actionable alerts completely overshadow the high-signal ones that indicate a real problem.
Most commonly, people try to solve this with silencing rules or better thresholds, but based on a combination of team vibes (”yeah, this usually isn’t a problem”), or based on data that’s extremely expensive to assemble, and often missing key information, like how much time is being spent.
If you want to actually fix alert noise, you need visibility into which alerts are causing problems, context to understand them, and tools to act on what you find. That’s exactly where incident.io helps.
It sounds basic, but the first step to reducing noise is understanding what’s actually making the noise.
Our alerting systems integrate tightly with Catalog—the source of truth for your services, teams, owners, environments, and more. When an alert fires, we can enrich them with that metadata: who owns the service, what team is responsible, what environment it’s in, etc.
This turns a noisy, contextless alert like “CPU > 90%” into something much more actionable: “High CPU on cart-service in production, owned by the checkout team.” And in case you’re wondering, making the title more human readable with AI is something we allow you to opt-in or out of.
With this in place, we have alerts that are enriched, connected to real entities in the organization, and with human friendly names that make them easier to decipher.
Not every alert needs to be an incident. Not every incident needs to page somebody.
That’s why we give you grouping and routing controls that help you bundle up related alerts and send them to the right place.
Our grouping logic allows for combinations or time-based (i.e. arriving within a defined window) and attribute-based (i.e. all alerts for the same team) grouping, and we also allow you to ungroup alerts in real-time during an incident.
Looking at an example of our dynamic ungrouping, let’s imagine we’re grouping by time-window only, and two alerts arrive within the window.
The first one creates an incident, and the second is grouped into the first incident by default due to grouping logic.
The responder is notified of this in Slack and has the option to ungroup at this point, and send that alert to be dealt with elsewhere.
Effective grouping doesn’t solve an underlying problem with noisy alerts, but it can treat the symptoms and reduce the burden on teams whilst you work on the proper fixes.
Ultimately with grouping set up, a spike in alerts from one flaky service doesn’t result in five separate incidents or three unnecessary pages.
Because alerts are directly tied to incidents in incident.io, we can track what actually happens after the alert fires.
That turns every alert into a feedback loop. Over time, we can tell you which alerts are frequently ignored, which ones cause the most disruption, and which ones are quietly eating up your team’s time.
This is the goldmine. Without this kind of visibility, all you can really do is guess which alerts are a problem.
To make all of this useful, we wrap it up in our Alerts Insights dashboard, where you can slice and dice alert data any way you like.
Here’s what you’ll find:
If you see an alert that’s declined 90% of the time, it’s not just noise, it’s someone actively telling you, “this isn’t useful.”
The plan for Alert Intelligence is to go beyond dashboards and actually do the analysis for you. Instead of asking you to sift through graphs and tables, we’d surface insights like:
“You’ve spent 90% more time dealing with this alert in the last 4 weeks.”
Or:
“This alert has been declined 22 times in a row by 3 different people.”
To do that, we’ll tap into all the data we already collect, like smart time tracking during incidents, to paint a picture of what’s really going on with each alert, and what’s worth fixing.
Reducing alert noise isn’t about tweaking alerting rules and calling it a day. It’s about connecting the alerts that are firing to the reality of what happens when people respond to them on an ongoing basis. And that’s exactly where incident.io shines.
If you want a walkthrough of how this looks in practice, we’d be happy to show you. Just give us a shout.
I'm one of the co-founders, and the Chief Product Officer here at incident.io.
We created a dedicated page for Anthropic to showcase our incident management platform, complete with a custom game called PagerTron, which we built using Claude Code. This project showcases how AI tools like Claude are revolutionizing marketing by enabling teams to focus on creative ways to reach potential customers.
We examine both companies' comparison pages and find some significant discrepancies between PagerDuty's claims and reality. Learn how our different origins shape our approaches to incident management.
The EU AI Act introduces new incident reporting rules for high-risk AI systems. This post breaks down what Article 73 actually mandates, why it's not as scary as it sounds, and how good incident management makes compliance a breeze.
Ready for modern incident management? Book a call with one our of our experts today.