TL;DR: AI coding assistants fundamentally changed how fast software ships, and more deployments mean more deployment-triggered incidents, more alert noise, and higher on-call load. PagerDuty's business signals tell the story clearly: dollar-based net retention fell to 98% as of January 31, 2026, down from 106% as of January 31, 2025, meaning existing customers actively cut their spend. Meanwhile, we shipped over 200 fixes and features in Q1 2025 alone, with an AI SRE that delivers up to 80% reduction in MTTR. If your pager was built for human-speed development, it is now a bottleneck.
More code ships faster than ever. AI coding assistants like GitHub Copilot and Cursor accelerate pull request output dramatically, and every additional deployment creates a new opportunity for something to break in production. If your on-call tooling was designed for a world where teams ship once or twice a week, it is not equipped for the world you are operating in now.
PagerDuty built its platform for that older world. Its architecture is web-first. Its alerting model is strong, but its coordination layer, the part that matters most when your team scrambles at 2 AM, forces engineers to leave Slack, open a browser, and manually stitch together their response. The business data confirms what SRE teams already feel: PagerDuty's customers spend less each year, and R&D investment is not keeping pace. This article uses PagerDuty's own financial metrics as evidence and explains why AI-native, Slack-first platforms are where high-velocity teams are moving in 2026.
AI coding tools compressed the development cycle. Tasks that used to take considerably longer now complete much faster. The result is more pull requests, more deployments, and a fundamentally larger incident surface area.
DORA's research shows that deployment frequency and change failure rate are both measurable outcomes of development velocity. Elite performers deploy multiple times per day, and that acceleration is now within reach for mid-market engineering teams running AI-assisted workflows. It changes the math on everything your on-call stack needs to handle.
When a team goes from 50 deployments per month to 200, on-call tooling gets stress-tested in a completely different way. Alert routing logic that assumed infrequent changes now produces noise. Runbooks that assumed slower releases become stale within weeks. And the humans responsible for on-call start seeing alert volumes their tools were never designed to route intelligently.
The incident.io vs. PagerDuty comparison highlights exactly this gap: PagerDuty's architecture prioritizes alerting configuration, while modern teams need correlation, context, and coordination speed. Those are different engineering problems, and the platforms solving them look very different.
More deployments mean more change failure events. DORA's research is explicit: deployment frequency and time to restore service are correlated, not opposed. Faster teams that ship more frequently also tend to resolve incidents faster, but only if their incident tooling can match the tempo.
Legacy tools create a fixed coordination overhead per incident regardless of volume. If your team spends 15 minutes assembling responders, finding the right runbook, and updating the status page every single time, doubling incident frequency doubles your coordination tax. That overhead comes directly out of MTTR and compounds as deployment volume grows.
Higher deployment frequency produces noisier alert environments. More code changes mean more monitoring thresholds crossed, more transient errors that look like incidents, and more genuine incidents that need fast triage. Legacy tools without intelligent alert grouping and deduplication amplify the noise instead of filtering it.
Our alert deduplication groups related alerts automatically, reducing the signal-to-noise ratio before a human is ever paged. PagerDuty includes basic alert deduplication via dedup key matching on standard plans, but advanced alert suppression requires the AIOps add-on at additional cost, layering complexity onto an already cluttered pricing model.
High-velocity teams need on-call tools that move as fast as their code. Specifically, they need three things: coordination that starts automatically the moment an alert fires, timeline capture that happens without a dedicated note-taker, and deployment correlation that connects a recent PR to an active incident in seconds.
Our Slack-native design delivers exactly this. Web-first tools with bolted-on chat integrations cannot replicate it architecturally because the fundamental data model is different.
The coordination tax is real and quantifiable. When an alert fires in PagerDuty, a typical workflow looks like this: acknowledge the alert in the PagerDuty web UI, manually create a Slack channel, invite responders by hand, copy the alert link, and paste context into the channel. That process consumes several minutes before any troubleshooting starts.
With a Slack-native tool, the Datadog alert fires and we automatically create #inc-2847-api-latency-spike, page the on-call engineer, pull in service owners based on the affected service, and start recording the timeline, all within seconds of alert ingestion. Engineers type /inc severity high and response begins. No browser tab. No manual setup.
The on-call tool selection framework walks through how to evaluate this specific capability, with questions to ask any vendor during a trial.
Manual note-taking during a live incident is a bad use of an engineer's attention. You want every responder focused on the problem, not on maintaining documentation. But post-mortems require accurate timelines, and if nobody captures them during the incident, you spend 90 minutes on Slack scroll-back archaeology three days later when details are fuzzy.
We capture everything automatically. Every status update, role assignment, Slack message, and call transcript populates the timeline as the incident runs. Our AI transcription tool, Scribe, records and summarizes incident calls via Google Meet or Zoom, extracting key decisions and flagging potential root causes without a dedicated note-taker.
"1-click post-mortem reports - this is a killer feature, time saving, that helps a lot to have relevant conversations around incidents (instead of spending time curating a timeline)." - Adrián M. on G2
Root cause identification takes longer when you manually correlate a recent deployment to an active incident. In a high-velocity environment where multiple engineers merge PRs every hour, the signal-to-noise problem in root cause analysis is acute.
Our AI SRE searches GitHub PRs, Slack messages, historical incidents, logs, and traces to build root cause hypotheses automatically. If the root cause is a code change, the AI SRE identifies the likely PR, surfaces it in Slack, and can generate a fix PR directly without the responder leaving the incident channel. The incident.io AI capabilities overview demonstrates how these capabilities work together end-to-end, from alert to fix generation.
It is one thing to say a tool feels stagnant. It is another to point at audited financial metrics that confirm the innovation slowdown. PagerDuty's own earnings reports tell this story clearly.
PagerDuty's net retention rate fell to 98% as of January 31, 2026, per their Q4 FY2026 earnings release. The prior year it was 106%. The year before that, it was higher still.
Net retention below 100% means existing customers spend less each year, on average. Some are churning. Others are downgrading. The cohort of customers PagerDuty had 12 months ago now pays less in aggregate than it did before, even accounting for upsells.
For enterprise SaaS vendors at PagerDuty's scale, NRR of 110% or above indicates healthy expansion. Anything below 100% raises a direct question about whether the product delivers enough value to justify its cost year over year.
PagerDuty generated $492.5 million in revenue in fiscal 2026, per the same earnings release. Their R&D expense line, available through the SEC filings at investor.pagerduty.com, shows where product investment is prioritized.
PagerDuty's fiscal 2026 communications emphasised significant investment in research and development to lead the agentic era. But NRR fell 8 percentage points in a single year. That points to one of two conclusions: either the investment is not landing as product customers value, or customers are unconvinced by the direction.
PagerDuty's architecture is web-first. Coordination happens in a browser-based UI that sends notifications to Slack, not in Slack itself. That distinction creates friction at every step of the incident lifecycle: acknowledging alerts, creating Slack channels, assigning roles, and updating status pages all require context-switching out of the primary tool your team uses.
The PagerDuty vs. incident.io tool comparison covers this architectural gap in detail, including setup timelines and workflow comparisons from teams that have used both platforms.
Financial metrics matter to your CFO. What matters to you as an SRE is what those numbers signal about the product experience your team will have in 12 to 24 months.
When NRR drops below 100%, it typically signals that the existing customer base is shrinking. Below 100% means customers churn or contract faster than others expand, which can reflect failed upsell and cross-sell efforts, churn rate increases, or reduced customer lifetime value. For enterprise SaaS at PagerDuty's scale, the business must now acquire new logos to both fund growth and offset the revenue lost from the existing base.
All three show up in PagerDuty's current position. AIOps, the feature set that most directly helps SRE teams dealing with higher alert volumes from increased deployments, is a paid add-on at additional cost. And the peer set of modern incident management platforms is growing rapidly, with transparent pricing and genuinely Slack-native architectures.
"issues are right there in Slack, giving really good visibility into what sort of issues are being submitted and ensuring that people are responding... It's super easy and useful to look and see where things are." - Alex N. on G2
The on-demand webinar on migrating to incident.io covers common reasons teams initiate that evaluation and what they typically find during the trial period.
Teams that reduce PagerDuty spend typically do so for a combination of reasons.
PagerDuty's alerting rules engine is sophisticated. If you need deep, complex alert routing logic, PagerDuty's capabilities here exceed most alternatives. But alert routing is not where most teams lose time during incidents in 2026. Coordination is.
The best Slack-native platforms for 2025 breaks down exactly this distinction: tools designed for alerting versus tools designed for the full incident lifecycle, and why that architectural decision matters at 3 AM.
The coordination overhead problem breaks into distinct phases: assembling the right people, locating runbooks and service owners, and then manually updating your status page at resolution. That overhead doesn't involve any actual troubleshooting, but it appears on every single incident regardless of severity.
We eliminate most of it. Alert fires, channel creates, on-call engineer is paged, service catalog context (owners, dependencies, recent deploys, runbooks) surfaces automatically, and timeline recording begins. /inc resolve closes the incident, updates the status page, drafts the post-mortem, and creates follow-up tasks in Jira or Linear. Our built-in migration tooling imports schedules and notification rules directly from PagerDuty, so you are not starting from scratch.
"With incident.io, managing incidents is no longer a chore due the automation that covers the whole incident lifecycle; from when an alert is triggered, to when you finish the post mortem." - Scott K. on G2
We shipped over 200 improvements in Q1 2025, including the AI SRE investigation capability, Scribe call transcription, the @incident AI assistant, native on-call scheduling, and AI-powered post-mortem automation. That cadence reflects a fundamentally different product investment philosophy. Every week, something ships that SRE teams can use immediately. Our public changelog shows exactly what changed and when, with no marketing spin.
Compare that to PagerDuty's H2 2025 product launch, which announced SRE Agent capabilities with general availability projected for October 2025, and their Spring 2026 release that put fully autonomous responder capabilities in "early access" for H2 2026. Some of those announced capabilities are now shipping. Others remain on the roadmap.
PagerDuty's on-call experience has evolved over recent years with updates like Flexible Shifts and Shift Agent for managing conflicts in chat, but the core workflow for new on-call engineers remains browser-heavy and configuration-heavy. Getting fully operational takes meaningful ramp time.
Our on-call onboarding delivers a fast ramp because the entire workflow runs through slash commands in Slack. New engineers can participate in their first incident without memorizing a lengthy runbook. /inc escalate. /inc assign @sarah. /inc severity high. It feels like using a tool they already know.
"You don't have to leave slack to run the incident. The commands are straightforward and quite intuitive, but you can also use buttons to navigate it." - Fina M. on G2
The question is not whether your current on-call tool works. It is whether it scales to the incident volume and velocity your team will face as AI coding assistants continue to compress development cycles.
MTTR compounds detection time, coordination time, diagnosis time, and resolution time. Legacy tools that create coordination overhead affect every incident, every time, regardless of the technical complexity of the underlying problem.
Here is the math for a 25-person on-call team running 15 incidents per month:
Your actual figure will vary depending on the loaded hourly cost, incident frequency, and coordination time per incident.
PagerDuty's Spring 2026 release positions their SRE Agent as the answer to agent-speed engineering. According to their H2 2025 roadmap documents, the SRE Agent was projected to reach general availability in October 2025, with agent-to-agent Model Context Protocol (MCP) capabilities projected for H1 2026 and fully autonomous responder capabilities in early access in H2 2026. PagerDuty's Spring 2026 announcement did not confirm general availability for triage and diagnosis workflows specifically. That progression shows real investment.
The architectural question is different. When PagerDuty's AI identifies a likely cause, the responder still acknowledges in PagerDuty, correlates context in Slack, and manually bridges the two systems. The AI surface area is strong. The coordination layer that connects AI output to human action remains fragmented across tools.
Our AI SRE is a shipped product. It identifies root causes automatically, connects telemetry, code changes, and past incidents, surfaces the likely PR behind an incident in Slack, and generates fix PRs without a human leaving the incident channel. The Chief AI Officer Show covered how our AI agents can draft code fixes rapidly after an alert.
Bolt-on AI gives you AI features. AI built into a Slack-native platform from the ground up gives you AI-driven response. The practical difference shows up at 3 AM when you want your platform to handle the first 80% of the incident while you focus on the other 20%.
| Capability | incident.io | PagerDuty | Impact on MTTR |
|---|---|---|---|
| Slack-native architecture | Built from day one | Web-first with Slack integration | Meaningful coordination overhead per incident |
| Automated channel creation | Auto-triggered on alert | Professional plan, workflow pre-config needed | Removes manual setup time per incident |
| AI root cause identification | Shipped, delivers up to 80% | ||
| reduction in MTTR | SRE Agent projected GA October 2025 per H2 2025 roadmap; triage workflow GA unconfirmed in Spring 2026 release; autonomous responder early access H2 2026 | Faster diagnosis by surfacing the culprit PR | |
| Automated post-mortem draft | Auto-drafted from captured incident data | Requires third-party tooling | Significant reduction in post-mortem writing time |
| Alert deduplication | Built into all plans | Basic dedup included; advanced suppression requires AIOps | Reduces alert fatigue at no added cost |
| On-call onboarding | Fast ramp via Slack commands | Browser-heavy onboarding process | New engineers contributing in days, not weeks |
| Transparent pricing | Public, on-call add-on shown upfront | Public pricing available; add-ons unclear at scale | Predictable annual planning |
| Support availability | Shared Slack support channel offered | 12x5 email support (Professional); P1/P2 SLA tiers (Business and above) | Faster issue resolution |
PagerDuty's AIOps for major incident teams and service owning teams represent genuine investment in signal correlation at scale. The underlying technology is capable.
The problem is architectural. Even with AI identifying a likely cause, the responder still needs to acknowledge in PagerDuty, coordinate in Slack, and bridge the two manually. PagerDuty's incident workflow automation can automate parts of this on Professional plan and above, but requires pre-configuration and still does not eliminate the context switch between the alerting system and the collaboration layer.
For teams that need autonomous response today, not projected for H2 2026 early access, the gap between what is shipped and what is roadmapped matters.
Slack-native coordination can significantly reduce MTTR through measurable components:
Favor's almost 40% MTTR reduction, a result specific to their environment and incident volume, came from eliminating coordination overhead at a team running real production incidents. Their case study results show the before and after clearly. This customer outcome demonstrates how reducing coordination time translates to measurable MTTR improvements.
Before your next renewal or vendor evaluation, run your current on-call stack through this checklist. An honest score tells you where the gaps are.
Architecture and workflow:
AI and automation:
Pricing and support:
Vendor trajectory:
If your current stack fails three or more of these checks, the coordination tax you pay is compounding every incident. The on-call tool selection framework walks through each criterion with evaluation guidance.
Our migration tooling from PagerDuty imports schedules and notification rules directly, and the Rescue Program includes hands-on migration support. Teams typically become operational within three to five days of import.
Schedule a demo to see the AI SRE assistant in action and learn how the coordination difference can work for your team.
MTTR (Mean Time To Resolution): The average time from when an incident is detected to when it is fully resolved. MTTR includes detection, coordination, diagnosis, and fix time. Favor's almost 40% MTTR reduction after adopting incident.io, a customer-specific result, meant resolving incidents faster and reclaiming hours of engineering time monthly.
Context switch tax: The coordination overhead created when engineers navigate multiple tools during an active incident. Checking PagerDuty for the alert, Datadog for metrics, Slack for communication, Jira for tickets, and Confluence for runbooks creates several minutes of lost time per incident before any troubleshooting begins.
AI SRE: Our AI system that delivers up to 80% reduction in MTTR. Unlike AI that surfaces correlated logs, the AI SRE identifies the likely code change or service dependency behind an incident and can generate a fix pull request directly from the Slack incident channel.
Dollar-based net retention rate (NRR): A SaaS metric measuring whether existing customers spend more or less compared to 12 months prior. NRR above 100% means customers expand. NRR below 100%, as in PagerDuty's 98% as of January 2026, means customers contract or churn faster than others expand.
Slack-native architecture: A design pattern where the incident management platform's primary interface is Slack itself, not a web application that sends notifications to Slack. Slash commands like /inc escalate and /inc resolve manage the full incident lifecycle without leaving the chat tool your team already uses. That is how we built incident.io from day one.


Instead of thinking about reliability as an exercise in figuring out what we can control, and ignoring anything beyond that, we think about what we'll be really proud to offer to customers.
Mike Fisher
A forward look at where engineering teams are heading with AI, based on conversations with design partners who are visibly six-to-twelve months ahead of the average. Tailored code agents, MCP gateways, agentic products that talk to each other — most of the picture is already there in pockets, and the rest of the industry is closing the gap fast.
Lawrence Jones
incident.io just launched the PagerDuty Rescue Program, making it easier than ever for engineering teams to ditch their decade-old on-call tooling. The program includes a contract buyout (up to a year free), AI-powered white glove migration, a 99.99% uptime SLA, and AI-first on-call that investigates alerts autonomously the moment they fire.
Tom WentworthReady for modern incident management? Book a call with one of our experts today.
