All customer stories

How Altis has scaled and planned for the future with incident.io

With incident.io, Altis has implemented a scalable, repeatable model for incident management that can grow easily with the organization.

Key Benefits

  • An incident response process that scales
  • Improved process efficiency
  • A clearer overview of incidents
We previously had 230-250 PagerDuty incidents per month, now we have 2-5 incident.io incidents, which has massively improved understandability and enhanced clarity
Ryan
Ryan
Director of Product

Altis is an enterprise-level hosting service, providing services for companies such as Snopes, Red Bull Media House, and Yell. Altis is part of Human Made, a fully remote organization with about 80 people all around the world.

The challenge

As an enterprise host, it is essential for Altis to provide incident management support to customers on the platform. However, before adopting incident.io, Altis had a confusing and unclear process for managing customer-reported incidents. Incidents would be run through a pre-existing, central Slack channel. Because this channel was used for multiple incidents at any one time, communications were easily lost in the mix of other messages.

Overall, there was a lack of focus, no clear lead for any given incident and a generally haphazard approach. Collaboration and communication in particular suffered as a result. Without a clear structure, the team found that there was a lot of confusion, with efforts sometimes being duplicated or important actions being missed.

We all knew that wasn't a great system, but we didn't see an easy way to replace that.

What were they looking for in an incident management tool?
  • An adaptable tool which would integrate with their pre-existing tools, such as PagerDuty
  • A way to collaborate more effectively across the team
  • A way to learn from and implement best practice in incident management

The solution

An incident response process that scales

incident.io has helped Altis to establish a scalable, repeatable model for incident management that can grow easily to support an increasing number of customers. In the past, there was a lack of confidence in how best to manage some aspects of the incident process. incident.io has provided a best practice framework and enabled the team at Altis to codify this, so that there is consistency every time.

I think in some cases we were like, "I don't know what we should do here," and incident.io has helped us shape that.

Improved process efficiency

By bringing structure, providing a dedicated space for each incident and automating processes, incident.io has helped the engineers at Altis to deal with issues much more efficiently. For example, instead of having to ask for updates and offer assistance, incident.io streamlines this flow, by clearly assigning roles and actions, and providing timely nudges to keep the incident running smoothly.

The automated reminders and things like that are really handy... because it's automated, it doesn't feel stressful.

A clearer overview of incidents

The web dashboard has enabled Altis to get a much clearer picture of the incidents that are occurring over time. Previously, the team would have to review PagerDuty triggers to work out how many incidents had taken place, but often this meant sifting through a large number of alerts that were not “real” incidents. It’s now really easy to get a snapshot of real incidents in any given month, and make the improvements needed to downgrade or eliminate them in the future. Access to this data, alongside timelines and postmortems, helps Altis to provide evidence as part of customer and compliance audits.

Now it's much, much easier for us to go back and say, "Well, how many actual incidents did we have this month versus alerts? What are we doing to improve upon this to either eliminate those entirely or to downgrade them to just alerts.

altis
About the interviewee

Ryan McCue is the Director of Product at Human Made (makers of Altis DXP). He is the creator of the WordPress REST API, and a WordPress security team member.

Ryan

Ryan

Director of Product

Industry
Web hosting
Customer since
November 2021
Company size
80

You may also be interested in

Operational excellence starts here