All customer stories

With incident.io, SumUp has a single place to turn to when things go wrong

Ever since adopting incident.io, SumUp has seen meaningful changes in the way it manages incidents, with communication seeing a massive improvement.

Our conversation with Adrián Moreno Peña

Key Benefits

  • Clearer communication during incidents—both internal and external
  • An all-in-one incident response platform that helps consolidate workflows
  • AI features that eliminate manual tasks
  • Improved post-incident processes that allow for more actionable learnings
One of the improvements that incident.io has brought to our incident response processes is the reduction of that cognitive overload. It’s one tool. You just send an /inc statuspage message, and you post an update. It's all in the same tool. It's in the same context.
Adrián Moreno Peña
Adrián Moreno Peña
VP of Engineering

With a presence in over 25 countries and responsible for points of sale (POS) at businesses both big and small, an incident of high severity not only impacts SumUp but its customers’ ability to collect payment.

For a global company like SumUp, getting incident response and communications right carries significant weight.

But before adopting incident.io, SumUp found itself stuck in a loop of cognitive overload with incident comms and its response processes as a whole. Manual processes hampered their ability to move quickly, and context switching was proving to be a big hurdle for responders as well.

These challenges directly impacted responders' ability to react to incidents without friction.

In incident.io, they found a better way to streamline incident response and a Status Page product that gave them the tools they needed to communicate more seamlessly than before.

Working with a cognitive overload

In the early days, VP of Engineering Adrián Moreno Peña and his team used a highly manual incident response bot built in-house.

Trying to decide whether to build or buy incident response tooling? Check out our guide.

“We had defined an incident management process that was fairly manual. We had developed an in-house bot…but it required manual overhead,” says Adrian

For example, if a responder needed to update SumUp’s status page, they needed to take a peek at a sticky note on their computer that told them exactly how to do it and what to say, depending on the severity of the incident. In the long term, this proved to be unscalable for a company with stakes as high as SumUp’s.

Context switching mid-incident

As a company directly responsible for facilitating payments between consumers and businesses, every minute of downtime meant so much more.

So when responders needed to context switch between SumUp’s response bot and their status page, it took them off task. And while a few seconds of distraction felt insignificant, this quickly became a burden for the team.

You might have a graph showing you where exactly the systems have dipped. But if you context-switch and open yet another tab in your browser, that might distract you, and you can lose the thread that you were pulling from.

For Adrian and his team, finding a solution that helped eliminate the need to switch from one tool to another when responding to incidents became more critical as they scaled.

“Keeping everything really close to where the action is happening just optimizes a lot—both in resolving the incident but also keeping merchants happy. The shorter the incident is, the better. That's what customers want.”

Simplicity wins

Similar to context switching, one of the biggest issues SumUp faced was that using disparate systems for status pages and incident response was causing unnecessary friction for responders.

“We were using a separate status page product, but it was not incorporated into the incident management process. That added a touch of friction because responders couldn’t post updates directly through our response bot,” says Adrian.

While switching from the Slack bot to the status page only took a little while, over time, this quickly added up.

We wanted to reduce the friction of managing an incident and make it easier to declare, manage, review, and learn from them.

For Adrian, the easier you make a process, the more people lean on it. And when it comes to incidents, the more you declare, the more data you have. The result of this? More opportunities to learn from incidents.

“It's all about behavioral economics. We want people to declare incidents because that means that we are going to respond in a timely manner and learn from them more efficiently.  We’ll continue to make small mistakes here and there because it's part of operating a complex product in a very distributed fashion—but we’ll make better mistakes next time.”

With incident.io, SumUp found the solution they were looking for

With incident.io in the fold, Adrian and his team at SumUp have moved past a world of disparate incident response tools. Incident comms and response are now handled in a single, intuitive platform.

The benefits of this have been felt immediately.

Better comms for customers and stakeholders

It’s critical to communicate clearly with customers when incidents occur. Equally important is sharing communications with internal stakeholders who may be impacted by these same incidents.

Enter internal Status Pages.

“We use both internal and external Status Pages, and it's useful to have that distinction. Internally, we're going to have lots of communications that we might not want to expose publicly. Or, sometimes, there’s information that doesn’t mean anything to someone who is outside of SumUp. So it’s really valuable to have that separation of visibility,” says Adrian

Now, SumUp can seamlessly switch between comms for customers and those for team members, such as customer support.

"It's beneficial for teams like operations, our key account managers, and compliance teams because they can always know the status of our systems. So if they observe something weird, they can go to the internal Status Page and check, ‘Have they found out about it? OK, I know that they're working on it.’"

AI—an unsung hero

Outside of Status Pages and Response, one of the added benefits of adopting incident.io has been the introduction of AI incident summaries.

We are a large company—3,000 people globally. AI summaries are saving us hours of work a month. For us, it's not about elaborating the message. What costs the most is the back-and-forth between teams.

For Adrian, eliminating yet another mental hurdle has proven invaluable.

“If there’s been an incident, our key account managers will get questions from large enterprise customers. ‘What happened with this? Can you tell us what the improvement plan is?’ But if you provide that information ahead of time—and we usually have it—it's now about compiling it in a digestible way, which is much simpler.”

By leaning on AI to provide these summaries, teams at SumUp can work more efficiently and build better relationships with customers in the long term.

“With these summaries, there's no need to go back and forth with engineering. ‘Can you give me an update? Can you give me an incident report? What can I share with these customers?’ Everyone knows what they have to do.”

A single place to turn when things go wrong

Friction. Cognitive overload. Disparate systems. Manual overhead.

All of these—and more—were issues that SumUp was acutely feeling the pain of before adopting incident.io. Now, these are all problems of the past.

“One of the improvements that incident.io has brought to our incident response processes is the reduction of that cognitive overload. It’s one tool. You just send an /inc statuspage message, and you post an update. It's all in the same tool. It's in the same context. You don't need to open another web browser. You don't need to copy-paste a message from another location.”

While it may seem inconsequential, every single second SumUp saves by keeping everything in a single tool adds up to significantly faster response times.

Optimizing every second and every decision you need to make during an incident is invaluable. It saves you time in the most stressful moments when you don't want to think about anything that isn’t returning the systems to normal.

Ultimately, incident.io is helping Adrian and his team do what they do best: respond to incidents, minus the cognitive overload.

“Codifying the incident management process into incident.io automates quite a lot of actions. Otherwise, it would require thinking, decision-making, and actioning certain things during an incident, which you don't really want to do.”

sumup
About the interviewee

Adrián Moreno Peña is a VP of Engineering at SumUp, and oversees the Incident Management Process. Previously, he worked at companies such as VanMoof and Emakina.NL.

Adrián Moreno Peña

Adrián Moreno Peña

VP of Engineering

Industry
FinTech
Customer since
2023
Company size
3000+
Office model
Hybrid

You may also be interested in

Move fast when you break things