With incident.io, Accurx have been able to standardise and improve overall visibility of their incidents, enabling them to meet auditing standards for their business.
Accurx is a healthcare startup which provides software for GP practices to help them to communicate effectively with their patients. Accurx is currently used in the majority of all GP practices in the UK.
Working with healthcare providers brings two main challenges:
As a result, Accurx has stringent contractual obligations to escalate and resolve incidents within agreed timeframes.
Given the sensitive nature of Accurx’s work, and the clinical risks involved, having clear processes in place to manage incidents was critical to prove their ability to respond to these effectively and responsibly.
To meet NHS requirements for engineering reliability and service management, Accurx needed to:
Previously incidents would be managed through Slack, but not all incidents were treated equally. For a smaller bug, existing Slack channels and threads were used to manage the response. For a major outage, typically a new, temporary Slack channel would be created.The result was a lack of overarching visibility on how many incidents were happening, and information on specific incidents was easily lost. As Accurx scaled this process got more noisy and chaotic, and they needed to streamline communication between their different teams.
As we got bigger, it was a bit harder and we needed to actually think about communication between different teams. Our comms and PR team need to know what's going on in case we need to tell people, and our commercial team need to contact our high-level customers or relationships.
incident.io saves us 1-2 hours per incident when considering the needs for us to write up the incident, root cause and actions thoroughly, communicate it to wider stakeholders (which would previously have taken us at least half an hour writing summaries and looping in the right people), and overall reporting on a monthly basis.
Before, things were often hidden in Slack thread communications, visibility was poor and information easily lost. Through incident.io incidents are raised a few times a week at Accurx, as a way of tracking minor bugs as well as major outages. With improved data and analytics, Accurx is now able to identify and respond to recurring problems much faster.
The main thing is just being able to get a clear picture of what's going on, and that's both at the level of an individual incident, but also all of the incidents going on…It's overall meant that we've kept a really strong eye on the quality of our product because there's so much structure and visibility.
It also makes it quick and easy to provide NHS Digital with incident reports, including through the automatically generated postmortem documentation. These have brought consistency and structure to incident reporting, and removed the need for time consuming manual work.
Through the automations incident.io encourages more consistent, transparent updates, as well as a stronger evaluation process after an incident is closed. Automations have been customised to fit within the NHS terminology and their incident categorisation has been used to create automatic prompts depending on the incident type.
It'll say, "You've set this to severity two, you need to call X within this time.
There are features in incident.io that act as a coach for team behaviour. With prompts to set an owner or set actions, the schematic of a good incident report is laid out for everyone involved, customised to Accurx’s specific needs.
By centralising their end-to-end incident management process in Slack, Accurx have been able to improve collaboration within a hybrid team, with some people in the office and others working remotely.
incident.io means that we have the tools on Slack to do all the stuff there, so rather than having to write up on a whiteboard… it's our agreed place for where we're tracking everything there rather than people frantically figuring things out.
Accurx is using incident.io to track what the engineering team are calling “near-misses”: things that actually didn't happen, but that perhaps got deployed to the beta environment.
This has reinforced accountability, allowing teams to make improvements to the way they work and creating a log of small tweaks and tidies that can help future problems.
We still want to get all of the learnings and force people to think about what the impact would have been and why it happened. The main thing really is to treat them seriously, and try and get all the same learning
Situation: A bug that miscalculated the averages of blood pressure readings that patients were submitting.
Reponse: Slack channel automatically created by incident.io allowed them to quickly sync up with other people in the business, pulling in the right engineers at the right time to diagnose and resolve the issue.
Communication: A clinical lead was looped in to assess the severity of the problem for the patients and suggest actions. The communications team could then be added to the channel to discuss contacting the impacted patients and explain how they may be affected.
Follow-up: Once the process was complete, the port mortem template documentation and automated prompts from incident.io allowed the team to create an evaluation
End result: During their next audit, Accurx was able to show the exact minute somebody spotted this issue, raised it, who was assigned to deal with it, and what discussions were had afterwards.
We got a really positive report back to the NHS from the auditors, and we definitely credit a lot of that down to incident.io.