Improved CI/CD

May 13, 2025

One of our company values is ‘Raise the pace.’ We are constantly looking for ways to speed up and get value out to customers faster.

As our company has grown, the time it took to deploy a change had risen to 13 minutes.

While trying to improve this initially, we ran into rough edges with our CICD that prevented substantial time savings:

  • Parallelizing our checks was hitting an unacceptable cost to speed ratio
  • We were unable to effectively use the Go build cache
  • We were unable to effectively use our Yarn modules cache

13 minutes was too slow for us, and with our hiring plans, we knew now was the time to invest in improving things.

Engineer speed improvements

Firstly, we invested time in making it quicker for local development.

While building a feature, we regularly run a suite of checks to ensure it works and meets our standards. An engineer will run these checks at least once per change they are making, so time here can add up quickly.

The big change we made was running our CICD checks on servers we own. This gave us more control of the resources we needed to run things. This allowed to us parallelize much better, make use of caching and removed start-up costs associated with checks.

We managed to half the amount of time taken here, bring it from 8 minutes to 4 minutes.

Deployment improvements

Improving things for ourselves is only half the battle, we also needed to make it quicker to get these changes out to you!

A change being available has multiple stages:

  • Run the full suite of checks again
  • Deploy the change to our pre-production environment
  • Deploy the change to our production environment
  • Run some post deploy checks and steps

By building on the savings we made for engineers, and some deploy-specific changes, we are now able to get changes out to you in 7 minutes, down from 13 minutes.

We will be writing a more in-depth blog post on this work in coming weeks, so stay tuned if you are interested in learning more!

What else we’ve shipped

New

  • You can now run a backstage sync from catalog-importer in 'dry-run' mode to understand what changes will be made when you run the sync
  • Escalation paths in Catalog now have an 'all users' attribute, containing everyone on that escalation path, both directly and via schedules
  • We now support zelt.app's calendar feed for displaying holidays on schedules
  • Adding a branch in an escalation path now duplicates the existing path to the "else" branch, and converts your level nodes to low urgency
  • You can now filter by created_at in our Alerts API

Improvements

  • Adjusted the spacing on the on-call schedules page to prevent columns from overlapping
  • When changing your expression for alert priority in an alert source, the preview now reflects that change
  • The escalation timeline now shows acknowledgements that are more than a minute apart as separate items
  • We now warn about unsaved changes when you hit Esc while editing a summary
  • Improved documentation for usage of channel configs on an alert route in Terraform
  • You can no longer click Add without selecting a country when attaching public holidays to a schedule

Bug fixes

  • Fixed a bug where you couldn't add another escalation rule to some alert routes
  • You can now navigate through to alert attributes from the variable picker when setting custom fields on an alert route without the popover closing
  • Suggested summaries no longer cause unpredictable cursor behaviour when edited

So good, you’ll break things on purpose

Ready for modern incident management? Book a call with one our of our experts today.

Signup image

We’d love to talk to you about

  • All-in-one incident management
  • Our unmatched speed of deployment
  • Why we’re loved by users and easily adopted
  • How we work for the whole organization