New kids on the blog

Weekly Update

On top of the usual collection product improvements, this week we've also shipped our new blog. The old one felt a little uninspiring, and since we've got lots we want to share, we thought a little overhaul was a good investment in time. To mark the occasion we shared the first in our interview series, this time featuring Colm Doyle from Slack.

We've had a few things competing for our time over the last few weeks, which has slowed the pace of change a little. Next week's looking pretty clear though, with plenty of time to devote to product and engineering. Stay tuned!

What we shipped

  • 🆕 We launched our lovely new blog and the first in our new incident interview series, this time with Colm Doyle from Slack. Kudos to Pete here!
  • 🆕 We've made some changes to make incident summaries feel more connected to the incident. After you /incident update, we now prompt you to check that the overall summary still makes sense. A well written summary can go a long way to keep folks on the same page.
  • 👷🏽‍♀️ We've been working with an external company to carry our a penetration test against our app and infrastructure. We're confident in our security setup, but it's great having someone try to find any holes. So far, things are looking great 🔒
  • 👷🏽‍♀️ We've added some debugging tools to our codebase to help us find and fix things like memory leaks in future. As they say, failure to prepare, is preparing to fail.
  • 👷🏽‍♀️ We used to perform a round trip to Slack on one of our identity APIs, which meant we coupled the success of that call to Slack. Given we were getting data that's unlikely to change often, we've moved it into cache.
  • 💅 We used to nudge you to set the 'incident lead' even if you'd changed what the role was called. Now, if you call it a commander, we'll call it a commander 🤝
  • 💅 We've shorted the timestamp we use in postmortems. Nobody needs millisecond precision here.
  • 💅 We used to nudge you about the incident being quiet, even if you were in a monitoring state. When you're monitoring it's not uncommon for the channel to go a little quiet so this wasn't particularly helpful. Longer term, we'll make all of our nudges customisable, but for now we've restricted this nudge to investigating and fixing stages of the incident when we expect activity in a channel.
  • 💅 We no longer show inactive or removed users in dropdowns, since assigning an action to someone who longer works for you probably isn't a wise move.
  • 🐛 We fixed a bug that occasionally meant images wouldn't show up in the incident timeline. Long story short, when you loaded the incident homepage we'd generated a url for the image which would be valid for 10 minutes, and then cached the linked timeline item for 15 minutes. In those last 5 minutes, the image url had expired, but we still served the timeline with it. All fixed now though, so pin away!
  • 🐛 Postmortem exports weren't working on Safari, but now they are!
  • 🐛 We fixed a few minor issues with our Jira integration, and have some bigger changes on the way soon.
  • 🐛 If you reopened an incident, there was a chance the incident homepage would stop working. Not anymore though! Thanks to RD Station for their help getting to the bottom of this 🙏

Operational excellence starts here