We’ve been shipping loads of great work in our existing Response and Status Page products, but we’ve also been hard at work behind the scenes building On-call, which until now we haven't been able to talk about.
It's great to finally be able to show you what we're building going forward. In the meantime, we thought you might enjoy a few highlights as well as a long list of smaller improvements we made in the run-up to launch below.
With the release of On-call, we now not only provide a product for effective incident coordination, but we can also ensure your team gets proactively notified via phone, email, SMS, push notification and Slack when there is an issue, too 🔔.
Our On-call solution is built from four main components:
Alerts: Connect tools like Datadog and Sentry to notify us if something goes wrong
Schedules: Configure who should be on-call and when
Escalation Paths: Define who to contact be contacted when something goes wrong
Notifications: Notify your team via SMS, phone call, email, Slack and our mobile app ✨
Alerts
ℹ️We launched Alerts to all our customers back in January of this year
Alerts allows organizations to automatically create incidents when something fires from monitoring tools such as Datadog or Sentry. Auto-creating incidents removes unnecessary friction from the incident response process (like having to manually create incidents from alerts).
As part of the On-call release, we have added a new section to your Alert routes which allows you to specify how you want to escalate these alerts.
For example, if you would like to notify the relevant team via their escalation path and if you would like to set a delay on escalating grouped alerts (thereby giving you a chance to mark alerts as related in the incident channel before your phone starts ringing).
Schedules
Schedules determine when individuals from a team or group will be on-call. Within your schedule creation, you will have the ability to configure:
When your handover will take place (i.e. every Monday at 9am)
When the schedule will be active (i.e. all day, only during business hours)
Which individuals will be on-call, and if you want more than one person on-call simultaneously
Additional rotas, which can support models such as:
Follow-the-sun (i.e. spread support across timezones)
Shadowing (i.e. onboarding new on-call individuals)
Additionally, we’ve built out support for overrides, making it easier for your teams to handle on-call coverage if someone needs to pop out to the gym or head to a doctor’s appointment.
👀 Make sure to check out /inc cover me in Slack, which allows you to request coverage from other team members on the same schedule.
Escalations
Escalation paths tell us who you would like to reach out to and in what order. You can configure:
Who to notify (either individuals or schedules) and in what order
How long to try notifying before moving to the next stage
How many times to try notifying if escalations go unacknowledged
Note, if you do not have a schedule set up, you can create a schedule inline from the escalation path creation flow or follow instructions below in the On-call > Schedules section of the dashboard.
👀 Don’t forget to check out /inc escalate to manually escalate during an incident. You can configure the form and utilize the catalog to make it easier for individuals to find the right person to escalate to.
Notifications
On-call supports notifications via Slack, SMS, phone call, email and push notification via our mobile app.
If you are an on-call, you’ll be able to:
Determine where you’d like to receive notifications when something goes wrong and about upcoming shift notifications
Create rulesfor receiving notifications (ie. notify me immediately via push notification)
Test your notifications to make sure things are working before starting your shift
👀 We will tell you if you set your notification rules in a way that violates an escalation policy you are on, so you can adjust accordingly.
Mobile app
The mobile app for On-call will be the easiest way of engaging with On-call from setting up your on-call preferences to quickly acknowledging issues.
In the mobile app you can:
Set up critical alerts to ensure push notifications can bypass silent and “do not disturb” modes
Add contact card syncing to allow for SMS and phone calls to also bypass silent and “do not disturb” modes
Configure your mobile app to light or dark mode
👀 One of the perks of having The Chainsmokers as investors is that when we were trying to figure out what our alert noises could be, they were only too happy to help. Thanks, team! 🎵
If you are interested in learning more about On-call, please reach out to hello@incident.io. And, don’t forget to check out our On-call help docs to help you get started!
🚀 What else we’ve shipped
We’ve been building On-call behind closed doors for a little while, which meant we haven’t been able to publicly shout about some of the bits and bobs we’ve shipped and fixed along the way.
There are far too many things to list out so we’ve cherry picked some highlights from just the past month as we approached launch day!
New
You can sync your shifts or the whole schedule to your calendar! (e.g. Google Calendar)
We now support paging using incident.io On-call via Workflows
The "User acknowledges an escalation" workflow now triggers for incidents escalated with incident.io On-call (already possible with existing providers)
You can now add more than one notification rule per notification method (ie. Notify me immediately via SMS and notify me after 5 minutes via SMS again)
You can now be notified with a delay up to 10 minutes
When you have multiple consecutive alerts pending on an incident, we'll now roll them up into a single summary message instead of overwhelming the incident channel
You can create test incidents from alert routes, to help you test your configuration more easily
Within Preferences, you can now remove mobile app installations as a notification channel (ie. when changing or upgrading phones)
When using an expression for your alert attributes, you can now specify a fallback value
In incidents, you can now escalate to a user, in addition to escalation paths, from the dashboard
You can receive upcoming shift email notifications
Add ability to use startsWith in Javascript expressions for alerts
Inspect your alerts from the alert detail page
Empty alert attributes will now show in the alert inspect drawer
Non-paging notifications will be sent as non-critical on the app
New help center doc for bypassing focus/silent modes
We will backfill your alerts if you configure a new attribute
You can duplicate any existing escalation path
Added support for sending SMS from our shortcode numbers
Improvements
Introduced visual consistency to escalations list view
Added schedule name to upcoming schedule notification emails
Improved user experience around adding phone number notification methods
Don’t auto-acknowledge alert escalations when you mark an alert as unrelated, in case you forget about it
Links to schedules and escalation paths in upcoming shift messages are now clickable
Improved styling and information on email on-call notifications
Alert routes/sources no longer show as clickable to users without permissions
Improve layout of sections in alerts details page
Escalation paths are now listed alphabetically in the dashboard
Show a warning when we truncate overrides
Added more sensible alert notification defaults
Allow ability to delete expressions from alert routes and sources
On-call received a new icon in the sidebar
Improved shift notification text in user preferences
On-call QR codes point to the public app store applications
Improved language for grouping alerts on the alert route
Schedule overrides start will default to now if the shift has already started
Fix styling of the On-call widget in the dashboard sidebar
Improved responsiveness of the escalations page
You can now disable all shift change notifications
Improved the accuracy of the override preview
You can now see the name of the mobile devices you've installed the app to, and how long ago they were used
Added the ability to manage expressions within alert source and alert route forms
Clarified messaging when we don't escalate because nobody is on-call
Removed some confusing colour palettes from schedules
When editing a schedule, we'll now show a confirmation dialog before navigating away if you have unsaved changes
When declining an incident with pending alerts through the dashboard, you can now select all alerts instead of having to click through one by one to determine next steps
We've improved the formatting of alert messages in Slack to be easier to scan
The phone and SMS notification rules now display the phone number of that rule
Notification rules and shift change notification rules are now both ordered by time, with delayed notifications shown further down in the list
Bug fixes
Repeating escalations on escalation policies work
Fixed a bug where users were unable to manually escalate from the dashboard
Incidents will always get created from an alert, even if we can’t set custom fields
Protect custom fields from being deleted if they are used in an alert route
Blocked access via the API to native manual escalations from private incidents
Fixed bug where related incidents to alerts were not showing when active or while in the post-incident phase
Correctly show users when they don’t have permissions to create an alert source
Fixed shift length duration preview being incorrect when switching timeline size in override modal
Fixed bug where errored attributes were not showing in the alert inspect drawer
Fixed a bug in schedule/alert route deletion protection
Fixed a bug where the override creation modal would preselect the incorrect layer
Fixed a bug where upcoming shift emails would display the incorrect timezone
Fixed a bug that was causing alert filters to behave unpredictably
Fixed a bug that was causing schedules in non-UTC timezones to render entries incorrectly escalations
Fixed a bug around alert transition states
Fixed a bug when displaying on-call users from multiple rotas
Fixed a bug where we were displaying “missing variable” in templated text expressions
Ensure we always display the latest action in the escalations list view
Smoother experience when editing attribute expressions
Fixed bugs with overrides on schedules with multiple rotations
Override previews now show up at the correct times on the timeline
We now correctly show the warning that an override is outside working hours when it actually is outside working hours
The notification for users going on-call for the first time is now only sent once
The "next changeover" time is now shown correctly for schedules where the schedule currently has no one on-call
Content no longer overflows between the panels when editing an alert source
/inc cover me now works in the same places that /inc cover does
Include overrides when calculating the next handover
Fixed auto-incrementing of acknowledgement options via SMS
Fixed on-call notification emails that don’t have associated incidents
Fixed title character escaping in Slack escalation messages
Always ingest alerts, even if we can’t evaluate an attribute
Fixed a bug where the wrong escalation path level was deleted while editing
Correctly show pending escalations in the timeline view again
Fixed a bug where not picking up a phone page would acknowledge it
We will correctly display if you’re on-call indefinitely
We correctly escape Slack special characters in messages for alerts
Fixed a bug where the templated text editor wasn't always correctly displaying expression variables
Fixed a bug where expressions weren't correctly being created in alert routing
Fixed a bug where we were displaying empty schedule entries