The incident-io/core application uses a mixture of environment variables, config files and secrets stored in Google Secret Manager to configure the app. This is a reference guide to all the parts that make up this flow.
Application config comes in several forms:
Both (1) and (2) make sense to track alongside the code that uses them, and – quite crucially – would expect to be set to the same value for each instance of an application environment. That means if we deployed the staging environments into two infrastructures, say Heroku and Google Cloud Run as part of a migration, this config would not change between those two instances.
Runtime configuration (3) differs in that it’s specific to the infrastructure, and you might be given different DATABASE_URL environments between instances of the same environment, or perhaps even between different roles (perhaps we should give cron a DATABASE_URL that points at a replica, rather than the primary).
The incident-io/core application includes configuration files for each environment it is deployed into (at time of writing, staging and production) which you can find at config/environments/<env>.yml
, each parseable into a Config structure defined in the code.
💡 Config files mostly track non-sensitive (1) and sensitive (2) config, and only include runtime (3) if it makes sense to assert on it being present when booting.
Config is a simple structure of FIELD_NAME to value, with some optional struct tags that opt values into additional validation.
A small sample is:
type Config struct {
APP_ENV string `config:"required"` // type 1 (non-sensitive)
APP_ROOT string // type 3 (runtime)
ACCESS_TOKEN_SALT Secret // type 2 (sensitive)
// ...
}
Taking the same slice of the production configuration file:
# config/environments/production.yml
---
APP_ENV: production
ATLASSIAN_OAUTH_CLIENT_ID: OEr2GKQfh8rHiqKVtFPWRiK2dAJ20FTd
ATLASSIAN_OAUTH_CLIENT_SECRET: secret-manager:projects/65058793566/secrets/app-atlassian-oauth-client-secret/versions/1
The application, when it wants to use any of these config values, will use them via a package singleton (e.g. config.CONFIG.APP_ENV
) that is loaded on application boot, via a call to config.Init
.
Key things to know about the loading of config:
config.Config
APP_ENV
environment variable set, it will take precedence over whatever was in the configuration filesecret-manager:<reference>
. This means a value is stored in Google Secret Manager, usually within the same Google project that the app runs under, and we’ll set the config value to whatever we access at that reference in Secret Manager.After this, the config is loaded and ready to be used. Other runtime values (3) might be loaded directly via os.Getenv if needed.
Google Secret Manager can be used to securely store secret material. It stores secret material under a secret name, and each secret can have multiple secret versions.
Developers are able to access Secret Manager via the Google Cloud Console or through APIs.
As a quickstart:
Most interactions with Google Secret Manager will happen through incident-io/core/server/cmd/config, which is a CLI that provides wrappers for common operations.
usage: config [<flags>] <command> [<args> ...]
Manage configuration files for the app
Flags:
--config-file=CONFIG-FILE The configuration file to load
Commands:
show [<flags>]
Loads and shows a configuration file
create-secret --project=PROJECT --field-name=FIELD-NAME
Create a new version of a secret in Google Secret Manager. Will create secret if it does not already exist.
The secret material our app uses is extremely valuable, and an obvious target for attackers. We’ve taken steps to protect these secrets that go beyond our normal procedures, which comes with two important and competing realisations:
Until now, very little could change outside of our control that might lead to significant downtime, but even small things – such as Heroku changing their IP ranges – may now lead to extended outages, especially if we don’t understand the setup.
We can mitigate these risks by having everyone read this article, which explains what we’ve configured, why, and how it might bite us.
We now separate secret material from our app runtime, which puts another step between someone accessing the environment variables and getting access to the secrets.
Our solution is to place secrets inside of Secret Manager, which permits access to read those secrets to a Google Service Account associated with the app. By necessity, we continue to provision the Google Service Account credentials via an environment variable to the Heroku compute, but this provides us several benefits:
With just Google IAM and putting secrets in Secret Manager, we ensure compromising our application environment is less likely to expose the values of the secrets, as we limit the possibility of access to a small window between a breach and when we rotate the Google Service Account credentials.
That small window is still concerning though, as a motivated attacker could pull our secrets in that time, and we’d be unable to cut-off access if they did (they’d have a copy of the secrets, which is game over).
To restrict this, we’ve created a security perimeter in our Google Cloud Platform account that represents locations we expect to receive requests from, and limit access to Secret Manager to appropriate combinations of source and IAM credentials.
Before we go into details:
To protect us against leaking of our Google Service Account credentials, we have:
Copying the ingress policy below, you’ll see we restrict specific patterns of access and pair them with IAM credentials:
The result is that:
Finally, whoever stole the credentials is likely to try access before ever realising we have these security rules in place. We can setup alerts for any security policy violations so we hear about it as soon as it happens, helping us reduce the window of opportunity even more.
Moving fast does not happen by accident. Here is some of the intentional things our engineers do to move so quickly!
Cutting through the hype and dollar signs, why should you actually join incident.io? And also, why might this not work for you
In the past year, we've reimagined how we build AI products at incident.io, moving from simple prompt based features to now building full-blown AI-native systems end to end. Learn why we’re hiring AI Engineers, what that work looks like, and how it’s changing the future of incident response.
Ready for modern incident management? Book a call with one our of our experts today.