How we page ourselves if incident.io goes down
Learn how we tackle the ultimate paradox: ensuring our alerting system pages us, even when it’s the one failing. It's a common question - let's dive into detail on our "dead man's switch", how we stress-test our systems, and why we care so much about our setup allowing us to dogfood our own product.
Lawrence Jones
Tracking developer build times to decide if the M3 MacBook is worth upgrading
When our CTO said "I'll upgrade your MacBook if you can prove it's worthwhile", we embarked on a journey including (re)building a Go hot-reloader, instrumenting developer builds, analyzing compiler performance, and feeding an AI model the data until we had an answer.
Lawrence Jones
Engineering nits: Building a Storybook for Slack Block Kit
We care a lot about the pace of shipping at incident.io, and we also build lots of UIs inside Slack. Slack previews lets us collaborate on designing these experiences much more quickly.
Lawrence Jones
Better security for your app's secrets
What comes after your default, out-of-box application secret solution? How do you add security to Heroku's environment variables, or go beyond putting secrets directly into Kubernetes? We've used GCP Secret Manager to improve our app secret handling, and this post shows how you can do the same.
Lawrence Jones
Keep the monolith, but split the workloads
Everybody loves a monolith, but you can hit issues as you scale. Learn how splitting workloads can improve your monolithic architecture's performance and scalability, and understand the trade-offs between monolithic systems and microservices.
Lawrence Jones
Intermittent downtime from repeated crashes
This is a technical write-up of an incident on Friday 18th November 2022 where we experienced 13 minutes of downtime from intermittent crashes.
Lawrence Jones