Learn how we tackle the ultimate paradox: ensuring our alerting system pages us, even when it’s the one failing. It's a common question - let's dive into detail on our "dead man's switch", how we stress-test our systems, and why we care so much about our setup allowing us to dogfood our own product.
When our CTO said "I'll upgrade your MacBook if you can prove it's worthwhile", we embarked on a journey including (re)building a Go hot-reloader, instrumenting developer builds, analyzing compiler performance, and feeding an AI model the data until we had an answer.
We care a lot about the pace of shipping at incident.io, and we also build lots of UIs inside Slack. Slack previews lets us collaborate on designing these experiences much more quickly.
What comes after your default, out-of-box application secret solution? How do you add security to Heroku's environment variables, or go beyond putting secrets directly into Kubernetes? We've used GCP Secret Manager to improve our app secret handling, and this post shows how you can do the same.
Everybody loves a monolith, but you can hit issues as you scale. Learn how splitting workloads can improve your monolithic architecture's performance and scalability, and understand the trade-offs between monolithic systems and microservices.
This is a technical write-up of an incident on Friday 18th November 2022 where we experienced 13 minutes of downtime from intermittent crashes.
Ready for modern incident management? Book a call with one our of our experts today.