We went live with our first set of AI-enabled features a few months ago.
Needless to say, we learned a lot along the way, as this was the first time we had experimented with generative AI.
Here, I'll share some of what we've learned as we’ve grappled with using LLMs to power new products at incident.io. This will be most applicable to the application layer, AI-enabled but not AI companies.
With so much excitement to build with AI, where do you even start?
Honestly, it can be a little overwhelming, with hundreds of possibilities floating around—so let me share how we tackled this conundrum and what we learned.
Our first learning was that it was helpful to discover what's technically possible by running several short experiments. The goal here is to iterate quickly, so sometimes a Jupyter Notebook is enough to get a feel for what's possible before building something you can ship to users.
We experimented with RAG, embeddings, multi-shot prompts, code generation, function calling, and others to build conviction in the team about where AI-powered features would be most compelling. We treated our first release similarly, as we wanted to understand what new AI-enabled features would have the biggest impact on our users.
We shipped four features that helped us cover a lot of ground. Doing this helped us build a breadth of knowledge about what’s possible early so we could move fast later.
We quickly realized that investing in tools and developer experience up front was as important as investing in the team. This allowed us to experiment faster and test more ideas.
One prominent example to highlight the importance of this:
We built a CLI tool that could run against OpenAI with good examples and bad examples of incidents where the model struggled or did well.
This was essentially a fixture test.
Then, during development, we repeatedly used those for more predictable outputs.
Our tooling meant our summarization feature took ~1.5 weeks, and we could iterate faster. Fast-forward to today and our AI features are used by more than 95% of customers monthly.
Things have changed quite a bit since we built this feature, so there are a lot more good tools to help with use cases like this. As an aside, you should definitely consider buying vs. building (e.g., context.ai).
The general idea here is that AI isn’t all that different from good old-fashioned software. You should still have principles about building products your customers love, which will help you narrow down from a long list of ideas to valuable products.
Here are a few you should keep in mind that proved invaluable to us:
Large language models (LLMs) are powerful tools for your product but you should also use them during development.
While building Related Incidents, we used clustering and GPT-4 to explain cluster similarities. This is just one example, but LLMs are fantastic partners for prompt engineering, interpretability, and more!
Foundations models and developer tooling are changing so fast that you probably shouldn’t focus too much on any single detail but on the big picture and moving quickly.
Here are some of the observations we made with this in mind:
Finally, launch in phases. This is often helpful in product development, particularly with AI-enabled features, where many unknowns exist. Here's an anecdote from our experience that highlights why this is so critical:
We had a support request about “mistranslation in Portuguese," which surprised us because we hadn't built multi-language support for most areas of our product yet, so everything should've been in English.
It turns out that our AI features were working in all languages, depending on their input data. It's pretty incredible that we got it for free, but catching it before GA enabled valuable QA and gave us some control over that experience.
The reality is that these systems are so new, you’re likely to bump into things you don’t expect.
My final thought: don't wait.
Today, our AI features are amongst our most loved, with >95% of customers using them every month. The potential to make something people want with AI is huge, and I bet you can improve your product in ways you don't expect.
That said, it's still so early. Foundation models change weekly, and we're discovering new user experiences, like Devin from Cognition AI. But if you start now, you could be a leader in your industry.
Introducing our AI-powered features to enable incident responders to save crucial time, learn from previous incidents, and become more resilient over time.
In this article, Aaron the Technical Lead for the Post-incident team highlights the differences of running projects for AI powered features
In this episode, we dive into our latest product release: a full suite of AI features designed to help you get the most out of your incident response.
Ready for modern incident management? Book a call with one our of our experts today.