
If you’re heading to AWS re:Invent this year, you already know the struggle: hundreds of sessions, all sounding useful, all happening at the same time, and only one of you to go around. Whether you’re deep in SRE land, running incident response, or just trying to build more resilient systems, finding the right sessions can feel like a full-time job.
That’s why we pulled together a curated list of the must-see talks for anyone who cares about reliability, on-call life, cloud resilience, or keeping production from melting down at 2 a.m. These AWS re:Invent sessions are packed with real-world architecture lessons, resilience best practices, and incident-ready tactics you can bring back to your team.
If you want to make the absolute most of your week in Vegas, (without scrolling the catalog until your eyes blur), start here.
Come see our team at Booth #362. We’ll be giving away swag like our famous socks, flame plushies, tote bags, and more. Plus, we’ll be raffling off a pair of Airpod Max. 🔥
On Tuesday, December 2, join us after the conference for a special happy hour at the F1 Arcade. We're taking over the world's largest F1 Arcade for a night of racing simulators, premium food and drink, and mingling with others in the field.
You can register for the happy hour, book a meeting with a rep on-site, and see more details about incident.io x AWS here.
Breakout session
Why you should attend:
If you want to understand how AWS designs for extreme resilience at a global scale, this session is for you. It digs into the patterns, failure assumptions, and architectural guardrails AWS uses when building their most critical services. For incident response managers, this is a rare behind-the-curtain look at how to design for failure - not just react to it.
What you’ll learn:
You’ll walk away with practical strategies for adopting AWS-style resilience patterns in your own systems. That includes how to build for availability, reduce fragility, and think about failure domains the way AWS does. It’s directly applicable to improving your incident preparedness and upstream architecture.
Speakers: Amazon Web Services, Inc.
Time commitment: 1 hour
Breakout session
Why you should attend:
Serverless introduces a whole new set of operational challenges - cold starts, distributed execution, and limited visibility. This session explores how AWS tests Lambda’s resilience using chaos engineering and fault-injection techniques that uncover real-world failure scenarios before customers ever hit them.
What you’ll learn:
Expect to learn how to apply resilience testing to your own serverless or event-driven architectures, including how to validate assumptions, uncover hidden failure modes, and reduce surprises during on-call. You’ll leave with concrete ideas to bring back to your incident response or SRE team.
Speakers: Amazon Web Services, Inc.
Time commitment: 1 hour
Breakout session
Why you should attend:
Every SRE knows that clear failure domains are one of the strongest tools you have in reducing downtime. This session breaks down how fault isolation boundaries - across AZs, Regions, and workloads - can dramatically limit blast radius and speed recovery when something does go wrong.
What you’ll learn:
You’ll gain an understanding of how to apply AWS’s fault isolation patterns in your own architecture and how tools like AWS Application Recovery Controller (ARC) can support faster recovery. For incident managers, this is a blueprint for fewer unknowns and smoother response during high-severity events.
Speakers: Amazon Web Services, Inc.
Time commitment: 1 hour
Topic track
Why you should attend:
If your responsibilities span across ops, governance, on-call, observability, and reliability, the Cloud Operations track is your home base. This track is built around improving productivity, reducing complexity, managing multi-cloud/hybrid systems, and building for operational resilience.
What you’ll learn:
Sessions in this track cover everything from day-2 operations and monitoring strategies to large-scale governance and cross-environment incident management. It’s ideal for SREs, incident responders, and platform teams looking to uplevel operational maturity.
Speakers: Various AWS speakers
Time commitment: Varies by session
Breakout session
Why you should attend:
The AWS Well-Architected Framework has shaped cloud best practices for a decade - and this session walks through its evolution, lessons learned, and the architecture patterns that matter most today. It’s a great pick if you want to align your incident tooling and operational practices with proven architectural foundations.
What you’ll learn:
You’ll learn how to build systems that are secure, reliable, and operationally sound by grounding decisions in established AWS principles. For SRE and IR teams, it ties architecture directly to the workflows and playbooks you rely on during incidents.
Speakers: Amazon Web Services, Inc.
Time commitment: 1 hour
As always, AWS re:Invent is packed with more content than any one human can realistically absorb, but the sessions above give you a solid roadmap - especially if you live and breathe reliability, incident response, or the never-ending quest for “just a little more resilience.” Whether you're itching to dive into multi-Region failover patterns, sharpen your on-call muscle, or steal some hard-earned lessons from teams operating at massive scale, these talks are absolutely worth your time.
At the end of the day, the whole point of re:Invent is to come home with ideas you can actually use. So bookmark the sessions that speak to you, block out some mental space to reflect after each one, and get ready to bring back insights your team will thank you for during the next 2 a.m. surprise.
We hope to see you there!


The first annual SEV0 in London exceeded our expectations with some amazing speakers and sessions.
Kate Bernacchi-Sass
During the October 20th AWS outage, our platform handled 12,500 hours of incident response and 4.5M requests/hour. Here's how we diagnosed cascading failures in real-time and deployed fixes within hours to build greater resilience.
Pete HamiltonReady for modern incident management? Book a call with one of our experts today.
