The AWS Developers Podcast

Name: The AWS Developers Podcast
SKU: PD_8001_037930UK

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

The AWS Developers Podcast

By: Amazon Web Services

Listen for free

Episodes View all

Why Your Agent Evaluations Will Fail You (and How to Fix Them Before Production)

Jun 3 2026

Anthropic deprecated Sonnet 3.5. Some of Xelix's pipelines migrated smoothly. Others broke — and customers noticed within hours. What separated the two? Evaluation. Paul Solomon and James Price Farr have spent 5+ years building AI systems that process millions of invoices for enterprise customers. In this episode, they share the evaluation-first framework that now saves them every time a model changes, an orchestration layer fails, or an agent picks the wrong tool. Key takeaways: • Evaluation-first, not evaluation-after — Retrofitting evaluation on an agent already in production is painful. Build your eval pipeline before you build the agent. • Monitor tool calls, not just outputs — If the agent isn't selecting the right tools, nothing downstream will be correct. Tool-call monitoring is your leading indicator. • 3 tiers of automation — Not everything needs an agent. Rules-based → single LLM call → agentic system. Pick the simplest tier that solves the problem. • Extended thinking tames token explosion — After migrating to newer, more verbose models, enabling extended thinking (with a budget) moved reasoning out of expensive output tokens and brought costs back under control. • Human-in-the-loop by default — Start with human review on every output, then earn trust toward touchless automation as customers gain confidence. • Pragmatism wins — Use whatever technology works best for the problem. Not every feature needs an LLM. Recorded live at AWS Summit London.
Show More Show Less

44 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Listen for free
5 Quality Gates That Let You Ship 250% Faster with AI Coding Agents

May 27 2026

How do you give 120+ engineers AI coding agents — and NOT break production? Ryan Cormack, Principal Engineer at Motorway and AWS Community Builder (recognized as a Renaissance Developer by Werner Vogels), shares the exact system his team uses to ship 250% more deployments while keeping quality high. In this episode, we break down the 5 quality gates that let Motorway's engineering teams move faster without sacrificing reliability: spec-driven planning to catch design issues before a single line of code is written, AI-assisted code review to verify code matches the plan, deterministic tests (unit + integration) as an automated safety net at the boundary, cyclomatic complexity checks to keep code maintainable, and human review as the final gate that stays human. Ryan explains how cross-functional DevOps teams — organized like Amazon's two-pizza teams with full end-to-end ownership — enable faster AI adoption. He walks through running parallel agents to explore multiple solutions simultaneously, building custom tools on top of ACP (Agent Client Protocol), and sharing agent configurations across 120+ engineers via a Git + S3 pipeline. The conversation also covers the Renaissance Developer mindset that Werner Vogels introduced at re:Invent 2024: curiosity, ownership, systems thinking, communication, and experimentation. Ryan shares how Motorway embraces this philosophy by encouraging engineers to build their own tools, experiment with new technologies in parallel, and focus engineering time on design and planning rather than writing code. Whether you are scaling AI coding assistants across a large engineering org, building quality gates for agentic development, or rethinking how your team ceremonies and processes should evolve in the age of AI, this episode offers a practitioner's blueprint from someone delivering measurable results: 250% more deployments, 4x engineering throughput, and no uptick in production incidents.
Show More Show Less

1 hr and 2 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Listen for free
Dark Factories: Why Your AI Coding Setup Is Already Outdated

May 20 2026

You're using Copilot. Maybe you've tried Cursor or Claude Code. But what if that's already the tail end of the AI wave? In this episode, Romain sits down with Christian Weichel, CTO and co-founder of Ona (formerly Gitpod), to explore 'dark factories' — autonomous AI agents that pick up work, write code, open PRs, and ship fixes while you sleep. No laptop required. Chris shares how his team of ~20 engineers went from 450 open pull requests to a streamlined, auto-approving system — all while staying SOC 2 compliant. He walks through the 3 stages of AI in the SDLC (better autocomplete → software conductor → background agents), the governance model that makes background agents safe for regulated enterprises, and why terminal-based coding agents' days are numbered. The conversation covers the risk ladder approach to auto-approving PRs, how isolated cloud development environments provide the security and autonomy agents need to operate safely, multi-agent code review with meta-reflection, and why accelerating implementation without accelerating review creates a bottleneck that breaks teams. Christian also shares his perspective on architecture governance, cognitive load management when running parallel agents, and why the future of IDEs will look different but won't disappear. Whether you are adopting AI coding assistants, building governance frameworks for agentic development, or exploring how background agents can automate your SDLC end-to-end, this episode offers a practitioner's view from someone who's been shipping with autonomous agents in production.
Show More Show Less

49 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Listen for free

No reviews yet

The AWS Developers Podcast

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

The AWS Developers Podcast

Why Your Agent Evaluations Will Fail You (and How to Fix Them Before Production)

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

5 Quality Gates That Let You Ship 250% Faster with AI Coding Agents

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

Dark Factories: Why Your AI Coding Setup Is Already Outdated

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed