The AWS Developers Podcast cover art

The AWS Developers Podcast

The AWS Developers Podcast

By: Amazon Web Services
Listen for free

Stay updated on the latest AWS news and insights for developers, wherever you are, whenever you want.All rights reserved
Episodes
  • Why Your Agent Evaluations Will Fail You (and How to Fix Them Before Production)
    Jun 3 2026
    Anthropic deprecated Sonnet 3.5. Some of Xelix's pipelines migrated smoothly. Others broke — and customers noticed within hours. What separated the two? Evaluation. Paul Solomon and James Price Farr have spent 5+ years building AI systems that process millions of invoices for enterprise customers. In this episode, they share the evaluation-first framework that now saves them every time a model changes, an orchestration layer fails, or an agent picks the wrong tool. Key takeaways: • Evaluation-first, not evaluation-after — Retrofitting evaluation on an agent already in production is painful. Build your eval pipeline before you build the agent. • Monitor tool calls, not just outputs — If the agent isn't selecting the right tools, nothing downstream will be correct. Tool-call monitoring is your leading indicator. • 3 tiers of automation — Not everything needs an agent. Rules-based → single LLM call → agentic system. Pick the simplest tier that solves the problem. • Extended thinking tames token explosion — After migrating to newer, more verbose models, enabling extended thinking (with a budget) moved reasoning out of expensive output tokens and brought costs back under control. • Human-in-the-loop by default — Start with human review on every output, then earn trust toward touchless automation as customers gain confidence. • Pragmatism wins — Use whatever technology works best for the problem. Not every feature needs an LLM. Recorded live at AWS Summit London.
    Show More Show Less
    44 mins
  • 5 Quality Gates That Let You Ship 250% Faster with AI Coding Agents
    May 27 2026
    How do you give 120+ engineers AI coding agents — and NOT break production? Ryan Cormack, Principal Engineer at Motorway and AWS Community Builder (recognized as a Renaissance Developer by Werner Vogels), shares the exact system his team uses to ship 250% more deployments while keeping quality high. In this episode, we break down the 5 quality gates that let Motorway's engineering teams move faster without sacrificing reliability: spec-driven planning to catch design issues before a single line of code is written, AI-assisted code review to verify code matches the plan, deterministic tests (unit + integration) as an automated safety net at the boundary, cyclomatic complexity checks to keep code maintainable, and human review as the final gate that stays human. Ryan explains how cross-functional DevOps teams — organized like Amazon's two-pizza teams with full end-to-end ownership — enable faster AI adoption. He walks through running parallel agents to explore multiple solutions simultaneously, building custom tools on top of ACP (Agent Client Protocol), and sharing agent configurations across 120+ engineers via a Git + S3 pipeline. The conversation also covers the Renaissance Developer mindset that Werner Vogels introduced at re:Invent 2024: curiosity, ownership, systems thinking, communication, and experimentation. Ryan shares how Motorway embraces this philosophy by encouraging engineers to build their own tools, experiment with new technologies in parallel, and focus engineering time on design and planning rather than writing code. Whether you are scaling AI coding assistants across a large engineering org, building quality gates for agentic development, or rethinking how your team ceremonies and processes should evolve in the age of AI, this episode offers a practitioner's blueprint from someone delivering measurable results: 250% more deployments, 4x engineering throughput, and no uptick in production incidents.
    Show More Show Less
    1 hr and 2 mins
  • Dark Factories: Why Your AI Coding Setup Is Already Outdated
    May 20 2026
    You're using Copilot. Maybe you've tried Cursor or Claude Code. But what if that's already the tail end of the AI wave? In this episode, Romain sits down with Christian Weichel, CTO and co-founder of Ona (formerly Gitpod), to explore 'dark factories' — autonomous AI agents that pick up work, write code, open PRs, and ship fixes while you sleep. No laptop required. Chris shares how his team of ~20 engineers went from 450 open pull requests to a streamlined, auto-approving system — all while staying SOC 2 compliant. He walks through the 3 stages of AI in the SDLC (better autocomplete → software conductor → background agents), the governance model that makes background agents safe for regulated enterprises, and why terminal-based coding agents' days are numbered. The conversation covers the risk ladder approach to auto-approving PRs, how isolated cloud development environments provide the security and autonomy agents need to operate safely, multi-agent code review with meta-reflection, and why accelerating implementation without accelerating review creates a bottleneck that breaks teams. Christian also shares his perspective on architecture governance, cognitive load management when running parallel agents, and why the future of IDEs will look different but won't disappear. Whether you are adopting AI coding assistants, building governance frameworks for agentic development, or exploring how background agents can automate your SDLC end-to-end, this episode offers a practitioner's view from someone who's been shipping with autonomous agents in production.
    Show More Show Less
    49 mins
adbl_web_anon_alc_button_suppression_t1
No reviews yet