Built with Agentic Engineering -- The New Way Software Gets Made
Forget vibe coding. This is a real workflow: plan with agents, verify with QA, review as human, ship to production. Here is what it actually looks like in practice.
Forget vibe coding. This is a real workflow: plan with agents, verify with QA, review as human, ship to production. Here is what it actually looks like in practice.
The full CI/CD pipeline for agentic code: spec to deploy in one workflow. Real examples from luonghongthuan.com and CubLearn. How to build the system that runs while you sleep.
The finale. Optimize costs, match models to agents, calculate ROI, and make your AI team production-ready.
Complete hands-on guide to building a production-ready agentic AI system. From project setup to deployment — every layer implemented with working code, tests, and Docker compose.
A complete technical guide to building a profitable agentic AI system using only open-source tools — with retrieval, orchestration, tool use, and observability. Includes architecture diagrams and real cost analysis.
Microsoft released the Agent Governance Toolkit on April 3, 2026 — seven packages covering policy enforcement, cryptographic identity, runtime isolation, and compliance automation. Here's a practical breakdown from someone building production agents.
90% of developers now use AI at work. But the real shift in March 2026 is agents moving from suggestion-mode to autonomous execution. Here's what that actually looks like in production systems and what breaks when you go too far too fast.
Mistral Large 3's MoE architecture delivers 92% of GPT-5.2's performance at 15% of the cost. As a technical lead who has run open-source LLMs in production, here's where it works and where it fails.
Gartner says 40% of enterprise apps will embed AI agents this year. But 40% of agentic projects will be scrapped by 2027. Here's what separates the teams that ship production agents from those that get stuck in pilots forever.
Deep-dive into 26+ real production issues with Pipecat voice agents — latency, audio quality, memory leaks, VAD problems, and pipeline freezes — plus battle-tested optimization strategies for building scalable voice AI systems.
The complete deployment guide: Docker multi-stage builds, Kubernetes orchestration, CI/CD with GitHub Actions, zero-downtime deploys, go-live checklist, production monitoring with Prometheus/Grafana, and the operational runbook that keeps voice AI running at scale.
Multi-language voice AI for research: language detection, provider routing (Gemini Live for 30+ languages, OpenAI Realtime for English), locale-aware VAD tuning, i18n prompt packs, and cross-language analysis pipelines.
Launch day is just the beginning. Privacy-compliant analytics, crash reporting that respects child data, and a feedback loop that actually improves the product.
Scaling from 10 sessions/week to 200 concurrent. The enrichment bottleneck (30,000 API calls), session recovery for dropped WebRTC connections, provider failover, and the operational metrics that keep it all visible.
Real-time per-minute cost tracking, provider comparison (OpenAI Realtime ~$0.053/min vs Gemini Live ~$0.029/min), budget enforcement with soft/hard limits, and the self-hosting math that saves 90% on transport.
The 3-stage automatic pipeline that turns raw interview recordings into enriched, queryable research data in 3-7 minutes. Transcription, enrichment, analysis — with the transcript batching trick that cut DB load by 80%.
Putting AI into production is nothing like building a demo. Two tech leads discuss costs, hallucinations, latency, guard rails, and what actually breaks when real users hit your AI features.
Research interviews follow structured protocols with distinct phases. How to build an LLM-driven state machine with next_phase() function calling and dynamic instruction swapping via set_chat_ctx().
Your demo works. Now make it production-ready for 1,000 concurrent interviews. Full guide to deployment options, WebSocket scaling, Azure pricing breakdown, cost estimations, and monitoring for Azure Voice Live.
The production pain points nobody warns you about: zombie agents, metadata latency, pre-warming for 1-2s time-to-first-voice, VAD tuning for research respondents, and provider quirks.
Why research interviews need server-side voice agents, the three-tier architecture, room metadata as configuration transport, and the 100-500ms propagation latency nobody tells you about.
Take Kids Learn production-ready: multi-region Active-Passive with Aurora Global Database, Route 53 health-checked failover, auto-scaling strategies, load testing with Artillery, chaos engineering, runbooks, and a comprehensive launch checklist.
Get notified when I publish new posts on AI, .NET, cloud architecture and more.