Project Setup: BaseAgent, State, and Shared Infrastructure
Build the foundation: BaseAgent class, TeamState, agent config YAML, SQLite memory store, and tool registry.
Build the foundation: BaseAgent class, TeamState, agent config YAML, SQLite memory store, and tool registry.
Complete hands-on guide to building a production-ready agentic AI system. From project setup to deployment — every layer implemented with working code, tests, and Docker compose.
Everything about Qwen's voice ecosystem — Qwen3-TTS, Qwen3-ASR, Qwen3-Omni, Qwen-Agent — and step-by-step implementation of Voice Interview Agent, Language Tutor, and Voice Tutor systems with full production code.
Everything about Qwen's voice ecosystem — Qwen3-TTS, Qwen3-ASR, Qwen3-Omni, Qwen-Agent — and step-by-step implementation of Voice Interview Agent, Language Tutor, and Voice Tutor systems with full production code.
Anthropic acquired Bun in December 2025. OpenAI acquired Astral (uv, Ruff) in March 2026. The AI model war has a new front: owning the developer toolchain. Here's what this arms race means for how you build software.
The complete deployment guide: Docker multi-stage builds, Kubernetes orchestration, CI/CD with GitHub Actions, zero-downtime deploys, go-live checklist, production monitoring with Prometheus/Grafana, and the operational runbook that keeps voice AI running at scale.
Multi-language voice AI for research: language detection, provider routing (Gemini Live for 30+ languages, OpenAI Realtime for English), locale-aware VAD tuning, i18n prompt packs, and cross-language analysis pipelines.
Scaling from 10 sessions/week to 200 concurrent. The enrichment bottleneck (30,000 API calls), session recovery for dropped WebRTC connections, provider failover, and the operational metrics that keep it all visible.
Real-time per-minute cost tracking, provider comparison (OpenAI Realtime ~$0.053/min vs Gemini Live ~$0.029/min), budget enforcement with soft/hard limits, and the self-hosting math that saves 90% on transport.
The 3-stage automatic pipeline that turns raw interview recordings into enriched, queryable research data in 3-7 minutes. Transcription, enrichment, analysis — with the transcript batching trick that cut DB load by 80%.
Research interviews follow structured protocols with distinct phases. How to build an LLM-driven state machine with next_phase() function calling and dynamic instruction swapping via set_chat_ctx().
The production pain points nobody warns you about: zombie agents, metadata latency, pre-warming for 1-2s time-to-first-voice, VAD tuning for research respondents, and provider quirks.
Why research interviews need server-side voice agents, the three-tier architecture, room metadata as configuration transport, and the 100-500ms propagation latency nobody tells you about.
Three personas, one infrastructure. How to build an AI interviewer that asks questions, a coach that gives feedback, and an evaluator that scores fairly — all with system prompts, function calling, and state machines.
Deepgram Nova-3 or Whisper for STT? Gemini Flash or GPT-4o for conversation? ElevenLabs or Cartesia for TTS? Real benchmarks, real costs, and the combinations that actually work in production.
LiveKit gives you WebRTC infrastructure. Pipecat gives you pipeline flexibility. Direct integration gives you simplicity. Here's how to choose for your voice interview platform.
A practical build guide for a speech-to-speech interview agent using LiveKit MultimodalAgent, OpenAI Realtime, and Gemini Live. Dynamic system prompts, 3 personas, function calling, and provider switching — no cascaded pipeline needed.
Lessons from building a real-time AI interviewer with LiveKit, OpenAI Realtime, Gemini Live, and Bedrock Nova. VAD tuning, provider failover, latency budgets, and the things nobody warns you about.
I experimented with CrewAI and LangGraph to build multi-agent workflows. Some were genuinely useful, some were expensive toys. Here's what I learned.
We built a RAG knowledge base. The first version gave wrong answers half the time. Six months of iteration later, it actually works. Here's every lesson.
We tested ChromaDB, Pinecone, Weaviate, Qdrant, and pgvector for our RAG system. Benchmarks, costs, developer experience — and the one I'd pick today.
Get notified when I publish new posts on AI, .NET, cloud architecture and more.