Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Chapter 18: Deployment and Production

Getting code to run locally is just the first step. The real challenge is making Agents serve users reliably in production.


Chapter Overview

This chapter covers the complete path from development to production for Agents: deployment architecture design, API service wrapping, Docker containerization, streaming responses and concurrency handling, and ultimately building a complete production-grade Agent service.

Chapter Goals

  • ✅ Understand the layered deployment architecture for production-grade Agents
  • ✅ Wrap an Agent as an API service using FastAPI
  • ✅ Orchestrate multi-service deployment with Docker Compose
  • ✅ Implement streaming responses and high-concurrency handling
  • ✅ Complete an end-to-end deployment of a production-grade Agent

Chapter Structure

SectionContent
18.1 Agent Application Deployment ArchitectureLayered architecture, state management
18.2 API Service WrappingFastAPI encapsulation, SSE streaming
18.3 Containerization and Cloud DeploymentDockerfile, Docker Compose
18.4 Streaming Responses and ConcurrencyAsync, semaphores, queues
18.5 Practice: Production-Grade Agent ServiceComplete deployment workflow

⏱️ Estimated Study Time

Approximately 120–150 minutes (including deployment practice)

💡 Prerequisites

  • Completed Chapters 13–14 on evaluation and security
  • Familiarity with HTTP APIs and REST concepts
  • Basic understanding of Docker (expertise not required)

🔗 Learning Path

Prerequisites: Chapter 16: Evaluation and Optimization, Chapter 17: Security and Reliability

Recommended Next Steps:


Next: 18.1 Agent Application Deployment Architecture