Chapter 18: Deployment and Production
Getting code to run locally is just the first step. The real challenge is making Agents serve users reliably in production.
Chapter Overview
This chapter covers the complete path from development to production for Agents: deployment architecture design, API service wrapping, Docker containerization, streaming responses and concurrency handling, and ultimately building a complete production-grade Agent service.
Chapter Goals
- ✅ Understand the layered deployment architecture for production-grade Agents
- ✅ Wrap an Agent as an API service using FastAPI
- ✅ Orchestrate multi-service deployment with Docker Compose
- ✅ Implement streaming responses and high-concurrency handling
- ✅ Complete an end-to-end deployment of a production-grade Agent
Chapter Structure
| Section | Content |
|---|---|
| 18.1 Agent Application Deployment Architecture | Layered architecture, state management |
| 18.2 API Service Wrapping | FastAPI encapsulation, SSE streaming |
| 18.3 Containerization and Cloud Deployment | Dockerfile, Docker Compose |
| 18.4 Streaming Responses and Concurrency | Async, semaphores, queues |
| 18.5 Practice: Production-Grade Agent Service | Complete deployment workflow |
⏱️ Estimated Study Time
Approximately 120–150 minutes (including deployment practice)
💡 Prerequisites
- Completed Chapters 13–14 on evaluation and security
- Familiarity with HTTP APIs and REST concepts
- Basic understanding of Docker (expertise not required)
🔗 Learning Path
Prerequisites: Chapter 16: Evaluation and Optimization, Chapter 17: Security and Reliability
Recommended Next Steps:
- 👉 Chapter 19: AI Coding Assistant — Comprehensive project practice
- 👉 Chapter 20: Data Analysis Agent — Comprehensive project practice