Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Chapter 19: Security and Reliability

Agents have tools and autonomy — security becomes especially important. This chapter is the key to going from "usable" to "trustworthy."


Chapter Overview

Agent security and reliability involves five aspects: prompt injection defense, hallucination control, permission management, data protection, and behavioral alignment. This chapter explains the principles and practical defense solutions for each aspect.

Chapter Goals

  • ✅ Understand the principles and defense strategies for prompt injection
  • ✅ Master techniques to reduce hallucinations and improve factuality
  • ✅ Design a least-privilege system and code execution sandbox
  • ✅ Implement sensitive data detection and desensitization
  • ✅ Build behavioral boundaries and rejection policies

Chapter Structure

SectionContent
19.1 Prompt Injection Attacks and DefenseAttack techniques, multi-layer defense
19.2 Hallucination and Factuality AssuranceCitation verification, RAG validation
19.3 Permission Control and Sandbox IsolationLeast privilege, code sandbox
19.4 Sensitive Data ProtectionPII detection, data desensitization
19.5 Controllability and Alignment of Agent BehaviorBehavioral boundaries, rejection policies
19.6 Paper Readings: Frontier Research in Security and ReliabilityAcademic frontiers

⏱️ Estimated Study Time

Approximately 90–120 minutes

💡 Prerequisites

  • Completed Chapter 18 (Evaluation and Optimization)
  • Familiarity with common web security concepts (such as injection attacks) is helpful

🔗 Learning Path

Prerequisites: Chapter 18: Evaluation and Optimization

Recommended next steps:


Next section: 19.1 Prompt Injection Attacks and Defense