Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Chapter 3: LLM Fundamentals

A craftsman must first sharpen his tools. Before we start building Agents, we need to deeply understand their "brain" — the Large Language Model (LLM).


Chapter Overview

This chapter explains how large language models work at an intuitive level, then systematically covers how to communicate with models effectively through Prompt Engineering, introduces common prompting strategies, and walks you through your first API call step by step. Finally, we dive into key parameters like Token and Temperature to help you truly "master" language models.

Chapter Goals

After completing this chapter, you will be able to:

  • ✅ Intuitively understand how LLMs work (no math required)
  • ✅ Master the core principles and techniques of Prompt Engineering
  • ✅ Flexibly apply prompting strategies like Zero-shot, Few-shot, and CoT
  • ✅ Proficiently call the OpenAI API and common open-source model interfaces
  • ✅ Understand how parameters like Token and Temperature affect output
  • ✅ Understand the architectural components of mainstream models (MHA/GQA/MLA, RoPE, SwiGLU, MoE) and 2026 breakthroughs (Hybrid Attention, Attention Residuals, MuonClip, Engram Memory)
  • ✅ Master the latest advances in foundation models and model selection strategies for Agent development
  • ✅ Understand the core principles of SFT and RL training data preparation: data volume selection, quality evaluation, and reward function design

Chapter Structure

SectionContentDifficulty
3.1 How Does an LLM Work?Intuitive understanding of Transformers, pre-training, and emergent abilities⭐⭐
3.2 Prompt EngineeringSystem messages, role-playing, structured output⭐⭐
3.3 Prompting StrategiesZero-shot, Few-shot, CoT, ToT⭐⭐⭐
3.4 Model API BasicsOpenAI SDK, open-source models, streaming⭐⭐
3.5 Tokens & Model ParametersToken counting, Temperature, Top-p, etc.⭐⭐
3.6 Foundation Model LandscapeIndustry landscape, model ecosystem (Kimi K2/K2.5, DeepSeek V4, Qwen3.5), Agent selection guide⭐⭐⭐
3.7 Foundation Model ArchitectureMHA→GQA→MLA, RoPE, SwiGLU, MoE, and 2026 breakthroughs: Hybrid Attention, Attention Residuals, MuonClip, Engram Memory⭐⭐⭐⭐
3.8 SFT & RL Training Data PreparationData volume selection, quality evaluation, SFT data creation, RL reward function design, difficulty calibration and curriculum learning⭐⭐⭐

Core Concepts at a Glance

LLM Core Concepts Overview

Why Do Agent Developers Need to Understand LLMs?

Many Agent frameworks (LangChain, LangGraph, etc.) wrap model calls very cleanly, allowing beginners to get started quickly. But when your Agent encounters the following issues, understanding the underlying LLM mechanisms becomes critical:

  • Unstable output — the same question gets different answers
  • Model "hallucination" — confidently giving wrong answers
  • Token limit exceeded — long conversations get truncated
  • High costs — needing to optimize Prompts to reduce consumption

Understanding LLMs is like understanding how an engine works — even if you don't build engines, knowing the principles makes you a better driver.

🔗 Learning Path

Prerequisites: Chapter 1: What is an Agent?, Chapter 2: Development Environment Setup

Recommended next steps:


Next section: 3.1 How Does an LLM Work? (Intuitive Understanding)