diff --git a/README.md b/README.md
index 169e6d9..1a31f0f 100644
--- a/README.md
+++ b/README.md
@@ -34,7 +34,7 @@ Download it and try it out for free!  **https://piebald.ai/**
 > [!important]
 > **NEW (January 23, 2026): We've added all of Claude Code's ~40 system reminders to this list&mdash;see [System Reminders](#system-reminders).**
 
-This repository contains an up-to-date list of all Claude Code's various system prompts and their associated token counts as of **[Claude Code v2.1.169](https://www.npmjs.com/package/@anthropic-ai/claude-code/v/2.1.169) (June 8th, 2026).**  It also contains a [**CHANGELOG.md**](./CHANGELOG.md) for the system prompts across 203 versions since v2.0.14.  From the team behind [<img src="https://github.com/Piebald-AI/piebald/raw/main/assets/logo.svg" width="15"> **Piebald.**](https://piebald.ai/)
+This repository contains an up-to-date list of all Claude Code's various system prompts and their associated token counts as of **[Claude Code v2.1.170](https://www.npmjs.com/package/@anthropic-ai/claude-code/v/2.1.170) (June 9th, 2026).**  It also contains a [**CHANGELOG.md**](./CHANGELOG.md) for the system prompts across 204 versions since v2.0.14.  From the team behind [<img src="https://github.com/Piebald-AI/piebald/raw/main/assets/logo.svg" width="15"> **Piebald.**](https://piebald.ai/)
 
 **This repository is updated within minutes of each Claude Code release.  See the [changelog](./CHANGELOG.md), and follow [@PiebaldAI](https://x.com/PiebaldAI) on X for a summary of the system prompt changes in each release.**
 
@@ -140,17 +140,17 @@ The content of various template files embedded in Claude Code.
 - [Data: Anthropic CLI](./system-prompts/data-anthropic-cli.md) (**4615** tks) - Reference documentation for the ant CLI covering installation, authentication, command structure, input and output shaping, managed agents workflows, and scripting patterns.
 - [Data: Assistant voice and values template](./system-prompts/data-assistant-voice-and-values-template.md) (**454** tks) - Template content for an assistant.md file describing Claude's voice, values, and communication style.
 - [Data: Claude API reference — C#](./system-prompts/data-claude-api-reference-c.md) (**4710** tks) - C# SDK reference including installation, client initialization, basic requests, streaming, and tool use.
-- [Data: Claude API reference — Go](./system-prompts/data-claude-api-reference-go.md) (**4521** tks) - Go SDK reference.
+- [Data: Claude API reference — Go](./system-prompts/data-claude-api-reference-go.md) (**4572** tks) - Go SDK reference.
 - [Data: Claude API reference — Java](./system-prompts/data-claude-api-reference-java.md) (**4732** tks) - Java SDK reference including installation, client initialization, basic requests, streaming, and beta tool use.
 - [Data: Claude API reference — PHP](./system-prompts/data-claude-api-reference-php.md) (**3691** tks) - PHP SDK reference.
-- [Data: Claude API reference — Python](./system-prompts/data-claude-api-reference-python.md) (**4909** tks) - Python SDK reference including installation, client initialization, basic requests, thinking, and multi-turn conversation.
+- [Data: Claude API reference — Python](./system-prompts/data-claude-api-reference-python.md) (**4934** tks) - Python SDK reference including installation, client initialization, basic requests, thinking, and multi-turn conversation.
 - [Data: Claude API reference — Ruby](./system-prompts/data-claude-api-reference-ruby.md) (**1094** tks) - Ruby SDK reference including installation, client initialization, basic requests, streaming, and beta tool runner.
-- [Data: Claude API reference — TypeScript](./system-prompts/data-claude-api-reference-typescript.md) (**3477** tks) - TypeScript SDK reference including installation, client initialization, basic requests, thinking, and multi-turn conversation.
-- [Data: Claude API reference — cURL](./system-prompts/data-claude-api-reference-curl.md) (**2220** tks) - Raw API reference for Claude API for use with cURL or else Raw HTTP.
+- [Data: Claude API reference — TypeScript](./system-prompts/data-claude-api-reference-typescript.md) (**3502** tks) - TypeScript SDK reference including installation, client initialization, basic requests, thinking, and multi-turn conversation.
+- [Data: Claude API reference — cURL](./system-prompts/data-claude-api-reference-curl.md) (**2239** tks) - Raw API reference for Claude API for use with cURL or else Raw HTTP.
 - [Data: Claude Code live documentation sources](./system-prompts/data-claude-code-live-documentation-sources.md) (**1380** tks) - WebFetch URLs for fetching current Claude Code documentation from official sources.
 - [Data: Claude Code recent changes reference](./system-prompts/data-claude-code-recent-changes-reference.md) (**528** tks) - Reference mapping of recently removed or renamed Claude Code commands, flags, and terms to their current replacements.
 - [Data: Claude Platform on AWS reference](./system-prompts/data-claude-platform-on-aws-reference.md) (**1158** tks) - Reference documentation for using the Claude Developer Platform through AWS infrastructure, including AnthropicAWS clients, required region and workspace configuration, SigV4 authentication, and short-term API keys.
-- [Data: Claude model catalog](./system-prompts/data-claude-model-catalog.md) (**2507** tks) - Catalog of current and legacy Claude models with exact model IDs, aliases, context windows, and pricing.
+- [Data: Claude model catalog](./system-prompts/data-claude-model-catalog.md) (**2678** tks) - Catalog of current and legacy Claude models with exact model IDs, aliases, context windows, and pricing.
 - [Data: Cowork plugin MCP discovery and connection](./system-prompts/data-cowork-plugin-mcp-discovery-and-connection.md) (**1338** tks) - Reference guidance for finding MCP connectors during plugin customization, using search and suggestion tools, mapping categories to keywords, and writing .mcp.json entries.
 - [Data: Cowork plugin component schemas](./system-prompts/data-cowork-plugin-component-schemas.md) (**3109** tks) - Reference documentation for Cowork plugin component formats, including skills, agents, hooks, MCP servers, legacy commands, CONNECTORS.md, and README.md.
 - [Data: Cowork plugin examples](./system-prompts/data-cowork-plugin-examples.md) (**2323** tks) - Reference examples of minimal, medium, and complex Cowork plugin structures with plugin metadata, skills, agents, hooks, MCP config, README, and connectors.
@@ -161,7 +161,7 @@ The content of various template files embedded in Claude Code.
 - [Data: Files API reference — TypeScript](./system-prompts/data-files-api-reference-typescript.md) (**797** tks) - TypeScript Files API reference including file upload, listing, deletion, and usage in messages.
 - [Data: GitHub Actions workflow for @claude mentions](./system-prompts/data-github-actions-workflow-for-claude-mentions.md) (**525** tks) - GitHub Actions workflow template for triggering Claude Code via @claude mentions.
 - [Data: GitHub App installation PR description](./system-prompts/data-github-app-installation-pr-description.md) (**409** tks) - Template for PR description when installing Claude Code GitHub App integration.
-- [Data: HTTP error codes reference](./system-prompts/data-http-error-codes-reference.md) (**2508** tks) - Reference for HTTP error codes returned by the Claude API with common causes and handling strategies.
+- [Data: HTTP error codes reference](./system-prompts/data-http-error-codes-reference.md) (**2631** tks) - Reference for HTTP error codes returned by the Claude API with common causes and handling strategies.
 - [Data: Knowledge MCP search strategies](./system-prompts/data-knowledge-mcp-search-strategies.md) (**447** tks) - Reference query patterns for using knowledge MCPs to discover organization-specific tool names, project identifiers, team names, and workflow details during plugin customization.
 - [Data: Live documentation sources](./system-prompts/data-live-documentation-sources.md) (**4180** tks) - WebFetch URLs for fetching current Claude API and Agent SDK documentation from official sources.
 - [Data: Managed Agents client patterns](./system-prompts/data-managed-agents-client-patterns.md) (**2685** tks) - Reference guide of common client-side patterns for driving Managed Agent sessions, including stream reconnection, idle-break gating, tool confirmations, interrupts, and custom tools.
@@ -180,13 +180,11 @@ The content of various template files embedded in Claude Code.
 - [Data: Managed Agents tools and skills](./system-prompts/data-managed-agents-tools-and-skills.md) (**4101** tks) - Reference documentation covering the Managed Agents SDK's tool types (agent toolset, MCP, custom), permission policies, vault credential management, and skills API for building specialized agents.
 - [Data: Managed Agents webhooks](./system-prompts/data-managed-agents-webhooks.md) (**1439** tks) - Reference documentation for Managed Agents webhooks, including endpoint registration, signature verification, payload envelopes, supported event types, delivery behavior, and pitfalls.
 - [Data: Message Batches API reference — Python](./system-prompts/data-message-batches-api-reference-python.md) (**1635** tks) - Python Batches API reference including batch creation, status polling, and result retrieval at 50% cost.
-- [Data: Prompt Caching — Design & Optimization](./system-prompts/data-prompt-caching-design-optimization.md) (**3914** tks) - Document on how to design prompt-building code for effective caching, including placement patterns and anti-patterns.
-- [Data: Streaming reference — Python](./system-prompts/data-streaming-reference-python.md) (**1668** tks) - Python streaming reference including sync/async streaming and handling different content types.
-- [Data: Streaming reference — TypeScript](./system-prompts/data-streaming-reference-typescript.md) (**1620** tks) - TypeScript streaming reference including basic streaming and handling different content types.
-- [Data: Superseded message UUID protocol note](./system-prompts/data-superseded-message-uuid-protocol-note.md) (**147** tks) - Internal protocol note explaining how supersedes UUIDs mark previously delivered messages as canonical replacements during refusal fallback handling.
-- [Data: Supported dialog kinds protocol note](./system-prompts/data-supported-dialog-kinds-protocol-note.md) (**153** tks) - Internal protocol note describing supported request_user_dialog kinds, fail-closed behavior, and the staged-release gate.
+- [Data: Prompt Caching — Design & Optimization](./system-prompts/data-prompt-caching-design-optimization.md) (**3927** tks) - Document on how to design prompt-building code for effective caching, including placement patterns and anti-patterns.
+- [Data: Streaming reference — Python](./system-prompts/data-streaming-reference-python.md) (**1675** tks) - Python streaming reference including sync/async streaming and handling different content types.
+- [Data: Streaming reference — TypeScript](./system-prompts/data-streaming-reference-typescript.md) (**1627** tks) - TypeScript streaming reference including basic streaming and handling different content types.
 - [Data: Token counting reference](./system-prompts/data-token-counting-reference.md) (**486** tks) - Reference documentation for counting Claude model tokens with the Messages count_tokens endpoint and Anthropic SDK or CLI examples, including warnings against OpenAI tokenizers.
-- [Data: Tool use concepts](./system-prompts/data-tool-use-concepts.md) (**4431** tks) - Conceptual foundations of tool use with the Claude API including tool definitions, tool choice, and best practices.
+- [Data: Tool use concepts](./system-prompts/data-tool-use-concepts.md) (**4446** tks) - Conceptual foundations of tool use with the Claude API including tool definitions, tool choice, and best practices.
 - [Data: Tool use reference — Python](./system-prompts/data-tool-use-reference-python.md) (**5106** tks) - Python tool use reference including tool runner, manual agentic loop, code execution, and structured outputs.
 - [Data: Tool use reference — TypeScript](./system-prompts/data-tool-use-reference-typescript.md) (**5033** tks) - TypeScript tool use reference including tool runner, manual agentic loop, code execution, and structured outputs.
 - [Data: User profile memory template](./system-prompts/data-user-profile-memory-template.md) (**232** tks) - Template content for the user profile memory file, covering personal details, work context, schedule, and communication preferences.
@@ -430,7 +428,7 @@ Built-in skill prompts for specialized tasks.
 - [Skill: /stuck slash command](./system-prompts/skill-stuck-slash-command.md) (**964** tks) - Diagnozse frozen or slow Claude Code sessions.
 - [Skill: Agent Design Patterns](./system-prompts/skill-agent-design-patterns.md) (**2029** tks) - Reference guide covering decision heuristics for building agents on the Claude API, including tool surface design, context management, caching strategies, and composing tool calls.
 - [Skill: Build with Claude API (reference guide)](./system-prompts/skill-build-with-claude-api-reference-guide.md) (**703** tks) - Template for presenting language-specific reference documentation with quick task navigation.
-- [Skill: Building LLM-powered applications with Claude](./system-prompts/skill-building-llm-powered-applications-with-claude.md) (**9367** tks) - Guides Claude in building LLM-powered applications using the Anthropic SDK, covering language detection, API surface selection (Claude API vs Managed Agents), model defaults, thinking/effort configuration, and language-specific documentation reading.
+- [Skill: Building LLM-powered applications with Claude](./system-prompts/skill-building-llm-powered-applications-with-claude.md) (**9626** tks) - Guides Claude in building LLM-powered applications using the Anthropic SDK, covering language detection, API surface selection (Claude API vs Managed Agents), model defaults, thinking/effort configuration, and language-specific documentation reading.
 - [Skill: Claude Code configuration guide](./system-prompts/skill-claude-code-configuration-guide.md) (**975** tks) - Skill instructions for answering Claude Code configuration questions by checking the running build, bundled references, and current documentation.
 - [Skill: Computer Use MCP](./system-prompts/skill-computer-use-mcp.md) (**1206** tks) - Instructions for using computer-use MCP tools including tool selection tiers, app access tiers, link safety, and financial action restrictions.
 - [Skill: Cowork plugin authoring](./system-prompts/skill-cowork-plugin-authoring.md) (**4791** tks) - Skill instructions for creating or customizing Cowork plugins, including mode selection, research, implementation, packaging, connector replacement, and plugin delivery.
diff --git a/system-prompts/data-claude-api-reference-curl.md b/system-prompts/data-claude-api-reference-curl.md
index 45b6b10..7460c52 100644
--- a/system-prompts/data-claude-api-reference-curl.md
+++ b/system-prompts/data-claude-api-reference-curl.md
@@ -1,7 +1,7 @@
 <!--
 name: 'Data: Claude API reference — cURL'
 description: Raw API reference for Claude API for use with cURL or else Raw HTTP
-ccVersion: 2.1.154
+ccVersion: 2.1.170
 -->
 # Claude API — cURL / Raw HTTP
 
@@ -187,11 +187,11 @@ For 1-hour TTL: `"cache_control": {"type": "ephemeral", "ttl": "1h"}`. Top-level
 
 ## Extended Thinking
 
-> **Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6:** Use adaptive thinking. `budget_tokens` is removed on Opus 4.8 and 4.7 (400 if sent); deprecated on Opus 4.6 and Sonnet 4.6.
+> **Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6:** Use adaptive thinking. `budget_tokens` is removed on Fable 5, Opus 4.8, and 4.7 (400 if sent); deprecated on Opus 4.6 and Sonnet 4.6.
 > **Older models:** Use `"type": "enabled"` with `"budget_tokens": N` (must be < `max_tokens`, min 1024).
 
 ```bash
-# Opus 4.8 / 4.7 / 4.6: adaptive thinking (recommended)
+# Fable 5 / Opus 4.8 / 4.7 / 4.6: adaptive thinking (recommended)
 curl https://api.anthropic.com/v1/messages \
   -H "Content-Type: application/json" \
   -H "x-api-key: $ANTHROPIC_API_KEY" \
diff --git a/system-prompts/data-claude-api-reference-go.md b/system-prompts/data-claude-api-reference-go.md
index 760d7c2..df5a566 100644
--- a/system-prompts/data-claude-api-reference-go.md
+++ b/system-prompts/data-claude-api-reference-go.md
@@ -1,7 +1,7 @@
 <!--
 name: 'Data: Claude API reference — Go'
 description: Go SDK reference
-ccVersion: 2.1.154
+ccVersion: 2.1.170
 -->
 # Claude API — Go
 
@@ -34,7 +34,7 @@ client := anthropic.NewClient(
 
 ## Model Constants
 
-The Go SDK provides typed model constants: `anthropic.ModelClaudeOpus4_8`, `anthropic.ModelClaudeOpus4_7`, `anthropic.ModelClaudeSonnet4_6`, `anthropic.ModelClaudeHaiku4_5_20251001`. Use `ModelClaudeOpus4_8` unless the user specifies otherwise.
+The Go SDK provides typed model constants: `anthropic.ModelClaudeFable5`, `anthropic.ModelClaudeOpus4_8`, `anthropic.ModelClaudeOpus4_7`, `anthropic.ModelClaudeSonnet4_6`, `anthropic.ModelClaudeHaiku4_5_20251001`. Use `ModelClaudeOpus4_8` unless the user specifies otherwise; if they ask for Fable or the most powerful model, use `anthropic.ModelClaudeFable5` (see `shared/models.md` for the full resolution table).
 
 ---
 
diff --git a/system-prompts/data-claude-api-reference-python.md b/system-prompts/data-claude-api-reference-python.md
index 7c93cec..e1ddece 100644
--- a/system-prompts/data-claude-api-reference-python.md
+++ b/system-prompts/data-claude-api-reference-python.md
@@ -1,7 +1,7 @@
 <!--
 name: 'Data: Claude API reference — Python'
 description: Python SDK reference including installation, client initialization, basic requests, thinking, and multi-turn conversation
-ccVersion: 2.1.154
+ccVersion: 2.1.170
 -->
 # Claude API — Python
 
@@ -255,11 +255,11 @@ If `cache_read_input_tokens` is zero across repeated identical-prefix requests,
 
 ## Extended Thinking
 
-> **Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6:** Use adaptive thinking. `budget_tokens` is removed on Opus 4.8 and 4.7 (400 if sent); deprecated on Opus 4.6 and Sonnet 4.6.
+> **Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6:** Use adaptive thinking. `budget_tokens` is removed on Fable 5, Opus 4.8, and 4.7 (400 if sent); deprecated on Opus 4.6 and Sonnet 4.6.
 > **Older models:** Use `thinking: {type: "enabled", budget_tokens: N}` (must be < `max_tokens`, min 1024).
 
 ```python
-# Opus 4.8 / 4.7 / 4.6: adaptive thinking (recommended)
+# Fable 5 / Opus 4.8 / 4.7 / 4.6: adaptive thinking (recommended)
 response = client.messages.create(
     model="{{OPUS_ID}}",
     max_tokens=16000,
@@ -386,7 +386,7 @@ response2 = conversation.send("What's my name?")  # Claude remembers "Alice"
 
 ### Compaction (long conversations)
 
-> **Beta, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6.** When conversations approach the 200K context window, compaction automatically summarizes earlier context server-side. The API returns a `compaction` block; you must pass it back on subsequent requests — append `response.content`, not just the text.
+> **Beta, Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6.** When conversations approach the 200K context window, compaction automatically summarizes earlier context server-side. The API returns a `compaction` block; you must pass it back on subsequent requests — append `response.content`, not just the text.
 
 ```python
 import anthropic
diff --git a/system-prompts/data-claude-api-reference-typescript.md b/system-prompts/data-claude-api-reference-typescript.md
index b90d24b..5a8d51a 100644
--- a/system-prompts/data-claude-api-reference-typescript.md
+++ b/system-prompts/data-claude-api-reference-typescript.md
@@ -1,7 +1,7 @@
 <!--
 name: 'Data: Claude API reference — TypeScript'
 description: TypeScript SDK reference including installation, client initialization, basic requests, thinking, and multi-turn conversation
-ccVersion: 2.1.154
+ccVersion: 2.1.170
 -->
 # Claude API — TypeScript
 
@@ -201,11 +201,11 @@ If `cache_read_input_tokens` is zero across repeated identical-prefix requests,
 
 ## Extended Thinking
 
-> **Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6:** Use adaptive thinking. `budget_tokens` is removed on Opus 4.8 and 4.7 (400 if sent); deprecated on Opus 4.6 and Sonnet 4.6.
+> **Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6:** Use adaptive thinking. `budget_tokens` is removed on Fable 5, Opus 4.8, and 4.7 (400 if sent); deprecated on Opus 4.6 and Sonnet 4.6.
 > **Older models:** Use `thinking: {type: "enabled", budget_tokens: N}` (must be < `max_tokens`, min 1024).
 
 ```typescript
-// Opus 4.8 / 4.7 / 4.6: adaptive thinking (recommended)
+// Fable 5 / Opus 4.8 / 4.7 / 4.6: adaptive thinking (recommended)
 const response = await client.messages.create({
   model: "{{OPUS_ID}}",
   max_tokens: 16000,
@@ -281,7 +281,7 @@ const response = await client.messages.create({
 
 ### Compaction (long conversations)
 
-> **Beta, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6.** When conversations approach the 200K context window, compaction automatically summarizes earlier context server-side. The API returns a `compaction` block; you must pass it back on subsequent requests — append `response.content`, not just the text.
+> **Beta, Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6.** When conversations approach the 200K context window, compaction automatically summarizes earlier context server-side. The API returns a `compaction` block; you must pass it back on subsequent requests — append `response.content`, not just the text.
 
 ```typescript
 import Anthropic from "@anthropic-ai/sdk";
diff --git a/system-prompts/data-claude-model-catalog.md b/system-prompts/data-claude-model-catalog.md
index 2f4a6ef..628b84c 100644
--- a/system-prompts/data-claude-model-catalog.md
+++ b/system-prompts/data-claude-model-catalog.md
@@ -1,7 +1,7 @@
 <!--
 name: 'Data: Claude model catalog'
 description: Catalog of current and legacy Claude models with exact model IDs, aliases, context windows, and pricing
-ccVersion: 2.1.154
+ccVersion: 2.1.170
 -->
 # Claude Model Catalog
 
@@ -62,6 +62,7 @@ curl https://api.anthropic.com/v1/models/claude-opus-4-8 \
 
 | Friendly Name     | Alias (use this)    | Full ID                       | Context        | Max Output | Status |
 |-------------------|---------------------|-------------------------------|----------------|------------|--------|
+| Claude Fable 5    | `{{FABLE_ID}}`      | —                             | 1M             | 128K       | Active |
 | Claude Opus 4.8   | `claude-opus-4-8`   | —                             | 1M             | 128K       | Active |
 | Claude Opus 4.7   | `claude-opus-4-7`   | —                             | 1M             | 128K       | Active |
 | Claude Opus 4.6   | `claude-opus-4-6`   | —                             | 1M             | 128K       | Active |
@@ -69,7 +70,8 @@ curl https://api.anthropic.com/v1/models/claude-opus-4-8 \
 | Claude Haiku 4.5  | `claude-haiku-4-5`  | `claude-haiku-4-5-20251001`   | 200K           | 64K        | Active |
 
 ### Model Descriptions
-- **Claude Opus 4.8** — The most capable Claude model to date — highly autonomous, state-of-the-art on long-horizon agentic work, knowledge work, and memory; clearer, warmer writing. Same API surface as Opus 4.7 (adaptive thinking only; sampling parameters and `budget_tokens` removed). 1M context window at standard API pricing (no long-context premium). See `shared/model-migration.md` → Migrating to Opus 4.8 — a 4.7 → 4.8 move is a model-ID swap plus prompt re-tuning, no new breaking changes.
+- **Claude Fable 5** — Our most powerful, most intelligent model. New tier above Opus. Same API surface as Opus 4.7/4.8 (see `shared/model-migration.md`) with one new breaking change: an explicit `thinking: {type: "disabled"}` returns a 400 (omit the `thinking` param instead); $10/$50 per MTok.
+- **Claude Opus 4.8** — The most capable Opus-tier model — highly autonomous, state-of-the-art on long-horizon agentic work, knowledge work, and memory; clearer, warmer writing. Same API surface as Opus 4.7 (adaptive thinking only; sampling parameters and `budget_tokens` removed). 1M context window at standard API pricing (no long-context premium). See `shared/model-migration.md` → Migrating to Opus 4.8 — a 4.7 → 4.8 move is a model-ID swap plus prompt re-tuning, no new breaking changes.
 - **Claude Opus 4.7** — Previous-generation Opus. Highly autonomous; strong on long-horizon agentic work, knowledge work, vision, and memory. Adaptive thinking only; sampling parameters and `budget_tokens` removed. 1M context window. See `shared/model-migration.md` → Migrating to Opus 4.7.
 - **Claude Opus 4.6** — Older Opus. Supports adaptive thinking (recommended), 128K max output tokens (requires streaming for large outputs). 1M context window.
 - **Claude Sonnet 4.6** — Our best combination of speed and intelligence. Supports adaptive thinking (recommended). 1M context window. 64K max output tokens.
@@ -110,7 +112,9 @@ When a user asks for a model by name, use this table to find the correct model I
 
 | User says...                              | Use this model ID              |
 |-------------------------------------------|--------------------------------|
-| "opus", "most powerful"                   | `claude-opus-4-8`              |
+| "fable"                                   | `{{FABLE_ID}}`                 |
+| "most powerful"                           | `{{FABLE_ID}}`                 |
+| "opus"                                    | `claude-opus-4-8`              |
 | "opus 4.8"                                | `claude-opus-4-8`              |
 | "opus 4.7"                                | `claude-opus-4-7`              |
 | "opus 4.6"                                | `claude-opus-4-6`              |
diff --git a/system-prompts/data-http-error-codes-reference.md b/system-prompts/data-http-error-codes-reference.md
index eca9e5c..c3e8fa3 100644
--- a/system-prompts/data-http-error-codes-reference.md
+++ b/system-prompts/data-http-error-codes-reference.md
@@ -1,7 +1,7 @@
 <!--
 name: 'Data: HTTP error codes reference'
 description: Reference for HTTP error codes returned by the Claude API with common causes and handling strategies
-ccVersion: 2.1.154
+ccVersion: 2.1.170
 -->
 # HTTP Error Codes Reference
 
@@ -112,10 +112,11 @@ Some 400 errors are specifically related to parameter validation:
 - `budget_tokens` >= `max_tokens` in extended thinking
 - Invalid tool definition schema
 
-**Model-specific 400s on Opus 4.8 / 4.7:**
+**Model-specific 400s on Fable 5 / Opus 4.8 / 4.7:**
 
 - `temperature`, `top_p`, `top_k` are removed — sending any of them returns 400. Delete the parameter; see `shared/model-migration.md` → Per-SDK Syntax Reference.
 - `thinking: {type: "enabled", budget_tokens: N}` is removed — sending it returns 400. Use `thinking: {type: "adaptive"}` instead.
+- **Fable 5 only:** an explicit `thinking: {type: "disabled"}` returns 400 (it is accepted on Opus 4.8/4.7). Omit the `thinking` param entirely instead.
 
 **Common mistake with extended thinking on older models (Opus 4.6 and earlier):**
 
@@ -173,8 +174,9 @@ thinking: budget_tokens=10000, max_tokens=16000
 
 | Mistake                         | Error            | Fix                                                     |
 | ------------------------------- | ---------------- | ------------------------------------------------------- |
-| `temperature`/`top_p`/`top_k` on Opus 4.8 / 4.7 | 400 | Remove the parameter (see `shared/model-migration.md`)  |
-| `budget_tokens` on Opus 4.8 / 4.7 | 400            | Use `thinking: {type: "adaptive"}`                      |
+| `temperature`/`top_p`/`top_k` on Fable 5 / Opus 4.8 / 4.7 | 400 | Remove the parameter (see `shared/model-migration.md`)  |
+| `budget_tokens` on Fable 5 / Opus 4.8 / 4.7 | 400  | Use `thinking: {type: "adaptive"}`                      |
+| `thinking: {type: "disabled"}` on Fable 5 | 400    | Omit the `thinking` param entirely (accepted on Opus 4.8/4.7) |
 | `budget_tokens` >= `max_tokens` (older models) | 400 | Ensure `budget_tokens` < `max_tokens`                  |
 | Typo in model ID                | 404              | Use valid model ID like `{{OPUS_ID}}`               |
 | First message is `assistant`    | 400              | First message must be `user`                            |
diff --git a/system-prompts/data-prompt-caching-design-optimization.md b/system-prompts/data-prompt-caching-design-optimization.md
index e4c66c1..1e7bc63 100644
--- a/system-prompts/data-prompt-caching-design-optimization.md
+++ b/system-prompts/data-prompt-caching-design-optimization.md
@@ -1,7 +1,7 @@
 <!--
 name: 'Data: Prompt Caching — Design & Optimization'
 description: Document on how to design prompt-building code for effective caching, including placement patterns and anti-patterns.
-ccVersion: 2.1.154
+ccVersion: 2.1.170
 -->
 # Prompt Caching — Design & Optimization
 
@@ -135,10 +135,10 @@ Fix by moving the dynamic piece after the last breakpoint, making it determinist
 | Model | Minimum |
 |---|---:|
 | Opus 4.8, Opus 4.7, Opus 4.6, Opus 4.5, Haiku 4.5 | 4096 tokens |
-| Sonnet 4.6, Haiku 3.5, Haiku 3 | 2048 tokens |
+| Fable 5, Sonnet 4.6, Haiku 3.5, Haiku 3 | 2048 tokens |
 | Sonnet 4.5, Sonnet 4.1, Sonnet 4, Sonnet 3.7 | 1024 tokens |
 
-A 3K-token prompt caches on Sonnet 4.5 but silently won't on Opus 4.8.
+A 3K-token prompt caches on Sonnet 4.5 and Fable 5 but silently won't on Opus 4.8.
 
 **Economics:** Cache reads cost ~0.1× base input price. Cache writes cost **1.25× for 5-minute TTL, 2× for 1-hour TTL**. Break-even depends on TTL: with 5-minute TTL, two requests break even (1.25× + 0.1× = 1.35× vs 2× uncached); with 1-hour TTL, you need at least three requests (2× + 0.2× = 2.2× vs 3× uncached). The 1-hour TTL keeps entries alive across gaps in bursty traffic, but the doubled write cost means it needs more reads to pay off.
 
diff --git a/system-prompts/data-streaming-reference-python.md b/system-prompts/data-streaming-reference-python.md
index b1ba372..b2f9299 100644
--- a/system-prompts/data-streaming-reference-python.md
+++ b/system-prompts/data-streaming-reference-python.md
@@ -1,7 +1,7 @@
 <!--
 name: 'Data: Streaming reference — Python'
 description: Python streaming reference including sync/async streaming and handling different content types
-ccVersion: 2.1.154
+ccVersion: 2.1.170
 -->
 # Streaming — Python
 
@@ -51,7 +51,7 @@ No final-message accumulation is done for you in this form.
 
 Claude may return text, thinking blocks, or tool use. Handle each appropriately:
 
-> **Opus 4.8 / Opus 4.7 / Opus 4.6:** Use `thinking: {type: "adaptive"}`. On older models, use `thinking: {type: "enabled", budget_tokens: N}` instead.
+> **Fable 5 / Opus 4.8 / Opus 4.7 / Opus 4.6:** Use `thinking: {type: "adaptive"}`. On older models, use `thinking: {type: "enabled", budget_tokens: N}` instead.
 
 ```python
 with client.messages.stream(
diff --git a/system-prompts/data-streaming-reference-typescript.md b/system-prompts/data-streaming-reference-typescript.md
index 141f18e..6ceb14c 100644
--- a/system-prompts/data-streaming-reference-typescript.md
+++ b/system-prompts/data-streaming-reference-typescript.md
@@ -1,7 +1,7 @@
 <!--
 name: 'Data: Streaming reference — TypeScript'
 description: TypeScript streaming reference including basic streaming and handling different content types
-ccVersion: 2.1.154
+ccVersion: 2.1.170
 -->
 # Streaming — TypeScript
 
@@ -28,7 +28,7 @@ for await (const event of stream) {
 
 ## Handling Different Content Types
 
-> **Opus 4.8 / Opus 4.7 / Opus 4.6:** Use `thinking: {type: "adaptive"}`. On older models, use `thinking: {type: "enabled", budget_tokens: N}` instead.
+> **Fable 5 / Opus 4.8 / Opus 4.7 / Opus 4.6:** Use `thinking: {type: "adaptive"}`. On older models, use `thinking: {type: "enabled", budget_tokens: N}` instead.
 
 ```typescript
 const stream = client.messages.stream({
diff --git a/system-prompts/data-superseded-message-uuid-protocol-note.md b/system-prompts/data-superseded-message-uuid-protocol-note.md
deleted file mode 100644
index be27c9c..0000000
--- a/system-prompts/data-superseded-message-uuid-protocol-note.md
+++ /dev/null
@@ -1,6 +0,0 @@
-<!--
-name: 'Data: Superseded message UUID protocol note'
-description: Internal protocol note explaining how supersedes UUIDs mark previously delivered messages as canonical replacements during refusal fallback handling
-ccVersion: 2.1.169
--->
-@internal Wire uuids of previously-delivered messages that this message replaces (refusal-fallback supersede: server-lane seam merge, or the client-lane retry’s first deliverable content frame). On the client lane the list matches the banner’s retracted_message_uuids exactly and can include tombstoned tool_result frames from the refused leg, not only assistant frames. Evict the named messages on arrival and treat this frame as their canonical replacement. Idempotent with the end-of-turn model_refusal_fallback notice, whose retracted_message_uuids remains the complete audit record for the turn.
diff --git a/system-prompts/data-supported-dialog-kinds-protocol-note.md b/system-prompts/data-supported-dialog-kinds-protocol-note.md
deleted file mode 100644
index a130fcf..0000000
--- a/system-prompts/data-supported-dialog-kinds-protocol-note.md
+++ /dev/null
@@ -1,6 +0,0 @@
-<!--
-name: 'Data: Supported dialog kinds protocol note'
-description: Internal protocol note describing supported request_user_dialog kinds, fail-closed behavior, and the staged-release gate
-ccVersion: 2.1.169
--->
-@internal Dialog kinds (request_user_dialog `dialog_kind` values) this consumer's onUserDialog can actually render. The CLI treats ABSENCE as 'cannot display' and fails closed: without the kind declared here, a dialog-gated flow degrades to its no-dialog behavior (for 'refusal_fallback_prompt', the classic refusal error) instead of parking a dialog the consumer may mishandle. First-attached-client-wins on multi-client sessions; later initializes do not change it. (The @internal tag is the staged-release gate — see the Options.supportedDialogKinds doc; delete it there and here to promote.)
diff --git a/system-prompts/data-tool-use-concepts.md b/system-prompts/data-tool-use-concepts.md
index 5080d35..e119d99 100644
--- a/system-prompts/data-tool-use-concepts.md
+++ b/system-prompts/data-tool-use-concepts.md
@@ -1,7 +1,7 @@
 <!--
 name: 'Data: Tool use concepts'
 description: Conceptual foundations of tool use with the Claude API including tool definitions, tool choice, and best practices
-ccVersion: 2.1.157
+ccVersion: 2.1.170
 -->
 # Tool Use Concepts
 
@@ -176,7 +176,7 @@ Web search and web fetch let Claude search the web and retrieve page content. Th
 ]
 ```
 
-### Dynamic Filtering (Opus 4.8 / Opus 4.7 / Opus 4.6 / Sonnet 4.6)
+### Dynamic Filtering (Fable 5 / Opus 4.8 / Opus 4.7 / Opus 4.6 / Sonnet 4.6)
 
 The `web_search_20260209` and `web_fetch_20260209` versions support **dynamic filtering** — Claude writes and executes code to filter search results before they reach the context window, improving accuracy and token efficiency. Dynamic filtering is built into these tool versions and activates automatically; you do not need to separately declare the `code_execution` tool or pass any beta header.
 
@@ -305,7 +305,7 @@ Two features are available:
 - **JSON outputs** (`output_config.format`): Control Claude's response format
 - **Strict tool use** (`strict: true`): Guarantee valid tool parameter schemas
 
-**Supported models:** {{OPUS_NAME}}, {{SONNET_NAME}}, and {{HAIKU_NAME}}. Legacy models (Claude Opus 4.5, Claude Opus 4.1) also support structured outputs.
+**Supported models:** {{FABLE_NAME}}, {{OPUS_NAME}}, {{SONNET_NAME}}, and {{HAIKU_NAME}}. Legacy models (Claude Opus 4.5, Claude Opus 4.1) also support structured outputs.
 
 > **Recommended:** Use `client.messages.parse()` which automatically validates responses against your schema. When using `messages.create()` directly, use `output_config: {format: {...}}`. The `output_format` convenience parameter is also accepted by some SDK methods (e.g., `.parse()`), but `output_config.format` is the canonical API-level parameter.
 
diff --git a/system-prompts/skill-building-llm-powered-applications-with-claude.md b/system-prompts/skill-building-llm-powered-applications-with-claude.md
index bc21887..61c3d76 100644
--- a/system-prompts/skill-building-llm-powered-applications-with-claude.md
+++ b/system-prompts/skill-building-llm-powered-applications-with-claude.md
@@ -1,7 +1,7 @@
 <!--
 name: 'Skill: Building LLM-powered applications with Claude'
 description: Guides Claude in building LLM-powered applications using the Anthropic SDK, covering language detection, API surface selection (Claude API vs Managed Agents), model defaults, thinking/effort configuration, and language-specific documentation reading
-ccVersion: 2.1.163
+ccVersion: 2.1.170
 -->
 # Building LLM-Powered Applications with Claude
 
@@ -167,6 +167,7 @@ Everything goes through `POST /v1/messages`. Tools and output constraints are fe
 
 | Model             | Model ID            | Context        | Input $/1M | Output $/1M |
 | ----------------- | ------------------- | -------------- | ---------- | ----------- |
+| Claude Fable 5    | `{{FABLE_ID}}`      | 1M             | $10.00     | $50.00      |
 | Claude Opus 4.8   | `claude-opus-4-8`   | 1M             | $5.00      | $25.00      |
 | Claude Opus 4.7   | `claude-opus-4-7`   | 1M             | $5.00      | $25.00      |
 | Claude Opus 4.6   | `claude-opus-4-6`   | 1M             | $5.00      | $25.00      |
@@ -185,13 +186,13 @@ A note: if any of the model strings above look unfamiliar to you, that's to be e
 
 ## Thinking & Effort (Quick Reference)
 
-**Opus 4.8 / 4.7 — Adaptive thinking only:** Use `thinking: {type: "adaptive"}`. `thinking: {type: "enabled", budget_tokens: N}` returns a 400 — adaptive is the only on-mode. `{type: "disabled"}` and omitting `thinking` both work. Sampling parameters (`temperature`, `top_p`, `top_k`) are also removed and will 400. Opus 4.8 keeps the same request surface as 4.7 (no new breaking changes) — see `shared/model-migration.md` → Migrating to Opus 4.8 for the behavioral re-tuning, and → Migrating to Opus 4.7 for the full breaking-change list when coming from 4.6 or earlier. Note: with `thinking` disabled, Opus 4.8 may write longer reasoning into the visible response — leave adaptive thinking on, or add a final-answer-only instruction (see the migration guide).
-**Opus 4.6 — Adaptive thinking (recommended):** Use `thinking: {type: "adaptive"}`. Claude dynamically decides when and how much to think. No `budget_tokens` needed — `budget_tokens` is deprecated on Opus 4.6 and Sonnet 4.6 and should not be used for new code. Adaptive thinking also automatically enables interleaved thinking (no beta header needed). **When the user asks for "extended thinking", a "thinking budget", or `budget_tokens`: always use Opus 4.8, 4.7, or 4.6 with `thinking: {type: "adaptive"}`. The concept of a fixed token budget for thinking is deprecated — adaptive thinking replaces it. Do NOT use `budget_tokens` for new 4.6/4.7/4.8 code and do NOT switch to an older model.** *Gradual-migration carve-out:* `budget_tokens` is still functional on Opus 4.6 and Sonnet 4.6 as a transitional escape hatch — if you're migrating existing code and need a hard token ceiling before you've tuned `effort`, see `shared/model-migration.md` → Transitional escape hatch. Note: this carve-out does **not** apply to Opus 4.7 or 4.8 — `budget_tokens` is fully removed there.
-**Effort parameter (GA, no beta header):** Controls thinking depth and overall token spend via `output_config: {effort: "low"|"medium"|"high"|"max"}` (inside `output_config`, not top-level). Default is `high` (equivalent to omitting it). `max` is Opus-tier only (Opus 4.6 and later — not Sonnet or Haiku). Opus 4.7 added `"xhigh"` (between `high` and `max`) — the best setting for most coding and agentic use cases on Opus 4.7/4.8, and the default in Claude Code; use a minimum of `high` for most intelligence-sensitive work. Works on Opus 4.5, Opus 4.6, Opus 4.7, Opus 4.8, and Sonnet 4.6. Will error on Sonnet 4.5 / Haiku 4.5. On Opus 4.7 and 4.8, effort matters more than on any prior Opus — re-tune it when migrating, and run long-horizon/agentic tasks at `high`/`xhigh` with the full task spec given up front. Combine with adaptive thinking for the best cost-quality tradeoffs. Lower effort means fewer and more-consolidated tool calls, less preamble, and terser confirmations — `high` is often the sweet spot balancing quality and token efficiency; use `max` when correctness matters more than cost; use `low` for subagents or simple tasks.
+**Fable 5 / Opus 4.8 / 4.7 — Adaptive thinking only:** Use `thinking: {type: "adaptive"}`. `thinking: {type: "enabled", budget_tokens: N}` returns a 400 — adaptive is the only on-mode. On Opus 4.8 and 4.7, `{type: "disabled"}` and omitting `thinking` both work; on Fable 5, an explicit `{type: "disabled"}` returns a 400 — omit the `thinking` param entirely instead. Sampling parameters (`temperature`, `top_p`, `top_k`) are also removed and will 400. Opus 4.8 keeps the same request surface as 4.7 (no new breaking changes) — see `shared/model-migration.md` → Migrating to Opus 4.8 for the behavioral re-tuning, and → Migrating to Opus 4.7 for the full breaking-change list when coming from 4.6 or earlier. Note: with `thinking` disabled, Opus 4.8 may write longer reasoning into the visible response — leave adaptive thinking on, or add a final-answer-only instruction (see the migration guide).
+**Opus 4.6 — Adaptive thinking (recommended):** Use `thinking: {type: "adaptive"}`. Claude dynamically decides when and how much to think. No `budget_tokens` needed — `budget_tokens` is deprecated on Opus 4.6 and Sonnet 4.6 and should not be used for new code. Adaptive thinking also automatically enables interleaved thinking (no beta header needed). **When the user asks for "extended thinking", a "thinking budget", or `budget_tokens`: always use Fable 5, Opus 4.8, 4.7, or 4.6 with `thinking: {type: "adaptive"}`. The concept of a fixed token budget for thinking is deprecated — adaptive thinking replaces it. Do NOT use `budget_tokens` for new 4.6/4.7/4.8 code and do NOT switch to an older model.** *Gradual-migration carve-out:* `budget_tokens` is still functional on Opus 4.6 and Sonnet 4.6 as a transitional escape hatch — if you're migrating existing code and need a hard token ceiling before you've tuned `effort`, see `shared/model-migration.md` → Transitional escape hatch. Note: this carve-out does **not** apply to Fable 5, Opus 4.7 or 4.8 — `budget_tokens` is fully removed there.
+**Effort parameter (GA, no beta header):** Controls thinking depth and overall token spend via `output_config: {effort: "low"|"medium"|"high"|"max"}` (inside `output_config`, not top-level). Default is `high` (equivalent to omitting it). `max` is supported on Fable 5, Opus 4.6 and later, and Sonnet 4.6 (not Haiku or earlier Sonnets). Opus 4.7 added `"xhigh"` (between `high` and `max`) — the best setting for most coding and agentic use cases on Fable 5 / Opus 4.7/4.8, and the default in Claude Code; use a minimum of `high` for most intelligence-sensitive work. Works on Fable 5, Opus 4.5, Opus 4.6, Opus 4.7, Opus 4.8, and Sonnet 4.6. Will error on Sonnet 4.5 / Haiku 4.5. On Fable 5, Opus 4.7 and 4.8, effort matters more than on any prior Opus — re-tune it when migrating, and run long-horizon/agentic tasks at `high`/`xhigh` with the full task spec given up front. Combine with adaptive thinking for the best cost-quality tradeoffs. Lower effort means fewer and more-consolidated tool calls, less preamble, and terser confirmations — `high` is often the sweet spot balancing quality and token efficiency; use `max` when correctness matters more than cost; use `low` for subagents or simple tasks.
 
-**Opus 4.8 / 4.7 — thinking content omitted by default:** `thinking` blocks still stream but their text is empty unless you opt in with `thinking: {type: "adaptive", display: "summarized"}` (default is `"omitted"`). Silent change — no error. If you stream reasoning to users, the default looks like a long pause before output; set `"summarized"` to restore visible progress.
+**Fable 5 / Opus 4.8 / 4.7 — thinking content omitted by default:** `thinking` blocks still stream but their text is empty unless you opt in with `thinking: {type: "adaptive", display: "summarized"}` (default is `"omitted"`). Silent change — no error. If you stream reasoning to users, the default looks like a long pause before output; set `"summarized"` to restore visible progress.
 
-**Task Budgets (beta, Opus 4.7 / 4.8):** `output_config: {task_budget: {type: "tokens", total: N}}` tells the model how many tokens it has for a full agentic loop — it sees a running countdown and self-moderates (minimum 20,000; beta header `task-budgets-2026-03-13`). Distinct from `max_tokens`, which is an enforced per-response ceiling the model is not aware of. See `shared/model-migration.md` → Task Budgets.
+**Task Budgets (beta, Fable 5 / Opus 4.7 / 4.8):** `output_config: {task_budget: {type: "tokens", total: N}}` tells the model how many tokens it has for a full agentic loop — it sees a running countdown and self-moderates (minimum 20,000; beta header `task-budgets-2026-03-13`). Distinct from `max_tokens`, which is an enforced per-response ceiling the model is not aware of. See `shared/model-migration.md` → Task Budgets.
 
 **Sonnet 4.6:** Supports adaptive thinking (`thinking: {type: "adaptive"}`). `budget_tokens` is deprecated on Sonnet 4.6 — use adaptive thinking instead.
 
@@ -201,7 +202,7 @@ A note: if any of the model strings above look unfamiliar to you, that's to be e
 
 ## Compaction (Quick Reference)
 
-**Beta, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6.** For long-running conversations that may exceed the 1M context window, enable server-side compaction. The API automatically summarizes earlier context when it approaches the trigger threshold (default: 150K tokens). Requires beta header `compact-2026-01-12`.
+**Beta, Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6.** For long-running conversations that may exceed the 1M context window, enable server-side compaction. The API automatically summarizes earlier context when it approaches the trigger threshold (default: 150K tokens). Requires beta header `compact-2026-01-12`.
 
 **Critical:** Append `response.content` (not just the text) back to your messages on every turn. Compaction blocks in the response must be preserved — the API uses them to replace the compacted history on the next request. Extracting only the text string and appending that will silently lose the compaction state.
 
@@ -261,7 +262,7 @@ After detecting the language, read the relevant files based on what the user nee
 
 **Long-running conversations (may exceed context window):**
 → Read `{lang}/claude-api/README.md` — see Compaction section
-**Migrating to a newer model (Opus 4.8 / Opus 4.7 / Opus 4.6 / Sonnet 4.6) or replacing a retired model:**
+**Migrating to a newer model (Fable 5 / Opus 4.8 / Opus 4.7 / Opus 4.6 / Sonnet 4.6) or replacing a retired model:**
 → Read `shared/model-migration.md`
 **Prompt caching / optimize caching / "why is my cache hit rate low":**
 → Read `shared/prompt-caching.md` + `{lang}/claude-api/README.md` (Prompt Caching section)
@@ -318,13 +319,13 @@ Live documentation URLs are in `shared/live-sources.md`.
 ## Common Pitfalls
 
 - Don't truncate inputs when passing files or content to the API. If the content is too long to fit in the context window, notify the user and discuss options (chunking, summarization, etc.) rather than silently truncating.
-- **Opus 4.8 / 4.7 thinking:** Adaptive only. `thinking: {type: "enabled", budget_tokens: N}` returns 400 — `budget_tokens` is fully removed (along with `temperature`, `top_p`, `top_k`). Use `thinking: {type: "adaptive"}`. Opus 4.8 inherits this surface from 4.7 with no new breaking changes.
-- **Opus 4.6 / Sonnet 4.6 thinking:** Use `thinking: {type: "adaptive"}` — do NOT use `budget_tokens` for new 4.6 code (deprecated on both Opus 4.6 and Sonnet 4.6; for gradual migration of existing code, see the transitional escape hatch in `shared/model-migration.md` — note this carve-out does not apply to Opus 4.7 or 4.8). For older models, `budget_tokens` must be less than `max_tokens` (minimum 1024). This will throw an error if you get it wrong.
-- **4.6/4.7/4.8 family prefill removed:** Assistant message prefills (last-assistant-turn prefills) return a 400 error on Opus 4.6, Opus 4.7, Opus 4.8, and Sonnet 4.6. Use structured outputs (`output_config.format`) or system prompt instructions to control response format instead.
+- **Fable 5 / Opus 4.8 / 4.7 thinking:** Adaptive only. `thinking: {type: "enabled", budget_tokens: N}` returns 400 — `budget_tokens` is fully removed (along with `temperature`, `top_p`, `top_k`). Use `thinking: {type: "adaptive"}`. Opus 4.8 inherits this surface from 4.7 with no new breaking changes; Fable 5 adds one — an explicit `thinking: {type: "disabled"}` returns a 400 (accepted on 4.7/4.8); omit the param instead.
+- **Opus 4.6 / Sonnet 4.6 thinking:** Use `thinking: {type: "adaptive"}` — do NOT use `budget_tokens` for new 4.6 code (deprecated on both Opus 4.6 and Sonnet 4.6; for gradual migration of existing code, see the transitional escape hatch in `shared/model-migration.md` — note this carve-out does not apply to Fable 5, Opus 4.7 or 4.8). For older models, `budget_tokens` must be less than `max_tokens` (minimum 1024). This will throw an error if you get it wrong.
+- **Prefill removed (Fable 5 and the 4.6/4.7/4.8 family):** Assistant message prefills (last-assistant-turn prefills) return a 400 error on Fable 5, Opus 4.6, Opus 4.7, Opus 4.8, and Sonnet 4.6. Use structured outputs (`output_config.format`) or system prompt instructions to control response format instead.
 - **Confirm migration scope before editing:** When a user asks to migrate code to a newer Claude model without naming a specific file, directory, or file list, **ask which scope to apply first** — the entire working directory, a specific subdirectory, or a specific set of files. Do not start editing until the user confirms. Imperative phrasings like "migrate my codebase", "move my project to X", "upgrade to Sonnet 4.6", or bare "migrate to Opus 4.8" are **still ambiguous** — they tell you what to do but not where, so ask. Proceed without asking only when the prompt names an exact file, a specific directory, or an explicit file list ("migrate `app.py`", "migrate everything under `services/`", "update `a.py` and `b.py`"). See `shared/model-migration.md` Step 0.
 - **`max_tokens` defaults:** Don't lowball `max_tokens` — hitting the cap truncates output mid-thought and requires a retry. For non-streaming requests, default to `~16000` (keeps responses under SDK HTTP timeouts). For streaming requests, default to `~64000` (timeouts aren't a concern, so give the model room). Only go lower when you have a hard reason: classification (`~256`), cost caps, deliberately short outputs, or **`max_tokens: 0`** for cache pre-warming (see `shared/prompt-caching.md` → Pre-warming).
-- **128K output tokens:** Opus 4.6, Opus 4.7, and Opus 4.8 support up to 128K `max_tokens`, but the SDKs require streaming for values that large to avoid HTTP timeouts. Use `.stream()` with `.get_final_message()` / `.finalMessage()`.
-- **Tool call JSON parsing (4.6/4.7/4.8 family):** Opus 4.6, Opus 4.7, Opus 4.8, and Sonnet 4.6 may produce different JSON string escaping in tool call `input` fields (e.g., Unicode or forward-slash escaping). Always parse tool inputs with `json.loads()` / `JSON.parse()` — never do raw string matching on the serialized input.
+- **128K output tokens:** Fable 5, Opus 4.6, Opus 4.7, and Opus 4.8 support up to 128K `max_tokens`, but the SDKs require streaming for values that large to avoid HTTP timeouts. Use `.stream()` with `.get_final_message()` / `.finalMessage()`.
+- **Tool call JSON parsing (Fable 5 and the 4.6/4.7/4.8 family):** Fable 5, Opus 4.6, Opus 4.7, Opus 4.8, and Sonnet 4.6 may produce different JSON string escaping in tool call `input` fields (e.g., Unicode or forward-slash escaping). Always parse tool inputs with `json.loads()` / `JSON.parse()` — never do raw string matching on the serialized input.
 - **Structured outputs (all models):** Use `output_config: {format: {...}}` instead of the deprecated `output_format` parameter on `messages.create()`. This is a general API change, not 4.6-specific.
 - **Don't reimplement SDK functionality:** The SDK provides high-level helpers — use them instead of building from scratch. Specifically: use `stream.finalMessage()` instead of wrapping `.on()` events in `new Promise()`; use typed exception classes (`Anthropic.RateLimitError`, etc.) instead of string-matching error messages; use SDK types (`Anthropic.MessageParam`, `Anthropic.Tool`, `Anthropic.Message`, etc.) instead of redefining equivalent interfaces.
 - **Don't define custom types for SDK data structures:** The SDK exports types for all API objects. Use `Anthropic.MessageParam` for messages, `Anthropic.Tool` for tool definitions, `Anthropic.ToolUseBlock` / `Anthropic.ToolResultBlockParam` for tool results, `Anthropic.Message` for responses. Defining your own `interface ChatMessage { role: string; content: unknown }` duplicates what the SDK already provides and loses type safety.