docs: add runtime-fallback and fallback_models documentation
This commit is contained in:
parent
7aafa13b21
commit
0ef17aa6c9
@ -163,19 +163,20 @@ Override built-in agent settings:
|
||||
}
|
||||
```
|
||||
|
||||
Each agent supports: `model`, `temperature`, `top_p`, `prompt`, `prompt_append`, `tools`, `disable`, `description`, `mode`, `color`, `permission`, `category`, `variant`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `providerOptions`.
|
||||
Each agent supports: `model`, `fallback_models`, `temperature`, `top_p`, `prompt`, `prompt_append`, `tools`, `disable`, `description`, `mode`, `color`, `permission`, `category`, `variant`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `providerOptions`.
|
||||
|
||||
### Additional Agent Options
|
||||
|
||||
| Option | Type | Description |
|
||||
| ------------------- | ------- | ----------------------------------------------------------------------------------------------- |
|
||||
| `category` | string | Category name to inherit model and other settings from category defaults |
|
||||
| `variant` | string | Model variant (e.g., `max`, `high`, `medium`, `low`, `xhigh`) |
|
||||
| `maxTokens` | number | Maximum tokens for response. Passed directly to OpenCode SDK. |
|
||||
| `thinking` | object | Extended thinking configuration for Anthropic models. See [Thinking Options](#thinking-options) below. |
|
||||
| `reasoningEffort` | string | OpenAI reasoning effort level. Values: `low`, `medium`, `high`, `xhigh`. |
|
||||
| `textVerbosity` | string | Text verbosity level. Values: `low`, `medium`, `high`. |
|
||||
| `providerOptions` | object | Provider-specific options passed directly to OpenCode SDK. |
|
||||
| Option | Type | Description |
|
||||
| ------------------- | -------------- | ----------------------------------------------------------------------------------------------- |
|
||||
| `fallback_models` | string/array | Fallback models for runtime switching on API errors. Single string or array of model strings. |
|
||||
| `category` | string | Category name to inherit model and other settings from category defaults |
|
||||
| `variant` | string | Model variant (e.g., `max`, `high`, `medium`, `low`, `xhigh`) |
|
||||
| `maxTokens` | number | Maximum tokens for response. Passed directly to OpenCode SDK. |
|
||||
| `thinking` | object | Extended thinking configuration for Anthropic models. See [Thinking Options](#thinking-options) below. |
|
||||
| `reasoningEffort` | string | OpenAI reasoning effort level. Values: `low`, `medium`, `high`, `xhigh`. |
|
||||
| `textVerbosity` | string | Text verbosity level. Values: `low`, `medium`, `high`. |
|
||||
| `providerOptions` | object | Provider-specific options passed directly to OpenCode SDK. |
|
||||
|
||||
#### Thinking Options (Anthropic)
|
||||
|
||||
@ -714,6 +715,63 @@ Configure concurrency limits for background agent tasks. This controls how many
|
||||
- Allow more concurrent tasks for fast/cheap models (e.g., Gemini Flash)
|
||||
- Respect provider rate limits by setting provider-level caps
|
||||
|
||||
## Runtime Fallback
|
||||
|
||||
Automatically switch to backup models when the primary model encounters transient API errors (rate limits, overload, etc.). This keeps conversations running without manual intervention.
|
||||
|
||||
```json
|
||||
{
|
||||
"runtime_fallback": {
|
||||
"enabled": true,
|
||||
"retry_on_errors": [429, 503, 529],
|
||||
"max_fallback_attempts": 3,
|
||||
"cooldown_seconds": 60,
|
||||
"notify_on_fallback": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
| Option | Default | Description |
|
||||
| ----------------------- | ----------------- | --------------------------------------------------------------------------- |
|
||||
| `enabled` | `true` | Enable runtime fallback |
|
||||
| `retry_on_errors` | `[429, 503, 529]` | HTTP status codes that trigger fallback (rate limit, service unavailable) |
|
||||
| `max_fallback_attempts` | `3` | Maximum fallback attempts per session (1-10) |
|
||||
| `cooldown_seconds` | `60` | Cooldown in seconds before retrying a failed model |
|
||||
| `notify_on_fallback` | `true` | Show toast notification when switching to a fallback model |
|
||||
|
||||
### How It Works
|
||||
|
||||
1. When an API error matching `retry_on_errors` occurs, the hook intercepts it
|
||||
2. The next request automatically uses the next available model from `fallback_models`
|
||||
3. Failed models enter a cooldown period before being retried
|
||||
4. Toast notification (optional) informs you of the model switch
|
||||
|
||||
### Configuring Fallback Models
|
||||
|
||||
Define `fallback_models` at the agent or category level:
|
||||
|
||||
```json
|
||||
{
|
||||
"agents": {
|
||||
"sisyphus": {
|
||||
"model": "anthropic/claude-opus-4-5",
|
||||
"fallback_models": ["openai/gpt-5.2", "google/gemini-3-pro"]
|
||||
}
|
||||
},
|
||||
"categories": {
|
||||
"ultrabrain": {
|
||||
"model": "openai/gpt-5.2-codex",
|
||||
"fallback_models": ["anthropic/claude-opus-4-5", "google/gemini-3-pro"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
When the primary model fails:
|
||||
1. First fallback: `openai/gpt-5.2`
|
||||
2. Second fallback: `google/gemini-3-pro`
|
||||
3. After `max_fallback_attempts`, returns to primary model
|
||||
|
||||
## Categories
|
||||
|
||||
Categories enable domain-specific task delegation via the `task` tool. Each category applies runtime presets (model, temperature, prompt additions) when calling the `Sisyphus-Junior` agent.
|
||||
@ -830,14 +888,15 @@ Add your own categories or override built-in ones:
|
||||
}
|
||||
```
|
||||
|
||||
Each category supports: `model`, `temperature`, `top_p`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `tools`, `prompt_append`, `variant`, `description`, `is_unstable_agent`.
|
||||
Each category supports: `model`, `fallback_models`, `temperature`, `top_p`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `tools`, `prompt_append`, `variant`, `description`, `is_unstable_agent`.
|
||||
|
||||
### Additional Category Options
|
||||
|
||||
| Option | Type | Default | Description |
|
||||
| ------------------ | ------- | ------- | --------------------------------------------------------------------------------------------------- |
|
||||
| `description` | string | - | Human-readable description of the category's purpose. Shown in task prompt. |
|
||||
| `is_unstable_agent`| boolean | `false` | Mark agent as unstable - forces background mode for monitoring. Auto-enabled for gemini models. |
|
||||
| Option | Type | Default | Description |
|
||||
| ------------------- | ------------ | ------- | --------------------------------------------------------------------------------------------------- |
|
||||
| `fallback_models` | string/array | - | Fallback models for runtime switching on API errors. Single string or array of model strings. |
|
||||
| `description` | string | - | Human-readable description of the category's purpose. Shown in delegate_task prompt. |
|
||||
| `is_unstable_agent` | boolean | `false` | Mark agent as unstable - forces background mode for monitoring. Auto-enabled for gemini models. |
|
||||
|
||||
## Model Resolution System
|
||||
|
||||
@ -973,7 +1032,7 @@ Disable specific built-in hooks via `disabled_hooks` in `~/.config/opencode/oh-m
|
||||
}
|
||||
```
|
||||
|
||||
Available hooks: `todo-continuation-enforcer`, `context-window-monitor`, `session-recovery`, `session-notification`, `comment-checker`, `grep-output-truncator`, `tool-output-truncator`, `directory-agents-injector`, `directory-readme-injector`, `empty-task-response-detector`, `think-mode`, `anthropic-context-window-limit-recovery`, `rules-injector`, `background-notification`, `auto-update-checker`, `startup-toast`, `keyword-detector`, `agent-usage-reminder`, `non-interactive-env`, `interactive-bash-session`, `compaction-context-injector`, `thinking-block-validator`, `claude-code-hooks`, `ralph-loop`, `preemptive-compaction`, `auto-slash-command`, `sisyphus-junior-notepad`, `no-sisyphus-gpt`, `start-work`
|
||||
Available hooks: `todo-continuation-enforcer`, `context-window-monitor`, `session-recovery`, `session-notification`, `comment-checker`, `grep-output-truncator`, `tool-output-truncator`, `directory-agents-injector`, `directory-readme-injector`, `empty-task-response-detector`, `think-mode`, `anthropic-context-window-limit-recovery`, `rules-injector`, `background-notification`, `auto-update-checker`, `startup-toast`, `keyword-detector`, `agent-usage-reminder`, `non-interactive-env`, `interactive-bash-session`, `compaction-context-injector`, `thinking-block-validator`, `claude-code-hooks`, `ralph-loop`, `preemptive-compaction`, `auto-slash-command`, `sisyphus-junior-notepad`, `no-sisyphus-gpt`, `start-work`, `runtime-fallback`
|
||||
|
||||
**Note on `directory-agents-injector`**: This hook is **automatically disabled** when running on OpenCode 1.1.37+ because OpenCode now has native support for dynamically resolving AGENTS.md files from subdirectories (PR #10678). This prevents duplicate AGENTS.md injection. For older OpenCode versions, the hook remains active to provide the same functionality.
|
||||
|
||||
|
||||
@ -352,6 +352,7 @@ Hooks intercept and modify behavior at key points in the agent lifecycle.
|
||||
| **session-recovery** | Stop | Recovers from session errors - missing tool results, thinking block issues, empty messages. |
|
||||
| **anthropic-context-window-limit-recovery** | Stop | Handles Claude context window limits gracefully. |
|
||||
| **background-compaction** | Stop | Auto-compacts sessions hitting token limits. |
|
||||
| **runtime-fallback** | Stop | Automatically switches to fallback models on API errors (429, 503, 529). Configurable via `runtime_fallback` and `fallback_models`. |
|
||||
|
||||
#### Truncation & Context Management
|
||||
|
||||
|
||||
@ -9,6 +9,45 @@
|
||||
## HOOK TIERS
|
||||
|
||||
### Tier 1: Session Hooks (22) — `create-session-hooks.ts`
|
||||
## STRUCTURE
|
||||
```
|
||||
hooks/
|
||||
├── atlas/ # Main orchestration (757 lines)
|
||||
├── anthropic-context-window-limit-recovery/ # Auto-summarize
|
||||
├── todo-continuation-enforcer.ts # Force TODO completion
|
||||
├── ralph-loop/ # Self-referential dev loop
|
||||
├── claude-code-hooks/ # settings.json compat layer - see AGENTS.md
|
||||
├── comment-checker/ # Prevents AI slop
|
||||
├── auto-slash-command/ # Detects /command patterns
|
||||
├── rules-injector/ # Conditional rules
|
||||
├── directory-agents-injector/ # Auto-injects AGENTS.md
|
||||
├── directory-readme-injector/ # Auto-injects README.md
|
||||
├── edit-error-recovery/ # Recovers from failures
|
||||
├── thinking-block-validator/ # Ensures valid <thinking>
|
||||
├── context-window-monitor.ts # Reminds of headroom
|
||||
├── session-recovery/ # Auto-recovers from crashes
|
||||
├── think-mode/ # Dynamic thinking budget
|
||||
├── keyword-detector/ # ultrawork/search/analyze modes
|
||||
├── background-notification/ # OS notification
|
||||
├── prometheus-md-only/ # Planner read-only mode
|
||||
├── agent-usage-reminder/ # Specialized agent hints
|
||||
├── auto-update-checker/ # Plugin update check
|
||||
├── tool-output-truncator.ts # Prevents context bloat
|
||||
├── compaction-context-injector/ # Injects context on compaction
|
||||
├── delegate-task-retry/ # Retries failed delegations
|
||||
├── interactive-bash-session/ # Tmux session management
|
||||
├── non-interactive-env/ # Non-TTY environment handling
|
||||
├── start-work/ # Sisyphus work session starter
|
||||
├── task-resume-info/ # Resume info for cancelled tasks
|
||||
├── question-label-truncator/ # Auto-truncates question labels
|
||||
├── category-skill-reminder/ # Reminds of category skills
|
||||
├── empty-task-response-detector.ts # Detects empty responses
|
||||
├── sisyphus-junior-notepad/ # Sisyphus Junior notepad
|
||||
├── stop-continuation-guard/ # Guards stop continuation
|
||||
├── subagent-question-blocker/ # Blocks subagent questions
|
||||
├── runtime-fallback/ # Auto-switch models on API errors
|
||||
└── index.ts # Hook aggregation + registration
|
||||
```
|
||||
|
||||
| Hook | Event | Purpose |
|
||||
|------|-------|---------|
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user