mirror of
https://github.com/ultraworkers/claw-code.git
synced 2026-04-25 13:44:06 +08:00
The Python code examples were accessing nested error.kind like envelope['error']['kind'],
but v1.0 emits flat envelopes with error as a STRING and kind at top-level.
Updated:
- Table header: now shows actual v1.0 shape {error: "...", kind: "...", type: "error"}
- match statement: switched from envelope.get('error',{}).get('kind') to envelope.get('kind')
- All ClawError raises: changed from envelope['error']['message'] to envelope.get('error','')
because error field is a STRING in v1.0, not a nested object
- Added inline comments on every error case noting v1.0 vs v2.0 difference
- Appendix: split into v1.0 (actual/current) and v2.0 (target after FIX_LOCUS_164)
The code examples now work correctly against the actual binary.
This was active misdocumentation (P0 severity) — the Python examples would crash
if a consumer tried to use them.
513 lines
19 KiB
Markdown
513 lines
19 KiB
Markdown
# Error Handling for Claw Code Claws
|
||
|
||
**Purpose:** Build a unified error handler for orchestration code using claw-code as a library or subprocess.
|
||
|
||
After cycles #178–#179 (parser-front-door hole closure), claw-code's error interface is deterministic, machine-readable, and clawable: **one error handler for all 14 clawable commands.**
|
||
|
||
---
|
||
|
||
## Quick Reference: Exit Codes and Envelopes
|
||
|
||
Every clawable command returns JSON on stdout when `--output-format json` is requested.
|
||
|
||
**IMPORTANT:** The exit code contract below applies **only when `--output-format json` is explicitly set**. Text mode follows argparse conventions and may return different exit codes (e.g., `2` for argparse parse errors). Claws consuming claw-code as a subprocess MUST always pass `--output-format json` to get the documented contract.
|
||
|
||
| Exit Code | Meaning | Response Format | Example |
|
||
|---|---|---|---|
|
||
| **0** | Success | `{success fields}` | `{"session_id": "...", "loaded": true}` |
|
||
| **1** | Error / Not Found | `{error: "...", hint: "...", kind: "...", type: "error"}` (flat, v1.0) | `{"error": "session not found", "kind": "session_not_found", "type": "error"}` |
|
||
| **2** | Timeout | `{final_stop_reason: "timeout", final_cancel_observed: ...}` | `{"final_stop_reason": "timeout", ...}` |
|
||
|
||
### Text mode vs JSON mode exit codes
|
||
|
||
| Scenario | Text mode exit | JSON mode exit | Why |
|
||
|---|---|---|---|
|
||
| Unknown subcommand | 2 (argparse default) | 1 (parse error envelope) | argparse defaults to 2; JSON mode normalizes to contract |
|
||
| Missing required arg | 2 (argparse default) | 1 (parse error envelope) | Same reason |
|
||
| Session not found | 1 | 1 | Application-level error, same in both |
|
||
| Command executed OK | 0 | 0 | Success path, identical |
|
||
| Turn-loop timeout | 2 | 2 | Identical (#161 implementation) |
|
||
|
||
**Practical rule for claws:** always pass `--output-format json`. This eliminates text-mode surprises and gives you the documented exit-code contract for every error path.
|
||
|
||
---
|
||
|
||
## One-Handler Pattern
|
||
|
||
Build a single error-recovery function that works for all 14 clawable commands:
|
||
|
||
```python
|
||
import subprocess
|
||
import json
|
||
import sys
|
||
from typing import Any
|
||
|
||
def run_claw_command(command: list[str], timeout_seconds: float = 30.0) -> dict[str, Any]:
|
||
"""
|
||
Run a clawable claw-code command and handle errors uniformly.
|
||
|
||
Args:
|
||
command: Full command list, e.g. ["claw", "load-session", "id", "--output-format", "json"]
|
||
timeout_seconds: Wall-clock timeout
|
||
|
||
Returns:
|
||
Parsed JSON result from stdout
|
||
|
||
Raises:
|
||
ClawError: Classified by error.kind (parse, session_not_found, runtime, timeout, etc.)
|
||
"""
|
||
try:
|
||
result = subprocess.run(
|
||
command,
|
||
capture_output=True,
|
||
text=True,
|
||
timeout=timeout_seconds,
|
||
)
|
||
except subprocess.TimeoutExpired:
|
||
raise ClawError(
|
||
kind='subprocess_timeout',
|
||
message=f'Command exceeded {timeout_seconds}s wall-clock timeout',
|
||
retryable=True, # Caller's decision; subprocess timeout != engine timeout
|
||
)
|
||
|
||
# Parse JSON (valid for all success/error/timeout paths in claw-code)
|
||
try:
|
||
envelope = json.loads(result.stdout)
|
||
except json.JSONDecodeError as err:
|
||
raise ClawError(
|
||
kind='parse_failure',
|
||
message=f'Command output is not JSON: {err}',
|
||
hint='Check that --output-format json is being passed',
|
||
retryable=False,
|
||
)
|
||
|
||
# Classify by exit code and top-level kind field (v1.0 flat envelope shape)
|
||
# NOTE: v1.0 envelopes have error as a STRING, not a nested object.
|
||
# The v2.0 schema (SCHEMAS.md) specifies nested error.{kind, message, ...},
|
||
# but the current binary emits flat {error: "...", kind: "...", type: "error"}.
|
||
# See FIX_LOCUS_164.md for the migration timeline.
|
||
match (result.returncode, envelope.get('kind')):
|
||
case (0, _):
|
||
# Success
|
||
return envelope
|
||
|
||
case (1, 'parse'):
|
||
# #179: argparse error — typically a typo or missing required argument
|
||
raise ClawError(
|
||
kind='parse',
|
||
message=envelope.get('error', ''), # error field is a string in v1.0
|
||
hint=envelope.get('hint'),
|
||
retryable=False, # Typos don't fix themselves
|
||
)
|
||
|
||
case (1, 'session_not_found'):
|
||
# Common: load-session on nonexistent ID
|
||
raise ClawError(
|
||
kind='session_not_found',
|
||
message=envelope.get('error', ''), # error field is a string in v1.0
|
||
session_id=envelope.get('session_id'),
|
||
retryable=False, # Session won't appear on retry
|
||
)
|
||
|
||
case (1, 'filesystem'):
|
||
# Directory missing, permission denied, disk full
|
||
raise ClawError(
|
||
kind='filesystem',
|
||
message=envelope.get('error', ''), # error field is a string in v1.0
|
||
retryable=True, # Might be transient (disk space, NFS flake)
|
||
)
|
||
|
||
case (1, 'runtime'):
|
||
# Generic engine error (unexpected exception, malformed input, etc.)
|
||
raise ClawError(
|
||
kind='runtime',
|
||
message=envelope.get('error', ''), # error field is a string in v1.0
|
||
retryable=envelope.get('retryable', False), # v1.0 may or may not have this
|
||
)
|
||
|
||
case (1, _):
|
||
# Catch-all for any new error.kind values
|
||
raise ClawError(
|
||
kind=envelope.get('kind', 'unknown'),
|
||
message=envelope.get('error', ''), # error field is a string in v1.0
|
||
retryable=envelope.get('retryable', False), # v1.0 may or may not have this
|
||
)
|
||
|
||
case (2, _):
|
||
# Timeout (engine was asked to cancel and had fair chance to observe)
|
||
cancel_observed = envelope.get('final_cancel_observed', False)
|
||
raise ClawError(
|
||
kind='timeout',
|
||
message=f'Turn exceeded timeout (cancel_observed={cancel_observed})',
|
||
cancel_observed=cancel_observed,
|
||
retryable=True, # Caller can retry with a fresh session
|
||
safe_to_reuse_session=(cancel_observed is True),
|
||
)
|
||
|
||
case (exit_code, _):
|
||
# Unexpected exit code
|
||
raise ClawError(
|
||
kind='unexpected_exit_code',
|
||
message=f'Unexpected exit code {exit_code}',
|
||
retryable=False,
|
||
)
|
||
|
||
|
||
class ClawError(Exception):
|
||
"""Unified error type for claw-code commands."""
|
||
|
||
def __init__(
|
||
self,
|
||
kind: str,
|
||
message: str,
|
||
hint: str | None = None,
|
||
retryable: bool = False,
|
||
cancel_observed: bool = False,
|
||
safe_to_reuse_session: bool = False,
|
||
session_id: str | None = None,
|
||
):
|
||
self.kind = kind
|
||
self.message = message
|
||
self.hint = hint
|
||
self.retryable = retryable
|
||
self.cancel_observed = cancel_observed
|
||
self.safe_to_reuse_session = safe_to_reuse_session
|
||
self.session_id = session_id
|
||
super().__init__(self.message)
|
||
|
||
def __str__(self) -> str:
|
||
parts = [f"{self.kind}: {self.message}"]
|
||
if self.hint:
|
||
parts.append(f"Hint: {self.hint}")
|
||
if self.retryable:
|
||
parts.append("(retryable)")
|
||
if self.cancel_observed:
|
||
parts.append(f"(safe_to_reuse_session={self.safe_to_reuse_session})")
|
||
return "\n".join(parts)
|
||
```
|
||
|
||
---
|
||
|
||
## Practical Recovery Patterns
|
||
|
||
### Pattern 1: Retry on transient errors
|
||
|
||
```python
|
||
from time import sleep
|
||
|
||
def run_with_retry(
|
||
command: list[str],
|
||
max_attempts: int = 3,
|
||
backoff_seconds: float = 0.5,
|
||
) -> dict:
|
||
"""Retry on transient errors (filesystem, timeout)."""
|
||
for attempt in range(1, max_attempts + 1):
|
||
try:
|
||
return run_claw_command(command)
|
||
except ClawError as err:
|
||
if not err.retryable:
|
||
raise # Non-transient; fail fast
|
||
|
||
if attempt == max_attempts:
|
||
raise # Last attempt; propagate
|
||
|
||
print(f"Attempt {attempt} failed ({err.kind}); retrying in {backoff_seconds}s...", file=sys.stderr)
|
||
sleep(backoff_seconds)
|
||
backoff_seconds *= 1.5 # exponential backoff
|
||
|
||
raise RuntimeError("Unreachable")
|
||
```
|
||
|
||
### Pattern 2: Reuse session after timeout (if safe)
|
||
|
||
```python
|
||
def run_with_timeout_recovery(
|
||
command: list[str],
|
||
timeout_seconds: float = 30.0,
|
||
fallback_timeout: float = 60.0,
|
||
) -> dict:
|
||
"""
|
||
On timeout, check cancel_observed. If True, the session is safe for retry.
|
||
If False, the session is potentially wedged; use a fresh one.
|
||
"""
|
||
try:
|
||
return run_claw_command(command, timeout_seconds=timeout_seconds)
|
||
except ClawError as err:
|
||
if err.kind != 'timeout':
|
||
raise
|
||
|
||
if err.safe_to_reuse_session:
|
||
# Engine saw the cancel signal; safe to reuse this session with a larger timeout
|
||
print(f"Timeout observed (cancel_observed=true); retrying with {fallback_timeout}s...", file=sys.stderr)
|
||
return run_claw_command(command, timeout_seconds=fallback_timeout)
|
||
else:
|
||
# Engine didn't see the cancel signal; session may be wedged
|
||
print(f"Timeout not observed (cancel_observed=false); session is potentially wedged", file=sys.stderr)
|
||
raise # Caller should allocate a fresh session
|
||
```
|
||
|
||
### Pattern 3: Detect parse errors (typos in command-line construction)
|
||
|
||
```python
|
||
def validate_command_before_dispatch(command: list[str]) -> None:
|
||
"""
|
||
Dry-run with --help to detect obvious syntax errors before dispatching work.
|
||
|
||
This is cheap (no API call) and catches typos like:
|
||
- Unknown subcommand: `claw typo-command`
|
||
- Unknown flag: `claw bootstrap --invalid-flag`
|
||
- Missing required argument: `claw load-session` (no session_id)
|
||
"""
|
||
help_cmd = command + ['--help']
|
||
try:
|
||
result = subprocess.run(help_cmd, capture_output=True, timeout=2.0)
|
||
if result.returncode != 0:
|
||
print(f"Warning: {' '.join(help_cmd)} returned {result.returncode}", file=sys.stderr)
|
||
print("(This doesn't prove the command is invalid, just that --help failed)", file=sys.stderr)
|
||
except subprocess.TimeoutExpired:
|
||
pass # --help shouldn't hang, but don't block on it
|
||
```
|
||
|
||
### Pattern 4: Log and forward errors to observability
|
||
|
||
```python
|
||
import logging
|
||
|
||
logger = logging.getLogger(__name__)
|
||
|
||
def run_claw_with_logging(command: list[str]) -> dict:
|
||
"""Run command and log errors for observability."""
|
||
try:
|
||
result = run_claw_command(command)
|
||
logger.info(f"Claw command succeeded: {' '.join(command)}")
|
||
return result
|
||
except ClawError as err:
|
||
logger.error(
|
||
"Claw command failed",
|
||
extra={
|
||
'command': ' '.join(command),
|
||
'error_kind': err.kind,
|
||
'error_message': err.message,
|
||
'retryable': err.retryable,
|
||
'cancel_observed': err.cancel_observed,
|
||
},
|
||
)
|
||
raise
|
||
```
|
||
|
||
---
|
||
|
||
## Error Kinds (Enumeration)
|
||
|
||
After cycles #178–#179, the complete set of `error.kind` values is:
|
||
|
||
| Kind | Exit Code | Meaning | Retryable | Notes |
|
||
|---|---|---|---|---|
|
||
| **parse** | 1 | Argparse error (unknown command, missing arg, invalid flag) | No | Real error message included (#179); valid choices list for discoverability |
|
||
| **session_not_found** | 1 | load-session target doesn't exist | No | session_id and directory included in envelope |
|
||
| **filesystem** | 1 | Directory missing, permission denied, disk full | Yes | Transient issues (disk space, NFS flake) can be retried |
|
||
| **runtime** | 1 | Engine error (unexpected exception, malformed input) | Depends | `error.retryable` field in envelope specifies |
|
||
| **timeout** | 2 | Engine timeout with cooperative cancellation | Yes* | `cancel_observed` field signals session safety (#164) |
|
||
|
||
*Retry safety depends on `cancel_observed`:
|
||
- `cancel_observed=true` → session is safe to reuse
|
||
- `cancel_observed=false` → session may be wedged; allocate fresh one
|
||
|
||
---
|
||
|
||
## What We Did to Make This Work
|
||
|
||
### Cycle #178: Parse-Error Envelope
|
||
|
||
**Problem:** `claw nonexistent --output-format json` returned argparse help text on stderr instead of an envelope.
|
||
**Solution:** Catch argparse `SystemExit` in JSON mode and emit a structured error envelope.
|
||
**Benefit:** Claws no longer need to parse human help text to understand parse errors.
|
||
|
||
### Cycle #179: Stderr Hygiene + Real Error Message
|
||
|
||
**Problem:** Even after #178, argparse usage was leaking to stderr AND the envelope message was generic ("invalid command or argument").
|
||
**Solution:** Monkey-patch `parser.error()` in JSON mode to raise an internal exception, preserving argparse's real message verbatim. Suppress stderr entirely in JSON mode.
|
||
**Benefit:** Claws see one stream (stdout), one envelope, and real error context (e.g., "invalid choice: typo (choose from ...)") for discoverability.
|
||
|
||
### Contract: #164 Stage B (`cancel_observed` field)
|
||
|
||
**Problem:** Timeout results didn't signal whether the engine actually observed the cancellation request.
|
||
**Solution:** Add `cancel_observed: bool` field to timeout TurnResult; signal true iff the engine had a fair chance to observe the cancel event.
|
||
**Benefit:** Claws can decide "retry with fresh session" vs "reuse this session with larger timeout" based on a single boolean.
|
||
|
||
---
|
||
|
||
## Common Mistakes to Avoid
|
||
|
||
❌ **Don't parse exit code alone**
|
||
```python
|
||
# BAD: Exit code 1 could mean parse error, not-found, filesystem, or runtime
|
||
if result.returncode == 1:
|
||
# What should I do? Unclear.
|
||
pass
|
||
```
|
||
|
||
✅ **Do parse error.kind**
|
||
```python
|
||
# GOOD: error.kind tells you exactly how to recover
|
||
match envelope['error']['kind']:
|
||
case 'parse': ...
|
||
case 'session_not_found': ...
|
||
case 'filesystem': ...
|
||
```
|
||
|
||
---
|
||
|
||
❌ **Don't capture both stdout and stderr and assume they're separate concerns**
|
||
```python
|
||
# BAD (pre-#179): Capture stdout + stderr, then parse stdout as JSON
|
||
# But stderr might contain argparse noise that you have to string-match
|
||
result = subprocess.run(..., capture_output=True, text=True)
|
||
if "invalid choice" in result.stderr:
|
||
# ... custom error handling
|
||
```
|
||
|
||
✅ **Do silence stderr in JSON mode**
|
||
```python
|
||
# GOOD (post-#179): In JSON mode, stderr is guaranteed silent
|
||
# Envelope on stdout is your single source of truth
|
||
result = subprocess.run(..., capture_output=True, text=True)
|
||
envelope = json.loads(result.stdout) # Always valid in JSON mode
|
||
```
|
||
|
||
---
|
||
|
||
❌ **Don't retry on parse errors**
|
||
```python
|
||
# BAD: Typos don't fix themselves
|
||
error_kind = envelope['error']['kind']
|
||
if error_kind == 'parse':
|
||
retry() # Will fail again
|
||
```
|
||
|
||
✅ **Do check retryable before retrying**
|
||
```python
|
||
# GOOD: Let the error tell you
|
||
error = envelope['error']
|
||
if error.get('retryable', False):
|
||
retry()
|
||
else:
|
||
raise
|
||
```
|
||
|
||
---
|
||
|
||
❌ **Don't reuse a session after timeout without checking cancel_observed**
|
||
```python
|
||
# BAD: Reuse session = potential wedge
|
||
result = run_claw_command(...) # times out
|
||
# ... later, reuse same session
|
||
result = run_claw_command(...) # might be stuck in the previous turn
|
||
```
|
||
|
||
✅ **Do allocate a fresh session if cancel_observed=false**
|
||
```python
|
||
# GOOD: Allocate fresh session if wedge is suspected
|
||
try:
|
||
result = run_claw_command(...)
|
||
except ClawError as err:
|
||
if err.cancel_observed:
|
||
# Safe to reuse
|
||
result = run_claw_command(...)
|
||
else:
|
||
# Allocate fresh session
|
||
fresh_session = create_session()
|
||
result = run_claw_command_in_session(fresh_session, ...)
|
||
```
|
||
|
||
---
|
||
|
||
## Testing Your Error Handler
|
||
|
||
```python
|
||
def test_error_handler_parse_error():
|
||
"""Verify parse errors are caught and classified."""
|
||
try:
|
||
run_claw_command(['claw', 'nonexistent', '--output-format', 'json'])
|
||
assert False, "Should have raised ClawError"
|
||
except ClawError as err:
|
||
assert err.kind == 'parse'
|
||
assert 'invalid choice' in err.message.lower()
|
||
assert err.retryable is False
|
||
|
||
def test_error_handler_timeout_safe():
|
||
"""Verify timeout with cancel_observed=true marks session as safe."""
|
||
# Requires a live claw-code server; mock this test
|
||
try:
|
||
run_claw_command(
|
||
['claw', 'turn-loop', '"x"', '--timeout-seconds', '0.0001'],
|
||
timeout_seconds=2.0,
|
||
)
|
||
assert False, "Should have raised ClawError"
|
||
except ClawError as err:
|
||
assert err.kind == 'timeout'
|
||
assert err.safe_to_reuse_session is True # cancel_observed=true
|
||
|
||
def test_error_handler_not_found():
|
||
"""Verify session_not_found is clearly classified."""
|
||
try:
|
||
run_claw_command(['claw', 'load-session', 'nonexistent', '--output-format', 'json'])
|
||
assert False, "Should have raised ClawError"
|
||
except ClawError as err:
|
||
assert err.kind == 'session_not_found'
|
||
assert err.retryable is False
|
||
```
|
||
|
||
---
|
||
|
||
## Appendix A: v1.0 Error Envelope (Current Binary)
|
||
|
||
The actual shape emitted by the current binary (v1.0, flat):
|
||
|
||
```json
|
||
{
|
||
"error": "session 'nonexistent' not found in .claw/sessions",
|
||
"hint": "use 'list-sessions' to see available sessions",
|
||
"kind": "session_not_found",
|
||
"type": "error"
|
||
}
|
||
```
|
||
|
||
**Key differences from v2.0 schema (below):**
|
||
- `error` field is a **string**, not a structured object
|
||
- `kind` is at **top-level**, not nested under `error`
|
||
- Missing: `timestamp`, `command`, `exit_code`, `output_format`, `schema_version`
|
||
- Extra: `type: "error"` field (not in schema)
|
||
|
||
## Appendix B: SCHEMAS.md Target Shape (v2.0)
|
||
|
||
For reference, the target JSON error envelope shape (SCHEMAS.md, v2.0):
|
||
|
||
```json
|
||
{
|
||
"timestamp": "2026-04-22T11:40:00Z",
|
||
"command": "load-session",
|
||
"exit_code": 1,
|
||
"output_format": "json",
|
||
"schema_version": "2.0",
|
||
"error": {
|
||
"kind": "session_not_found",
|
||
"operation": "session_store.load_session",
|
||
"target": "nonexistent",
|
||
"retryable": false,
|
||
"message": "session 'nonexistent' not found in .port_sessions",
|
||
"hint": "use 'list-sessions' to see available sessions"
|
||
}
|
||
}
|
||
```
|
||
|
||
**This is the target schema after [`FIX_LOCUS_164`](./FIX_LOCUS_164.md) is implemented.** The migration plan includes a dual-mode `--envelope-version=2.0` flag in Phase 1, default version bump in Phase 2, and deprecation in Phase 3. For now, code against v1.0 (Appendix A).
|
||
|
||
---
|
||
|
||
## Summary
|
||
|
||
After cycles #178–#179, **one error handler works for all 14 clawable commands.** No more string-matching, no more stderr parsing, no more exit-code ambiguity. Just parse the JSON, check `error.kind`, and decide: retry, escalate, or reuse session (if safe).
|
||
|
||
The handler itself is ~80 lines of Python; the patterns are reusable across any language that can speak JSON.
|