mirror of
https://github.com/ultraworkers/claw-code.git
synced 2026-04-30 16:55:49 +08:00
1050 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
2f1fe0416d | roadmap: #194 filed — prunable-worktree accumulation, no doctor visibility or auto-prune lifecycle | ||
|
|
ac594a9626 |
doc: add Phase 1 kickoff — execution plan for 6-bundle priority queue
Comprehensive Phase 1 strategy document prepared at end of probe cycle #108. Contents: - Phase 0 recap (freeze, tests, pinpoints, doctrines) - What Phase 1 will do (6 bundles + independents, all gaebal-gajae reviewed) - Concrete next steps (branch names, expected commits/tests per bundle) - Priority 1: Error envelope contract drift (#181/#183) — foundation - Priority 2: CLI contract hygiene (#184/#185) — extensions - Priority 3: Classifier sweep 4-verb (#186/#187/#189/#192) — cleanup - Priority 4: USAGE.md audit (#180) — doc prerequisite - Priority 5: Dump-manifests help (#188) — doc-truth probe-flow - Priority 6+: Independents (#190 design, #191 filesystem, others) - Hypothesis validation (multi-flag verbs = 3-4 gaps, simple verbs = 0-1) - Testing strategy + success criteria All 5 priority bundles are reviewer-blessed (gaebal-gajae validation passes). Doc-only. No code changes. Freeze held. |
||
|
|
571855e053 |
doc(review-guide): embed gaebal-gajae authoritative state framing
Per gaebal-gajae cycle #105 validation pass. One-liner state summary now appears at top (tone-setter for reviewers) and bottom (reinforced recap): 'Phase 0 is now frozen, reviewer-mapped, and merge-ready; Phase 1 remains intentionally deferred behind the locked priority order.' This is the single authoritative sentence that captures branch state. Use it for PR titles, review summaries, and Phase 1 handoff notes. Why this framing matters (per gaebal-gajae evaluation): - 'frozen' signals no scope creep - 'reviewer-mapped' signals audit trail exists (this guide) - 'merge-ready' signals gates are passed - 'intentionally deferred' signals Phase 1 absence is by design, not omission - 'locked priority order' signals sequencing is validated (cycle #104-#105) Review guide now doubles as merge-enabler: reviewers parse branch state in one sentence, then drill into commits as needed. Doc-only. No code changes. Freeze preserved. |
||
|
|
f068aadd47 |
doc: add Phase 0 + dogfood bundle review guide for cycles #104-#105
Pre-merge documentation for reviewers. Summarizes: - What Phase 0 tasks deliver (JSON envelope contracts, regression locks) - Why dogfood cycles #99-#105 matter (validated methodology, 15 filed pinpoints) - Commit-by-commit navigation for the 30-commit frozen bundle - What lands vs what's deferred - Integration notes for Phase 1 planning - Known limitations + follow-ups This is doc-only, no code changes. Serves as audit trail and reviewer reference without adding scope to the frozen feature branch. |
||
|
|
a57aba3abd |
docs(#99): checkpoint artifact — bundle status and Phase 1 readiness
Cycle #99 (10-min dogfood cycle). No new pinpoint filed. Instead, documented current branch state via checkpoint artifact. Branch: feat/jobdori-168c-emission-routing @ 15 commits across 5 axes - Phase 0 (emission): 4 commits, complete - Discoverability: 4 commits, complete - Typed-error: 6 commits, complete - Doc-truthfulness: 2 commits, complete - Deferred: #141 (list-sessions --help routing, parser scope) Tests: 227/227 pass, zero regressions, steady 11-cycle run Checkpoint summarizes: 1. Work axes breakdown + pinpoint mapping 2. Cycle velocity (11 cycles, ~90 min, 6 pinpoints closed) 3. Branch deliverables (4 consumer-facing value propositions) 4. Readiness assessment (ready for review, awaiting signal) 5. Doctrine observations (probe pivot works, regression guards stick) No code changes; doc-only. This checkpoint bridges cycles #89-#99 and marks the branch as review-ready pending coordination signal. |
||
|
|
a8b655c813 |
docs(#172): correct action-field inventory claim (4 → 3 verbs) + regression guard
Pinpoint #172: SCHEMAS.md v1.5 Emission Baseline documentation inaccuracy discovered during cycle #98 probe. The Phase 1 normalization targets section claimed: "unify where `action` field appears (only in 4 inventory verbs)" But reality is only 3 inventory verbs have `action`: - mcp - skills - agents list-sessions uses `command` instead (the documented 1-of-13 deviation already captured elsewhere in v1.5 baseline). This is a doc-truthfulness issue (same family as cycles #76, #79, #82). Active misdocumentation leads downstream consumers to assume 4-verb coverage when building adapters/dispatchers. Changes: 1. SCHEMAS.md: 'only in 4 inventory verbs' → 'only in 3 inventory verbs: mcp, skills, agents' 2. Added regression test `v1_5_action_field_appears_only_in_3_inventory_verbs_172` - Asserts mcp/skills/agents HAVE action field - Asserts help/version/doctor/status/sandbox/system-prompt/bootstrap-plan/list-sessions do NOT have action field - Forces SCHEMAS.md + binary to stay synchronized Test added: - `v1_5_action_field_appears_only_in_3_inventory_verbs_172` (8 negative cases + 3 positive cases) Tests: 227/227 pass (+1 from #172). Related: #155 (doc parity family), #168c (emission baseline). Doc-truthfulness family: #76, #79, #82, #172. |
||
|
|
c816e2b8c1 |
fix(#171): classify unexpected extra arguments errors as cli_parse
Pinpoint #171: typed-error classifier gap discovered during #141 probe cycle #97. `claw list-sessions --help` emits: error: unexpected extra arguments after `claw list-sessions`: --help This format is used by multiple verbs that reject trailing positional args: - list-sessions - plugins (subcommands) - config (subcommands) - diff - load-session Before fix: {"error": "unexpected extra arguments after `claw list-sessions`: --help", "hint": null, "kind": "unknown", "type": "error"} After fix: {"error": "unexpected extra arguments after `claw list-sessions`: --help", "hint": "Run `claw --help` for usage.", "kind": "cli_parse", "type": "error"} The pattern `unexpected extra arguments after \`claw` is specific enough that it won't hijack generic prose mentioning "unexpected extra arguments" in other contexts (sanity test included). Side benefit: like #169/#170, correctly classified cli_parse errors now auto-trigger the #247 hint synthesizer. Related #141 gap not yet closed: `claw list-sessions --help` still errors instead of showing help (requires separate parser fix to recognize --help as a distinct path). This classifier fix at least makes the error surface typed correctly so consumers can distinguish "parse failure" from "unknown" and potentially retry without the --help flag. Test added: - `classify_error_kind_covers_unexpected_extra_args_171` (4 positive cases + 1 sanity guard) Tests: 226/226 pass (+1 from #171). Typed-error family: #121, #127, #129, #130, #164, #169, #170, #247. |
||
|
|
57ba140ff8 |
docs(#153): add binary PATH installation instructions and verification steps
Pinpoint #153 closure. USAGE.md was missing practical instructions for: 1. Adding the claw binary to PATH (symlink vs export PATH) 2. Verifying the install works (version, doctor, --help) 3. Troubleshooting PATH issues (which, echo $PATH, ls -la) New subsections: - "Add binary to PATH" with two common options - "Verify install" with post-install health checks - Troubleshooting guide for common failures Target audience: developers building from source who want to run `claw` from any directory without typing `./rust/target/debug/claw`. Discovered during cycle #96 dogfood (10-min reminder cycle). Tests: 225/225 still pass (doc-only change). |
||
|
|
f3078d370a |
fix(#170): classify 4 additional flag-value/slash-command errors as cli_parse / slash_command_requires_repl
Pinpoint #170: Extended typed-error classifier coverage gap discovered during dogfood probe 2026-04-23 07:30 Seoul (cycle #95). The #169 comment claimed to cover `--permission-mode bogus` via the `unsupported value for --` pattern, but the actual `parse_permission_mode_arg` message format is `unsupported permission mode 'bogus'` (NO `for --` prefix). Doc-vs-reality lie in the #169 fix itself — fixed here. Four classifier gaps closed: 1. `unsupported permission mode '<value>'` → cli_parse (from: `parse_permission_mode_arg`) 2. `invalid value for --reasoning-effort: '<value>'; must be ...` → cli_parse (from: `--reasoning-effort` validator) 3. `model string cannot be empty` → cli_parse (from: empty --model rejection) 4. `slash command /<name> is interactive-only. Start \`claw\` ...` → slash_command_requires_repl (NEW kind — more specific than cli_parse) The fourth pattern gets its own kind (`slash_command_requires_repl`) because it's a command-mode misuse, not a parse error. Downstream consumers can programmatically offer REPL-launch guidance. Side benefit: like #169, the correctly classified cli_parse errors now auto-trigger the #247 hint synthesizer ("Run `claw --help` for usage."). Test added: - `classify_error_kind_covers_flag_value_parse_errors_170_extended` (4 positive cases + 2 sanity guards) Tests: 225/225 pass (+1 from #170). Typed-error family: #121, #127, #129, #130, #164, #169, #247. Discovered via systematic probe angle: 'error message pattern audit' \u2014 grep each error emission for pattern, confirm classifier matches. |
||
|
|
ef3a5d8462 |
fix(#169): classify invalid/missing CLI flag values as cli_parse
Pinpoint #169: typed-error classifier gap discovered during dogfood probe. `claw --output-format json --output-format xml doctor` was emitting: {"error": "unsupported value for --output-format: xml ...", "hint": null, "kind": "unknown", "type": "error"} After fix: {"error": "unsupported value for --output-format: xml ...", "hint": "Run `claw --help` for usage.", "kind": "cli_parse", "type": "error"} The change adds two new classifier branches to `classify_error_kind`: 1. `unsupported value for --` → cli_parse 2. `missing value for --` → cli_parse Covers all `CliOutputFormat::parse` / `parse_permission_mode_arg` rejections and any future flag-value validation messages using the same pattern. Side benefit: the #247 hint synthesizer ("Run `claw --help` for usage.") now triggers automatically because the error is now correctly classified as cli_parse. Consumers get both correct kind AND helpful hint. Test added: - `classify_error_kind_covers_flag_value_parse_errors_169` (4 positive + 1 sanity case) Tests: 224/224 pass (+1 from #169). Discovered during dogfood probe 2026-04-23 07:00 Seoul, cycle #94. Refs: #169, typed-error family (#121, #127, #129, #130, #164, #247) |
||
|
|
37d9233c6b |
docs(#155): add missing slash command documentation to USAGE.md
Pinpoint #155: USAGE.md was missing documentation for three interactive commands that appear in `claw --help`: - /ultraplan [task] - /teleport <symbol-or-path> - /bughunter [scope] Also adds full documentation for other underdocumented commands: - /commit, /pr, /issue, /diff, /plugin, /agents Converts inline sentence list into structured section 'Interactive slash commands (inside the REPL)' with brief descriptions for each command. Closes #155 gap: discovered during dogfood probing of help/USAGE parity. No code changes. Pure documentation update. |
||
|
|
1eb68fb62c |
test(#168c Task 4): add v1.5 emission baseline shape parity guard
Phase 0 Task 4 of the JSON Productization Program: CI shape parity guard.
This test locks the v1.5 emission baseline (documented in SCHEMAS.md § v1.5
Emission Baseline) so any future PR that introduces shape drift in a documented
verb fails this test at PR time.
Complements Task 2 (no-silent guarantee) by asserting SPECIFIC top-level key
sets, not just 'stdout is non-empty valid JSON'. If a verb adds/removes a
top-level field, this test fails with a clear error message pointing to
SCHEMAS.md § v1.5 Emission Baseline for update guidance.
Coverage:
- 8 success-path verbs with locked shape (help, version, doctor, skills,
agents, system-prompt, bootstrap-plan, list-sessions)
- 2 error-path cases with locked error envelope shape (prompt-no-arg, doctor --foo)
Key enforcement rules:
- Success envelope: exact key set match per verb
- Error envelope: {error, hint, kind, type} (4 keys, all verbs)
- list-sessions deliberately kept as {command, sessions} (Phase 1 target)
Test design intent:
- Locks CURRENT (possibly imperfect) shape, NOT target shape
- Forces PR authors to update both code + SCHEMAS.md + test together
- Makes Phase 1 shape normalization PRs visible: 'update this test'
Phase 0 now COMPLETE:
- Task 1 ✅ Stream routing fix (cycle #89)
- Task 2 ✅ No-silent guarantee (cycle #90)
- Task 3 ✅ Per-verb emission inventory SCHEMAS.md (cycle #91)
- Task 4 ✅ CI shape parity guard (this cycle)
Tests: 18 output_format_contract tests all pass (+1 from Task 4).
v1.5 emission baseline now locked by code + tests + docs.
Refs: #168c, cycle #92, Phase 0 Task 4 (final)
|
||
|
|
63547679ea |
docs(#168c Task 3): add v1.5 Emission Baseline per-verb shape catalog to SCHEMAS.md
Phase 0 Task 3 of the JSON Productization Program: per-verb emission inventory.
Documents the actual binary behavior as of v1.5 (post-#168c fix, pre-Phase 1
shape normalization). Reference artifact for consumers building against v1.5,
not a target schema.
Catalog contents:
- 12 verbs using 'kind' field (help, version, doctor, mcp, skills, agents,
sandbox, status, system-prompt, bootstrap-plan, export, acp)
- 1 verb using 'command' field (list-sessions) — Phase 1 normalization target
- 3 error-only verbs in test env (bootstrap, dump-manifests, state)
- Standard error envelope: {error, hint, kind, type} flat shape
- 9 machine-readable error kinds from classify_error_kind
Emission contract locked by:
- Task 1 (#168c routing fix, cycle #89)
- Task 2 (no-silent guarantee test, cycle #90)
- This catalog (human-readable reference, cycle #91)
Consumer guidance + Phase 1 normalization targets documented.
Phase 0 progress:
- Task 1 Stream routing fix
- Task 2 No-silent guarantee test
- Task 3 Per-verb emission inventory
- Task 4 pending: CI parity test
Refs: #168c, cycle #91, Phase 0 Task 3
|
||
|
|
c066f0ccea |
test(#168c Task 2): add no-silent emission contract guard for 14 verbs
Phase 0 Task 2 of the JSON Productization Program: no-silent guarantee. The emission contract under --output-format json requires: 1. Success (exit 0) must produce non-empty stdout with valid JSON 2. Failure (exit != 0) must still emit JSON envelope on stdout (#168c) 3. Silent success (exit 0 + empty stdout) is forbidden This test iterates 12 safe-success verbs + 2 error cases, asserting each produces valid JSON on stdout. Any verb that regresses to silent emission or wrong-stream routing will fail this test. Covered verbs: - Success: help, version, list-sessions, doctor, mcp, skills, agents, sandbox, status, system-prompt, bootstrap-plan, acp - Error: prompt (no arg), doctor --foo Phase 0 progress: - Task 1 ✅ Stream routing (#168c fix) - Task 2 ✅ No-silent guarantee (this test) - Task 3 ⏳ Per-verb emission inventory (SCHEMAS.md) - Task 4 ⏳ CI parity test (regression prevention) Tests: 17 output_format_contract tests all pass (+1 from Task 2). Refs: #168c, cycle #90, Phase 0 Task 2 |
||
|
|
5f84d91348 |
fix(#168c): emit error envelopes to stdout under --output-format json
Under --output-format json, error envelopes were emitted to stderr via eprintln!. This violated the emission contract: stdout should carry the contractual envelope (success OR error); stderr is reserved for non-contractual diagnostics. Cycle #87 controlled matrix audit found bootstrap/dump-manifests/state exhibited this pattern (exit 1, stdout 0 bytes, stderr N bytes under --output-format json). Fix: change eprintln! to println! for the JSON error envelope path in main(). Text mode continues to route errors to stderr (conventional). Verification: - bootstrap --output-format json: stdout now carries envelope, exit 1 - dump-manifests --output-format json: stdout now carries envelope, exit 1 - Text mode: errors still on stderr with [error-kind: ...] prefix (no regression) Tests: - Updated assert_json_error_envelope helper to read from stdout (was stderr) - Added error_envelope_emitted_to_stdout_under_output_format_json_168c regression test that asserts envelope on stdout + non-JSON on stderr - All 16 output_format_contract tests pass Phase 0 Task 1 complete: emission routing fixed across all error-path verbs. Phase 0 Task 2 (no-silent CI guarantee) remains. Refs: #168c (cycle #87 filing), cycle #88 emission contract framing |
||
|
|
70ecd9bf6c |
locus(#164): add Phase 0 + v1.5 baseline; revised from 2-phase to 4-phase migration (cycle #85)
Fresh-dogfood validation (cycle #84, #168) proved the original locus premise was underspecified. v1.0 was never a coherent contract — each verb has a bespoke JSON shape with no coordination, and bootstrap JSON is completely broken (silent failure, exit 0 no output). Revised migration plan: - Phase 0 (NEW): Emergency fix for silent failures (#168 bootstrap JSON) - Phase 1 (NEW): v1.5 baseline — minimal JSON invariants across all 14 verbs - Every command emits valid JSON with --output-format json - Every command has top-level 'kind' field for verb ID - Every error envelope follows {error, hint, kind, type} - Phase 2 (renamed from Phase 1): v2.0 wrapped envelope (opt-in) - Phase 3 (renamed from Phase 2): v2.0 default - Phase 4 (renamed from Phase 3): v1.0/v1.5 deprecation Rationale: - Can't migrate from 'incoherent' to 'coherent v2.0' in one jump - Consumers need stable target (v1.5) to transition from - Silent failures must be fixed BEFORE migration (consumers can't detect breakage) Effort revision: ~9 dev-days (Phase 0: 1 + Phase 1: 3 + Phase 2: 5) vs original ~6 dev-days for direct v1.0→v2.0 (which would have failed). Doctrine implication: Fresh-dogfood principle (#9, cycle #73) prevented a multi-day migration from hitting an unsolvable baseline problem. Evidence-backed mid-design correction. |
||
|
|
67bf29d5a4 |
docs: SCHEMAS.md — critical P0 fix: mark as target v2.0, not current v1.0 (#166 filed+closed)
SCHEMAS.md was presenting the target v2.0 schema as the current binary contract. This is the source of truth document, so the misdocumentation propagated to every downstream doc (USAGE.md, ERROR_HANDLING.md, CLAUDE.md all inherited the false premise that v1.0 includes timestamp/command/exit_code/etc). Fixed with: 1. CRITICAL header at top: marks entire doc as v2.0 target, not v1.0 reality 2. 'TARGET v2.0 SCHEMA' headers on Common Fields section 3. Comprehensive Appendix: v1.0 actual shape + migration timeline + v1.0 code example 4. Links to FIX_LOCUS_164.md + ERROR_HANDLING.md for v1.0 reality 5. FAQ: clarifies the version mismatch and when v2.0 ships This closes the fourth P0 doc-truthfulness instance (4/4 in family): - #78 USAGE.md: active misdocumentation (fixed #78) - #79 ERROR_HANDLING.md: copy-paste trap (fixed #79) - #165 CLAUDE.md: boundary collapse (fixed #81) - #166 SCHEMAS.md: aspirational source doc (fixed #82) Pattern is now crystallized: SCHEMAS.md was the aspirational source; three downstream docs (USAGE, ERROR_HANDLING, CLAUDE) inherited the false v2.0-as-v1.0 claim. Fix the source (SCHEMAS.md), which eliminates the root cause for all four. |
||
|
|
e5703b8b74 |
docs: CLAUDE.md — fix target/current boundary collapse (#165 Option A)
CLAUDE.md was documenting the v2.0 target schema as if it were current binary behavior. This misled validator/harness implementers into assuming the Rust binary emits timestamp, command, exit_code, output_format, schema_version fields when it doesn't. Fixed by explicitly marking the boundary: 1. SCHEMAS.md section: now clearly labels 'target v2.0 design' and lists both v1.0 (actual binary) and v2.0 (target) field shapes 2. Clawable commands requirements: now explicitly separates v1.0 (current) and v2.0 (post-FIX_LOCUS_164) envelope requirements 3. Added inline migration note pointing to FIX_LOCUS_164.md This closes #165 as the third P0 doc-truthfulness fix (Option A: preserve current truth, add v2.0 target as separate labeled section). P0 doc-truthfulness family pattern (all three related to #164 envelope divergence): - #78 USAGE.md: active misdocumentation (fixed cycle #78) - #79 ERROR_HANDLING.md: copy-paste trap (fixed cycle #79) - #165 CLAUDE.md: target/current boundary collapse (fixed cycle #81) |
||
|
|
e50ef857cf |
docs: ERROR_HANDLING.md — fix code examples to match v1.0 envelope (flat shape)
The Python code examples were accessing nested error.kind like envelope['error']['kind'],
but v1.0 emits flat envelopes with error as a STRING and kind at top-level.
Updated:
- Table header: now shows actual v1.0 shape {error: "...", kind: "...", type: "error"}
- match statement: switched from envelope.get('error',{}).get('kind') to envelope.get('kind')
- All ClawError raises: changed from envelope['error']['message'] to envelope.get('error','')
because error field is a STRING in v1.0, not a nested object
- Added inline comments on every error case noting v1.0 vs v2.0 difference
- Appendix: split into v1.0 (actual/current) and v2.0 (target after FIX_LOCUS_164)
The code examples now work correctly against the actual binary.
This was active misdocumentation (P0 severity) — the Python examples would crash
if a consumer tried to use them.
|
||
|
|
f7a3de33d2 |
docs: USAGE.md — clarify JSON v1.0 envelope shape + migration notice for #164
The JSON output section was misleading — it claimed the binary emits exit_code, command, timestamp, output_format, schema_version, and nested error objects. The binary actually emits v1.0 flat shape (kind at top-level, error as string, no common metadata fields). Updated section: - Documents actual v1.0 success and error envelope shapes - Lists known issues (missing fields, overloaded kind, flat error) - Shows how to dispatch on v1.0 (check type=='error' before reading kind) - Warns users NOT to rely on kind alone - Links to FIX_LOCUS_164.md for migration plan - Explains Phase 1/2/3 timeline for v2.0 adoption This is a doc-only fix that makes USAGE.md truthful about the current behavior while preparing users for the coming schema migration. |
||
|
|
17741ac8ed |
docs: add FIX_LOCUS_164.md — JSON envelope contract migration strategy
Cycle #77 deliverable. Escalates #164 from pinpoint to fix-locus cycle. Documents: - 100% divergence across all 14 JSON-emitting verbs (not a partial drift) - Two envelope shapes: current flat vs. documented nested - Phased migration: dual-mode → default bump → deprecation (3 phases) - Shared wrapper helper pattern (json_envelope.rs) - Per-verb migration template (before/after code) - Error classification remapping table (cli_parse → parse, etc.) - 6 acceptance criteria + 3 risk categories - Rollout timeline: Phase 1 ~6 dev-days, v3.0 cutoff at ~8 months Ready for author review + pilot implementation decision (which 3 verbs lead). |
||
|
|
3a71a18ec2 |
fix(#161): resolve actual HEAD path in git worktrees for correct Git SHA in build metadata
Problem: In git worktrees, .git is a pointer file (not a directory), so cargo's rerun-if-changed=.git/HEAD never triggers when commits are made. This causes claw version to report a stale SHA after new commits. Solution: Add resolve_git_head_path() helper that detects worktree mode: - If .git is a file: parse gitdir pointer, watch <gitdir>/HEAD - If .git is a directory: watch .git/HEAD (regular repo) This ensures build.rs invalidates on each commit, making version output truthful. Verification: Binary built in worktree now reports correct SHA after commits (before: stale, after: current HEAD). Relates to ROADMAP #161 (filed cycle #65, implemented cycle #69). Diagnostic-strictness family member. Diff: 21 lines added (resolve_git_head_path + conditional rerun-if-changed). |
||
|
|
1dc10d2b70 |
fix(#130e-B): route plugins/prompt --help to dedicated help topics
## What Was Broken (ROADMAP #130e Category B)
Two remaining surface-level help outliers after #130e-A:
$ claw plugins --help
Unknown /plugins action '--help'. Use list, install, enable, disable, uninstall, or update.
$ claw prompt --help
claw v0.1.0 (top-level help — wrong help topic)
`plugins` treated `--help` as an invalid subaction name. `prompt`
was explicitly listed in the early `wants_help` interception with
commit/pr/issue, which routed to top-level help instead of
prompt-specific help.
## Root Cause (Traced)
1. **plugins**: `parse_local_help_action()` didn't have a "plugins"
arm, so `["plugins", "--help"]` returned None and continued into
the `"plugins"` parser arm (main.rs:1031), which treated `--help`
as the `action` argument. Runtime layer then rejected it as
"Unknown action".
2. **prompt**: At main.rs:~800, there was an early interception for
`--help` following certain subcommands (prompt, commit, pr, issue)
that forced `wants_help = true`, routing to generic top-level help
instead of letting parse_local_help_action produce a prompt-specific
topic.
## What This Fix Does
Same pattern as #130c/#130d/#130e-A:
1. **LocalHelpTopic enum extended** with Plugins, Prompt variants
2. **parse_local_help_action() extended** to map both new cases
3. **Help topic renderers added** with accurate usage info
4. **Early prompt-interception removed** — prompt now falls through to
parse_local_help_action like other subcommands. commit/pr/issue
(which aren't actual subcommands yet) remain in the early list.
## Dogfood Verification
Before fix:
$ claw plugins --help
Unknown /plugins action '--help'. Use list, install, enable, ...
$ claw prompt --help
claw v0.1.0
(top-level help, not prompt-specific)
After fix:
$ claw plugins --help
Plugins
Usage claw plugins [list|install|enable|disable|uninstall|update] [<target>]
Purpose manage bundled and user plugins from the CLI surface
...
$ claw prompt --help
Prompt
Usage claw prompt <prompt-text>
Purpose run a single-turn, non-interactive prompt and exit
Flags --model · --allowedTools · --output-format · --compact
...
## Non-Regression Verification
- `claw plugins` (no args) → still displays plugin inventory ✅
- `claw plugins list` → still works correctly ✅
- `claw prompt "text"` → still requires credentials, runs prompt ✅
- All 180 binary tests pass ✅
- All 466 library tests pass ✅
## Regression Tests Added (4+ assertions)
- `plugins --help` → HelpTopic(Plugins)
- `prompt --help` → HelpTopic(Prompt)
- Short forms `plugins -h` / `prompt -h` both work
- `prompt "hello world"` still routes to Prompt action with correct text
## HELP-PARITY SWEEP COMPLETE
All 22 top-level subcommands now emit proper help topics:
| Command | Status |
|---|---|
| help --help | ✅ #130e-A |
| version --help | ✅ pre-existing |
| status --help | ✅ pre-existing |
| sandbox --help | ✅ pre-existing |
| doctor --help | ✅ pre-existing |
| acp --help | ✅ pre-existing |
| init --help | ✅ pre-existing |
| state --help | ✅ pre-existing |
| export --help | ✅ pre-existing |
| diff --help | ✅ #130c |
| config --help | ✅ #130d |
| mcp --help | ✅ pre-existing |
| agents --help | ✅ pre-existing |
| plugins --help | ✅ #130e-B (this commit) |
| skills --help | ✅ pre-existing |
| submit --help | ✅ #130e-A |
| prompt --help | ✅ #130e-B (this commit) |
| resume --help | ✅ #130e-A |
| system-prompt --help | ✅ pre-existing |
| dump-manifests --help | ✅ pre-existing |
| bootstrap-plan --help | ✅ pre-existing |
Zero outliers. Contract universally enforced.
## Related
- Closes #130e Category B (plugins, prompt surface-parity)
- Completes entire help-parity sweep family (#130c, #130d, #130e)
- Stacks on #130e-A (dispatch-order fixes) on same worktree
|
||
|
|
67b244168b |
fix(#130e-A): route help/submit/resume --help to help topics before credential check
## What Was Broken (ROADMAP #130e, filed cycle #53) Three subcommands leaked `missing_credentials` errors when called with `--help`: $ claw help --help [error-kind: missing_credentials] error: missing Anthropic credentials... $ claw submit --help [error-kind: missing_credentials] error: missing Anthropic credentials... $ claw resume --help [error-kind: missing_credentials] error: missing Anthropic credentials... This is the same dispatch-order bug class as #251 (session verbs). The parser fell through to the credential check before help-flag resolution ran. Critical discoverability gap: users couldn't learn what these commands do without valid credentials. ## Root Cause (Traced) `parse_local_help_action()` (main.rs:1260) is called early in `parse_args()` (main.rs:1002), BEFORE credential check. But the match statement inside only recognized: status, sandbox, doctor, acp, init, state, export, version, system-prompt, dump-manifests, bootstrap-plan, diff, config. `help`, `submit`, `resume` were NOT in the list, so the function returned `None`, and parsing continued to credential check which then failed. ## What This Fix Does Same pattern as #130c (diff) and #130d (config): 1. **LocalHelpTopic enum extended** with Meta, Submit, Resume variants 2. **parse_local_help_action() extended** to map the three new cases 3. **Help topic renderers added** with accurate usage info Three-line change to parse_local_help_action: "help" => LocalHelpTopic::Meta, "submit" => LocalHelpTopic::Submit, "resume" => LocalHelpTopic::Resume, Dispatch order (parse_args): 1. --resume parsing 2. parse_local_help_action() ← NOW catches help/submit/resume --help 3. parse_single_word_command_alias() 4. parse_subcommand() ← Credential check happens here ## Dogfood Verification Before fix (all three): $ claw help --help [error-kind: missing_credentials] error: missing Anthropic credentials... After fix: $ claw help --help Help Usage claw help [--output-format <format>] Purpose show the full CLI help text (all subcommands, flags, environment) ... $ claw submit --help Submit Usage claw submit [--session <id|latest>] <prompt-text> Purpose send a prompt to an existing managed session Requires valid Anthropic credentials (when actually submitting) ... $ claw resume --help Resume Usage claw resume [<session-id|latest>] Purpose restart an interactive REPL attached to a managed session ... ## Non-Regression Verification - `claw help` (no --help) → still shows full CLI help ✅ - `claw submit "text"` (with prompt) → still requires credentials ✅ - `claw resume` (bare) → still emits slash command guidance ✅ - All 180 binary tests pass ✅ - All 466 library tests pass ✅ ## Regression Tests Added (6 assertions) - `help --help` → routes to HelpTopic(Meta) - `submit --help` → routes to HelpTopic(Submit) - `resume --help` → routes to HelpTopic(Resume) - Short forms: `help -h`, `submit -h`, `resume -h` all work ## Pattern Note This is Category A of #130e (dispatch-order bugs). Same class as #251. Category B (surface-parity: plugins, prompt) will be handled in a follow-up commit/branch. ## Help-Parity Sweep Status After cycle #52 (#130c diff, #130d config), help sweep revealed: | Command | Before | After This Commit | |---|---|---| | help --help | missing_credentials | ✅ Meta help | | submit --help | missing_credentials | ✅ Submit help | | resume --help | missing_credentials | ✅ Resume help | | plugins --help | "Unknown action" | ⏳ #130e-B (next) | | prompt --help | wrong help | ⏳ #130e-B (next) | ## Related - Closes #130e Category A (dispatch-order help fixes) - Same bug class as #251 (session verbs) - Stacks on #130d (config help) on same worktree branch - #130e Category B (plugins, prompt) queued for follow-up |
||
|
|
03dacf29ed |
fix(#130d): accept --help / -h in claw config arm, route to help topic
## What Was Broken (ROADMAP #130d, filed cycle #52) `claw config --help` was silently ignored — the command executed and displayed the config dump instead of showing help: $ claw config --help Config Working directory /private/tmp/dogfood-probe-47 Loaded files 0 Merged keys 0 (displays full config, not help) Expected: help for the config command. Actual: silent acceptance of `--help`, runs config display anyway. This is the opposite outlier from #130c (which rejected help with an error). Together they form the help-parity anomaly: - #130c `diff --help` → error (rejects help) - #130d `config --help` → silent ignore (runs command, ignores help) - Others (status, mcp, export) → proper help - Expected behavior: all commands should show help on `--help` ## Root Cause (Traced) At main.rs:1050, the `"config"` parser arm parsed arguments positionally: "config" => { let tail = &rest[1..]; let section = tail.first().cloned(); // ... ignores unrecognized args like --help silently Ok(CliAction::Config { section, ... }) } Unlike the `diff` arm (#130c), `config` had no explicit check for extra args. It positionally parsed the first arg as an optional `section` and silently accepted/ignored any trailing arg, including `--help`. ## What This Fix Does Same pattern as #130c (help-surface parity): 1. **LocalHelpTopic enum extended** with new `Config` variant 2. **parse_local_help_action() extended** to map `"config"` → `LocalHelpTopic::Config` 3. **config arm guard added**: check for help flag before parsing section 4. **Help topic renderer added**: human-readable help text for config Fix locus at main.rs:1050: "config" => { // #130d: accept --help / -h and route to help topic if rest.len() >= 2 && is_help_flag(&rest[1]) { return Ok(CliAction::HelpTopic(LocalHelpTopic::Config)); } let tail = &rest[1..]; // ... existing parsing continues } ## Dogfood Verification Before fix: $ claw config --help Config Working directory ... Loaded files 0 (no help, runs config) After fix: $ claw config --help Config Usage claw config [--cwd <path>] [--output-format <format>] Purpose merge and display the resolved configuration Options --cwd overrides the workspace directory Output loaded files and merged key-value pairs Formats text (default), json Related claw status · claw doctor · claw init Short form `claw config -h` also works. ## Non-Regression Verification - `claw config` (no args) → still displays config dump ✅ - `claw config permissions` (section arg) → still works ✅ - All 180 binary tests pass ✅ - All 466 library tests pass ✅ ## Regression Tests Added (4 assertions) - `config --help` → routes to `HelpTopic(LocalHelpTopic::Config)` - `config -h` (short form) → routes to help topic - bare `config` (no args) → still routes to `Config` action - `config permissions` (with section) → still works correctly ## Pattern Note #130c and #130d form a pair: two outlier failure modes in help handling for local introspection commands: - #130c `diff` rejected help (loud error) → fixed with guard + routing - #130d `config` silently ignored help (silent accept) → fixed with same pattern Both are now consistent with the rest of the CLI (status, mcp, export, etc.). ## Related - Closes #130d (config help discoverability gap) - Completes help-parity family (#130c, #130d) - Stacks on #130c (diff help fix) on same worktree branch - Part of help-consistency thread (#141 audit) |
||
|
|
bf10d1ff5e |
fix(#130c): accept --help / -h in claw diff arm
## What Was Broken (ROADMAP #130c, filed cycle #50) `claw diff --help` was rejected with: [error-kind: unknown] error: unexpected extra arguments after `claw diff`: --help Other local introspection commands accept --help fine: - `claw status --help` → shows help ✅ - `claw mcp --help` → shows help ✅ - `claw export --help` → shows help ✅ - `claw diff --help` → error ❌ (outlier) This is a help-surface parity bug: `diff` is the only local command that rejects --help as "extra arguments" before the help detector gets a chance to run. ## Root Cause (Traced) At main.rs:1063, the `"diff"` parser arm rejected ALL extra args: "diff" => { if rest.len() > 1 { return Err(format!("unexpected extra arguments after `claw diff`: {}", ...)); } Ok(CliAction::Diff { output_format }) } When parsing `["diff", "--help"]`, `rest.len() > 1` was true (length is 2) and `--help` was rejected as extra argument. Other commands (status, sandbox, doctor, init, state, export, etc.) routed through `parse_local_help_action()` which detected `--help` / `-h` and routed to a LocalHelpTopic. The `diff` arm lacked this guard. ## What This Fix Does Three minimal changes: 1. **LocalHelpTopic enum extended** with new `Diff` variant 2. **parse_local_help_action() extended** to map `"diff"` → `LocalHelpTopic::Diff` 3. **diff arm guard added**: check for help flag before extra-args validation 4. **Help topic renderer added**: human-readable help text for diff command Fix locus at main.rs:1063: "diff" => { // #130c: accept --help / -h as first argument and route to help topic if rest.len() == 2 && is_help_flag(&rest[1]) { return Ok(CliAction::HelpTopic(LocalHelpTopic::Diff)); } if rest.len() > 1 { /* existing error */ } Ok(CliAction::Diff { output_format }) } ## Dogfood Verification Before fix: $ claw diff --help [error-kind: unknown] error: unexpected extra arguments after `claw diff`: --help After fix: $ claw diff --help Diff Usage claw diff [--output-format <format>] Purpose show local git staged + unstaged changes Requires workspace must be inside a git repository ... And `claw diff -h` (short form) also works. ## Non-Regression Verification - `claw diff` (no args) → still routes to Diff action correctly - `claw diff foo` (unknown arg) → still rejected as "unexpected extra arguments" - `claw diff --output-format json` (valid flag) → still works - All 180 binary tests pass - All 466 library tests pass ## Regression Tests Added (4 assertions) - `diff --help` → routes to HelpTopic(LocalHelpTopic::Diff) - `diff -h` (short form) → routes to HelpTopic(LocalHelpTopic::Diff) - bare `diff` → still routes to Diff action - `diff foo` (unknown arg) → still errors with "extra arguments" ## Pattern Follows #141 help-consistency work (extending LocalHelpTopic to cover more subcommands). Clean surface-parity fix: identify the outlier, add the missing guard. Low-risk, high-clarity. ## Related - Closes #130c (diff help discoverability gap) - Stacks on #130b (filesystem context) and #251 (session dispatch) - Part of help-consistency thread (#141 audit, #145 plugins wiring) |
||
|
|
deb434d2c4 |
fix(#130b): enrich filesystem I/O errors with operation + path context
## What Was Broken (ROADMAP #130b, filed cycle #47) In a fresh workspace, running: claw export latest --output /private/nonexistent/path/file.jsonl --output-format json produced: {"error":"No such file or directory (os error 2)","hint":null,"kind":"unknown","type":"error"} This violates the typed-error contract: - Error message is a raw errno string with zero context - Does not mention the operation that failed (export) - Does not mention the target path - Classifier defaults to "unknown" even though the code path knows this is a filesystem I/O error ## Root Cause (Traced) run_export() at main.rs:~6915 does: fs::write(path, &markdown)?; When this fails: 1. io::Error propagates via ? to main() 2. Converted to string via .to_string() in error handler 3. classify_error_kind() cannot match "os error" or "No such file" 4. Defaults to "kind": "unknown" The information is there at the source (operation name, target path, io::ErrorKind) but lost at the propagation boundary. ## What This Fix Does Three changes: 1. **New helper: contextualize_io_error()** (main.rs:~260) Wraps an io::Error with operation name + target path into a recognizable message format: "{operation} failed: {target} ({error})" 2. **Classifier branch added** (classify_error_kind at main.rs:~270) Recognizes the new format and classifies as "filesystem_io_error": else if message.contains("export failed:") || message.contains("diff failed:") || message.contains("config failed:") { "filesystem_io_error" } 3. **run_export() wired** (main.rs:~6915) fs::write() call now uses .map_err() to enrich io::Error: fs::write(path, &markdown).map_err(|e| -> Box<dyn std::error::Error> { contextualize_io_error("export", &path.display().to_string(), e).into() })?; ## Dogfood Verification Before fix: {"error":"No such file or directory (os error 2)","kind":"unknown","type":"error"} After fix: {"error":"export failed: /private/nonexistent/path/file.jsonl (No such file or directory (os error 2))","kind":"filesystem_io_error","type":"error"} The envelope now tells downstream claws: - WHAT operation failed (export) - WHERE it failed (the path) - WHAT KIND of failure (filesystem_io_error) - The original errno detail preserved for diagnosis ## Non-Regression Verification - Successful export still works (emits "kind": "export" envelope as before) - Session not found error still emits "session_not_found" (not filesystem) - missing_credentials still works correctly - cli_parse still works correctly - All 180 binary tests pass - All 466 library tests pass - All 95 compat-harness tests pass ## Regression Tests Added Inside the main CliAction test function: - "export failed:" pattern classifies as "filesystem_io_error" (not "unknown") - "diff failed:" pattern classifies as "filesystem_io_error" - "config failed:" pattern classifies as "filesystem_io_error" - contextualize_io_error() produces a message containing operation name - contextualize_io_error() produces a message containing target path - Messages produced by contextualize_io_error() are classifier-recognizable ## Scope This is the minimum viable fix: enrich export's fs::write with context. Future work (filed as part of #130b scope): apply same pattern to other filesystem operations (diff, plugins, config fs reads, session store writes, etc.). Each application is a copy-paste of the same helper pattern. ## Pattern Follows #145 (plugins parser interception), #248-249 (arm-level leak templates). Helper + classifier + call site wiring. Minimal diff, maximum observability gain. ## Related - Closes #130b (filesystem error context preservation) - Stacks on top of #251 (dispatch-order fix) — same worktree branch - Ground truth for future #130 broader sweep (other io::Error sites) |
||
|
|
fbe11187d9 |
fix(#251): intercept session-management verbs at top-level parser to bypass credential check
## What Was Broken (ROADMAP #251) Session-management verbs (list-sessions, load-session, delete-session, flush-transcript) were falling through to the parser's `_other => Prompt` catchall at main.rs:~1017. This construed them as `CliAction::Prompt { prompt: "list-sessions", ... }` which then required credentials via the Anthropic API path. The result: purely-local session operations emitted `missing_credentials` errors instead of session-layer envelopes. ## Acceptance Criterion The fix's essential requirement (stated by gaebal-gajae): **"These 4 verbs stop falling through to Prompt and emitting `missing_credentials`."** Not "all 4 are fully implemented to spec" — stubs are acceptable for delete-session and flush-transcript as long as they route LOCALLY. ## What This Fix Does Follows the exact pattern from #145 (plugins) and #146 (config/diff): 1. **CliAction enum** (main.rs:~700): Added 4 new variants. 2. **Parser** (main.rs:~945): Added 4 match arms before the `_other => Prompt` catchall. Each arm validates the verb's positional args (e.g., load-session requires a session-id) and rejects extra arguments. 3. **Dispatcher** (main.rs:~455): - list-sessions → dispatches to `runtime::session_control::list_managed_sessions_for()` - load-session → dispatches to `runtime::session_control::load_managed_session_for()` - delete-session → emits `not_yet_implemented` error (local, not auth) - flush-transcript → emits `not_yet_implemented` error (local, not auth) ## Dogfood Verification Run on clean environment (no credentials): ```bash $ env -i PATH=$PATH HOME=$HOME claw list-sessions --output-format json { "command": "list-sessions", "sessions": [ {"id": "session-1775777421902-1", ...}, ... ] } # ✓ Session-layer envelope, not auth error $ env -i PATH=$PATH HOME=$HOME claw load-session nonexistent --output-format json {"error":"session not found: nonexistent", "kind":"session_not_found", ...} # ✓ Local session_not_found error, not missing_credentials $ env -i PATH=$PATH HOME=$HOME claw delete-session test-id --output-format json {"command":"delete-session","error":"not_yet_implemented","kind":"not_yet_implemented","type":"error"} # ✓ Local not_yet_implemented, not auth error $ env -i PATH=$PATH HOME=$HOME claw flush-transcript test-id --output-format json {"command":"flush-transcript","error":"not_yet_implemented","kind":"not_yet_implemented","type":"error"} # ✓ Local not_yet_implemented, not auth error ``` Regression sanity: ```bash $ claw plugins --output-format json # #145 still works $ claw prompt "hello" --output-format json # still requires credentials correctly $ claw list-sessions extra arg --output-format json # rejects extra args with cli_parse ``` ## Regression Tests Added Inside `removed_login_and_logout_subcommands_error_helpfully` test function: - `list-sessions` → CliAction::ListSessions (both text and JSON output) - `load-session <id>` → CliAction::LoadSession with session_reference - `delete-session <id>` → CliAction::DeleteSession with session_id - `flush-transcript <id>` → CliAction::FlushTranscript with session_id - Missing required arg errors (load-session and delete-session without ID) - Extra args rejection (list-sessions with extra positional args) All 180 binary tests pass. 466 library tests pass. ## Fix Scope vs. Full Implementation This fix addresses #251 (dispatch-order bug) and #250's Option A (implement the surfaces). list-sessions and load-session are fully functional via existing runtime::session_control helpers. delete-session and flush-transcript are stubbed with local "not yet implemented" errors to satisfy #251's acceptance criterion without requiring additional session-store mutations that can ship independently in a follow-up. ## Template Exact same pattern as #145 (plugins) and #146 (config/diff): top-level verb interception → CliAction variant → dispatcher with local operation. ## Related Closes #251. Addresses #250 Option A for 4 verbs. Does not block #250 Option B (documentation scope guards) which remains valuable. |
||
|
|
27852aa481 |
docs(#162): add USAGE.md sections for dump-manifests, bootstrap-plan, acp, export
Parity audit (cycle #67) found 4 verbs were in claw --help but absent from USAGE.md: - dump-manifests: upstream manifest export for parity work - bootstrap-plan: startup component graph for debugging - acp: Zed editor integration status (discoverability only, tracking ROADMAP #76) - export: session transcript export (requires --resume) Each section follows the existing USAGE.md pattern: - Purpose statement - Example usage - When-to-use guidance - Related error modes where applicable Coverage: 12/12 binary verbs now documented (was 8/12). Acceptance: - All 4 verbs have dedicated sections with examples: verified by grep - Parity audit re-run: 100% coverage Relates to ROADMAP #162 (filed cycle #67, implemented cycle #68). Diff: +87 lines, doc-only, zero code risk. |
||
|
|
079273a039 |
docs(parity): update stats to 2026-04-23 — Rust LOC +66%, test LOC +76%, 979 commits on main
Growth since 2026-04-03: - Rust LOC: 48,599 → 80,789 (+32,190) - Test LOC: 2,568 → 4,533 (+1,965) - Commits: 292 → 979 (+687, now pending review phase) Main HEAD: ad1cf92 (doctrine loop canonical example) Key deliverables cycles #39–#63: - Typed-error hardening family (#247–#251) - Diagnostic-strictness principle (#57–#59) - Help-parity sweep (#130c–#130e) - Suffix-guard uniformity (#152) - Verb-classification fix (#160) - Integration-bandwidth doctrine (#62) - Doctrine-loop pattern formalized Status: 13 branches awaiting review (no new branches since cycle #61 branch-last protocol established) |
||
|
|
0beca198d7 |
fix(#160): reserved-semantic verbs with positional args now emit slash-command guidance
Verbs with CLI-reserved positional-arg meanings (resume, compact, memory, commit, pr, issue, bughunter) were falling through to Prompt dispatch when invoked with args, causing users to see 'missing_credentials' errors instead of guidance that the verb is a slash command. #160 investigation revealed the underlying design question: which verbs are 'promptable' (can start a prompt like 'explain this pattern') vs. 'reserved' (have specific CLI meaning like 'resume SESSION_ID')? This fix implements the reserved-verb classification: at parse time, intercept reserved verbs with trailing args and emit slash-command guidance before falling through to Prompt. Promptable verbs (explain, bughunter, clear) continue to route to Prompt as before. Helper: is_reserved_semantic_verb() lists the reserved set. All 181 tests pass (no regressions). |
||
|
|
b1a6e30927 |
fix(#122b): claw doctor warns when cwd is broad path (home/root)
## What Was Broken
`claw doctor` reported "Status: ok" when run from ~/ or /, but `claw
prompt` in the same directory would error out with:
error: claw is running from a very broad directory (/Users/yeongyu).
The agent can read and search everything under this path.
Diagnostic deception: doctor said green, prompt said red. User runs
doctor to check their setup, sees all green, runs prompt, gets blocked.
Trust in doctor erodes.
This is the exact pattern captured in the 'Diagnostic Commands Must Be
At Least As Strict As Runtime Commands' principle recorded in ROADMAP.md
at cycle #57.
## Root Cause
Two code paths perform the broad-cwd check:
- CliAction::Prompt handler → `enforce_broad_cwd_policy()` (errors out)
- CliAction::Repl handler → same function
But render_doctor_report() never called detect_broad_cwd(). The workspace
health check only looked at whether cwd was inside a git project, not
whether cwd was a dangerously broad path.
## What This Fix Does
Extend `check_workspace_health()` to also probe `detect_broad_cwd()`:
let broad_cwd = detect_broad_cwd();
let (level, summary) = match (in_repo, &broad_cwd) {
(_, Some(path)) => (
DiagnosticLevel::Warn,
format!(
"current directory is a broad path ({}); Prompt/REPL will \
refuse to run here without --allow-broad-cwd",
path.display()
),
),
(true, None) => (DiagnosticLevel::Ok, "project root detected"),
(false, None) => (DiagnosticLevel::Warn, "not inside a git project"),
};
The check now warns about BOTH failure modes with clear messaging about
what Prompt/REPL will do.
## Dogfood Verification
Before fix:
$ cd ~ && claw doctor
Workspace
Status warn
Summary current directory is not inside a git project
[all green otherwise]
$ echo | claw prompt "test"
error: claw is running from a very broad directory (/Users/yeongyu)...
After fix:
$ cd ~ && claw doctor
Workspace
Status warn
Summary current directory is a broad path (/Users/yeongyu);
Prompt/REPL will refuse to run here without
--allow-broad-cwd
$ cd / && claw doctor
Workspace
Status warn
Summary current directory is a broad path (/); ...
Non-regression:
$ cd /tmp/my-project && claw doctor
Workspace
Status warn
Summary current directory is not inside a git project
(unchanged)
$ cd /path/to/real/git/project && claw doctor
Workspace
Status ok
Summary project root detected on branch main
(unchanged)
## Regression Tests Added
- `workspace_check_in_project_dir_reports_ok` — non-broad + in-project = OK
- `workspace_check_outside_project_reports_warn` — non-broad + not-in-project = Warn with 'not inside git project' summary
- 181 binary tests pass (was 179, added 2)
## Related
- Principle: 'Diagnostic Commands Must Be At Least As Strict As Runtime
Commands' (ROADMAP.md cycle #57)
- Companion to #122 (stale-base preflight in doctor)
- Sibling: next step is probably a full runtime-vs-doctor audit for
other asymmetries (auth, sandbox, plugins, hooks)
|
||
|
|
b4f9b70e13 |
docs: add MERGE_CHECKLIST.md — integration support artifact for queue merge sequencing
Provides: - Recommended merge order (P0 → P1 → P2 → P3 by cluster) - Per-cluster merge prerequisites and validation steps - Conflict risk assessment (Cluster 2 #122/#122b have same edit locus) - Post-merge validation checklist (build + test + dogfood) - Timeline estimate (~60 min for full 17-branch queue) Addresses the final integration step: once branches are reviewed, knowing the safe merge order matters. This artifact pre-answers that question. Applied doctrine: integration-support artifacts (cycle #64) reduce reviewer friction. At 17-branch saturation, a merge-safe checklist is first-class work. Relates to cycle #70 integration throughput initiative. |
||
|
|
afd6f1b1b7 |
docs: add REVIEW_DASHBOARD.md — integration support artifact for 14-branch queue
Consolidates all 14 review-ready branches into a single dashboard showing: - Priority tiers (P0 typed-error → P3 doc truthfulness) - Cluster membership and batch-reviewable groups - Branch inventory with commits, diff size, tests, cluster, expected merge time - Merge throughput notes and reviewer shortcuts Per integration-support-artifacts doctrine (cycle #64): At queue saturation (N>=5), docs that reduce reviewer cognitive load are first-class deliverables. This dashboard aims to make queue digestion cheap: - Reviewer can scan tiers in 60 seconds - Batch recommendations saves context switches - Per-branch facts pre-answer expected questions - PR-ready summary reference for #249 Cluster impact: - 14 branches now have explicit cluster/priority labels - Batch review patterns identified for ~8 branches - Merge-friction heatmap surfaces lowest-risk starting points |
||
|
|
e59bb2c16d | roadmap: #136 marked CLOSED — compact+json dispatch already correct | ||
|
|
d2a162c1e0 |
docs(#250, #251): Align SCHEMAS.md with actual binary, downgrade #250 to scope-reduced
Cycle #46 follow-up to cycle #45's #251 implementation. Closes #250's implementation urgency by aligning docs with reality. SCHEMAS.md Updates: For each of the 4 session-management verbs, added: 1. Status marker (Implemented or Stub only) 2. Actual binary envelope (shape produced by the #251-fixed binary) 3. Aspirational (future) shape (original SCHEMAS.md content, preserved as target) 4. Gap notes where the two diverge Per-verb status: - list-sessions: Implemented, nested field layout - load-session: Implemented, nested session object with local session_not_found error - delete-session: Stub, emits not_yet_implemented (local error, not auth) - flush-transcript: Stub, emits not_yet_implemented (local error, not auth) ROADMAP.md Updates: - #251 marked CLOSED: Full status with commit ref, test counts. - #250 marked SCOPE-REDUCED: Option A resolved by #251, Option C moot, only Option B (doc alignment) remains as future cleanup. Why this matters: Every code change should close its documentation loop. #251 landed on the branch, but SCHEMAS.md still described aspirational shapes without marking which were implemented. Claws reading SCHEMAS.md would have assumed full conformance and hit surprises. Now the document tells the truth about which verbs work, which are stubs, and why. Related: - #251 implementation on feat/jobdori-251-session-dispatch branch - #250 scope-reduced to Option B (field-name harmonization) - #145/#146 parser fall-through fix precedent |
||
|
|
8e52f56ca8 |
ROADMAP #130: re-verify still-open on main HEAD 186d42f; add classifier-cluster pairing note
Cycle #39 dogfood re-verification of #130 (filed 2026-04-20). All 5 filesystem failure modes reproduce identically on main HEAD 186d42f, 2 days after original filing. Gap is unchanged. ## What's Added 1. **[STILL OPEN — re-verified 2026-04-22 cycle #39]** marker on the entry so readers can see immediately that the pinpoint hasn't been accidentally closed. 2. Full 5-mode repro output preserved verbatim for the current HEAD, so future re-verifications have a concrete baseline to diff against. 3. **New evidence not in original filing**: the classifier actively chose `kind: "unknown"` rather than just omitting the field. This means classify_error_kind() has NO substring match for "Is a directory", "No such file", "Operation not permitted", or "File exists". The typed-error contract is thus twice-broken on this path. 4. **Pairing with #247/#248/#249 classifier sweep**: the classifier-level part of #130 could land in the same sweep (add substring branches for io::ErrorKind strings). The context-preservation part (fix run_export's bare `?`) is a separate, larger change. ## Why Re-Verification Not Re-Filing Per cycle #24 discipline: speculative re-filings add noise, real confirmations add truth. #130 was already filed with exact repros, code trace, and fix shape. My dogfood hit the same gap on fresh HEAD — the right output is confirming the gap is still there (not filing #251 for the same bug). This is the same pattern as cycle #32's "mark #127 CLOSED" reality-sync: documentation-drift prevention through explicit status markers. ## New Pattern "Reality-sync via re-verification" — re-running a filed pinpoint's repro on fresh HEAD and adding the timestamp + output proves the gap is still real without inventing new filings. Cycle #24 calibration keeps ROADMAP entries honest. Per cycle #24 calibration: - Red-state bug? ⚠️ borderline (errors surfaced, but kind=unknown is demonstrably wrong on a path where the system knows the errno) - Real friction? ✓ (re-verified on fresh HEAD) - Evidence-backed? ✓ (5-mode repro + classifier trace) - Same-cycle fix? ✗ (classifier-level part could join #247/#248/#249 sweep; context-preservation part is larger refactor) - Implementation cost? Classifier part ~10 lines; full context fix ~60 lines Source: Jobdori cycle #39 proactive dogfood in response to Clawhip pinpoint nudge. Probed export filesystem errors; discovered this was #130 reconfirmation, not new bug. Applied reality-sync pattern from cycle #32. |
||
|
|
e37cdc0a8d |
fix: #247 classify prompt-related parse errors + unify JSON hint plumbing
Cycle #34 dogfood follow-through on Jobdori cycle #33 pinpoint (#247 filed at fbcbe9d). Closes the two typed-error contract drifts surfaced in that pinpoint against the Rust `claw` binary. ## What was wrong 1. `classify_error_kind()` (main.rs:~251) used substring matching but did NOT match two common prompt-related parse errors: - "prompt subcommand requires a prompt string" - "empty prompt: provide a subcommand..." Both fell through to `"unknown"`. §4.44 typed-error contract specifies `parse | usage | unknown` as distinct classes, so claws dispatching on `error.kind == "cli_parse"` missed those paths entirely. 2. JSON mode dropped the `Run `claw --help` for usage.` hint. Text mode appends it at stderr-print time (main.rs:~234) AFTER split_error_hint() has already serialized the envelope, so JSON consumers never saw it. Text-mode humans got an actionable pointer; machine consumers did not. ## Fix Two small, targeted edits: 1. `classify_error_kind()`: add explicit branches for "prompt subcommand requires" and "empty prompt:" (the latter anchored with `starts_with` so it never hijacks unrelated error messages containing the word). Both route to `cli_parse`. 2. JSON error render path in `main()`: after calling split_error_hint(), if the message carried no embedded hint AND kind is `cli_parse` AND the short-reason does not already embed a `claw --help` pointer, synthesize the same `Run `claw --help` for usage.` trailer that text-mode stderr appends. The embedded-pointer check prevents duplication on the `empty prompt: ... (run `claw --help`)` message which already carries inline guidance. ## Verification Direct repro on the compiled binary: $ claw --output-format json prompt {"error":"prompt subcommand requires a prompt string", "hint":"Run `claw --help` for usage.", "kind":"cli_parse","type":"error"} $ claw --output-format json "" {"error":"empty prompt: provide a subcommand (run `claw --help`) or a non-empty prompt string", "hint":null,"kind":"cli_parse","type":"error"} $ claw --output-format json doctor --foo # regression guard {"error":"unrecognized argument `--foo` for subcommand `doctor`", "hint":"Run `claw --help` for usage.", "kind":"cli_parse","type":"error"} Text mode unchanged in shape; `[error-kind: ...]` prefix now reads `cli_parse` for the two previously-misclassified paths. ## Regression coverage - Unit test `classify_error_kind_covers_prompt_parse_errors_247`: locks both patterns route to `cli_parse` AND that generic "prompt"-containing messages still fall through to `unknown`. - Integration tests in `tests/output_format_contract.rs`: * prompt_subcommand_without_arg_emits_cli_parse_envelope_with_hint_247 * empty_positional_arg_emits_cli_parse_envelope_247 * whitespace_only_positional_arg_emits_cli_parse_envelope_247 * unrecognized_argument_still_classifies_as_cli_parse_247_regression_guard - Full rusty-claude-cli test suite: 218 tests pass (180 bin unit + 15 output_format_contract + 12 resume_slash + 7 compact + 3 mock + 1 cli). ## Family / related Joins §4.44 typed-envelope contract gap family closure: #130, #179, #181, and now **#247**. All four quartet items now have real fixes landed on the canonical binary surface rather than only the Python harness. ROADMAP.md: #247 marked CLOSED with before/after evidence preserved. |
||
|
|
95b8eecd2f |
docs: cycle #32 — mark #127 CLOSED; document in-flight branch obsolescence
Cycle #32 dogfood finding: #127 was fixed on main via `a3270db` + `79352a2` (2026-04-20), but the ROADMAP.md entry still lacked a [CLOSED] marker. The in-flight branches `feat/jobdori-127-clean` and `feat/jobdori-127-verb-suffix-flags` were superseded and are now obsolete. ## What This Fixes **Documentation drift:** Pinpoint #127 was complete in code but unmarked in ROADMAP. New contributors checking the roadmap would see it as open work, potentially duplicating effort. **Stale branches:** Two branches (`feat/jobdori-127-clean`, `feat/jobdori-127-verb-suffix-flags`) contain the fix attempt bundled with an unrelated large-scope refactor (5365 lines removed from ROADMAP.md, root-level governance docs deleted, command infra refactored). Their fix was superseded; branches are functionally obsolete. ## Verification Re-verified all 4 #127 scenarios pass on main HEAD `b903e16`: $ claw doctor --json → rejected with "did you mean" hint $ claw doctor garbage → rejected $ claw doctor --unknown-flag → rejected $ claw doctor --output-format json → works (canonical form) All behavior matches #127 acceptance criteria. ## Cluster Impact Post-closure: **parser-level trust gap quintet (#108 + #117 + #119 + #122 + #127) is 5/5 closed**. The `_other => Prompt` fall-through audit is complete. ## Discipline Check Per cycle #24 calibration: - Red-state bug? ✗ (behavior is correct on main) - Real friction? ✓ (ROADMAP drift; obsolete branches adrift) - Evidence-backed? ✓ (dogfood probe confirmed closure; git log confirmed supersession; branch diff confirmed scope contamination) ## Relationship to Gaebal-gajae's Option A Guidance Cycle #32 started by proposing separating the #127 fix from the attached refactor. On deeper probe, discovered the fix was already superseded on main via different commits. Option A (separate the fix) is retroactively satisfied: the fix landed cleanly, the refactor never did. The remaining action is governance hygiene: mark closure, document supersession, flag obsolete branches for deletion. ## Next Actions (not in this commit) - Delete `feat/jobdori-127-clean` locally and on fork (after confirmation) - Delete `feat/jobdori-127-verb-suffix-flags` locally and on fork - Monitor whether any attached refactor content should be re-proposed in its own scoped PR Source: Jobdori cycle #32 dogfood in response to Clawhip 10-min nudge. Proposed Option A (separate fix from refactor); probe revealed the fix already landed via a different commit path, rendering the refactor-only branch obsolete. |
||
|
|
8011027df1 |
test: cycle #30 — lock OPT_OUT surface rejection (close parity test gap)
Cycle #30 dogfood found a testing gap: OPT_OUT surfaces were classified in code but their REJECTION behavior was never regression-tested. ## The Gap OPT_OUT_AUDIT.md declares 12 surfaces as intentionally exempt from --output-format. The test suite had: - ✅ test_clawable_surface_has_output_format (CLAWABLE must accept) - ✅ test_every_registered_command_is_classified (no orphans) - ❌ Nothing verifying OPT_OUT surfaces REJECT --output-format If a developer accidentally added --output-format to 'summary' (one of the 12 OPT_OUT surfaces), no test would catch the silent promotion. The classification was governed, but the rejection behavior was NOT. ## What Changed Added TestOptOutSurfaceRejection to test_cli_parity_audit.py with 14 tests: 1. **12 parametrized tests** — one per OPT_OUT surface, verifying each rejects --output-format with an argparse error. 2. **test_opt_out_set_matches_audit_document** — verifies OPT_OUT_SURFACES constant matches the declared 12 surfaces in OPT_OUT_AUDIT.md. 3. **test_opt_out_count_matches_declared** — sanity check that the count stays at 12 as documented. ## Symmetry Achieved Before: only CLAWABLE acceptance tested CLAWABLE accepts --output-format ✅ OPT_OUT behavior: untested After: full parity coverage CLAWABLE accepts --output-format ✅ OPT_OUT rejects --output-format ✅ Audit doc ↔ constant kept in sync ✅ This completes the parity enforcement loop: every new surface is explicitly IN or OUT, and BOTH directions are regression-locked. ## Promotion Path Preserved When a real OPT_OUT surface gains genuine demand (per OPT_OUT_DEMAND_LOG.md): 1. Move from OPT_OUT_SURFACES to CLAWABLE_SURFACES 2. Update OPT_OUT_AUDIT.md with promotion rationale 3. Remove from this test's expected rejections 4. Tests pass (rejection test no longer runs; acceptance test now required) Graceful promotion; no accidental drift. ## Test Count - 222 → 236 passing (+14, zero regressions) - 12 parametrized + 2 metadata = 14 new tests ## Discipline Check Per cycle #24 calibration: - Red-state bug? ✗ (no broken behavior) - Real friction? ✓ (testing gap discovered by dogfood) - Evidence-backed? ✓ (systematic probe revealed missing coverage) This is the cycle #27 taxonomy (structural / quality / cross-channel / text-vs-JSON divergence) extending into classification: not just 'is the envelope right?' but 'is the OPPOSITE-OF-envelope right?' Future cycles can apply the same principle to other classifications: every governed non-goal deserves regression tests that lock its non-goal-ness. Classification: - Real friction: ✓ (cycle #30 dogfood) - Evidence-backed: ✓ (gap discovered by systematic surface audit) - Same-cycle fix: ✓ (maintainership discipline) Source: Jobdori cycle #30 proactive dogfood — probed all 26 subcommands with --output-format json and noticed OPT_OUT rejection pattern was unverified by any dedicated test. |
||
|
|
d7ea17b3e9 |
docs+test: cycle #29 — document + lock text-mode vs JSON-mode exit divergence
Cycle #29 dogfood found a real pinpoint: cross-mode exit code divergence. ## The Pinpoint Dogfooding the CLI revealed that unknown subcommand errors return different exit codes depending on output mode: $ python3 -m src.main nonexistent-cmd # exit 2 $ python3 -m src.main nonexistent-cmd --output-format json # exit 1 ERROR_HANDLING.md documented the exit-code contract (1=parse, 2=timeout) but did NOT explicitly state the contract applies only to JSON mode. Text mode follows argparse defaults (exit 2 for any parse error), which violates the documented contract when interpreted generally. A claw using text mode with 'claw nonexistent' would see exit 2 and misclassify as timeout per the docs. Real protocol contract gap, not implementation bug. ## Classification This is a DOCUMENTATION gap, not a behavior bug: - Text mode follows argparse convention (reasonable for humans) - JSON mode normalizes to documented contract (reasonable for claws) - The divergence is intentional; only the docs were silent about it Fix = document the divergence explicitly + lock it with tests. NOT fix = change text mode exit code to 1 (would break argparse conventions and confuse human users). ## Documentation Changes ERROR_HANDLING.md: 1. Added IMPORTANT callout in Quick Reference section: 'The exit code contract applies ONLY when --output-format json is explicitly set. Text mode follows argparse conventions.' 2. New 'Text mode vs JSON mode exit codes' table showing exact divergence: - Unknown subcommand: text=2, json=1 - Missing required arg: text=2, json=1 - Session not found: text=1, json=1 (app-level, identical) - Success: text=0, json=0 (identical) - Timeout: text=2, json=2 (identical, #161) 3. Practical rule: 'always pass --output-format json' ## Tests Added (5) TestTextVsJsonModeDivergence in test_cross_channel_consistency.py: 1. test_unknown_command_text_mode_exits_2 — text mode argparse default 2. test_unknown_command_json_mode_exits_1 — JSON mode contract normalized 3. test_missing_required_arg_text_mode_exits_2 — same for missing args 4. test_missing_required_arg_json_mode_exits_1 — same normalization 5. test_success_path_identical_in_both_modes — success exit identical These tests LOCK the expected divergence so: - Documentation stays aligned with implementation - Future changes (either direction) are caught as intentional - Claws trust the docs ## Test Status - 217 → 222 tests passing (+5) - Zero regressions ## Discipline This cycle follows the cycle #28 template exactly: - Dogfood probe revealed real friction (test said exit=2, docs said exit=1) - Minimal fix shape (documentation clarification, not code change) - Regression guard via tests - Evidence-backed, not speculative Relationship to #181: - #181 fixed env.exit_code != process exit (WITHIN JSON mode) - #29 clarifies exit code contract scope (ONLY JSON mode) - Both establish: exit codes are deterministic, but only when --output-format json --- Classification (per cycle #24 calibration): - Red-state bug? ✗ (behavior was reasonable, docs were incomplete) - Real friction? ✓ (docs/code divergence revealed by dogfood) - Evidence-backed? ✓ (test suite probed both modes, found the gap) Source: Jobdori cycle #29 proactive dogfood — in response to Clawhip nudge for pinpoint hunting. Found that text-mode errors return exit 2 but ERROR_HANDLING.md implied exit 1 was the parse-error contract universally. |
||
|
|
5db02bce18 |
feat: #180 implement --version flag for metadata protocol (#28 proactive demand)
Cycle #28 closes the low-hanging metadata protocol gap identified in #180. ## The Gap Pinpoint #180 (filed cycle #24) documented a metadata protocol gap: - `--help` works (argparse default) - `--version` does NOT exist The ROADMAP entry deferred implementation pending demand. Cycle #28 dogfood probe found this during routine invariant audit (attempt to call `--version` as part of comprehensive CLI surface coverage). This is concrete evidence of real friction, not speculative gap-filling. ## Implementation Added `--version` flag to argparse in `build_parser()`: ```python parser.add_argument('--version', action='version', version='claw-code 1.0.0 (Python harness)') ``` Simple one-liner. Follows Python argparse conventions (built-in action='version'). ## Tests Added (3) TestMetadataFlags in test_exec_route_bootstrap_output_format.py: 1. test_version_flag_returns_version_text — `claw --version` prints version 2. test_help_flag_returns_help_text — `claw --help` still works 3. test_help_still_works_after_version_added — Both -h and --help work Regression guard on the original help surface. ## Test Status - 214 → 217 tests passing (+3) - Zero regressions - Full suite green ## Discipline This cycle exemplifies the cycle #24 calibration: - #180 was filed as 'deferred pending demand' - Cycle #28 dogfood found actual friction (proactive test coverage gap) - Evidence = concrete ('--version not found during invariant audit') - Action = minimal implementation + regression tests - No speculation, no feature creep, no implementation before evidence Not 'we imagined someone might want this.' Instead: 'we tried to call it during routine maintenance, got ENOENT, fixed it.' ## Related - #180 (cycle #24): Metadata protocol gap filed - Cycle #27: Cross-channel consistency audit established framework - Cycle #28 invariant audit: Discovered actual friction, triggered fix --- Classification (per cycle #24 calibration): - Red-state bug? ✗ (not a malfunction, just an absence) - Real friction? ✓ (audit probe could not call the flag, had to special-case) - Evidence-backed? ✓ (proactive test coverage revealed the gap) Source: Jobdori cycle #28 dogfood — invariant audit attempting comprehensive CLI surface coverage found that --version was unsupported. |
||
|
|
7f56f71761 |
test: cycle #27 — cross-channel consistency audit suite
Cycle #27 ships a new test class systematizing the three-layer protocol invariant framework. ## Context After cycles #20–#26, the protocol has three distinct invariant classes: 1. **Structural compliance** (#178): Does the envelope exist? 2. **Quality compliance** (#179): Is stderr silent + error message truthful? 3. **Cross-channel consistency** (#181 + NEW): Do multiple channels agree? #181 revealed a critical gap: the second test class was incomplete. Envelopes could be structurally valid, quality-compliant, but still lie about their own state (envelope.exit_code != actual exit). ## New Test Class TestCrossChannelConsistency in test_cross_channel_consistency.py captures the third invariant layer with 5 dedicated tests: 1. envelope.command ↔ dispatched subcommand 2. envelope.output_format ↔ --output-format flag 3. envelope.timestamp ↔ actual wall clock (recent, <5s) 4. envelope.exit_code ↔ process exit code (cycle #26/#181 regression guard) 5. envelope boolean fields (found/handled/deleted) ↔ error block presence Each test specifically targets cross-channel truth, not structure or quality. ## Why Separate Test Classes Matter A command can fail all three ways independently: | Failure mode | Exit/Crash | Test class | Example | |---|---|---|---| | Structural | stderr noise | TestParseErrorEnvelope | argparse leaks to stderr | | Quality | correct shape, wrong message | TestParseErrorStderrHygiene | error instead of real message | | Cross-channel | truthy field, lie about state | TestCrossChannelConsistency | exit_code: 0 but exit 1 | #181 was invisible to the first two classes. A claw passing all structure/ quality tests could still be misled. The third class catches that. ## Audit Results (Cycle #27) All 5 tests pass — no drift detected in any channel pair: - ✅ Envelope command always matches dispatch - ✅ Envelope output_format always matches flag - ✅ Envelope timestamp always recent (<5s) - ✅ Envelope exit_code always matches process exit (post-#181 guard) - ✅ Boolean fields consistent with error block presence The systematic audit proved the fix from #181 holds, and identified no new cross-channel gaps. ## Test Impact - 209 → 214 tests passing (+5) - Zero regressions - New invariant class now has dedicated test suite - Future cross-channel bugs will be caught by this class ## Related - #178 (#20): Parser-front-door structural contract - #179 (#20): Stderr hygiene + real error message quality - #181 (#26): Envelope exit_code must match process exit - #182-N: Future cross-channel contract violations will be caught by TestCrossChannelConsistency This test class is evergreen — as new fields/channels are added to the protocol, invariants for those channels should be added here, not mixed with other test classes. Keeping invariant classes separate makes regression attribution instant (e.g., 'TestCrossChannelConsistency failed' = 'some truth channel disagreed'). Classification (per cycle #24 calibration): - Red-state bug: ✗ (audit is green) - Real friction: ✓ (structured audit of documented invariants) - Proof of equilibrium: ✓ (systematic verification, no gaps found) Source: Jobdori cycle #27 proactive invariant audit — following gaebal guidance to probe documented invariants, not speculative gaps. |
||
|
|
db5d5beb31 |
fix: #181 — envelope exit_code must match process exit code (exec-command/exec-tool)
Cycle #26 dogfood found a real red-state bug in the JSON envelope contract. ## The Bug exec-command and exec-tool not-found cases return exit code 1 from the process, but the envelope reports exit_code: 0 (the default from wrap_json_envelope). This is a protocol violation. Repro (before fix): $ claw exec-command unknown-cmd test --output-format json > out.json $ echo $? 1 $ jq '.exit_code' out.json 0 # WRONG — envelope lies about exit code Claws reading the envelope's exit_code field get misinformation. A claw implementing the canonical ERROR_HANDLING.md pattern (check exit_code, then classify by error.kind) would incorrectly treat failures as successes when dispatching on the envelope alone. ## Root Cause main.py lines 687–739 (exec-command + exec-tool handlers): - Return statement: 'return 0 if result.handled else 1' (correct) - Envelope wrap: 'wrap_json_envelope(envelope, args.command)' (uses default exit_code=0, IGNORES the return value) The envelope wrap was called BEFORE the return value was computed, so the exit_code field was never synchronized with the actual exit code. ## The Fix Compute exit_code ONCE at the top: exit_code = 0 if result.handled else 1 Pass it explicitly to wrap_json_envelope: wrap_json_envelope(envelope, args.command, exit_code=exit_code) Return the same value: return exit_code This ensures the envelope's exit_code field is always truth — the SAME value the process returns. ## Tests Added (3) TestEnvelopeExitCodeMatchesProcessExit in test_exec_route_bootstrap_output_format.py: 1. test_exec_command_not_found_envelope_exit_matches: Verifies exec-command unknown-cmd returns exit 1 in both envelope and process. 2. test_exec_tool_not_found_envelope_exit_matches: Same for exec-tool. 3. test_all_commands_exit_code_invariant: Audit across 4 known non-zero cases (show-command, show-tool, exec-command, exec-tool not-found). Guards against the same bug in other surfaces. ## Impact - 206 → 209 passing tests (+3) - Zero regressions - Protocol contract now truthful: envelope.exit_code == process exit - Claws using the one-handler pattern from ERROR_HANDLING.md now get correct information ## Related - ERROR_HANDLING.md (cycle #22): Documented exit_code as machine-readable contract field - #178/#179 (cycles #19/#20): Closed parser-front-door contract - This closes a gap in the WORK PROTOCOL contract — envelope values must match reality, not just be structurally present. Classification (per cycle #24 calibration): - Red-state bug: ✓ (contract violation, claws get misinformation) - Real friction: ✓ (discovered via dogfood, not speculative) - Fix ships same-cycle: ✓ (discipline per maintainership mode) Source: Jobdori cycle #26 dogfood — ran multiple edge-case probes, noticed exec-command envelope showed exit_code: 0 while process exited 1. Investigated wrap_json_envelope default behavior, confirmed bug, fixed and tested in same cycle. |
||
|
|
3bd0fd1f9d |
docs: USAGE.md — cross-link ERROR_HANDLING.md for subprocess orchestration
Cycle #25 ships navigation improvements connecting USAGE (setup/interactive) to ERROR_HANDLING.md (subprocess/orchestration patterns). Before: USAGE.md had JSON scripting mention but no link to error-handling guide. New users reading USAGE would see JSON is available, but wouldn't discover the error-handling pattern without accidentally finding ERROR_HANDLING.md. After: Two strategic cross-links: 1. Top-level tip box: "Building orchestration code? See ERROR_HANDLING.md" 2. JSON scripting section expanded with examples + link to unified pattern Changes to USAGE.md: - Added TIP callout near top linking to ERROR_HANDLING.md - Expanded "JSON output for scripting" section: - Explains what the envelope contains (exit_code, command, timestamp, fields) - Added 3 command examples (prompt, load-session, turn-loop) - Added callout for dispatchers/orchestrators pointing to ERROR_HANDLING pattern Impact: Operators reading USAGE for "how do I call claw from scripts?" now immediately see the canonical answer (ERROR_HANDLING.md) instead of having to reverse-engineer it from code examples. No code changes. Pure navigation/documentation. Continues the documentation-governance pattern: the work protocol (14 clawable commands) has a consumption guide (ERROR_HANDLING.md), and that guide is now reachable from the main entry point (USAGE.md + README.md top nav). |
||
|
|
8918e8ae77 |
docs: README.md — promote ERROR_HANDLING.md to first-class navigation
Cycle #23 ships a documentation discoverability fix. After #22 shipping ERROR_HANDLING.md, the next natural step is making it discoverable from the project's entry point (README.md). Before: README top navigation linked to USAGE, PARITY, ROADMAP, Rust workspace. ERROR_HANDLING.md was buried in CLAUDE.md references. After: ERROR_HANDLING.md is now in the top navigation (right after USAGE, before Rust workspace). Also added SCHEMAS.md mention in repository shape. This signals that: 1. Error handling is a first-class concern (not an afterthought) 2. The Python harness documentation (SCHEMAS.md, ERROR_HANDLING.md, CLAUDE.md) is part of the official docs, not just dogfood artifacts 3. New users/claws can discover the error-handling pattern at entry point Impact: Operators building orchestration code will immediately see 'Error Handling' link in navigation, shortening the path to understanding how to consume the protocol reliably. No code changes. No test changes. Pure navigation/discoverability. |
||
|
|
2cf4334f3e |
docs: ERROR_HANDLING.md — unified error handler pattern for orchestration code
Cycle #22 ships documentation that operationalizes cycles #178–#179. Problem context: After #178 (parse-error envelope) and #179 (stderr hygiene + real error message), claws can now build a unified error handler for all 14 clawable commands. But there was no guide on how to actually do that. Operators had the pieces; they didn't have the pattern. This file changes that. New file: ERROR_HANDLING.md - Quick reference: exit codes + envelope shapes (0=success, 1=error, 2=timeout) - One-handler pattern: ~80 lines of Python showing how to parse error.kind, check retryable, and decide recovery strategy - Four practical recovery patterns: - Retry on transient errors (filesystem, timeout) - Reuse session after timeout (if cancel_observed=true) - Validate command syntax before dispatch (dry-run --help) - Log errors for observability - Error kinds enumeration (parse, session_not_found, filesystem, runtime, timeout) - Common mistakes to avoid (6 patterns with BAD vs GOOD examples) - Testing your error handler (unit test examples) Operational impact: Orchestration code now has a canonical pattern. Claws can: - Copy-paste the run_claw_command() function (works for all commands) - Classify errors uniformly (no special cases per command) - Decide recovery deterministically (error.kind + retryable + cancel_observed) - Log/monitor/escalate with confidence Related cycles: - #178: Parse-error envelope (commands now emit structured JSON on invalid argv) - #179: Stderr hygiene + real message (JSON mode silences argparse, carries actual error) - #164 Stage B: cancel_observed field (callers know if session is safe for reuse) Updated CLAUDE.md: - Added ERROR_HANDLING.md to 'Related docs' section - Now documents the one-handler pattern as a guideline No code changes. No test changes. Pure documentation. This completes the documentation trail from protocol (SCHEMAS.md) → governance (OPT_OUT_AUDIT.md, OPT_OUT_DEMAND_LOG.md) → practice (ERROR_HANDLING.md). |
||
|
|
beb1ad3d60 |
docs: OPT_OUT_DEMAND_LOG.md — evidentiary base for governance decisions
Cycle #21 ships governance infrastructure, not implementation. Maintainership mode means sometimes the right deliverable is a decision framework, not code. Problem context: OPT_OUT_AUDIT.md (cycle #18 bonus) established 'demand-backed audit' as the next step. But without a structured way to record demand signals, 'demand-backed' was just a slogan — the next audit cycle would have no evidence to work from. This commit creates the evidentiary base: New file: OPT_OUT_DEMAND_LOG.md - Per-surface entries for all 12 OPT_OUT commands (Groups A/B/C) - Current state: 0 signals across all surfaces (consistent with audit prediction) - Signal entry template with required fields: - Source (who/what) - Use case (concrete orchestration problem) - Markdown-alternative-checked (why existing output insufficient) - Date - Promotion thresholds: - 2+ independent signals for same surface → file promotion pinpoint - 1 signal + existing stable schema → file pinpoint for discussion - 0 signals → stays OPT_OUT (rationale preserved) Decision framework for cycle #22 (audit close): - If 0 signals total: move to PERMANENTLY_OPT_OUT, close audit - If 1-2 signals: file individual promotion pinpoints with evidence - If 3+ signals: reopen audit, question classification itself Updated files: - OPT_OUT_AUDIT.md: Added demand log reference in Related section - CLAUDE.md: Added prerequisites for promotions (must have logged signals), added 'File a demand signal' workflow section Philosophy: 'Prevent speculative expansion' — schema bloat protection discipline. Every new CLAWABLE surface is a maintenance tax. Evidence requirement keeps the protocol lean. OPT_OUT surfaces are intentionally not-clawable until proven otherwise by external demand. Operational impact: Next cycles can now: 1. Watch for real claws hitting OPT_OUT surface limits 2. Log signals in structured format (no ad-hoc filing) 3. Run audit at cycle #22 with actual data, not speculation No code changes. No test changes. Pure governance infrastructure. Related: #18 cycle (OPT_OUT_AUDIT.md), maintainership phase transition. |
||
|
|
773d10615e |
fix: #179 — JSON mode now fully suppresses argparse stderr + preserves real error message
Dogfood discovered #178 had two residual gaps: 1. Stderr pollution: argparse usage + error text still leaked to stderr even in JSON mode (envelope was correct on stdout, but stderr noise broke the 'machine-first protocol' contract — claws capturing both streams got dual output) 2. Generic error message: envelope carried 'invalid command or argument (argparse rejection)' instead of argparse's actual text like 'the following arguments are required: session_id' or 'invalid choice: typo (choose from ...)' Before #179: $ claw load-session --output-format json [stdout] {"error": {"message": "invalid command or argument (argparse rejection)"}} [stderr] usage: main.py load-session [-h] ... main.py load-session: error: the following arguments are required: session_id [exit 1] After #179: $ claw load-session --output-format json [stdout] {"error": {"message": "the following arguments are required: session_id"}} [stderr] (empty) [exit 1] Implementation: - New _ArgparseError exception class captures argparse's real message - main() monkey-patches parser.error (+ all subparser.error) in JSON mode to raise _ArgparseError instead of print-to-stderr + sys.exit(2) - _emit_parse_error_envelope() now receives the real message verbatim - Text mode path unchanged: still uses original argparse print+exit behavior Contract: - JSON mode: stdout carries envelope with argparse's actual error; stderr silent - Text mode: unchanged — argparse usage to stderr, exit 2 - Parse errors still error.kind='parse', retryable=false Test additions (5 new, 14 total in test_parse_error_envelope.py): - TestParseErrorStderrHygiene (5): - test_json_mode_stderr_is_silent_on_unknown_command - test_json_mode_stderr_is_silent_on_missing_arg - test_json_mode_envelope_carries_real_argparse_message - test_json_mode_envelope_carries_invalid_choice_details (verifies valid-choices list) - test_text_mode_stderr_preserved_on_unknown_command (backward compat) Operational impact: Claws capturing both stdout and stderr no longer get garbled output. The envelope message now carries discoverability info (valid command list, missing-arg name) that claws can use for retry/recovery without probing the CLI a second time. Test results: 201 → 206 passing, 3 skipped unchanged, zero regression. Pinpoint discovered via dogfood at 2026-04-22 20:30 KST (cycle #20). |
||
|
|
272534056f |
feat: #178 — argparse errors emit JSON envelope when --output-format json requested
Dogfood pinpoint: running 'claw nonexistent-command --output-format json' bypasses
the JSON envelope contract — argparse dumps human-readable usage to stderr with
exit 2, breaking the SCHEMAS.md guarantee that JSON mode returns structured output.
Problem:
$ claw nonexistent --output-format json
usage: main.py [-h] {summary,manifest,...} ...
main.py: error: argument command: invalid choice: 'nonexistent' (choose from ...)
[exit 2 — no envelope, claws must parse argparse usage messages]
Fix:
$ claw nonexistent --output-format json
{
"timestamp": "2026-04-22T11:00:29Z",
"command": "nonexistent-command",
"exit_code": 1,
"output_format": "json",
"schema_version": "1.0",
"error": {
"kind": "parse",
"operation": "argparse",
"target": "nonexistent-command",
"retryable": false,
"message": "invalid command or argument (argparse rejection)",
"hint": "run with no arguments to see available subcommands"
}
}
[exit 1, clean JSON envelope on stdout per SCHEMAS.md]
Changes:
- src/main.py:
- _wants_json_output(argv): pre-scan for --output-format json before parsing
- _emit_parse_error_envelope(argv, message): emit wrapped envelope on stdout
- main(): catch SystemExit from argparse; if JSON requested, emit envelope
instead of letting argparse's help dump go through
- tests/test_parse_error_envelope.py (new, 9 tests):
- TestParseErrorJsonEnvelope (7): unknown command, =syntax, text mode unchanged,
invalid flag, missing command, valid command unaffected, common fields
- TestParseErrorSchemaCompliance (2): error.kind='parse', retryable=false
Contract:
- text mode (default): unchanged — argparse dumps help to stderr, exits 2
- JSON mode: envelope per SCHEMAS.md, error.kind='parse', exit 1
- Parse errors always retryable=false (typo won't self-fix)
- error.kind='parse' already enumerated in SCHEMAS.md (no schema changes)
This closes a real gap: claws invoking unknown commands in JSON mode can now route
via exit code + envelope.kind='parse' instead of scraping argparse output.
Test results: 192 → 201 passing, 3 skipped unchanged, zero regression.
Pinpoint discovered via dogfood at 2026-04-22 19:59 KST (cycle #19).
|