mirror of
https://github.com/ultraworkers/claw-code.git
synced 2026-04-24 21:28:11 +08:00
250 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
bc259ec6f9 |
fix: #149 — eliminate parallel-test flake in runtime::config tests
## Problem `runtime::config::tests::validates_unknown_top_level_keys_with_line_and_field_name` intermittently fails during `cargo test --workspace` (witnessed during #147 and #148 workspace runs) but passes deterministically in isolation. Example failure from workspace run: test result: FAILED. 464 passed; 1 failed ## Root cause `runtime/src/config.rs::tests::temp_dir()` used nanosecond timestamp alone for namespace isolation: std::env::temp_dir().join(format!("runtime-config-{nanos}")) Under parallel test execution on fast machines with coarse clock resolution, two tests start within the same nanosecond bucket and collide on the same path. One test's `fs::remove_dir_all(root)` then races another's in-flight `fs::create_dir_all()`. Other crates already solved this pattern: - plugins::tests::temp_dir(label) — label-parameterized - runtime::git_context::tests::temp_dir(label) — label-parameterized runtime/src/config.rs was missed. ## Fix Added process id + monotonically-incrementing atomic counter to the namespace, making every callsite provably unique regardless of clock resolution or scheduling: static COUNTER: AtomicU64 = AtomicU64::new(0); let pid = std::process::id(); let seq = COUNTER.fetch_add(1, Ordering::Relaxed); std::env::temp_dir().join(format!("runtime-config-{pid}-{nanos}-{seq}")) Chose counter+pid over the label-parameterized pattern to avoid touching all 20 callsites in the same commit (mechanical noise with no added safety — counter alone is sufficient). ## Verification Before: one failure per workspace run (config test flake). After: 5 consecutive `cargo test --workspace` runs — zero config test failures. Only pre-existing `resume_latest` flake remains (orthogonal, unrelated to this change). for i in 1 2 3 4 5; do cargo test --workspace; done # All 5 runs: config tests green. Only resume_latest flake appears. cargo test -p runtime # 465 passed; 0 failed ## ROADMAP.md Added Pinpoint #149 documenting the gap, root cause, and fix. Closes ROADMAP #149. |
||
|
|
f84c7c4ed5 |
feat: #148 + #128 closure — model provenance in claw status JSON/text
## Scope Two deltas in one commit: ### #128 closure (docs) Re-verified on main HEAD `4cb8fa0`: malformed `--model` strings already rejected at parse time (`validate_model_syntax` in parse_args). All historical repro cases now produce specific errors: claw --model '' → error: model string cannot be empty claw --model 'bad model' → error: invalid model syntax: 'bad model' contains spaces claw --model 'sonet' → error: invalid model syntax: 'sonet'. Expected provider/model or known alias claw --model '@invalid' → error: invalid model syntax: '@invalid'. Expected provider/model ... claw --model 'totally-not-real-xyz' → error: invalid model syntax: ... claw --model sonnet → ok, resolves to claude-sonnet-4-6 claw --model anthropic/claude-opus-4-6 → ok, passes through Marked #128 CLOSED in ROADMAP with repro block. Residual provenance gap split off as #148. ### #148 implementation **Problem.** After #128 closure, `claw status --output-format json` still surfaces only the resolved model string. No way for a claw to distinguish whether `claude-sonnet-4-6` came from `--model sonnet` (alias resolution) vs `--model claude-sonnet-4-6` (pass-through) vs `ANTHROPIC_MODEL` env vs `.claw.json` config vs compiled-in default. Debug forensics had to re-read argv instead of reading a structured field. Clawhip orchestrators sending `--model` couldn't confirm the flag was honored vs falling back to default. **Fix.** Added two fields to status JSON envelope: - `model_source`: "flag" | "env" | "config" | "default" - `model_raw`: user's input before alias resolution (null on default) Text mode appends a `Model source` line under `Model`, showing the source and raw input (e.g. `Model source flag (raw: sonnet)`). **Resolution order** (mirrors resolve_repl_model but with source attribution): 1. If `--model` / `--model=` flag supplied → source: flag, raw: flag value 2. Else if ANTHROPIC_MODEL set → source: env, raw: env value 3. Else if `.claw.json` model key set → source: config, raw: config value 4. Else → source: default, raw: null ## Changes ### rust/crates/rusty-claude-cli/src/main.rs - Added `ModelSource` enum (Flag/Env/Config/Default) with `as_str()`. - Added `ModelProvenance` struct (resolved, raw, source) with three constructors: `default_fallback()`, `from_flag(raw)`, and `from_env_or_config_or_default(cli_model)`. - Added `model_flag_raw: Option<String>` field to `CliAction::Status`. - Parse loop captures raw input in `--model` and `--model=` arms. - Extended `parse_single_word_command_alias` to thread `model_flag_raw: Option<&str>` through. - Extended `print_status_snapshot` signature to accept `model_flag_raw: Option<&str>`. Resolves provenance at dispatch time (flag provenance from arg; else probe env/config/default). - Extended `status_json_value` signature with `provenance: Option<&ModelProvenance>`. On Some, adds `model_source` and `model_raw` fields; on None (legacy resume paths), omits them for backward compat. - Extended `format_status_report` signature with optional provenance. On Some, renders `Model source` line after `Model`. - Updated all existing callers (REPL /status, resume /status, tests) to pass None (legacy paths don't carry flag provenance). - Added 2 regression assertions in parse_args test covering both `--model sonnet` and `--model=...` forms. ### ROADMAP.md - Marked #128 CLOSED with re-verification block. - Filed #148 documenting the provenance gap split, fix shape, and acceptance criteria. ## Live verification $ claw --model sonnet --output-format json status | jq '{model,model_source,model_raw}' {"model": "claude-sonnet-4-6", "model_source": "flag", "model_raw": "sonnet"} $ claw --output-format json status | jq '{model,model_source,model_raw}' {"model": "claude-opus-4-6", "model_source": "default", "model_raw": null} $ ANTHROPIC_MODEL=haiku claw --output-format json status | jq '{model,model_source,model_raw}' {"model": "claude-haiku-4-5-20251213", "model_source": "env", "model_raw": "haiku"} $ echo '{"model":"claude-opus-4-7"}' > .claw.json && claw --output-format json status | jq '{model,model_source,model_raw}' {"model": "claude-opus-4-7", "model_source": "config", "model_raw": "claude-opus-4-7"} $ claw --model sonnet status Status Model claude-sonnet-4-6 Model source flag (raw: sonnet) Permission mode danger-full-access ... ## Tests - rusty-claude-cli bin: 177 tests pass (2 new assertions for #148) - Full workspace green except pre-existing resume_latest flake (unrelated) Closes ROADMAP #128, #148. |
||
|
|
4cb8fa059a |
feat: #147 — reject empty / whitespace-only prompts at CLI fallthrough
## Problem
The `"prompt"` subcommand arm enforced `if prompt.trim().is_empty()`
and returned a specific error. The fallthrough `other` arm in the same
match block — which routes any unrecognized first positional arg to
`CliAction::Prompt` — had no such guard. Result:
$ claw ""
error: missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN ...
$ claw " "
error: missing Anthropic credentials; ...
$ claw "" ""
error: missing Anthropic credentials; ...
$ claw --output-format json ""
{"error":"missing Anthropic credentials; ...","type":"error"}
An empty prompt should never reach the credentials check. Worse: with
valid credentials, the literal empty string gets sent to Claude as a
user prompt, either burning tokens for nothing or triggering a model-
side refusal. Same prompt-misdelivery family as #145.
## Root cause
In `parse_subcommand()`, the final `other =>` arm in the top-level
match only guards against typos (#108 guard via `looks_like_subcommand_typo`)
and then unconditionally builds `CliAction::Prompt { prompt: rest.join(" ") }`.
An empty/whitespace-only join passes through.
## Changes
### rust/crates/rusty-claude-cli/src/main.rs
Added the same `if joined.trim().is_empty()` guard already used in the
`"prompt"` arm to the fallthrough path. Error message distinguishes it
from the `prompt` subcommand path:
empty prompt: provide a subcommand (run `claw --help`) or a
non-empty prompt string
Runs AFTER the typo guard (so `claw sttaus` still suggests `status`)
and BEFORE CliAction::Prompt construction (so no network call ever
happens for empty inputs).
### Regression tests
Added 4 assertions in the existing parse_args test:
- parse_args([""]) → Err("empty prompt: ...")
- parse_args([" "]) → Err("empty prompt: ...")
- parse_args(["", ""]) → Err("empty prompt: ...")
- parse_args(["sttaus"]) → Err("unknown subcommand: ...") [verifies #108 typo guard still takes precedence]
### ROADMAP.md
Added Pinpoint #147 documenting the gap, verification, root cause,
fix shape, and acceptance. Joins the prompt-misdelivery cluster
alongside #145.
## Live verification
$ claw ""
error: empty prompt: provide a subcommand (run `claw --help`) or a non-empty prompt string
$ claw " "
error: empty prompt: provide a subcommand (run `claw --help`) or a non-empty prompt string
$ claw --output-format json ""
{"error":"empty prompt: provide a subcommand ...","type":"error"}
$ claw prompt "" # unchanged: subcommand-specific error preserved
error: prompt subcommand requires a prompt string
$ claw hello # unchanged: typo guard still fires
error: unknown subcommand: hello.
Did you mean help
$ claw "real prompt here" # unchanged: real prompts still reach API
error: api returned 401 Unauthorized (with dummy key, as expected)
All empty/whitespace-only paths exit 1. No network call. No misleading
credentials error.
## Tests
- rusty-claude-cli bin: 177 tests pass (4 new assertions)
- Full workspace green except pre-existing resume_latest flake (unrelated)
Closes ROADMAP #147.
|
||
|
|
f877acacbf |
feat: #146 — wire claw config and claw diff as standalone subcommands
## Problem `claw config` and `claw diff` are pure-local read-only introspection commands (config merges .claw.json + .claw/settings.json from disk; diff shells out to `git diff --cached` + `git diff`). Neither needs a session context, yet both rejected direct CLI invocation: $ claw config error: `claw config` is a slash command. Use `claw --resume SESSION.jsonl /config` ... $ claw diff error: `claw diff` is a slash command. ... This forced clawing operators to spin up a full session just to inspect static disk state, and broke natural pipelines like `claw config --output-format json | jq`. ## Root cause Sibling of #145: `SlashCommand::Config { section }` and `SlashCommand::Diff` had working renderers (`render_config_report`, `render_config_json`, `render_diff_report`, `render_diff_json_for`) exposed for resume sessions, but the top-level CLI parser in `parse_subcommand()` had no arms for them. Zero-arg `config`/`diff` hit `parse_single_word_command_alias`'s fallback to `bare_slash_command_guidance`, producing the misleading guidance. ## Changes ### rust/crates/rusty-claude-cli/src/main.rs - Added `CliAction::Config { section, output_format }` and `CliAction::Diff { output_format }` variants. - Added `"config"` / `"diff"` arms to the top-level parser in `parse_subcommand()`. `config` accepts an optional section name (env|hooks|model|plugins) matching SlashCommand::Config semantics. `diff` takes no positional args. Both reject extra trailing args with a clear error. - Added `"config" | "diff" => None` to `parse_single_word_command_alias` so bare invocations fall through to the new parser arms instead of the slash-guidance error. - Added dispatch in run() that calls existing renderers: text mode uses `render_config_report` / `render_diff_report`; JSON mode uses `render_config_json` / `render_diff_json_for` with `serde_json::to_string_pretty`. - Added 5 regression assertions in parse_args test covering: parse_args(["config"]), parse_args(["config", "env"]), parse_args(["config", "--output-format", "json"]), parse_args(["diff"]), parse_args(["diff", "--output-format", "json"]). ### ROADMAP.md Added Pinpoint #146 documenting the gap, verification, root cause, fix shape, and acceptance. Explicitly notes which other slash commands (`hooks`, `usage`, `context`, etc.) are NOT candidates because they are session-state-modifying. ## Live verification $ claw config # no config files Config Working directory /private/tmp/cd-146-verify Loaded files 0 Merged keys 0 Discovered files user missing ... project missing ... local missing ... Exit 0. $ claw config --output-format json { "cwd": "...", "files": [...], ... } $ claw diff # no git Diff Result no git repository Detail ... Exit 0. $ claw diff --output-format json # inside claw-code { "kind": "diff", "result": "changes", "staged": "", "unstaged": "diff --git ..." } Exit 0. ## Tests - rusty-claude-cli bin: 177 tests pass (5 new assertions in parse_args) - Full workspace green except pre-existing resume_latest flake (unrelated) ## Not changed `hooks`, `usage`, `context`, `tasks`, `theme`, `voice`, `rename`, `copy`, `color`, `effort`, `branch`, `rewind`, `ide`, `tag`, `output-style`, `add-dir` — all session-mutating or interactive-only; correctly remain slash-only. Closes ROADMAP #146. |
||
|
|
7d63699f9f |
feat: #145 — wire claw plugins subcommand to CLI parser (prompt misdelivery fix)
## Problem `claw plugins` (and `claw plugins list`, `claw plugins --help`, `claw plugins info <name>`, etc.) fell through the top-level subcommand match and got routed into the prompt-execution path. Result: a purely local introspection command triggered an Anthropic API call and surfaced `missing Anthropic credentials` to the user. With valid credentials, it would actually send the literal string "plugins" as a user prompt to Claude, burning tokens for a local query. $ claw plugins error: missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY before calling the Anthropic API $ ANTHROPIC_API_KEY=dummy claw plugins ⠋ 🦀 Thinking... ✘ ❌ Request failed error: api returned 401 Unauthorized Meanwhile siblings (`agents`, `mcp`, `skills`) all worked correctly: $ claw agents No agents found. $ claw mcp MCP Working directory ... Configured servers 0 ## Root cause `CliAction::Plugins` exists, has a working dispatcher (`LiveCli::print_plugins`), and is produced inside the REPL via `SlashCommand::Plugins`. But the top-level CLI parser in `parse_subcommand()` had arms for `agents`, `mcp`, `skills`, `status`, `doctor`, `init`, `export`, `prompt`, etc., and **no arm for `plugins`**. The dispatch never ran from the CLI entry point. ## Changes ### rust/crates/rusty-claude-cli/src/main.rs Added a `"plugins"` arm to the top-level match in `parse_subcommand()` that produces `CliAction::Plugins { action, target, output_format }`, following the same positional convention as `mcp` (`action` = first positional, `target` = second). Rejects >2 positional args with a clear error. Added four regression assertions in the existing `parse_args` test: - `plugins` alone → `CliAction::Plugins { action: None, target: None }` - `plugins list` → action: Some("list"), target: None - `plugins enable <name>` → action: Some("enable"), target: Some(...) - `plugins --output-format json` → action: None, output_format: Json ### ROADMAP.md Added Pinpoint #145 documenting the gap, verification, root cause, fix shape, and acceptance. ## Live verification $ claw plugins # no credentials set Plugins example-bundled v0.1.0 disabled sample-hooks v0.1.0 disabled $ claw plugins --output-format json # no credentials set { "action": "list", "kind": "plugin", "message": "Plugins\n example-bundled ...\n sample-hooks ...", "reload_runtime": false, "target": null } Exit 0 in all modes. No network call. No "missing credentials" error. ## Tests - rusty-claude-cli bin: 177 tests pass (new plugin assertions included) - Full workspace green except pre-existing resume_latest flake (unrelated) Closes ROADMAP #145. |
||
|
|
faeaa1d30c |
feat: #144 phase 1 + ROADMAP filing — claw mcp degrades gracefully on malformed config
Filing + Phase 1 fix in one commit (sibling of #143). ## Context With #143 Phase 1 landed (`claw status` degrades), `claw mcp` was the remaining diagnostic surface that hard-failed on a malformed `.claw.json`. Same input, same parse error, same partial-success violation. Fresh dogfood at 18:59 KST caught it on main HEAD `e2a43fc`. ## Changes ### ROADMAP.md Added Pinpoint #144 documenting the gap and acceptance criteria. Joins the partial-success / Principle #5 cluster with #143. ### rust/crates/commands/src/lib.rs `render_mcp_report_for()` + `render_mcp_report_json_for()` now catch the ConfigError at loader.load() instead of propagating: - **Text mode** prepends a "Config load error" block (same shape as #143's status output) before the MCP listing. The listing still renders with empty servers so the output structure is preserved. - **JSON mode** adds top-level `status: "ok" | "degraded"` + `config_load_error: string | null` fields alongside existing fields (`kind`, `action`, `working_directory`, `configured_servers`, `servers[]`). On clean runs, `status: "ok"` and `config_load_error: null`. On parse failure, `status: "degraded"`, `config_load_error: "..."`, `servers: []`, exit 0. - Both list and show actions get the same treatment. ### Regression test `commands::tests::mcp_degrades_gracefully_on_malformed_mcp_config_144`: - Injects the same malformed .claw.json as #143 (one valid + one broken mcpServers entry). - Asserts mcp list returns Ok (not Err). - Asserts top-level status: "degraded" and config_load_error names the malformed field path. - Asserts show action also degrades. - Asserts clean path returns status: "ok" with config_load_error null. ## Live verification $ claw mcp --output-format json { "action": "list", "kind": "mcp", "status": "degraded", "config_load_error": ".../.claw.json: mcpServers.missing-command: missing string field command", "working_directory": "/Users/yeongyu/clawd", "configured_servers": 0, "servers": [] } Exit 0. ## Contract alignment after this commit All three diagnostic surfaces match now: - `doctor` — degraded envelope with typed check entries ✅ - `status` — degraded envelope with config_load_error ✅ (#143) - `mcp` — degraded envelope with config_load_error ✅ (this commit) Phase 2 (typed-error object joining taxonomy §4.44) tracked separately across all three surfaces. Full workspace test green except pre-existing resume_latest flake (unrelated). Closes ROADMAP #144 phase 1. |
||
|
|
fcd5b49428 | ROADMAP #143: claw status hard-fails on malformed MCP config while doctor degrades gracefully | ||
|
|
2665ada94e | ROADMAP #142: claw init --output-format json emits unstructured message string instead of created/skipped fields | ||
|
|
21b377d9c0 | ROADMAP #141: claw <subcommand> --help has 5 different behaviors — inconsistent help surface | ||
|
|
0cf8241978 | ROADMAP #140: deprecated permissionMode migration silently downgrades DangerFullAccess to WorkspaceWrite — 1 test failure on main HEAD 36b3a09 | ||
|
|
36b3a09818 | ROADMAP #139: claw state error references undocumented 'worker' concept (unactionable for claws) | ||
|
|
883cef1a26 | docs: #138 add concrete evidence — feat/134-135 branch pushed but no PR (closure-state gap) | ||
|
|
768c1abc78 | ROADMAP #138: dogfood cycle report-gate opacity — nudge surface needs explicit closure state | ||
|
|
724a78604d | ROADMAP #137: model-alias shorthand regression in test suite — bare alias parsing broken on feat/134-135-session-identity; 3 tests fail with invalid model syntax error after #134/#135 validation tightening | ||
|
|
91ba54d39f | ROADMAP #136: --compact flag silently overrides --output-format json — compact turn always emits plain text even when JSON requested; unreachable Json arm in run_with_output() match; joins output-format completeness cluster #90/#91/#92/#127/#130 and CLI/REPL parity §7.1 | ||
|
|
8b52e77f23 | ROADMAP #135: claw status --json missing active_session bool and session.id cross-reference — status query side of #134 round-trip; joins session identity completeness §4.7 and status surface completeness cluster #80/#83/#114/#122; natural bundle #134+#135 closes session-identity round-trip | ||
|
|
2c42f8bcc8 | docs: remove duplicate ROADMAP #134 entry | ||
|
|
f266505546 | ROADMAP #134: no run/correlation ID at session boundary — session.id missing from startup event and status JSON; observer must infer session identity from timing | ||
|
|
5c579e4a09 |
§4.44.5.1: file ship event wiring pinpoint (schema landed, wiring missing)
Dogfood cycle 2026-04-20 identified that §4.44.5 ship/provenance event schema is implemented (ShipProvenance struct, ship.* constructors, tests pass) but actual git push/merge/commit-range operations do not yet emit these events. Events remain dead code—constructors exist but are never called during real workflows. This pinpoint tracks the missing wiring: locating actual git operation call sites in main.rs/tools/lib.rs/worker_boot.rs and intercepting to emit ship.prepared/commits_selected/merged/pushed_main with real metadata (source_branch, commit_range, merge_method, actor, pr_number). Acceptance: at least one real git push emits all 4 events with actual payload values, claw state JSON surfaces ship provenance. Ref: dogfood gaebal-gajae @ 1495672954573291571 (15:30 KST) |
||
|
|
8a8ca8a355 |
ROADMAP #4.44.5: Ship/provenance events — implement §4.44.5
Adds structured ship provenance surface to eliminate delivery-path opacity: New lane events: - ship.prepared — intent to ship established - ship.commits_selected — commit range locked - ship.merged — merge completed with provenance - ship.pushed_main — delivery to main confirmed ShipProvenance struct carries: - source_branch, base_commit - commit_count, commit_range - merge_method (direct_push/fast_forward/merge_commit/squash_merge/rebase_merge) - actor, pr_number Constructor methods added to LaneEvent for all four ship events. Tests: - Wire value serialization for ship events - Round-trip deserialization - Canonical event name coverage Runtime: 465 tests pass ROADMAP updated with IMPLEMENTED status This closes the gap where 56 commits pushed to main had no structured provenance trail — now emits first-class events for clawhip consumption. |
||
|
|
b0b579ebe9 |
ROADMAP #133: Blocked-state subphase contract — implement §6.5
Adds BlockedSubphase enum with 7 variants for structured blocked-state reporting: - blocked.trust_prompt — trust gate blockers - blocked.prompt_delivery — prompt misdelivery - blocked.plugin_init — plugin startup failures - blocked.mcp_handshake — MCP connection issues - blocked.branch_freshness — stale branch blockers - blocked.test_hang — test timeout/hang - blocked.report_pending — report generation stuck LaneEventBlocker now carries optional subphase field that gets serialized into LaneEvent data. Enables clawhip to route recovery without pane scraping. Updates: - lane_events.rs: BlockedSubphase enum, LaneEventBlocker.subphase field - lane_events.rs: blocked()/failed() constructors with subphase serialization - lib.rs: Export BlockedSubphase - tools/src/lib.rs: classify_lane_blocker() with subphase: None - Test imports and fixtures updated Backward-compatible: subphase is Option<>, existing events continue to work. |
||
|
|
c956f78e8a |
ROADMAP #4.44.5: Ship/provenance opacity — filed from dogfood
Added structured delivery-path contract to surface branch → merge → main-push provenance as first-class events. Filed from the 56-commit 2026-04-20 push that exposed the gap. Also fixes: ApiError test compilation — add suggested_action: None to 4 sites - Line ~8414: opaque_provider_wrapper_surfaces_failure_class_session_and_trace - Line ~8436: retry_exhaustion_uses_retry_failure_class_for_generic_provider_wrapper - Line ~8499: provider_context_window_errors_are_reframed_with_same_guidance - Line ~8533: retry_wrapped_context_window_errors_keep_recovery_guidance |
||
|
|
dd73962d0b | ROADMAP #122: doctor invocation does not check stale-base condition — run_stale_base_preflight() only invoked in Prompt + REPL paths, missing in doctor action handler; inconsistency: doctor says 'ok' but prompt warns 'stale base'; joins boot preflight / doctor contract family (#80-#83/#114) and silent-state inventory (#102/#127/#129/#245) | ||
|
|
027efb2f9f | ROADMAP §4.44: Typed-error envelope contract (Silent-state inventory roll-up) — locks in structured error.kind/operation/target/errno/hint/retryable contract that closes the family of pinpoints currently scattered across #102 + #121 + #127 + #129 + #130 + #245; backward-compat additive; regression locked via golden-fixture; gates 'Run claw --help for usage' trailer on error.kind == usage; drafted jointly with gaebal-gajae during 2026-04-20 dogfood cycle | ||
|
|
866f030713 | ROADMAP #130: claw export --output filesystem errors surface raw OS errno strings with zero context — 5 distinct failure modes all produce different errno strings but the same zero-context shape; no path echoed, no operation named, no io::ErrorKind classification, no actionable hint; JSON envelope flattens to {error, type} losing all structure; Run claw --help for usage trailer misleads on non-usage errors; joins JSON-envelope asymmetry family #90/#91/#92/#110/#115/#116 and truth-audit #80-#127/#129 | ||
|
|
d2a83415dc | ROADMAP #129: MCP server startup blocks credential validation in Prompt path — cred check ordered AFTER MCP child handshake await; misbehaved/slow MCP wedges every claw <prompt> invocation indefinitely; npx restart loop wastes resources; runtime-side companion to #102's config-time MCP gap; PARITY.md Lane 7 acceptance gap | ||
|
|
8122029eba | ROADMAP #128: claw --model <malformed> (spaces, empty string, invalid syntax) silently accepted at parse time, falls through to cred-error misdirection; joins parser-level trust gap family #108/#117/#119/#122/#127; joins token-burn family #99/#127 | ||
|
|
d284ef774e | ROADMAP #127: claw <subcommand> --json silently falls through to LLM Prompt dispatch — diagnostic verbs (doctor, status, sandbox, skills, version, help) reject --json with cred-error misdirection; valid verb + unrecognized suffix arg = Prompt fall-through; 18th silent-flag, 5th parser-level trust gap, joins #108 + #117 + #119 + #122 | ||
|
|
7370546c1c |
ROADMAP #126: /config [env|hooks|model|plugins] ignores section argument — all 4 subcommands return bit-identical file-list envelope; 4-way dispatch collapse
Dogfooded 2026-04-18 on main HEAD b56841c from /tmp/cdFF2.
/config model, /config hooks, /config plugins, /config env all
return: {kind:'config', cwd, files:[...], loaded_files,
merged_keys} — BIT-IDENTICAL.
diff /config model vs /config hooks → empty.
Section argument parsed at slash-command level but not branched
on in the handler.
Help: '/config [env|hooks|model|plugins] Inspect Claude config
files or merged sections [resume]'
→ 'merged sections' never shown. Same file-list for all.
Third dispatch-collapse finding:
#111: /providers → Doctor (2-way, wildly wrong)
#118: /stats + /tokens + /cache → Stats (3-way, distinct)
#126: /config env + hooks + model + plugins → file-list (4-way)
Fix shape (~60 lines):
- Section-specific handlers:
/config model → resolved model, source, aliases
/config hooks → pre_tool_use, post_tool_use arrays
/config plugins → enabled_plugins list
/config env → current file-list (already correct)
- Bare /config → current file-list envelope
- Regression per section
Joins Silent-flag/documented-but-unenforced.
Joins Truth-audit — help promises section inspection.
Joins Dispatch-collapse family: #111 + #118 + #126.
Natural bundle: #111 + #118 + #126 — dispatch-collapse trio.
Complete parser-dispatch-collapse audit across slash commands.
Filed in response to Clawhip pinpoint nudge 1495023618529300580
in #clawcode-building-in-public.
|
||
|
|
b56841c5f4 |
ROADMAP #125: git_state 'clean' emitted for non-git directories; GitWorkspaceSummary default all-zeros → is_clean() → 'clean' even when in_git_repo: false; contradictory doctor fields
Dogfooded 2026-04-18 on main HEAD debbcbe from /tmp/cdBB2.
Non-git directory:
$ mkdir /tmp/cdBB2 && cd /tmp/cdBB2 # NO git init
$ claw --output-format json status | jq .workspace.git_state
'clean' # should be null — not in a git repo
$ claw --output-format json doctor | jq '.checks[]
| select(.name=="workspace") | {in_git_repo, git_state}'
{"in_git_repo": false, "git_state": "clean"}
# CONTRADICTORY: not in git BUT git is 'clean'
Trace:
main.rs:2550-2554 parse_git_workspace_summary:
let Some(status) = status else {
return summary; // all-zero default when no git
};
All-zero GitWorkspaceSummary → is_clean() (changed_files==0)
→ true → headline() = 'clean'
main.rs:4950 status JSON: git_summary.headline() for git_state
main.rs:1856 doctor workspace: same headline() for git_state
Fix shape (~25 lines):
- Return Option<GitWorkspaceSummary> when status is None
- headline() returns Option<String>: None when no git
- Status JSON: git_state: null when not in git
- Doctor: omit git_state when in_git_repo: false, or set null
- Optional: claw init skip .gitignore in non-git dirs
- Regression: non-git → null, git clean → 'clean',
detached HEAD → 'clean' + 'detached HEAD'
Joins Truth-audit — 'clean' is a lie for non-git dirs.
Adjacent to #89 (claw blind to mid-rebase) — same field,
different missing state.
Joins #100 (status/doctor JSON gaps) — another field whose
value doesn't reflect reality.
Natural bundle: #89 + #100 + #125 — git-state-completeness
triple: rebase/merge invisible (#89) + stale-base unplumbed
(#100) + non-git 'clean' lie (#125). Complete git_state
field failure coverage.
Filed in response to Clawhip pinpoint nudge 1495016073085583442
in #clawcode-building-in-public.
|
||
|
|
debbcbe7fb |
ROADMAP #124: --model accepts any string with zero validation; typos silently pass through; empty string accepted; status JSON has no model provenance
Dogfooded 2026-04-18 on main HEAD bb76ec9 from /tmp/cdAA2. --model flag has zero validation: claw --model sonet status → model:'sonet' (typo passthrough) claw --model '' status → model:'' (empty accepted) claw --model garbage status → model:'garbage' (any string) Valid aliases do resolve: sonnet → claude-sonnet-4-6 opus → claude-opus-4-6 Config aliases also resolve via resolve_model_alias_with_config But unresolved strings pass through silently. Typo 'sonet' becomes literal model ID sent to API → fails late with 'model not found' after full context assembly. Compare: --reasoning-effort: validates low|medium|high. Has guard. --permission-mode: validates against known set. Has guard. --model: no guard. Any string. --base-commit: no guard (#122). Same pattern. status JSON: {model: 'sonet'} — shows resolved name only. No model_source (flag/config/default). No model_raw (pre-resolution input). No model_valid (known to any provider). Claw can't distinguish typo from exact model from alias. Trace: main.rs:470-480 --model parsing: model = value.clone(); index += 2; No validation. Raw string stored. main.rs:1032-1046 resolve_model_alias_with_config: resolves known aliases. Unknown strings pass through. main.rs:~4951 status JSON builder: reports resolved model. No source/raw/valid fields. Fix shape (~65 lines): - Reject empty string at parse time - Warn on unresolved aliases with fuzzy-match suggestion - Add model_source, model_raw to status JSON - Add model-validity check to doctor - Regression per failure mode Joins #105 (4-surface model disagreement) — model pair: #105 status ignores config model, doctor mislabels #124 --model flag unvalidated, no provenance in JSON Joins #122 (--base-commit zero validation) — unvalidated-flag pair: same parser pattern, no guards. Joins Silent-flag/documented-but-unenforced as 17th. Joins Truth-audit — status model field has no provenance. Joins Parallel-entry-point asymmetry as 10th. Filed in response to Clawhip pinpoint nudge 1495000973914144819 in #clawcode-building-in-public. |
||
|
|
bb76ec9730 |
ROADMAP #123: --allowedTools tool-name normalization asymmetric; snake_case canonicals accept variants, PascalCase canonicals reject snake_case; whitespace+comma split undocumented; allowed_tools not surfaced in JSON
Dogfooded 2026-04-18 on main HEAD 2bf2a11 from /tmp/cdZZ.
Asymmetric normalization:
normalize_tool_name(value) = trim + lowercase + replace -→_
Canonical 'read_file' (snake_case):
accepts: read_file, READ_FILE, Read-File, read-file,
Read (alias), read (alias)
rejects: ReadFile, readfile, READFILE
→ Because normalize('ReadFile')='readfile', and name_map
has key 'read_file' not 'readfile'.
Canonical 'WebFetch' (PascalCase):
accepts: WebFetch, webfetch, WEBFETCH
rejects: web_fetch, web-fetch, Web-Fetch
→ Because normalize('WebFetch')='webfetch' (no underscore).
User input 'web_fetch' normalizes to 'web_fetch' (keeps
underscore). Keys don't match.
The normalize function ADDS underscores (hyphen→underscore) but
DOESN'T REMOVE them. So PascalCase canonicals have underscore-
free normalized keys; user input with explicit underscores keeps
them, creating key mismatch.
Result: 'bash,Bash,BASH,Read,read_file,Read-File,WebFetch' all
accepted, but 'web_fetch,web-fetch' rejected.
Additional silent-flag issues:
- Splits on commas OR whitespace (undocumented — help says
TOOL[,TOOL...])
- 'bash,Bash,BASH' silently accepts all 3 case variants, no
dedup warning
- Allowed tools NOT in status/doctor JSON — claw passing
--allowedTools has no way to verify what runtime accepted
Trace:
tools/src/lib.rs:192-244 normalize_allowed_tools:
canonical_names from mvp_tool_specs + plugin_tools + runtime
name_map: (normalize_tool_name(canonical), canonical)
for token in value.split(|c| c==',' || c.is_whitespace()):
lookup normalize_tool_name(token) in name_map
tools/src/lib.rs:370-372 normalize_tool_name:
fn normalize_tool_name(value: &str) -> String {
value.trim().replace('-', '_').to_ascii_lowercase()
}
Replaces - with _. Lowercases. Does NOT remove _.
Asymmetry source: normalize('WebFetch')='webfetch',
normalize('web_fetch')='web_fetch'. Different keys.
--allowedTools NOT plumbed into Status JSON output
(no 'allowed_tools' field).
Fix shape (~50 lines):
- Symmetric normalization: strip underscores from both canonical
and input, OR don't normalize hyphens in input either.
Pick one convention.
- claw tools list / --allowedTools help subcommand that prints
canonical names + accepted variants.
- Surface allowed_tools in status/doctor JSON when flag set.
- Document comma+whitespace split semantics in --help.
- Warn on duplicate tokens (bash,Bash,BASH = 3 tokens, 1 unique).
- Regression per normalization pair + status surface + duplicate.
Joins Silent-flag/documented-but-unenforced (#96-#101, #104,
#108, #111, #115, #116, #117, #118, #119, #121, #122) as 16th.
Joins Permission-audit/tool-allow-list (#94, #97, #101, #106,
#115, #120) as 7th.
Joins Truth-audit — status/doctor JSON hides what allowed-tools
set actually is.
Joins Parallel-entry-point asymmetry (#91, #101, #104, #105,
#108, #114, #117, #122) as 9th — --allowedTools vs
.claw.json permissions.allow likely disagree on normalization.
Natural bundles:
#97 + #123 — --allowedTools trust-gap pair:
empty silently blocks (#97) +
asymmetric normalization + invisible runtime state (#123)
Permission-audit 7-way (grown):
#94 + #97 + #101 + #106 + #115 + #120 + #123
Flagship permission-audit sweep 8-way (grown):
#50 + #87 + #91 + #94 + #97 + #101 + #115 + #123
Filed in response to Clawhip pinpoint nudge 1494993419536306176
in #clawcode-building-in-public.
|
||
|
|
2bf2a11943 |
ROADMAP #122: --base-commit greedy-consumes next arg with zero validation; subcommand/flag swallow; stale-base signal missing from status/doctor JSON surfaces
Dogfooded 2026-04-18 on main HEAD d1608ae from /tmp/cdYY.
Three related findings:
1. --base-commit has zero validation:
$ claw --base-commit doctor
warning: worktree HEAD (...) does not match expected
base commit (doctor). Session may run against a stale
codebase.
error: missing Anthropic credentials; ...
# 'doctor' used as base-commit value literally.
# Subcommand absorbed. Prompt fallthrough. Billable.
2. Greedy swallow of next flag:
$ claw --base-commit --model sonnet status
warning: ...does not match expected base commit (--model)
# '--model' taken as value. status never dispatched.
3. Garbage values silently accepted:
$ claw --base-commit garbage status
Status ...
# No validation. No warning (status path doesn't run check).
4. Stale-base signal missing from JSON surfaces:
$ claw --output-format json --base-commit $BASE status
{"kind":"status", ...}
# no stale_base, no base_commit, no base_commit_mismatch.
Stale-base check runs ONLY on Prompt path, as stderr prose.
Trace:
main.rs:487-494 --base-commit parsing:
'base-commit' => {
let value = args.get(index + 1).ok_or_else(...)?;
base_commit = Some(value.clone());
index += 2;
}
No format check. No reject-on-flag-prefix. No reject-on-
known-subcommand.
Compare main.rs:498-510 --reasoning-effort:
validates 'low' | 'medium' | 'high'. Has guard.
stale_base.rs check_base_commit runs on Prompt/turn path
only. No Status/Doctor handler includes base_commit field.
grep 'stale_base|base_commit_matches|base_commit:'
rust/crates/rusty-claude-cli/src/main.rs | grep status|doctor
→ zero matches.
Fix shape (~40 lines):
- Reject values starting with '-' (flag-like)
- Reject known-subcommand names as values
- Optionally run 'git cat-file -e {value}' to verify real commit
- Plumb base_commit + base_commit_matches + stale_base_warning
into Status and Doctor JSON surfaces
- Emit warning as structured JSON event too (not just stderr)
- Regression per failure mode
Joins Silent-flag/documented-but-unenforced (#96-#101, #104,
#108, #111, #115, #116, #117, #118, #119, #121) as 15th.
Joins Parser-level trust gaps: #108 + #117 + #119 + #122 —
billable-token silent-burn via parser too-eager consumption.
Joins Parallel-entry-point asymmetry (#91, #101, #104, #105,
#108, #114, #117) as 8th — stale-base implemented for Prompt
but absent from Status/Doctor.
Joins Truth-audit — 'expected base commit (doctor)' lies by
including user's mistake as truth.
Cross-cluster with Unplumbed-subsystem (#78, #96, #100, #102,
#103, #107, #109, #111, #113, #121) — stale-base signal in
runtime but not JSON.
Natural bundles:
Parser-level trust gap quintet (grown):
#108 + #117 + #119 + #122 — billable-token silent-burn
via parser too-eager consumption.
#100 + #122 — stale-base diagnostic-integrity pair:
#100 stale-base subsystem unplumbed (general)
#122 --base-commit accepts anything, greedy, Status/Doctor
JSON unplumbed (specific)
Filed in response to Clawhip pinpoint nudge 1494978319920136232
in #clawcode-building-in-public.
|
||
|
|
d1608aede4 |
ROADMAP #121: hooks schema incompatible with Claude Code; error message misleading; doctor JSON emits 2 objects on failure breaking single-doc parsing; doctor has duplicate message+report fields
Dogfooded 2026-04-18 on main HEAD b81e642 from /tmp/cdWW.
Four related findings in one:
1. hooks schema incompatible with Claude Code (primary):
claw-code: {'hooks':{'PreToolUse':['cmd1','cmd2']}}
Claude Code: {'hooks':{'PreToolUse':[
{'matcher':'Bash','hooks':[{'type':'command','command':'...'}]}
]}}
Flat string array vs matcher-keyed object array. Incompatible.
User copying .claude.json hooks to .claw.json hits parse-fail.
2. Error message misleading:
'field hooks.PreToolUse must be an array of strings, got an array'
Both input and expected are arrays. Correct diagnosis:
'got an array of objects where array of strings expected'
3. Missing Claude Code hook event types:
claw-code supports: PreToolUse, PostToolUse, PostToolUseFailure
Claude Code supports: above + UserPromptSubmit, Notification,
Stop, SubagentStop, PreCompact, SessionStart
5+ event types missing.
matcher regex not supported.
type: 'command' vs type: 'http' extensibility not supported.
4. doctor NDJSON output on failures:
With failures present, --output-format json emits TWO
concatenated JSON objects on stdout:
Object 1: {kind:'doctor', has_failures:true, ...}
Object 2: {type:'error', error:'doctor found failing checks'}
python json.load() fails: 'Extra data: line 133 column 1'
Flag name 'json' violated — NDJSON is not JSON.
5. doctor message + report byte-duplicated:
.message and .report top-level fields have identical prose
content. Parser ambiguity + byte waste.
Trace:
config.rs:750-771 parse_optional_hooks_config_object:
optional_string_array(hooks, 'PreToolUse', context)
Expects ['cmd1', 'cmd2']. Claude Code gives
[{matcher,hooks:[{type,command}]}]. Schema-incompatible.
config.rs:775-779 validate_optional_hooks_config:
calls same parser. Error bubbles up.
Message comes from optional_string_array path —
technically correct but misleading.
Fix shape (~200 lines + migration docs):
- Dual-schema hooks parser: accept native + Claude Code forms
- Add missing event types to RuntimeHookConfig
- Implement matcher regex
- Fix error message to distinguish array-element types
- Fix doctor: single JSON object regardless of failure state
- De-duplicate message + report (keep report, drop message)
- Regression per schema form + event type + matcher
Joins Claude Code migration parity (#103, #109, #116, #117,
#119, #120) as 7th — most severe parity break since hooks is
load-bearing automation infrastructure.
Joins Truth-audit on misleading error message.
Joins Silent-flag on --output-format json emitting NDJSON.
Cross-cluster with Unplumbed-subsystem (#78, #96, #100, #102,
#103, #107, #109, #111, #113) — hooks subsystem exists but
schema incompatible with reference implementation.
Natural bundles:
Claude Code migration parity septet (grown flagship):
#103 + #109 + #116 + #117 + #119 + #120 + #121
Complete coverage of every migration failure mode.
#107 + #121 — hooks-subsystem pair:
#107 hooks invisible to JSON diagnostics
#121 hooks schema incompatible with migration source
Filed in response to Clawhip pinpoint nudge 1494963222157983774
in #clawcode-building-in-public.
|
||
|
|
b81e6422b4 |
ROADMAP #120: .claw.json custom JSON5-partial parser accepts trailing commas but silently drops comments/unquoted/BOM; combined with alias table 'default'→ReadOnly + no-config→DangerFullAccess creates security-critical user-intent inversion
Dogfooded 2026-04-18 on main HEAD 7859222 from /tmp/cdVV. Extends #86 (silent-drop general case) with two new angles: 1. JSON5-partial acceptance matrix: ACCEPTED (loaded correctly): - trailing comma (one) SILENTLY DROPPED (loaded_config_files=0, zero stderr, exit 0): - line comments (//) - block comments (/* */) - unquoted keys - UTF-8 BOM - single quotes - hex numbers - leading commas - multiple trailing commas 8 cases tested, 1 accepted, 7 silently dropped. The 1 accepted gives false signal of JSON5 tolerance. 2. Alias table creates user-intent inversion: config.rs:856-858: 'default' | 'plan' | 'read-only' => ReadOnly 'acceptEdits' | 'auto' | 'workspace-write' => WorkspaceWrite 'dontAsk' | 'danger-full-access' => DangerFullAccess CRITICAL: 'default' in the config file = ReadOnly no config at all = DangerFullAccess (per #87) These are OPPOSITE modes. Security-inversion chain: user writes: {'// comment', 'defaultMode': 'default'} user intent: read-only parser: rejects comment read_optional_json_object: silently returns Ok(None) config loader: no config present permission_mode: falls back to no-config default = DangerFullAccess ACTUAL RESULT: opposite of intent. ZERO warning. Trace: config.rs:674-692 read_optional_json_object: is_legacy_config = (file_name == '.claw.json') match JsonValue::parse(&contents) { Ok(parsed) => parsed, Err(_error) if is_legacy_config => return Ok(None), Err(error) => return Err(ConfigError::Parse(...)), } is_legacy silent-drop. (#86 covers general case) json.rs JsonValue::parse — custom parser: accepts trailing comma rejects everything else JSON5-ish Fix shape (~80 lines, overlaps with #86): - Pick policy: strict JSON or explicit JSON5. Enforce consistently. - Apply #86 fix here: replace silent-drop with warn-and-continue, structured warning in stderr + JSON surface. - Rename 'default' alias OR map to 'ask' (matches English meaning). - Structure status output: add config_parse_errors:[] field so claws detect silent drops via JSON without stderr-parsing. - Regression matrix per JSON5 feature + security-invariant test. Joins Permission-audit/tool-allow-list (#94, #97, #101, #106, #115) as 6th — this is the CONFIG-PARSE anchor of the permission- posture problem. Complete matrix: #87 absence → DangerFullAccess #101 env-var fail-OPEN → DangerFullAccess #115 init-generated dangerous default → DangerFullAccess #120 config parse-drops → DangerFullAccess Joins Truth-audit on loaded_config_files=0 + permission_mode= danger-full-access inconsistency without config_parse_errors[]. Joins Reporting-surface/config-hygiene (#90, #91, #92, #110, #115, #116) on silent-drop-no-stderr-exit-0 axis. Joins Claude Code migration parity (#103, #109, #116, #117, #119) as 6th — claw-code is strict-where-Claude-was-lax (#116) AND lax-where-Claude-was-strict (#120). Maximum migration confusion. Natural bundles: #86 + #120 — config-parse reliability pair: silent-drop general case (#86) + JSON5-partial-acceptance + alias-inversion (#120) Permission-drift-at-every-boundary 4-way: #87 + #101 + #115 + #120 — absence + env-var + init + config-drop. Complete coverage of every path to DangerFullAccess. Security-critical permission drift audit mega-bundle: #86 + #87 + #101 + #115 + #116 + #120 — five-way sweep of every path to wrong permissions. Filed in response to Clawhip pinpoint nudge 1494955670791913508 in #clawcode-building-in-public. |
||
|
|
78592221ec |
ROADMAP #119: claw <slash-only verb> + any arg silently falls through to Prompt; bare_slash_command_guidance gated by rest.len() != 1; 9 known verbs affected
Dogfooded 2026-04-18 on main HEAD 3848ea6 from /tmp/cdUU.
The 'this is a slash command' helpful-error only fires when
invoked EXACTLY bare. Adding ANY argument silently falls through
to Prompt dispatch and burns billable tokens.
$ claw --output-format json hooks
{"error":"`claw hooks` is a slash command. Use `claw
--resume SESSION.jsonl /hooks`..."}
# clean error
$ claw --output-format json hooks --help
{"error":"missing Anthropic credentials; ..."}
# Prompt fallthrough. The CLI tried to send 'hooks --help'
# to the LLM as a user prompt.
9 known slash-only verbs affected:
hooks, plan, theme, tasks, subagent, agent, providers,
tokens, cache
All exhibit identical pattern:
bare verb → clean error
verb + any arg (--help, list, on, off, --json, etc) →
Prompt fallthrough, billable LLM call
User pattern: 'claw status --help' prints usage. So users
naturally try 'claw hooks --help' expecting same. Gets
charged for prompt 'hooks --help' to LLM instead.
Trace:
main.rs:745-763 entry point:
if rest.len() != 1 { return None; } <-- THE BUG
match rest[0].as_str() {
'help' => ...,
'version' => ...,
other => bare_slash_command_guidance(other).map(Err),
}
main.rs:765-793 bare_slash_command_guidance:
looks up command in slash_command_specs()
returns helpful error string
WORKS CORRECTLY — just never called when args present
Claude Code convention: 'claude hooks --help' prints usage,
'claude hooks list' lists hooks. claw-code silently charges.
Compare sibling bugs:
#108 typo'd verb + args → Prompt (typo path)
#117 -p 'text' --arg → Prompt with swallowed flags (greedy -p)
#119 known slash-verb + any arg → Prompt (too-narrow guidance)
All three are silent-billable-token-burn. Same underlying cause:
too-narrow parser detection + greedy Prompt dispatch.
Fix shape (~35 lines):
- Remove rest.len() != 1 gate. Widen to:
if rest.is_empty() { return None; }
let first = rest[0].as_str();
if rest.len() == 1 {
// existing bare-verb handling
}
if let Some(guidance) = bare_slash_command_guidance(first) {
return Some(Err(format!(
'{} The extra argument `{}` was not recognized.',
guidance, rest[1..].join(' ')
)));
}
None
- Subcommand --help support: catch --help for all recognized
slash verbs, print SlashCommandSpec.description
- Regression tests: 'claw <verb> --help' prints help,
'claw <verb> any arg' prints guidance, no Prompt fallthrough
Joins Silent-flag/documented-but-unenforced (#96-#101, #104,
#108, #111, #115, #116, #117, #118) as 14th.
Joins Claude Code migration parity (#103, #109, #116, #117)
as 5th — muscle memory from claude <verb> --help burns tokens.
Joins Truth-audit — 'missing credentials' is a lie; real cause
is CLI invocation was interpreted as chat prompt.
Cross-cluster with Parallel-entry-point asymmetry — slash-verb
with args is another entry point differing from bare form.
Natural bundles:
#108 + #117 + #119 — billable-token silent-burn triangle:
typo fallthrough (#108) +
flag swallow (#117) +
known-slash-verb fallthrough (#119)
#108 + #111 + #118 + #119 — parser-level trust gap quartet:
typo fallthrough + 2-way collapse + 3-way collapse +
known-verb fallthrough
Filed in response to Clawhip pinpoint nudge 1494948121099243550
in #clawcode-building-in-public.
|
||
|
|
3848ea64e3 |
ROADMAP #118: /stats, /tokens, /cache all collapse to SlashCommand::Stats; 3-way dispatch collapse with 3 distinct help descriptions
Dogfooded 2026-04-18 on main HEAD b9331ae from /tmp/cdTT.
Three slash commands collapse to one handler:
$ claw --help | grep -E '^\s*/(stats|tokens|cache)\s'
/stats Show workspace and session statistics [resume]
/tokens Show token count for the current conversation [resume]
/cache Show prompt cache statistics [resume]
Three distinct promises. One implementation:
$ claw --resume s --output-format json /stats
{"kind":"stats","input_tokens":0,"output_tokens":0,
"cache_creation_input_tokens":0,"cache_read_input_tokens":0,
"total_tokens":0}
$ claw --resume s --output-format json /tokens
{"kind":"stats", ...identical...}
$ claw --resume s --output-format json /cache
{"kind":"stats", ...identical...}
diff /stats /tokens → empty
diff /stats /cache → empty
kind field is always 'stats', never 'tokens' or 'cache'.
Trace:
commands/src/lib.rs:1405-1408:
'stats' | 'tokens' | 'cache' => {
validate_no_args(command, &args)?;
SlashCommand::Stats
}
commands/src/lib.rs:317 SlashCommandSpec name='stats' registered
commands/src/lib.rs:702 SlashCommandSpec name='tokens' registered
SlashCommandSpec name='cache' also registered
Each has distinct summary/description in help.
No SlashCommand::Tokens or SlashCommand::Cache variant exists.
main.rs:2872-2879 SlashCommand::Stats handler hard-codes
'kind': 'stats' regardless of which alias invoked.
More severe than #111:
#111: /providers → Doctor (2-way collapse, wildly wrong category)
#118: /stats + /tokens + /cache → Stats (3-way collapse with
THREE distinct advertised purposes)
The collapse hides information that IS available. /stats output
has cache_creation_input_tokens + cache_read_input_tokens as
top-level fields, so cache data is PRESENT. But /cache should
probably return {kind:'cache', cache_hits, cache_misses,
hit_rate}, a cache-specific schema. Similarly /tokens should
return {kind:'tokens', conversation_total, turns,
average_per_turn}. Implementation returns the union for all.
Fix shape (~90 lines):
- Add SlashCommand::Tokens and SlashCommand::Cache variants
- Parser arms:
'tokens' => SlashCommand::Tokens
'cache' => SlashCommand::Cache
'stats' => SlashCommand::Stats
- Handlers with distinct output schemas:
/tokens: {kind:'tokens', conversation_total, input_tokens,
output_tokens, turns, average_per_turn}
/cache: {kind:'cache', cache_creation_input_tokens,
cache_read_input_tokens, cache_hits, cache_misses,
hit_rate_pct}
/stats: {kind:'stats', subsystem:'all', ...}
- Regression per alias: kind matches, schema matches purpose
- Sweep parser for other collapse arms
- If aliasing intentional, annotate --help with (alias for X)
Joins Silent-flag/documented-but-unenforced (#96-#101, #104,
#108, #111, #115, #116, #117) as 13th — more severe than #111.
Joins Truth-audit on help-vs-implementation mismatch axis.
Cross-cluster with Parallel-entry-point asymmetry on multiple-
surfaces-identical-implementation axis.
Natural bundles:
#111 + #118 — dispatch-collapse pair:
/providers → Doctor (2-way, wildly wrong)
/stats+/tokens+/cache → Stats (3-way, distinct purposes)
Complete parser-dispatch audit shape.
#108 + #111 + #118 — parser-level trust gaps:
typo fallthrough (#108) +
2-way collapse (#111) +
3-way collapse (#118)
Filed in response to Clawhip pinpoint nudge 1494940571385593958
in #clawcode-building-in-public.
|
||
|
|
b9331ae61b |
ROADMAP #117: -p flag is super-greedy, swallows all subsequent args into prompt; --help/--version/--model after -p silently consumed; flag-like prompts bypass emptiness check
Dogfooded 2026-04-18 on main HEAD f2d6538 from /tmp/cdSS.
-p (Claude Code compat shortcut) at main.rs:524-538:
"-p" => {
let prompt = args[index + 1..].join(" ");
if prompt.trim().is_empty() {
return Err(...);
}
return Ok(CliAction::Prompt {...});
}
args[index+1..].join(" ") = ABSORBS EVERY subsequent arg.
return Ok(...) = short-circuits parser, discards wants_help etc.
Failure modes:
1. Silent flag swallow:
claw -p "test" --model sonnet --output-format json
→ prompt = "test --model sonnet --output-format json"
→ model: default (not sonnet), format: text (not json)
→ LLM receives literal string '--model sonnet' as user input
→ billable tokens burned on corrupted prompt
2. --help/--version defeated:
claw -p "test" --help → sends 'test --help' to LLM
claw -p "test" --version → sends 'test --version' to LLM
claw --help -p "test" → wants_help=true set, then discarded
by -p's early return. Help never prints.
3. Emptiness check too weak:
claw -p --model sonnet
→ prompt = "--model sonnet" (non-empty)
→ passes is_empty() check
→ sends '--model sonnet' to LLM as the user prompt
→ no error raised
4. Flag-order invisible:
claw --model sonnet -p "test" → WORKS (model parsed first)
claw -p "test" --model sonnet → BROKEN (--model swallowed)
Same flags, different order, different behavior.
--help has zero warning about flag-order semantics.
Compare Claude Code:
claude -p "prompt" --model sonnet → works (model takes effect)
claw -p "prompt" --model sonnet → silently broken
Fix shape (~40 lines):
- "-p" takes exactly args[index+1] as prompt, continues parsing:
let prompt = args.get(index+1).cloned().unwrap_or_default();
if prompt.trim().is_empty() || prompt.starts_with('-') {
return Err("-p requires a prompt string");
}
pending_prompt = Some(prompt);
index += 2;
- Reject prompts that start with '-' unless preceded by '--':
'claw -p -- --literal-prompt' = literal '--literal-prompt'
- Consult wants_help before returning from -p branch.
- Regression tests:
-p "prompt" --model sonnet → model takes effect
-p "prompt" --help → help prints
-p --foo → error
--help -p "test" → help prints
-p -- --literal → literal prompt sent
Joins Silent-flag/documented-but-unenforced (#96-#101, #104,
#108, #111, #115, #116) as 12th — -p is undocumented in --help
yet actively broken.
Joins Parallel-entry-point asymmetry (#91, #101, #104, #105,
#108, #114) as 7th — three entry points (prompt TEXT, bare
positional, -p TEXT) with subtly different arg-parsing.
Joins Claude Code migration parity (#103, #109, #116) as 4th —
users typing 'claude -p "..." --model ...' muscle memory get
silent prompt corruption.
Joins Truth-audit — parser lies about what it parsed.
Natural bundles:
#108 + #117 — billable-token silent-burn pair:
typo fallthrough burns tokens (#108) +
flag-swallow burns tokens (#117)
#105 + #108 + #117 — model-resolution triangle:
status ignores .claw.json model (#105) +
typo statuss burns tokens (#108) +
-p --model sonnet silently ignored (#117)
Filed in response to Clawhip pinpoint nudge 1494933025857736836
in #clawcode-building-in-public.
|
||
|
|
f2d653896d |
ROADMAP #116: unknown keys in .claw.json hard-fail startup with exit 1; Claude Code migration parity broken (apiKeyHelper rejected); forward-compat impossible; only first error surfaces
Dogfooded 2026-04-18 on main HEAD ad02761 from /tmp/cdRR.
Three related gaps in one finding:
1. Unknown keys are strict ERRORS, not warnings:
{"permissions":{"defaultMode":"default"},"futureField":"x"}
$ claw --output-format json status
# stdout: empty
# stderr: {"type":"error","error":"unknown key futureField"}
# exit: 1
2. Claude Code migration parity broken:
$ cp .claude.json .claw.json
# .claude.json has apiKeyHelper (real Claude Code field)
$ claw --output-format json status
# stderr: unknown key apiKeyHelper → exit 1
No 'this is a Claude Code field we don't support, ignored' message.
3. Only errors[0] is reported — iterative discovery required:
3 unknown fields → 3 edit-run-fix cycles to fix them all.
Error-routing split with --output-format json:
success → stdout
errors → stderr (structured JSON)
Empty stdout on config errors. A claw piping stdout silently
gets nothing. Must capture both streams.
No escape hatch. No --ignore-unknown-config, no --strict flag,
no strictValidation config option.
Trace:
config.rs:282-291 ConfigLoader gate:
let validation = validate_config_file(...);
if !validation.is_ok() {
let first_error = &validation.errors[0];
return Err(ConfigError::Parse(first_error.to_string()));
}
all_warnings.extend(validation.warnings);
config_validate.rs:19-47 DiagnosticKind::UnknownKey:
level: DiagnosticLevel::Error (not Warning)
config_validate.rs schema allow-list is hard-coded. No
forward-compat extension (no x-* reserved namespace, no
additionalProperties: true, no opt-in lax mode).
grep 'apiKeyHelper' rust/crates/runtime/ → 0 matches.
Claude-Code-native fields not tolerated as no-ops.
grep 'ignore.*unknown|--no-validate|strict.*validation'
rust/crates/ → 0 matches. No escape hatch.
Fix shape (~100 lines):
- Downgrade UnknownKey Error → Warning default. ~5 lines.
- Add strict mode flag: .claw.json strictValidation: true OR
--strict-config CLI flag. Default off. ~15 lines.
- Collect all diagnostics, don't halt on first. ~20 lines.
- TOLERATED_CLAUDE_CODE_FIELDS allow-list: apiKeyHelper, env
etc. emit migration-hint warning 'not yet supported; ignored'
instead of hard-fail. ~30 lines.
- Emit structured error envelope on stdout too, not just stderr.
--output-format json stdout includes config_diagnostics[]. ~15.
- Wire suggestion: Option<String> for UnknownKey via fuzzy
match ('permisions' → 'permissions'). ~15 lines.
- Regression tests per outcome.
Joins Claude Code migration parity (#103, #109) as 3rd member —
most severe migration break. #103 silently drops .md files,
#109 stderr-prose warnings, #116 outright hard-fails.
Joins Reporting-surface/config-hygiene (#90, #91, #92, #110,
#115) on error-routing-vs-stdout axis.
Joins Silent-flag/documented-but-unenforced (#96-#101, #104,
#108, #111, #115) — only first error reported, rest silent.
Cross-cluster with Truth-audit — validation.is_ok() hides all
but first structured problem.
Natural bundles:
#103 + #109 + #116 — Claude Code migration parity triangle:
loss of compat (.md dropped) +
loss of structure (stderr prose warnings) +
loss of forward-compat (unknowns hard-fail)
#109 + #116 — config validation reporting surface:
only first warning surfaces structurally (#109)
only first error surfaces structurally AND halts (#116)
Filed in response to Clawhip pinpoint nudge 1494925472239321160
in #clawcode-building-in-public.
|
||
|
|
ad02761918 |
ROADMAP #115: claw init hardcodes 'defaultMode: dontAsk' alias for danger-full-access; init output zero security signal; JSON wraps prose
Dogfooded 2026-04-18 on main HEAD ca09b6b from /tmp/cdPP.
Three compounding issues in one finding:
1. claw init generates .claw.json with dangerous default:
$ claw init && cat .claw.json
{"permissions":{"defaultMode":"dontAsk"}}
$ claw status | grep permission_mode
permission_mode: danger-full-access
2. The 'dontAsk' alias obscures the actual security posture:
config.rs:858 "dontAsk" | "danger-full-access" =>
Ok(ResolvedPermissionMode::DangerFullAccess)
User reads 'dontAsk' as 'skip confirmations I'd otherwise see'
— NOT 'grant every tool unconditional access'. But the two
parse identically. Alias name dilutes severity.
3. claw init --output-format json wraps prose in message field:
{
"kind": "init",
"message": "Init\n Project /private/tmp/cdPP\n
.claw/ created\n..."
}
Claws orchestrating setup must string-parse \n-prose to
know what got created. No files_created[], no
resolved_permission_mode, no security_posture.
Zero mention of 'danger', 'permission', or 'access' anywhere
in init output. The init report says 'Review and tailor the
generated guidance' — implying there's something benign to tailor.
Trace:
rusty-claude-cli/src/init.rs:4-9 STARTER_CLAW_JSON constant:
hardcoded {"permissions":{"defaultMode":"dontAsk"}}
runtime/src/config.rs:858 alias resolution:
"dontAsk" | "danger-full-access" => DangerFullAccess
rusty-claude-cli/src/init.rs:370 JSON-output also emits
'defaultMode': 'dontAsk' literal.
grep 'dontAsk' rust/crates/ → 4 matches. None explain that
dontAsk == danger-full-access anywhere user-facing.
Fix shape (~60 lines):
- STARTER_CLAW_JSON default → 'default' (explicit safe). Users
wanting danger-full-access opt in. ~5 lines.
- init output warns when effective mode is DangerFullAccess:
'security: danger-full-access (unconditional tool approval).'
~15 lines.
- Structure the init JSON:
{kind, files:[{path,action}], resolved_permission_mode,
permission_mode_source, security_warnings:[]}
~30 lines.
- Deprecate 'dontAsk' alias OR log warning at parse: 'alias for
danger-full-access; grants unconditional tool access'. ~8 lines.
- Regression tests per outcome.
Builds on #87 and amplifies it:
#87: absence-of-config default = danger-full-access
#101: fail-OPEN on bad RUSTY_CLAUDE_PERMISSION_MODE env var
#115: init actively generates the dangerous default
Three sequential compounding permission-posture failures.
Joins Permission-audit/tool-allow-list (#94, #97, #101, #106)
as 5th member — init-time anchor of the permission problem.
Joins Silent-flag/documented-but-unenforced on silent-setting
axis. Cross-cluster with Reporting-surface/config-hygiene
(prose-wrapped JSON) and Truth-audit (misleading 'Next step'
phrasing).
Natural bundle: #87 + #101 + #115 — 'permission drift at every
boundary': absence default + env-var bypass + init-generated.
Flagship permission-audit sweep grows 7-way:
#50 + #87 + #91 + #94 + #97 + #101 + #115
Filed in response to Clawhip pinpoint nudge 1494917922076889139
in #clawcode-building-in-public.
|
||
|
|
ca09b6b374 |
ROADMAP #114: /session list and --resume disagree after /clear; reported session_id unresumable; .bak files invisible; 0-byte files fabricate phantoms
Dogfooded 2026-04-18 on main HEAD 43eac4d from /tmp/cdNN and /tmp/cdOO.
Three related findings on session reference resolution asymmetry:
1. /clear divergence (primary):
- /clear --confirm rewrites session_id inside the file header
but reuses the old filename.
- /session list reads meta header, reports new id.
- --resume looks up by filename stem, not meta header.
- Net: /session list reports ids that --resume can't resolve.
Concrete:
claw --resume ses /clear --confirm
→ new_session_id: session-1776481564268-1
→ file still named ses.jsonl, meta session_id now the new id
claw --resume ses /session list
→ active: session-1776481564268-1
claw --resume session-1776481564268-1
→ ERROR session not found
2. .bak files filtered out of /session list silently:
ls .claw/sessions/<bucket>/
ses.jsonl ses.jsonl.before-clear-<ts>.bak
/session list → only ses.jsonl visible, .bak zero discoverability
is_managed_session_file only matches .jsonl and .json.
3. 0-byte session files fabricate phantom sessions:
touch .claw/sessions/<bucket>/emptyses.jsonl
claw --resume emptyses /session list
→ active: session-<ms>-0
→ sessions: [session-<ms>-1]
Two different fabricated ids, neither persisted to disk.
--resume either fabricated id → 'session not found'.
Trace:
session_control.rs:86-116 resolve_reference:
handle.id = session_id_from_path(&path) (filename stem)
.unwrap_or_else(|| ref.to_string())
Meta header NEVER consulted for ref → id mapping.
session_control.rs:118-137 resolve_managed_path:
for ext in [jsonl, json]:
path = sessions_root / '{ref}.{ext}'
if path.exists(): return
Lookup key is filename. Zero fallback to meta scan.
session_control.rs:228-285 collect_sessions_from_dir:
on load success: summary.id = session.session_id (meta)
on load failure: summary.id = path.file_stem() (filename)
/session list thus reports meta ids for good files.
/clear handler rewrites session_id in-place, writes to same
session_path. File keeps old name, gets new id inside.
is_managed_session_file filters .jsonl/.json only. .bak invisible.
Fix shape (~90 lines):
- /clear preserves filename's identity (Option A: keep session_id,
wipe content). /session fork handles new-id semantics (#113).
- resolve_reference falls back to meta-header scan when filename
lookup fails. Covers legacy divergent files.
- /session list surfaces backups via --include-backups flag OR
separate backups: [] array with structured metadata.
- 0-byte session files produce SessionError::EmptySessionFile
instead of silent fabrication. Structured error, not phantom.
- regression tests per failure mode.
Joins Session-handling: #93 + #112 + #113 + #114 — reference
resolution + concurrent-modification + programmatic management +
reference/enumeration asymmetry. Complete session-handling cluster.
Joins Truth-audit — /session list output factually wrong about
what is resumable.
Cross-cluster with Parallel-entry-point asymmetry (#91, #101,
#104, #105, #108) — entry points reading same underlying data
produce mutually inconsistent identifiers.
Natural bundle: #93 + #112 + #113 + #114 (session-handling
quartet — complete coverage).
Alternative bundle: #104 + #114 — /clear filename semantics +
/export filename semantics both hide identity in filename.
Filed in response to Clawhip pinpoint nudge 1494895272936079493
in #clawcode-building-in-public.
|
||
|
|
43eac4d94b |
ROADMAP #113: /session switch/fork/delete unsupported from --resume; no claw session CLI subcommand; REPL-only programmatic gap
Dogfooded 2026-04-18 on main HEAD 8b25daf from /tmp/cdJJ. Test matrix: /session list → works (structured JSON) /session switch s → 'unsupported resumed slash command' /session fork foo → 'unsupported resumed slash command' /session delete s → 'unsupported resumed slash command' /session delete s --force → 'unsupported resumed slash command' claw session delete s → Prompt fallthrough (#108), 'missing credentials' from LLM error path Help documents ALL session verbs as one unified capability: /session [list|switch <session-id>|fork [branch-name]|delete <session-id> [--force]] Summary: 'List, switch, fork, or delete managed local sessions' Implementation: main.rs:10618 parser builds SlashCommand::Session{action, target} for every subverb. All parse successfully. main.rs:2908-2925 dedicated /session list handler. Only one. main.rs:2936-2940+ catch-all: SlashCommand::Session {..} | SlashCommand::Plugins {..} | ... => Err(format_unsupported_resumed_slash_command(...)) main.rs:3963 SlashCommand::Session IS handled in LiveCli REPL path — switch/fork/delete implemented for interactive mode. runtime/session_control.rs:131+ SessionStore::resolve_reference, delete_managed_session, fork_managed_session all exist. grep 'claw session\b' main.rs → zero matches. No CLI subcommand. Gap: backing code exists, parser understands verbs, REPL handler wired — ONLY the --resume dispatch path lacks switch/fork/delete plumbing, and there's no claw session CLI subcommand as programmatic alternative. A claw orchestrating session lifecycle at scale has three options: a) start interactive REPL (impossible without TTY) b) manual .claw/sessions/ rm/cp (bypasses bookkeeping, breaks with #112's proposed locking) c) stick to /session list + /clear, accept missing verbs Fix shape (~130 lines): - /session switch <id> in run_resume_command (~25 lines) - /session fork [branch] in run_resume_command (~30 lines) - /session delete <id> [--force] in run_resume_command (~30), --force required without TTY - claw session <verb> CLI subcommand (~40) - --help: annotate which session verbs are resume-safe vs REPL-only - regression tests per verb x (CLI / slash-via-resume) Joins Unplumbed-subsystem (#78, #96, #100, #102, #103, #107, #109, #111) as 9th declared-but-not-delivered surface. Joins Session- handling (#93, #112) as 3rd member. Cross-cluster with Silent- flag on help-vs-impl mismatch. Natural bundles: #93 + #112 + #113 — session-handling triangle (semantic / concurrency / management API) #78 + #111 + #113 — declared-but-not-delivered triangle with three flavors: #78 fails-noisy (CLI variant → Prompt fallthrough) #111 fails-quiet (slash → wrong handler) #113 no-handler-at-all (slash → unsupported-resumed) Filed in response to Clawhip pinpoint nudge 1494887723818029156 in #clawcode-building-in-public. |
||
|
|
8b25daf915 |
ROADMAP #112: concurrent /compact and /clear race with raw 'No such file or directory (os error 2)' on session file
Dogfooded 2026-04-18 on main HEAD a049bd2 from /tmp/cdII.
5 concurrent /compact on same session → 4 succeed, 1 races with
raw ENOENT. Same pattern with concurrent /clear --confirm.
Trace:
session.rs:204-212 save_to_path:
rotate_session_file_if_needed(path)?
write_atomic(path, &snapshot)?
cleanup_rotated_logs(path)?
Three steps. No lock around sequence.
session.rs:1085-1094 rotate_session_file_if_needed:
metadata(path) → rename(path, rot_path)
Classic TOCTOU. Race window between check and rename.
session.rs:1063-1071 write_atomic:
writes .tmp-{ts}-{counter}, renames to path
Atomic per rename, not per multi-step sequence.
cleanup_rotated_logs deletes .rot-{ts} files older than 3 most
recent. Can race against another process reading that rot file.
No flock, no advisory lock file, no fcntl.
grep 'flock|FileLock|advisory' session.rs → zero matches.
SessionError::Io Display forwards os::Error Display:
'No such file or directory (os error 2)'
No domain translation to 'session file vanished during save'
or 'concurrent modification detected, retry safe'.
Fix shape (~90 lines + test):
- advisory lock: .claw/sessions/<bucket>/<session>.jsonl.lock
exclusive flock for duration of save_to_path (fs2 crate)
- domain error variants:
SessionError::ConcurrentModification {path, operation}
SessionError::SessionFileVanished {path}
- error-to-JSON mapping:
{error_kind: 'concurrent_modification', retry_safe: true}
- retry-policy hints on idempotent ops (/compact, /clear)
- regression test: spawn 10 concurrent /compact, assert all
success OR structured ConcurrentModification (no raw os_error)
Affected operations:
- /compact (session save_to_path after compaction)
- /clear --confirm (save_to_path after new session)
- /export (may hit rotation boundary)
- Turn-persist (append_persisted_message can race rotation)
Not inherently a bug if sessions are single-writer, but
workspace-bucket scoping at session_control.rs:31-32 assumes
one claw per workspace. Parallel ulw lanes, CI matrix runners,
orchestration loops all violate that assumption.
Joins truth-audit (error lies by omission about what happened).
New micro-cluster 'session handling' with #93. Adjacent to
#104 on session-file-handling axis.
Natural bundle: #93 + #112 (session semantic correctness +
concurrency error clarity).
Filed in response to Clawhip pinpoint nudge 1494880177099116586
in #clawcode-building-in-public.
|
||
|
|
a049bd29b1 |
ROADMAP #111: /providers documented as 'List available model providers' but dispatches to Doctor
Dogfooded 2026-04-18 on main HEAD b2366d1 from /tmp/cdHH.
Specification mismatch at the command-dispatch layer:
commands/src/lib.rs:716-720 SlashCommandSpec registry:
name: 'providers', summary: 'List available model providers'
commands/src/lib.rs:1386 parser:
'doctor' | 'providers' => SlashCommand::Doctor
So /providers dispatches to SlashCommand::Doctor. A claw calling
/providers expecting {kind: 'providers', providers: [...]} gets
{kind: 'doctor', checks: [auth, config, install_source, workspace,
sandbox, system]} instead. Same top-level kind field name,
completely different payload.
Help text lies twice:
--help slash listing: '/providers List available model providers'
--help Resume-safe summary: includes /providers
Unlike STUB_COMMANDS (#96) which fail noisily, /providers fails
QUIETLY — returns wrong subsystem output.
Runtime has provider data:
ProviderKind::{Anthropic, Xai, OpenAi, ...} at main.rs:1143-1147
resolve_repl_model with provider-prefix routing
pricing_for_model with per-provider costs
provider_fallbacks config field
Scaffolding is present; /providers just doesn't use it.
By contrast /tokens → Stats and /cache → Stats are semantically
reasonable (Stats has the requested data). /providers → Doctor
is genuinely bizarre.
Fix shape:
A. Implement: SlashCommand::Providers variant + render helper
using ProviderKind + provider_fallbacks + env-var check (~60)
B. Remove: delete 'providers' from registry + parser (~3 lines)
then /providers becomes 'unknown, did you mean /doctor?'
Either way: fix --help to match.
Parallel to #78 (claw plugins CLI variant never constructed,
falls through to prompt). Both are 'declared in spec, not
implemented as declared.' #78 fails noisy, #111 fails quiet.
Joins silent-flag cluster (#96-#101, #104, #108) — 8th
doc-vs-impl mismatch. Joins unplumbed-subsystem (#78, #96,
#100, #102, #103, #107, #109) as 8th declared-but-not-
delivered surface. Joins truth-audit.
Natural bundles:
#78 + #96 + #111 — declared-but-not-as-declared triangle
#96 + #108 + #111 — full --help/dispatch hygiene quartet
(help-filter-leaks + subcommand typo fallthrough + slash
mis-dispatch)
Filed in response to Clawhip pinpoint nudge 1494872623782301817
in #clawcode-building-in-public.
|
||
|
|
b2366d113a |
ROADMAP #110: ConfigLoader only checks cwd paths; .claw.json at project_root invisible from subdirectories
Dogfooded 2026-04-18 on main HEAD 16244ce from /tmp/cdGG/nested/deep/dir.
ConfigLoader::discover at config.rs:242-270 hardcodes every
project/local path as self.cwd.join(...):
- self.cwd.join('.claw.json')
- self.cwd.join('.claw').join('settings.json')
- self.cwd.join('.claw').join('settings.local.json')
No ancestor walk. No consultation of project_root.
Concrete:
cd /tmp/cdGG && git init && echo '{permissions:{defaultMode:read-only}}' > .claw.json
cd /tmp/cdGG/nested/deep/dir
claw status → permission_mode: 'danger-full-access' (fallback)
claw doctor → 'Config files loaded 0/0, defaults are active'
But project_root: /tmp/cdGG is correctly detected via git walk.
Same config file, same repo, invisible from subdirectory.
Meanwhile CLAUDE.md discovery walks ancestors unbounded (per #85
over-discovery). Same subsystem category, opposite policy, no doc.
Security-adjacent per #87: permission-mode fallback is
danger-full-access. cd'ing to a subdirectory silently upgrades
from read-only (configured) → danger-full-access (fallback) —
workspace-location-dependent permission drift.
Fix shape (~90 lines):
- add project_root_for(&cwd) helper (reuse git-root walker from
render_doctor_report)
- config search: user → project_root/.claw.json →
project_root/.claw/settings.json → cwd/.claw.json (overlay) →
cwd/.claw/settings.* (overlays)
- optionally walk intermediate ancestors
- surface 'where did my config come from' in doctor (pairs with
#106 + #109 provenance)
- warn when cwd has no config but project_root does
- documentation parity with CLAUDE.md
- regression tests per cwd depth + overlay precedence
Joins truth-audit (doctor says 'ok, defaults active' when config
exists). Joins discovery-overreach as opposite-direction sibling:
#85: skills ancestor walk UNBOUNDED (over-discovery)
#88: CLAUDE.md ancestor walk enables injection
#110: config NO ancestor walk (under-discovery)
Natural bundle: #85 + #110 (ancestor policy unification), or
#85 + #88 + #110 (full three-way ancestor-walk audit).
Filed in response to Clawhip pinpoint nudge 1494865079567519834
in #clawcode-building-in-public.
|
||
|
|
16244cec34 |
ROADMAP #109: config validation warnings stderr-only; structured ConfigDiagnostic flattened to prose, JSON-invisible
Dogfooded 2026-04-18 on main HEAD 21b2773 from /tmp/cdDD.
Validator produces structured diagnostics but loader discards
them after stderr eprintln:
config_validate.rs:19-66 ConfigDiagnostic {path, field, line,
kind: UnknownKey|WrongType|Deprecated}
config_validate.rs:313-322 DEPRECATED_FIELDS: permissionMode,
enabledPlugins
config_validate.rs:451 emits DiagnosticKind::Deprecated
config.rs:285-300 ConfigLoader::load:
if !validation.is_ok() {
return Err(validation.errors[0].to_string()) // ERRORS propagate
}
all_warnings.extend(validation.warnings);
for warning in &all_warnings {
eprintln!('warning: {warning}'); // WARNINGS stderr only
}
RuntimeConfig has no warnings field. No accessor. No route from
validator structured data to doctor/status JSON envelope.
Concrete:
.claw.json with enabledPlugins:{foo:true}
→ config check: {status: 'ok', summary: 'runtime config
loaded successfully'}
→ stderr: 'warning: field enabledPlugins is deprecated'
→ claw with 2>/dev/null loses the warning entirely
Errors DO propagate correctly:
.claw.json with 'permisions' (typo)
→ config check: {status: 'fail', summary: 'unknown key
permisions... Did you mean permissions?'}
Warning→stderr, Error→JSON asymmetry: a claw reading JSON can
see errors structurally but can't see warnings at all. Silent
migration drift: legacy claude-code 'permissionMode' key still
works, warning lost, operator never sees 'use permissions.
defaultMode' guidance unless they notice stderr.
Fix shape (~85 lines, all additive):
- add warnings: Vec<ConfigDiagnostic> field to RuntimeConfig
- populate from all_warnings, keep eprintln for human ops
- add ConfigDiagnostic::to_json_value emitting
{path, field, line, kind, message, replacement?}
- check_config_health: status='warn' + warnings[] JSON when
non-empty
- surface in status JSON (config_warnings[] or top-level
warnings[])
- surface in /config slash-command output
- regression tests per deprecated field + aggregation + no-warn
Joins truth-audit (#80-#87, #89, #100, #102, #103, #105, #107)
— doctor says 'ok' while validator flagged deprecations. Joins
unplumbed-subsystem (#78, #96, #100, #102, #103, #107) — 7th
surface. Joins Claude Code migration parity (#103) —
permissionMode legacy path is stderr-only.
Natural bundles:
#100 + #102 + #103 + #107 + #109 — 5-way doctor-surface
coverage plus structured warnings (doctor stops lying PR)
#107 + #109 — stderr-only-prose-warning sweep (hook events +
config warnings = same plumbing pattern)
Filed in response to Clawhip pinpoint nudge 1494857528335532174
in #clawcode-building-in-public.
|
||
|
|
21b2773233 |
ROADMAP #108: subcommand typos silently fall through to LLM prompt dispatch, burning billed tokens
Dogfooded 2026-04-18 on main HEAD 91c79ba from /tmp/cdCC.
Unrecognized first-positional tokens fall through the
_other => Ok(CliAction::Prompt { ... }) arm at main.rs:707.
Per --help this is 'Shorthand non-interactive prompt mode' —
documented behavior — but it eats known-subcommand typos too:
claw doctorr → Prompt("doctorr") → LLM API call
claw skilsl → Prompt("skilsl") → LLM API call
claw statuss → Prompt("statuss") → LLM API call
claw deply → Prompt("deply") → LLM API call
With credentials set, each burns real tokens. Without creds,
returns 'missing Anthropic credentials' — indistinguishable
from a legitimate prompt failure. No 'did you mean' suggestion.
Infrastructure exists:
slash command typos:
claw --resume s /skilsl
→ 'Unknown slash command: /skilsl. Did you mean /skill, /skills'
flag typos:
claw --fake-flag
→ structured error 'unknown option: --fake-flag'
subcommand typos:
→ silently become LLM prompts
The did-you-mean helper exists for slash commands. Flag
validation exists. Only subcommand dispatch has the silent-
fallthrough.
Fix shape (~60 lines):
- suggest_similar_subcommand(token) using levenshtein ≤ 2
against the ~16-item known-subcommand list
- gate the Prompt fallthrough on a shape heuristic:
single-token + near-match → return structured error with
did-you-mean. Otherwise fall through unchanged.
- preserve shorthand-prompt mode for multi-word inputs,
quoted inputs, and non-near-match tokens
- regression tests per typo shape + legit prompt + quoted
workaround
Cross-claw orchestration hazard: claws constructing subcommand
names from config or other claws' output have a latent 'typo →
live LLM call' vector. Over CI matrix with 1% typo rate, that's
billed-token waste + structural signal loss (error handler
can't distinguish typo from legit prompt failure).
Joins silent-flag cluster (#96-#101, #104) on subcommand axis —
6th instance of 'malformed input silently produces unintended
behavior.' Joins parallel-entry-point asymmetry (#91, #101,
#104, #105) — slash vs subcommand disagree on typo handling.
Natural bundles: #96 + #98 + #108 (--help/dispatch surface
hygiene triangle), #91 + #101 + #104 + #105 + #108 (parallel-
entry-point 5-way).
Filed in response to Clawhip pinpoint nudge 1494849975530815590
in #clawcode-building-in-public.
|
||
|
|
91c79baf20 |
ROADMAP #107: hooks subsystem fully invisible to JSON diagnostic surfaces; doctor no hook check, /hooks is stub, progress events stderr-only
Dogfooded 2026-04-18 on main HEAD a436f9e from /tmp/cdBB. Complete hook invisibility across JSON diagnostic surfaces: 1. doctor: no check_hooks_health function exists. check_config_health emits 'Config files loaded N/M, MCP servers N, Discovered file X' — NO hook count, no hook event breakdown, no hook health. .claw.json with 3 hooks (including /does/not/exist and curl-pipe-sh remote-exec payload) → doctor: ok, has_failures: false. 2. /hooks list: in STUB_COMMANDS (main.rs:7272) → returns 'not yet implemented in this build'. Parallel /mcp list / /agents list / /skills list work fine. /hooks has no sibling. 3. /config hooks: reports loaded_files and merged_keys but NOT hook bodies, NOT hook source files, NOT per-event breakdown. 4. Hook progress events route to eprintln! as prose: CliHookProgressReporter (main.rs:6660-6695) emits '[hook PreToolUse] tool_name: command' to stderr unconditionally. NEVER into --output-format json. A claw piping stderr to /dev/null (common in pipelines) loses all hook visibility. 5. parse_optional_hooks_config_object (config.rs:766) accepts any non-empty string. No fs::metadata() check, no which() check, no shell-syntax sanity check. 6. shell_command (hooks.rs:739-754) runs 'sh -lc <command>' with full shell expansion — env vars, globs, pipes, , remote curl pipes. Compounds with #106: downstream .claw/settings.local.json can silently replace the entire upstream hook array via the deep_merge_objects replace-semantic. A team-level audit hook in ~/.claw/settings.json is erasable and replaceable by an attacker-controlled hook with zero visibility anywhere machine-readable. Fix shape (~220 lines, all additive): - check_hooks_health doctor check (like #102's check_mcp_health) - status JSON exposes {pre_tool_use, post_tool_use, post_tool_use_failure} with source-file provenance - implement /hooks list (remove from STUB_COMMANDS) - route HookProgressEvent into JSON turn-summary as hook_events[] - validate hook commands at config-load, classify execution_kind - regression tests Joins truth-audit (#80-#87, #89, #100, #102, #103, #105) — doctor lies when hooks are broken or hostile. Joins unplumbed-subsystem (#78, #96, #100, #102, #103) — HookProgressEvent exists, JSON-invisible. Joins subsystem-doctor-coverage (#100, #102, #103) as fourth opaque subsystem. Cross-cluster with permission-audit (#94, #97, #101, #106) because hooks ARE a permission mechanism. Natural bundle: #102 + #103 + #107 (subsystem-doctor-coverage 3-way becomes 4-way). Plus #106 + #107 (policy-erasure + policy- visibility = complete hook-security story). Filed in response to Clawhip pinpoint nudge 1494834879127486544 in #clawcode-building-in-public. |
||
|
|
a436f9e2d6 |
ROADMAP #106: config merge deep_merge_objects REPLACES arrays; permission deny rules can be silently erased by downstream config layer
Dogfooded 2026-04-18 on main HEAD 71e7729 from /tmp/cdAA.
deep_merge_objects at config.rs:1216-1230 recurses into nested
objects but REPLACES arrays. So:
~/.claw/settings.json: {"permissions":{"deny":["Bash(rm *)"]}}
.claw.json: {"permissions":{"deny":["Bash(sudo *)"]}}
Merged: {"permissions":{"deny":["Bash(sudo *)"]}}
User's Bash(rm *) deny rule SILENTLY LOST. No warning. doctor: ok.
Worst case:
~/.claw/settings.json: {deny: [...strict list...]}
.claw/settings.local.json: {deny: []}
Merged: {deny: []}
Every deny rule from every upstream layer silently removed by a
workspace-local file. Any team/org security policy distributed
via user-home config is trivially erasable.
Arrays affected:
permissions.allow/deny/ask
hooks.PreToolUse/PostToolUse/PostToolUseFailure
plugins.externalDirectories
MCP servers are merged BY-KEY (merge_mcp_servers at :709) so
distinct server names across layers coexist. Author chose
merge-by-key for MCP but not for policy arrays. Design is
internally inconsistent.
extend_unique + push_unique helpers EXIST at :1232-1244 that do
union-merge with dedup. They are not called on the config-merge
axis for any policy array.
Fix shape (~100 lines):
- union-merge permissions.allow/deny/ask via extend_unique
- union-merge hooks.* arrays
- union-merge plugins.externalDirectories
- explicit replace-semantic opt-in via 'deny!' sentinel or
'permissions.replace: [...]' form (opt-in, not default)
- doctor surfaces policy provenance per rule (also helps #94)
- emit warning when replace-sentinel is used
- regression tests for union + explicit replace + multi-layer
Joins permission-audit sweep as 4-way composition-axis finding
(#94, #97, #101, #106). Joins truth-audit (doctor says 'ok'
while silently deleted every deny rule).
Natural bundle: #94 + #106 (rule validation + rule composition).
Plus #91 + #94 + #97 + #101 + #106 as 5-way policy-surface-audit.
Filed in response to Clawhip pinpoint nudge 1494827325085454407
in #clawcode-building-in-public.
|
||
|
|
71e77290b9 |
ROADMAP #105: claw status ignores .claw.json model, doctor mislabels alias as Resolved, 4 surfaces disagree
Dogfooded 2026-04-18 on main HEAD 6580903 from /tmp/cdZ.
.claw.json with {"model":"haiku"} produces:
claw status → model: 'claude-opus-4-6' (DEFAULT_MODEL, config ignored)
claw doctor → 'Resolved model haiku' (raw alias, label lies)
turn dispatch → claude-haiku-4-5-20251213 (actually-resolved canonical)
ANTHROPIC_MODEL=sonnet → status still says claude-opus-4-6
FOUR separate understandings of 'active model':
1. config file (alias as written)
2. doctor (alias mislabeled as 'Resolved')
3. status (hardcoded DEFAULT_MODEL ignoring config entirely)
4. turn dispatch (canonical, alias-resolved, what turns actually use)
Trace:
main.rs:59 DEFAULT_MODEL const = claude-opus-4-6
main.rs:400 parse_args starts model = DEFAULT_MODEL
main.rs:753 Status dispatch: model.to_string() — never calls
resolve_repl_model, never reads config or env
main.rs:1125 resolve_repl_model: source of truth for actual
model, consults ANTHROPIC_MODEL env + config + alias table.
Called from Prompt and Repl dispatch. NOT from Status.
main.rs:1701 check_config_health: 'Resolved model {model}'
where model is raw configured string, not resolved.
Label says Resolved, value is pre-resolution alias.
Orchestration hazard: a claw picks tool strategy based on
status.model assuming it reflects what turns will use. Status
lies: always reports DEFAULT_MODEL unless --model flag was
passed. Config and env var completely ignored by status.
Fix shape (~30 lines):
- call resolve_repl_model from print_status_snapshot
- add effective_model field to status JSON (or rename/enrich)
- fix doctor 'Resolved model' label (either rename to 'Configured'
or actually alias-resolve before emitting)
- honor ANTHROPIC_MODEL env in status
- regression tests per model source with cross-surface equality
Joins truth-audit (#80-#84, #86, #87, #89, #100, #102, #103).
Joins two-paths-diverge (#91, #101, #104) — now 4-way with #105.
Joins doctor-surface-coverage triangle (#100 + #102 + #105).
Filed in response to Clawhip pinpoint nudge 1494819785676947543
in #clawcode-building-in-public.
|