Constraint: ROADMAP.md-only restore of lost #337 from PR #2852 / Jobdori dogfood evidence
Rejected: Renumbering adjacent items | preserving existing #338 and surrounding roadmap entries keeps history stable
Confidence: high
Scope-risk: narrow
Directive: Keep #337 before #338 and do not collapse the dirty-file detail requirement into the broader help/status backlog
Tested: git diff --check; scripts/fmt.sh --check
Not-tested: Product behavior changes; documentation-only change
Constraint: Respond to 14:30 dogfood nudge with one direct claw-code pinpoint.\nEvidence: rebuilt actual debug binary at git_sha 24ccb59b; compared top-level help --output-format json with resume-safe /help --output-format json.\nFinding: same help surface uses message in top-level JSON and text in slash/resume JSON.\nTested: cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json; ./rust/target/debug/claw help --output-format json; ./rust/target/debug/claw --resume latest /help --output-format json; git diff --check; scripts/fmt.sh --check.\nNot-tested: full Rust suite; roadmap-only documentation change.
Constraint: Scope requested ROADMAP.md only with exactly one new #328 pinpoint from direct claw dogfood.\nRejected: Implementing the agents-help fix now | user requested roadmap-only evidence item.\nConfidence: high\nScope-risk: narrow\nDirective: Keep agent help source roots derived from the same loader registry as agents list; do not hand-maintain a divergent root list.\nTested: cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json; ./rust/target/debug/claw version --output-format json; ./rust/target/debug/claw agents help --output-format json; ./rust/target/debug/claw agents --output-format json; git diff --check; scripts/fmt.sh --check\nNot-tested: Full Rust test suite; roadmap-only documentation change.
Constraint: Scope limited to ROADMAP.md and one new pinpoint #327 from actual rebuilt claw dogfood.
Rejected: Code fix in this branch | user requested roadmap-only filing.
Confidence: high
Scope-risk: narrow
Directive: Keep mcp help source lists derived from actual config discovery, not hard-coded partial docs.
Tested: ./rust/target/debug/claw version --output-format json; ./rust/target/debug/claw mcp --help; ./rust/target/debug/claw mcp help --output-format json; temp .claw.json mcp list proof; git diff --check; scripts/fmt.sh --check
Not-tested: Full Rust test suite, documentation-only change.
Document the dogfood gap where help JSON stays parseable but hides command metadata inside a prose message, so future implementation can expose machine-readable command, slash-command, and resume-safety fields.\n\nConstraint: user requested ROADMAP.md-only pinpoint for issue #325 from origin/main d607ff36.\nRejected: implementing the schema now | requested fix shape is roadmap documentation only.\nConfidence: high\nScope-risk: narrow\nDirective: keep message for humans while adding schema/versioned structured help metadata when implementing.\nTested: git diff --check; scripts/fmt.sh --check\nNot-tested: runtime CLI behavior unchanged by docs-only change
Constraint: Documentation-only follow-up from current main e7074f47 after PR #2838; edit scope limited to ROADMAP.md.\nRejected: Implementing provenance detection now | user requested roadmap entry only.\nConfidence: high\nScope-risk: narrow\nDirective: Future implementation should compare embedded build git_sha/build date to workspace HEAD/dirty state without leaking secrets.\nTested: git diff --check; scripts/fmt.sh --check\nNot-tested: Runtime provenance behavior; this commit only records the roadmap requirement.
Keep claw --help's resume-safe slash command summary aligned with the interactive command list by filtering STUB_COMMANDS and adding regression coverage.
Operator status previously treated any tmux pane in a workspace as equivalent to active work. The new classifier uses tmux pane command/path metadata as a soft signal, treats plain shells as idle, and adds dirty-worktree abandoned markers to status and session-list output for clawhip consumers.
Constraint: Keep issue #320 prototype minimal and additive without new dependencies
Rejected: Screen-scraping pane output | fragile and broader than needed for lifecycle classification
Confidence: high
Scope-risk: narrow
Tested: cargo test -p rusty-claude-cli
Tested: cargo check -p rusty-claude-cli
Not-tested: cargo clippy -p rusty-claude-cli --all-targets -- -D warnings is blocked by pre-existing commands crate clippy::unnecessary_wraps warnings
The formatting wrapper should remain safe when invoked through different current directories or shell contexts, so resolve the script directory before entering the Rust workspace and forwarding cargo fmt arguments.
Constraint: Wrapper must be runnable from repo root while forwarding flags like --check
Rejected: Leave relative dirname cd | less robust if invocation context changes
Confidence: high
Scope-risk: narrow
Tested: scripts/fmt.sh --check
Tested: git diff --check
The Rust crate layout expects formatting to run from the rust directory, so add a root-level wrapper that preserves the working command while forwarding user flags like --check. Documentation now points contributors at the wrapper instead of the misleading virtual-workspace manifest invocation.
Constraint: Root-level cargo fmt --manifest-path rust/Cargo.toml is misleading for this virtual workspace
Rejected: Document cd rust && cargo fmt directly | a root wrapper gives one stable repo-root command
Confidence: high
Scope-risk: narrow
Tested: scripts/fmt.sh --check
Tested: git diff --check
Run rustfmt from the Rust workspace so CI format checks pass without changing behavior.
Constraint: Scope is formatting-only across tracked Rust files
Confidence: high
Scope-risk: narrow
Tested: cd rust && cargo fmt --check
Tested: git diff --check
Reject empty --allowedTools inputs instead of treating them as an empty restriction, and surface status JSON metadata that distinguishes default unrestricted tools from flag-provided allow lists.
Confidence: high
Scope-risk: narrow
Tested: cargo test -p rusty-claude-cli rejects_empty_allowed_tools_flag -- --nocapture
Tested: cargo test -p tools allowed_tools_rejects_empty_token_lists -- --nocapture
Tested: cargo check -p rusty-claude-cli -p tools
Tested: cargo test -p rusty-claude-cli -p tools
Not-tested: full workspace cargo fmt --check is blocked by pre-existing unrelated formatting drift
Worker boot could previously stall on an interactive MCP/tool permission prompt while readiness and startup-timeout surfaces only had generic idle/no-evidence shapes. This adds a first-class blocked lifecycle state, structured event payload, startup evidence fields, and regression coverage so callers can report the exact server/tool gate instead of pane-scraping.
Constraint: ROADMAP #200 requires tool/server identity, prompt age, and session-only versus always-allow capability in status/evidence surfaces
Rejected: Treat MCP/tool prompts as trust gates | conflates distinct prompts and loses tool identity
Rejected: Leave allow-scope as pane text only | clawhip still could not classify the blocker without scraping
Confidence: high
Scope-risk: moderate
Directive: Keep tool_permission_required distinct from trust_required; downstream claws rely on server/tool payload plus allow-scope metadata
Tested: cargo test -p runtime tool_permission
Tested: cargo fmt -p runtime -- --check && cargo clippy -p runtime --all-targets -- -D warnings && cargo test -p runtime
Tested: cargo test --workspace
Not-tested: live interactive MCP permission prompt in tmux
The pull brought the branch current with origin/main while replaying local follow-up work. Conflict resolution kept the roadmap/progress additions and integrated the runtime event/trust changes with upstream's newer surfaces.
The trust allowlist now treats worktree_pattern as an additional required predicate, including the missing-worktree case, so auto-trust cannot fall back to cwd-only matching when a worktree constraint was declared. The runtime formatting cleanup keeps clippy/fmt green after the merge.
Constraint: Local branch was 109 commits behind origin/main with dirty tracked follow-up work.
Rejected: Drop the autostash after conflict resolution | keeping it preserves a reversible safety backup for unrelated recovery.
Confidence: high
Scope-risk: moderate
Directive: Do not relax worktree_pattern matching without preserving the missing-worktree regression.
Tested: git diff --cached --check; cargo fmt -p runtime -- --check; cargo clippy -p runtime --all-targets -- -D warnings; cargo test -p runtime; cargo test --workspace; architect verification approved
Not-tested: Live tmux/worker auto-trust behavior outside unit/integration tests
## Gap
#77 Phase 1 added machine-readable error kind discriminants and #156 extended
them to text-mode output. However, the hint field is still prose derived from
splitting existing error text — not a stable registry-backed remediation
contract.
Downstream claws inspecting the hint field still need to parse human wording
to decide whether to retry, escalate, or terminate.
## Fix Shape
1. Remediation registry: remediation_for(kind, operation) -> Remediation struct
with action (retry/escalate/terminate/configure), target, and stable message
2. Stable hint outputs per error class (no more prose splitting)
3. Golden fixture tests replacing split_error_hint() string hacks
## Source
gaebal-gajae dogfood sweep 2026-04-22 05:30 KST
## Problem
#77 Phase 1 added machine-readable error `kind` discriminants to JSON error
payloads. Text-mode (stderr) errors still emit prose-only output with no
structured classification.
Observability tools (log aggregators, CI error parsers) parsing stderr can't
distinguish error classes without regex-scraping the prose.
## Fix
Added `[error-kind: <class>]` prefix line to all text-mode error output.
The prefix appears before the error prose, making it immediately parseable by
line-based log tools without any substring matching.
**Examples:**
## Impact
- Stderr observers (log aggregators, CI systems) can now parse error class
from the first line without regex or substring scraping
- Same classifier function used for JSON (#77 P1) and text modes
- Text-mode output remains human-readable (error prose unchanged)
- Prefix format follows syslog/structured-logging conventions
## Tests
All 179 rusty-claude-cli tests pass. Verified on 3 different error classes.
Closes ROADMAP #156.
## Problem
All JSON error payloads had the same three-field envelope:
```json
{"type": "error", "error": "<prose with hint baked in>"}
```
Five distinct error classes were indistinguishable at the schema level:
- missing_credentials (no API key)
- missing_worker_state (no state file)
- session_not_found / session_load_failed
- cli_parse (unrecognized args)
- invalid_model_syntax
Downstream claws had to regex-scrape the prose to route failures.
## Fix
1. **Added `classify_error_kind()`** — prefix/keyword classifier that returns a
snake_case discriminant token for 12 known error classes:
`missing_credentials`, `missing_manifests`, `missing_worker_state`,
`session_not_found`, `session_load_failed`, `no_managed_sessions`,
`cli_parse`, `invalid_model_syntax`, `unsupported_command`,
`unsupported_resumed_command`, `confirmation_required`, `api_http_error`,
plus `unknown` fallback.
2. **Added `split_error_hint()`** — splits multi-line error messages into
(short_reason, optional_hint) so the runbook prose stops being stuffed
into the `error` field.
3. **Extended JSON envelope** at 4 emit sites:
- Main error sink (line ~213)
- Session load failure in resume_session
- Stub command (unsupported_command)
- Unknown resumed command (unsupported_resumed_command)
## New JSON shape
```json
{
"type": "error",
"error": "short reason (first line)",
"kind": "missing_credentials",
"hint": "Hint: export ANTHROPIC_API_KEY..."
}
```
`kind` is always present. `hint` is null when no runbook follows.
`error` now carries only the short reason, not the full multi-line prose.
## Tests
Added 2 new regression tests:
- `classify_error_kind_returns_correct_discriminants` — all 9 known classes + fallback
- `split_error_hint_separates_reason_from_runbook` — with and without hints
All 179 rusty-claude-cli tests pass. Full workspace green.
Closes ROADMAP #77 Phase 1.
## Problem
Two session error messages advertised `.claw/sessions/` as the managed-session
location, but the actual on-disk layout is `.claw/sessions/<workspace_fingerprint>/`
where the fingerprint is a 16-char FNV-1a hash of the CWD path.
Users see error messages like:
```
no managed sessions found in .claw/sessions/
```
But the real directory is:
```
.claw/sessions/8497f4bcf995fc19/
```
The error copy was a direct lie — it made workspace-fingerprint partitioning
invisible and left users confused about whether sessions were lost or just in
a different partition.
## Fix
Updated two error formatters to accept the resolved `sessions_root` path
and extract the actual workspace-fingerprint directory:
1. **format_missing_session_reference**: now shows the actual fingerprint dir
and explains that it's a workspace-specific partition
2. **format_no_managed_sessions**: now shows the actual fingerprint dir and
includes a note that sessions from other CWDs are intentionally invisible
Updated all three call sites to pass `&self.sessions_root` to the formatters.
## Examples
**Before:**
```
no managed sessions found in .claw/sessions/
```
**After:**
```
no managed sessions found in .claw/sessions/8497f4bcf995fc19/
Start `claw` to create a session, then rerun with `--resume latest`.
Note: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.
```
```
session not found: nonexistent-id
Hint: managed sessions live in .claw/sessions/8497f4bcf995fc19/ (workspace-specific partition).
Try `latest` for the most recent session or `/session list` in the REPL.
```
## Impact
- Users can now tell from the error message that they're looking in the right
directory (the one their current CWD maps to)
- The workspace-fingerprint partitioning stops being invisible
- Operators understand why sessions from adjacent CWDs don't appear
- Error copy matches the actual on-disk structure
## Tests
All 466 runtime tests pass. Verified on two real workspaces with actual
workspace-fingerprint directories.
Closes ROADMAP #80.
## Problem
Three interactive slash commands are documented in `claw --help` but have no
corresponding section in USAGE.md:
- `/ultraplan [task]` — Run a deep planning prompt with multi-step reasoning
- `/teleport <symbol-or-path>` — Jump to a file or symbol by searching the workspace
- `/bughunter [scope]` — Inspect the codebase for likely bugs
New users see these commands in the help output but don't know:
- What each command does
- How to use it
- When to use it vs. other commands
- What kind of results to expect
## Fix
Added new section "Advanced slash commands (Interactive REPL only)" to USAGE.md
with documentation for all three commands:
1. **`/ultraplan`** — multi-step reasoning for complex tasks
- Example: `/ultraplan refactor the auth module to use async/await`
- Output: structured plan with numbered steps and reasoning
2. **`/teleport`** — navigate to a file or symbol
- Example: `/teleport UserService`, `/teleport src/auth.rs`
- Output: file content with the requested symbol highlighted
3. **`/bughunter`** — scan for likely bugs
- Example: `/bughunter src/handlers`, `/bughunter` (all)
- Output: list of suspicious patterns with explanations
## Impact
Users can now discover these commands and understand when to use them without
having to guess or search external sources. Bridges the gap between `--help`
output and full documentation.
Also filed ROADMAP #155 documenting the gap.
Closes ROADMAP #155.
## Problem
When a user types `claw --model gpt-4` or `--model qwen-plus`, they get:
```
error: invalid model syntax: 'gpt-4'. Expected provider/model (e.g., anthropic/claude-opus-4-6) or known alias
```
USAGE.md documents that "The error message now includes a hint that names the detected env var" — but this hint does not actually exist. The user has to re-read USAGE.md or guess the correct prefix.
## Fix
Enhance `validate_model_syntax` to detect when a model name looks like it belongs to a different provider:
1. **OpenAI models** (starts with `gpt-` or `gpt_`):
```
Did you mean `openai/gpt-4`? (Requires OPENAI_API_KEY env var)
```
2. **Qwen/DashScope models** (starts with `qwen`):
```
Did you mean `qwen/qwen-plus`? (Requires DASHSCOPE_API_KEY env var)
```
3. **Grok/xAI models** (starts with `grok`):
```
Did you mean `xai/grok-3`? (Requires XAI_API_KEY env var)
```
Unrelated invalid models (e.g., `asdfgh`) do not get a spurious hint.
## Verification
- `claw --model gpt-4` → hints `openai/gpt-4` + `OPENAI_API_KEY`
- `claw --model qwen-plus` → hints `qwen/qwen-plus` + `DASHSCOPE_API_KEY`
- `claw --model grok-3` → hints `xai/grok-3` + `XAI_API_KEY`
- `claw --model asdfgh` → generic error (no hint)
## Tests
Added 3 new assertions in `parses_multiple_diagnostic_subcommands`:
- GPT model error hints openai/ prefix and OPENAI_API_KEY
- Qwen model error hints qwen/ prefix and DASHSCOPE_API_KEY
- Unrelated models don't get a spurious hint
All 177 rusty-claude-cli tests pass.
Closes ROADMAP #154.
## Problem
Users frequently ask after building:
- "Where is the claw binary?"
- "Did the build actually work?"
- "Why can't I run \`claw\` from anywhere?"
This happens because \`cargo build\` puts the binary in \`rust/target/debug/claw\`
(or \`rust/target/release/claw\`), and new users don't know:
1. Where to find it
2. How to test it
3. How to add it to PATH (optional but common follow-up)
## Fix
Added new section "Post-build: locate the binary and verify" to README covering:
1. **Binary location table:** debug vs. release, macOS/Linux vs. Windows paths
2. **Verification commands:** Test the binary with \`--help\` and \`doctor\`
3. **Three ways to add to PATH:**
- Symlink (macOS/Linux): \`ln -s ... /usr/local/bin/claw\`
- cargo install: \`cargo install --path . --force\`
- Shell profile update: add rust/target/debug to \$PATH
4. **Troubleshooting:** Common errors ("command not found", "permission denied",
debug vs. release build speed)
## Impact
New users can now:
- Find the binary immediately after build
- Run it and verify with \`claw doctor\`
- Know their options for system-wide access
Also filed ROADMAP #153 documenting the gap.
Closes ROADMAP #153.
## Problem
Users commonly type `claw doctor --json`, `claw status --json`, or
`claw system-prompt --json` expecting JSON output. These fail with
`unrecognized argument \`--json\` for subcommand` with no hint that
`--output-format json` is the correct flag.
## Discovery
Filed as #152 during 21:17 dogfood nudge. The #127 worktree contained
a more comprehensive patch but conflicted with #141 (unified --help).
On re-investigation of main, Bugs 1 and 3 from #127 are already closed
(positional arg rejection works, no double "error:" prefix). Only
Bug 2 (the `--json` hint) remained.
## Fix
Two call sites add the hint:
1. `parse_single_word_command_alias`'s diagnostic-verb suffix path:
when rest[1] == "--json", append "Did you mean \`--output-format json\`?"
2. `parse_system_prompt_options` unknown-option path: same hint when
the option is exactly `--json`.
## Verification
Before:
$ claw doctor --json
error: unrecognized argument `--json` for subcommand `doctor`
Run `claw --help` for usage.
After:
$ claw doctor --json
error: unrecognized argument `--json` for subcommand `doctor`
Did you mean `--output-format json`?
Run `claw --help` for usage.
Covers: `doctor --json`, `status --json`, `sandbox --json`,
`system-prompt --json`, and any other diagnostic verb that routes
through `parse_single_word_command_alias`.
Other unrecognized args (`claw doctor garbage`) correctly don't
trigger the hint.
## Tests
- 2 new assertions in `parses_multiple_diagnostic_subcommands`:
- `claw doctor --json` produces hint
- `claw doctor garbage` does NOT produce hint
- 177 rusty-claude-cli tests pass
- Workspace tests green
Closes ROADMAP #152.
Filed from nudge directive at 21:17 KST. Implementation exists on worktree
`jobdori-127-verb-suffix` but needs rebase due to merge with #141.
Ready for Phase 1 implementation once conflicts resolved.
## Problem
`workspace_fingerprint(path)` hashes the raw path string without
canonicalization. Two equivalent paths (e.g. `/tmp/foo` vs
`/private/tmp/foo` on macOS) produce different fingerprints and
therefore different session stores. #150 fixed the test-side symptom;
this fixes the underlying product contract.
## Discovery path
#150 fix (canonicalize in test) was a workaround. Q's ack on #150
surfaced the deeper gap: the function itself is still fragile for
any caller passing a non-canonical path:
1. Embedded callers with a raw `--data-dir` path
2. Programmatic `SessionStore::from_cwd(user_path)` calls
3. NixOS store paths, Docker bind mounts, case-insensitive normalization
The REPL's default flow happens to work because `env::current_dir()`
returns canonical paths on macOS. But any caller passing a raw path
risks silent session-store divergence.
## Fix
Canonicalize inside `SessionStore::from_cwd()` and `from_data_dir()`
before computing the fingerprint. Kept `workspace_fingerprint()` itself
as a pure function for determinism — canonicalization is the entry
point's responsibility.
```rust
let canonical_cwd = fs::canonicalize(cwd).unwrap_or_else(|_| cwd.to_path_buf());
let sessions_root = canonical_cwd.join(".claw").join("sessions").join(workspace_fingerprint(&canonical_cwd));
```
Falls back to the raw path if canonicalize fails (directory doesn't
exist yet).
## Test-side updates
Three legacy-session tests expected the non-canonical base path to
match the store's workspace_root. Updated them to canonicalize
`base` after creation — same defensive pattern as #150, now
explicit across all three tests.
## Regression test
Added `session_store_from_cwd_canonicalizes_equivalent_paths` that
creates two stores from equivalent paths (raw vs canonical) and
asserts they resolve to the same sessions_dir.
## Verification
- `cargo test -p runtime session_store_` — 9/9 pass
- `cargo test --workspace` — all green, no FAILED markers
- No behavior change for existing users (REPL default flow already
used canonical paths)
## Backward compatibility
Users on macOS who always went through `env::current_dir()`:
no hash change, sessions resume identically.
Users who ever called with a non-canonical path: hash would change,
but those sessions were already broken (couldn't be resumed from a
canonical-path cwd). Net improvement.
Closes ROADMAP #151.
## #150 Fix: resume_latest test flake
**Problem:** `resume_latest_restores_the_most_recent_managed_session` intermittently
fails when run in the workspace suite or multiple times in sequence, but passes in
isolation.
**Root cause:** `workspace_fingerprint(path)` hashes the path string without
canonicalization. On macOS, `/tmp` is a symlink to `/private/tmp`. The test
creates a temp dir via `std::env::temp_dir().join(...)` which returns
`/var/folders/...` (non-canonical). When the subprocess spawns,
`env::current_dir()` returns the canonical path `/private/var/folders/...`.
The two fingerprints differ, so the subprocess looks in
`.claw/sessions/<hash1>` while files are in `.claw/sessions/<hash2>`.
Session discovery fails.
**Fix:** Call `fs::canonicalize(&project_dir)` after creating the directory
to ensure test and subprocess use identical path representations.
**Verification:** 5 consecutive runs of the full test suite — all pass.
Previously: 5/5 failed when run in sequence.
## #246 Filing: Reminder cron outcome ambiguity (control-loop blocker)
The `clawcode-dogfood-cycle-reminder` cron times out repeatedly with no
structured feedback on whether the nudge was delivered, skipped, or died in-flight.
**Phase 1 outcome schema** — add explicit field to cron result:
- `delivered` — nudge posted to Discord
- `timed_out_before_send` — died before posting
- `timed_out_after_send` — posted but cleanup timed out
- `skipped_due_to_active_cycle` — previous cycle active
- `aborted_gateway_draining` — daemon shutdown
Assigned to gaebal-gajae (cron/orchestration domain). Unblocks trustworthy
dogfood cycle observability.
Closes ROADMAP #150. Filed ROADMAP #246.
## Problem
`runtime::config::tests::validates_unknown_top_level_keys_with_line_and_field_name`
intermittently fails during `cargo test --workspace` (witnessed during
#147 and #148 workspace runs) but passes deterministically in isolation.
Example failure from workspace run:
test result: FAILED. 464 passed; 1 failed
## Root cause
`runtime/src/config.rs::tests::temp_dir()` used nanosecond timestamp
alone for namespace isolation:
std::env::temp_dir().join(format!("runtime-config-{nanos}"))
Under parallel test execution on fast machines with coarse clock
resolution, two tests start within the same nanosecond bucket and
collide on the same path. One test's `fs::remove_dir_all(root)` then
races another's in-flight `fs::create_dir_all()`.
Other crates already solved this pattern:
- plugins::tests::temp_dir(label) — label-parameterized
- runtime::git_context::tests::temp_dir(label) — label-parameterized
runtime/src/config.rs was missed.
## Fix
Added process id + monotonically-incrementing atomic counter to the
namespace, making every callsite provably unique regardless of clock
resolution or scheduling:
static COUNTER: AtomicU64 = AtomicU64::new(0);
let pid = std::process::id();
let seq = COUNTER.fetch_add(1, Ordering::Relaxed);
std::env::temp_dir().join(format!("runtime-config-{pid}-{nanos}-{seq}"))
Chose counter+pid over the label-parameterized pattern to avoid
touching all 20 callsites in the same commit (mechanical noise with
no added safety — counter alone is sufficient).
## Verification
Before: one failure per workspace run (config test flake).
After: 5 consecutive `cargo test --workspace` runs — zero config
test failures. Only pre-existing `resume_latest` flake remains
(orthogonal, unrelated to this change).
for i in 1 2 3 4 5; do cargo test --workspace; done
# All 5 runs: config tests green. Only resume_latest flake appears.
cargo test -p runtime
# 465 passed; 0 failed
## ROADMAP.md
Added Pinpoint #149 documenting the gap, root cause, and fix.
Closes ROADMAP #149.
## Scope
Two deltas in one commit:
### #128 closure (docs)
Re-verified on main HEAD `4cb8fa0`: malformed `--model` strings already
rejected at parse time (`validate_model_syntax` in parse_args). All
historical repro cases now produce specific errors:
claw --model '' → error: model string cannot be empty
claw --model 'bad model' → error: invalid model syntax: 'bad model' contains spaces
claw --model 'sonet' → error: invalid model syntax: 'sonet'. Expected provider/model or known alias
claw --model '@invalid' → error: invalid model syntax: '@invalid'. Expected provider/model ...
claw --model 'totally-not-real-xyz' → error: invalid model syntax: ...
claw --model sonnet → ok, resolves to claude-sonnet-4-6
claw --model anthropic/claude-opus-4-6 → ok, passes through
Marked #128 CLOSED in ROADMAP with repro block. Residual provenance gap
split off as #148.
### #148 implementation
**Problem.** After #128 closure, `claw status --output-format json`
still surfaces only the resolved model string. No way for a claw to
distinguish whether `claude-sonnet-4-6` came from `--model sonnet`
(alias resolution) vs `--model claude-sonnet-4-6` (pass-through) vs
`ANTHROPIC_MODEL` env vs `.claw.json` config vs compiled-in default.
Debug forensics had to re-read argv instead of reading a structured
field. Clawhip orchestrators sending `--model` couldn't confirm the
flag was honored vs falling back to default.
**Fix.** Added two fields to status JSON envelope:
- `model_source`: "flag" | "env" | "config" | "default"
- `model_raw`: user's input before alias resolution (null on default)
Text mode appends a `Model source` line under `Model`, showing the
source and raw input (e.g. `Model source flag (raw: sonnet)`).
**Resolution order** (mirrors resolve_repl_model but with source
attribution):
1. If `--model` / `--model=` flag supplied → source: flag, raw: flag value
2. Else if ANTHROPIC_MODEL set → source: env, raw: env value
3. Else if `.claw.json` model key set → source: config, raw: config value
4. Else → source: default, raw: null
## Changes
### rust/crates/rusty-claude-cli/src/main.rs
- Added `ModelSource` enum (Flag/Env/Config/Default) with `as_str()`.
- Added `ModelProvenance` struct (resolved, raw, source) with
three constructors: `default_fallback()`, `from_flag(raw)`, and
`from_env_or_config_or_default(cli_model)`.
- Added `model_flag_raw: Option<String>` field to `CliAction::Status`.
- Parse loop captures raw input in `--model` and `--model=` arms.
- Extended `parse_single_word_command_alias` to thread
`model_flag_raw: Option<&str>` through.
- Extended `print_status_snapshot` signature to accept
`model_flag_raw: Option<&str>`. Resolves provenance at dispatch time
(flag provenance from arg; else probe env/config/default).
- Extended `status_json_value` signature with
`provenance: Option<&ModelProvenance>`. On Some, adds `model_source`
and `model_raw` fields; on None (legacy resume paths), omits them
for backward compat.
- Extended `format_status_report` signature with optional provenance.
On Some, renders `Model source` line after `Model`.
- Updated all existing callers (REPL /status, resume /status, tests)
to pass None (legacy paths don't carry flag provenance).
- Added 2 regression assertions in parse_args test covering both
`--model sonnet` and `--model=...` forms.
### ROADMAP.md
- Marked #128 CLOSED with re-verification block.
- Filed #148 documenting the provenance gap split, fix shape, and
acceptance criteria.
## Live verification
$ claw --model sonnet --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-sonnet-4-6", "model_source": "flag", "model_raw": "sonnet"}
$ claw --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-opus-4-6", "model_source": "default", "model_raw": null}
$ ANTHROPIC_MODEL=haiku claw --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-haiku-4-5-20251213", "model_source": "env", "model_raw": "haiku"}
$ echo '{"model":"claude-opus-4-7"}' > .claw.json && claw --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-opus-4-7", "model_source": "config", "model_raw": "claude-opus-4-7"}
$ claw --model sonnet status
Status
Model claude-sonnet-4-6
Model source flag (raw: sonnet)
Permission mode danger-full-access
...
## Tests
- rusty-claude-cli bin: 177 tests pass (2 new assertions for #148)
- Full workspace green except pre-existing resume_latest flake (unrelated)
Closes ROADMAP #128, #148.
## Problem
The `"prompt"` subcommand arm enforced `if prompt.trim().is_empty()`
and returned a specific error. The fallthrough `other` arm in the same
match block — which routes any unrecognized first positional arg to
`CliAction::Prompt` — had no such guard. Result:
$ claw ""
error: missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN ...
$ claw " "
error: missing Anthropic credentials; ...
$ claw "" ""
error: missing Anthropic credentials; ...
$ claw --output-format json ""
{"error":"missing Anthropic credentials; ...","type":"error"}
An empty prompt should never reach the credentials check. Worse: with
valid credentials, the literal empty string gets sent to Claude as a
user prompt, either burning tokens for nothing or triggering a model-
side refusal. Same prompt-misdelivery family as #145.
## Root cause
In `parse_subcommand()`, the final `other =>` arm in the top-level
match only guards against typos (#108 guard via `looks_like_subcommand_typo`)
and then unconditionally builds `CliAction::Prompt { prompt: rest.join(" ") }`.
An empty/whitespace-only join passes through.
## Changes
### rust/crates/rusty-claude-cli/src/main.rs
Added the same `if joined.trim().is_empty()` guard already used in the
`"prompt"` arm to the fallthrough path. Error message distinguishes it
from the `prompt` subcommand path:
empty prompt: provide a subcommand (run `claw --help`) or a
non-empty prompt string
Runs AFTER the typo guard (so `claw sttaus` still suggests `status`)
and BEFORE CliAction::Prompt construction (so no network call ever
happens for empty inputs).
### Regression tests
Added 4 assertions in the existing parse_args test:
- parse_args([""]) → Err("empty prompt: ...")
- parse_args([" "]) → Err("empty prompt: ...")
- parse_args(["", ""]) → Err("empty prompt: ...")
- parse_args(["sttaus"]) → Err("unknown subcommand: ...") [verifies #108 typo guard still takes precedence]
### ROADMAP.md
Added Pinpoint #147 documenting the gap, verification, root cause,
fix shape, and acceptance. Joins the prompt-misdelivery cluster
alongside #145.
## Live verification
$ claw ""
error: empty prompt: provide a subcommand (run `claw --help`) or a non-empty prompt string
$ claw " "
error: empty prompt: provide a subcommand (run `claw --help`) or a non-empty prompt string
$ claw --output-format json ""
{"error":"empty prompt: provide a subcommand ...","type":"error"}
$ claw prompt "" # unchanged: subcommand-specific error preserved
error: prompt subcommand requires a prompt string
$ claw hello # unchanged: typo guard still fires
error: unknown subcommand: hello.
Did you mean help
$ claw "real prompt here" # unchanged: real prompts still reach API
error: api returned 401 Unauthorized (with dummy key, as expected)
All empty/whitespace-only paths exit 1. No network call. No misleading
credentials error.
## Tests
- rusty-claude-cli bin: 177 tests pass (4 new assertions)
- Full workspace green except pre-existing resume_latest flake (unrelated)
Closes ROADMAP #147.
## Problem
`claw config` and `claw diff` are pure-local read-only introspection
commands (config merges .claw.json + .claw/settings.json from disk; diff
shells out to `git diff --cached` + `git diff`). Neither needs a session
context, yet both rejected direct CLI invocation:
$ claw config
error: `claw config` is a slash command. Use `claw --resume SESSION.jsonl /config` ...
$ claw diff
error: `claw diff` is a slash command. ...
This forced clawing operators to spin up a full session just to inspect
static disk state, and broke natural pipelines like
`claw config --output-format json | jq`.
## Root cause
Sibling of #145: `SlashCommand::Config { section }` and
`SlashCommand::Diff` had working renderers (`render_config_report`,
`render_config_json`, `render_diff_report`, `render_diff_json_for`)
exposed for resume sessions, but the top-level CLI parser in
`parse_subcommand()` had no arms for them. Zero-arg `config`/`diff`
hit `parse_single_word_command_alias`'s fallback to
`bare_slash_command_guidance`, producing the misleading guidance.
## Changes
### rust/crates/rusty-claude-cli/src/main.rs
- Added `CliAction::Config { section, output_format }` and
`CliAction::Diff { output_format }` variants.
- Added `"config"` / `"diff"` arms to the top-level parser in
`parse_subcommand()`. `config` accepts an optional section name
(env|hooks|model|plugins) matching SlashCommand::Config semantics.
`diff` takes no positional args. Both reject extra trailing args
with a clear error.
- Added `"config" | "diff" => None` to
`parse_single_word_command_alias` so bare invocations fall through
to the new parser arms instead of the slash-guidance error.
- Added dispatch in run() that calls existing renderers: text mode uses
`render_config_report` / `render_diff_report`; JSON mode uses
`render_config_json` / `render_diff_json_for` with
`serde_json::to_string_pretty`.
- Added 5 regression assertions in parse_args test covering:
parse_args(["config"]), parse_args(["config", "env"]),
parse_args(["config", "--output-format", "json"]),
parse_args(["diff"]), parse_args(["diff", "--output-format", "json"]).
### ROADMAP.md
Added Pinpoint #146 documenting the gap, verification, root cause,
fix shape, and acceptance. Explicitly notes which other slash commands
(`hooks`, `usage`, `context`, etc.) are NOT candidates because they
are session-state-modifying.
## Live verification
$ claw config # no config files
Config
Working directory /private/tmp/cd-146-verify
Loaded files 0
Merged keys 0
Discovered files
user missing ...
project missing ...
local missing ...
Exit 0.
$ claw config --output-format json
{
"cwd": "...",
"files": [...],
...
}
$ claw diff # no git
Diff
Result no git repository
Detail ...
Exit 0.
$ claw diff --output-format json # inside claw-code
{
"kind": "diff",
"result": "changes",
"staged": "",
"unstaged": "diff --git ..."
}
Exit 0.
## Tests
- rusty-claude-cli bin: 177 tests pass (5 new assertions in parse_args)
- Full workspace green except pre-existing resume_latest flake (unrelated)
## Not changed
`hooks`, `usage`, `context`, `tasks`, `theme`, `voice`, `rename`,
`copy`, `color`, `effort`, `branch`, `rewind`, `ide`, `tag`,
`output-style`, `add-dir` — all session-mutating or interactive-only;
correctly remain slash-only.
Closes ROADMAP #146.