1072 Commits

Author SHA1 Message Date
YeonGyu-Kim
4813a2b351 fix: #162 — budget-overflow no longer corrupts session state in submit_message
Previously, QueryEnginePort.submit_message() checked the token budget AFTER
appending the prompt to mutable_messages, transcript_store, and permission_denials,
and AFTER calling compact_messages_if_needed(). On overflow it set
stop_reason='max_budget_reached' but the overflow turn was already committed.
Any caller that persisted the session afterwards wrote the rejected prompt to
disk — the session was silently poisoned even though the TurnResult said the
turn never completed.

Fix:
- Restructure submit_message so the budget check early-returns BEFORE any
  mutation of mutable_messages, transcript_store, permission_denials, or
  total_usage.
- The returned TurnResult.usage reflects pre-call state (overflow never
  advanced the usage counter).
- Normal (in-budget) path unchanged: mutation happens exactly once, at the
  end, only on 'completed' results.

This closes the atomicity gap: submit_message is now either 'turn committed'
(stop_reason='completed') or 'turn rejected, state untouched'
(stop_reason in {'max_budget_reached', 'max_turns_reached'}). Callers can
safely retry with a fresh budget or a smaller prompt without worrying about
phantom committed turns from prior rejections.

Tests (tests/test_submit_message_budget.py, 10 tests):
- TestBudgetOverflowDoesNotMutate (5): mutable_messages / transcript /
  permission_denials / total_usage / TurnResult.usage all pre-mutation after overflow
- TestOverflowPersistence (2): first-turn overflow persists empty session;
  successful-turn-then-overflow persists only the successful turn
- TestEngineUsableAfterOverflow (2): subsequent in-budget call still works
  with no residue; repeated overflows don't accumulate hidden state
- TestNormalPathStillCommits (1): regression guard — non-overflow path still
  commits mutable_messages/transcript/usage as expected

Full suite: 59/59 passing, zero regression.

Blocker: none. Closes ROADMAP #162.
2026-04-22 17:29:55 +09:00
YeonGyu-Kim
3f4d46d7b4 fix: #161 — wall-clock timeout for run_turn_loop; stalled turns now abort with stop_reason='timeout'
Previously, run_turn_loop was bounded only by max_turns (turn count). If
engine.submit_message stalled — slow provider, hung network, infinite
stream — the loop blocked indefinitely with no cancellation path. Claws
calling run_turn_loop in CI or orchestration had no reliable way to
enforce a deadline; the loop would hang until OS kill or human intervention.

Fix:
- Add timeout_seconds parameter to run_turn_loop (default None = legacy unbounded).
- When set, each submit_message call runs inside a ThreadPoolExecutor and is
  bounded by the remaining wall-clock budget (total across all turns, not per-turn).
- On timeout, synthesize a TurnResult with stop_reason='timeout' carrying the
  turn's prompt and routed matches so transcripts preserve orchestration context.
- Exhausted/negative budget short-circuits before calling submit_message.
- Legacy path (timeout_seconds=None) bypasses the executor entirely — zero
  overhead for callers that don't opt in.

CLI:
- Added --timeout-seconds flag to 'turn-loop' command.
- Exit code 2 when the loop terminated on timeout (vs 0 for completed),
  so shell scripts can distinguish 'done' from 'budget exhausted'.

Tests (tests/test_run_turn_loop_timeout.py, 6 tests):
- Legacy unbounded path unchanged (timeout_seconds=None never emits 'timeout')
- Hung submit_message aborted within budget (0.3s budget, 5s mock hang → exit <1.5s)
- Budget is cumulative across turns (0.6s budget, 0.4s per turn, not per-turn)
- timeout_seconds=0 short-circuits first turn without calling submit_message
- Negative timeout treated as exhausted (guard against caller bugs)
- Timeout TurnResult carries correct prompt, matches, UsageSummary shape

Full suite: 49/49 passing, zero regression.

Blocker: none. Closes ROADMAP #161.
2026-04-22 17:23:43 +09:00
YeonGyu-Kim
6a76cc7c08 feat(#160): wire claw list-sessions and delete-session CLI commands
Closes the last #160 gap: claws can now manage session lifecycle entirely
through the CLI without filesystem hacks.

New commands:
- claw list-sessions [--directory DIR] [--output-format text|json]
  Enumerates stored session IDs. JSON mode emits {sessions, count}.
  Missing/empty directories return empty list (exit 0), not an error.

- claw delete-session SESSION_ID [--directory DIR] [--output-format text|json]
  Idempotent: not-found is exit 0 with status='not_found' (no raise).
  Partial-failure: exit 1 with typed JSON error envelope:
    {session_id, deleted: false, error: {kind, message, retryable}}
  The 'session_delete_failed' kind is retryable=true so orchestrators
  know to retry vs escalate.

Public API surface extended in src/__init__.py:
- list_sessions, session_exists, delete_session
- SessionNotFoundError, SessionDeleteError

Tests added (tests/test_porting_workspace.py):
- test_list_sessions_cli_runs: text + json modes against tempdir
- test_delete_session_cli_idempotent: first call deleted=true,
  second call deleted=false (exit 0, status=not_found)
- test_delete_session_cli_partial_failure_exit_1: permission error
  surfaces as exit 1 + typed JSON error with retryable=true

All 43 tests pass. The session storage abstraction chapter is closed:
- storage layer decoupled from claw code (#160 initial impl)
- delete contract hardened + caller-audited (#160 hardening pass)
- CLI wired with idempotency preserved at exit-code boundary (this commit)
2026-04-22 17:16:53 +09:00
YeonGyu-Kim
527c0f971c fix(#160): harden delete_session contract — idempotency, race-safety, typed partial-failure
Addresses review feedback on initial #160 implementation:

1. delete_session() contract now explicit:
   - Idempotent: delete(x); delete(x) is safe, second call returns False
   - Race-safe: TOCTOU between exists()/unlink() eliminated via unlink-then-catch
   - Partial-failure typed: permission/IO errors wrapped in SessionDeleteError (OSError subclass)
     so callers can distinguish 'not found' (return False) from 'could not delete' (raise)

2. New SessionDeleteError class for partial-failure surfacing.
   Distinct from SessionNotFoundError (KeyError subclass for missing loads).

3. Caller audit confirmed: no code outside session_store globs .port_sessions
   or imports DEFAULT_SESSION_DIR. Storage layout is fully encapsulated.

4. Added tests/test_session_store.py — 18 tests covering:
   - list_sessions: empty/missing/sorted/non-json filter
   - session_exists: true/false/missing-dir
   - load_session: SessionNotFoundError typing (KeyError subclass, not FileNotFoundError)
   - delete_session idempotency: first/second/never-existed calls
   - delete_session partial-failure: SessionDeleteError wraps OSError
   - delete_session race-safety: concurrent deletion returns False, not raise
   - Full save->list->exists->load->delete roundtrip

All 18 tests pass. Merge-ready: contract documented, caller-audited, race-safe.
2026-04-22 17:11:26 +09:00
YeonGyu-Kim
504d238af1 fix: #160 — add list_sessions, session_exists, delete_session to session_store
- list_sessions(directory=None) -> list[str]: enumerate stored session IDs
- session_exists(session_id, directory=None) -> bool: check existence without FileNotFoundError
- delete_session(session_id, directory=None) -> bool: unlink a session file
- load_session now raises typed SessionNotFoundError (subclass of KeyError) instead of FileNotFoundError
- Claws can now manage session lifecycle without reaching past the module to glob filesystem

Closes ROADMAP #160. Acceptance: claw can call list_sessions(), session_exists(id), delete_session(id) without importing Path or knowing .port_sessions/<id>.json layout.
2026-04-22 17:08:01 +09:00
YeonGyu-Kim
41a6091355 file: #163 — run_turn_loop injects [turn N] suffix into follow-up prompts; multi-turn sessions semantically broken 2026-04-22 10:07:35 +09:00
YeonGyu-Kim
bc94870a54 file: #162 — submit_message appends budget-exceeded turn before returning max_budget_reached; session state corrupted on overflow 2026-04-22 09:38:00 +09:00
YeonGyu-Kim
ee3aa29a5e file: #161 — run_turn_loop has no wall-clock timeout, stalled turn blocks indefinitely 2026-04-22 08:57:38 +09:00
YeonGyu-Kim
a389f8dff1 file: #160 — session_store missing list_sessions, delete_session, session_exists — claw cannot enumerate or clean up sessions without filesystem hacks 2026-04-22 08:47:52 +09:00
YeonGyu-Kim
7a014170ba file: #159 — run_turn_loop hardcodes empty denied_tools, permission denials absent from multi-turn sessions 2026-04-22 06:48:03 +09:00
YeonGyu-Kim
986f8e89fd file: #158 — compact_messages_if_needed drops turns silently, no structured compaction event 2026-04-22 06:37:54 +09:00
YeonGyu-Kim
ef1cfa1777 file: #157 — structured remediation registry for error hints (Phase 3 of #77)
## Gap

#77 Phase 1 added machine-readable error kind discriminants and #156 extended
them to text-mode output. However, the hint field is still prose derived from
splitting existing error text — not a stable registry-backed remediation
contract.

Downstream claws inspecting the hint field still need to parse human wording
to decide whether to retry, escalate, or terminate.

## Fix Shape

1. Remediation registry: remediation_for(kind, operation) -> Remediation struct
   with action (retry/escalate/terminate/configure), target, and stable message
2. Stable hint outputs per error class (no more prose splitting)
3. Golden fixture tests replacing split_error_hint() string hacks

## Source

gaebal-gajae dogfood sweep 2026-04-22 05:30 KST
2026-04-22 05:31:00 +09:00
YeonGyu-Kim
f1e4ad7574 feat: #156 — error classification for text-mode output (Phase 2 of #77)
## Problem

#77 Phase 1 added machine-readable error `kind` discriminants to JSON error
payloads. Text-mode (stderr) errors still emit prose-only output with no
structured classification.

Observability tools (log aggregators, CI error parsers) parsing stderr can't
distinguish error classes without regex-scraping the prose.

## Fix

Added `[error-kind: <class>]` prefix line to all text-mode error output.
The prefix appears before the error prose, making it immediately parseable by
line-based log tools without any substring matching.

**Examples:**

## Impact

- Stderr observers (log aggregators, CI systems) can now parse error class
  from the first line without regex or substring scraping
- Same classifier function used for JSON (#77 P1) and text modes
- Text-mode output remains human-readable (error prose unchanged)
- Prefix format follows syslog/structured-logging conventions

## Tests

All 179 rusty-claude-cli tests pass. Verified on 3 different error classes.

Closes ROADMAP #156.
2026-04-22 00:21:32 +09:00
YeonGyu-Kim
14c5ef1808 file: #156 — error classification for text-mode output (Phase 2 of #77)
ROADMAP entry for natural Phase 2 follow-up to #77 Phase 1 (JSON error kind
classification). Text-mode errors currently prose-only with no structured
class; observability tools parsing stderr need the kind token.

Two implementation options:
- Prefix line before error prose: [error-kind: missing_credentials]
- Suffix comment: # error_class=missing_credentials

Scope: ~20 lines. Non-breaking (adds classification, doesn't change error text).

Source: Cycle 11 dogfood probe at 23:18 KST — product surface clean after
today's batch, identified natural next step for error-classification symmetry.
2026-04-21 23:19:58 +09:00
YeonGyu-Kim
9362900b1b feat: #77 Phase 1 — machine-readable error classification in JSON error payloads
## Problem

All JSON error payloads had the same three-field envelope:
```json
{"type": "error", "error": "<prose with hint baked in>"}
```

Five distinct error classes were indistinguishable at the schema level:
- missing_credentials (no API key)
- missing_worker_state (no state file)
- session_not_found / session_load_failed
- cli_parse (unrecognized args)
- invalid_model_syntax

Downstream claws had to regex-scrape the prose to route failures.

## Fix

1. **Added `classify_error_kind()`** — prefix/keyword classifier that returns a
   snake_case discriminant token for 12 known error classes:
   `missing_credentials`, `missing_manifests`, `missing_worker_state`,
   `session_not_found`, `session_load_failed`, `no_managed_sessions`,
   `cli_parse`, `invalid_model_syntax`, `unsupported_command`,
   `unsupported_resumed_command`, `confirmation_required`, `api_http_error`,
   plus `unknown` fallback.

2. **Added `split_error_hint()`** — splits multi-line error messages into
   (short_reason, optional_hint) so the runbook prose stops being stuffed
   into the `error` field.

3. **Extended JSON envelope** at 4 emit sites:
   - Main error sink (line ~213)
   - Session load failure in resume_session
   - Stub command (unsupported_command)
   - Unknown resumed command (unsupported_resumed_command)

## New JSON shape

```json
{
  "type": "error",
  "error": "short reason (first line)",
  "kind": "missing_credentials",
  "hint": "Hint: export ANTHROPIC_API_KEY..."
}
```

`kind` is always present. `hint` is null when no runbook follows.
`error` now carries only the short reason, not the full multi-line prose.

## Tests

Added 2 new regression tests:
- `classify_error_kind_returns_correct_discriminants` — all 9 known classes + fallback
- `split_error_hint_separates_reason_from_runbook` — with and without hints

All 179 rusty-claude-cli tests pass. Full workspace green.

Closes ROADMAP #77 Phase 1.
2026-04-21 22:38:13 +09:00
YeonGyu-Kim
ff45e971aa fix: #80 — session-lookup error messages now show actual workspace-fingerprint directory
## Problem

Two session error messages advertised `.claw/sessions/` as the managed-session
location, but the actual on-disk layout is `.claw/sessions/<workspace_fingerprint>/`
where the fingerprint is a 16-char FNV-1a hash of the CWD path.

Users see error messages like:
```
no managed sessions found in .claw/sessions/
```

But the real directory is:
```
.claw/sessions/8497f4bcf995fc19/
```

The error copy was a direct lie — it made workspace-fingerprint partitioning
invisible and left users confused about whether sessions were lost or just in
a different partition.

## Fix

Updated two error formatters to accept the resolved `sessions_root` path
and extract the actual workspace-fingerprint directory:

1. **format_missing_session_reference**: now shows the actual fingerprint dir
   and explains that it's a workspace-specific partition

2. **format_no_managed_sessions**: now shows the actual fingerprint dir and
   includes a note that sessions from other CWDs are intentionally invisible

Updated all three call sites to pass `&self.sessions_root` to the formatters.

## Examples

**Before:**
```
no managed sessions found in .claw/sessions/
```

**After:**
```
no managed sessions found in .claw/sessions/8497f4bcf995fc19/
Start `claw` to create a session, then rerun with `--resume latest`.
Note: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.
```

```
session not found: nonexistent-id
Hint: managed sessions live in .claw/sessions/8497f4bcf995fc19/ (workspace-specific partition).
Try `latest` for the most recent session or `/session list` in the REPL.
```

## Impact

- Users can now tell from the error message that they're looking in the right
  directory (the one their current CWD maps to)
- The workspace-fingerprint partitioning stops being invisible
- Operators understand why sessions from adjacent CWDs don't appear
- Error copy matches the actual on-disk structure

## Tests

All 466 runtime tests pass. Verified on two real workspaces with actual
workspace-fingerprint directories.

Closes ROADMAP #80.
2026-04-21 22:18:12 +09:00
YeonGyu-Kim
4b53b97e36 docs: #155 — add USAGE.md documentation for /ultraplan, /teleport, /bughunter commands
## Problem

Three interactive slash commands are documented in `claw --help` but have no
corresponding section in USAGE.md:

- `/ultraplan [task]` — Run a deep planning prompt with multi-step reasoning
- `/teleport <symbol-or-path>` — Jump to a file or symbol by searching the workspace
- `/bughunter [scope]` — Inspect the codebase for likely bugs

New users see these commands in the help output but don't know:
- What each command does
- How to use it
- When to use it vs. other commands
- What kind of results to expect

## Fix

Added new section "Advanced slash commands (Interactive REPL only)" to USAGE.md
with documentation for all three commands:

1. **`/ultraplan`** — multi-step reasoning for complex tasks
   - Example: `/ultraplan refactor the auth module to use async/await`
   - Output: structured plan with numbered steps and reasoning

2. **`/teleport`** — navigate to a file or symbol
   - Example: `/teleport UserService`, `/teleport src/auth.rs`
   - Output: file content with the requested symbol highlighted

3. **`/bughunter`** — scan for likely bugs
   - Example: `/bughunter src/handlers`, `/bughunter` (all)
   - Output: list of suspicious patterns with explanations

## Impact

Users can now discover these commands and understand when to use them without
having to guess or search external sources. Bridges the gap between `--help`
output and full documentation.

Also filed ROADMAP #155 documenting the gap.

Closes ROADMAP #155.
2026-04-21 21:49:04 +09:00
YeonGyu-Kim
3cfe6e2b14 feat: #154 — hint provider prefix and env var when model name looks like different provider
## Problem

When a user types `claw --model gpt-4` or `--model qwen-plus`, they get:
```
error: invalid model syntax: 'gpt-4'. Expected provider/model (e.g., anthropic/claude-opus-4-6) or known alias
```

USAGE.md documents that "The error message now includes a hint that names the detected env var" — but this hint does not actually exist. The user has to re-read USAGE.md or guess the correct prefix.

## Fix

Enhance `validate_model_syntax` to detect when a model name looks like it belongs to a different provider:

1. **OpenAI models** (starts with `gpt-` or `gpt_`):
   ```
   Did you mean `openai/gpt-4`? (Requires OPENAI_API_KEY env var)
   ```

2. **Qwen/DashScope models** (starts with `qwen`):
   ```
   Did you mean `qwen/qwen-plus`? (Requires DASHSCOPE_API_KEY env var)
   ```

3. **Grok/xAI models** (starts with `grok`):
   ```
   Did you mean `xai/grok-3`? (Requires XAI_API_KEY env var)
   ```

Unrelated invalid models (e.g., `asdfgh`) do not get a spurious hint.

## Verification

- `claw --model gpt-4` → hints `openai/gpt-4` + `OPENAI_API_KEY`
- `claw --model qwen-plus` → hints `qwen/qwen-plus` + `DASHSCOPE_API_KEY`
- `claw --model grok-3` → hints `xai/grok-3` + `XAI_API_KEY`
- `claw --model asdfgh` → generic error (no hint)

## Tests

Added 3 new assertions in `parses_multiple_diagnostic_subcommands`:
- GPT model error hints openai/ prefix and OPENAI_API_KEY
- Qwen model error hints qwen/ prefix and DASHSCOPE_API_KEY
- Unrelated models don't get a spurious hint

All 177 rusty-claude-cli tests pass.

Closes ROADMAP #154.
2026-04-21 21:40:48 +09:00
YeonGyu-Kim
71f5f83adb feat: #153 — add post-build binary location and verification guide to README
## Problem

Users frequently ask after building:
- "Where is the claw binary?"
- "Did the build actually work?"
- "Why can't I run \`claw\` from anywhere?"

This happens because \`cargo build\` puts the binary in \`rust/target/debug/claw\`
(or \`rust/target/release/claw\`), and new users don't know:
1. Where to find it
2. How to test it
3. How to add it to PATH (optional but common follow-up)

## Fix

Added new section "Post-build: locate the binary and verify" to README covering:

1. **Binary location table:** debug vs. release, macOS/Linux vs. Windows paths
2. **Verification commands:** Test the binary with \`--help\` and \`doctor\`
3. **Three ways to add to PATH:**
   - Symlink (macOS/Linux): \`ln -s ... /usr/local/bin/claw\`
   - cargo install: \`cargo install --path . --force\`
   - Shell profile update: add rust/target/debug to \$PATH
4. **Troubleshooting:** Common errors ("command not found", "permission denied",
   debug vs. release build speed)

## Impact

New users can now:
- Find the binary immediately after build
- Run it and verify with \`claw doctor\`
- Know their options for system-wide access

Also filed ROADMAP #153 documenting the gap.

Closes ROADMAP #153.
2026-04-21 21:29:59 +09:00
YeonGyu-Kim
79352a2d20 feat: #152 — hint --output-format json when user types --json on diagnostic verbs
## Problem

Users commonly type `claw doctor --json`, `claw status --json`, or
`claw system-prompt --json` expecting JSON output. These fail with
`unrecognized argument \`--json\` for subcommand` with no hint that
`--output-format json` is the correct flag.

## Discovery

Filed as #152 during 21:17 dogfood nudge. The #127 worktree contained
a more comprehensive patch but conflicted with #141 (unified --help).
On re-investigation of main, Bugs 1 and 3 from #127 are already closed
(positional arg rejection works, no double "error:" prefix). Only
Bug 2 (the `--json` hint) remained.

## Fix

Two call sites add the hint:

1. `parse_single_word_command_alias`'s diagnostic-verb suffix path:
   when rest[1] == "--json", append "Did you mean \`--output-format json\`?"

2. `parse_system_prompt_options` unknown-option path: same hint when
   the option is exactly `--json`.

## Verification

Before:
  $ claw doctor --json
  error: unrecognized argument `--json` for subcommand `doctor`
  Run `claw --help` for usage.

After:
  $ claw doctor --json
  error: unrecognized argument `--json` for subcommand `doctor`
  Did you mean `--output-format json`?
  Run `claw --help` for usage.

Covers: `doctor --json`, `status --json`, `sandbox --json`,
`system-prompt --json`, and any other diagnostic verb that routes
through `parse_single_word_command_alias`.

Other unrecognized args (`claw doctor garbage`) correctly don't
trigger the hint.

## Tests

- 2 new assertions in `parses_multiple_diagnostic_subcommands`:
  - `claw doctor --json` produces hint
  - `claw doctor garbage` does NOT produce hint
- 177 rusty-claude-cli tests pass
- Workspace tests green

Closes ROADMAP #152.
2026-04-21 21:23:17 +09:00
YeonGyu-Kim
dddbd78dbd file: #152 — diagnostic verb suffixes allow arbitrary positional args, double error prefix
Filed from nudge directive at 21:17 KST. Implementation exists on worktree
`jobdori-127-verb-suffix` but needs rebase due to merge with #141.
Ready for Phase 1 implementation once conflicts resolved.
2026-04-21 21:19:51 +09:00
YeonGyu-Kim
7bc66e86e8 feat: #151 — canonicalize workspace path in SessionStore::from_cwd/data_dir
## Problem

`workspace_fingerprint(path)` hashes the raw path string without
canonicalization. Two equivalent paths (e.g. `/tmp/foo` vs
`/private/tmp/foo` on macOS) produce different fingerprints and
therefore different session stores. #150 fixed the test-side symptom;
this fixes the underlying product contract.

## Discovery path

#150 fix (canonicalize in test) was a workaround. Q's ack on #150
surfaced the deeper gap: the function itself is still fragile for
any caller passing a non-canonical path:

1. Embedded callers with a raw `--data-dir` path
2. Programmatic `SessionStore::from_cwd(user_path)` calls
3. NixOS store paths, Docker bind mounts, case-insensitive normalization

The REPL's default flow happens to work because `env::current_dir()`
returns canonical paths on macOS. But any caller passing a raw path
risks silent session-store divergence.

## Fix

Canonicalize inside `SessionStore::from_cwd()` and `from_data_dir()`
before computing the fingerprint. Kept `workspace_fingerprint()` itself
as a pure function for determinism — canonicalization is the entry
point's responsibility.

```rust
let canonical_cwd = fs::canonicalize(cwd).unwrap_or_else(|_| cwd.to_path_buf());
let sessions_root = canonical_cwd.join(".claw").join("sessions").join(workspace_fingerprint(&canonical_cwd));
```

Falls back to the raw path if canonicalize fails (directory doesn't
exist yet).

## Test-side updates

Three legacy-session tests expected the non-canonical base path to
match the store's workspace_root. Updated them to canonicalize
`base` after creation — same defensive pattern as #150, now
explicit across all three tests.

## Regression test

Added `session_store_from_cwd_canonicalizes_equivalent_paths` that
creates two stores from equivalent paths (raw vs canonical) and
asserts they resolve to the same sessions_dir.

## Verification

- `cargo test -p runtime session_store_` — 9/9 pass
- `cargo test --workspace` — all green, no FAILED markers
- No behavior change for existing users (REPL default flow already
  used canonical paths)

## Backward compatibility

Users on macOS who always went through `env::current_dir()`:
no hash change, sessions resume identically.

Users who ever called with a non-canonical path: hash would change,
but those sessions were already broken (couldn't be resumed from a
canonical-path cwd). Net improvement.

Closes ROADMAP #151.
2026-04-21 21:06:09 +09:00
YeonGyu-Kim
eaa077bf91 fix: #150 — eliminate symlink canonicalization flake in resume_latest test + file #246 (reminder outcome ambiguity)
## #150 Fix: resume_latest test flake

**Problem:** `resume_latest_restores_the_most_recent_managed_session` intermittently
fails when run in the workspace suite or multiple times in sequence, but passes in
isolation.

**Root cause:** `workspace_fingerprint(path)` hashes the path string without
canonicalization. On macOS, `/tmp` is a symlink to `/private/tmp`. The test
creates a temp dir via `std::env::temp_dir().join(...)` which returns
`/var/folders/...` (non-canonical). When the subprocess spawns,
`env::current_dir()` returns the canonical path `/private/var/folders/...`.
The two fingerprints differ, so the subprocess looks in
`.claw/sessions/<hash1>` while files are in `.claw/sessions/<hash2>`.
Session discovery fails.

**Fix:** Call `fs::canonicalize(&project_dir)` after creating the directory
to ensure test and subprocess use identical path representations.

**Verification:** 5 consecutive runs of the full test suite — all pass.
Previously: 5/5 failed when run in sequence.

## #246 Filing: Reminder cron outcome ambiguity (control-loop blocker)

The `clawcode-dogfood-cycle-reminder` cron times out repeatedly with no
structured feedback on whether the nudge was delivered, skipped, or died in-flight.

**Phase 1 outcome schema** — add explicit field to cron result:
- `delivered` — nudge posted to Discord
- `timed_out_before_send` — died before posting
- `timed_out_after_send` — posted but cleanup timed out
- `skipped_due_to_active_cycle` — previous cycle active
- `aborted_gateway_draining` — daemon shutdown

Assigned to gaebal-gajae (cron/orchestration domain). Unblocks trustworthy
dogfood cycle observability.

Closes ROADMAP #150. Filed ROADMAP #246.
2026-04-21 21:01:09 +09:00
YeonGyu-Kim
bc259ec6f9 fix: #149 — eliminate parallel-test flake in runtime::config tests
## Problem

`runtime::config::tests::validates_unknown_top_level_keys_with_line_and_field_name`
intermittently fails during `cargo test --workspace` (witnessed during
#147 and #148 workspace runs) but passes deterministically in isolation.

Example failure from workspace run:
  test result: FAILED. 464 passed; 1 failed

## Root cause

`runtime/src/config.rs::tests::temp_dir()` used nanosecond timestamp
alone for namespace isolation:

  std::env::temp_dir().join(format!("runtime-config-{nanos}"))

Under parallel test execution on fast machines with coarse clock
resolution, two tests start within the same nanosecond bucket and
collide on the same path. One test's `fs::remove_dir_all(root)` then
races another's in-flight `fs::create_dir_all()`.

Other crates already solved this pattern:
- plugins::tests::temp_dir(label) — label-parameterized
- runtime::git_context::tests::temp_dir(label) — label-parameterized

runtime/src/config.rs was missed.

## Fix

Added process id + monotonically-incrementing atomic counter to the
namespace, making every callsite provably unique regardless of clock
resolution or scheduling:

  static COUNTER: AtomicU64 = AtomicU64::new(0);
  let pid = std::process::id();
  let seq = COUNTER.fetch_add(1, Ordering::Relaxed);
  std::env::temp_dir().join(format!("runtime-config-{pid}-{nanos}-{seq}"))

Chose counter+pid over the label-parameterized pattern to avoid
touching all 20 callsites in the same commit (mechanical noise with
no added safety — counter alone is sufficient).

## Verification

Before: one failure per workspace run (config test flake).
After: 5 consecutive `cargo test --workspace` runs — zero config
test failures. Only pre-existing `resume_latest` flake remains
(orthogonal, unrelated to this change).

  for i in 1 2 3 4 5; do cargo test --workspace; done
  # All 5 runs: config tests green. Only resume_latest flake appears.

  cargo test -p runtime
  # 465 passed; 0 failed

## ROADMAP.md

Added Pinpoint #149 documenting the gap, root cause, and fix.

Closes ROADMAP #149.
2026-04-21 20:54:12 +09:00
YeonGyu-Kim
f84c7c4ed5 feat: #148 + #128 closure — model provenance in claw status JSON/text
## Scope

Two deltas in one commit:

### #128 closure (docs)

Re-verified on main HEAD `4cb8fa0`: malformed `--model` strings already
rejected at parse time (`validate_model_syntax` in parse_args). All
historical repro cases now produce specific errors:

  claw --model ''                       → error: model string cannot be empty
  claw --model 'bad model'              → error: invalid model syntax: 'bad model' contains spaces
  claw --model 'sonet'                  → error: invalid model syntax: 'sonet'. Expected provider/model or known alias
  claw --model '@invalid'               → error: invalid model syntax: '@invalid'. Expected provider/model ...
  claw --model 'totally-not-real-xyz'   → error: invalid model syntax: ...
  claw --model sonnet                   → ok, resolves to claude-sonnet-4-6
  claw --model anthropic/claude-opus-4-6 → ok, passes through

Marked #128 CLOSED in ROADMAP with repro block. Residual provenance gap
split off as #148.

### #148 implementation

**Problem.** After #128 closure, `claw status --output-format json`
still surfaces only the resolved model string. No way for a claw to
distinguish whether `claude-sonnet-4-6` came from `--model sonnet`
(alias resolution) vs `--model claude-sonnet-4-6` (pass-through) vs
`ANTHROPIC_MODEL` env vs `.claw.json` config vs compiled-in default.

Debug forensics had to re-read argv instead of reading a structured
field. Clawhip orchestrators sending `--model` couldn't confirm the
flag was honored vs falling back to default.

**Fix.** Added two fields to status JSON envelope:
- `model_source`: "flag" | "env" | "config" | "default"
- `model_raw`: user's input before alias resolution (null on default)

Text mode appends a `Model source` line under `Model`, showing the
source and raw input (e.g. `Model source     flag (raw: sonnet)`).

**Resolution order** (mirrors resolve_repl_model but with source
attribution):
1. If `--model` / `--model=` flag supplied → source: flag, raw: flag value
2. Else if ANTHROPIC_MODEL set → source: env, raw: env value
3. Else if `.claw.json` model key set → source: config, raw: config value
4. Else → source: default, raw: null

## Changes

### rust/crates/rusty-claude-cli/src/main.rs

- Added `ModelSource` enum (Flag/Env/Config/Default) with `as_str()`.
- Added `ModelProvenance` struct (resolved, raw, source) with
  three constructors: `default_fallback()`, `from_flag(raw)`, and
  `from_env_or_config_or_default(cli_model)`.
- Added `model_flag_raw: Option<String>` field to `CliAction::Status`.
- Parse loop captures raw input in `--model` and `--model=` arms.
- Extended `parse_single_word_command_alias` to thread
  `model_flag_raw: Option<&str>` through.
- Extended `print_status_snapshot` signature to accept
  `model_flag_raw: Option<&str>`. Resolves provenance at dispatch time
  (flag provenance from arg; else probe env/config/default).
- Extended `status_json_value` signature with
  `provenance: Option<&ModelProvenance>`. On Some, adds `model_source`
  and `model_raw` fields; on None (legacy resume paths), omits them
  for backward compat.
- Extended `format_status_report` signature with optional provenance.
  On Some, renders `Model source` line after `Model`.
- Updated all existing callers (REPL /status, resume /status, tests)
  to pass None (legacy paths don't carry flag provenance).
- Added 2 regression assertions in parse_args test covering both
  `--model sonnet` and `--model=...` forms.

### ROADMAP.md

- Marked #128 CLOSED with re-verification block.
- Filed #148 documenting the provenance gap split, fix shape, and
  acceptance criteria.

## Live verification

$ claw --model sonnet --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-sonnet-4-6", "model_source": "flag", "model_raw": "sonnet"}

$ claw --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-opus-4-6", "model_source": "default", "model_raw": null}

$ ANTHROPIC_MODEL=haiku claw --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-haiku-4-5-20251213", "model_source": "env", "model_raw": "haiku"}

$ echo '{"model":"claude-opus-4-7"}' > .claw.json && claw --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-opus-4-7", "model_source": "config", "model_raw": "claude-opus-4-7"}

$ claw --model sonnet status
Status
  Model            claude-sonnet-4-6
  Model source     flag (raw: sonnet)
  Permission mode  danger-full-access
  ...

## Tests

- rusty-claude-cli bin: 177 tests pass (2 new assertions for #148)
- Full workspace green except pre-existing resume_latest flake (unrelated)

Closes ROADMAP #128, #148.
2026-04-21 20:48:46 +09:00
YeonGyu-Kim
4cb8fa059a feat: #147 — reject empty / whitespace-only prompts at CLI fallthrough
## Problem

The `"prompt"` subcommand arm enforced `if prompt.trim().is_empty()`
and returned a specific error. The fallthrough `other` arm in the same
match block — which routes any unrecognized first positional arg to
`CliAction::Prompt` — had no such guard. Result:

$ claw ""
error: missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN ...

$ claw "   "
error: missing Anthropic credentials; ...

$ claw "" ""
error: missing Anthropic credentials; ...

$ claw --output-format json ""
{"error":"missing Anthropic credentials; ...","type":"error"}

An empty prompt should never reach the credentials check. Worse: with
valid credentials, the literal empty string gets sent to Claude as a
user prompt, either burning tokens for nothing or triggering a model-
side refusal. Same prompt-misdelivery family as #145.

## Root cause

In `parse_subcommand()`, the final `other =>` arm in the top-level
match only guards against typos (#108 guard via `looks_like_subcommand_typo`)
and then unconditionally builds `CliAction::Prompt { prompt: rest.join(" ") }`.
An empty/whitespace-only join passes through.

## Changes

### rust/crates/rusty-claude-cli/src/main.rs

Added the same `if joined.trim().is_empty()` guard already used in the
`"prompt"` arm to the fallthrough path. Error message distinguishes it
from the `prompt` subcommand path:

  empty prompt: provide a subcommand (run `claw --help`) or a
  non-empty prompt string

Runs AFTER the typo guard (so `claw sttaus` still suggests `status`)
and BEFORE CliAction::Prompt construction (so no network call ever
happens for empty inputs).

### Regression tests

Added 4 assertions in the existing parse_args test:
- parse_args([""]) → Err("empty prompt: ...")
- parse_args(["   "]) → Err("empty prompt: ...")
- parse_args(["", ""]) → Err("empty prompt: ...")
- parse_args(["sttaus"]) → Err("unknown subcommand: ...") [verifies #108 typo guard still takes precedence]

### ROADMAP.md

Added Pinpoint #147 documenting the gap, verification, root cause,
fix shape, and acceptance. Joins the prompt-misdelivery cluster
alongside #145.

## Live verification

$ claw ""
error: empty prompt: provide a subcommand (run `claw --help`) or a non-empty prompt string

$ claw "   "
error: empty prompt: provide a subcommand (run `claw --help`) or a non-empty prompt string

$ claw --output-format json ""
{"error":"empty prompt: provide a subcommand ...","type":"error"}

$ claw prompt ""   # unchanged: subcommand-specific error preserved
error: prompt subcommand requires a prompt string

$ claw hello        # unchanged: typo guard still fires
error: unknown subcommand: hello.
  Did you mean     help

$ claw "real prompt here"   # unchanged: real prompts still reach API
error: api returned 401 Unauthorized (with dummy key, as expected)

All empty/whitespace-only paths exit 1. No network call. No misleading
credentials error.

## Tests

- rusty-claude-cli bin: 177 tests pass (4 new assertions)
- Full workspace green except pre-existing resume_latest flake (unrelated)

Closes ROADMAP #147.
2026-04-21 20:35:17 +09:00
YeonGyu-Kim
f877acacbf feat: #146 — wire claw config and claw diff as standalone subcommands
## Problem

`claw config` and `claw diff` are pure-local read-only introspection
commands (config merges .claw.json + .claw/settings.json from disk; diff
shells out to `git diff --cached` + `git diff`). Neither needs a session
context, yet both rejected direct CLI invocation:

$ claw config
error: `claw config` is a slash command. Use `claw --resume SESSION.jsonl /config` ...

$ claw diff
error: `claw diff` is a slash command. ...

This forced clawing operators to spin up a full session just to inspect
static disk state, and broke natural pipelines like
`claw config --output-format json | jq`.

## Root cause

Sibling of #145: `SlashCommand::Config { section }` and
`SlashCommand::Diff` had working renderers (`render_config_report`,
`render_config_json`, `render_diff_report`, `render_diff_json_for`)
exposed for resume sessions, but the top-level CLI parser in
`parse_subcommand()` had no arms for them. Zero-arg `config`/`diff`
hit `parse_single_word_command_alias`'s fallback to
`bare_slash_command_guidance`, producing the misleading guidance.

## Changes

### rust/crates/rusty-claude-cli/src/main.rs

- Added `CliAction::Config { section, output_format }` and
  `CliAction::Diff { output_format }` variants.
- Added `"config"` / `"diff"` arms to the top-level parser in
  `parse_subcommand()`. `config` accepts an optional section name
  (env|hooks|model|plugins) matching SlashCommand::Config semantics.
  `diff` takes no positional args. Both reject extra trailing args
  with a clear error.
- Added `"config" | "diff" => None` to
  `parse_single_word_command_alias` so bare invocations fall through
  to the new parser arms instead of the slash-guidance error.
- Added dispatch in run() that calls existing renderers: text mode uses
  `render_config_report` / `render_diff_report`; JSON mode uses
  `render_config_json` / `render_diff_json_for` with
  `serde_json::to_string_pretty`.
- Added 5 regression assertions in parse_args test covering:
  parse_args(["config"]), parse_args(["config", "env"]),
  parse_args(["config", "--output-format", "json"]),
  parse_args(["diff"]), parse_args(["diff", "--output-format", "json"]).

### ROADMAP.md

Added Pinpoint #146 documenting the gap, verification, root cause,
fix shape, and acceptance. Explicitly notes which other slash commands
(`hooks`, `usage`, `context`, etc.) are NOT candidates because they
are session-state-modifying.

## Live verification

$ claw config   # no config files
Config
  Working directory /private/tmp/cd-146-verify
  Loaded files      0
  Merged keys       0
Discovered files
  user    missing ...
  project missing ...
  local   missing ...
Exit 0.

$ claw config --output-format json
{
  "cwd": "...",
  "files": [...],
  ...
}

$ claw diff   # no git
Diff
  Result           no git repository
  Detail           ...
Exit 0.

$ claw diff --output-format json   # inside claw-code
{
  "kind": "diff",
  "result": "changes",
  "staged": "",
  "unstaged": "diff --git ..."
}
Exit 0.

## Tests

- rusty-claude-cli bin: 177 tests pass (5 new assertions in parse_args)
- Full workspace green except pre-existing resume_latest flake (unrelated)

## Not changed

`hooks`, `usage`, `context`, `tasks`, `theme`, `voice`, `rename`,
`copy`, `color`, `effort`, `branch`, `rewind`, `ide`, `tag`,
`output-style`, `add-dir` — all session-mutating or interactive-only;
correctly remain slash-only.

Closes ROADMAP #146.
2026-04-21 20:07:28 +09:00
YeonGyu-Kim
7d63699f9f feat: #145 — wire claw plugins subcommand to CLI parser (prompt misdelivery fix)
## Problem

`claw plugins` (and `claw plugins list`, `claw plugins --help`,
`claw plugins info <name>`, etc.) fell through the top-level subcommand
match and got routed into the prompt-execution path. Result: a purely
local introspection command triggered an Anthropic API call and surfaced
`missing Anthropic credentials` to the user. With valid credentials, it
would actually send the literal string "plugins" as a user prompt to
Claude, burning tokens for a local query.

$ claw plugins
error: missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY before calling the Anthropic API

$ ANTHROPIC_API_KEY=dummy claw plugins
⠋ 🦀 Thinking...
✘  Request failed
error: api returned 401 Unauthorized

Meanwhile siblings (`agents`, `mcp`, `skills`) all worked correctly:

$ claw agents
No agents found.
$ claw mcp
MCP
  Working directory ...
  Configured servers 0

## Root cause

`CliAction::Plugins` exists, has a working dispatcher
(`LiveCli::print_plugins`), and is produced inside the REPL via
`SlashCommand::Plugins`. But the top-level CLI parser in
`parse_subcommand()` had arms for `agents`, `mcp`, `skills`, `status`,
`doctor`, `init`, `export`, `prompt`, etc., and **no arm for
`plugins`**. The dispatch never ran from the CLI entry point.

## Changes

### rust/crates/rusty-claude-cli/src/main.rs

Added a `"plugins"` arm to the top-level match in `parse_subcommand()`
that produces `CliAction::Plugins { action, target, output_format }`,
following the same positional convention as `mcp` (`action` = first
positional, `target` = second). Rejects >2 positional args with a clear
error.

Added four regression assertions in the existing `parse_args` test:
- `plugins` alone → `CliAction::Plugins { action: None, target: None }`
- `plugins list` → action: Some("list"), target: None
- `plugins enable <name>` → action: Some("enable"), target: Some(...)
- `plugins --output-format json` → action: None, output_format: Json

### ROADMAP.md

Added Pinpoint #145 documenting the gap, verification, root cause,
fix shape, and acceptance.

## Live verification

$ claw plugins   # no credentials set
Plugins
  example-bundled      v0.1.0      disabled
  sample-hooks         v0.1.0      disabled

$ claw plugins --output-format json   # no credentials set
{
  "action": "list",
  "kind": "plugin",
  "message": "Plugins\n  example-bundled ...\n  sample-hooks ...",
  "reload_runtime": false,
  "target": null
}

Exit 0 in all modes. No network call. No "missing credentials" error.

## Tests

- rusty-claude-cli bin: 177 tests pass (new plugin assertions included)
- Full workspace green except pre-existing resume_latest flake (unrelated)

Closes ROADMAP #145.
2026-04-21 19:36:49 +09:00
YeonGyu-Kim
faeaa1d30c feat: #144 phase 1 + ROADMAP filing — claw mcp degrades gracefully on malformed config
Filing + Phase 1 fix in one commit (sibling of #143).

## Context

With #143 Phase 1 landed (`claw status` degrades), `claw mcp` was the
remaining diagnostic surface that hard-failed on a malformed `.claw.json`.
Same input, same parse error, same partial-success violation. Fresh
dogfood at 18:59 KST caught it on main HEAD `e2a43fc`.

## Changes

### ROADMAP.md
Added Pinpoint #144 documenting the gap and acceptance criteria. Joins
the partial-success / Principle #5 cluster with #143.

### rust/crates/commands/src/lib.rs
`render_mcp_report_for()` + `render_mcp_report_json_for()` now catch the
ConfigError at loader.load() instead of propagating:

- **Text mode** prepends a "Config load error" block (same shape as
  #143's status output) before the MCP listing. The listing still renders
  with empty servers so the output structure is preserved.
- **JSON mode** adds top-level `status: "ok" | "degraded"` +
  `config_load_error: string | null` fields alongside existing fields
  (`kind`, `action`, `working_directory`, `configured_servers`,
  `servers[]`). On clean runs, `status: "ok"` and
  `config_load_error: null`. On parse failure, `status: "degraded"`,
  `config_load_error: "..."`, `servers: []`, exit 0.
- Both list and show actions get the same treatment.

### Regression test
`commands::tests::mcp_degrades_gracefully_on_malformed_mcp_config_144`:
- Injects the same malformed .claw.json as #143 (one valid + one broken
  mcpServers entry).
- Asserts mcp list returns Ok (not Err).
- Asserts top-level status: "degraded" and config_load_error names the
  malformed field path.
- Asserts show action also degrades.
- Asserts clean path returns status: "ok" with config_load_error null.

## Live verification

$ claw mcp --output-format json
{
  "action": "list",
  "kind": "mcp",
  "status": "degraded",
  "config_load_error": ".../.claw.json: mcpServers.missing-command: missing string field command",
  "working_directory": "/Users/yeongyu/clawd",
  "configured_servers": 0,
  "servers": []
}
Exit 0.

## Contract alignment after this commit

All three diagnostic surfaces match now:
- `doctor` — degraded envelope with typed check entries 
- `status` — degraded envelope with config_load_error  (#143)
- `mcp` — degraded envelope with config_load_error  (this commit)

Phase 2 (typed-error object joining taxonomy §4.44) tracked separately
across all three surfaces.

Full workspace test green except pre-existing resume_latest flake (unrelated).

Closes ROADMAP #144 phase 1.
2026-04-21 19:07:17 +09:00
YeonGyu-Kim
e2a43fcd49 feat: #143 phase 1 — claw status degrades gracefully on malformed config
Previously `claw status` hard-failed on any config parse error, emitting
a bare error string and exiting 1. This took down the entire health
surface for a single malformed MCP entry, even though workspace, git,
model, permission, and sandbox state could all be reported independently.

`claw doctor` already degraded gracefully on the exact same input.
This commit matches `claw status` to that contract.

Changes:
- Add `StatusContext::config_load_error: Option<String>` to capture parse
  errors without aborting.
- Rewrite `status_context()` to match on `ConfigLoader::load()`: on Err,
  fall back to default `SandboxConfig` for sandbox resolution and record
  the parse error, then continue populating workspace/git/memory fields.
- JSON output gains top-level `status: "ok" | "degraded"` marker and a
  `config_load_error` string (null on clean runs). All other existing
  fields preserved for backward compat.
- Text output prepends a "Config load error" block with Details + Hint
  when config failed to parse, then a "Status (degraded)" header on the
  main block. Clean runs show the usual "Status" header.
- Doctor path updated to pass the config load error through StatusContext.

Regression test `status_degrades_gracefully_on_malformed_mcp_config_143`:
- Injects a .claw.json with one valid + one malformed mcpServers entry
- Asserts status_context() returns Ok (not Err)
- Asserts config_load_error names the malformed field path
- Asserts workspace/sandbox fields still populated in JSON
- Asserts top-level status is 'degraded'
- Asserts clean config path still returns status: 'ok'

Verified live on /Users/yeongyu/clawd (contains deliberately broken MCP entries):
  $ claw status --output-format json
  { "status": "degraded",
    "config_load_error": ".../mcpServers.missing-command: missing string field command",
    "model": "claude-opus-4-6",
    "workspace": {...},
    "sandbox": {...},
    ... }

Phase 2 (typed error object joining #4.44 taxonomy) tracked separately.

Full workspace test green except pre-existing resume_latest flake (unrelated).

Closes ROADMAP #143 phase 1.
2026-04-21 18:37:42 +09:00
YeonGyu-Kim
fcd5b49428 ROADMAP #143: claw status hard-fails on malformed MCP config while doctor degrades gracefully 2026-04-21 18:32:09 +09:00
YeonGyu-Kim
e73b6a2364 docs: USAGE.md sections for claw init (#142) and claw state (#139)
Add two missing sections documenting the recently-fixed commands:

- **Initialize a repository**: Shows both text and JSON output modes for
  `claw init`. Explains that structured JSON fields (created[], updated[],
  skipped[], artifacts[]) allow claws to detect per-artifact state without
  substring-matching prose. Documents idempotency.

- **Inspect worker state**: Documents `claw state` and the prerequisite
  that a worker must have executed at least once. Includes the helpful error
  message and remediation hints (claw or claw prompt <text>) so users
  discovering the command for the first time see actionable guidance.

These sections complement the product fixes in #142 (init JSON structure)
and #139 (state error actionability) by documenting the contract from a
user perspective.

Related: ROADMAP #142 (structured init output), #139 (worker-state discoverability).
2026-04-21 18:28:21 +09:00
YeonGyu-Kim
541c5bb95d feat: #139 actionable worker-state guidance in claw state error + help
Previously `claw state` errored with "no worker state file found ... — run a
worker first" but there is no `claw worker` subcommand, so claws had no
discoverable path from the error to a fix.

Changes:
- Rewrite the missing-state error to name the two concrete commands that
  produce .claw/worker-state.json:
    * `claw` (interactive REPL, writes state on first turn)
    * `claw prompt <text>` (one non-interactive turn)
  Also tell the user what to rerun: `claw state [--output-format json]`.
- Expand the State --help topic with "Produces state", "Observes state",
  and "Exit codes" lines so the worker-state contract is discoverable
  before the user hits the error.
- Add regression test state_error_surfaces_actionable_worker_commands_139
  asserting the error contains `claw prompt`, REPL mention, and the
  rerun path, plus that the help topic documents the producer contract.

Verified live:
  $ claw state
  error: no worker state file found at .claw/worker-state.json
    Hint: worker state is written by the interactive REPL or a non-interactive prompt.
    Run:   claw               # start the REPL (writes state on first turn)
    Or:    claw prompt <text> # run one non-interactive turn
    Then rerun: claw state [--output-format json]

JSON mode preserves the full hint inside the error envelope so CI/claws
can match on `claw prompt` without losing the canonical prefix.

Full workspace test green except pre-existing resume_latest flake (unrelated).

Closes ROADMAP #139.
2026-04-21 18:04:04 +09:00
YeonGyu-Kim
611eed1537 feat: #142 structured fields in claw init --output-format json
Previously `claw init --output-format json` emitted a valid JSON envelope but
packed the entire human-formatted output into a single `message` string. Claw
scripts had to substring-match human language to tell `created` from `skipped`.

Changes:
- Add InitStatus::json_tag() returning machine-stable "created"|"updated"|"skipped"
  (unlike label() which includes the human " (already exists)" suffix).
- Add InitReport::NEXT_STEP constant so claws can read the next-step hint
  without grepping the message string.
- Add InitReport::artifacts_with_status() to partition artifacts by state.
- Add InitReport::artifact_json_entries() for the structured artifacts[] array.
- Rewrite run_init + init_json_value to emit first-class fields alongside the
  legacy message string (kept for text consumers): project_path, created[],
  updated[], skipped[], artifacts[], next_step, message.
- Update the slash-command Init dispatch to use the same structured JSON.
- Add regression test artifacts_with_status_partitions_fresh_and_idempotent_runs
  asserting both fresh + idempotent runs produce the right partitioning and
  that the machine-stable tag is bare 'skipped' not label()'s phrasing.

Verified output:
- Fresh dir: created[] has 4 entries, skipped[] empty
- Idempotent call: created[] empty, skipped[] has 4 entries
- project_path, next_step as first-class keys
- message preserved verbatim for backward compat

Full workspace test green except pre-existing resume_latest flake (unrelated).

Closes ROADMAP #142.
2026-04-21 17:42:00 +09:00
YeonGyu-Kim
7763ca3260 feat: #141 unify claw <subcommand> --help contract across all 14 subcommands
Previously, `claw <subcommand> --help` had 5 different behaviors:
- 7 subcommands returned subcommand-specific help (correct)
- init/export/state/version silently fell back to global `claw --help`
- system-prompt/dump-manifests errored with `unknown <cmd> option: --help`
- bootstrap-plan printed its phase list instead of help text

Changes:
- Extend LocalHelpTopic enum with Init, State, Export, Version, SystemPrompt,
  DumpManifests, BootstrapPlan variants.
- Extend parse_local_help_action() to resolve those 7 subcommands to their
  local help topic instead of falling through to the main dispatch.
- Remove init/state/export/version from the explicit wants_help=true matcher
  so they reach parse_local_help_action() before being routed to global help.
- Add render_help_topic() entries for the 7 new topics with consistent
  Usage/Purpose/Output/Formats/Related structure.
- Add regression test subcommand_help_flag_has_one_contract_across_all_subcommands_141
  asserting every documented subcommand + both --help and -h variants resolve
  to a HelpTopic with non-empty text that contains a Usage line.

Verification:
- All 14 subcommands now return subcommand-specific help (live dogfood).
- Full workspace test green except pre-existing resume_latest flake.

Closes ROADMAP #141.
2026-04-21 17:36:48 +09:00
YeonGyu-Kim
2665ada94e ROADMAP #142: claw init --output-format json emits unstructured message string instead of created/skipped fields 2026-04-21 17:31:11 +09:00
YeonGyu-Kim
21b377d9c0 ROADMAP #141: claw <subcommand> --help has 5 different behaviors — inconsistent help surface 2026-04-21 17:01:46 +09:00
YeonGyu-Kim
27ffd75f03 fix: #140 isolate test cwd + env in punctuation_bearing_single_token test
Previously this test inherited the cargo test runner's CWD, which could contain
a stale .claw/settings.json with "permissionMode": "acceptEdits" written by
another test. The deprecated-field resolver then silently downgraded the
default permission mode to WorkspaceWrite, breaking the test's assertion.

Fix: wrap the assertion in with_current_dir() + env_lock() so the test runs in
an isolated temp directory with no stale config.

Full workspace test now passes except for pre-existing resume_latest flake
(unrelated to #140, environment-dependent, tracked separately).

Closes ROADMAP #140.
2026-04-21 16:34:58 +09:00
YeonGyu-Kim
0cf8241978 ROADMAP #140: deprecated permissionMode migration silently downgrades DangerFullAccess to WorkspaceWrite — 1 test failure on main HEAD 36b3a09 2026-04-21 16:23:00 +09:00
YeonGyu-Kim
36b3a09818 ROADMAP #139: claw state error references undocumented 'worker' concept (unactionable for claws) 2026-04-21 16:01:54 +09:00
YeonGyu-Kim
f3f6643fb9 feat: #108 add did-you-mean guard for subcommand typos (prevents silent LLM dispatch)
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-21 15:37:58 +09:00
YeonGyu-Kim
883cef1a26 docs: #138 add concrete evidence — feat/134-135 branch pushed but no PR (closure-state gap) 2026-04-21 15:02:33 +09:00
YeonGyu-Kim
768c1abc78 ROADMAP #138: dogfood cycle report-gate opacity — nudge surface needs explicit closure state 2026-04-21 14:49:36 +09:00
YeonGyu-Kim
a8beca1463 fix: #136 support --output-format json with --compact flag
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-21 14:47:15 +09:00
YeonGyu-Kim
21adae9570 fix: #137 update test fixtures to use canonical 'opus' alias for main branch consistency
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-21 14:32:49 +09:00
YeonGyu-Kim
724a78604d ROADMAP #137: model-alias shorthand regression in test suite — bare alias parsing broken on feat/134-135-session-identity; 3 tests fail with invalid model syntax error after #134/#135 validation tightening 2026-04-21 13:27:10 +09:00
YeonGyu-Kim
91ba54d39f ROADMAP #136: --compact flag silently overrides --output-format json — compact turn always emits plain text even when JSON requested; unreachable Json arm in run_with_output() match; joins output-format completeness cluster #90/#91/#92/#127/#130 and CLI/REPL parity §7.1 2026-04-21 12:27:06 +09:00
YeonGyu-Kim
8b52e77f23 ROADMAP #135: claw status --json missing active_session bool and session.id cross-reference — status query side of #134 round-trip; joins session identity completeness §4.7 and status surface completeness cluster #80/#83/#114/#122; natural bundle #134+#135 closes session-identity round-trip 2026-04-21 06:55:09 +09:00
YeonGyu-Kim
2c42f8bcc8 docs: remove duplicate ROADMAP #134 entry 2026-04-21 04:50:43 +09:00
YeonGyu-Kim
f266505546 ROADMAP #134: no run/correlation ID at session boundary — session.id missing from startup event and status JSON; observer must infer session identity from timing 2026-04-21 01:55:42 +09:00