claw-code

mirror of https://github.com/ultraworkers/claw-code.git synced 2026-06-16 14:46:50 +08:00

Author	SHA1	Message	Date
YeonGyu-Kim	4ae59f27e6	fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.	2026-04-30 01:06:57 +09:00
YeonGyu-Kim	de97541ebd	fix: #167 — show-command and show-tool now accept --output-format flag; CLI parity with session-lifecycle family Closes the inspect-capability parity gap: show-command and show-tool were the only discovery/inspection CLI commands lacking --output-format support, making them outliers in the ecosystem that already had unified JSON contracts across list-sessions, load-session, delete-session, and flush-transcript (#160/#165/#166). Concrete additions: - show-command: --output-format {text,json} - show-tool: --output-format {text,json} JSON envelope shape (found case): {name, found: true, source_hint, responsibility} JSON envelope shape (not-found case): {name, found: false, error: {kind:'command_not_found'\|'tool_not_found', message, retryable: false}} Exit codes: 0 = success 1 = not found Backward compatibility: - Default (no --output-format) is 'text' (unchanged) - Text output byte-identical to pre-#167 (three newline-separated lines) Tests (10 new, test_show_command_tool_output_format.py): - TestShowCommandOutputFormat (5): found + not-found in JSON; text mode backward compat; text is default - TestShowToolOutputFormat (3): found + not-found in JSON; text mode backward compat - TestShowCommandToolFormatParity (2): both accept same flag choices; consistent JSON envelope shape Full suite: 114 → 124 passing, zero regression. Closes ROADMAP #167. Why this matters: Before: Claws calling show-command/show-tool had to parse human-readable prose output via regex, with no structured error signal. After: Same envelope contract as load-session and friends: JSON-first, typed errors, machine-parseable. Related clusters: - Session-lifecycle CLI parity family (#160, #165, #166, #167) - Machine-readable error contracts (same vein as #162 atomicity + #164 cancellation state-safety: structured boundaries for orchestration)	2026-04-30 01:06:57 +09:00
YeonGyu-Kim	3ec635207e	fix: #164 Stage A — cooperative cancellation via cancel_event in submit_message Closes the #161 follow-up gap identified in review: wall-clock timeout bounded caller-facing wait but did not cancel the underlying provider thread, which could silently mutate mutable_messages / transcript_store / permission_denials / total_usage after the caller had already observed stop_reason='timeout'. A ghost turn committed post-deadline would poison any session that got persisted afterwards. Stage A scope (this commit): runtime + engine layer cooperative cancel. Engine layer (src/query_engine.py): - submit_message now accepts cancel_event: threading.Event \| None = None - Two safe checkpoints: 1. Entry (before max_turns / budget projection) — earliest possible return 2. Post-budget (after output synthesis, before mutation) — catches cancel that arrives while output was being computed - Both checkpoints return stop_reason='cancelled' with state UNCHANGED (mutable_messages, transcript_store, permission_denials, total_usage all preserved exactly as on entry) - cancel_event=None preserves legacy behaviour with zero overhead (no checkpoint checks at all) Runtime layer (src/runtime.py): - run_turn_loop creates one cancel_event per invocation when a deadline is in play (and None otherwise, preserving legacy fast path) - Passes the same event to every submit_message call across turns, so a late cancel on turn N-1 affects turn N - On timeout (either pre-call or mid-call), runtime explicitly calls cancel_event.set() before future.cancel() + synthesizing the timeout TurnResult. This upgrades #161's best-effort future.cancel() (which only cancels not-yet-started futures) to cooperative mid-flight cancel. Stop reason taxonomy after Stage A: 'completed' — turn committed, state mutated exactly once 'max_budget_reached' — overflow, state unchanged (#162) 'max_turns_reached' — capacity exceeded, state unchanged 'cancelled' — cancel_event observed, state unchanged (#164 Stage A) 'timeout' — synthesised by runtime, not engine (#161) The 'cancelled' vs 'timeout' split matters: - 'timeout' is the runtime's best-effort signal to the caller: deadline hit - 'cancelled' is the engine's confirmation: cancel was observed + honoured If the provider call wedges entirely (never reaches a checkpoint), the caller still sees 'timeout' and the thread is leaked — but any NEXT submit_message call on the same engine observes the event at entry and returns 'cancelled' immediately, preventing ghost-turn accumulation. This is the honest cooperative limit in Python threading land; true preemption requires async-native provider IO (future work, not Stage A). Tests (29 new tests, tests/test_submit_message_cancellation.py + tests/ test_run_turn_loop_cancellation.py): Engine-layer (12 tests): - TestCancellationBeforeCall (5): pre-set event returns 'cancelled' immediately; mutable_messages, transcript_store, usage, permission_denials all preserved - TestCancellationAfterBudgetCheck (1): cancel set mid-call (after projection, before commit) still honoured; output synthesised but state untouched - TestCancellationAfterCommit (2): post-commit cancel not observable (honest limit) BUT next call on same engine observes it + returns 'cancelled' - TestLegacyCallersUnchanged (3): cancel_event=None preserves #162 atomicity + max_turns contract with zero behaviour change - TestCancellationVsOtherStopReasons (2): cancel precedes max_turns check; cancel does not retroactively override a completed turn Runtime-layer (5 tests): - TestTimeoutPropagatesCancelEvent (3): submit_message receives a real Event object when deadline is set; None in legacy mode; timeout actually calls event.set() so in-flight threads observe at their next checkpoint - TestCancelEventSharedAcrossTurns (1): same event object passed to every turn (object identity check) — late cancel on turn N-1 must affect turn N Regression: 3 existing timeout test mocks updated to accept cancel_event kwarg (mocks that previously had signature (prompt, commands, tools, denials) now have (prompt, commands, tools, denials, cancel_event=None) since runtime passes cancel_event positionally on the timeout path). Full suite: 97 → 114 passing, zero regression. Closes ROADMAP #164 Stage A. What's explicitly NOT in Stage A: - Preemptive cancellation of wedged provider IO (requires asyncio-native provider path; larger refactor) - Timeout on the legacy unbounded run_turn_loop path (by design: legacy callers opt out of cancellation entirely) - CLI exposure of 'cancelled' as a distinct exit code (currently 'cancelled' maps to the same stop_reason != 'completed' break condition as others; CLI surface for cancel is a separate pinpoint if warranted)	2026-04-30 01:06:57 +09:00
YeonGyu-Kim	6542dded66	chore: gitignore .port_sessions/ to prevent dogfood-run pollution Every 'claw flush-transcript' call without --directory writes to .port_sessions/<uuid>.json in CWD. Without a gitignore entry, every dogfood run leaves dozens of untracked files in the repo, masking real changes in 'git status' output. Now that #160/#166 ship structured session lifecycle commands and deterministic --session-id, this directory is purely transient by default — belongs in .gitignore.	2026-04-30 01:06:57 +09:00
YeonGyu-Kim	6b9879cd1b	fix: #166 — flush-transcript now accepts --directory / --output-format / --session-id; session-creation command parity with #160/#165 lifecycle triplet	2026-04-30 01:06:57 +09:00
YeonGyu-Kim	9c2901eb21	fix: #159 — run_turn_loop no longer hardcodes empty denied_tools; permission denials now parity-match bootstrap_session #159: multi-turn sessions had a silent security asymmetry: denied_tools were always empty in run_turn_loop, even though bootstrap_session inferred them from the routed matches. Result: any tool gated as 'destructive' (bash-family commands, rm, etc) would silently appear unblocked across all turns in multi-turn mode, giving a false 'clean' permission picture to any claw consuming TurnResult.permission_denials. Fix: compute denied_tools once at loop start via _infer_permission_denials, then pass the same denials to every submit_message call (both timeout and legacy unbounded paths). This mirrors the existing bootstrap_session pattern. Acceptance: run_turn_loop('run bash ls').permission_denials now matches what bootstrap_session returns — both infer the same denials from the routed matches. Multi-turn security posture is symmetric. Tests (tests/test_run_turn_loop_permissions.py, 2 tests): - test_turn_loop_surfaces_permission_denials_like_bootstrap: Symmetry check confirming both paths infer identical denials for destructive tools - test_turn_loop_with_continuation_preserves_denials: Denials inferred at loop start are passed consistently to all turns; captured via mock and verified non-empty Full suite: 82/82 passing, zero regression. Closes ROADMAP #159.	2026-04-30 01:06:57 +09:00
YeonGyu-Kim	b2a0c5da03	fix: #165 — load-session CLI now parity-matches list/delete (--directory, --output-format, typed JSON errors) The #160 session-lifecycle CLI triplet was asymmetric: list-sessions and delete-session accepted --directory + --output-format and emitted typed JSON error envelopes, but load-session had neither flag and dumped a raw Python traceback (including the SessionNotFoundError class name) on a missing session. Three concrete impacts this fix closes: 1. Alternate session-store locations (e.g. /tmp/claw-run-XXX/.port_sessions) were unreachable via load-session; claws had to chdir or monkeypatch DEFAULT_SESSION_DIR to work around it. 2. Not-found emitted a multi-line Python stack, not a parseable envelope. Claws deciding retry/escalate/give-up had only exit code 1 to work with. 3. The traceback leaked 'src.session_store.SessionNotFoundError' verbatim, coupling version-pinned claws to our internal exception class name. Now all three triplet commands accept the same flag pair and emit the same JSON error shape: Success (json mode): {"session_id": "alpha", "loaded": true, "messages_count": 3, "input_tokens": 42, "output_tokens": 99} Not-found: {"session_id": "missing", "loaded": false, "error": {"kind": "session_not_found", "message": "session 'missing' not found in /path", "directory": "/path", "retryable": false}} Corrupted file: {"session_id": "broken", "loaded": false, "error": {"kind": "session_load_failed", "message": "...", "directory": "/path", "retryable": true}} Exit code contract: - 0 on successful load - 1 on not-found (preserves existing $?) - 1 on OSError/JSONDecodeError (distinct 'kind' in JSON) Backward compat: legacy 'claw load-session ID' text output unchanged byte-for-byte. Only new behaviour is the flags and structured error path. Tests (tests/test_load_session_cli.py, 13 tests): - TestDirectoryFlagParity (2): --directory works + fallback to CWD/.port_sessions - TestOutputFormatFlagParity (2): json schema + text-mode backward compat - TestNotFoundTypedError (2): JSON envelope on not-found; no traceback in either mode; no internal class name leak - TestLoadFailedDistinctFromNotFound (1): corrupted file = session_load_failed with retryable=true, distinct from session_not_found - TestTripletParityConsistency (6): parametrised over [list, delete, load] * [--directory, --output-format] — explicit parity guard for future regressions Full suite: 80/80 passing, zero regression. Discovered via Jobdori dogfood sweep 2026-04-22 17:44 KST — ran 'claw load-session nonexistent' expecting a clean error, got a Python traceback. Filed #165 + fixed in same commit. Closes ROADMAP #165.	2026-04-30 01:06:57 +09:00
YeonGyu-Kim	11326905e9	fix: #163 — remove [turn N] suffix pollution from run_turn_loop; file #164 timeout-cancellation followup #163: run_turn_loop no longer injects f'{prompt} [turn N]' into follow-up prompts. The suffix was never defined or interpreted anywhere — not by the engine, not by the system prompt, not by any LLM. It looked like a real user-typed annotation in the transcript and made replay/analysis fragile. New behaviour: - turn 0 submits the original prompt (unchanged) - turn > 0 submits caller-supplied continuation_prompt if provided, else the loop stops cleanly — no fabricated user turn - added continuation_prompt: str \| None = None parameter to run_turn_loop - added --continuation-prompt CLI flag for claws scripting multi-turn loops - zero '[turn' strings ever appear in mutable_messages or stdout now Behaviour change for existing callers: - Before: run_turn_loop(prompt, max_turns=3) submitted 3 turns ('prompt', 'prompt [turn 2]', 'prompt [turn 3]') - After: run_turn_loop(prompt, max_turns=3) submits 1 turn ('prompt') - To preserve old multi-turn behaviour, pass continuation_prompt='Continue.' or any structured follow-up text One existing timeout test (test_budget_is_cumulative_across_turns) updated to pass continuation_prompt so the cumulative-budget contract is actually exercised across turns instead of trivially satisfied by a one-turn loop. #164 filed: addresses reviewer feedback on #161. The wall-clock timeout bounds the caller-facing wait, but the underlying submit_message worker thread keeps running and can mutate engine state after the timeout TurnResult is returned. A cooperative cancel_event pattern is sketched in the pinpoint; real asyncio.Task.cancel() support will come once provider IO is async-native (larger refactor). Tests (tests/test_run_turn_loop_continuation.py, 8 tests): - TestNoTurnSuffixInjection (2): zero '[turn' strings in any submitted prompt, both default and explicit-continuation paths - TestContinuationDefaultStopsAfterTurnZero (2): default loops run exactly one turn; engine.submit_message called exactly once despite max_turns=10 - TestExplicitContinuationBehaviour (2): turn 0 = original, turn N = continuation verbatim; max_turns still respected - TestCLIContinuationFlag (2): CLI default emits only '## Turn 1'; --continuation-prompt wires through to multi-turn behaviour Full suite: 67/67 passing. Closes ROADMAP #163. Files #164.	2026-04-30 01:06:57 +09:00
YeonGyu-Kim	c07089eedd	fix: #162 — budget-overflow no longer corrupts session state in submit_message Previously, QueryEnginePort.submit_message() checked the token budget AFTER appending the prompt to mutable_messages, transcript_store, and permission_denials, and AFTER calling compact_messages_if_needed(). On overflow it set stop_reason='max_budget_reached' but the overflow turn was already committed. Any caller that persisted the session afterwards wrote the rejected prompt to disk — the session was silently poisoned even though the TurnResult said the turn never completed. Fix: - Restructure submit_message so the budget check early-returns BEFORE any mutation of mutable_messages, transcript_store, permission_denials, or total_usage. - The returned TurnResult.usage reflects pre-call state (overflow never advanced the usage counter). - Normal (in-budget) path unchanged: mutation happens exactly once, at the end, only on 'completed' results. This closes the atomicity gap: submit_message is now either 'turn committed' (stop_reason='completed') or 'turn rejected, state untouched' (stop_reason in {'max_budget_reached', 'max_turns_reached'}). Callers can safely retry with a fresh budget or a smaller prompt without worrying about phantom committed turns from prior rejections. Tests (tests/test_submit_message_budget.py, 10 tests): - TestBudgetOverflowDoesNotMutate (5): mutable_messages / transcript / permission_denials / total_usage / TurnResult.usage all pre-mutation after overflow - TestOverflowPersistence (2): first-turn overflow persists empty session; successful-turn-then-overflow persists only the successful turn - TestEngineUsableAfterOverflow (2): subsequent in-budget call still works with no residue; repeated overflows don't accumulate hidden state - TestNormalPathStillCommits (1): regression guard — non-overflow path still commits mutable_messages/transcript/usage as expected Full suite: 59/59 passing, zero regression. Blocker: none. Closes ROADMAP #162.	2026-04-30 01:06:57 +09:00
YeonGyu-Kim	af9723cf0a	fix: #161 — wall-clock timeout for run_turn_loop; stalled turns now abort with stop_reason='timeout' Previously, run_turn_loop was bounded only by max_turns (turn count). If engine.submit_message stalled — slow provider, hung network, infinite stream — the loop blocked indefinitely with no cancellation path. Claws calling run_turn_loop in CI or orchestration had no reliable way to enforce a deadline; the loop would hang until OS kill or human intervention. Fix: - Add timeout_seconds parameter to run_turn_loop (default None = legacy unbounded). - When set, each submit_message call runs inside a ThreadPoolExecutor and is bounded by the remaining wall-clock budget (total across all turns, not per-turn). - On timeout, synthesize a TurnResult with stop_reason='timeout' carrying the turn's prompt and routed matches so transcripts preserve orchestration context. - Exhausted/negative budget short-circuits before calling submit_message. - Legacy path (timeout_seconds=None) bypasses the executor entirely — zero overhead for callers that don't opt in. CLI: - Added --timeout-seconds flag to 'turn-loop' command. - Exit code 2 when the loop terminated on timeout (vs 0 for completed), so shell scripts can distinguish 'done' from 'budget exhausted'. Tests (tests/test_run_turn_loop_timeout.py, 6 tests): - Legacy unbounded path unchanged (timeout_seconds=None never emits 'timeout') - Hung submit_message aborted within budget (0.3s budget, 5s mock hang → exit <1.5s) - Budget is cumulative across turns (0.6s budget, 0.4s per turn, not per-turn) - timeout_seconds=0 short-circuits first turn without calling submit_message - Negative timeout treated as exhausted (guard against caller bugs) - Timeout TurnResult carries correct prompt, matches, UsageSummary shape Full suite: 49/49 passing, zero regression. Blocker: none. Closes ROADMAP #161.	2026-04-30 01:06:57 +09:00
YeonGyu-Kim	b88c899ceb	feat(#160 ): wire claw list-sessions and delete-session CLI commands Closes the last #160 gap: claws can now manage session lifecycle entirely through the CLI without filesystem hacks. New commands: - claw list-sessions [--directory DIR] [--output-format text\|json] Enumerates stored session IDs. JSON mode emits {sessions, count}. Missing/empty directories return empty list (exit 0), not an error. - claw delete-session SESSION_ID [--directory DIR] [--output-format text\|json] Idempotent: not-found is exit 0 with status='not_found' (no raise). Partial-failure: exit 1 with typed JSON error envelope: {session_id, deleted: false, error: {kind, message, retryable}} The 'session_delete_failed' kind is retryable=true so orchestrators know to retry vs escalate. Public API surface extended in src/__init__.py: - list_sessions, session_exists, delete_session - SessionNotFoundError, SessionDeleteError Tests added (tests/test_porting_workspace.py): - test_list_sessions_cli_runs: text + json modes against tempdir - test_delete_session_cli_idempotent: first call deleted=true, second call deleted=false (exit 0, status=not_found) - test_delete_session_cli_partial_failure_exit_1: permission error surfaces as exit 1 + typed JSON error with retryable=true All 43 tests pass. The session storage abstraction chapter is closed: - storage layer decoupled from claw code (#160 initial impl) - delete contract hardened + caller-audited (#160 hardening pass) - CLI wired with idempotency preserved at exit-code boundary (this commit)	2026-04-30 01:06:57 +09:00
YeonGyu-Kim	e6ea4d248d	fix(#160 ): harden delete_session contract — idempotency, race-safety, typed partial-failure Addresses review feedback on initial #160 implementation: 1. delete_session() contract now explicit: - Idempotent: delete(x); delete(x) is safe, second call returns False - Race-safe: TOCTOU between exists()/unlink() eliminated via unlink-then-catch - Partial-failure typed: permission/IO errors wrapped in SessionDeleteError (OSError subclass) so callers can distinguish 'not found' (return False) from 'could not delete' (raise) 2. New SessionDeleteError class for partial-failure surfacing. Distinct from SessionNotFoundError (KeyError subclass for missing loads). 3. Caller audit confirmed: no code outside session_store globs .port_sessions or imports DEFAULT_SESSION_DIR. Storage layout is fully encapsulated. 4. Added tests/test_session_store.py — 18 tests covering: - list_sessions: empty/missing/sorted/non-json filter - session_exists: true/false/missing-dir - load_session: SessionNotFoundError typing (KeyError subclass, not FileNotFoundError) - delete_session idempotency: first/second/never-existed calls - delete_session partial-failure: SessionDeleteError wraps OSError - delete_session race-safety: concurrent deletion returns False, not raise - Full save->list->exists->load->delete roundtrip All 18 tests pass. Merge-ready: contract documented, caller-audited, race-safe.	2026-04-30 01:06:57 +09:00
YeonGyu-Kim	0c600e76a7	fix: #160 — add list_sessions, session_exists, delete_session to session_store - list_sessions(directory=None) -> list[str]: enumerate stored session IDs - session_exists(session_id, directory=None) -> bool: check existence without FileNotFoundError - delete_session(session_id, directory=None) -> bool: unlink a session file - load_session now raises typed SessionNotFoundError (subclass of KeyError) instead of FileNotFoundError - Claws can now manage session lifecycle without reaching past the module to glob filesystem Closes ROADMAP #160. Acceptance: claw can call list_sessions(), session_exists(id), delete_session(id) without importing Path or knowing .port_sessions/<id>.json layout.	2026-04-30 01:06:57 +09:00
YeonGyu-Kim	424d5aff74	file: #161 — run_turn_loop has no wall-clock timeout, stalled turn blocks indefinitely	2026-04-30 01:06:57 +09:00
Bellman	f65b2b4f0e	Merge pull request #2861 from ultraworkers/docs/roadmap-341-tasks-json-dual-vocab docs(roadmap): add #341 — tasks JSON error envelope uses dual vocabulary	2026-04-30 01:06:27 +09:00
Yeachan-Heo	f4b74e89dd	Document why /tasks JSON errors need one stdout contract Constraint: ROADMAP-only dogfood follow-up for 16:00 nudge on rebuilt claw git_sha 58569131 Rejected: code change in the command dispatcher \| request was specifically to add one ROADMAP.md-only item Confidence: high Scope-risk: narrow Directive: Keep /tasks distinct from #340; this is unsupported command stub JSON, not session help Tested: git diff --check; scripts/fmt.sh --check Not-tested: runtime behavior change, because this commit only documents the gap	2026-04-29 16:02:10 +00:00
Bellman	5856913104	Merge pull request #2859 from ultraworkers/docs/roadmap-340-session-help-json-stderr docs(roadmap): add #340 — session help JSON error envelope goes to stderr	2026-04-30 00:54:42 +09:00
Yeachan-Heo	d45a0d2f5b	Document stderr-only session help JSON contract gap Capture the dogfood evidence as a roadmap item so the stdout JSON error-envelope contract can be fixed and regression-tested later.\n\nConstraint: User requested exactly one ROADMAP.md-only item #340 from current origin/main.\nConfidence: high\nScope-risk: narrow\nTested: git diff --check; scripts/fmt.sh --check\nNot-tested: Runtime behavior unchanged; documentation-only roadmap entry.	2026-04-29 15:31:59 +00:00
Bellman	dc47482e40	Merge pull request #2857 from ultraworkers/docs/roadmap-339-v2 docs(roadmap): add #339 — session delete not resume-safe, blocks GC automation	2026-04-30 00:26:29 +09:00
YeonGyu-Kim	9537c97231	docs(roadmap): add #339 — session delete not resume-safe, blocks GC automation	2026-04-30 00:18:28 +09:00
Bellman	f56a5afcf7	Merge pull request #2856 from ultraworkers/docs/roadmap-337-workspace-dirty-lifecycle-detail-restore docs(roadmap): restore #337 workspace dirty lifecycle detail gap	2026-04-30 00:14:48 +09:00
Yeachan-Heo	3efaf551ed	Restore roadmap GC lifecycle detail gap Constraint: ROADMAP.md-only restore of lost #337 from PR #2852 / Jobdori dogfood evidence Rejected: Renumbering adjacent items \| preserving existing #338 and surrounding roadmap entries keeps history stable Confidence: high Scope-risk: narrow Directive: Keep #337 before #338 and do not collapse the dirty-file detail requirement into the broader help/status backlog Tested: git diff --check; scripts/fmt.sh --check Not-tested: Product behavior changes; documentation-only change	2026-04-29 15:09:40 +00:00
Bellman	30c9b438ef	Merge pull request #2853 from ultraworkers/docs/roadmap-338-help-json-field-drift docs(roadmap): add #338 for help JSON field drift	2026-04-30 00:06:24 +09:00
Yeachan-Heo	587bb18572	docs(roadmap): add #338 for help JSON field drift Constraint: Respond to 14:30 dogfood nudge with one direct claw-code pinpoint.\nEvidence: rebuilt actual debug binary at git_sha 24ccb59b; compared top-level help --output-format json with resume-safe /help --output-format json.\nFinding: same help surface uses message in top-level JSON and text in slash/resume JSON.\nTested: cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json; ./rust/target/debug/claw help --output-format json; ./rust/target/debug/claw --resume latest /help --output-format json; git diff --check; scripts/fmt.sh --check.\nNot-tested: full Rust suite; roadmap-only documentation change.	2026-04-29 14:34:26 +00:00
Bellman	24ccb59bd2	Merge pull request #2851 from ultraworkers/docs/roadmap-329-slash-agents-json-opacity docs(roadmap): add #329 for slash agents JSON opacity	2026-04-29 23:33:47 +09:00
Yeachan-Heo	0e8e75ef75	docs(roadmap): add #329 for slash agents JSON opacity Constraint: Respond to dogfood nudge with exactly one concrete clawability pinpoint from direct claw-code use.\nEvidence: rebuilt actual debug binary at git_sha 0f7578c0; compared resume-safe /agents --output-format json with top-level claw agents --output-format json.\nFinding: slash /agents JSON only exposes kind,text while top-level agents JSON exposes structured agents[] inventory and provenance.\nTested: cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json; ./rust/target/debug/claw --resume latest /agents --output-format json; ./rust/target/debug/claw agents --output-format json; git diff --check; scripts/fmt.sh --check.\nNot-tested: full Rust suite; roadmap-only documentation change.	2026-04-29 14:01:36 +00:00
Bellman	0f7578c064	Merge pull request #2849 from ultraworkers/docs/roadmap-328-dogfood-pinpoint Add ROADMAP #328 for native-agent source provenance	2026-04-29 22:35:51 +09:00
Yeachan-Heo	213d406cbf	Record why native-agent provenance needs dogfood follow-up Constraint: Scope requested ROADMAP.md only with exactly one new #328 pinpoint from direct claw dogfood.\nRejected: Implementing the agents-help fix now \| user requested roadmap-only evidence item.\nConfidence: high\nScope-risk: narrow\nDirective: Keep agent help source roots derived from the same loader registry as agents list; do not hand-maintain a divergent root list.\nTested: cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json; ./rust/target/debug/claw version --output-format json; ./rust/target/debug/claw agents help --output-format json; ./rust/target/debug/claw agents --output-format json; git diff --check; scripts/fmt.sh --check\nNot-tested: Full Rust test suite; roadmap-only documentation change.	2026-04-29 13:33:23 +00:00
Bellman	ee85fed6ca	Merge pull request #2847 from ultraworkers/docs/roadmap-327-dogfood-pinpoint Add ROADMAP #327 for MCP help source mismatch	2026-04-29 22:06:45 +09:00
Yeachan-Heo	3a34d83749	Record why MCP source help needs dogfood follow-up Constraint: Scope limited to ROADMAP.md and one new pinpoint #327 from actual rebuilt claw dogfood. Rejected: Code fix in this branch \| user requested roadmap-only filing. Confidence: high Scope-risk: narrow Directive: Keep mcp help source lists derived from actual config discovery, not hard-coded partial docs. Tested: ./rust/target/debug/claw version --output-format json; ./rust/target/debug/claw mcp --help; ./rust/target/debug/claw mcp help --output-format json; temp .claw.json mcp list proof; git diff --check; scripts/fmt.sh --check Not-tested: Full Rust test suite, documentation-only change.	2026-04-29 13:02:27 +00:00
Bellman	981aff7c8b	Merge pull request #2845 from ultraworkers/docs/roadmap-326-dogfood-pinpoint docs(roadmap): add #326 pane inventory opacity pinpoint	2026-04-29 21:35:26 +09:00
Yeachan-Heo	c94940effa	docs: add roadmap 326 pane inventory opacity	2026-04-29 12:33:36 +00:00
Bellman	b90875fa8e	Merge pull request #2843 from ultraworkers/docs/roadmap-325-help-json-schema docs(roadmap): add #325 help json schema opacity pinpoint	2026-04-29 21:05:12 +09:00
Yeachan-Heo	2567cbcc78	Pin help JSON schema opacity for automation Document the dogfood gap where help JSON stays parseable but hides command metadata inside a prose message, so future implementation can expose machine-readable command, slash-command, and resume-safety fields.\n\nConstraint: user requested ROADMAP.md-only pinpoint for issue #325 from origin/main d607ff36.\nRejected: implementing the schema now \| requested fix shape is roadmap documentation only.\nConfidence: high\nScope-risk: narrow\nDirective: keep message for humans while adding schema/versioned structured help metadata when implementing.\nTested: git diff --check; scripts/fmt.sh --check\nNot-tested: runtime CLI behavior unchanged by docs-only change	2026-04-29 12:02:14 +00:00
Bellman	d607ff3674	Merge pull request #2840 from ultraworkers/docs/roadmap-324-stale-binary-provenance docs(roadmap): add #324 stale binary provenance pinpoint	2026-04-29 20:34:27 +09:00
Yeachan-Heo	cdf6282965	Record why stale binary provenance needs a roadmap pin Constraint: Documentation-only follow-up from current main e7074f47 after PR #2838; edit scope limited to ROADMAP.md.\nRejected: Implementing provenance detection now \| user requested roadmap entry only.\nConfidence: high\nScope-risk: narrow\nDirective: Future implementation should compare embedded build git_sha/build date to workspace HEAD/dirty state without leaking secrets.\nTested: git diff --check; scripts/fmt.sh --check\nNot-tested: Runtime provenance behavior; this commit only records the roadmap requirement.	2026-04-29 11:31:19 +00:00
Bellman	e7074f47ee	Merge pull request #2838 from ultraworkers/docs/roadmap-322-323-clean docs(roadmap): add #322 #323 — json stream corruption and session identity contradiction	2026-04-29 19:40:50 +09:00
YeonGyu-Kim	9468383b67	docs(roadmap): add #322 #323 — json stream corruption and session identity contradiction	2026-04-29 19:38:00 +09:00
Bellman	1da2781816	Merge pull request #2835 from ultraworkers/docs/roadmap-249-issue-github-oauth-opacity docs(roadmap): add #249 issue GitHub OAuth opacity pinpoint	2026-04-29 19:31:50 +09:00
Yeachan-Heo	9037430d52	docs(roadmap): add #249 issue github oauth opacity pinpoint	2026-04-29 10:01:16 +00:00
Bellman	8e22f757d8	Merge pull request #2834 from ultraworkers/docs/roadmap-248-prompt-mode-silent-hang docs(roadmap): add #248 prompt-mode silent-hang pinpoint	2026-04-29 18:31:48 +09:00
Yeachan-Heo	7676b376ae	docs(roadmap): add #248 prompt-mode silent-hang pinpoint	2026-04-29 08:24:37 +00:00
Sigrid Jin (ง'̀-'́)ง oO	1011a83823	Merge pull request #2829 from ultraworkers/fix/issue-320-session-lifecycle-classification Fix session lifecycle classification for idle tmux shells	2026-04-29 16:11:58 +09:00
Yeachan-Heo	1376d92064	Filter stub commands from resume-safe help Keep claw --help's resume-safe slash command summary aligned with the interactive command list by filtering STUB_COMMANDS and adding regression coverage.	2026-04-29 03:31:34 +00:00
Yeachan-Heo	be53e04671	Classify saved sessions by live work rather than pane existence Operator status previously treated any tmux pane in a workspace as equivalent to active work. The new classifier uses tmux pane command/path metadata as a soft signal, treats plain shells as idle, and adds dirty-worktree abandoned markers to status and session-list output for clawhip consumers. Constraint: Keep issue #320 prototype minimal and additive without new dependencies Rejected: Screen-scraping pane output \| fragile and broader than needed for lifecycle classification Confidence: high Scope-risk: narrow Tested: cargo test -p rusty-claude-cli Tested: cargo check -p rusty-claude-cli Not-tested: cargo clippy -p rusty-claude-cli --all-targets -- -D warnings is blocked by pre-existing commands crate clippy::unnecessary_wraps warnings	2026-04-28 13:12:37 +00:00
Yeachan-Heo	cb56dc12ab	Document Rust formatting wrapper Make scripts/fmt.sh robust to caller cwd and document it as the supported repo-root formatting entrypoint for the Rust workspace.	2026-04-28 09:38:46 +00:00
Yeachan-Heo	71686a20fc	Resolve fmt wrapper path from its own directory The formatting wrapper should remain safe when invoked through different current directories or shell contexts, so resolve the script directory before entering the Rust workspace and forwarding cargo fmt arguments. Constraint: Wrapper must be runnable from repo root while forwarding flags like --check Rejected: Leave relative dirname cd \| less robust if invocation context changes Confidence: high Scope-risk: narrow Tested: scripts/fmt.sh --check Tested: git diff --check	2026-04-28 09:38:40 +00:00
Yeachan-Heo	07992b8a1b	Make Rust formatting guidance runnable from repo root The Rust crate layout expects formatting to run from the rust directory, so add a root-level wrapper that preserves the working command while forwarding user flags like --check. Documentation now points contributors at the wrapper instead of the misleading virtual-workspace manifest invocation. Constraint: Root-level cargo fmt --manifest-path rust/Cargo.toml is misleading for this virtual workspace Rejected: Document cd rust && cargo fmt directly \| a root wrapper gives one stable repo-root command Confidence: high Scope-risk: narrow Tested: scripts/fmt.sh --check Tested: git diff --check	2026-04-28 09:38:08 +00:00
Yeachan-Heo	74ea754d29	Restore Rust formatting compliance Run rustfmt from the Rust workspace so CI format checks pass without changing behavior. Constraint: Scope is formatting-only across tracked Rust files Confidence: high Scope-risk: narrow Tested: cd rust && cargo fmt --check Tested: git diff --check	2026-04-28 09:19:16 +00:00
Yeachan-Heo	77afde768c	Clarify allowed tool status handling Reject empty --allowedTools inputs instead of treating them as an empty restriction, and surface status JSON metadata that distinguishes default unrestricted tools from flag-provided allow lists. Confidence: high Scope-risk: narrow Tested: cargo test -p rusty-claude-cli rejects_empty_allowed_tools_flag -- --nocapture Tested: cargo test -p tools allowed_tools_rejects_empty_token_lists -- --nocapture Tested: cargo check -p rusty-claude-cli -p tools Tested: cargo test -p rusty-claude-cli -p tools Not-tested: full workspace cargo fmt --check is blocked by pre-existing unrelated formatting drift	2026-04-28 05:44:14 +00:00

1 2 3 4 5 ...

966 Commits