claw-code

mirror of https://github.com/ultraworkers/claw-code.git synced 2026-04-24 13:08:11 +08:00

Author	SHA1	Message	Date
YeonGyu-Kim	8322bb8ec6	roadmap: #166 closed — SCHEMAS.md source misdoc fixed (P0 root cause) The aspirational SCHEMAS.md doc (v2.0 target) was the source of truth misdocumentation. Three downstream docs (USAGE, ERROR_HANDLING, CLAUDE) inherited the false claim that v1.0 binary emits common fields it doesn't actually emit. Fixing SCHEMAS.md at the source eliminates the root cause for all four P0 instances. Doc-truthfulness P0 family now complete: 4/4 closed, root cause identified + fixed. All fixes shipped within 6 cycles (#76 audit → #82 execution).	2026-04-23 05:21:22 +09:00
YeonGyu-Kim	86db2e0b03	roadmap: #165 closed with evidence (cycle #81 , commit 1a03359) CLAUDE.md Option A implemented. P0 doc-truthfulness family now at 3 closed + 0 open (all 3 fixed within the same dogfood session). Taxonomy refinement added: P0 doc-truthfulness has three distinct subclasses: - active misdocumentation (false sentence) — USAGE.md cycle #78 - copy-paste trap (broken example code) — ERROR_HANDLING.md cycle #79 - target/current boundary collapse (v2.0 as v1.0) — CLAUDE.md cycle #81 All three related to #164 (envelope divergence). Root cause consistent across family; remedies differ per subclass.	2026-04-23 05:11:42 +09:00
YeonGyu-Kim	b34f370645	roadmap: #165 filed — CLAUDE.md documents v2.0 schema as current (P0 active misdoc) CLAUDE.md claims 'Common fields (all envelopes): timestamp, command, exit_code, output_format, schema_version' but the actual binary v1.0 doesn't emit these. This is aspirational (v2.0 target from SCHEMAS.md) documented as current behavior in a file that's supposed to describe the Python reference harness. Filed as 3rd member of doc-truthfulness P0 family (joins #78, #79). Both options documented: update CLAUDE.md for v1.0 OR clarify it's v2.0 aspirational. Recommendation: Option A (keep CLAUDE.md truthful about actual validation). Part of broader #164 family (envelope schema divergence across all docs).	2026-04-23 05:10:01 +09:00
YeonGyu-Kim	a9e87de905	roadmap: doctrine refinement — doc-truthfulness severity scale (cycle #79 ) Formalizes a 4-level severity scale for documentation-vs-implementation divergence: - P0: Active misdocumentation (consumer code breaks) — immediate fix - P1: Stale docs (consumer confused) — high priority - P2: Incomplete docs (friction, eventual success) — medium - P3: Terminology drift (confusion but survivable) — low Parallel to diagnostic-strictness scale (cycles #57–#69). Both are 'truth-over-convenience' constraints. Evidence: cycles #78–#79 found 2 P0 instances in USAGE.md and ERROR_HANDLING.md, both related to JSON envelope shape. Root cause: SCHEMAS.md is aspirational (v2.0), binary still emits v1.0, docs needed to be empirical not aspirational. Going forward: doc audits compare against actual binary, flag P0 violations immediately, link forward to migration plans (FIX_LOCUS_164.md).	2026-04-23 05:00:55 +09:00
YeonGyu-Kim	5b9097a7ac	roadmap: #164 filed — JSON envelope schema-vs-binary divergence Binary emits different envelope shape than SCHEMAS.md documents: - Missing: timestamp, command, exit_code, output_format, schema_version - Wrong placement: kind is top-level, not nested under error - Extra: type:error field not in schema - Wrong type: error is string, not object with operation/target/retryable Additional issue: 'kind' field is semantically overloaded (verb-id in success envelopes, error-kind in error envelopes) — violates typed contract. Filed as 7th member of typed-error family (joins #102, #121, #127, #129, #130, #245). Recommended fix: Option A — update binary to match schema (principled design).	2026-04-23 04:31:53 +09:00
YeonGyu-Kim	69a15bd707	roadmap: cycle #75 finding — rebase-bridge pattern breaks on multi-conflict branches Attempted cherry-pick of #248 (1 commit) onto main. Encountered 2 conflict zones in main.rs (test definitions + error classification). Manual regex cleanup left orphaned diff markers that Rust compiler rejected. Decision: Rebase-bridge works for 1-conflict branches, but 2+ conflicts in 12K+-line files require author context. Revised strategy: push main to origin, request branch authors rebase locally with IDE support, then merge from updated origin branches. Estimated timeline: 30 min for branch authors to rebase 8 branches in parallel.	2026-04-23 04:26:21 +09:00
YeonGyu-Kim	41c87309f3	roadmap: cycle #74 checkpoint — rebase blocker identified Fresh dogfood found no new pinpoints. All core verbs working correctly. Blocker: 8 remaining review-ready branches on origin have conflicts with cycle #72's 4 merges. Root cause: remote branches predated the merge chain. Example: feat/jobdori-127-verb-suffix-flags rebase fails on commit 3/3 because cycle #72 added 15+ new LocalHelpTopic variants. Recommend: coordinate with branch authors to rebase against new main. Cycle #74 will post integration checkpoint + queue status.	2026-04-23 04:17:54 +09:00
YeonGyu-Kim	a02527826e	roadmap: #163 closed as already-fixed — #130e-A (merged cycle #72 ) handled help --help Backlog-truthfulness (cycle #60) validated: fresh dogfood on current main confirmed #163 was closed by cycle #72's help-parity chain merge. Zero duplicate work. Cleanup: removed /tmp/jobdori-163 worktree and fix/jobdori-163-help-help-selfref branch.	2026-04-23 04:07:37 +09:00
YeonGyu-Kim	a52a361e16	roadmap: cycle #72 — 4 merges landed, 9 branches integrated via MERGE_CHECKLIST runbook	2026-04-23 04:04:57 +09:00
YeonGyu-Kim	499d84c04a	roadmap: #163 filed — claw help --help emits missing_credentials instead of help topic (help-parity family)	2026-04-23 04:01:24 +09:00
YeonGyu-Kim	6d1c24f9ee	roadmap: doctrine refinement — three-tier artifact classification (doc → support → execution) per cycle #70 framing	2026-04-23 03:56:48 +09:00
YeonGyu-Kim	0527dd608d	roadmap: #161 closed — shipped on fix/jobdori-161-worktree-git-sha (cycle #69 )	2026-04-23 03:46:37 +09:00
YeonGyu-Kim	d64c7144ff	roadmap: doctrine extension — CLI discoverability chain completion as doctrine (from #162 closure framing)	2026-04-23 03:40:43 +09:00
YeonGyu-Kim	2a82cf2856	roadmap: #162 closed — shipped on docs/jobdori-162-usage-verb-parity (cycle #68 )	2026-04-23 03:39:36 +09:00
YeonGyu-Kim	de7a0ffde6	roadmap: #162 filed — USAGE.md missing docs for dump-manifests, bootstrap-plan, acp, export verbs (parity audit)	2026-04-23 03:37:09 +09:00
YeonGyu-Kim	36883ba4c2	roadmap: cluster update — #161 elevated to diagnostic-strictness family (per gaebal-gajae reframe)	2026-04-23 03:35:03 +09:00
YeonGyu-Kim	f18f45c0cf	roadmap: #161 filed — claw version stale SHA in worktrees (build.rs rerun-if-changed misses worktree HEAD)	2026-04-23 03:31:40 +09:00
YeonGyu-Kim	946e43e0c7	roadmap: doctrine extension — integration support artifacts as first-class deliverable at scale (from #64 framing)	2026-04-23 03:27:19 +09:00
YeonGyu-Kim	ad1cf92620	roadmap: canonical worked example of doctrine loop (#61–#63 sequence preserved for future claws)	2026-04-23 03:17:41 +09:00
YeonGyu-Kim	6a3913e278	roadmap: #160 SHIPPED — reserved-verb classification landed (cycle #63 )	2026-04-23 03:16:34 +09:00
YeonGyu-Kim	b54eacaa6e	roadmap: cycle-pattern doctrine — how violation → reframe → protocol loops create self-enforcing doctrine (from #61–#62)	2026-04-23 03:09:13 +09:00
YeonGyu-Kim	51cee23a27	roadmap: principle — integration bandwidth as constraint when queue is saturated (from #62 framing)	2026-04-23 03:06:22 +09:00
YeonGyu-Kim	35fee5ecde	roadmap: #160 investigation update — verb classification table needed for clean fix	2026-04-23 03:04:50 +09:00
YeonGyu-Kim	f034b01733	roadmap: #160 filed — resume with positional args falls through to Prompt dispatch (#251 family)	2026-04-23 03:02:30 +09:00
YeonGyu-Kim	c4054d2fa3	roadmap: principle — backlog truthfulness is execution speed (from #60 framing)	2026-04-23 02:56:23 +09:00
YeonGyu-Kim	7bd91096a8	roadmap: #136 + #153b marked CLOSED — compact+json already correct, PATH docs comprehensive	2026-04-23 02:55:22 +09:00
YeonGyu-Kim	196fe6b493	roadmap: #136 marked CLOSED — compact+json dispatch already correct	2026-04-23 02:54:41 +09:00
YeonGyu-Kim	dc8b275c9f	roadmap: principle — cycle cadence (hygiene cycles are first-class, from #59 framing)	2026-04-23 02:45:43 +09:00
YeonGyu-Kim	8f4f215e27	roadmap: diagnostic-strictness audit checklist (from cycles #57-#58)	2026-04-23 02:38:06 +09:00
YeonGyu-Kim	86b98d07e9	roadmap: principle — diagnostic surfaces must be at least as strict as runtime (from #122 framing)	2026-04-23 02:25:45 +09:00
YeonGyu-Kim	cb8839e050	roadmap: cluster closure + defer #155/#156 design questions (config section validation, mcp/agents soft-warning)	2026-04-23 02:18:46 +09:00
YeonGyu-Kim	41b0006eea	roadmap: cluster closure note — help-parity family complete (#130c, #130d, #130e)	2026-04-23 02:10:07 +09:00
YeonGyu-Kim	762e9bb212	roadmap: file #130e — help-parity sweep reveals 5 additional anomalies (3 dispatch-order, 2 surface)	2026-04-23 02:00:59 +09:00
YeonGyu-Kim	5e29430d4f	roadmap: file #130d — config command silently ignores --help, displays config dump instead	2026-04-23 01:53:31 +09:00
YeonGyu-Kim	0d8adceb67	roadmap: file #130c — pure-local commands reject --help as extra argument (diff, config, status)	2026-04-23 01:44:11 +09:00
YeonGyu-Kim	9eba71da81	roadmap: file #153b — PATH setup guide follow-up to #153	2026-04-23 01:35:24 +09:00
YeonGyu-Kim	ef5aae3ddd	roadmap: file #130b — filesystem errors lose context, emit generic errno strings (export command case)	2026-04-23 01:33:25 +09:00
YeonGyu-Kim	f05bc037de	docs(#250 , #251 ): Align SCHEMAS.md with actual binary, downgrade #250 to scope-reduced Cycle #46 follow-up to cycle #45's #251 implementation. Closes #250's implementation urgency by aligning docs with reality. SCHEMAS.md Updates: For each of the 4 session-management verbs, added: 1. Status marker (Implemented or Stub only) 2. Actual binary envelope (shape produced by the #251-fixed binary) 3. Aspirational (future) shape (original SCHEMAS.md content, preserved as target) 4. Gap notes where the two diverge Per-verb status: - list-sessions: Implemented, nested field layout - load-session: Implemented, nested session object with local session_not_found error - delete-session: Stub, emits not_yet_implemented (local error, not auth) - flush-transcript: Stub, emits not_yet_implemented (local error, not auth) ROADMAP.md Updates: - #251 marked CLOSED: Full status with commit ref, test counts. - #250 marked SCOPE-REDUCED: Option A resolved by #251, Option C moot, only Option B (doc alignment) remains as future cleanup. Why this matters: Every code change should close its documentation loop. #251 landed on the branch, but SCHEMAS.md still described aspirational shapes without marking which were implemented. Claws reading SCHEMAS.md would have assumed full conformance and hit surprises. Now the document tells the truth about which verbs work, which are stubs, and why. Related: - #251 implementation on feat/jobdori-251-session-dispatch branch - #250 scope-reduced to Option B (field-name harmonization) - #145/#146 parser fall-through fix precedent	2026-04-23 01:28:33 +09:00
YeonGyu-Kim	2fcb85ce4e	ROADMAP #251 : dispatch-order bug — session-management verbs fall through to Prompt before credential check (filed by gaebal-gajae; formalized by Jobdori cycle #40 ) Cycle #40: gaebal-gajae conceived #251 in their 00:00 Discord cycle status but hadn't committed to ROADMAP yet. Jobdori verified their diagnosis with code trace and formalized into ROADMAP with the proper framing relationship to #250. ## What This Pinpoint Says Same observable as #250 (session-management verbs emit missing_credentials instead of SCHEMAS.md envelope) but reframed at the dispatch-order layer: - #250 says: surface missing on canonical binary vs SCHEMAS.md promise - #251 says: top-level parser fall-through happens BEFORE dispatcher could intercept, so credential resolution runs before the verb is classified as a purely-local operation #251's framing is sharper because it identifies WHY the fall-through produces auth errors, not just that it does. ## Verified Code Trace - main.rs:1017-1027 is the _other => Prompt catchall - joins all rest[] tokens into joined, constructs CliAction::Prompt - downstream resolves credentials -> emits missing_credentials - No credential call would be needed had the verb been intercepted Same pattern has been fixed before for other purely-local verbs: - #145: plugins (main.rs:888-906, explicit match arm) - #146: config and diff (main.rs:911-935, same shape) #251 extends this to the 4 session-management verbs. ## Recommended Sequence 1. #251 fix (4 match arms mirroring #145/#146) — principled solution 2. #250's Option B (docs scope note) — guard against future drift 3. #250's Option C (reject with redirect) — unnecessary if #251 lands ## Discipline Per cycle #24 calibration: - Red-state bug? Borderline (silent misroute to auth error class) - Real friction? ✓ (4 documented surfaces emit wrong error class) - Evidence-backed? ✓ (code trace + prior-fix precedent #145/#146) - Same-cycle fix? ✗ (filed + document, boundary discipline #36) - Implementation cost? ~40 lines Rust + tests, bounded ## Credit Conception: gaebal-gajae (Discord msg 1496526112254328902, 00:00 KST) Formalization: Jobdori cycle #40 (code trace + precedent linking) This is the right kind of collaboration: gaebal-gajae saw the dispatch pattern I had missed in #250 (I framed as surface parity; they framed as dispatch order). I verified their diagnosis and committed the ROADMAP entry. Two framings make the pinpoint sharper than either alone.	2026-04-23 00:06:46 +09:00
YeonGyu-Kim	f1103332d0	ROADMAP #130 : re-verify still-open on main HEAD 186d42f; add classifier-cluster pairing note Cycle #39 dogfood re-verification of #130 (filed 2026-04-20). All 5 filesystem failure modes reproduce identically on main HEAD 186d42f, 2 days after original filing. Gap is unchanged. ## What's Added 1. [STILL OPEN — re-verified 2026-04-22 cycle #39] marker on the entry so readers can see immediately that the pinpoint hasn't been accidentally closed. 2. Full 5-mode repro output preserved verbatim for the current HEAD, so future re-verifications have a concrete baseline to diff against. 3. New evidence not in original filing: the classifier actively chose `kind: "unknown"` rather than just omitting the field. This means classify_error_kind() has NO substring match for "Is a directory", "No such file", "Operation not permitted", or "File exists". The typed-error contract is thus twice-broken on this path. 4. Pairing with #247/#248/#249 classifier sweep: the classifier-level part of #130 could land in the same sweep (add substring branches for io::ErrorKind strings). The context-preservation part (fix run_export's bare `?`) is a separate, larger change. ## Why Re-Verification Not Re-Filing Per cycle #24 discipline: speculative re-filings add noise, real confirmations add truth. #130 was already filed with exact repros, code trace, and fix shape. My dogfood hit the same gap on fresh HEAD — the right output is confirming the gap is still there (not filing #251 for the same bug). This is the same pattern as cycle #32's "mark #127 CLOSED" reality-sync: documentation-drift prevention through explicit status markers. ## New Pattern "Reality-sync via re-verification" — re-running a filed pinpoint's repro on fresh HEAD and adding the timestamp + output proves the gap is still real without inventing new filings. Cycle #24 calibration keeps ROADMAP entries honest. Per cycle #24 calibration: - Red-state bug? ⚠️ borderline (errors surfaced, but kind=unknown is demonstrably wrong on a path where the system knows the errno) - Real friction? ✓ (re-verified on fresh HEAD) - Evidence-backed? ✓ (5-mode repro + classifier trace) - Same-cycle fix? ✗ (classifier-level part could join #247/#248/#249 sweep; context-preservation part is larger refactor) - Implementation cost? Classifier part ~10 lines; full context fix ~60 lines Source: Jobdori cycle #39 proactive dogfood in response to Clawhip pinpoint nudge. Probed export filesystem errors; discovered this was #130 reconfirmation, not new bug. Applied reality-sync pattern from cycle #32.	2026-04-23 00:02:58 +09:00
YeonGyu-Kim	186d42f979	ROADMAP #250 : CLI surface parity gap — SCHEMAS.md's list-sessions/delete-session/etc. are Python-only; Rust binary falls through to Prompt with cred error Cycle #38 dogfood finding. Probed session management via the top-level subcommand path documented in SCHEMAS.md; discovered the Rust binary doesn't implement these as top-level subcommands. The literal token 'list-sessions' falls through the _other => Prompt arm and returns 'missing Anthropic credentials' instead of the documented envelope. ## The Gap SCHEMAS.md documents 14 CLAWABLE top-level subcommands. Python audit harness (src/main.py) implements all 14. Rust binary implements ~8 of them as top-level, routing session management through /session slash commands via --resume instead. Repro: $ env -i PATH=$PATH HOME=$HOME claw list-sessions --output-format json {"error":"missing Anthropic credentials; ...","kind":"missing_credentials"} $ claw --resume latest /session list --output-format json {"active":"...","kind":"session_list","sessions":[...]} $ python3 -m src.main list-sessions --output-format json {"command":"list-sessions","sessions":[...],"exit_code":0} Same operation, three different CLI shapes across implementations. ## Classification This is BOTH: - a parser-level trust gap (6th in #108/#117/#119/#122/#127 family; same _other => Prompt fall-through), AND - a cross-implementation parity gap (SCHEMAS.md at repo root doesn't match Rust binary's top-level surface) Unlike prior fall-throughs where the input was malformed, the input here IS a documented surface. The fall-through is wrong for a different reason: the surface exists in the protocol but not in this implementation. ## Three Fix Options Option A: Implement surfaces on Rust binary (highest cost, full parity) Option B: Scope SCHEMAS.md to Python harness (docs-only) Option C: Reject at parse time with redirect hint (cheapest, #127 pattern) Recommended: C first (prevents cred misdirection), then B for docs hygiene, then A if demand justifies. ## Discipline Per cycle #24 calibration: - Red-state bug? ⚠️ borderline — silent misroute to cred error on a documented surface. Not a crash but a real wrong-contract response. - Real friction? ✓ (claws reading SCHEMAS.md hit wrong error on canonical binary) - Evidence-backed? ✓ (dogfood probe + SCHEMAS.md cross-reference + code trace) - Implementation cost? Option C: ~30 lines (bounded). Option A: larger. - Same-cycle fix? ✗ (file + document, defer implementation per #36 boundary discipline) ## Family Position Natural bundle: #127 + #250 — parser-level fall-through pair with class distinction. #127 fixed suffix-arg-on-valid-verb case. #250 extends to 'entire Python-harness verb treated as prompt.' Same fall-through arm, different entry class. Source: Jobdori cycle #38 proactive dogfood in response to Clawhip pinpoint nudge at msg 1496518474019639408. Probed session management CLI after gaebal-gajae's status sync confirmed no red-state regressions this cycle; found this cross-implementation surface parity gap by comparing SCHEMAS.md claims against actual Rust binary behavior.	2026-04-22 23:37:45 +09:00
YeonGyu-Kim	5f8d1b92a6	ROADMAP #249 : resumed-session slash command error envelopes omit `kind` field Cycle #37 dogfood finding post-#247 merge. Two Err arms in the resumed-session JSON path at main.rs:2747 and main.rs:2783 emit error envelopes WITHOUT the `kind` field required by the §4.44 typed-envelope contract. ## The Pinpoint Probed resumed-session slash command JSON path: $ claw --output-format json --resume latest /session {"command":"/session","error":"unsupported resumed slash command","type":"error"} # no kind field $ claw --output-format json --resume latest /xyz-unknown {"command":"/xyz-unknown","error":"Unknown slash command: /xyz-unknown\n Help /help lists available slash commands","type":"error"} # no kind field AND multi-line error without split hint Compare to happy path which DOES include kind: $ claw --output-format json --resume latest /session list {"active":"...","kind":"session_list",...} Contract awareness exists. It's just not applied in the Err arms. ## Scope Two atomic fixes in main.rs: - Line 2747: SlashCommand::parse() Err → add kind via classify_error_kind() - Line 2783: run_resume_command() Err → add kind + call split_error_hint() ~15 lines Rust total. Bounded. ## Family Classification §4.44 typed-envelope contract sweep: - #179 (parse-error real message quality) — closed - #181 (envelope exit_code matches process exit) — closed - #247 (classify_error_kind misses prompt-patterns) — closed - #248 (verb-qualified unknown option errors) — in-flight (another agent) - #249 (resumed-session slash error envelopes omit kind) — filed Natural bundle #247+#248+#249: classifier/envelope completeness across all three CLI paths (top-level parse, subcommand options, resumed-session slash). ## Discipline Per cycle #24 calibration: - Red-state bug? ✗ (errors surfaced, exit codes correct) - Real friction? ✓ (typed-error contract violation; claws dispatching on error.kind get undefined for all resumed slash-command errors) - Evidence-backed? ✓ (dogfood probe + code trace identified both Err arms) - Implementation cost? ~15 lines (bounded) - Same-cycle fix? ✗ (Rust change, deferred per file-not-fix discipline) ## Not Implementing This Cycle Per the boundary discipline established in cycle #36: I don't touch another agent's in-flight work, and I don't implement a Rust fix same-cycle when the pattern is "file + document + let owner/maintainer decide." Filing with concrete fix shape is the correct output. If demand or red-state symptoms arrive, implementation can follow the same path as #247: file → fix in branch → review → merge. Source: Jobdori cycle #37 proactive dogfood in response to Clawhip pinpoint nudge at msg 1496518474019639408.	2026-04-22 23:33:50 +09:00
YeonGyu-Kim	84466bbb6c	fix: #247 classify prompt-related parse errors + unify JSON hint plumbing Cycle #34 dogfood follow-through on Jobdori cycle #33 pinpoint (#247 filed at fbcbe9d). Closes the two typed-error contract drifts surfaced in that pinpoint against the Rust `claw` binary. ## What was wrong 1. `classify_error_kind()` (main.rs:~251) used substring matching but did NOT match two common prompt-related parse errors: - "prompt subcommand requires a prompt string" - "empty prompt: provide a subcommand..." Both fell through to `"unknown"`. §4.44 typed-error contract specifies `parse \| usage \| unknown` as distinct classes, so claws dispatching on `error.kind == "cli_parse"` missed those paths entirely. 2. JSON mode dropped the `Run `claw --help` for usage.` hint. Text mode appends it at stderr-print time (main.rs:~234) AFTER split_error_hint() has already serialized the envelope, so JSON consumers never saw it. Text-mode humans got an actionable pointer; machine consumers did not. ## Fix Two small, targeted edits: 1. `classify_error_kind()`: add explicit branches for "prompt subcommand requires" and "empty prompt:" (the latter anchored with `starts_with` so it never hijacks unrelated error messages containing the word). Both route to `cli_parse`. 2. JSON error render path in `main()`: after calling split_error_hint(), if the message carried no embedded hint AND kind is `cli_parse` AND the short-reason does not already embed a `claw --help` pointer, synthesize the same `Run `claw --help` for usage.` trailer that text-mode stderr appends. The embedded-pointer check prevents duplication on the `empty prompt: ... (run `claw --help`)` message which already carries inline guidance. ## Verification Direct repro on the compiled binary: $ claw --output-format json prompt {"error":"prompt subcommand requires a prompt string", "hint":"Run `claw --help` for usage.", "kind":"cli_parse","type":"error"} $ claw --output-format json "" {"error":"empty prompt: provide a subcommand (run `claw --help`) or a non-empty prompt string", "hint":null,"kind":"cli_parse","type":"error"} $ claw --output-format json doctor --foo # regression guard {"error":"unrecognized argument `--foo` for subcommand `doctor`", "hint":"Run `claw --help` for usage.", "kind":"cli_parse","type":"error"} Text mode unchanged in shape; `[error-kind: ...]` prefix now reads `cli_parse` for the two previously-misclassified paths. ## Regression coverage - Unit test `classify_error_kind_covers_prompt_parse_errors_247`: locks both patterns route to `cli_parse` AND that generic "prompt"-containing messages still fall through to `unknown`. - Integration tests in `tests/output_format_contract.rs`: * prompt_subcommand_without_arg_emits_cli_parse_envelope_with_hint_247 * empty_positional_arg_emits_cli_parse_envelope_247 * whitespace_only_positional_arg_emits_cli_parse_envelope_247 * unrecognized_argument_still_classifies_as_cli_parse_247_regression_guard - Full rusty-claude-cli test suite: 218 tests pass (180 bin unit + 15 output_format_contract + 12 resume_slash + 7 compact + 3 mock + 1 cli). ## Family / related Joins §4.44 typed-envelope contract gap family closure: #130, #179, #181, and now #247. All four quartet items now have real fixes landed on the canonical binary surface rather than only the Python harness. ROADMAP.md: #247 marked CLOSED with before/after evidence preserved.	2026-04-22 22:43:14 +09:00
YeonGyu-Kim	fbcbe9d8d5	ROADMAP #247 : classify_error_kind() misses prompt-related parse errors; hint dropped in JSON envelope Cycle #33 dogfood finding from direct probe of Rust claw binary: ## The Pinpoint Two related contract drifts in the typed-error envelope: ### 1. Error-kind misclassification `classify_error_kind()` at main.rs:246-280 uses substring matching but does NOT match two common parse error messages: - "prompt subcommand requires a prompt string" → classified as 'unknown' - "empty prompt: provide a subcommand..." → classified as 'unknown' The §4.44 typed-error contract specifies 'parse \| usage \| unknown' as DISTINCT classes. Known parse errors should be 'cli_parse', not 'unknown'. ### 2. Hint lost in JSON mode Text mode appends 'Run `claw --help` for usage.' to parse errors. JSON mode emits 'hint: null'. The trailer is added at the stderr-print stage AFTER split_error_hint() has already serialized the envelope, so JSON consumers never see it. ## Repro Dogfooded on main HEAD dd0993c (cycle #33): $ claw --output-format json prompt {"error":"prompt subcommand requires a prompt string","hint":null,"kind":"unknown","type":"error"} Expected: kind="cli_parse" + hint="Run \\`claw --help\\` for usage." ## Impact - Claws dispatching on typed error.kind fall back to substring matching - JSON consumers lose actionable hint that text-mode users see - Joins JSON envelope field-quality family (#90, #91, #92, #110, #115, #116, #130, #179, #181, #247) ## Fix Shape 1. Add prompt-pattern clauses to classify_error_kind() (~4 lines) 2. Move hint plumbing to BEFORE JSON envelope serialization (~15 lines) 3. Add golden-fixture regression tests per cycle #30 pattern Not a red-state bug (error IS surfaced, exit code IS correct), but real contract drift. Deferred for implementation; filed per Clawhip nudge to 'add one concrete follow-up to ROADMAP.md'. Per cycle #24 calibration: - Red-state bug? ✗ (errors exit 1 correctly) - Real friction? ✓ (typed-error contract drift) - Evidence-backed? ✓ (dogfood probe + code trace identified both leaks) - Implementation cost? ~20 lines Rust (bounded) - Demand signal needed? Medium — any claw doing error.kind dispatch on prompt-path errors is affected Source: Jobdori cycle #33 direct dogfood 2026-04-22 22:30 KST in response to Clawhip pinpoint nudge at msg 1496503374621970583.	2026-04-22 22:34:35 +09:00
YeonGyu-Kim	dd0993c157	docs: cycle #32 — mark #127 CLOSED; document in-flight branch obsolescence Cycle #32 dogfood finding: #127 was fixed on main via `a3270db` + `79352a2` (2026-04-20), but the ROADMAP.md entry still lacked a [CLOSED] marker. The in-flight branches `feat/jobdori-127-clean` and `feat/jobdori-127-verb-suffix-flags` were superseded and are now obsolete. ## What This Fixes Documentation drift: Pinpoint #127 was complete in code but unmarked in ROADMAP. New contributors checking the roadmap would see it as open work, potentially duplicating effort. Stale branches: Two branches (`feat/jobdori-127-clean`, `feat/jobdori-127-verb-suffix-flags`) contain the fix attempt bundled with an unrelated large-scope refactor (5365 lines removed from ROADMAP.md, root-level governance docs deleted, command infra refactored). Their fix was superseded; branches are functionally obsolete. ## Verification Re-verified all 4 #127 scenarios pass on main HEAD `b903e16`: $ claw doctor --json → rejected with "did you mean" hint $ claw doctor garbage → rejected $ claw doctor --unknown-flag → rejected $ claw doctor --output-format json → works (canonical form) All behavior matches #127 acceptance criteria. ## Cluster Impact Post-closure: parser-level trust gap quintet (#108 + #117 + #119 + #122 + #127) is 5/5 closed. The `_other => Prompt` fall-through audit is complete. ## Discipline Check Per cycle #24 calibration: - Red-state bug? ✗ (behavior is correct on main) - Real friction? ✓ (ROADMAP drift; obsolete branches adrift) - Evidence-backed? ✓ (dogfood probe confirmed closure; git log confirmed supersession; branch diff confirmed scope contamination) ## Relationship to Gaebal-gajae's Option A Guidance Cycle #32 started by proposing separating the #127 fix from the attached refactor. On deeper probe, discovered the fix was already superseded on main via different commits. Option A (separate the fix) is retroactively satisfied: the fix landed cleanly, the refactor never did. The remaining action is governance hygiene: mark closure, document supersession, flag obsolete branches for deletion. ## Next Actions (not in this commit) - Delete `feat/jobdori-127-clean` locally and on fork (after confirmation) - Delete `feat/jobdori-127-verb-suffix-flags` locally and on fork - Monitor whether any attached refactor content should be re-proposed in its own scoped PR Source: Jobdori cycle #32 dogfood in response to Clawhip 10-min nudge. Proposed Option A (separate fix from refactor); probe revealed the fix already landed via a different commit path, rendering the refactor-only branch obsolete.	2026-04-22 22:28:22 +09:00
YeonGyu-Kim	1d155e4304	docs: ROADMAP.md — file #180 (discoverability gap: --help/--version outside JSON contract) Cycle #24 dogfood discovery. Running proactive edge-case dogfood on the JSON contract, hit a real pinpoint: --help and --version are outside the parser-front-door contract. The gap: 1. "claw --help --output-format json" returns text (not envelope) 2. "claw bootstrap --help --output-format json" returns text (not envelope) 3. "claw --version" doesn't exist at all Why it matters: - Claws can't programmatically discover the CLI surface - Version checking requires side-effectful commands - Natural follow-up gap to #178/#179 parser-front-door work Discoverability scenarios: - Orchestrator checking whether a new command (e.g., turn-loop) is available - Version compat check before dispatching work - Enumerating available commands for routing decisions Filed as Pinpoint #180 in ROADMAP.md with: - Gap description + 3-case repro - Impact analysis (version compat, surface enumeration, governance) - Root cause (argparse default HelpAction prints text + exits) - Fix shape (3 stages, ~40 lines total) - Stage A: --version + JSON envelope version metadata - Stage B: --help JSON routing via custom HelpAction - Stage C: optional 'schema-info' command for pre-dispatch discovery - Acceptance criteria (4 cases, including backward compat) - Priority: Medium (not red-state, but real discoverability gap) Status: Filed, implementation deferred. Following maintainership equilibrium: pinpoints stay documented but don't force code changes. If external demand arrives (claw author building a dispatcher, orchestrator doing version checks), the fix can ship in one cycle using the shape already documented. No code changes this cycle. Pure ROADMAP filing. Continues the maintainership pattern: find friction, document it, defer until evidence-backed demand arrives. Source: Jobdori proactive dogfood at 2026-04-22 20:58 KST.	2026-04-22 21:01:40 +09:00
YeonGyu-Kim	85de7f9814	fix: #166 — flush-transcript now accepts --directory / --output-format / --session-id; session-creation command parity with #160/#165 lifecycle triplet	2026-04-22 18:04:25 +09:00
YeonGyu-Kim	d453eedae6	fix: #165 — load-session CLI now parity-matches list/delete (--directory, --output-format, typed JSON errors) The #160 session-lifecycle CLI triplet was asymmetric: list-sessions and delete-session accepted --directory + --output-format and emitted typed JSON error envelopes, but load-session had neither flag and dumped a raw Python traceback (including the SessionNotFoundError class name) on a missing session. Three concrete impacts this fix closes: 1. Alternate session-store locations (e.g. /tmp/claw-run-XXX/.port_sessions) were unreachable via load-session; claws had to chdir or monkeypatch DEFAULT_SESSION_DIR to work around it. 2. Not-found emitted a multi-line Python stack, not a parseable envelope. Claws deciding retry/escalate/give-up had only exit code 1 to work with. 3. The traceback leaked 'src.session_store.SessionNotFoundError' verbatim, coupling version-pinned claws to our internal exception class name. Now all three triplet commands accept the same flag pair and emit the same JSON error shape: Success (json mode): {"session_id": "alpha", "loaded": true, "messages_count": 3, "input_tokens": 42, "output_tokens": 99} Not-found: {"session_id": "missing", "loaded": false, "error": {"kind": "session_not_found", "message": "session 'missing' not found in /path", "directory": "/path", "retryable": false}} Corrupted file: {"session_id": "broken", "loaded": false, "error": {"kind": "session_load_failed", "message": "...", "directory": "/path", "retryable": true}} Exit code contract: - 0 on successful load - 1 on not-found (preserves existing $?) - 1 on OSError/JSONDecodeError (distinct 'kind' in JSON) Backward compat: legacy 'claw load-session ID' text output unchanged byte-for-byte. Only new behaviour is the flags and structured error path. Tests (tests/test_load_session_cli.py, 13 tests): - TestDirectoryFlagParity (2): --directory works + fallback to CWD/.port_sessions - TestOutputFormatFlagParity (2): json schema + text-mode backward compat - TestNotFoundTypedError (2): JSON envelope on not-found; no traceback in either mode; no internal class name leak - TestLoadFailedDistinctFromNotFound (1): corrupted file = session_load_failed with retryable=true, distinct from session_not_found - TestTripletParityConsistency (6): parametrised over [list, delete, load] * [--directory, --output-format] — explicit parity guard for future regressions Full suite: 80/80 passing, zero regression. Discovered via Jobdori dogfood sweep 2026-04-22 17:44 KST — ran 'claw load-session nonexistent' expecting a clean error, got a Python traceback. Filed #165 + fixed in same commit. Closes ROADMAP #165.	2026-04-22 17:44:48 +09:00
YeonGyu-Kim	79a9f0e6f6	fix: #163 — remove [turn N] suffix pollution from run_turn_loop; file #164 timeout-cancellation followup #163: run_turn_loop no longer injects f'{prompt} [turn N]' into follow-up prompts. The suffix was never defined or interpreted anywhere — not by the engine, not by the system prompt, not by any LLM. It looked like a real user-typed annotation in the transcript and made replay/analysis fragile. New behaviour: - turn 0 submits the original prompt (unchanged) - turn > 0 submits caller-supplied continuation_prompt if provided, else the loop stops cleanly — no fabricated user turn - added continuation_prompt: str \| None = None parameter to run_turn_loop - added --continuation-prompt CLI flag for claws scripting multi-turn loops - zero '[turn' strings ever appear in mutable_messages or stdout now Behaviour change for existing callers: - Before: run_turn_loop(prompt, max_turns=3) submitted 3 turns ('prompt', 'prompt [turn 2]', 'prompt [turn 3]') - After: run_turn_loop(prompt, max_turns=3) submits 1 turn ('prompt') - To preserve old multi-turn behaviour, pass continuation_prompt='Continue.' or any structured follow-up text One existing timeout test (test_budget_is_cumulative_across_turns) updated to pass continuation_prompt so the cumulative-budget contract is actually exercised across turns instead of trivially satisfied by a one-turn loop. #164 filed: addresses reviewer feedback on #161. The wall-clock timeout bounds the caller-facing wait, but the underlying submit_message worker thread keeps running and can mutate engine state after the timeout TurnResult is returned. A cooperative cancel_event pattern is sketched in the pinpoint; real asyncio.Task.cancel() support will come once provider IO is async-native (larger refactor). Tests (tests/test_run_turn_loop_continuation.py, 8 tests): - TestNoTurnSuffixInjection (2): zero '[turn' strings in any submitted prompt, both default and explicit-continuation paths - TestContinuationDefaultStopsAfterTurnZero (2): default loops run exactly one turn; engine.submit_message called exactly once despite max_turns=10 - TestExplicitContinuationBehaviour (2): turn 0 = original, turn N = continuation verbatim; max_turns still respected - TestCLIContinuationFlag (2): CLI default emits only '## Turn 1'; --continuation-prompt wires through to multi-turn behaviour Full suite: 67/67 passing. Closes ROADMAP #163. Files #164.	2026-04-22 17:37:22 +09:00
YeonGyu-Kim	41a6091355	file: #163 — run_turn_loop injects [turn N] suffix into follow-up prompts; multi-turn sessions semantically broken	2026-04-22 10:07:35 +09:00

1 2 3 4 5 ...

263 Commits