Per gaebal-gajae cycle #117 closing validation:
Authoritative reframe:
'Cycle #117은 PR creation failure를 브랜치 문제에서
organization-level PR authorization barrier로 정확히 격리한
진단 턴입니다.'
The cycle value was NOT 'PR blocked'.
The cycle value WAS 'boundary of the barrier isolated through experiments'.
Four dimensions experimentally separated:
1. Repository state: healthy (push, tests)
2. Branch readiness: visible on origin
3. Token liveness: valid (own-fork PR succeeded)
4. Org PR authorization: BLOCKED (FORBIDDEN for both claws)
Reviewer-ready compression:
'The branch is pushable and reviewable, but PR creation into
ultraworkers/claw-code is blocked specifically at the organization
authorization layer, not by repository state or token liveness.'
Doctrine #32 formalized:
'Merge-wait mode actions must be within the agent's capability
envelope. When blocked externally, diagnose by boundary separation
and hand off to the responsible party, not by retry or redefinition.'
Operational protocol:
1. Isolate boundary through experiments (not retry same path)
2. Document separation explicitly (works vs doesn't work)
3. Escalate to responsible party (web UI, org admin, infra)
4. Do NOT retry, conflate, or redefine the failure
Validation: Cycle #117 both-claws blocked, boundary isolated,
escalation path identified.
Cross-claw coherence:
- Cycle #115: 1 claw attempted, 1 succeeded (hypothesis)
- Cycle #117: 2 claws attempted, 2 blocked, IDENTICAL error (confirmed)
Next action path (per gaebal-gajae):
Author/owner intervention via web UI OR org admin OAuth grant.
'기술적 탐사가 아니라 author/owner intervention입니다.'
Doctrine count: 32 formalized.
Gate status: Blocked pending author intervention.
Mode integrity: Preserved throughout cycle #117.
Per cycle #117 cross-claw diagnosis (both claws attempted independently):
Both Jobdori (code-yeongyu) and gaebal-gajae (Yeachan-Heo) hit
identical GraphQL FORBIDDEN error on createPullRequest mutation.
Diagnosis: Organization-wide OAuth app restriction on
ultraworkers/claw-code, not per-identity issue.
Reviewer-ready compression (per gaebal-gajae):
'The branch is now remotely visible and PR-ready, but actual PR
creation is blocked by GitHub permissions rather than repository
state.'
Confirmed state:
- Branch on origin: Yes (cycle #115)
- PR creation CLI path: Blocked for both claws
- Manual web UI: Required
- Org admin OAuth grant: Long-term fix
Gate sequence updated:
1. Branch on origin (DONE, cycle #115)
2. PR creation - BLOCKED at OAuth (cycle #116/#117)
3. Manual web UI PR creation (REQUIRED next)
4. Review cycle
5. Merge signal
6. Phase 1 Bundle 1 (#181 + #183)
Doctrine #32 (provisional, pending gaebal-gajae formal acceptance):
'Merge-wait mode actions must be within the agent's capability
envelope. When blocked externally, diagnose + document + escalate,
not retry.'
Cross-claw validation: Both claws blocked, same error pattern.
Mode integrity: Preserved throughout both attempts.
Next blocker: External human action (manual web UI or org admin).
Per gaebal-gajae cycle #115 validation pass:
Authoritative reframe:
'Cycle #115 was not an exception to merge-wait mode; it was the first
turn where merge-wait mode actually did what merge-wait mode is
supposed to do.'
Reviewer-ready compression:
'The branch was frozen but not yet reviewable because it had never
been pushed; this cycle converted merge-wait from a declared state
into a remotely visible one.'
Mode semantic correction:
- Merge-wait mode is NOT 'do nothing'
- Merge-wait mode IS 'block discovery + enable merge-readiness'
- Push to origin = merge-readiness action (fits mode, not violation)
Doctrine #31 (formalized):
'Merge-wait mode requires remote visibility.'
Protocol: git ls-remote origin <branch> must return commit hash.
If empty: push before claiming review-ready.
Self-process pinpoint #193 (formalized):
'Dogfood process hygiene gap — declared review-ready claims lacked
remote visibility check for 40+ minutes (cycles #109-#114).'
Applies to dogfood methodology, not claw-code binary.
Gate sequence (per gaebal-gajae):
1. Branch on origin (cycle #115, DONE)
2. PR creation (next concrete action)
3. Review cycle
4. Merge signal
5. Phase 1 Bundle 1 kickoff
Doctrine count: 31 total.
Per gaebal-gajae cycle #110 state designation:
'Phase 0 is no longer in discovery mode; it is in merge-wait mode
with Phase 1 already precommitted.'
Mode distinction formalized:
- Discovery mode: probe + file + refine (previous state)
- Merge-wait mode: hold state, await signal (CURRENT)
- Execution mode: land bundles (post-merge state)
Doctrine #30: 'Modes are state, not suggestions.'
Once closure is declared, mode label acts as operational guard.
Future cycles must respect state designation:
- No new probes (that's discovery)
- No new pinpoints (branch frozen)
- No new branches (Phase 0 must merge first)
- Maintain readiness; respond to signal
Mode history for Phase 0:
- Cycle #97: Discovery begins
- Cycle #108: Exhaustion criteria met
- Cycle #109: Closure declared
- Cycle #110: Merge-wait mode formally entered
Current state: MERGE-WAIT MODE. Awaiting signal.
Per gaebal-gajae 11:58 Seoul closure validation.
Authoritative closure statement:
'Phase 0 has finished discovery. Phase 1 should start by landing
the locked contract foundation bundle, not by opening new
exploratory cycles.'
All four exhaustion criteria met:
1. Unaudited surfaces: 9 probed (full coverage)
2. Probe hypothesis: Fully validated (multi-flag 3-4, simple 0-1)
3. Phase 1 docs: PHASE_1_KICKOFF.md + review guide + priority queue
4. Branch hygiene: 39 commits, 564 tests, 0 regressions, freeze held
Doctrine #29 (final): 'Discovery termination is itself a deliverable.'
- Criteria: surfaces probed, hypothesis validated, plan documented,
branch review-ready
- Anti-pattern: infinite probe continuation
- Correct: explicit closure + pivot to execution
Phase 0 / dogfood cycles formally closed. No more probe filings
on this branch. Next work unit is Phase 1 execution, not discovery.
Pending: Phase 0 merge approval → Phase 1 branch creation in
priority order → bundle-by-bundle execution (~10 min per bundle).
Cycle #108 probe of claw skills install/enable/disable yielded 3 pinpoints:
#190: Design decision needed
skills install (no args) routes to help (action: help, kind: skills).
May be intentional (like agents pattern) or design inconsistency.
Requires verification against agents canonical reference.
#191: Classifier gap (filesystem family extension)
skills install /bad/path emits kind=unknown.
Should be kind=filesystem or filesystem_io_error.
Extends #177/#178/#179 install-surface taxonomy.
#192: Classifier gap (unknown-option family extension)
skills install --bogus-flag emits kind=unknown.
Should be kind=cli_parse (like sandbox).
Now 4 members in unknown-option sub-lineage: #186, #187, #189, #192.
Pinpoint count: 82 filed, 67 genuinely open.
Classifier family: 19 members (+2).
All unaudited surfaces now probed:
- Cycles #104-#108: plugins, agents, init, bootstrap-plan, system-prompt,
export, sandbox, dump-manifests, skills
- Hypothesis fully validated: Multi-flag verbs have 3-4 classifier gaps;
simple verbs have 0-1 gaps.
Per freeze doctrine, no code changes. Doc-only filing.
Per gaebal-gajae cycle #107 validation pass. Three refinements:
1. Framings locked (both verb-specific):
#188: 'dump-manifests --help omits the prerequisite that runtime
behavior actually requires.'
#189: 'dump-manifests unknown-option errors still fall through to
unknown instead of the existing CLI-parse path.'
2. Doc-truthfulness family formally split into 2 sub-axes:
- Audit-flow (5 members: #76, #79, #82, #172, #180) — reading one
file vs another declared source of truth
- Probe-flow (NEW, 1 member: #188) — running verb vs observing
--help text
3. Priority refinement:
- #189 → bundled in feat/jobdori-186-189-classifier-sweep (3 verbs)
- #188 → post-#180 (doc parity sequence: USAGE gap → help-text gap)
- Full sequence: #180 (audit-flow doc-truth) → #188 (probe-flow doc-truth)
4. Key cycle #107 outcome (per gaebal-gajae):
'behavior bug처럼 보이던 걸 help-text truthfulness gap으로 정확히 재분류'
This is the reclassification skill that earned the filing.
Doctrine #28: First observation is hypothesis, not filing. Verify
against SCHEMAS/USAGE/--help before classifying axis. Cost: 30-60s
per probe. Benefit: avoid filing not-a-bug pinpoints.
Priority queue now 6 bundles + 3+ independent, all reviewer-blessed.
Cycle #107 probe of claw dump-manifests yielded 2 pinpoints:
#188: Doc-truthfulness gap (NEW sub-axis)
claw dump-manifests --help describes usage as optional flags, but
the verb fails without --manifests-dir or CLAUDE_CODE_UPSTREAM.
USAGE.md is correct; CLI --help output lies by omission.
This is the first doc-truth pinpoint from probe flow (vs audit flow).
New sub-axis: help text vs behavior (prior doc-truth: SCHEMAS/USAGE/README).
#189: Classifier gap (same pattern as #186/#187)
dump-manifests --bogus-flag falls through to kind=unknown.
Should be cli_parse (like sandbox).
Now at 3 verbs in same pattern: system-prompt (#186), export (#187),
dump-manifests (#189). Rename bundle to feat/jobdori-186-189-classifier-sweep.
Pinpoint count: 79 filed, 65 genuinely open.
Doc-truthfulness family: 6 members (was 5).
Classifier unknown-option sub-lineage: 3 members (was 2).
Per freeze doctrine, no code changes. Doc-only filing.
Per gaebal-gajae cycle #106 validation pass. Two refinements:
1. #187 framing locked:
'export unknown-option errors still fall through to unknown,
unlike the already-canonical sandbox CLI-parse path.'
Surgical parallel to #186 framing (cycle #105):
'system-prompt unknown-option errors still fall through to unknown
instead of the existing CLI-parse classification path.'
Same pattern: verb + drift + reference path.
2. #186 and #187 bundled into feat/jobdori-186-187-classifier-sweep.
Rationale: identical fix pattern, identical test pattern, same
source file, 2x review overhead if separated.
Updated merge priority queue (gaebal-gajae reviewer-blessed):
1. feat/jobdori-181-error-envelope-contract-drift (#181 + #183)
2. feat/jobdori-184-cli-contract-hygiene-sweep (#184 + #185)
3. feat/jobdori-186-187-classifier-sweep (#186 + #187)
Doctrine #27: Same-pattern pinpoints should bundle into one classifier
sweep PR. One-pinpoint = one-branch is not universal; batching
same-pattern fixes halves review/merge overhead.
Cycle #106 probe of export and sandbox verbs. Found:
- export --bogus-flag: kind=unknown (should be cli_parse)
- sandbox --bogus-flag: kind=cli_parse (canonical correct)
#187 is direct sibling of #186 (system-prompt classifier gap).
Both unknown-option, both should use cli_parse classifier.
Observation: sandbox has no gaps. export has 1 classifier gap.
Suggests classifier coverage improving on newer verbs, not consistent
regression across unaudited surfaces.
Hypothesis (#104) partially validated: unaudited surfaces yield
pinpoints, but not uniformly. Single-issue verbs (sandbox) may be
cleaner than multi-flag verbs (export, init, bootstrap-plan).
Pinpoint count: 77 filed, 63 genuinely open.
Per freeze doctrine, no code changes. Doc-only filing.
Per gaebal-gajae cycle #105 validation pass. One-liner state summary
now appears at top (tone-setter for reviewers) and bottom (reinforced
recap):
'Phase 0 is now frozen, reviewer-mapped, and merge-ready;
Phase 1 remains intentionally deferred behind the locked priority order.'
This is the single authoritative sentence that captures branch state.
Use it for PR titles, review summaries, and Phase 1 handoff notes.
Why this framing matters (per gaebal-gajae evaluation):
- 'frozen' signals no scope creep
- 'reviewer-mapped' signals audit trail exists (this guide)
- 'merge-ready' signals gates are passed
- 'intentionally deferred' signals Phase 1 absence is by design, not omission
- 'locked priority order' signals sequencing is validated (cycle #104-#105)
Review guide now doubles as merge-enabler: reviewers parse branch state
in one sentence, then drill into commits as needed.
Doc-only. No code changes. Freeze preserved.
Pre-merge documentation for reviewers. Summarizes:
- What Phase 0 tasks deliver (JSON envelope contracts, regression locks)
- Why dogfood cycles #99-#105 matter (validated methodology, 15 filed pinpoints)
- Commit-by-commit navigation for the 30-commit frozen bundle
- What lands vs what's deferred
- Integration notes for Phase 1 planning
- Known limitations + follow-ups
This is doc-only, no code changes. Serves as audit trail and reviewer
reference without adding scope to the frozen feature branch.
Per gaebal-gajae cycle #105 review pass. Three corrections:
1. #184/#185 belong to #171 lineage (CLI contract hygiene sub-family),
NOT a new family. Same enforcement hole pattern on unaudited verbs.
2. #186 locked as member of #169/#170 classifier lineage. Framing:
'system-prompt unknown-option errors still fall through to unknown
instead of the existing CLI-parse classification path.'
3. agents is the #183 reference implementation. Fix path reframed from
'design new contract' to 'align outliers to existing reference'.
Much smaller scope for feat/jobdori-181-error-envelope-contract-drift.
Canonical reference shape locked:
{action: 'help', kind: <verb>, unexpected: <bad-name>, usage: {...}}
Doctrine #24: Pinpoint lineage continuity. Check existing family
before creating new. Reviewers follow pattern lineages.
Family tree corrected: CLI contract hygiene moved from 'NEW' to
'#171 sub-lineage within classifier family'.
Cycle #105 probe of agents/init/bootstrap-plan/system-prompt verbs
(unaudited per cycle #104 hypothesis) yielded 3 pinpoints:
#184: claw init silently accepts unknown positional arguments. Inconsistent
with #171 CLI contract hygiene pattern.
#185: claw bootstrap-plan silently accepts unknown flags. Same family as
#184, different verb, different surface.
#186: claw system-prompt --<unknown> classified as 'unknown' instead of
'cli_parse'. Classifier family member (#182-style).
Bonus observation (not filed): claw agents bogus-action emits the
canonical mcp-style {action: help, unexpected, usage} shape. This is
the shape that #183 wants as canonical, NOT the plugins-style success
envelope. agents is the reference implementation.
Hypothesis validated: unaudited verb surfaces have 2-3x higher pinpoint
yield. Predicted cycle #104-#105 pattern holds.
Pinpoint count: 76 filed, 62 genuinely open.
Final framing pass for cycle #104 plugin lifecycle pinpoints. Three
one-liner framings captured for reviewer consumption:
#181 (HIGH): 'plugins unknown-subcommand errors currently emit on the
success path instead of the JSON error path.'
#183 (HIGH): 'Invalid subcommand handling is not normalized across
plugins and mcp JSON surfaces.'
#182 (MEDIUM): 'Plugin lifecycle failures still fall through to unknown
instead of canonical error kinds.'
Branch sequencing locked:
1. feat/jobdori-181-error-envelope-contract-drift (bundles #181+#183)
2. feat/jobdori-182-plugin-classifier-alignment (#182, post-merge)
Rationale: #181 is root bug, #183 is sibling symptom, #182 is cleanup
that benefits from clean error envelope landing first.
Branch at 27 commits, 227/227 tests, review-ready.
Cycle #104 probe of plugin lifecycle axis (claw plugins + mcp subcommands)
yielded 3 related gaps:
#181: plugins bogus-subcommand returns SUCCESS-shaped envelope with
error buried in 'message' text field. Consumer parsing via
type=='error' check treats it as success. Severe.
#182: plugins install/enable not-found errors classified as 'unknown'
instead of 'plugin_not_found' or 'not_found'. Classifier family member.
#183: plugins and mcp emit DIFFERENT shapes on unknown subcommand.
plugins has reload_runtime+target+message, mcp has unexpected+usage.
Shape parity gap.
All three filed only per freeze doctrine. Proposed separate branches:
- feat/jobdori-181-unknown-subcommand-error-routing (#181 + #183 bundled)
- feat/jobdori-182-plugin-not-found-classifier (#182 standalone)
Pinpoint count: 73 filed, 59 genuinely open. Typed-error family: 14
members. Emission routing family: 1 new member (#181).
Cycle #103 doc-truthfulness audit found USAGE.md incomplete.
Actual CLI has 14 standalone verbs (status, doctor, mcp, skills, agents,
export, init, sandbox, system-prompt, bootstrap-plan, dump-manifests,
help, version, acp).
USAGE.md covers only 3 entry modes (claw REPL, claw prompt TEXT,
claw --resume). Other verbs absent or underdocumented.
Example: USAGE.md says 'start claw, then /doctor' but doesn't explain
that 'claw doctor' is also a standalone entry point (no REPL needed).
Fix: Add 'Standalone commands' section to USAGE.md with all 14 verbs
documented. Include regression test (grep USAGE.md for each verb).
Doc-truthfulness family: #76, #79, #82, #172, #180.
Pinpoint count: 70 filed, 56 genuinely open.
#175 numbering collision between:
- gaebal-gajae's CI framing (filed at ~10:00 via Discord verbally)
- my filesystem classifier filing (#175 per cycle #102 10:02)
Resolution:
- gaebal-gajae's framing reclaims #175 (higher-level workflow gap)
- My filesystem classifier renumbered to #177
- My export enum naming renumbered to #178
All three pinpoints now filed with correct non-colliding numbers:
- #175: CI fmt/test signal decoupling (gaebal-gajae)
- #177: skills install filesystem classifier (Jobdori, was #175)
- #178: export kind naming consistency (Jobdori, was #176)
Typed-error family membership updated accordingly.
Cycle #102 probe of model/skills/export axis found two related gaps:
#175: skills install filesystem errors classified as 'unknown' instead of
'filesystem' (which is in v1.5 enum).
#176: export uses 'filesystem_io_error' kind but this is NOT in v1.5
declared enum (which only lists 'filesystem'). Inconsistent naming.
Both filed only per freeze doctrine. Proposed bundling as
feat/jobdori-175-filesystem-error-family branch.
Family observation: classifier + enum-naming gaps found simultaneously
in filesystem-error axis. Indicates broader unaudited surface.
Pinpoint count: 68 filed, 54 genuinely-open.
Per gaebal-gajae cycle #101 framing pass. Adds stable framing that
captures scope + root cause + visible effect + surface in one line.
Locks branch name: feat/jobdori-174-resume-trailing-cli-parse
Next-branch prep steps documented so post-168c-merge execution is
zero-friction (classifier branch + regression test pattern already
established by #169/#170/#171).
Cycle #101 probe of session-boot axis (prompt misdelivery / resume
lifecycle) found another typed-error classifier gap.
Filed only, not fixed. Per freeze doctrine (cycles #98-#100), no new
code axis added to feat/jobdori-168c-emission-routing.
Pattern: `--resume trailing arguments must be slash commands` classified
as 'unknown' instead of 'cli_parse'. Side effect: #247 hint synthesizer
doesn't trigger, so hint is null.
Same family as #169, #170, #171 (classifier coverage gaps).
Proposed fix: add `--resume trailing arguments` pattern to
classify_error_kind as cli_parse.
Pinpoint count: 66 filed, 52 genuinely-open + #174 new.
Cycle #100 probe of non-classifier axes (event/log opacity) found new
consumer parity gap: JSON mode missing 'hint' field that text mode
provides for config_load_error scenarios.
Filed only, not fixed. Per freeze doctrine (cycles #98-#99), no new axis
added to feat/jobdori-168c-emission-routing. This pinpoint is a Phase 1
scope candidate for a separate branch.
Affects: claw mcp, claw status, claw doctor (JSON mode).
Text mode shows: Hint `claw doctor` classifies config parse errors...
JSON mode shows: no hint field at all.
Consumer impact: claws parsing JSON output can't programmatically route
errors to recovery paths the way text-mode users can with human guidance.
Family: Consumer parity. Related: #247 (hint synthesizer), #169-#172
(classifier family), #172 (doc-truthfulness).
Proposed fix: add 'hint' field to JSON envelope when config_load_error
is present, with hint taxonomy for typed dispatch.
Pinpoint count: 65 filed, 51 genuinely-open + #173 new.
Pinpoint #172: SCHEMAS.md v1.5 Emission Baseline documentation inaccuracy
discovered during cycle #98 probe.
The Phase 1 normalization targets section claimed:
"unify where `action` field appears (only in 4 inventory verbs)"
But reality is only 3 inventory verbs have `action`:
- mcp
- skills
- agents
list-sessions uses `command` instead (the documented 1-of-13 deviation
already captured elsewhere in v1.5 baseline).
This is a doc-truthfulness issue (same family as cycles #76, #79, #82).
Active misdocumentation leads downstream consumers to assume 4-verb
coverage when building adapters/dispatchers.
Changes:
1. SCHEMAS.md: 'only in 4 inventory verbs' → 'only in 3 inventory verbs: mcp, skills, agents'
2. Added regression test `v1_5_action_field_appears_only_in_3_inventory_verbs_172`
- Asserts mcp/skills/agents HAVE action field
- Asserts help/version/doctor/status/sandbox/system-prompt/bootstrap-plan/list-sessions do NOT have action field
- Forces SCHEMAS.md + binary to stay synchronized
Test added:
- `v1_5_action_field_appears_only_in_3_inventory_verbs_172` (8 negative cases + 3 positive cases)
Tests: 227/227 pass (+1 from #172).
Related: #155 (doc parity family), #168c (emission baseline).
Doc-truthfulness family: #76, #79, #82, #172.
Pinpoint #171: typed-error classifier gap discovered during #141 probe cycle #97.
`claw list-sessions --help` emits:
error: unexpected extra arguments after `claw list-sessions`: --help
This format is used by multiple verbs that reject trailing positional args:
- list-sessions
- plugins (subcommands)
- config (subcommands)
- diff
- load-session
Before fix:
{"error": "unexpected extra arguments after `claw list-sessions`: --help",
"hint": null,
"kind": "unknown",
"type": "error"}
After fix:
{"error": "unexpected extra arguments after `claw list-sessions`: --help",
"hint": "Run `claw --help` for usage.",
"kind": "cli_parse",
"type": "error"}
The pattern `unexpected extra arguments after \`claw` is specific enough
that it won't hijack generic prose mentioning "unexpected extra arguments"
in other contexts (sanity test included).
Side benefit: like #169/#170, correctly classified cli_parse errors now
auto-trigger the #247 hint synthesizer.
Related #141 gap not yet closed: `claw list-sessions --help` still errors
instead of showing help (requires separate parser fix to recognize --help
as a distinct path). This classifier fix at least makes the error surface
typed correctly so consumers can distinguish "parse failure" from "unknown"
and potentially retry without the --help flag.
Test added:
- `classify_error_kind_covers_unexpected_extra_args_171` (4 positive cases
+ 1 sanity guard)
Tests: 226/226 pass (+1 from #171).
Typed-error family: #121, #127, #129, #130, #164, #169, #170, #247.
Cycle #96 dogfood found practical install-experience gap in USAGE.md.
#153 closed by commit 6212f17 (same branch, same cycle).
Part of discoverability family (#155, help/USAGE parity).
Pinpoint count: 62 filed, 51 genuinely-open + #153 closed this cycle.
Pinpoint #153 closure. USAGE.md was missing practical instructions for:
1. Adding the claw binary to PATH (symlink vs export PATH)
2. Verifying the install works (version, doctor, --help)
3. Troubleshooting PATH issues (which, echo $PATH, ls -la)
New subsections:
- "Add binary to PATH" with two common options
- "Verify install" with post-install health checks
- Troubleshooting guide for common failures
Target audience: developers building from source who want to run `claw`
from any directory without typing `./rust/target/debug/claw`.
Discovered during cycle #96 dogfood (10-min reminder cycle).
Tests: 225/225 still pass (doc-only change).
Pinpoint #170: Extended typed-error classifier coverage gap discovered during
dogfood probe 2026-04-23 07:30 Seoul (cycle #95).
The #169 comment claimed to cover `--permission-mode bogus` via the
`unsupported value for --` pattern, but the actual `parse_permission_mode_arg`
message format is `unsupported permission mode 'bogus'` (NO `for --` prefix).
Doc-vs-reality lie in the #169 fix itself — fixed here.
Four classifier gaps closed:
1. `unsupported permission mode '<value>'` → cli_parse
(from: `parse_permission_mode_arg`)
2. `invalid value for --reasoning-effort: '<value>'; must be ...` → cli_parse
(from: `--reasoning-effort` validator)
3. `model string cannot be empty` → cli_parse
(from: empty --model rejection)
4. `slash command /<name> is interactive-only. Start \`claw\` ...` →
slash_command_requires_repl (NEW kind — more specific than cli_parse)
The fourth pattern gets its own kind (`slash_command_requires_repl`) because
it's a command-mode misuse, not a parse error. Downstream consumers can
programmatically offer REPL-launch guidance.
Side benefit: like #169, the correctly classified cli_parse errors now
auto-trigger the #247 hint synthesizer ("Run `claw --help` for usage.").
Test added:
- `classify_error_kind_covers_flag_value_parse_errors_170_extended`
(4 positive cases + 2 sanity guards)
Tests: 225/225 pass (+1 from #170).
Typed-error family: #121, #127, #129, #130, #164, #169, #247.
Discovered via systematic probe angle: 'error message pattern audit' \u2014
grep each error emission for pattern, confirm classifier matches.
Pinpoint #169: typed-error classifier gap discovered during dogfood probe.
`claw --output-format json --output-format xml doctor` was emitting:
{"error": "unsupported value for --output-format: xml ...",
"hint": null,
"kind": "unknown",
"type": "error"}
After fix:
{"error": "unsupported value for --output-format: xml ...",
"hint": "Run `claw --help` for usage.",
"kind": "cli_parse",
"type": "error"}
The change adds two new classifier branches to `classify_error_kind`:
1. `unsupported value for --` → cli_parse
2. `missing value for --` → cli_parse
Covers all `CliOutputFormat::parse` / `parse_permission_mode_arg` rejections
and any future flag-value validation messages using the same pattern.
Side benefit: the #247 hint synthesizer ("Run `claw --help` for usage.")
now triggers automatically because the error is now correctly classified
as cli_parse. Consumers get both correct kind AND helpful hint.
Test added:
- `classify_error_kind_covers_flag_value_parse_errors_169` (4 positive +
1 sanity case)
Tests: 224/224 pass (+1 from #169).
Discovered during dogfood probe 2026-04-23 07:00 Seoul, cycle #94.
Refs: #169, typed-error family (#121, #127, #129, #130, #164, #247)
Pinpoint #155: USAGE.md was missing documentation for three interactive
commands that appear in `claw --help`:
- /ultraplan [task]
- /teleport <symbol-or-path>
- /bughunter [scope]
Also adds full documentation for other underdocumented commands:
- /commit, /pr, /issue, /diff, /plugin, /agents
Converts inline sentence list into structured section 'Interactive slash
commands (inside the REPL)' with brief descriptions for each command.
Closes#155 gap: discovered during dogfood probing of help/USAGE parity.
No code changes. Pure documentation update.
Phase 0 Task 4 of the JSON Productization Program: CI shape parity guard.
This test locks the v1.5 emission baseline (documented in SCHEMAS.md § v1.5
Emission Baseline) so any future PR that introduces shape drift in a documented
verb fails this test at PR time.
Complements Task 2 (no-silent guarantee) by asserting SPECIFIC top-level key
sets, not just 'stdout is non-empty valid JSON'. If a verb adds/removes a
top-level field, this test fails with a clear error message pointing to
SCHEMAS.md § v1.5 Emission Baseline for update guidance.
Coverage:
- 8 success-path verbs with locked shape (help, version, doctor, skills,
agents, system-prompt, bootstrap-plan, list-sessions)
- 2 error-path cases with locked error envelope shape (prompt-no-arg, doctor --foo)
Key enforcement rules:
- Success envelope: exact key set match per verb
- Error envelope: {error, hint, kind, type} (4 keys, all verbs)
- list-sessions deliberately kept as {command, sessions} (Phase 1 target)
Test design intent:
- Locks CURRENT (possibly imperfect) shape, NOT target shape
- Forces PR authors to update both code + SCHEMAS.md + test together
- Makes Phase 1 shape normalization PRs visible: 'update this test'
Phase 0 now COMPLETE:
- Task 1 ✅ Stream routing fix (cycle #89)
- Task 2 ✅ No-silent guarantee (cycle #90)
- Task 3 ✅ Per-verb emission inventory SCHEMAS.md (cycle #91)
- Task 4 ✅ CI shape parity guard (this cycle)
Tests: 18 output_format_contract tests all pass (+1 from Task 4).
v1.5 emission baseline now locked by code + tests + docs.
Refs: #168c, cycle #92, Phase 0 Task 4 (final)
Under --output-format json, error envelopes were emitted to stderr via
eprintln!. This violated the emission contract: stdout should carry the
contractual envelope (success OR error); stderr is reserved for
non-contractual diagnostics.
Cycle #87 controlled matrix audit found bootstrap/dump-manifests/state
exhibited this pattern (exit 1, stdout 0 bytes, stderr N bytes under
--output-format json).
Fix: change eprintln! to println! for the JSON error envelope path in main().
Text mode continues to route errors to stderr (conventional).
Verification:
- bootstrap --output-format json: stdout now carries envelope, exit 1
- dump-manifests --output-format json: stdout now carries envelope, exit 1
- Text mode: errors still on stderr with [error-kind: ...] prefix (no regression)
Tests:
- Updated assert_json_error_envelope helper to read from stdout (was stderr)
- Added error_envelope_emitted_to_stdout_under_output_format_json_168c
regression test that asserts envelope on stdout + non-JSON on stderr
- All 16 output_format_contract tests pass
Phase 0 Task 1 complete: emission routing fixed across all error-path verbs.
Phase 0 Task 2 (no-silent CI guarantee) remains.
Refs: #168c (cycle #87 filing), cycle #88 emission contract framing
Fresh-dogfood validation (cycle #84, #168) proved the original locus premise was
underspecified. v1.0 was never a coherent contract — each verb has a bespoke JSON
shape with no coordination, and bootstrap JSON is completely broken (silent
failure, exit 0 no output).
Revised migration plan:
- Phase 0 (NEW): Emergency fix for silent failures (#168 bootstrap JSON)
- Phase 1 (NEW): v1.5 baseline — minimal JSON invariants across all 14 verbs
- Every command emits valid JSON with --output-format json
- Every command has top-level 'kind' field for verb ID
- Every error envelope follows {error, hint, kind, type}
- Phase 2 (renamed from Phase 1): v2.0 wrapped envelope (opt-in)
- Phase 3 (renamed from Phase 2): v2.0 default
- Phase 4 (renamed from Phase 3): v1.0/v1.5 deprecation
Rationale:
- Can't migrate from 'incoherent' to 'coherent v2.0' in one jump
- Consumers need stable target (v1.5) to transition from
- Silent failures must be fixed BEFORE migration (consumers can't detect breakage)
Effort revision: ~9 dev-days (Phase 0: 1 + Phase 1: 3 + Phase 2: 5) vs original
~6 dev-days for direct v1.0→v2.0 (which would have failed).
Doctrine implication: Fresh-dogfood principle (#9, cycle #73) prevented a multi-day
migration from hitting an unsolvable baseline problem. Evidence-backed mid-design
correction.
Fresh dogfood validation (cycle #84) revealed the binary v1.0 envelope is NOT
consistent across commands:
- list-sessions: {command, sessions}
- doctor: {checks, kind, message, ...}
- bootstrap: (no JSON output at all)
- mcp: {action, kind, status, ...}
Each command has a custom JSON shape. Bootstrap's JSON path is completely broken
(exit 0 but no output). This is not 'v1.0 vs v2.0 design difference' — it's
'no consistent v1.0 ever existed'.
This explains why #164 (envelope migration) is blocked on design: the 'v1.0 from'
was never coherent. The real task is not 'migrate v1.0 to v2.0' but 'migrate
incoherent-per-command shapes to coherent-common-envelope'.
Implications for cycles #76–#82: The P0 doc fixes were correct to mark SCHEMAS.md
as 'aspirational' because the binary never had a consistent contract to document.
The deeper issue: each verb renderer was written independently with no envelope
coordination.
Three options proposed:
- A: accept per-command shapes (status quo + documentation)
- B: enforce common wrapper (FIX_LOCUS_164 full approach)
- C: hybrid (document current incoherence, then migrate 3 pilot verbs)
Recommendation: Option C. Documents truth immediately, enables phased migration.
This filing resolves the #164 design blocker: now we understand what we're
migrating from.