2 Commits

Author SHA1 Message Date
YeonGyu-Kim
7724bf98fd fix: #181 — envelope exit_code must match process exit code (exec-command/exec-tool)
Cycle #26 dogfood found a real red-state bug in the JSON envelope contract.

## The Bug

exec-command and exec-tool not-found cases return exit code 1 from the
process, but the envelope reports exit_code: 0 (the default from
wrap_json_envelope). This is a protocol violation.

Repro (before fix):
  $ claw exec-command unknown-cmd test --output-format json > out.json
  $ echo $?
  1
  $ jq '.exit_code' out.json
  0  # WRONG — envelope lies about exit code

Claws reading the envelope's exit_code field get misinformation. A claw
implementing the canonical ERROR_HANDLING.md pattern (check exit_code,
then classify by error.kind) would incorrectly treat failures as
successes when dispatching on the envelope alone.

## Root Cause

main.py lines 687–739 (exec-command + exec-tool handlers):
- Return statement: 'return 0 if result.handled else 1' (correct)
- Envelope wrap: 'wrap_json_envelope(envelope, args.command)'
  (uses default exit_code=0, IGNORES the return value)

The envelope wrap was called BEFORE the return value was computed, so
the exit_code field was never synchronized with the actual exit code.

## The Fix

Compute exit_code ONCE at the top:
  exit_code = 0 if result.handled else 1

Pass it explicitly to wrap_json_envelope:
  wrap_json_envelope(envelope, args.command, exit_code=exit_code)

Return the same value:
  return exit_code

This ensures the envelope's exit_code field is always truth — the SAME
value the process returns.

## Tests Added (3)

TestEnvelopeExitCodeMatchesProcessExit in test_exec_route_bootstrap_output_format.py:

1. test_exec_command_not_found_envelope_exit_matches:
   Verifies exec-command unknown-cmd returns exit 1 in both envelope
   and process.

2. test_exec_tool_not_found_envelope_exit_matches:
   Same for exec-tool.

3. test_all_commands_exit_code_invariant:
   Audit across 4 known non-zero cases (show-command, show-tool,
   exec-command, exec-tool not-found). Guards against the same bug
   in other surfaces.

## Impact

- 206 → 209 passing tests (+3)
- Zero regressions
- Protocol contract now truthful: envelope.exit_code == process exit
- Claws using the one-handler pattern from ERROR_HANDLING.md now get
  correct information

## Related

- ERROR_HANDLING.md (cycle #22): Documented exit_code as machine-readable
  contract field
- #178/#179 (cycles #19/#20): Closed parser-front-door contract
- This closes a gap in the WORK PROTOCOL contract — envelope values must
  match reality, not just be structurally present.

Classification (per cycle #24 calibration):
- Red-state bug: ✓ (contract violation, claws get misinformation)
- Real friction: ✓ (discovered via dogfood, not speculative)
- Fix ships same-cycle: ✓ (discipline per maintainership mode)

Source: Jobdori cycle #26 dogfood — ran multiple edge-case probes, noticed
exec-command envelope showed exit_code: 0 while process exited 1.
Investigated wrap_json_envelope default behavior, confirmed bug, fixed
and tested in same cycle.
2026-04-22 21:33:57 +09:00
YeonGyu-Kim
60925fa9f7 fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE
Extends the #167 inspect-surface parity fix to the four remaining CLI
outliers: the commands claws actually invoke to DO work, not just
inspect state. After this commit, the entire claw-code CLI family speaks
a unified JSON envelope contract.

Concrete additions:
- exec-command: --output-format {text,json}
- exec-tool: --output-format {text,json}
- route: --output-format {text,json}
- bootstrap: --output-format {text,json}

JSON envelope shapes:

exec-command (handled):
  {name, prompt, source_hint, handled: true, message}
exec-command (not-found):
  {name, prompt, handled: false,
   error: {kind:'command_not_found', message, retryable: false}}

exec-tool (handled):
  {name, payload, source_hint, handled: true, message}
exec-tool (not-found):
  {name, payload, handled: false,
   error: {kind:'tool_not_found', message, retryable: false}}

route:
  {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]}

bootstrap:
  {prompt, limit,
   setup: {python_version, implementation, platform_name, test_command},
   routed_matches: [{kind, name, score, source_hint}],
   command_execution_messages: [str],
   tool_execution_messages: [str],
   turn: {prompt, output, stop_reason},
   persisted_session_path}

Exit codes (unchanged from pre-#168):
  0 = success
  1 = exec not-found (exec-command, exec-tool only)

Backward compatibility:
  - Default (no --output-format) is 'text'
  - exec-command/exec-tool text output byte-identical
  - route text output: unchanged tab-separated kind/name/score/source_hint
  - bootstrap text output: unchanged Markdown runtime session report

Tests (13 new, test_exec_route_bootstrap_output_format.py):
  - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat
  - TestExecToolOutputFormat (3): handled + not-found JSON; text compat
  - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat
  - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat
  - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands
    (show-command, show-tool, exec-command, exec-tool, route, bootstrap) —
    every one accepts --output-format json and emits parseable JSON; every
    one defaults to text mode without a leading {. One future regression on
    any family member breaks this test.

Full suite: 124 → 137 passing, zero regression.

Closes ROADMAP #168.

This completes the CLI-wide JSON parity sweep:
- Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush)
- Inspect family: #167 (show-command, show-tool)
- Work-verb family: #168 (exec-command, exec-tool, route, bootstrap)

ENTIRE CLI SURFACE is now machine-readable via --output-format json with
typed errors, deterministic exit codes, and consistent envelope shape.
Claws no longer need to regex-parse any CLI output.

Related clusters:
  - Clawability principle: 'machine-readable in state and failure modes'
    (ROADMAP top-level). 9 pinpoints in this cluster; all now landed.
  - Typed-error envelope consistency: command_not_found / tool_not_found /
    session_not_found / session_load_failed all share {kind, message,
    retryable} shape.
  - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not
    'found') because 'not handled' is the operational signal — claws
    dispatch on whether the work was performed, not whether the entry
    exists in the inventory.
2026-04-22 18:34:26 +09:00