claw-code

elias/claw-code

Fork 0

mirror of https://github.com/ultraworkers/claw-code.git synced 2026-04-24 21:28:11 +08:00

Commit Graph

Author	SHA1	Message	Date
YeonGyu-Kim	de368a2615	docs+test: cycle #29 — document + lock text-mode vs JSON-mode exit divergence Cycle #29 dogfood found a real pinpoint: cross-mode exit code divergence. ## The Pinpoint Dogfooding the CLI revealed that unknown subcommand errors return different exit codes depending on output mode: $ python3 -m src.main nonexistent-cmd # exit 2 $ python3 -m src.main nonexistent-cmd --output-format json # exit 1 ERROR_HANDLING.md documented the exit-code contract (1=parse, 2=timeout) but did NOT explicitly state the contract applies only to JSON mode. Text mode follows argparse defaults (exit 2 for any parse error), which violates the documented contract when interpreted generally. A claw using text mode with 'claw nonexistent' would see exit 2 and misclassify as timeout per the docs. Real protocol contract gap, not implementation bug. ## Classification This is a DOCUMENTATION gap, not a behavior bug: - Text mode follows argparse convention (reasonable for humans) - JSON mode normalizes to documented contract (reasonable for claws) - The divergence is intentional; only the docs were silent about it Fix = document the divergence explicitly + lock it with tests. NOT fix = change text mode exit code to 1 (would break argparse conventions and confuse human users). ## Documentation Changes ERROR_HANDLING.md: 1. Added IMPORTANT callout in Quick Reference section: 'The exit code contract applies ONLY when --output-format json is explicitly set. Text mode follows argparse conventions.' 2. New 'Text mode vs JSON mode exit codes' table showing exact divergence: - Unknown subcommand: text=2, json=1 - Missing required arg: text=2, json=1 - Session not found: text=1, json=1 (app-level, identical) - Success: text=0, json=0 (identical) - Timeout: text=2, json=2 (identical, #161) 3. Practical rule: 'always pass --output-format json' ## Tests Added (5) TestTextVsJsonModeDivergence in test_cross_channel_consistency.py: 1. test_unknown_command_text_mode_exits_2 — text mode argparse default 2. test_unknown_command_json_mode_exits_1 — JSON mode contract normalized 3. test_missing_required_arg_text_mode_exits_2 — same for missing args 4. test_missing_required_arg_json_mode_exits_1 — same normalization 5. test_success_path_identical_in_both_modes — success exit identical These tests LOCK the expected divergence so: - Documentation stays aligned with implementation - Future changes (either direction) are caught as intentional - Claws trust the docs ## Test Status - 217 → 222 tests passing (+5) - Zero regressions ## Discipline This cycle follows the cycle #28 template exactly: - Dogfood probe revealed real friction (test said exit=2, docs said exit=1) - Minimal fix shape (documentation clarification, not code change) - Regression guard via tests - Evidence-backed, not speculative Relationship to #181: - #181 fixed env.exit_code != process exit (WITHIN JSON mode) - #29 clarifies exit code contract scope (ONLY JSON mode) - Both establish: exit codes are deterministic, but only when --output-format json --- Classification (per cycle #24 calibration): - Red-state bug? ✗ (behavior was reasonable, docs were incomplete) - Real friction? ✓ (docs/code divergence revealed by dogfood) - Evidence-backed? ✓ (test suite probed both modes, found the gap) Source: Jobdori cycle #29 proactive dogfood — in response to Clawhip nudge for pinpoint hunting. Found that text-mode errors return exit 2 but ERROR_HANDLING.md implied exit 1 was the parse-error contract universally.	2026-04-22 22:03:08 +09:00
YeonGyu-Kim	fef249d9e7	test: cycle #27 — cross-channel consistency audit suite Cycle #27 ships a new test class systematizing the three-layer protocol invariant framework. ## Context After cycles #20–#26, the protocol has three distinct invariant classes: 1. Structural compliance (#178): Does the envelope exist? 2. Quality compliance (#179): Is stderr silent + error message truthful? 3. Cross-channel consistency (#181 + NEW): Do multiple channels agree? #181 revealed a critical gap: the second test class was incomplete. Envelopes could be structurally valid, quality-compliant, but still lie about their own state (envelope.exit_code != actual exit). ## New Test Class TestCrossChannelConsistency in test_cross_channel_consistency.py captures the third invariant layer with 5 dedicated tests: 1. envelope.command ↔ dispatched subcommand 2. envelope.output_format ↔ --output-format flag 3. envelope.timestamp ↔ actual wall clock (recent, <5s) 4. envelope.exit_code ↔ process exit code (cycle #26/#181 regression guard) 5. envelope boolean fields (found/handled/deleted) ↔ error block presence Each test specifically targets cross-channel truth, not structure or quality. ## Why Separate Test Classes Matter A command can fail all three ways independently: \| Failure mode \| Exit/Crash \| Test class \| Example \| \|---\|---\|---\|---\| \| Structural \| stderr noise \| TestParseErrorEnvelope \| argparse leaks to stderr \| \| Quality \| correct shape, wrong message \| TestParseErrorStderrHygiene \| error instead of real message \| \| Cross-channel \| truthy field, lie about state \| TestCrossChannelConsistency \| exit_code: 0 but exit 1 \| #181 was invisible to the first two classes. A claw passing all structure/ quality tests could still be misled. The third class catches that. ## Audit Results (Cycle #27) All 5 tests pass — no drift detected in any channel pair: - ✅ Envelope command always matches dispatch - ✅ Envelope output_format always matches flag - ✅ Envelope timestamp always recent (<5s) - ✅ Envelope exit_code always matches process exit (post-#181 guard) - ✅ Boolean fields consistent with error block presence The systematic audit proved the fix from #181 holds, and identified no new cross-channel gaps. ## Test Impact - 209 → 214 tests passing (+5) - Zero regressions - New invariant class now has dedicated test suite - Future cross-channel bugs will be caught by this class ## Related - #178 (#20): Parser-front-door structural contract - #179 (#20): Stderr hygiene + real error message quality - #181 (#26): Envelope exit_code must match process exit - #182-N: Future cross-channel contract violations will be caught by TestCrossChannelConsistency This test class is evergreen — as new fields/channels are added to the protocol, invariants for those channels should be added here, not mixed with other test classes. Keeping invariant classes separate makes regression attribution instant (e.g., 'TestCrossChannelConsistency failed' = 'some truth channel disagreed'). Classification (per cycle #24 calibration): - Red-state bug: ✗ (audit is green) - Real friction: ✓ (structured audit of documented invariants) - Proof of equilibrium: ✓ (systematic verification, no gaps found) Source: Jobdori cycle #27 proactive invariant audit — following gaebal guidance to probe documented invariants, not speculative gaps.	2026-04-22 21:45:00 +09:00

Author

SHA1

Message

Date

YeonGyu-Kim

de368a2615

docs+test: cycle #29 — document + lock text-mode vs JSON-mode exit divergence

Cycle #29 dogfood found a real pinpoint: cross-mode exit code divergence.

## The Pinpoint

Dogfooding the CLI revealed that unknown subcommand errors return different
exit codes depending on output mode:

  $ python3 -m src.main nonexistent-cmd                        # exit 2
  $ python3 -m src.main nonexistent-cmd --output-format json   # exit 1

ERROR_HANDLING.md documented the exit-code contract (1=parse, 2=timeout)
but did NOT explicitly state the contract applies only to JSON mode. Text
mode follows argparse defaults (exit 2 for any parse error), which
violates the documented contract when interpreted generally.

A claw using text mode with 'claw nonexistent' would see exit 2 and
misclassify as timeout per the docs. Real protocol contract gap, not
implementation bug.

## Classification

This is a DOCUMENTATION gap, not a behavior bug:
- Text mode follows argparse convention (reasonable for humans)
- JSON mode normalizes to documented contract (reasonable for claws)
- The divergence is intentional; only the docs were silent about it

Fix = document the divergence explicitly + lock it with tests.

NOT fix = change text mode exit code to 1 (would break argparse
conventions and confuse human users).

## Documentation Changes

ERROR_HANDLING.md:
1. Added IMPORTANT callout in Quick Reference section:
   'The exit code contract applies ONLY when --output-format json is
    explicitly set. Text mode follows argparse conventions.'
2. New 'Text mode vs JSON mode exit codes' table showing exact divergence:
   - Unknown subcommand: text=2, json=1
   - Missing required arg: text=2, json=1
   - Session not found: text=1, json=1 (app-level, identical)
   - Success: text=0, json=0 (identical)
   - Timeout: text=2, json=2 (identical, #161)
3. Practical rule: 'always pass --output-format json'

## Tests Added (5)

TestTextVsJsonModeDivergence in test_cross_channel_consistency.py:

1. test_unknown_command_text_mode_exits_2 — text mode argparse default
2. test_unknown_command_json_mode_exits_1 — JSON mode contract normalized
3. test_missing_required_arg_text_mode_exits_2 — same for missing args
4. test_missing_required_arg_json_mode_exits_1 — same normalization
5. test_success_path_identical_in_both_modes — success exit identical

These tests LOCK the expected divergence so:
- Documentation stays aligned with implementation
- Future changes (either direction) are caught as intentional
- Claws trust the docs

## Test Status

- 217 → 222 tests passing (+5)
- Zero regressions

## Discipline

This cycle follows the cycle #28 template exactly:
- Dogfood probe revealed real friction (test said exit=2, docs said exit=1)
- Minimal fix shape (documentation clarification, not code change)
- Regression guard via tests
- Evidence-backed, not speculative

Relationship to #181:
- #181 fixed env.exit_code != process exit (WITHIN JSON mode)
- #29 clarifies exit code contract scope (ONLY JSON mode)
- Both establish: exit codes are deterministic, but only when --output-format json

---

Classification (per cycle #24 calibration):
- Red-state bug? ✗ (behavior was reasonable, docs were incomplete)
- Real friction? ✓ (docs/code divergence revealed by dogfood)
- Evidence-backed? ✓ (test suite probed both modes, found the gap)

Source: Jobdori cycle #29 proactive dogfood — in response to Clawhip nudge
for pinpoint hunting. Found that text-mode errors return exit 2 but
ERROR_HANDLING.md implied exit 1 was the parse-error contract universally.

2026-04-22 22:03:08 +09:00

YeonGyu-Kim

fef249d9e7

test: cycle #27 — cross-channel consistency audit suite

Cycle #27 ships a new test class systematizing the three-layer protocol
invariant framework.

## Context

After cycles #20–#26, the protocol has three distinct invariant classes:

1. **Structural compliance** (#178): Does the envelope exist?
2. **Quality compliance** (#179): Is stderr silent + error message truthful?
3. **Cross-channel consistency** (#181 + NEW): Do multiple channels agree?

#181 revealed a critical gap: the second test class was incomplete.
Envelopes could be structurally valid, quality-compliant, but still
lie about their own state (envelope.exit_code != actual exit).

## New Test Class

TestCrossChannelConsistency in test_cross_channel_consistency.py captures
the third invariant layer with 5 dedicated tests:

1. envelope.command ↔ dispatched subcommand
2. envelope.output_format ↔ --output-format flag
3. envelope.timestamp ↔ actual wall clock (recent, <5s)
4. envelope.exit_code ↔ process exit code (cycle #26/#181 regression guard)
5. envelope boolean fields (found/handled/deleted) ↔ error block presence

Each test specifically targets cross-channel truth, not structure or quality.

## Why Separate Test Classes Matter

A command can fail all three ways independently:

| Failure mode | Exit/Crash | Test class | Example |
|---|---|---|---|
| Structural | stderr noise | TestParseErrorEnvelope | argparse leaks to stderr |
| Quality | correct shape, wrong message | TestParseErrorStderrHygiene | error instead of real message |
| Cross-channel | truthy field, lie about state | TestCrossChannelConsistency | exit_code: 0 but exit 1 |

#181 was invisible to the first two classes. A claw passing all structure/
quality tests could still be misled. The third class catches that.

## Audit Results (Cycle #27)

All 5 tests pass — no drift detected in any channel pair:

- ✅ Envelope command always matches dispatch
- ✅ Envelope output_format always matches flag
- ✅ Envelope timestamp always recent (<5s)
- ✅ Envelope exit_code always matches process exit (post-#181 guard)
- ✅ Boolean fields consistent with error block presence

The systematic audit proved the fix from #181 holds, and identified
no new cross-channel gaps.

## Test Impact

- 209 → 214 tests passing (+5)
- Zero regressions
- New invariant class now has dedicated test suite
- Future cross-channel bugs will be caught by this class

## Related

- #178 (#20): Parser-front-door structural contract
- #179 (#20): Stderr hygiene + real error message quality
- #181 (#26): Envelope exit_code must match process exit
- #182-N: Future cross-channel contract violations will be caught
  by TestCrossChannelConsistency

This test class is evergreen — as new fields/channels are added to the
protocol, invariants for those channels should be added here, not mixed
with other test classes. Keeping invariant classes separate makes
regression attribution instant (e.g., 'TestCrossChannelConsistency failed'
= 'some truth channel disagreed').

Classification (per cycle #24 calibration):
- Red-state bug: ✗ (audit is green)
- Real friction: ✓ (structured audit of documented invariants)
- Proof of equilibrium: ✓ (systematic verification, no gaps found)

Source: Jobdori cycle #27 proactive invariant audit — following gaebal
guidance to probe documented invariants, not speculative gaps.

2026-04-22 21:45:00 +09:00

2 Commits