claw-code/OPT_OUT_AUDIT.md
YeonGyu-Kim c73423871b docs: OPT_OUT_AUDIT.md — decision table for 12 exempt surfaces (#175–#177 prep)
Filed explicit decision criteria for the 12 OPT_OUT surfaces (commands that do
not support --output-format json) documented in test_cli_parity_audit.py.

Categorized by rationale:
- Group A (4): Rich-Markdown reports (summary, manifest, parity-audit, setup-report)
  Markdown-as-output is intentional; JSON would be information loss.
  Unlikely promotions (remain OPT_OUT long-term).

- Group B (3): List filters with --query/--limit (subsystems, commands, tools)
  Query layer already exists; users have escape hatch.
  Remain OPT_OUT (promotion effort >> value).

- Group C (5): Simulation/debug surfaces (remote-mode, ssh-mode, teleport-mode,
  direct-connect-mode, deep-link-mode)
  Intentionally non-production; JSON output doesn't add value.
  Remain OPT_OUT (simulation tools, not orchestration endpoints).

Audit workflow documented:
1. Survey: Check if external claws actually request JSON versions
2. Cost estimate: Schema + tests for each surface
3. Value estimate: Real demand vs hypothetical
4. Decision: CLAWABLE, remain OPT_OUT, or new pinpoint

Promotion criteria locked (only if clear use case + schema simple + demand exists).

Outcome prediction: All 12 likely remain OPT_OUT (documented rationale per group).

Timeline: Survey period (cycles #19–#21), final decision (cycle #22).

Related pinpoints: #175 (summary/manifest JSON parallel?), #176 (--query-json?),
#177 (mode simulators ever CLAWABLE?).

This closes the documentation loop from cycles #173–#174 (protocol closure →
field evolution → reframe). Now governance rules are explicit for future work.
2026-04-22 19:54:41 +09:00

7.2 KiB
Raw Blame History

OPT_OUT Surface Audit Roadmap

Status: Pre-audit (decision table ready, survey pending)

This document governs the audit and potential promotion of 12 OPT_OUT surfaces (commands that currently do not support --output-format json).

OPT_OUT Classification Rationale

A surface is classified as OPT_OUT when:

  1. Human-first by nature: Rich Markdown prose / diagrams / structured text where JSON would be information loss
  2. Query-filtered alternative exists: Commands with internal --query / --limit don't need JSON (users already have escape hatch)
  3. Simulation/debug only: Not meant for production orchestration (e.g., mode simulators)
  4. Future JSON work is planned: Documented in ROADMAP with clear upgrade path

OPT_OUT Surfaces (12 Total)

Group A: Rich-Markdown Reports (4 commands)

Rationale: These emit structured narrative prose. JSON would require lossy serialization.

Command Output Current use JSON case
summary Multi-section workspace summary (Markdown) Human readability Not applicable; Markdown is the output
manifest Workspace manifest with project tree (Markdown) Human readability Not applicable; Markdown is the output
parity-audit TypeScript/Python port comparison report (Markdown) Human readability Not applicable; Markdown is the output
setup-report Preflight + startup diagnostics (Markdown) Human readability Not applicable; Markdown is the output

Audit decision: These likely remain OPT_OUT long-term (Markdown-as-output is intentional). If JSON version needed in future, would be a separate --output-format json path generating structured data (project summary object, manifest array, audit deltas, setup checklist) — but that's a new contract, not an addition to existing Markdown surfaces.

Pinpoint: #175 (deferred) — audit whether summary/manifest should emit JSON structured versions in parallel with Markdown, or if Markdown-only is the right UX.


Group B: List Commands with Query Filters (3 commands)

Rationale: These already support --query and --limit for filtering. JSON output would be redundant; users can pipe to jq.

Command Filtering Current output JSON case
subsystems --limit Human-readable list Use --query to filter, users can parse if needed
commands --query, --limit, --no-plugin-commands, --no-skill-commands Human-readable list Use --query to filter, users can parse if needed
tools --query, --limit, --simple-mode Human-readable list Use --query to filter, users can parse if needed

Audit decision: --query / --limit are already the machine-friendly escape hatch. These commands are intentionally list-filter-based (not orchestration-primary). Promoting to CLAWABLE would require:

  1. Formalizing what the structured output is (command array? tool array?)
  2. Versioning the schema per command
  3. Updating tests to validate per-command schemas

Cost-benefit: Low. Users who need structured data can already use --query to narrow results, then parse. Effort to promote > value.

Pinpoint: #176 (backlog) — audit --query UX; consider if a --query-json escape hatch (output JSON of matching items) is worth the schema tax.


Group C: Simulation / Debug Surfaces (5 commands)

Rationale: These are intentionally not production-orchestrated. They simulate behavior, test modes, or debug scenarios. JSON output doesn't add value.

Command Purpose Output Use case
remote-mode Simulate remote execution Text (mock session) Testing harness behavior under remote constraints
ssh-mode Simulate SSH execution Text (mock SSH session) Testing harness behavior over SSH-like transport
teleport-mode Simulate teleport hop Text (mock hop session) Testing harness behavior with teleport bouncing
direct-connect-mode Simulate direct network Text (mock session) Testing harness behavior with direct connectivity
deep-link-mode Simulate deep-link invocation Text (mock deep-link) Testing harness behavior from URL/deeplink

Audit decision: These are intentionally simulation-only. Promoting to CLAWABLE means:

  1. "This simulated mode is now a valid orchestration surface"
  2. Need to define what JSON output means (mock session state? simulation log?)
  3. Need versioning + test coverage

Cost-benefit: Very low. These are debugging tools, not orchestration endpoints. Effort to promote >> value.

Pinpoint: #177 (backlog) — decide if mode simulators should ever be CLAWABLE (probably no).


Audit Workflow (Future Cycles)

For each surface:

  1. Survey: Check if any external claw actually uses --output-format with this surface
  2. Cost estimate: How much schema work + testing?
  3. Value estimate: How much demand for JSON version?
  4. Decision: CLAWABLE, remain OPT_OUT, or new pinpoint?

Promotion criteria (if promoting to CLAWABLE):

A surface moves from OPT_OUT → CLAWABLE only if:

  • Clear use case for JSON (not just "hypothetically could be JSON")
  • Schema is simple and stable (not 20+ fields)
  • At least one external claw has requested it
  • Tests can be added without major refactor
  • Maintainability burden is worth the value

Demote criteria (if staying OPT_OUT):

A surface stays OPT_OUT if:

  • JSON would be information loss (Markdown reports)
  • Equivalent filtering already exists (--query / --limit)
  • Use case is simulation/debug, not production
  • Promotion effort > value to users

Post-Audit Outcomes

Likely scenario (high confidence)

Group A (Markdown reports): Remain OPT_OUT

  • summary, manifest, parity-audit, setup-report are intentionally human-first
  • If JSON-like structure is needed in future, would be separate *-json commands or distinct --output-format, not added to Markdown surfaces

Group B (List filters): Remain OPT_OUT

  • subsystems, commands, tools have --query / --limit as query layer
  • Users who need structured data already have escape hatch

Group C (Mode simulators): Remain OPT_OUT

  • remote-mode, ssh-mode, etc. are debug tools, not orchestration endpoints
  • No demand for JSON version; promotion would be forced, not driven

Result: OPT_OUT audit concludes that 12/12 surfaces should remain OPT_OUT (no promotions).

If demand emerges

If external claws report needing JSON from any OPT_OUT surface:

  1. File pinpoint with use case + rationale
  2. Estimate cost + value
  3. If value > cost, promote to CLAWABLE with full test coverage
  4. Update SCHEMAS.md
  5. Update CLAUDE.md

Timeline

  • Post-#174 (now): OPT_OUT audit documented (this file)
  • Cycles #19#21 (deferred): Survey period — collect data on external demand
  • Cycle #22 (deferred): Final audit decision + any promotions
  • Post-audit: Move to protocol maintenance mode (new commands/fields/surfaces)

  • SCHEMAS.md — Clawable surface contracts
  • CLAUDE.md — Development guidance
  • test_cli_parity_audit.py — Parametrized tests for CLAWABLE_SURFACES enforcement
  • ROADMAP.md — Macro phases (this audit is Phase 3 before Phase 2 closure)