ROADMAP #116: unknown keys in .claw.json hard-fail startup with exit 1; Claude Code migration parity broken (apiKeyHelper rejected); forward-compat impossible; only first error surfaces

Dogfooded 2026-04-18 on main HEAD ad02761 from /tmp/cdRR. Three related gaps in one finding: 1. Unknown keys are strict ERRORS, not warnings: {"permissions":{"defaultMode":"default"},"futureField":"x"} $ claw --output-format json status # stdout: empty # stderr: {"type":"error","error":"unknown key futureField"} # exit: 1 2. Claude Code migration parity broken: $ cp .claude.json .claw.json # .claude.json has apiKeyHelper (real Claude Code field) $ claw --output-format json status # stderr: unknown key apiKeyHelper → exit 1 No 'this is a Claude Code field we don't support, ignored' message. 3. Only errors[0] is reported — iterative discovery required: 3 unknown fields → 3 edit-run-fix cycles to fix them all. Error-routing split with --output-format json: success → stdout errors → stderr (structured JSON) Empty stdout on config errors. A claw piping stdout silently gets nothing. Must capture both streams. No escape hatch. No --ignore-unknown-config, no --strict flag, no strictValidation config option. Trace: config.rs:282-291 ConfigLoader gate: let validation = validate_config_file(...); if !validation.is_ok() { let first_error = &validation.errors[0]; return Err(ConfigError::Parse(first_error.to_string())); } all_warnings.extend(validation.warnings); config_validate.rs:19-47 DiagnosticKind::UnknownKey: level: DiagnosticLevel::Error (not Warning) config_validate.rs schema allow-list is hard-coded. No forward-compat extension (no x-* reserved namespace, no additionalProperties: true, no opt-in lax mode). grep 'apiKeyHelper' rust/crates/runtime/ → 0 matches. Claude-Code-native fields not tolerated as no-ops. grep 'ignore.*unknown|--no-validate|strict.*validation' rust/crates/ → 0 matches. No escape hatch. Fix shape (~100 lines): - Downgrade UnknownKey Error → Warning default. ~5 lines. - Add strict mode flag: .claw.json strictValidation: true OR --strict-config CLI flag. Default off. ~15 lines. - Collect all diagnostics, don't halt on first. ~20 lines. - TOLERATED_CLAUDE_CODE_FIELDS allow-list: apiKeyHelper, env etc. emit migration-hint warning 'not yet supported; ignored' instead of hard-fail. ~30 lines. - Emit structured error envelope on stdout too, not just stderr. --output-format json stdout includes config_diagnostics[]. ~15. - Wire suggestion: Option<String> for UnknownKey via fuzzy match ('permisions' → 'permissions'). ~15 lines. - Regression tests per outcome. Joins Claude Code migration parity (#103, #109) as 3rd member — most severe migration break. #103 silently drops .md files, #109 stderr-prose warnings, #116 outright hard-fails. Joins Reporting-surface/config-hygiene (#90, #91, #92, #110, #115) on error-routing-vs-stdout axis. Joins Silent-flag/documented-but-unenforced (#96-#101, #104, #108, #111, #115) — only first error reported, rest silent. Cross-cluster with Truth-audit — validation.is_ok() hides all but first structured problem. Natural bundles: #103 + #109 + #116 — Claude Code migration parity triangle: loss of compat (.md dropped) + loss of structure (stderr prose warnings) + loss of forward-compat (unknowns hard-fail) #109 + #116 — config validation reporting surface: only first warning surfaces structurally (#109) only first error surfaces structurally AND halts (#116) Filed in response to Clawhip pinpoint nudge 1494925472239321160 in #clawcode-building-in-public.
2026-06-10 11:13:33 +08:00 · 2026-04-18 14:03:20 +09:00 · 2026-04-18 14:03:20 +09:00 · f2d653896d
commit f2d653896d
parent ad02761918
1 changed files with 95 additions and 0 deletions
--- a/ROADMAP.md
+++ b/ROADMAP.md
@ -3609,3 +3609,98 @@ ear], /color [scheme], /effort [low|medium|high], /fast, /summary, /tag [label],
     **Blocker.** Product decision: is `init`-default `danger-full-access` intentional (for low-friction onboarding) or accidental? If intentional, the fix is warning-only. If accidental, the fix is a safer default.

     **Source.** Jobdori dogfood 2026-04-18 against `/tmp/cdPP` on main HEAD `ca09b6b` in response to Clawhip pinpoint nudge at `1494917922076889139`. Joins **Permission-audit / tool-allow-list** (#94, #97, #101, #106) as 5th member — this is the init-time ANCHOR of the permission-posture problem: #87 is absence-of-config, #101 is fail-OPEN on bad env var, **#115** is the init-generated dangerous default. Joins **Silent-flag / documented-but-unenforced** (#96–#101, #104, #108, #111) on the third axis: not a silent flag, but a silent setting (the generated config's security implications are silent in the init output). Cross-cluster with **Reporting-surface / config-hygiene** (#90, #91, #92, #110) on the structured-data-vs-prose axis: `claw init --output-format json` wraps all structure inside `message`. Cross-cluster with **Truth-audit** on "Next step: Review and tailor the generated guidance" phrasing — misleads by omission. Natural bundle: **#87 + #101 + #115** — "permission drift at every boundary": absence default + env-var bypass + init-generated default. Also: **#50 + #87 + #91 + #94 + #97 + #101 + #115** — flagship permission-audit sweep now 7-way. Session tally: ROADMAP #115.
+
+116. **Unknown keys in `.claw.json` are strict ERRORS, not warnings — `claw` hard-fails at startup with exit 1 if any field is unrecognized. Only the FIRST error is reported; all subsequent validation messages are lost. Valid Claude Code config fields (`apiKeyHelper`, `env`, and other Claude-Code-native keys) trigger the same hard-fail, so a user renaming `.claude.json → .claw.json` for migration gets `"unknown key \"apiKeyHelper\"" ... exit 1` with zero guidance on what to delete. The error goes to stderr as structured JSON (`{"type":"error","error":"..."}`) but a `--output-format json` consumer has to read BOTH stdout AND stderr to capture success-or-error — the stdout side is empty on error. There is no `--ignore-unknown-config` flag, no `strict` vs `warn` mode toggle, no forward-compat path — a claw's future-self putting a single new field in the config kills every older claw binary** — dogfooded 2026-04-18 on main HEAD `ad02761` from `/tmp/cdRR`.
+
+     **Concrete repro.**
+     ```
+     # Forward-compat scenario — config has a "future" field:
+     $ cd /tmp/cdRR && git init -q .
+     $ cat > .claw.json << 'EOF'
+     {
+       "permissions": {"defaultMode": "default"},
+       "futureField": "some-feature"
+     }
+     EOF
+     $ claw --output-format json status
+     # stdout: (empty)
+     # stderr: {"type":"error","error":"/private/tmp/cdRR/.claw.json: unknown key \"futureField\" (line 3)"}
+     # exit: 1
+
+     # Claude Code migration scenario — rename .claude.json to .claw.json:
+     $ cat > .claw.json << 'EOF'
+     {
+       "permissions": {"defaultMode": "default"},
+       "apiKeyHelper": "/usr/local/bin/key-helper",
+       "env": {"FOO": "bar"}
+     }
+     EOF
+     $ claw --output-format json status
+     # stderr: {"type":"error","error":"/private/tmp/cdRR/.claw.json: unknown key \"apiKeyHelper\""}
+     # apiKeyHelper is a real Claude Code config field. claw-code refuses it.
+
+     # Multiple unknowns — only the first is reported:
+     $ cat > .claw.json << 'EOF'
+     {
+       "a_bad": 1,
+       "b_bad": 2,
+       "c_bad": 3
+     }
+     EOF
+     $ claw --output-format json status
+     # stderr: unknown key "a_bad" (line 2)
+     # User fixes a_bad, re-runs, gets b_bad error. Iterative discovery.
+
+     # No escape hatch:
+     $ claw --ignore-unknown-config --output-format json status
+     # stderr: unknown option: --ignore-unknown-config
+     ```
+
+     **Trace path.**
+     - `rust/crates/runtime/src/config.rs:282-291` — `ConfigLoader` validation gate:
+       ```rust
+       let validation = crate::config_validate::validate_config_file(
+           &parsed.object,
+           &parsed.source,
+           &entry.path,
+       );
+       if !validation.is_ok() {
+           let first_error = &validation.errors[0];
+           return Err(ConfigError::Parse(first_error.to_string()));
+       }
+       all_warnings.extend(validation.warnings);
+       ```
+       `validation.is_ok()` means `errors.is_empty()`. Any error in the vec halts loading. Only `errors[0]` is surfaced. `validation.warnings` is accumulated and later `eprintln!`d as prose (already covered in #109).
+     - `rust/crates/runtime/src/config_validate.rs:19-47` — `DiagnosticKind::UnknownKey`:
+       ```rust
+       UnknownKey { suggestion: Option<String> }
+       ```
+       Unknown keys produce a `ConfigDiagnostic` with `level: DiagnosticLevel::Error`. They're classified as errors, not warnings.
+     - `rust/crates/runtime/src/config_validate.rs:380-395` — unknown-key detection walks the parsed object, compares keys against a hard-coded known list, emits `Error`-level diagnostics for any mismatch.
+     - `rust/crates/runtime/src/config_validate.rs` — `SCHEMA_FIELDS` or equivalent allow-list is a fixed set. There is no forward-compat extension mechanism (no `extensions` / `x-*` prefix convention, no reserved namespace, no `additionalProperties` toggle).
+     - `grep -rn "apiKeyHelper" rust/crates/runtime/` → zero matches. Claude-Code-native fields are not recognized even as no-ops; they are outright rejected.
+     - `grep -rn "ignore.*unknown\|--no-validate\|strict.*validation" rust/crates/` → zero matches. No escape hatch.
+
+     **Why this is specifically a clawability gap.**
+     1. *Forward-compat is impossible.* If a claw upgrade adds a new config field, any older binary (CI cache, legacy nodes, stuck deployments) hard-fails on the new field. This is the opposite of how tools like `cargo`, `jq`, most JSON APIs, and every serde-derived Rust config loader handle unknowns (warn or silently accept by default).
+     2. *Only `errors[0]` is reported per run.* Fixing N unknown fields requires N edit-run-fix cycles. A claw running `claw status` inside a validation loop has to re-invoke for every unknown. This joins #109 where only the first error surfaces structurally; the rest are discarded.
+     3. *Claude Code migration parity is broken.* The README and user docs for claw-code position it as Claude-Code-compatible. Users who literally `cp .claude.json .claw.json` get immediate hard-fail on `apiKeyHelper`, `env`, and other legitimate Claude Code fields. No graceful "this is a Claude Code field we don't support, ignored" message.
+     4. *Error-routing split.* With `--output-format json`, success goes to stdout, errors go to stderr. Claws orchestrating claw must capture both streams and correlate. A claw that `claw status | jq .permission_mode` silently gets empty output when config is broken — the error is invisible to the pipe consumer.
+     5. *Joins #109 (validation warnings stderr-only).* #109 said warnings are prose-on-stderr and the structured form is discarded. #116 adds: errors also go to stderr (structured as JSON this time, good), but in a hard-fail way that prevents the stdout channel from emitting ANYTHING. A claw gets either pure-JSON success or empty-stdout + JSON-error-stderr; it must always read both.
+     6. *No strict-vs-lax mode.* Tools that support forward-compat typically have two modes: strict (reject unknown) for production, lax (warn on unknown) for developer workflows. claw-code has neither toggle; it's strict always.
+     7. *Joins Claude Code migration parity cluster* (#103, #109). #103 was `claw agents` dropping non-`.toml` files. #109 was stderr-only prose warnings. **#116** is the outright rejection of Claude-Code-native config fields at load time.
+
+     **Fix shape — make unknown keys warnings by default, add explicit strict mode, collect all errors per run.**
+     1. *Downgrade `DiagnosticKind::UnknownKey` from Error to Warning by default.* The parser still surfaces the diagnostic; the CLI just doesn't halt on it. ~5 lines.
+     2. *Add `strict` mode flag.* `.claw.json` top-level `{"strictValidation": true}` OR `--strict-config` CLI flag. When set, unknown keys become errors as today. Default: off. ~15 lines.
+     3. *Collect all diagnostics, don't halt on first.* Replace `errors[0]` return with full `errors: [...]` collection, then decide fatal-or-not based on severity + strict-mode flag. ~20 lines.
+     4. *Recognize Claude-Code-native fields as explicit no-ops.* Add `apiKeyHelper`, `env`, and other known Claude Code fields to a `TOLERATED_CLAUDE_CODE_FIELDS` allow-list that emits a migration-hint warning: `"apiKeyHelper" is a Claude Code field not yet supported by claw-code; ignored.` ~30 lines.
+     5. *Include structured errors in the `--output-format json` stdout payload on hard fail.* Currently `{"type":"error","error":"..."}` goes to stderr and stdout is empty. Emit a machine-readable error envelope on stdout as well (or exclusively), with `config_diagnostics: [{level, field, location, message}]`. Keep stderr human-readable. ~15 lines.
+     6. *Add suggestion-by-default for UnknownKey.* The parser already supports `suggestion: Option<String>` in the DiagnosticKind — wire it to a fuzzy-match across the schema. `"permisions"` → `"permissions"` suggestion. ~15 lines.
+     7. *Regression tests.* (a) Forward-compat config with novel field loads without error. (b) Strict mode opt-in rejects unknown. (c) All diagnostics reported, not just first. (d) apiKeyHelper + env + other Claude Code fields produce migration-hint warning, not hard-fail. (e) `--output-format json` stdout contains error envelope on validation failure.
+
+     **Acceptance.** `cp .claude.json .claw.json && claw status` loads without hard-fail and emits a migration-hint warning for each Claude-Code-native field. `echo '{"newFutureField": 1}' > .claw.json && claw status` loads with a single warning, not a fatal error. `claw --strict-config status` retains today's strict behavior. All diagnostics are reported, not just `errors[0]`. `--output-format json` emits errors on stdout in addition to stderr.
+
+     **Blocker.** Policy decision: does the project want strict-by-default (current) or lax-by-default? The fix shape assumes lax-by-default with strict opt-in, matching industry-standard forward-compat conventions and easing Claude Code migration.
+
+     **Source.** Jobdori dogfood 2026-04-18 against `/tmp/cdRR` on main HEAD `ad02761` in response to Clawhip pinpoint nudge at `1494925472239321160`. Joins **Claude Code migration parity** (#103, #109) as 3rd member — this is the most severe migration-parity break, since it's a HARD FAIL at startup rather than a silent drop (#103) or a stderr-prose warning (#109). Joins **Reporting-surface / config-hygiene** (#90, #91, #92, #110, #115) on the error-routing-vs-stdout axis: `--output-format json` consumers get empty stdout on config errors. Joins **Silent-flag / documented-but-unenforced** (#96–#101, #104, #108, #111, #115) because only the first error is reported and all subsequent errors are silent. Cross-cluster with **Truth-audit / diagnostic-integrity** (#80–#87, #89, #100, #102, #103, #105, #107, #109, #110, #112, #114, #115) because `validation.is_ok()` hides all-but-the-first structured problem. Natural bundle: **#103 + #109 + #116** — Claude Code migration parity triangle: `claw agents` drops `.md` (loss of compatibility) + config warnings stderr-prose (loss of structure) + config unknowns hard-fail (loss of forward-compat). Also **#109 + #116** — config validation reporting surface: only first warning surfaces structurally (#109) + only first error surfaces structurally and halts loading (#116). Session tally: ROADMAP #116.