ROADMAP #103: claw agents silently drops every non-.toml file; claude-code convention .md files ignored, no content validation

Dogfooded 2026-04-18 on main HEAD 6a16f08 from /tmp/cdX.

Two-part gap on agent subsystem:

1. File-format gate silently discards .md (YAML frontmatter):
   commands/src/lib.rs:3180-3220 load_agents_from_roots filters
   extension() != 'toml' and silently continues. No log, no warn.
   .claw/agents/foo.md → agents list count: 0, doctor: ok.
   Same file renamed to .toml → discovered instantly.

2. No content validation inside accepted .toml:
   model='nonexistent/model-that-does-not-exist' → accepted.
   tools=['DoesNotExist', 'AlsoFake'] → accepted.
   reasoning_effort string → unvalidated.
   No check against model registry, tool registry, or
   reasoning-effort enum — all machinery exists elsewhere
   (#97 validates tools for --allowedTools flag).

Compounded:
- agents help JSON lists sources but NOT accepted file formats.
  Operators have zero documentation-surface way to diagnose
  'why does my .md file not work?'
- Doctor check set has no agents check. 3 files present with
  1 silently skipped → summary: 'ok'.
- Skills use .md (SKILL.md). MCP uses .json (.claw.json).
  Agents uses .toml. Three subsystems, three formats, no
  cross-subsystem consistency or documentation.
- Claude Code convention is .md with YAML frontmatter.
  Migrating operators copy that and silently fail.

Fix shape (~100 lines):
- accept .md with YAML frontmatter via existing
  parse_skill_frontmatter helper
- validate model/tools/reasoning_effort against existing
  registries; emit status: 'invalid' + validation_errors
  instead of silently accepting
- agents list summary.skipped: [{path, reason}]
- add agents doctor check (total/active/skipped/invalid)
- agents help: accepted_formats list

Joins truth-audit (#80-#84, #86, #87, #89, #100, #102) on
silent-ok-while-ignoring axis. Joins silent-flag (#96-#101) at
subsystem scale. Joins unplumbed-subsystem (#78, #96, #100,
#102) as 5th unreachable surface: load_agents_from_roots
present, parse_skill_frontmatter present, validation helpers
present, agents path calls none of them.

Also opens new 'Claude Code migration parity' cross-cluster:
claw-code silently breaks the expected convention migration
path for a first-class subsystem.

Natural bundles: #102 + #103 (subsystem-doctor-coverage),
#78 + #96 + #100 + #102 + #103 (unplumbed-surface quintet).

Filed in response to Clawhip pinpoint nudge 1494804679962661187
in #clawcode-building-in-public.
This commit is contained in:
YeonGyu-Kim 2026-04-18 06:03:22 +09:00
parent 6a16f0824d
commit 7447232688

View File

@ -2527,3 +2527,90 @@ ear], /color [scheme], /effort [low|medium|high], /fast, /summary, /tag [label],
**Blocker.** None. The cheapest tier (`which` / absolute-path existence check) is ~10 lines per server transport class and closes the "command doesn't exist on disk" gap entirely. Deeper handshake probes can be added later behind an opt-in `--probe` flag.
**Source.** Jobdori dogfood 2026-04-18 against `/tmp/cdW2` on main HEAD `eabd257` in response to Clawhip pinpoint nudge at `1494797126041862285`. Joins the **unplumbed-subsystem** cross-cluster with #78 (`claw plugins` route never constructed) and #100 (stale-base JSON-invisible) — same shape: machinery exists, diagnostic surface doesn't expose it. Joins **truth-audit / diagnostic-integrity** (#80-#84, #86, #87, #89, #100) because `doctor: ok` is a lie when MCP servers are unreachable. Directly implements the roadmap's own Phase 1 §3.5 (boot preflight), Phase 2 §4 (canonical lane events), Phase 4.4.4 (event provenance), and Product Principle #5 (partial success is first-class). Natural bundle: **#78 + #100 + #102** (unplumbed-surface quartet, now with #96) — four surfaces where the subsystem exists but the JSON diagnostic doesn't expose it; tight family PR. Also **#100 + #102** as the pure "doctor surface coverage" 2-way: #100 surfaces commit identity, #102 surfaces MCP reachability, together they let `claw doctor` actually live up to its name.
103. **`claw agents` silently discards every agent definition that is not a `.toml` file — including `.md` files with YAML frontmatter, which is the Claude Code convention that most operators will reach for first. A `.claw/agents/foo.md` file is silently skipped by the agent-discovery walker; `agents list` reports zero agents; doctor reports ok; neither `agents help` nor `--help` nor any docs mention that `.toml` is the accepted format — the gate is entirely code-side and invisible at the operator layer. Compounded by the agent loader not validating *any* of the values inside a discovered `.toml` (model names, tool names, reasoning effort levels) — so the `.toml` gate filters *form* silently while downstream ignores *content* silently** — dogfooded 2026-04-18 on main HEAD `6a16f08` from `/tmp/cdX`. A `.claw/agents/broken.md` with claude-code-style YAML frontmatter is invisible to `agents list`. The same content moved into `.claw/agents/broken.toml` is loaded instantly — including when it references `model: "nonexistent/model-that-does-not-exist"` and `tools: ["DoesNotExist", "AlsoFake"]`, both of which are accepted without complaint.
**Concrete repro.**
```
$ mkdir -p /tmp/cdX/.claw/agents
$ cat > /tmp/cdX/.claw/agents/broken.md << 'MD'
---
name: broken
description: Test agent with garbage
model: nonexistent/model-that-does-not-exist
tools: ["DoesNotExist", "AlsoFake"]
---
You are a test agent.
MD
$ claw --output-format json agents list | jq '{count, agents: .agents | length, summary}'
{"count": 0, "agents": 0, "summary": {"active": 0, "shadowed": 0, "total": 0}}
# .md file silently skipped — no log, no warning, no doctor signal
$ claw --output-format json doctor | jq '.has_failures, .summary'
false
{"failures": 0, "ok": 4, "total": 6, "warnings": 2}
# doctor: clean
# Now rename the SAME content to .toml:
$ mv /tmp/cdX/.claw/agents/broken.md /tmp/cdX/.claw/agents/broken.toml
# ... (adjusting content to TOML syntax instead of YAML frontmatter)
$ cat > /tmp/cdX/.claw/agents/broken.toml << 'TOML'
name = "broken"
description = "Test agent with garbage"
model = "nonexistent/model-that-does-not-exist"
tools = ["DoesNotExist", "AlsoFake"]
TOML
$ claw --output-format json agents list | jq '.agents[0] | {name, model}'
{"name": "broken", "model": "nonexistent/model-that-does-not-exist"}
# File format (.toml) passes the gate. Garbage content (nonexistent model,
# fake tool names) is accepted without validation.
$ claw --output-format json agents help | jq '.usage'
{
"direct_cli": "claw agents [list|help]",
"slash_command": "/agents [list|help]",
"sources": [".claw/agents", "~/.claw/agents", "$CLAW_CONFIG_HOME/agents"]
}
# Help lists SOURCES but not the required FILE FORMAT.
```
**Trace path.**
- `rust/crates/commands/src/lib.rs:3180-3220``load_agents_from_roots`:
```rust
for entry in fs::read_dir(root)? {
let entry = entry?;
if entry.path().extension().is_none_or(|ext| ext != "toml") {
continue;
}
let contents = fs::read_to_string(entry.path())?;
// ... parse_toml_string(&contents, "name") etc.
}
```
The `extension() != "toml"` check silently drops every non-TOML file. No log. No warning. No collection of skipped-file names for later display. `grep -rn 'extension().*"md"\|parse_yaml_frontmatter\|yaml_frontmatter' rust/crates/commands/src/lib.rs` — zero hits. No code anywhere reads `.md` as an agent source.
- `rust/crates/commands/src/lib.rs``parse_toml_string(&contents, "name")` — falls back to filename stem if parsing fails. Thus a `.toml` file that is *not actually TOML* would still be "discovered" with the filename as the name. `parse_toml_string` presumably handles `description`/`model`/`reasoning_effort` similarly. No structural validation.
- `rust/crates/commands/src/lib.rs` — no validation of `model` against a known-model list, no validation of `tools[]` entries against the canonical tool registry (the registry exists, per #97). Garbage model names and nonexistent tool names flow straight into the `AgentSummary`.
- The `agents help` output emitted at `commands/src/lib.rs` (rendered via `render_agents_help`) exposes the three search roots but *not* the required file extension. A claude-code-migrating operator who drops a `.md` file into `.claw/agents/` gets silent failure and no help-surface hint.
- Skills use `.md` via `SKILL.md`, scanned at `commands/src/lib.rs:3229-3260`. MCP uses `.json` via `.claw.json`. Agents use `.toml`. Three subsystems, three formats, zero consistency documentation; only one of them silently discards the claude-code-convention format.
**Why this is specifically a clawability gap.**
1. *Silent-discard discovery.* Same family as the #96/#97/#98/#99/#100/#101/#102 silent-failure class, now on the agent-registration axis. An operator thinks they defined an agent; claw thinks no agent was defined; doctor says ok. The ground truth mismatch surfaces only when the agent tries to invoke `/agent spawn broken` and the name isn't resolvable — and even then the error is "agent not found" rather than "agent file format wrong."
2. *Claude Code convention collision.* The Anthropic Claude Code reference for agents uses `.md` with YAML frontmatter. Migrating operators copy that convention over. claw-code silently drops their files. There is no migration shim, no "we detected 1 .md file in .claw/agents/ but we only read .toml; did you mean to use TOML format? see docs/agents.md" warning.
3. *Help text is incomplete.* `agents help` lists search directories but not the accepted file format. The operator has nothing documentation-side to diagnose "why does `.md` not work?" without reading source.
4. *No content validation inside accepted files.* Even when the `.toml` gate lets a file through, claw does not validate `model` against the model registry, `tools[]` against the tool registry, `reasoning_effort` against the valid `low|medium|high` set (#97 validated tools for CLI flag but not here). Garbage-in, garbage-out: the agent definition is accepted, stored, listed, and will only fail when actually invoked.
5. *Doctor has no agent check.* The doctor check set is `auth / config / install_source / workspace / sandbox / system`. No `agents` check surfaces "3 files in .claw/agents, 2 accepted, 1 silently skipped because format." Pairs directly with #102's missing `mcp` check — both are doctor-coverage gaps on subsystems that are already implemented.
6. *Format asymmetry undermines plugin authoring.* A plugin or skill author who writes an `.md` agent file for distribution (to match the broader Claude Code ecosystem) ships a file that silently does nothing in every claw-code workspace. The author gets no feedback; the users get no signal. A migration path from claude-code → claw-code for agent definitions is effectively silently broken.
**Fix shape — accept `.md` (YAML frontmatter) as an agent source, validate contents, surface skipped files in doctor.**
1. *Accept `.md` with YAML frontmatter.* Extend `load_agents_from_roots` to also read `.md` files. Reuse the same `parse_skill_frontmatter` helper that skills discovery at `:3229` already uses. If both `foo.toml` and `foo.md` exist, prefer `.toml` but record a `conflict: true` flag in the summary. ~30 lines.
2. *Validate agent content against registries.* Check `model` is a known alias or provider/model string. Check `tools[]` entries exist in the canonical tool registry (shared with #97's proposed validation). Check `reasoning_effort` is in `low|medium|high`. On failure, include the agent in the list with `status: "invalid"` and a `validation_errors` array. Do not silently drop. ~40 lines.
3. *Emit skipped-file counts in `agents list`.* Add `summary: {total, active, shadowed, skipped: [{path, reason}]}` so an operator can see that their `.md` file was not a `.toml` file. ~10 lines.
4. *Add an `agents` doctor check.* Sum across roots: total files present, format-skipped, parse-errored, validation-invalid, active. Emit warn if any files were skipped or parse-failed. ~25 lines.
5. *Update `agents help` to name the accepted file formats.* Add an `accepted_formats: [".toml", ".md (YAML frontmatter)"]` field to the help JSON and mention it in text-mode help. ~5 lines.
6. *Regression tests.* One per format. One for shadowing between `.toml` and `.md`. One for garbage model/tools content. One for doctor-check agent-skipped signal.
**Acceptance.** `claw --output-format json agents list` with a `.claw/agents/foo.md` file exposes the agent (or exposes it with `status: "invalid"` if the frontmatter is malformed) instead of silently dropping it. `claw doctor` emits an `agents` check reporting total/active/skipped counts and a warn status when any file was skipped or parse-failed. `agents help` documents the accepted file formats. Garbage `model`/`tools[]` values surface as `validation_errors` in the agent summary rather than being stored and only failing at invocation.
**Blocker.** None. Three-source agent discovery (`.toml`, `.md`, shared helpers) is ~30 lines. Content validation using existing tool-registry + model-alias machinery is ~40 lines. Doctor check is ~25 lines. All additive; no breaking changes for existing `.toml`-only configs.
**Source.** Jobdori dogfood 2026-04-18 against `/tmp/cdX` on main HEAD `6a16f08` in response to Clawhip pinpoint nudge at `1494804679962661187`. Joins **truth-audit / diagnostic-integrity** (#80-#84, #86, #87, #89, #100, #102) on the agent-discovery axis: another "subsystem silently reports ok while ignoring operator input." Joins **silent-flag / documented-but-unenforced** (#96-#101) on the silent-discard dimension (but subsystem-scale rather than flag-scale). Joins **unplumbed-subsystem** (#78, #96, #100, #102) as the fifth surface with machinery present but operator-unreachable: `load_agents_from_roots` exists, `parse_skill_frontmatter` exists (used for skills), validation helpers exist (used for `--allowedTools`) — the agents path just doesn't call any of them beyond TOML parsing. Natural bundle: **#102 + #103** (subsystem-doctor-coverage 2-way — MCP liveness + agent-format validity); also **#78 + #96 + #100 + #102 + #103** as the unplumbed-surface quintet. And cross-cluster with **Claude Code migration parity** (no other ROADMAP entry captures this yet) — claw-code silently breaks an expected migration path for a first-class subsystem.