ROADMAP #100: claw status/doctor JSON expose no commit identity; stale-base subsystem unplumbed

Dogfooded 2026-04-18 on main HEAD 63a0d30 from /tmp/cdU + /tmp/cdO*.

Three-fold gap:
1. status/doctor JSON workspace object has 13 fields; none of them
   contain: head_sha, head_short_sha, expected_base, base_source,
   stale_base_state, upstream, ahead, behind, merge_base, is_detached,
   is_bare, is_worktree. A claw cannot answer 'is this lane at the
   expected base?' from the JSON surface alone.

2. --base-commit flag is silently accepted by status/doctor/sandbox/
   init/export/mcp/skills/agents and silently dropped on dispatch.
   Same silent-no-op class as #98. A claw running
   'claw --base-commit $expected status' gets zero effect — flag
   parses into a local, discharged at dispatch.

3. runtime::stale_base subsystem is FULLY implemented with 30+ tests
   (BaseCommitState, BaseCommitSource, resolve_expected_base,
   read_claw_base_file, check_base_commit, format_stale_base_warning).
   run_stale_base_preflight at main.rs:3058 calls it from Prompt/Repl
   only, writes output to stderr as human prose. .claw-base file is
   honored internally but invisible to status/doctor JSON. Complete
   implementation, wrong dispatch points.

Plus: detached HEAD reported as magic string 'git_branch: "detached HEAD"'
without accompanying SHA. Bare repo/worktree/submodule indistinguishable
from regular repo in JSON. parse_git_status_branch has latent dot-split
truncation bug on branch names like 'feat.ui' with upstream.

Hits roadmap Product Principle #4 (Branch freshness before blame) and
Phase 2 §4.2 (branch.stale_against_main event) directly — both
unimplementable without commit identity in the JSON surface.

Fix shape (~80 lines plumbing):
- add head_sha/head_short_sha/is_detached/head_ref/is_bare/is_worktree
- add base_commit: {source, expected, state}
- add upstream: {ref, ahead, behind, merge_base}
- wire --base-commit into CliAction::Status + CliAction::Doctor
- add stale_base doctor check
- fix parse_git_status_branch dot-split at :2541

Cross-cluster: truth-audit/diagnostic-integrity (#80-#87, #89) +
silent-flag (#96-#99) + unplumbed-subsystem (#78). Natural bundles:
#89+#100 (git-state completeness) and #78+#96+#100 (unplumbed surface).

Milestone: ROADMAP #100.

Filed in response to Clawhip pinpoint nudge 1494782026660712672
in #clawcode-building-in-public.
This commit is contained in:
YeonGyu-Kim 2026-04-18 04:36:47 +09:00
parent 63a0d30f57
commit d63d58f3d0

View File

@ -2282,3 +2282,69 @@ ear], /color [scheme], /effort [low|medium|high], /fast, /summary, /tag [label],
**Blocker.** None. Two parser changes of ~5-10 lines each plus regression tests. `chrono` dep check is the only minor question.
**Source.** Jobdori dogfood 2026-04-18 against `/tmp/cdN` on main HEAD `0e263be` in response to Clawhip pinpoint nudge at `1494774477009981502`. Joins the **silent-flag no-op / documented-but-unenforced class** with #96 / #97 / #98 but is qualitatively more severe: the failure mode is *system-prompt injection*, not a silent feature no-op. Cross-cluster with the **truth-audit / diagnostic-integrity bundle** (#80#87, #89): both are about "the prompt/diagnostic surface should not lie, and should not be a vehicle for external tampering." Natural sibling of **#83** (system-prompt date = build date) and **#84** (dump-manifests bakes build-machine abs path) — all three are about the system-prompt / manifest surface trusting compile-time or operator-supplied values that should be validated or dynamically sourced.
100. **`claw status` / `claw doctor` JSON surfaces expose no commit identity: no HEAD SHA, no expected-base SHA, no stale-base state, no upstream tracking info (ahead/behind), no merge-base — making the "branch-freshness before blame" principle from this very roadmap (§Product Principles #4) unachievable without a claw shelling out to `git rev-parse HEAD` / `git merge-base` / `git rev-list` itself. The `--base-commit` flag is silently accepted by `status` / `doctor` / `sandbox` / `init` / `export` / `mcp` / `skills` / `agents` and silently dropped — same silent-no-op pattern as #98 but on the stale-base axis. The `.claw-base` file support exists in `runtime::stale_base` but is invisible to every JSON diagnostic surface. Even the detached-HEAD signal is a magic string (`git_branch: "detached HEAD"`) rather than a typed state, with no accompanying commit SHA to tell *which* commit HEAD is detached on** — dogfooded 2026-04-18 on main HEAD `63a0d30` from `/tmp/cdU` and scratch repos under `/tmp/cdO*`. `claw --base-commit abc1234 status` exits 0 with identical JSON to `claw status`; the flag had zero effect on the status/doctor surface. `run_stale_base_preflight` at `main.rs:3058` is wired into `CliAction::Prompt` and `CliAction::Repl` dispatch paths only, and it writes its output to stderr as human prose — never into the JSON envelope.
**Concrete repro.**
```
$ cd /tmp/cdU && git init -q .
$ echo "h" > f && git add f && git -c user.email=x -c user.name=x commit -q -m first
# status JSON — what's missing
$ ~/clawd/claw-code/rust/target/release/claw --output-format json status | jq '.workspace'
{
"changed_files": 0,
"cwd": "/private/tmp/cdU",
"discovered_config_files": 5,
"git_branch": "master",
"git_state": "clean",
"loaded_config_files": 0,
"memory_file_count": 0,
"project_root": "/private/tmp/cdU",
"session": "live-repl",
"session_id": null,
"staged_files": 0,
"unstaged_files": 0,
"untracked_files": 0
}
#
100. **`claw status` / `claw doctor` JSON surfaces expose no commit identity: no HEAD SHA, no expected-base SHA, no stale-base state, no upstream tracking info (ahead/behind), no merge-base — making the "branch-freshness before blame" principle from this very roadmap (Product Principle 4) unachievable without a claw shelling out to `git rev-parse HEAD` / `git merge-base` / `git rev-list` itself. The `--base-commit` flag is silently accepted by `status` / `doctor` / `sandbox` / `init` / `export` / `mcp` / `skills` / `agents` and silently dropped — same silent-no-op pattern as #98 but on the stale-base axis. The `.claw-base` file support exists in `runtime::stale_base` but is invisible to every JSON diagnostic surface. Even the detached-HEAD signal is a magic string (`git_branch: "detached HEAD"`) rather than a typed state, with no accompanying commit SHA to tell *which* commit HEAD is detached on** — dogfooded 2026-04-18 on main HEAD `63a0d30` from `/tmp/cdU` and scratch repos under `/tmp/cdO*`. `claw --base-commit abc1234 status` exits 0 with identical JSON to `claw status`; the flag had zero effect on the status/doctor surface. `run_stale_base_preflight` at `main.rs:3058` is wired into `CliAction::Prompt` and `CliAction::Repl` dispatch paths only, and it writes its output to stderr as human prose — never into the JSON envelope.
**Concrete repro.**
- `claw --output-format json status | jq '.workspace'` in a fresh repo returns 13 fields: `changed_files`, `cwd`, `discovered_config_files`, `git_branch`, `git_state`, `loaded_config_files`, `memory_file_count`, `project_root`, `session`, `session_id`, `staged_files`, `unstaged_files`, `untracked_files`. No `head_sha`. No `head_short_sha`. No `expected_base`. No `base_source`. No `stale_base_state`. No `upstream`. No `ahead`. No `behind`. No `merge_base`. No `is_detached`. No `is_bare`. No `is_worktree`.
- `claw --base-commit $(git rev-parse HEAD) --output-format json status` produces byte-identical output to `claw --output-format json status`. The flag is parsed into a local variable (`main.rs:487-496`) then silently dropped on dispatch to `CliAction::Status { model, permission_mode, output_format }` which has no base_commit field.
- `echo "abc1234" > .claw-base && claw --output-format json doctor | jq '.checks'` returns six standard checks (`auth`, `config`, `install_source`, `workspace`, `sandbox`, `system`). No `stale_base` check. No mention of `.claw-base` anywhere in the doctor report, despite `runtime::stale_base::read_claw_base_file` existing and being tested.
- In a bare repo: `claw --output-format json status | jq '.workspace'` returns `project_root: null` but `git_branch: "master"` — no flag that this is a bare repo.
- In a detached HEAD (tag checkout): `git_branch: "detached HEAD"` and nothing else. The claw has no way to know the underlying commit SHA from this output alone.
- In a worktree: `project_root` points at the worktree directory, not the underlying main gitdir. No `worktree: true` flag. No reference to the parent.
**Trace path.**
- `rust/crates/runtime/src/stale_base.rs:1-122` — the full stale-base subsystem exists: `BaseCommitState` (Matches / Diverged / NoExpectedBase / NotAGitRepo), `BaseCommitSource` (Flag / File), `resolve_expected_base`, `read_claw_base_file`, `check_base_commit`, `format_stale_base_warning`. Complete implementation. 30+ unit tests in the same file.
- `rust/crates/rusty-claude-cli/src/main.rs:3058-3067``run_stale_base_preflight` uses the stale-base subsystem and writes warnings to `eprintln!`. It is called from exactly two places: the `Prompt` dispatch (line 236) and the `Repl` dispatch (line 3079).
- `rust/crates/rusty-claude-cli/src/main.rs:218-222``CliAction::Status { model, permission_mode, output_format }` has three fields; no `base_commit`, no plumbing to `check_base_commit`.
- `rust/crates/rusty-claude-cli/src/main.rs:1478-1508``render_doctor_report` calls `ProjectContext::discover_with_git` which populates `git_status` and `git_diff` but *not* `head_sha`. The resulting doctor check set (line 1506-1511) has no stale-base check.
- `rust/crates/rusty-claude-cli/src/main.rs:487-496``--base-commit` is parsed into a local `base_commit: Option<String>` but only reaches `CliAction::Prompt` / `CliAction::Repl`. `CliAction::Status`, `Doctor`, `Sandbox`, `Init`, `Export`, `Mcp`, `Skills`, `Agents` all silently drop the value.
- `rust/crates/rusty-claude-cli/src/main.rs:2535-2548``parse_git_status_branch` returns the literal string `"detached HEAD"` when the first line of `git status --short --branch` starts with `## HEAD`. This is a sentinel value masquerading as a branch name. Neither the status JSON nor the doctor JSON exposes a typed `is_detached: bool` alongside; a claw has to string-compare against the magic sentinel.
- `rust/crates/runtime/src/git_context.rs:13``GitContext` exists and is computed by `ProjectContext::discover_with_git` but its contents are never surfaced into the status/doctor JSON. It is read internally for render-into-system-prompt and then discarded.
**Why this is specifically a clawability gap.**
1. *The roadmap's own product principles say this should work.* Product Principle #4 ("Branch freshness before blame — detect stale branches before treating red tests as new regressions"). Roadmap Phase 2 item §4.2 ("Canonical lane event schema" — `branch.stale_against_main`). The diagnostic substrate to *implement* any of those is missing: without HEAD SHA in the status JSON, a claw orchestrating lanes has no way to check freshness against a known base commit.
2. *The machinery exists but is unplumbed.* `runtime::stale_base` is a complete implementation with 30+ tests. It is wired into the REPL and Prompt paths — exactly where it is *least* useful for machine orchestration. It is *not* wired into `status` / `doctor` — exactly where it *would* be useful. The gap is plumbing, not design.
3. *Silent `--base-commit` on status/doctor.* Same silent-no-op class as #98 (`--compact`) and #97 (`--allowedTools ""`). A claw that adopts `claw --base-commit $expected status` as its stale-base preflight gets *no warning* that its own preflight was a no-op. The flag parses, lands in a local variable, and is discharged at dispatch.
4. *Detached HEAD is a magic string.* `git_branch: "detached HEAD"` is a sentinel value that a claw must string-match. A proper surface would be `is_detached: true, head_sha: "<sha>", head_ref: null`. Pairs with #99 (system-prompt surface) on the "sentinel strings instead of typed state" failure mode.
5. *Bare / worktree / submodule status is erased.* Bare repo shows `project_root: null` with no `is_bare: true` flag. A worktree shows `project_root` at the worktree dir with no reference to the gitdir or a sibling worktree. A submodule looks identical to a standalone repo. A claw orchestrating multi-worktree lanes (the central use case the roadmap prescribes) cannot distinguish these from JSON alone.
6. *Latent parser bug — `parse_git_status_branch` splits branch names on `.` and space.* `main.rs:2541``let branch = line.split(['.', ' ']).next().unwrap_or_default().trim();`. A branch named `feat.ui` with an upstream produces the `## feat.ui...origin/feat.ui` first line; the parser splits on `.` and takes the first token, yielding `feat` (silently truncated). This is masked in most real runs because `resolve_git_branch_for` (which uses `git branch --show-current`) is tried first, but the fallback path still runs when `--show-current` is unavailable (git < 2.22, or sandboxed PATHs without the full git binary) and in the existing unit test at `:10424`. Latent truncation bug.
**Fix shape — surface commit identity + wire the stale-base subsystem into the JSON diagnostic path.**
1. *Extend the status JSON workspace object with commit identity.* Add `head_sha`, `head_short_sha`, `is_detached`, `head_ref` (branch or tag name, `None` when detached), `is_bare`, `is_worktree`, `gitdir`. All read-only; all computable from `git rev-parse --verify HEAD`, `git rev-parse --is-bare-repository`, `git rev-parse --git-dir`, and the existing `resolve_git_branch_for`. ~40 lines in the status builder.
2. *Extend the status JSON workspace object with base-commit state.* Add `base_commit: { source: "flag"|"file"|null, expected: "<sha>"|null, state: "matches"|"diverged"|"no_expected_base"|"not_a_git_repo" }`. Populates from `resolve_expected_base` + `check_base_commit` (already implemented). ~15 lines.
3. *Extend the status JSON workspace object with upstream tracking.* Add `upstream: { ref: "<remote/branch>"|null, ahead: <int>, behind: <int>, merge_base: "<sha>"|null }`. Computable from `git for-each-ref --format='%(upstream:short)'` and `git rev-list --left-right --count HEAD...@{upstream}` (only when an upstream is configured). ~25 lines.
4. *Wire `--base-commit` into `CliAction::Status` and `CliAction::Doctor`.* Add `base_commit: Option<String>` to both variants and pipe through to the JSON builder. Add a `stale_base` doctor check with `status: ok|warn|fail` based on `BaseCommitState`. ~20 lines.
5. *Fix the `parse_git_status_branch` dot-split bug.* Change `line.split(['.', ' ']).next()` at `:2541` to something that correctly isolates the branch name from the upstream suffix `...origin/foo` (the actual delimiter is the literal string `"..."`, not `.` alone). ~3 lines.
6. *Regression tests.* One per new JSON field in each of the covered git states (clean / dirty / detached / tag checkout / bare / worktree / submodule / stale-base-match / stale-base-diverged / upstream-ahead / upstream-behind). Plus the `feat.ui` branch-name test for the parser fix.
**Acceptance.** `claw --output-format json status | jq '.workspace'` exposes `head_sha`, `head_short_sha`, `is_detached`, `head_ref`, `is_bare`, `is_worktree`, `base_commit`, `upstream`. A claw can do `claw --base-commit $expected --output-format json status | jq '.workspace.base_commit.state'` and get `"matches"` / `"diverged"` without shelling out to `git rev-parse`. The `.claw-base` file is honored by both `status` and `doctor`. `claw doctor` emits a `stale_base` check. `parse_git_status_branch` correctly handles branch names containing dots.
**Blocker.** None. Four additive JSON field groups (~80 lines total) plus one-flag-plumbing change and one three-line parser fix. The underlying stale-base subsystem and git helpers are all already implemented — this is strictly plumbing + surfacing.
**Source.** Jobdori dogfood 2026-04-18 against `/tmp/cdU` + `/tmp/cdO*` scratch repos on main HEAD `63a0d30` in response to Clawhip pinpoint nudge at `1494782026660712672`. Cross-cluster find: primary cluster is **truth-audit / diagnostic-integrity** (joins #80#87, #89) — the status/doctor JSON lies by omission about the git state it claims to report. Secondary cluster is **silent-flag / documented-but-unenforced** (joins #96, #97, #98, #99) — the `--base-commit` flag is a silent no-op on status/doctor. Tertiary cluster is **unplumbed-subsystem**`runtime::stale_base` is fully implemented but only reachable via stderr in the Prompt/Repl paths; this is the same shape as the `claw plugins` CLI route being wired but never constructed (#78). Natural bundle candidates: **#89 + #100** (git-state completeness sweep — #89 adds mid-operation states, #100 adds commit identity + stale-base + upstream); **#78 + #96 + #100** (unplumbed-surface triangle — CLI route never wired, help-listing unfiltered, subsystem present but JSON-invisible). Hits the roadmap's own Product Principle #4 and Phase 2 §4.2 directly — making this pinpoint the most load-bearing of the 20 items filed this dogfood session for the "branch freshness" product thesis. Milestone: ROADMAP #100.